The New C: It All Began with FORTRAN

新的C语言:一切从FORTRAN开始

By Randy Meyers, November 01, 2000


Sometimes the best way to improve a language is to make it look more like the one it set out to obsolete 30 years earlier.

有时候改进一种语言最好的方式就是,让它更像一种快要过时的30年前的语言。


It all began with FORTRAN.


In this case, I am not referring to the fact that FORTRAN was the first programming language, but that a disagreement in the 1960s on how to implement parameter passing in FORTRAN accidentally gave FORTRAN a huge performance advantage on supercomputers in the 1970s, and eventually led to a new feature in C99 in the 1990s. The new C99 feature is restricted pointers, and one of the best ways to understand the motivation for restricted pointers is to follow the history of the accident in FORTRAN.


In FORTRAN, unlike C, if a function assigns a new value to a parameter, the value of the actual argument passed to the function will change, and after the function returns, the caller will see the new value of the argument. Consider the FORTRAN code in Example 1 below. If you call F passing a variable Y as an argument, after F returns, Y will equal 6.


一切从FORTRAN开始


    在这里,我并不是在陈述FORTRAN是第一种编程语言这个事实[a]。但是,在20世纪60年代一个关于如何实现FORTRAN中参数传递的分歧,意外的使FORTRAN在70年代的巨型机上有了巨大的性能提升,并最终导致90年代的C99增加一个新特性。该特性就是受限(restricted)指针,一个了解受限指针来由的最好方式是,回顾FORTRAN中有关这个意外的历史。


    在FORTRAN中,跟C不一样,如果一个函数给形参赋了一个新值,被传入的实参数值也会被改变,在函数返回以后,调用者得到这个参数的新值[b]。考虑下面Example 1中的FORTRAN代码。如果你使用变量Y作为参数调用函数F,在F返回以后,Y的值是6。


Example 1:

SUBROUTINE F(X)
INTEGER X
X = 6
END

Here is where the disagreement came in. Different FORTRAN compilers chose one of two different ways to achieve the FORTRAN semantics for parameter passing. The first way was "by reference" parameter passing, or the way that your typical C programmer would write a function that modifies a variable in its caller. The function is passed the address of the argument and uses indirection everywhere when accessing the parameter. If the FORTRAN compiler produced C code as its output, the result would be something like Example 2.

    这就是分歧所在。不同的FORTRAN编译器选择了两种不同方法中的一种来实现FORTRAN语义中的参数传递。第一种方法是“引用(by reference)”传递参数,也就是典型的C程序员使用的:编写一个函数修改调用者的变量[c]。参数的地址被传入这个函数,任何地方存取这个参数都需要间接存取。如果FORTRAN编译器产生C代码输出的话,结果会像Example 2那样。

Example 2:
void f(int *x)
{
    *x = 6;
}
However, on some machines, indirection was fairly expensive compared to directly referencing a local variable. This led to the second way to implement FORTRAN parameter passing. The address of the actual argument is still passed to the function, but the function when entered makes a local copy of the argument, uses the local copy throughout the body of the function, and then copies the local copy back to the original argument before returning. Such a FORTRAN compiler would produce code that looks something like Example 3. While "copy in/copy out" parameter passing adds overhead to entering and returning from a function, if the parameter is referenced just a few times, and the (expensive on some machines) indirection is eliminated, the net result is increased performance (on some machines).

    然而,在一些机器上,比起直接引用一个局部变量,间接访问的代价要大得多。这就导致了实现FORTRAN参数传递的第二种方法。实际参数的地址仍然被传出函数,但是在进入这个函数的时候产生一个该参数的局部副本(copy)[d],在整个函数体中使用这个局部副本,并在返回之前把这个局部副本复制回原来的参数。这样的FORTRAN编译器会产生像Example 3的代码。尽管“复制入/复制出(copy in/copy out)”传递参数增加了进入和返回函数时的开销,但是如果一个参数仅仅引用数次,并且排除了(一些机器上昂贵的)间接访问,最终的结果是提高了性能(在一些机器上)。

 Example 3:
 void f(int *x)
{
    int xcopy = *x;
    xcopy = 6;
    *x = xcopy;
}
Many times, the details of how a compiler implements a language feature are just "implementation details" in the usual sense. They do not affect the way programmers write programs, and programming language standards committees allow implementations to freely choose among the alternatives. However, FORTRAN programs could produce different results depending on the parameter passing mechanism used. Consider the FORTRAN code in Example 4, and the two different translations of that code into C.

   
许多时候,编译器如果实现一个语言特性的细节仅仅是通常意义上的“实现细节”。它们并不影响程序员编写程序的方式,编程语言标准委员会允许实现自由的在各个可选方案中选择。然而,跟据不同的参数传递实现机制,FORTRAN程序会产生不同的结果。考虑Example 4中的FORTRAN代码,以及C代码的两种不同翻译。

Example 4:
SUBROUTINE G(X, Y)
INTEGER X, Y
X = 1 + X
Y = 5 + Y
END

// Translation using "by reference"
// “引用”的翻译
void g(int *x, int *y)
{
    *x = 1 + *x;
    *y = 5 + *y;
}

// Translation using
// "copy in/copy out"
// “复制入/复制出”的翻译
void g(int *x, int *y)
{
    int xcopy = *x;
    int ycopy = *y;
    xcopy = 1 + xcopy;
    ycopy = 5 + ycopy;
    *x = xcopy;
    *y = ycopy;
}
The G subroutine adds different constants to both its parameters. If you pass two arguments, variables A and B, to the function G, and before the call A has the value 1 and B had the value 10, then as you would suspect, after the call, A would have the value 2 and B would have the value 15. This would be true regardless of which implementation of FORTRAN parameter passing was used.

However, consider what happens if you pass A (initial value 1) as both parameters. If "by reference" parameter passing is used, after the call A has the value of 7. A is updated during the assignment to *x, and then again during the assignment to *y, since both x and y point to A. In contrast, if "copy in/copy out" parameter passing is used, then after the call to G, A has the value 6. Distinct copies of A were made inside G, and the additions were made separately to each copy. Both copies were eventually assigned back to A, but the last assignment "won."

The two alternative parameter passing mechanisms are an example of the types of tradeoffs that the definers of any programming language have to face. Should the language require a particular implementation technique, possibly at the expense of efficiency? Should the language features be changed to avoid some controversy? The FORTRAN definition decided in favor of efficiency to allow either parameter passing mechanism. Once that decision was made, certain types of programs became non-portable, and were "outlawed."

The FORTRAN 66 standard contained all sorts of rules that seemed completely unmotivated to many programmers. You can pass any one variable to a function at most once in the argument list. If you pass a variable as an argument to a function, that function cannot also reference the variable as a global (FORTRAN COMMON). If you pass a variable to a function, you cannot also pass anything, and the function can not reference anything, that overlaps it in storage (FORTRAN EQUIVALENCE). No program that follows those rules can determine which parameter passing mechanism is used.

About ten years later, the Cray 1 supercomputer required highly optimizing compilers to allow traditional programs to make use of the machine's vector registers, and thus achieve supercomputer performance. Consider the program in Example 5. The most efficient code for the function is to load the array pointed to by x into a vector register, then load the array pointed to by y into a vector register, and then do a vector add instruction adding the two vector registers together. If the compiler generates vector instructions, rather than a traditional loop accessing the array an element at a time, the code may run an order of magnitude faster.

   
子例程G给它的两个参数加上两个不同的常量,如果你传递两个参数,变量AB给函数G,在调用前A的值是1,B的值是10。如你所料,在这次调用以后,A的值是2,B的值是15。这里确实不需要考虑FORTRAN采用了哪种参数传递的实现。

    不过,想象一下如果你同时以A(初始值为1)作为两个参数。如果使用“引用”参数传递,在调用以后A的值是7。当对*x赋值的时候更新了A的值,然后在对*y赋值的时候A的值再次被更新了,因为xy同样指向A。相反的,如果使用“复制入/复制出”参数传递,在对G的调用以后,A的值是6。在G中产生了A的不同副本,每个副本的加法都单独的进行。两个副本最终都被复制回A,但是最后一次的赋值“获胜”。

    这两种可选的参数传递机制是一个例子,权衡要采用哪一种方式,任何编程语言的定义者都不得不面对。一种语言需要采用特定的实现技术,甚至可能以效率为代价吗?需要改变语言的特性来避免争议吗?Fortran语言的定义为了有利于提高效率,允许两种参数传递机制。一旦做出决定,某些程序变成不可以移植的,并且是“不合法的”。

    FORTRAN66标准包含了各种各样的规则,可能会误导许多程序员。你最多只能把任一变量传入参数表中一次。如果你把一个变量作为参数传递给函数,这个函数就不能在全局范围内引用这个变量(FORTRAN中的全局变量叫做COMMON)。如果你传递了一个变量给函数,你就不能再传递任何与该变量在存储空间上重叠的变量(FORTRAN中叫做EQUIVALENCE),该函数也不能再引用任何这些重叠的变量。遵循这些规则的的程序,都无法知道到使用的是哪一种参数传递机制[e]

    大约10年以后,巨型机Cray 1需要高度优化的编译器,使传统的程序能够利用该机器的向量寄存器,来提高巨型机的性能。考虑Example 5的程序。对于这个函数来说,最高效的代码是把x指向的数组装载入一个向量寄存器,然后把y指向的数组装加入另一个向量寄存器,然后执行一次向量加法指令把两个向量寄存器相加。如果编译器产生向量指令,而不是传统的每次循环都存取数组中的一个元素,这样的代码运行起来可能快一个数量级。

Example 5:
void
vector_add(float *x, float *y,
float *result)
{
    int i;
    for (i = 0; i < 64; ++i)
        result[i] = x[i] + y[i];
}
The optimizer in a compiler can certainly transform a loop into a sequence of vector instructions, but the question is whether the sequence of vector instructions is really equivalent to the original loop. Loading all the x array and the y array into vector registers before doing any stores into the result array works only if you know that the result array is distinct from both x and y. Consider what happens if result points to x[1]. Then result[0] is really x[1], and result[i] is the same as x[i+1]. Each iteration of the loop stores values referenced in the next iteration. If you load x into a vector register before doing stores into result, the values calculated change. It is at this point the accident in the definition of FORTRAN comes in. In order to avoid requiring a particular parameter passing mechanism, the FORTRAN Standard had exactly the right set of rules to allow a vectorizing compiler to assume that x, y, and result are all distinct, non-overlapping arrays. Thus, by accident, FORTRAN had a major performance advantage on vector machines.

 
    编译器的优化程序确实可以把一个循环转换成一系列的向量指令,但问题是,这一系列的向量指令是否真的跟原来的循环等价。把数组xy完全装载入向量寄存器,只有知道结果数组跟xy都不同的情况下,才能对结果数组进行任何存储操作。想想如果result指向x[1]的话会怎么样。那么result[0]实际上就是x[1],而result[i]与x[i+1]相同。循环中每一次迭代存储的值是下一次迭代要引用的值。如果你在把存入结果数组以前就已经把x载入了向量寄存器,这个数值的计算结果就改变了。这就是FORTRAN的定义带来的意外。为了避免使用某种特定的参数传递机制,FORTRAN标准已经精确地制定了恰当的规则,使向量编译器假设xyresult都是不同的,非重叠的数组。因而FORTRAN意外地在向量计算机上有了巨大的性能优势。

This performance advantage also carried over into parallel processor machines. The goal for an optimizing compiler for a parallel processor is to divide a loop into several ranges of iterations that can be done by separate processors working independently of each other. Thus, the loop in vector_add might be divided into two ranges: one processor might handle iterations 0 to 31 while another processor simultaneously does iterations 32 to 63. Work can be divided up among different processors only if the compiler knows that the results of the iterations done by one processor are not needed by a different processor working simultaneously. Proving this usually means the compiler needs to determine that different arrays are distinct, non-overlapping objects. Again, the rules in FORTRAN are exactly what a compiler needs in order to prove it can automatically parallelize a program.
 
    这种性能优势也延续到并行处理计算机上。对并行处理器上的编译器的优化目标是,把一个循环划分为若干个能由独立的处理器单独处理的迭代范围。因此,在函数vector_add中的循环可能划分成两个范围:一个处理器处理0至31的迭代,同时另一个处理器也在处理32到63的迭代。只有在编译器知道一个处理器不需要另一个处理器迭代计算结果的时候,才能把工作分给不同的处理器完成。要保证这些意味着编译器需要知道不同的数组都是单独的、不重叠的。又一次的,FORTRAN的规则正好满足编译器的需要,确保它能够自动的并行化一个程序。

While the FORTRAN performance advantage started out on supercomputers, it is now present on your typical, modern, general-purpose microprocessor. These days, most general-purpose microprocessors have a superscalar design:
 
    FORTRAN在巨型机上显现出来的性能优势,现在正展现在你的典型、现代、通用微处理器上。在这个时候,大多数通用微处理器使用了超标量设计:

A compiler for a superscalar machine has to carefully schedule the instructions it generates, and interleave instructions from different computations, to achieve maximum performance. For example, loading a memory value into a register takes a long time. If you attempt to use the value in the register before the value from memory arrives, the processor will stall. So, the compiler will try to generate a load instruction, followed by several instructions that do not need the result of the load, followed by the instructions that use the result of the load. This gives the load time to finish, and keeps the processor busy.

Consider vector_add from Example 5. A typical optimization performed for a superscalar processor is to load x[i+1] and y[i+1] into registers at the start of the iteration for i. This means that each iteration of the loop conveniently has its operands fetched from memory for it by the previous iteration, and thus never has to wait for the slow memory system to provide a value.

However, this prefetching works only if the results of one iteration of the loop does not affect the operands fetched by the next iteration of the loop. Proving this usually means knowing that the objects operated upon by the loop are distinct and non-overlapping.

    一个超标量计算机的编译器必须仔细的安排它产生的指令,交错不同计算的指令,以达到最好的性能。例如,从主存总装载一个值到寄存器中需要较长的时间。如果 你在主存中的值到达以前,就尝试使用寄存器中的指,处理器将会暂停。因此,编译器将会尝试产生一个装载指令,后面接着数条不需要该装载结果的指令,然后才 是需要该装载结果的指令。这给了装载操作一些时间来完成,并保持处理器一直在工作状态。

    考虑Example 5中的vector_add,一种超标量处理器的典型优化措施是,在迭代 i 开始的时候就装载 x[i+1]y[i+1] 到寄存器中。这意味着,循环中的每一次迭代已经在前一次迭代中从主存中取出了操作数,因此永远不必等待缓慢的主存系统来提供一个值。

    然而,只有在循环中一条迭代指令的结果不影响下一次迭代提取的操作数的时候,这个预取工作才能进行。要提供这些通常意味着得知道循环中要处理的对象是单独并且不重叠的。

Clearly, C could not concede this performance advantage to FORTRAN forever. Note that FORTRAN's advantage over C comes only because C programs use pointers. A C compiler can perform any of the optimizations discussed when it sees individual objects being operated upon directly. The problem is when objects are operated upon indirectly through pointers. Since, in general, the compiler has no idea which object a pointer is referencing, the compiler cannot prove a pointer references a distinct object.

    显然,C比起FORTRAN永远不能在性能占优势。注意,FORTRAN相比C的优势只是因为C程序使用指针。当直接操作单独的对象时,C编译器可以采取任何上述的优化措施。问题是,这些对象是通过指针间接的操作的。因为,通常来说,编译器无法知道哪个对象是一个引用[g],编译器无法保证一个指针引用的是一个独特的对象。

One approach used by some C compilers is to provide an optional, "unsafe" level of optimization beyond normal optimization. If unsafe optimization is enabled when the program is compiled, the compiler will assume that all pointers point to distinct objects, and the resulting program will run faster. Unfortunately, if the programmer meant for some pointers to reference the same object, the program will likely also get incorrect results.
 
    某些C编译器的解决办法是,在通常的优化以外,提供一个可选的,“不安全”的优化级别。如果在编译程序的时候启用了不安全的优化,编译器会假设所有指针都指向不同的对象,并最终使程序运行得更快。不幸的是,如果程序员的本意是使用一些不同的指针来引用一个相同的对象,这个程序还可能得到错误的结果。

A better solution is to allow the programmer to mark which pointers in the program point to distinct objects. This is the new "restricted pointer" feature in C99. (Some C++ compilers also support restricted pointers; however, they are not part of Standard C++.) C99 has a new keyword, restrict, that is a type-qualifier like const and volatile. Syntactically, restrict can appear wherever const and volatile can. However, semantically, restrict must modify a pointer type. This usually means that restrict follows the * in a pointer declaration. The following snippet shows some examples.

     一个更好的解决办法是,允许程序员标记程序中的哪些指针是指向独特的对象。这就是C99的新特性“受限指针”。(一些C++编译器也支持受限指针;然而,它们不是C++标准的一部分)。C99有一个新的关键词restrict,这是像constvolatile 那样的类型限定符。在语法上, restrict 可以出现在任何 constvolatile 能出现的地方。然而,在语义上,restrict 只能修饰指针类型。这通常表示 restrict 出现在指针声明的 * 后面。下面的片段展示了一些例子。

Example 6:

int *restrict p1; // OK, p1 is a restricted pointer to int
                             // 合法,p1是一个指向整型的受限指针
restrict int *p2; // invalid: pointer to restricted int
                             // 非法:指向一个受限整型
typedef int *intptr;
restrict intptr p3; //OK, p3 is a restricted pointer to int
                                // 合法,p3是一个指向整型的受限指针


C99 also adds a new syntactic spot where any type qualifier can appear: after the [ in an array parameter declaration. Remember, a parameter of type array is really a parameter of type pointer. The type qualifiers following the [ act as if they followed the * when the parameter's type is rewritten as a pointer type. Thus:

C99 还增加了一个语法点:任何类型限定符可以出现在数组声明的 [ 后面。记住,一个某种类型的数组参数,实际上就是这种类型的指针。在 [ 后面的类型限定符,当参数类型重写成指针类型时,就如同它们出现在 * 后面。

 int f(float array[const restrict]);

is the same as:

等同于:

int f(float *const restrict array);

There's a bug in the C99 Standard — the change to the grammar to allow type qualifiers following thein an array parameter declaration was made only to the grammar rule when the parameter is named. The change should have also been made for unnamed parameters. I hope that C compilers will accept the unnamed parameter case as a common extension until the C Standard can be corrected:

    在C99标准中存在一个bug——从语法上允许类型限定符出现在数组参数声明的 [ 后面,这个改动只有该参数被命名的时候该语法规则才生效。这个改动本应该也支持未命名参数。我希望在标准正确以前,C编译器能针对未命名参数的情况接受一个常用的扩展。

 int f(float [restrict]); // common extension
                                         // 常用的扩展

As you probably suspect, the rough meaning of a restricted pointer is that it provides a unique way to access a unique, non-overlapping object in the program. In other words, the object accessed through this pointer can be assumed to be unique from any other object being referenced in the code during the lifetime of the restricted pointer, or until the restricted pointer's value changes. This allows an optimizer to safely make the sorts of optimizations discussed above, as well as other optimizations such as avoiding recomputing common (duplicate) subexpressions involving access through pointers.

    正如你可能猜测的那样,受限指针的粗略意义在于在程序中提供一个唯一的方法来存取一个唯一的、非重叠的对象。换句话说,通过该指针访问的对象,在受限指针的生存期内,或者该受限指针的值改变以前,都可以被假定为不同于代码中任何其他引用的对象。这让优化程序能够安全的采用上述的各种优化措施,以及其他的优化措施,例如:避免重复的计算涉及通过指针访问的通用(相同)子表达式[h]

There is an exception to the rule that a restricted pointer during its lifetime provides the sole access to an object. If the object is never modified in any fashion during the lifetime of the restricted pointer, then the object may be accessed not only by the restricted pointer but other ways as well. Such access does not interfere with any optimizations.

    受限指针在它的生存期内提供了存取某个对象的唯一途径,这个规则有个例外。如果在受限指针的生存期内没有对对象做过任何形式的修改,这个对象也许不只是通过这个受限指针,还可以通过别的途径来访问。这样的访问不影响任何优化措施。

In order to understand the "uniqueness" rules of restricted pointers, you might want to think of them in terms of "copy in/copy out" semantics. Pretend that when you assign an address to a restricted pointer, a copy is made of the entire part of the object that is being referenced though the restricted pointer. The restricted pointer then points to the copy instead of the original object. Magically, the restricted pointer remembers the address of the original object, and if you ever change the value of the restricted pointer, or the restricted pointer goes out of scope, then the copy of the object is copied back to the original object referenced. This pretend "copy in/copy out" transformation breaks your program only if the program violates the uniqueness rules of restricted pointers.

    为了理解受限指针的“唯一性”规则,你可以把它们想象成某种“复制出/复制如”语义。假如当你给一个受限指针赋了一个地址的时候,通过这个受限指针引用的对象被完整的复制了一个副本。这个受限指针接下来就指向这个副本,而不是原来的对象。奇妙的是,这个受限指针记得原有对象的地址,如果你在任何时候改动了该受限指针的值,或者该受限指针超出了生存期,对象的副本就被复制回所引用的原始对象。只有在违反受限指针唯一性规则的情况下,假设的“复制入/复制出”变换才会打破你的程序[i]

Here are some examples. If someone stores into the original object during the lifetime of the restricted pointer, then accesses through the restricted pointer will fetch the unmodified values from the copy, not the updated values from the original object. If the restricted pointer is used to modify the object, then accesses made not through the restricted pointer during the lifetime of the restricted pointer will see the old values from the original object, not the updated values in the copy. On the other hand, if your program works even if the pretend copy in/copy out transformation was performed, then it is safe to use restricted pointers, and doing so might enable many useful optimizations.

    这里有一些例子。如果有人在受限指针的生存期内,(把数据)存入了原始对象, 那么通过访问这个受限指针取得的将是来自副本中未修改的值,而不是原始对象中更新过的值。如果通过这个受限指针来修改这个对象,在该受限指针的生产期内,不是通过该受限指针(对对象)的访问将会看到来自原始对象的旧值,而不是副本中更新过的值。另一方面,如果在假设实施了“复制入/复制出”变换的情况下,你的程序仍然工作,那么使用受限指针是安全的,这样做也许能够开启许多有用的优化。

The lifetime of a restricted pointer is the period of time during the execution of the program that the pointer itself would be allocated. For an automatic variable, it is the block in which it is declared is active. For a parameter, it is until the function body returns. For a file-scope variable, it is until main returns.

    受限指针的生存期就是程序执行时该指针本身被分配的那段时间。对于自动变量来说,就是其定义所在的代码块活动的时候。对于参数来说,直到函数体返回。对于文件范围内的变量来说,直到main函数返回。

As I stated above, the uniqueness property applies only to the parts of the object actually accessed through the restricted pointer. For example, if vector_add from Example 5 declared all three of its parameters to be restricted pointers, it would be valid to make the call:

    如我在上面所述的,唯一性属性仅仅作用于对象中实际通过受限指针存取的部分。例如:如果Example 5中的 vector_add 把它的三个参数都声明为受限指针,这样调用是合法的:

float data[192];
vector_add(&data[0], &data[64], &data[128]);

Since vector_add accesses only 64 elements from any of the arrays pointed to by its parameters, all of the accesses of the data array are unique to one of the three restricted pointer parameters. The data array is really being treated as three, unique, non-overlapping arrays. However, if the second argument was &array[63], the call would be invalid since the x parameter and the y parameter were both accessing the same storage.


    因为vector_add 只从它的参数指向数组中存取64个元素, 所有对数据数组的访问,对于三个受限指针参数中的任意一个来说都是唯一的。该数据数组真的被看作3个不同的、不重叠的数组。然而,如果第二个参数是 &array[63],这样的调用是不合法的,因为 参数 x 和参数 y 同时访问了相同的存储位置。


It is best to confine your use of restricted pointers to code that is fairly straightforward. If you have a hard time following how your pointers are used, there is an increased danger that you are inadvertently making invalid references to the object referenced by the restricted pointer not using the restricted pointer (you have invalid aliasing).


    使用受限指针最好限制在相当简单的地方。如果你对跟踪你的指针是怎么使用的有困难,会增加一些危险:你不慎没用通过受限指针,却对受限指针引用的对象作了一个非法的引用(你有一个非法的别名)[j]


This brings us to the subject of debugging. If your program doesn't work, or stops working in the future, and you use restricted pointers, you might want to see if the problem is invalid aliasing. Some compilers have an option to merely ignore restrict, which allows you to test if aliasing is the problem. If your compiler lacks such an option, define restrict to be a macro with no body when compiling your program. Since the only meaning of restrict is to permit additional optimization, it can be safely deleted from any program.


    这给我们带来一些调试的话题。如果你的程序在未来不工作了,并且你使用了受限指针,你可能想要看看问题时不是出在非法的别名上。一些编译器提供了一个仅仅忽略 restrict t选项,允许你测试问题是否在别名上。如果你的编译器缺少这样一个功能,在编译你的程序的时候把 restrict定义为一个空的宏。因为 restrict 的唯一意义是允许额外的优化,它能够安全地从任何程序中删除。


In addition to optimization, you can also use restrict to warn a user of your code that the code just will not work if you pass in pointers to the same or overlapping objects. For example, the C89 standard contained a statement at the beginning of the library section that, unless stated otherwise, you must not pass pointers to the same or overlapping objects to the functions in the standard library. This ruled out lots of silly uses of the library, like calling sprintf with the pointers to the format string and the result string being the same. The C99 Standard still contains that prose prohibition, but emphasizes the requirement by making most of the pointer arguments in the prototypes in the library restricted pointers. The restrict keyword allows the specification in the C Language itself a prohibition that previously required English text.


    除了优化,你还可以使用 restrict来警告你的代码的用户:在你使用指针传递相同的或重叠的对象时,这些代码是行不通的。例如,C89标准在标准库部分(library section)的开头包含了一个声明:除非另有说明,否则你不得往标准库中的函数传递指向相同或重叠对象的指针。这样排除了许多对标准库愚蠢的使用, 像是调用 sprintf 而指向格式字符串和结果字符串的指针却是相同的。C99仍然包含了该散文禁令(prose prohibition),但是强调了这个需求:把大部分标准库原形中的指针参数标记为受限指针。 restrict 关键字允许C语言本身作为一个规格说明禁令,之前需要用英语文本。


There you have it. Thirty-three years of programming language history and a new keyword in C.


    你了解到了,33年的编程语言历史,以及C语言中的一个新关键字。


Randy Meyers is consultant providing training and mentoring in C, C++, and Java. He is the current chair of J11, the ANSI C committee, and previously was a member of J16 (ANSI C++) and the ISO Java Study Group. He worked on compilers for Digital Equipment Corporation for 16 years and was Project Architect for DEC C and C++. He can be reached at rmeyers@ix.netcom.com.


Randy Meyers 是为C、C++和JAVA提供培训和指导的顾问。他目前是ANSI C委员会J11的主席,之前是J16(ANSI C++)和ISO JAVA学习小组(ISO Java Study Group)的成员。他曾经在DEC公司(Digital Equipment Corporation)研究编译器长达16年,并且是DEC C和C++的项目架构师。可以通过以下地址与他联系:rmeyers@ix.netcom.com。

注释
[a] FORTRAN 是第一种高级程序语言。
[b] C语言中所有的参数传递都是传值。因而直接修改参数的值不会改变调用者中原来的值。
[c] 如上面所说的,C语言只有传值,没有引用,但是可以用指针来模拟这个特性。
[d] 我不喜欢“拷贝”这个翻译,但是“副本”可能用得少,似乎也并不见得比前者好。
[e] 也不必知道,因为遵循这些规则以后写出来的程序是无歧义的。
[f] 我猜,以ia32为例,指的是类似leal这种指令,例如取整型数组ai个位置的值附给j ,指令可能是这样的:
;int j = a[i];
;%eax a的地址
;%ecx i的值
;%edx j的地址

leal (%eax, 4, %ecx), %edx
该指令把%eax + 4 * %ecx所指地址的值(也就是a[i])写入%edx指向的地址。
[g] 我猜这个可以用C++的“引用”来理解。
[h] 这有一个引申自《深入理解计算机系统》的小例子:
void func1 (int *x, int *y)
{
    *x += *y;
    *x += *y;
}

void func2 (int *x, int *y)
{
    *x += 2 * (*y);
}

void func3 (int *restrict x, int *restrict y)
{
    *x += *y;
    *x += *y;
}
在这里,func2的效率比func1要高,但编译器却不能把func1优化成func2的形式,想想如果x和y指向同一个对象。但是有了restrict 关键字,func3可以优化成func2的形式。
[i] 这句话翻译不好,意思是程序可能产生错误的结果。
[j] 什么是不合法的别名?这里有一个译者编造的例子:
int a[10];

void func1 (void)
{
    func2 (a);
}

void func2 (int *restrict p)
{
    p[0] = 0;
    a[0] = 1;    /* 不合法的别名:通过全局变量访问a,受限指针p同时也指向a */
    p[1] = 1;

    /* ... */
}

原文地址
http://www.ddj.com/cpp/184401313