The New C: Declarations & Initializations

新的C语言:声明和初始化

By Randy Meyers, April 01, 2001

C99 provides many new options for when and how a variable begins its life. The benefit is code that is less buggy and more readable.

C99为一个变量何时及如何开始它的生存期提供了许多新选项。好处是编码更不容易出bug,也更具可读性。


Oh, April! The first stirrings of Spring and new beginnings. It is appropriate therefore that this month’s column should be on the subject of initialization and declarations.


Unlike C90, C99 allows declarations to be intermixed with executable statements. Since declarations can have initializers that in some cases are run-time expressions, this raises questions about when those expressions are evaluated and when the initialization takes place.


C99 also makes two changes concerning the initialization of structs, unions, and arrays. The first change is that aggregates and unions are no longer limited to being initialized with constant expressions. The second change is a new syntax that allows you to initialize individual members or elements by name, and thus more easily initialize sparse objects and avoid the errors common to the traditional, positional brace-enclosed initializer syntax.


    哦,四月!春天第一次的萌动、新的开端。因此本月专栏的主题为声明和初始化是很恰当的。

    与C90不同,C99允许声明混合在可执行语句中。由于声明可以有初始器(initializers ),它们有时候是运行时表达式,这导致了这些表达式何时求值、初始化何时发生的问题。

    C99 还做了两个改动,涉及结构、联合以及数组的初始化。第一个改动是聚合类型[a]和联合不再局限于使用常量表达式初始化。第二个改动是新的语法允许你按名字来初始化单独的成员或元素,因而能更容易地初始化分散的对象,避免了传统的按位置括号封闭初始器语法中常见的错误。


Constant Expressions

In this article, the term “constant expression” is going to come up several times. A constant expression is an expression whose value can be calculated before running the program. Typically, the compiler determines the value for constant expressions involving arithmetic types, and the linker or program loader determines the value of constant expressions involving addresses.


Not surprisingly, a constant expression cannot depend upon the value of any variable, since in general you must run the program to determine a variable’s value. The operands of constant expressions are constants, enumeration literals, sizeof where the operand is not a variable length array, or addresses of functions or static variables. (Such addresses are constant by the time the program is loaded into memory.)


A constant expression cannot have side effects, and so cannot use the assignment, increment, decrement, function call, or comma operators, unless the operators are the operand of a sizeof operator, since sizeof does not evaluate the expression that is its operand.


There are places (particularly during initialization) where C requires expressions to be constant expressions. As discussed below, C99 requires that an expression be a constant expression in fewer places than C90.


常量表达式

    在本文中,术语“常量表达式”将要出现多次。常量表达式是能够在运行程序之前计算出其值的表达式。通常,编译器决定涉及算术类型的常量表达式的值,链接器或者程序加载器(program loader)决定涉及地址的常量表达式的值。

    毫不奇怪,常量表达式不依赖于任何变量的值,因为一般来说,你必须运行程序来确定一个变量的值。常量表达式的运算数是常数、枚举文字、操作数不是变长数组的 sizeof、函数或静态变量的地址(这些地址在程序装载入内存的时候是不变的)。

    常量表达式不能有副作用,因此不能使用赋值(assignment)、自增(increment)、自减(decrement)、函数调用或着逗号运算符(comma operators),除非这个运算符是该操作数的sizeof 运算符[b],因为 sizeof 不对操作数就是它本身的表达式求值。

    有些地方(特别是初始化),C要求表达式为一个常量表达式。正如下面讨论的,C99要求表达式为常量表达式的地方比C90少。

Mixed Declarations and Statements

C99, like C++, does not require that all declarations appear at the start of a block. The items declared by a declaration are not visible until after their declaration, and so cannot be used by any statements or declarations that precede it in the block.


Objects with static storage duration (objects declared with the static or extern keywords or declared at file scope) must be initialized with constant expressions. Such objects are initialized once, which happens before the program starts. (C++ treats these initializations the same way, but also permits run-time initializers with different rules.)


Objects with automatic storage duration (objects declared local to a function or block without the static or extern keywords) are not initialized until their declarations are reached. In fact, the declarations act like assignment statements and evaluate the initializer expression each time the declaration is reached and assign the value to the object declared. If the same declaration declares multiple initialized objects, the “assignments” occur in the order the objects are declared. (C++ also behaves this way.)


Consider Example 1 below. The program prints the values 0, 2, and 4 for j; and 0, 6, and 12 for k.

// Example 1 
#include <stdio.h> 
int main() 
    int i = 0; 

loop:
    // j and k are initialized each
    // time their declarations are
    // reached
    int j = 2*i, k = 3*j;
    printf("%d %d\n", j, k);
    if (++i < 3) goto loop;

    return 0; 
}
Now for some advice that many of you will resist: when possible, turn the first assignment to a variable into an initialized declaration of that variable. If you delay declaring a variable until it is needed and you can initialize it with its first value, you can eliminate uninitialized variable bugs. Many C programmers hate this advice when they first hear it. (I know I did.) The usual reasons for resisting are habit and readability. Habit (“I always did it this way”) is poor justification for even an occasional bug. As to readability, this style is already common among C++ and Java programmers, and is quite readable once learned.

Of course, programs are not only art but also commercial artifacts. If you need your program to compile on platforms whose compilers do not support this C99 feature, you have a good excuse not to use it. However, be aware that this excuse is only a temporary one until the feature is more widespread.

混合声明和语句

    C99如同C99,不需要所有声明都出现在块的开始位置。由声明(declaration)声明的(declared)条目在它的声明(declaration)前不可见,因此不能用于块中在它之前的任何语句或声明。
    拥有静态存储期(static storage duration )的对象(使用 staticextern 关键字声明,或是在文件作用域内声明的对象)必须使用常量表达式初始化。这样的对象只初始化一次,发生在程序启动之前。(C++按同样的方式处理这些初始化,但是还是用了不同的规则允许运行时初始器)。
    拥有自动存储期(automatic storage duration)的对象(在函数本地或块声明的不带 staticextern 关键字的对象)直到到达它们的声明时才初始化。事实上,声明(declaration)的行为就像赋值语句,在每一次到达声明(declaration)时对初始器表达式求值,并把这个值赋给声明的(declared)对象。如果同样的声明(declaration)声明了(declares)多个初始化对象,“赋值”按照对象声明的顺序进行(C++的行为也如此)。
    考虑下面的Example 1。该程序打印了 j 的值024,以及 k 的值 0612

// Example 1 
#include <stdio.h> 
int main() 
    int i = 0; 

loop:
    // j and k are initialized each
    // time their declarations are
    // reached
    int j = 2*i, k = 3*j;
    printf("%d %d\n", j, k);
    if (++i < 3) goto loop;

    return 0; 
}
    现在有一些你们许多人可能会抵制的建议:在可能的时候,把第一次对变量的赋值转变成该变量的初始化声明(initialized declaration)。如果你延迟声明一个变量直到需要的时候,且以第一个值初始化它,你就能消除变量未初始化的bug。许多C程序员在他们第一次听到的时候不喜欢这个建议(我知道我就是)。通常抵制的理由是习惯和可读性。习惯(“我总是这样做”)是对于甚至可能导致偶然的bug来说是一个苍白的理由。至于可读性,这种风格已经在C++和Java程序员中很常见,并且一旦掌握就有相当好的可读性。
    当然,编程不仅是艺术,而且还是商业构件(commercial artifacts)。如果你需要在编译器不支持该C99特性的平台上编译你的程序,你有很好的理由不使用它。然而,要注意在该特性普及以前这只是暂时的理由。

Non-Constant Initializers

C90 required all the expressions in a brace-enclosed initializer list for a struct, union, or array to be constant expressions. For example,
void f()
{
    // line below is bad C90, but good C99
    int array[2] = {g(), h()};

In C90, the initialization of array is invalid, since function calls are not constant expressions. C99 removes this restriction for objects with automatic storage duration and allows any run-time expression to be used in the brace-enclosed initializer for such an object. Objects with static storage duration, regardless of type, still must be initialized with constant expressions in C99.

C99 does not require that the expression in a brace-enclosed initializer list be evaluated in any particular order. In the last example, function g might be called before h, or h might be called before g. Be careful if your initializer list expressions have side effects, since you will not know the order in which they occur.

非常量初始器

    C90要求用于结构、联合或数组的括号封闭初始器列表中的表达式为常量表达式。例如,
void f() 
{
    // line below is bad C90, but good C99
    int array[2] = {g(), h()}; 

    在C90中,array 的初始化无效的,因为函数调用不是常量表达式。C99移除了这些对拥有自动存储期的对象的限制,并允许在这些对象的括号封闭初始器中使用任何运行时表达式。拥有静态存储期的对象,不管什么类型,在C99中仍必须以常量表达式中初始化。
    C99并不要求括号封闭初始器列表中的表达式按照任何特定的顺序求值。在上一个例子中,函数 g 可能先于 h 调用,或者 h 可能先于 g 调用。如果你的初始器列表表达式有副作用的时候要小心,因为你不知道它们发生的顺序。

Designated Initializers

A new feature, designated initializers, permits you to name the particular member or array element being initialized. For example:

struct S1 {
    int i;
    float f;
    int a[2];
};

struct S1 x = {
    .f=3.1,
    .i=2,
    .a[1]=9
};
As you can see, designators have two different forms. The first form is .member-name to indicate a member of a struct or union. The second form is [constant-expression] to indicate an array element. Just as you can combine multiple . and [] operators in expressions, you may combine multiple designators to indicate a member of a member, a member of an array element, an array element of a member, or an array element of an array element, and so on.

In C, expressions written using the . and [] operators usually begin with an identifier that is the name of the object against which the . and [] operators are applied. Unlike expressions, designators do not begin with an indication of the object against which the . and [] designators are applied; designators are applied against the current object being initialized.

The current object being initialized changes if you nest brace-enclosed initializers within brace-enclosed initializers. The outermost brace-enclosed initializer initializes the complete object being declared, and any designated initializers that appear at that level of nesting are relative to the complete object. A nested brace-enclosed initializer whose position corresponds to an element or member whose type is array, struct, or union initializes that nested array, struct, or union subobject, and any designated initializer at that level of nesting is relative to that subobject.

Normally, each initializer in an initializer list initializes the next element or member of the object. However, designated initializers permit you to initialize the elements or members in any order. You may mix using and not using designators for the initializers. If an initializer without a designator follows an initializer with a designator, the element or member initialized by the initializer without the designator is the element or member that would normally follow the one specified by the designator.

Consider Example 2 below. All of the arrays in Example 2 are initialized to the same value. The use of designated initializers for a3 permits the initializers to be given in reverse order. The initialization of a4 illustrates how nested brace-enclosed initializers affect the current object to which designators are relative. The initialization for a5 shows the mixing of initializers with and without designators.

//Example 2

struct S2 {
    int x, y;
};

// Arrays a1, a2, a3, a4, and a5 are
// initialized to the same values.

struct S2 a1[3] = {1, 2, 3, 4, 5, 6};

struct S2 a2[3] =
{
    {1, 2},
    {3, 4},
    5, 6
};

struct S2 a3[3] =
{
    [2].y=6, [2].x=5,
    [1].y=4, [1].x=3,
    [0].y=2, [0].x=1
};

struct S2 a4[3] =
{
    // Current object is all of a4
    [0].x=1, [0].y=2,
    {
        // Current object is a4[1]
        x=3, .y=4
    }
    // Current object is all of a4
    // Current position is [2]
    5, [2].y=6
};

struct S2 a5[3] =
{
    // After [2].x follows [2].y
    [2].x=5, 6,
    // After [0].x follows [0].y
    [0].x=1, 2,
    // after [0].y is [1]
    3, 4
}; 
You are permitted to initialize the same subobject more than once in the initializer for an object. The last initializer in the initializer list for the subobject overrides previous initializers. If you fail to initialize a subobject, the subobject is initialized with zeros of the appropriate types, as if the subobject was an object with static storage duration without an explicit initializer. In the following example, a6[0] and a6[2] are both initialized to 0, and a6[1] is initialized to 100.

int a6[3] = {[1]=5, [1]=100}; 
In a similar fashion, the square matrix a7 is initialized with all zeros except for ones along the diagonal.

double a7[3][3] =
    {[0][0]=1., [1][1]=1., [2][2]=1.}; 
You may use designated initializers to initialize an array of unknown size. The size of the array will be determined by the highest numbered element that has an explicit initializer. In the following example, a8 has 100 elements (maximum index value of 99); and a8[0] is initialized with 50, a8[1] is initialized with 51, a8[99] is initialized with 149, and all other elements are initialized with zero.

int a8[] = {[99]=149, [0]=50, 51}; 
Designated initializers are particularly useful when initializing a union. Under C90, if you initialized a union, the initializer was for the first member of the union, and there was no way to specify that the initializer was intended for a different union member. Designated initializers permit you to name the union member you are initializing. In the following, x.f is initialized to 3.14.

union {int i; float f;} x =
    {.f=3.14};

指定初始器(designated initializers)

    一个新特性——指定初始器,允许你指出哪些特定的成员或数组元素将要初始化。例如:

struct S1 { 
    int i; 
    float f; 
    int a[2];
}; 

struct S1 x = { 
    .f=3.1, 
    .i=2, 
    .a[1]=9 
};
    正如你所看到的,标识符(designators)有两种不同的形式。第一种形式是 .成员名字 来表明一个结构或联合的成员。第二种形式是 [常量表达式] 来指明一个数组元素。正如你可以在表达式中可以合并多个 .[] 运算符,你可以合并多个标识符来指明成员的成员、数组元素的成员、成员的数组元素,或是数组元素的数组元素,等等。

    在C中,使用 .[] 运算符书写的表达式通常以一个标识符开头,该标识符就是对应的 .[] 运算符所应用的对象的名字。不同于表达式,标识符不以 . [] 表达式所应用对象的指定值开头;标识符应用于当前正在初始化的对象[c]

    如果你在括号封闭初始器中嵌套了括号封闭初始器,当前正在初始化的对象就不同了。最外层的括号封闭初始器初始化了整个声明的对象,任何出现在嵌套级别的指定初始器与整个对象关联。被嵌套的括号封闭初始器,其位置对应于一个类型为数组、结构或联合的元素或成员的,初始化了这些被嵌套的数组、结构或联合子对象,任何在该嵌套级别的标志初始器与该子对象关联。

    通常,初始器列表中的每一个初始器初始化对象中的下一个元素或成员。然而,指定初始器允许你以任何顺序初始化元素或对象。你可以对初始器混合使用或不使用标识符。如果一个没有标识符的初始器在一个带标识符的初始器后面,由这个初始器初始化的元素或成员通常就是标识符指定对象的下一个元素或成员。

    考虑下面的Example 2.所有Example 2中的数组都被初始化为相同的值。a3 的指定初始器的用法允许初始器以逆序给出(标识符)。a4 的初始化说明了嵌套的括号封闭初始器如何作用于与标识符关联的当前对象。a5的初始化展示了混合使用和不使用标识符。

//Example 2 

struct S2 {
    int x, y;
}; 

// Arrays a1, a2, a3, a4, and a5 are 
// initialized to the same values. 

struct S2 a1[3] = {1, 2, 3, 4, 5, 6};

struct S2 a2[3] = 

    {1, 2}, 
    {3, 4}, 
    5, 6 
}; 

struct S2 a3[3] = 

    [2].y=6, [2].x=5, 
    [1].y=4, [1].x=3, 
    [0].y=2, [0].x=1 
}; 

struct S2 a4[3] = 

    // Current object is all of a4 
    [0].x=1, [0].y=2, 
    { 
        // Current object is a4[1] 
        x=3, .y=4
    } 
    // Current object is all of a4 
    // Current position is [2] 
    5, [2].y=6 
}; 

struct S2 a5[3] = 

    // After [2].x follows [2].y 
    [2].x=5, 6, 
    // After [0].x follows [0].y 
    [0].x=1, 2, 
    // after [0].y is [1] 
    3, 4 
}; 
    你可以在初始器中对相同一个子对象多次初始化。初始器列表中该子对象的最后一个初始器会覆盖前一个初始器。如果你未能初始化子对象,该子对象会被初始化为适当类型的零,该子对象就如同一个拥有静态存储期、没有显式初始器的对象。在下面的例子中,a6[0] 和 a6[2] 都被初始化为 0,且 a6[1] 被初始化为 100

int a6[3] = {[1]=5, [1]=100}; 
以类似的方式,矩阵a7 除了对象线的元素以外都被初始化为0。

double a7[3][3] = 
    {[0][0]=1., [1][1]=1., [2][2]=1.}; 
    你可以使用指定初始器来初始化一个未知大小的数组。该数组的大小将取决于带显式初始器的最高编号元素决定。在下面的例子中,a8 有 100 个元素(最大的索引值是99);并且 a8[0] 初始化为 50 a8[1] 初始化为 51, a8[99] 初始化为 149, 其他所有元素被初始化为0。

int a8[] = {[99]=149, [0]=50, 51}; 
    指定初始器在初始化联合时尤其有用。按照C90,如果你初始化一个联合,初始器是作用于联合的第一个成员,而且没有办法指定这个初始器是为一个不同的联合成员准备的。指定初始器允许你在初始化的时候指出联合成员。在下面,x.f 被初始化为 3.14

union {int i; float f;} x = 
    {.f=3.14};

Summary

C99 removes some of the restrictions in C90. Declarations need not be grouped at the beginning of a block. Instead they may appear at the point where the item declared is actually needed. Initialization of variables with automatic storage duration acts like assignment, and more such initializations can use run-time expressions rather than constant expressions compared to C90. This promotes program correctness by allowing the elimination of uninitialized variables.

C99 also adds designated initializers, which allow the programmer to express more explicitly which members or elements are initialized, removes the dependency on initializers appearing in a particular order, and allows initializer lists to be more succinct by only containing initializers for non-zero subobjects.

总结

    C99移除了C90中的一些限制。生命不必成组出现在代码块的开头。相反,它们可以出现在声明的项目实际需要的地方。拥有自动变量存储期的变量行为如同赋值,而且与C90相比,更多这样的初始化可以使用运行时表达式,而不是常量表达式。这通过消除未初始化变量提升了程序的正确性。
    C99还加入了指定初始器,允许程序员更明确的表达哪些成员或元素被初始化了,移除了初始器要以特定顺序出现的依赖性,并且通过只包含非零子对象的初始器使初始器列表更简洁。


Randy Meyers is consultant providing training and mentoring in C, C++, and Java. He is the current chair of J11, the ANSI C committee, and previously was a member of J16 (ANSI C++) and the ISO Java Study Group. He worked on compilers for Digital Equipment Corporation for 16 years and was Project Architect for DEC C and C++. He can be reached at rmeyers@ix.netcom.com.


 Randy Meyers 是为C、C++和JAVA提供培训和指导的顾问。他目前是ANSI C委员会J11的主席,之前是J16(ANSI C++)和ISO JAVA学习小组(ISO Java Study Group)的成员。他曾经在DEC公司(Digital Equipment Corporation)研究编译器长达16年,并且是DEC C和C++的项目架构师。可以通过以下地址与他联系:rmeyers@ix.netcom.com。


注释

[a] aggregates,聚合类型,即结构和数组。

[b] 这句话没看懂

[c] 还是没看懂

原文地址

http://www.ddj.com/cpp/184401377