The creators and critics of programming languages sometimes classify the data types in a programming language as to whether they are first class types or not. A first class type is one that has the full set of reasonable operations and possible uses defined for it. For example, arrays in C are not first class types because you cannot perform array assignment using the assignment operator or pass an entire array by value as an argument or return an array as a result from a function. In contrast, int in most programming languages is the quintessential first class type: Not only are all reasonable operators defined upon int, but you can also have arrays of int, pass int as an argument, return int from a function, and so on.
编程语言的创造者和批评家有时候把编程语言中的数据类型分成是否为头等类型。头等类型拥有合理操作的全套,并且可能定义了它的用法。例如,C语言中的数组不是头等类型,因为你不能在数组上执行赋值操作符来赋值,或是把整个数组的值作为参数,或是从一个函数中返回一个数组作为结果。相反,int 在多数编程语言中是典型的头等类型:不仅所有合理的操作符在 int 上定义了,而且你可以拥有 int 的数组,传递 int 作为一个参数,从一个函数中返回一个 int ,等等。
Originally, structs in C suffered many of the same deficiencies as arrays, but it was commonplace even before the ANSI C Standard for compilers to support struct assignments, struct arguments, and struct function return values. Structs in modern C are almost first class types, but they still lack support for comparisons for equality or inequality using the == and != operators. The C committee has entertained proposals for supporting == and != for structs, but the debate over how to treat union members of structs caused the proposal to be shelved.
起初,C语言中的结果跟数组有许多相同的缺陷,但是在给编译器的 ANSI C 标准支持结果赋值,结构参数,以及结构函数返回类型以前就已经司空见惯了。结构在现代C语言中几乎就是头等类型,但是它们仍然不支持通过 == 和 != 操作符判断相等与否的比较。C委员会已经受理让结构支持 == 和 != 的建议,但是关于如何对待结构中的联合成员的讨论导致该建议被搁置了。
You might wonder, if structs need the equality operators defined in order to be first class types, do they also need the relational operators, e.g. < or >, to be defined in order to be first class types? This brings us to the “reasonable” in the definition of first class types above. Consider:
你可能会猜想,如果结构需要定义等于运算符以成为头等类型,它们是否还需要定义关系运算符,例如 < 或者 > 来成为头等类型。这给我们带来的是上述头等类型定义中的“合理”。考虑:
struct S {int a, b;};Given that x.a < y.a but x.b > y.b, is it more reasonable to say that x < y, or that x > y, or that no automatic, general definition of < and > on structs is reasonable? I would argue that since programmers lay out structs in order to minimize padding, or to match an externally declared layout, or in the order that members occur to them, and not in an order that results in a natural comparison order for < and >, that it is unreasonable to provide a standard definition of < and > in the C Language.
struct S x = {1,2};
struct S y = {2,1};
Not surprisingly, students of programming language design at times disagree whether a particular operation or use of a type is reasonable or necessary in order to be a first class type. Dennis Ritchie pointed out [1] that some might not consider structs in C90 to be first class types because there are no constants of type struct.
C99 [2] added exactly that feature: constants of almost any type including struct, union, and array. This feature, called compound literals, is based on the brace-enclosed initializer syntax. The motivation for adding this feature to C99 was its notational conciseness, convenience, and usefulness, rather than an abstract desire to make struct a first class type.
Compound literals might or might not be constant depending upon whether their programmer-specified type is const or not. Unlike string literals, it is portable to modify a non-const compound literal.
struct POINT {int x, y;};Here are some examples of compound literals:
union U {float f; int i;};
(int) {1}The value of the compound literal is an anonymous object whose type is specified by the “cast.” The anonymous object has been initialized by the brace-enclosed initializer list. As the last three compound literals in the above example show, compound literals give you a constant-like notation for arrays, structs, unions, as well as any other object type (except for C99 variable length arrays).
(const int) {2}
(float[2]) {2.7, 3.1}
(struct POINT) {0, 0}
(union U) {1.4}
A compound literal can be used anywhere an object with the same type of the compound literal could be used. For example,
int x;is equivalent to
x = (int) {1} + (int) {3};
int x; int unnamed1 = {1};Compound literals are particularly useful as function arguments. For example, suppose you were using a graphics library that used struct POINTs to express coordinates. You might draw a pixel in a window like this:
int unnamed2 = {3};
x = unnamed1 + unnamed2;
extern drawpixel(struct POINT where);Compound literals yield lvalues. This means that you can take the address of a compound literal, which is the address of the unnamed object declared by the compound literal. As long as the compound literal does not have a const-qualified type, you can use the pointer to modify it.
drawpixel((struct POINT) {5, 5});
struct POINT *p;causes *p = 2, 2 to be printed.
p = &(struct POINT) {1, 1};
p->x = 2; p->y = 2;
printf("*p = %d, %d\n", p->x, p->y);
Compound literals are in effect declarations and initializations of unnamed objects that can appear in expressions. The unnamed objects and their initializations follow the same rules [5] as normal declarations, and have the same special treatment depending upon whether the compound literal appears within a function body or not.
If a compound literal appears outside of a function body, then the unnamed object has static storage duration, just like all other objects declared outside of a function. It is allocated and initialized once before the program begins to run and remains allocated as long as the program is running. Since the initialization occurs before running the program, all of the initializers in the brace-enclosed list must be constant expressions [5].
If a compound literal appears inside the body of a function, then the unnamed object has automatic storage duration and acts like a local variable of the immediately enclosing block. It is allocated and initialized when its “declaration” is reached in the block and deallocated upon exiting the block [5]. The expressions in the brace-enclosed initializer list can be any run-time expressions.
void f()In the same way that the declaration and initialization of an automatic variable acts like an assignment to that variable [5], every time control passes through the body of a compound literal with automatic storage duration, the unnamed variable is reinitialized. Thus, the following function draws a diagonal line from (0,0) to (9, 9).
{
int *p;
extern int g(void);
{
p = &(int) {g()};
*p = 1; //OK
}
// p points to deallocated
// stack space
*p = 2; //BAD
}
void line()The brace-enclosed initializer list for a compound literal has the same semantics as a brace-enclosed initializer list in a declaration. If you only provide initializers in the list for some of the members of a struct or elements of an array, the other members or elements are implicitly initialized with zeros of the appropriate type. Thus, (int [10]) {0} is an array of ten integers all initialized to zero. This means that it might be safer to assign to a struct using a compound literal rather than assigning its members individually. Contrast the following lines in a function:
{
int i;
for (i = 0; i < 10; ++i)
drawpixel((struct POINT) {i, i});
}
struct POINT p;versus:
p.x = x; p.y = y;
struct POINT p;Suppose in the future you add a z member to POINT to make it a three-dimensional point. When you assign the members individually, the z member never receives a value and contains stack trash. When a compound literal is used to assign p, the z member is assigned the default value of zero (probably a reasonable default for a 3-D graphics package).
p = (struct POINT) {x, y};
Like any other brace-enclosed initializer list, the initializer list in a compound literal may use the new C99 feature of designated initializers [5], where the member or array element being initialized may be named. When a function takes a struct as an argument, compound literals and designated initializers can be used to call the function with a poor man’s version of keyword arguments to a function and default argument values for a function, as in:
drawpixel((struct POINT) {.y=12});Here, the designated initializer .y acts like a keyword argument to the function, and the .x “argument” to the function receives a default value of zero.
Like normal declarations, if the type inside of the “cast” of a compound literal is an array of unknown size, then the number of elements of the array is determined by the brace-enclosed initializer. A compound literal with type array has the same semantics as a variable with type array. Except when used as the operand of sizeof or &, an array used in an expression is converted to a pointer to the first element of the array. In the following, p points to the first element of an array of three ints.
int *p;
p = (int []) {1, 2, 3};
Normally, every compound literal that you write results in a distinct unnamed object. However, if the type of the compound literal is const-qualified, and the compound literal is initialized with constant expressions, then the compiler is free to pool the compound literals (only store one copy) and to place the unnamed object(s) in write-locked storage. Such compound literals are true constants, not just literals.
通常,你写的每一个复合字面量产生一个独特的未命名对象。然而,如果该复合字面量的类型是 const 限定的,并且该复合字面量是以常量表达式初始化的,那么编译器可以自由地存储这个复合字面量(只存储一个副本)并且把该未命名对象放入禁止写入的存储区中。这样的复合字面值就是真正的常量,不仅仅是字面量。
Thus, those programmers who worry about whether their types are first class types and consider “having a constant representation” to be a requirement, have one less thing to worry about.
因此,那些担心他们的类型是否为头等类型,并且考虑需要“拥有一个常量表示”的程序员,少了一件需要担心的东西。
[1] Dennis Ritchie. “The Development of the C Programming Language,” in Bergin and Gibson, editors, History of Programming Languages (Addison Wesley, 1996).
[2] ANSI/ISO/IEC 9899:1999, Programming Languages - C. 1999. Available in Adobe PDF format for $18 from <http://www.techstreet.com/ncitsgate.html>.
[3] ANSI/ISO/IEC 9899:1990, Programming Languages - C. 1990.
[4] ANSI/ISO/IEC 14882:1998., Programming Languages - C++. 1998. Available in Adobe PDF format for $18 from <http://www.techstreet.com/ncitsgate.html>.
[5] Randy Meyers. “The New C: Declarations and Initializations,” C/C++ Users Journal, April 2001.
Randy Meyers is consultant providing training and mentoring in C, C++, and Java. He is the current chair of J11, the ANSI C committee, and previously was a member of J16 (ANSI C++) and the ISO Java Study Group. He worked on compilers for Digital Equipment Corporation for 16 years and was Project Architect for DEC C and C++. He can be reached at rmeyers@ix.netcom.com.
Randy Meyers 是为C、C++和JAVA提供培训和指导的顾问。他目前是ANSI C委员会J11的主席,之前是J16(ANSI C++)和ISO JAVA学习小组(ISO Java Study Group)的成员。他曾经在DEC公司(Digital Equipment Corporation)研究编译器长达16年,并且是DEC C和C++的项目架构师。可以通过以下地址与他联系:rmeyers@ix.netcom.com。
http://www.drdobbs.com/cpp/184401404