C99变长数组剩下的部分。是的,它们表现良好也非常灵活,但是使用它们时要小心。
My last few columns have dealt with VLAs (Variable Length Arrays) in C99 [1, 2, 3]. VLAs are arrays with run-time expressions instead of compile-time constant expressions for the bounds of the array. The bounds expression is evaluated when the declaration of a VLA is reached inside of a block, and the array has the calculated bounds until its lifetime ends (usually by exiting the block).
我的前几个专栏阐述了C99的VLA (变长数组)[1, 2, 3]。VLA数组是边界为运行时表达式而非编译时常量的数组。边界表达式在在到达块中VLA的声明时求值,数组边界为计算出的值直到生存期结束(通常是退出该块时)。
This column discusses the remaining feature of VLAs, VLA typedefs. I will also discuss flexible array members, a C99 feature similar to VLAs.
本专栏讨论了VLA的剩余特性,VLA typedef。我还会讨论灵活数组成员,一个类似 VLA 的C99特性。
As I discussed in previous columns, the size of a VLA is needed at run time to perform indexing and address arithmetic, so the compiler must make arrangements to store the size of the array somewhere. However, the size is not stored in the array object itself. It is not stored as part of the pointer if you have a pointer to a VLA. The size of a VLA is an attribute of the VLA type [3].
如我前一个专栏中讨论的,需要VLA的大小在运行时进行索引和地址运算,所以编译器必须做出安排把数组的大小存储在某个地方。然而,这个大小没有存储在数组对象本身中。如果你有一个指向VLA的指针,它不作为数组的一部分存储。VLA 的大小是VLA类型的一个属性[3]。
Consider the following:
考虑下面的:
void ex1(int n)pvla is a pointer to a VLA of n chars. In order to do pointer arithmetic with pvla or in order to be able to return the size of the objects to which pvla points, the compiler must calculate the size of a VLA of n elements of type char. Since the C99 rules say that a VLA’s size is fixed at the point the declaration of its type is encountered, the compiler must perform the calculation of the array’s size at the point of the declaration to protect against the value of the bounds expression changing later in the program. The function ex1 prints the size of the array to which pvla points. Since the size of an array of n elements of type char is just n, that is the value that the function prints. However, it prints the original value of n passed to the function, not the value of n after 10 has been added to it. Note that the function ex1 works even though pvla is uninitialized stack trash. The program is perfectly valid because sizeof does not actually evaluate its argument: the uninitialized pointer pvla is not actually dereferenced. The sizeof operator only inspects its operand in order to determine the resulting type, and in C, size is an attribute of the type of an expression. The function ex1 makes this clear. pvla does not actually point at an array, so the size information could not be stored as part of the array object. Likewise, pvla is uninitialized stack trash, so the size information could not be part of its value. Instead, compilers generate code to record the size of VLA types in the program, not the VLA objects themselves. For every VLA type that occurs in a block, the compiler creates an unnamed automatic temporary variable that holds the size of that VLA type during its lifetime. When the type is executed by program flow of control reaching a declaration or cast involving a VLA type, the size of the VLA type is stored in the temporary variable. If the size of a VLA is needed, then the value is fetched from the temporary variable associated with the VLA type. When the block containing the VLA type exits, then the temporary variable is deallocated along with all of the other automatic variables. Of course, a clever compiler might not create a temporary for every VLA type in a block. If the compiler can prove that several of the temporaries always hold the same value or that the temporaries are not used later in the block, the compiler might optimize them away. Clearly, C99 compilers are proficient in handling the bookkeeping associated with VLA types. The C99 language builds upon that by allowing VLA typedefs.
{
char (*pvla)[n];
n += 10;
printf("%zu", sizeof *pvla);
}
void ex2(int n)The typedef declares VARRAY to be the name of the type “variable length array of n elements of type int,” where n has the value it had at the point the typedef declaration was executed. VARRAY is used to declare a1 and a2 to be VLAs of n elements of type int where n has the value it had when the typedef was executed. Thus, if you make the call ex2(5), a1 and a2 are both VLAs of five ints even though the value of n has been changed to 15 by the time a1 and a2 are declared. Of course, a1 and a2 can be used like any other arrays of five ints. VLA typedefs follow the same rules as other VLA types. They can only appear in a block: they cannot appear at file scope. (VLA parameters are permitted because parameters are considered to be local to the function body.) The size of a VLA typedef is constant during its lifetime. The size is fixed when the typedef is executed. The size is no longer associated with the VLA typedef when the lifetime ends by either exiting the block or branching backwards in the block to a point before the typedef declaration [2]. VLA typedefs, like other VLAs, cannot be struct or union members.
{
typedef int VARRAY[n];
n += 10;
VARRAY a1, a2;
}
listing1 this is a testyou get the output (the first line is system specific):
count=12, s="listing1.exe"
count=4, s="this"
count=2, s="is"
count=1, s="a"
count=4, s="test"
There are various rules that flexible array members must follow:
有几个灵活数组成员必须遵守的规则:
Unlike VLAs, the C implementation keeps no run-time information about the size of a flexible array member. It is the programmer’s responsibility to allocate the space for the array and remember the number of elements in the array. If you assign a struct with a flexible array member or pass it as an argument to a function (not through a pointer), then the compiler generates code based on its compile-time information about the struct type. Since the compiler believes that the flexible array member has no elements, no elements will be copied during assignment. If you want to assign structs that contain flexible array elements, you must make sure the target has the proper amount of memory allocated and then use memcpy or a loop to copy the flexible array elements.
不同于 VLA,C的实现不保存任何与灵活数组成员大小相关的运行时信息。为数组分配空间以及记住数组元素的数组是程序员的责任。如果你为带有灵活数组成员的struct 赋值或是把它作为参数传递给一个函数(不是通过指针),那么编译器产生的代码基于编译时有关该 struct 类型的信息。因为编译器认为灵活数组成员没有元素,在赋值时不会复制任何元素。如果你想要为包含灵活数组成员的 struct 复制,你必须保证目标拥有合适大小的已分配内存,然后使用 memcpy 或者是一个循环来复制该灵活数组元素。
As mentioned before, some pre-C99 compilers permit flexible array members. Some of those compilers use a slightly different syntax: rather than the flexible array member having no bounds inside the [], the compilers permit the array bounds to be zero. (Officially, arrays of zero elements are not permitted in C.) Programs that use the [0] form of the extension can be converted to C99 merely by removing the 0.
就如前面提到的,一些C99以前的编译器允许灵活数组成员。这些编译器中的一些使用了稍微不同的语法:相对于灵活数组成员在 [] 中不包含边界,这些编译器允许数组的边界为零(正式地说,零个元素的数组在C中是不允许的)。使用 [0] 这种形式扩展的程序可以简单的删除 0 来转换成 C99。
Unfortunately, in some cases, programmers relied on tricks before C99 to get the effect of flexible array members. Perhaps the most common form of that trick is to declare the fake flexible array with bounds 1. When allocating the struct with malloc, extra space for an array of one less than the desired number of elements was allocated since the struct already had one element built in. While this technique is likely to work for most C and C++ implementations, it does break the rules. A small number of C implementations generate code to check if array indexes are in bounds, and they will complain about any index other than 0 being used with the fake flexible array. (Such checking is automatically turned off when using a real C99 flexible array.)
不幸的是,在某些时候,程序员依赖于一些C99以前的技巧来得到灵活数组成员的功效。或许这些技巧中最常见的形式就是声明一个边界为 1 的伪灵活数组。当使用 malloc 分配这样的 struct 时,为一个元素数组分配的空间要少于期望元素的数目,因为 struct 已经有内建了一个元素。虽然这项技术可能适用于大多数 C 和 C++ 实现,它确实打破了规则。少数 C 实现产生代码来检查数组索引是否在边界以内,它们会抱怨任何不以 0 对该伪灵活数组的索引(这样检查在使用真正的 C99 灵活数组成员时会关闭)。
[1] Randy Meyers. “The New C: Why Variable Length Arrays?” C/C++ Users Journal, October 2001.
[2] Randy Meyers. “The New C: Variable Length Arrays, Part 2,” C/C++ Users Journal, December 2001.
[3] Randy Meyers. “The New C: Variable Length Arrays, Part 3: Pointers and Parameters,” C/C++ Users Journal, January 2002.
Randy Meyers is a consultant providing training and mentoring in C, C++, and Java. He is the current chair of J11, the ANSI C committee, and previously was a member of J16 (ANSI C++) and the ISO Java Study Group. He worked on compilers for Digital Equipment Corporation for 16 years and was Project Architect for DEC C and C++. He can be reached at rmeyers@ix.netcom.com.
Randy Meyers 是为C、C++和JAVA提供培训和指导的顾问。他目前是ANSI C委员会J11的主席,之前是J16(ANSI C++)和ISO JAVA学习小组(ISO Java Study Group)的成员。他曾经在DEC公司(Digital Equipment Corporation)研究编译器长达16年,并且是DEC C和C++的项目架构师。可以通过以下地址与他联系:rmeyers@ix.netcom.com。
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// The representation of a PL/I string
// PL/I 字符串的表示
struct PLIstring {
unsigned short count;
// s is a flexible array member
//s 是一个灵活数组成员
char s[];
};
// Convert the C language string cstr to a PL/I string
// allocated on the heap
// 把C语言的字符串cstr转换成堆上分配的PL/I字符串
struct PLIstring *toPLI(char *cstr)
{
struct PLIstring *pli;
size_t len = strlen(cstr);
// We allocate len extra bytes as storage for the s array
// 我们分配恰好 len 个额外字节作为数组 s 的存储空间
pli = malloc(sizeof (struct PLIstring) + len);
assert(pli != NULL);
pli->count = len;
// Copy len bytes into the flexible array s. Note the zero byte
// ending the C string is not copied.
// 把 len 个字节复制到灵活数组 s。注意结束C字符串的零字节没有复制。
memcpy(pli->s, cstr, len);
return pli;
}
int main(int argc, char **argv)
{
int i;
// Convert our program arguments to PL/I strings and print them
// 把我们程序的参数转换成PL/I字符串并打印它们
for (i = 0; i < argc; ++i) {
struct PLIstring *pli = toPLI(argv[i]);
// print the PL/I string. By specifying a precision for %s, we
// can force it to stop printing before finding a zero byte.
// By making the precision be *, we can pass it as an argument
// to printf.
// 打印PL/I字符串。通过为%s指定精度,我们可以强制它在找到一个零字节前停止。
//把精度指定为*,我们可以把它作为一个参数传递给 printf。
printf("count=%hu, s=\"%.*s\"\n", pli->count, pli->count,
pli->s);
}
return EXIT_SUCCESS;
}
— End of Listing —