The New C: X Macros

新的C语言:X Macros

By Randy Meyers, May 01, 2001

Assembly language programmers of the 60's had to develop some great tools just to preserve their sanity. Some of those tools, such as X macros, are still potentially useful today.

60年代的汇编程序员不得不开发一些强大的工具,仅仅为了保持他们的精神健全。其中一些工具,像是 X macros [a],在今天可能仍然是有用的。


Sometimes a good idea just gets lost. Assembly language programmers of the 1960’s invented a clever way to use macros to initialize parallel tables. This technique works perfectly well in C and C++, but seems almost unknown. The technique can also be used in some cases as a replacement for, or as an enhancement to, the C99 feature known as designated initializers.

    

    有时候,一些很好的主意被遗忘了。20世纪60年代的汇编程序员发明了一种精巧的方法,使用宏来初始化并行的表格。这项技术在C和C++上表现完美,但是几乎不为人知。该技术有时候可以作为对C99中指定初始器的替代,或是增强。

Designated Initializers

指定初始器

Designated initializers permit you to name in the brace-enclosed initializer the particular array element or member being initialized when you initialize an aggregate or union. (See my April 2001 column for more details.) As such, designated initializers are a partial solution to the software engineering problem of maintaining an enum type definition and a parallel array giving the names of the enum literals:


    指定初始器允许你在初始化一个聚合类型或联合的时候,在括号封闭初始器中指明将要初始化的特定的数组元素或成员(参阅我在 2001 年四月的专栏获取更多细节)。因此,指定初始器部分解决了软件工程中的一个问题:维持一个枚举类型的定义,以及一个给出这些枚举字面上名字的并行数组。

#include <stdio.h> 

enum COLOR {
    red,
    green,
    blue
};

char *color_name[] = {
    [red]="red",
    [green]="green",
    [blue]="blue"
};

int main() {
    enum COLOR c = red;
    printf("c=%s\n", color_name[c]);
    return 0;
Using designated initializers when initializing the array color_name has two advantages. First, it makes the initialization of the array independent of the order that the enum literals are declared in the definition of enum COLOR. Even if you rearrange the definition of the enum literals so that red has the integer value 2, green has the value 0, and blue has the value 1, the array color_name will have its elements initialized with the proper values in the right elements.

    使用指定初始器来初始化数组 color_name 有两个好处。首先,它使数组的初始化与枚举类型 COLOR 定义中枚举字面量(enum literals)的顺序无关。即使你重新安排了枚举字面量的定义,使 red 的整数值为2,green 的整数值为 0,以及 blue 的整数值为 1,数组 color_name 仍然能把它的元素在正确的位置上以正确的值初始化。

Second, since the names of the enumeration literals are explicitly used in the initialization of color_name, it serves as a clue that if you add a new enum literal to the definition of COLOR that you should add a new initializer to color_name. (A comment, of course, would also be helpful.) Unfortunately, too often adding a new enum literal to a large program requires searching the program source files for the names of the old enum literals to find all of the switch statements, arrays, and other places that need to be aware of the new enum name. Note that in a large program, it is likely that the definition of enum COLOR would be in a header file along with an extern declaration for array color_name, and the definition and initialization of color_name would be in a separate source file since it must be compiled only once. An explicit reference to an enum literal is thus much better than an implicit one when you have to search several source files for uses of the enum.

其次,由于枚举字面量名字显式使用于 color_name 的初始化,它提供了一个提示,即如果你在 COLOR 的定义中添加了一个新的枚举字面量,那么你也应该在 color_name 中添加一个新的初始器(当然,注释也会有帮助)。不幸的是,在大型程序中添加一个新的枚举字面量往往需要在程序源文件中搜索原有枚举字面量的名字,来找到所有的 switch 语句,数组、以及其他需要注意该新枚举名字的地方。注意在大型程序中,很可能枚举类型 COLOR 的定义与 数组 color_nameextern 声明一起出现在一个头文件中,而 color_name 的定义和初始化会出现在一个单独的源文件汇总,因为它只能被编译一次。当你不得不搜索几个源文件来找到使用该枚举类型的地方时,对枚举字面量显式的引用要比隐式的好得多。

The use of designated initializers by color_name can also be a mixed blessing. Consider if a new enum literal white is added to COLOR between red and green, but no corresponding initialization is made to color_name. The array color_name will have the right number of elements since [blue], the highest numbered element is still initialized. The elements [red], [green], and [blue] will be initialized correctly. However, the element [white] will have not been initialized and will default to a null pointer. (The use of designated initializers does not change the fact that you need not initialize all of the elements of an array, and compilers need not issue a message if you only initialize part of an array.) The program will work as long as you do not attempt to reference the name of white. Hence the mixed blessing: if you manage to put the program with this bug into production, you would probably prefer that it mostly work. However, during testing, you would probably prefer that the program fail spectacularly and obviously.


color_name 使用的指定初始器也可能喜忧参半。考虑一下,如果在 COLOR 中添加一个新的枚举字面量 white ,而在 color_name 中没有做出相应的初始化。数组 color_name 仍然包含正确数目的元素,因为最高编号的元素 [blue] 仍然被初始化了。元素 [red]、[green] 和 [blue] 将正确的初始化。然而,元素 [white] 没有初始化并将默认成为一个空指针(null pointer)(指定初始器的用法并没有改变你不需要初始化一个数组中所有成员的事实,并且如果你只初始化了数组中的一部分,编译器也不需要发出一条信息)。只要你不引用 white 的名字,程序将继续工作。因此喜忧参半:如果你试图把带有这个bug 的程序投入生产,你可能宁愿它大部分时候工作。然而,在测试中,你可能宁愿这个程序壮观地明显地崩溃了。

X Macros

A better solution would be to automatically obligate programmers to add the initializer when adding a new enum literal, and to provide a single place to add both, even if the enum definition was in a header file and the array initialization was in a separate source file. The assembly language programmers had a preprocessor-based solution to this problem.

    一个更好的解决方法是在添加一个新的枚举字面量时自动为程序员添加一个初始器,并提供一个单独的地方来添加两者,即使枚举定义在一个头文件中,而数组的初始化在一个单独的源文件中。汇编语言程序员对这个问题有一个基于预处理器的解决方法。

Their solution depended upon features common to most preprocessors for most languages, and that are part of the traditional and standard preprocessors for C and C++. Those features are that macros can call other macros, that macros may be redefined, and that a nested macro call is not expanded until the macro that calls it is expanded during one of its calls. Consider a use of this technique:

    他们接解决方法依赖于对大多数语言的预处理器来说都很常见的特性,它们也是传统以及标准 C 和 C++ 预处理器的一部分。这些特性是宏能够调用其他宏、宏可以重定义、以及嵌套的宏调用直到在它的一个调用中调用该宏时才展开。考虑这项技术的一个用法:

#include <stdio.h>

#define COLOR_TABLE \
X(red, "red") \
X(green, "green") \
X(blue, "blue")

#define X(a, b) a,
enum COLOR {
    COLOR_TABLE
};
#undef X

#define X(a, b) b,
char *color_name[] = {
    COLOR_TABLE
};
#undef X

int main() {
    enum COLOR c = red;
    printf("c=%s\n", color_name[c]);
    return 0;
In this example, COLOR_TABLE is a macro that expands into a series of calls to a two-argument macro named X. At the time COLOR_TABLE is defined, the macro X has not been defined, but that is okay since the definition of macro X is not needed until COLOR_TABLE is actually used and expanded.

    在这个例子中, COLOR_TABLE 是一个宏,展开成一系列对一个带有2个参数叫做X的宏的调用。在定义 COLOR_TABLE 的时候, 宏 X 还没有定义,但这是可以的,因为只有在实际使用和展开COLOR_TABLE 时才需要宏 X 的定义。

COLOR_TABLE is expanded twice by this program, and a different definition of the macro X is used by each expansion. The first time COLOR_TABLE is expanded, the macro X is defined to expand to its first argument followed by a comma, and thus the result of expanding COLOR_TABLE is a comma-separated list of the enum literals inside the definition of enum COLOR. After the expansion of COLOR_TABLE, I undefine the macro X, which prevents accidental use of the macro name and also allows X to be redefined later with a different body.

    COLOR_TABLE 在程序中展开了两次,在每一次的展开中使用了宏 X 不同的定义。COLOR_TABLE 第一次展开时, 宏 X 被定义为它的第一个参数后面带一个逗号,因此展开 COLOR_TABLE 的结果是枚举类型 COLOR 中一个逗号分隔的枚举字面量列表。在展开 COLOR_TABLE 以后,我取消了宏 X 的定义,以阻止对宏名字的意外使用,并允许稍后以不同的宏体重定义 X

The second time COLOR_TABLE is expanded, the macro X is defined to expand to its second argument followed by a comma, which results in a comma-separated list of string literals that properly initialize the array color_name. Again, after the expansion of COLOR_TABLE, the macro X is undefined to prevent accidental uses of the name.

    COLOR_TABLE 第二次展开时,宏 X 被定义为它的第二个参数后面带一个逗号,结果是逗号分隔的字符串字面量,正确的初始化了数组 COLOR_NAME。再一次,在展开 COLOR_TABLE 以后,取消宏 X 的定义以避免对该名字的意外使用。

The macro COLOR_TABLE, and its use of “X macros,” improves program maintainability in several ways. If you add a new line (X macro call) to COLOR_TABLE in order to declare a new enum literal, you are forced to add the initializer as well. You can reorder the lines in the table, or insert lines, or delete lines, and the color_name array will still be initialized properly. You have one location to edit to add a new enum literal even if the declaration of the enum type is in a header file and the initialization of the array is in a separate source file. For example, the definitions of COLOR_TABLE and enum COLOR could be in a header, and the source file that initializes the array color_name would include that header so it could expand COLOR_TABLE.

    宏 COLOR_TABLE,以及它对“X macros”的使用,在几个方面提高了程序的可维护性。如果你往 COLOR_TABLE 中加入一个新行(X macro的调用)来声明一个新的枚举字面量,你不得不同时添加一个初始器。你可以重新排序表中的行,或是插入删除行,数组 color_name 都将会正确的初始化。你有一个地方来编辑添加一个新的枚举字面量,即使枚举类型的定义位于一个头文件,而数组的初始化在一个单独的源文件。例如,COLOR_TABLE 和 枚举类型 COLOR 的定义可以在一个头文件中,而初始化数组 color_name 的源文件将包含该头文件,这样它就能够展开 COLOR_TABLE

There is a potential problem with defining a macro that expands into a series of X macro calls. If the table gets big, you may exceed the compiler limit on the number of characters in a source line when defining the table macro or when expanding the macro. C90 only required that compilers accept lines with fewer than 510 characters after line continuation and macro expansion. C99 increases that limit to 4,095 or fewer characters. In practice, many compilers allow source lines much larger than the minimum limits, but it is wise to avoid defining very large macros.


    定义一个扩展成一系列X macro调用的宏,存在一个潜在的问题。如果表格太大了,你可能在定义表格宏或是展开该宏时,超过编译器对源文件中一行字符数的限制。C99 只要求编译器接受少于510个字符的连续行[b]或者宏展开。C99 把该限制增加到少于4095字符。在实践中,许多编译器允许的源文件行比最小限制大得多,但明智的做法是避免定义非常大的宏。

To avoid the line length problem, you can put the X macro calls in a header file and include that header file in the places that would have expanded the table macro, as in the following.

    为了避免行长度的问题,你可以把X macro调用放入一个头文件,并在将要展开表格宏的地方包含该头文件,如下:

// File: color_table.h
X(red, "red")
X(green, "green")
X(blue, "blue")

// File: main.c
#include <stdio.h>

#define X(a, b) a,
enum COLOR {
#include "color_table.h"
};
#undef X

#define X(a, b) b,
char *color_name[] = {
#include "color_table.h"
};
#undef X

int main() {
    enum COLOR c = red;
    printf("c=%s\n", color_name[c]);
    return 0;
Putting the X macro calls in a header file also has the advantage that you can use conditional compilation in building the table of X macro calls. For example, if some systems have the color purple instead of red, the header color_table.h could be:

    把X macro调用放入头文件中还有一个优点,你可能在构建 X macro调用表格时使用条件编译。例如,如果一些系统有颜色 purple 而不是 red, 头文件 color_table.h 可以是:

// File: color_table.h
#ifdef NO_RED
X(purple, "purple")
#else
X(red, "red")
#endif
X(green, "green")
X(blue, "blue") 
Note that there is no requirement that X macros take two arguments. The alert reader probably has realized that in the previous examples the second argument to the X macro was a quoted string whose value was the same as the first argument. The X macro calls in COLOR_TABLE could have taken only one argument (the enum name). When COLOR_TABLE was expanded to initialize color_name, the X macro could have been defined to use the stringize operator to create the needed string literal. On the other hand, if you have more that one parallel table indexed by an enum, you might use X macros that take more than two arguments. Since the X macro is defined immediately before expanding calls to it, and undefined immediately afterwards, you might have a table macro containing X macro calls with two arguments and a different table macro containing X macro calls with five arguments. There is no conflict since a different definition of the X macro would be used when expanding the two table macros.

    注意并没有规定 X macro 接受两个参数。细心的读者可能已经意识到,在前面的例子中,X macro的第二个参数是一个引号包围的字符串,它的值跟第一个参数相同。COLOR_TABLE 中的 X macro 调用可以只接受一个参数(枚举名称)。当展开 COLOR_TABLE 来初始化 color_name 时,X macro 可以定义为使用字符串化操作符[c]创建需要的字符串字面量。另一方面,如果你有不止一个由枚举索引的并行表格,你可能使用不止两个参数的 X macro。因为 X macro 在对它的展开调用之前立即定义,并在之后立即取消定义,你可以有一个有一个表格宏包含两个参数的X macro的调用,以及另一个表格宏包含五个参数的 X macro 的调用。这里不存在冲突,因为在展开两个表格宏的时候使用了 X macro 的不同定义。

X macros might expand into something more complex than a copy of one of their arguments. Suppose we wanted to have a small gap in the value of our enum literals. We might use three-argument X macros as follows:

    X macro 可以展开成更复杂的形式,而不仅仅是复制一个它们的参数。假设我们需要枚举字面量的值有一些小的间隔。我们可以使用三个参数的 X macro,如下:

#define COLOR_TABLE \
X(green, , "green") \
X(red, =3, "red") \
X(blue, , "blue")

#define X(a, b, c) a b,
enum COLOR {
    COLOR_TABLE
};
#undef X

#define X(a, b, c) [a]=c,
char *color_name[] = {
    COLOR_TABLE
}; #undef X 
The second parameter in the X macros calls is an optional initializer for the enum literal itself to set its value. This example also shows the C99 feature wherein a macro argument is allowed to consist of no tokens. Such an argument expands into nothing in the macro body. Thus, the definition for enum COLOR is:

    X macro 调用中的第二个参数是一个可选的初始器,用于设置枚举字面量。这个例子还展示了C99的特性,即允许宏参数的位置不存在参数。这样的参数在宏体中扩展成什么也没有。因此,枚举类型 COLOR 的定义是:

enum COLOR {
green, red=3, blue,
}; 
Since our enum literals now have gaps in their values, the X macro used during the initialization of color_name expands into designated initializers.

    因为现在我们的枚举字面量的值有间隔,color_name 初始化中使用的 X macro 展开成指定初始器。

The above definition of enum COLOR shows another C99 feature I have been quietly using in this article. C traditionally allows a trailing comma in a bracketed initializer list as an aid to machine-generated source text. C90, however, did not allow a corresponding trailing comma in a enum literal definition list, although many C compilers allowed the comma as an extension. C99 now permits the trailing comma for enum literal definitions. As shown above, the X macro expansions in all of my previous examples generated a trailing comma after the last enum literal definition. If you are using a compiler that does not accept the trailing comma, you can comma-separate the X macro calls in the table macro and not produce a comma in the X macro expansion as in the following.

    上面枚举类型 COLOR 的定义展示另一项我一直在本文中悄悄使用的C99特性。C 传统上允许一个逗号出现在加括号的初始器列表的结尾,来帮助机器生成源代码文本。然而C90并不允许对应的逗号出现在枚举字面量定义列表的结尾,虽然许多C编译器允许这个逗号作为一个扩展。C99 现在允许枚举字面量定义的结尾逗号。如果你正在使用一个不接受结尾逗号的编译器,你可以在表格宏中用逗号分隔 X macro 调用,并在展开 X macro 时不产生一个逗号,如下:

#define COLOR_TABLE \
X(red, "red"), \
X(green, "green"), \
X(blue, "blue")

#define X(a, b) a
enum COLOR {
    COLOR_TABLE
};
#undef X

Acknowledgments

The X macro technique was used extensively in the operating system and utilities for the DECsystem-10 as early as 1968, and probably dates back further to PDP-1 and TX-0 programmers at MIT. Alan Martin introduced X macros to me in 1984. I wish to thank Alan Martin, Phil Budne, Bob Clements, Tom Hastings, Alan Kotok, Dave Nixon, and Pete Samson for providing me with historical background for this article.

致谢

    X macro 技术早在1968年就广泛的用于DEC system-10的操作系统和实用工具,并且大概可以追溯到 MIT 的 PDP-1 和 TX-0 程序员。Alan Martin 在1984年向我介绍了 X macro。我要感谢Alan Martin、Phil Budne、Bob Clements、Tom Hastings、Alan Kotok、Dave Nixon 以及 Pete Samson 为我提供了这篇文章的历史背景。


Randy Meyers is consultant providing training and mentoring in C, C++, and Java. He is the current chair of J11, the ANSI C committee, and previously was a member of J16 (ANSI C++) and the ISO Java Study Group. He worked on compilers for Digital Equipment Corporation for 16 years and was Project Architect for DEC C and C++. He can be reached at rmeyers@ix.netcom.com.


 Randy Meyers 是为C、C++和JAVA提供培训和指导的顾问。他目前是ANSI C委员会J11的主席,之前是J16(ANSI C++)和ISO JAVA学习小组(ISO Java Study Group)的成员。他曾经在DEC公司(Digital Equipment Corporation)研究编译器长达16年,并且是DEC C和C++的项目架构师。可以通过以下地址与他联系:rmeyers@ix.netcom.com。

注释

[a] X Macros, 应该没必要翻译成“X宏”了吧。

[b] 存疑,貌似说的是用'\'分开的连续行。

[c] 说的是#操作符

原文地址

http://www.drdobbs.com/cpp/184401387