C99 is much like C89, but with more of the same - lots more of the same.
You might not have noticed, but a major revision [1] to the ANSI/ISO C Standard, called C99, was approved last December. Also, you might not have noticed, but you might already be using the new C language, or at least parts of it.
The reason for this is that the committee took a pretty conservative approach in adding features to C. Almost all of the new features have been implemented and have proved their worth in existing C implementations. Although no implementation yet supports all of C99, many implementations have supported different parts of C99 for years.
This is good news for C programmers. Perhaps you have been avoiding an extension in your favorite compiler because it was not portable. If that extension is a feature of the new C Standard, you can start using the feature knowing it will spread to other compilers as the industry rolls out C99-compliant compilers.
It almost goes without saying that the new Standard is upwardly compatible with the old. There are a few incompatibilities, but they are very minor, and the committee worked very hard to minimize problems. For example, see the discussion of new keywords below.
你可能没有注意到,一个对ANSI/ISO C 标准的主要修订[1],叫做C99,已经在去年十二月份通过了。你也可能还没注意到,但是你可能已经在使用新的C语言,或是它的一部分。
原因是委员会在增加C语言特性的时候相当保守。几乎所有的新特性都已经有实现(implementations)[a]支持,并且在现有的C语言实现中证明了他们的价值。尽管没有一个实现完全支持C99,但是许多实现已经支持C99的不同部分很多年了。
这对C程序员来说是个好消息。你也许在避免使用你所喜爱的编译器的扩展,因为它是不可移植的。如果这个扩展是新的C标准的特性,你可以开始使用这个特性,随着兼容C99的编译器的发展,它会延伸到其他编译器。
毫无疑问,新的标准是向上兼容的。也有一些小的不兼容的地方,委员会努力使这个问题最小化。举例来说,看看下文中对新的关键字的讨论。
Programming languages evolve over time, and the usual practice is to refer to a language not only by its name, but also the year in which it was defined. (In lectures five years ago, I could get a laugh by giving some examples of this rule: ALGOL 68, C89, Fortran 77, and Fortran 4. Alas, this geeky humor has a Y2K bug: these days 2004 springs to mind before 1904.) Thus, the new language and the Standard that defines it are called C99. The original C Standard [2] is called either C89 or C90. (ANSI published the document in 1989, but ISO renumbered the sections and published the document in 1990.) There was a minor update [3] to C89 called C95 that you probably did not notice unless you process Japanese, Korean, or Chinese text, since it mostly added more library functions that process wide and multibyte characters. (Java proponents sometimes erroneously claim that Java was the first language to support large character sets. Such support was in Standard C in 1989.)
Perhaps the greatest influence on C99 was the Numerical C Extensions Group, or NCEG. The NCEG was a subcommittee of J11, the ANSI C committee, that started working on a technical report [4] after C89 was finalized. The NCEG Tech Report was not a standard, it was a call for implementations to experiment and gain experience with a set of well described extensions. The majority of these extensions dealt with numerical programming in C (IEEE arithmetic, complex numbers), but some had more general purpose or promoted optimization (variable length arrays, parallel processing, the restrict keyword).
In some cases, NCEG extensions were invented by the subcommittee. In others, vendors brought extensions already implemented in their compilers to the committee for review and feedback. Since the tech report was not a standard, vendors were free to pick and choose which extensions to implement, and to modify the extensions based on customer experience.
This real world experience is very valuable. Language features can interact in surprising ways, and sometimes a language feature will cause a run-time penalty even if the feature is never used. (For example, on some C++ implementations, the mere existence of multiple inheritance as a feature slows down programs that use only single inheritance.) The experimentation with NCEG extensions not only improved the extensions themselves, but also improved their specification, and gave the C committee confidence that the interactions and costs of the language features were known.
Not all NCEG extensions were added to C99. Perhaps the biggest example is the NCEG parallel-processing support, which was based on the C* Language (pronounced C-Star) from Thinking Machines. The manufacturers of parallel computers have various idiosyncratic extensions to write explicitly parallel programs, and the NCEG Technical Report did not change this. Since there is still little consensus on the best way to program parallel computers, such a feature is not yet suitable for including in Standard C.
In other cases, NCEG extensions were modified when added to C99. The NCEG support for complex numbers included separate imaginary datatypes, such as double_ imaginary. The imaginary data types were made optional in C99.
However, the biggest feature considered for, but not included in, C99 did not come from the NCEG, but from C++. For about a year, the committee worked on a subset of C++ object-oriented features. Included in the subset were single (but not multiple) inheritance, virtual functions, member access control (public, private, protected), constructors, and destructors. This mix of features was similar to C++ in the late 1980s.
This resemblance to early C++ was both a plus and a minus. On the positive side, this set of features was responsible for the initial popularity of C++, and the set of features was known to work well together with well understood costs and interactions. On the negative side was the question, "Isn't the natural evolution of the C++ of the 1980s the C++ of the 1990s? If so, what is the value in C starting down that path since the C++ of the 1990s already exists?" Ultimately, for a variety of reasons, some logistical, the committee abandoned adding object-oriented features to C.
The remainder of this article will briefly list the different features that are in C99. Future articles in this series will describe individual features in greater detail.
不是所有的NCEG扩展都被添加到了C99。最大的例子是NCEG的并行处理支持,这是以Thinking Machines上的C*语言(读作C-Star)为基础的。并行计算机的制造商要写许多特殊的扩展来明确地进行并行编程,NCEG的技术报告没有改变这一点。由于目前在并行计算机编程的最佳途径上鲜有共识,这样的扩展还不适合包含在C标准中。
有些情况下,NCEG的扩展在加入C99的时候做了些修改。NCEG提供了复数支持,并包含了单独的虚数类型,例如double_ imaginary。这种虚数类型在C99中是可选的。
然而,在C99中考虑了最多却没有被采纳的特性并不是来自NCEG,而是C++。大约一年前,该委员会在致力于完成一个C++面向对象特性的子集。在该子集中包括单重(而不是多重)继承、虚函数、成员权限控制(public, private, protected)、构造函数以及析构函数。这些特性的组合更像是20世纪80年代晚期的C++。
跟早期C++的形似既有积极的一面也有消极的一面。在积极的方面,这组特性对于初期C++的流行功不可没,它们能很好的协同工作,运行开销和交互方式也众所周知。在消极的方面,有一个问题:“90年代的C++是不是80年代C++的自然演化?如果是的话,既然90年代的C++已经存在的情况下,C往回头路走的价值何在?”最终,由于各种原因,以及一些在实施上的逻辑问题,委员会放弃了把面向对象特性增加到C语言中。
文章的余下部分将简要的列举C99中的不同特性。未来的一系列文章将单独、详细地对这些特性进行描述。
C99 has the following new keywords: inline, restrict, _Bool, _Complex, and _Imaginary.
The last three of the new keywords start with an underscore followed by an upper case letter in order to avoid conflicts with user identifiers in existing programs (bool in particular is common). However, the header <stdbool.h> defines a macro named bool that expands to _Bool, and the header <complex.h> defines a macro complex that expands to _Complex, and (if supported) a macro imaginary that expands to _Imaginary. The preferred style (once you determine it will not cause conflicts with identifiers in your program) is to include the appropriate header and use bool, complex, or imaginary rather than the underscore keywords. In the rest of this column, I will assume the proper headers have been included.
C99新增了以下关键字:inline、 restrict、_Bool、_Complex以及_Imaginary。
最后三个关键字以一个下划线和一个大写字母开头,以免与现有程序中用户定义的标识符冲突(bool 尤其常见)。不过,头文件<stdbool.h>定义了一个名为bool的宏,这个宏被扩展成_Bool;头文件<complex.h>定义了一个名为complex的宏,这个宏被扩展成_Complex;如果支持(虚数)的话,一个 imaginary的宏被扩展成 _Imaginary。首选的方法(如果你确定它不会与你的程序中的标识符冲突)是包含恰当的头文件,并使用bool、complex,或者imaginary,而不是使用带下划线的关键字。在本专栏中,我将假设已经包含合适的头文件。
C99 has the following additional new types:
C99新增了以下类型
C99 allows implementations to define additional integer datatypes. All of the semantic rules dealing with integers in the C Standard were generalized to allow such "extended integers" to follow predictable rules and behave like any other integer type.
The new header <stdint.h> contains typedefs for integers of various sizes in bits (e.g., int32_t) or properties (like fast computation). The header also contains typedefs for the largest signed and unsigned integer types supported by the implementation, and (if such a type exists) a typedef for the integer type capable of holding the value of a pointer.
The header <inttypes.h> defines macros that are the printf and scanf format specifiers suitable for reading or writing values of all the different types named by typedefs in <stdint.h>.
C99允许实现定义额外的整型数据类型。所有C标准中处理整型的语义规则都能推广到这些“扩展的整型”上,使得它们能像其他整型类型一样遵从原有的规则。
新的头文件<stdint.h>包含了许多按位(in bits)的长度typedef的整型(例如int32_t)或者属性(如快速计算(like fast computation)[c])。该头文件还包含了由实现支持的typedef的最大有/无符号整型,如果这样的类型存在的话,还有typedef的能够包含指针数值的整型类型。
头文件<inttypes.h>定义了许多宏,它们是printf和scanf格式说明符,适用于读写头文件<stdint.h>中typedef的各种不同类型的数值。
The new header <fpenv.h> defines functions to allow you to control the floating-point environment, including rounding modes, status flags, and exception state.
The header <math.h> contains many new library functions. The new header <complex.h> contains math functions for complex numbers.
The new header <tgmath.h> defines type-generic function-like macros for many math functions (like intrinsic functions in Fortran or overloaded functions in C++). For example, after including <tgmath.h>, the call sin(x) will expand into a call to whichever sine function in the library takes an argument whose type is the same as the type of x.
C99 provides an optional specification of exactly how C behaves on a machine with IEEE floating-point arithmetic, including the rules for handling infinities, NaNs, signed zeroes, conversions, and expression evaluation.
C99 supports a new hexadecimal floating-point constant, which allows floating-point constants to be written without any loss of accuracy due to the decimal-to-binary conversion of traditional floating-point constants.
The header <float.h> contains additional information about the implementation.
New standard pragmas allow control of certain aspects of expression evaluation.
新的头文件<fpenv.h>定义了一些函数,能让你控制浮点环境,包括舍入模式、状态标志以及异常状态。
头文件 <math.h>包含了许多新的库函数。新的头文件 <complex.h>包含许多用于复数的数学函数。
新的头文件<tgmath.h>为许多数学函数定义了看上去像函数的泛型宏(像是Fortran的intrinsic functions[d]或是C++中的重载函数)。举例来说,在包含了<tgmath.h>以后,对sin(x)的调用将会被扩展为对函数库中某个正弦函数的调用,这个函数接受与x相同类型的参数[e]。
C99提供了一个可选、精确的浮点规格说明,详细的描述了C进行IEEE浮点运算时在机器上的行为,其中包括处理无穷大、NaN[f]、有符号的0、转换以及表达式求值的规则。
C99支持一种新的十六进制浮点常量,这种浮点常量不会损失任何精度,而传统的浮点常量会在进行十进制到二进制的转换中损失精度。
头文件<float.h>包含与实现有关的附加信息。
新的标准pragmas[g]能够控制某些表达式求值的问题。
The bounds of an array can now be a run-time expression; such arrays are called "variable length arrays," or VLAs for short. VLAs may not have static storage duration, and thus may not be declared at file scope, but they may be function parameters or local to a function. If local to a function, the correct amount of space for a VLA is allocated when the block containing the array is entered and the declaration of the VLA is reached. The storage is deallocated when leaving the block.
The last member of a struct may be an array with no bounds expression, called a flexible array member. If such a struct is allocated using malloc, the programmer can request additional storage to allow the flexible array member to be an array of any desired size.
Type qualifiers may appear after the [ in the declaration of an array parameter to a function. Once the compiler has changed the type of the parameter from array to pointer, the type qualifiers modify the pointer type.
现在数组的边界可以由运行时的表达式确定;这样的数组叫做“变长数组”,缩写为VLA。VLA不能有静态存储生存期(static storage duration),因此不能声明为文件范围内的全局变量,但是可以作为函数参数或是函数的局部变量。若作为函数的局部变量,在进入包含VLA的(代码)块并到达该VLA的声明时,为该VLA分配正确的空间大小。该存储空间在离开该(代码)块的时候被收回。
结构的最后一个成员可以是没有边界表达式的数组[h],叫做灵活数组成员(flexible array member)[i]。如果这样的结构是使用malloc 分配的,程序员可以申请更多存储空间,使灵活数组成员拥有任何想要的大小[j]。
当数组作为函数的参数声明时,类型限定符可以出现在 [ 后面。一旦编译器把这种参数从数组转换成指针时,该类型限定符修饰这个指针类型[k]。
// comments are in C99.
Declarations do not have to appear at the start of a block. They may be intermixed with executable statements.
The if, switch, while, do, and for statements are now all blocks, as if a { preceded the statement and a } followed it. The first expression in a for statement (the initialization expression) may now be a declaration of the loop variable, which has scope of just the for statement.
Type qualifiers (const, volatile, restrict) may be redundantly specified。
//注释符可以在C99中使用了。
声明不必出现在代码块的开头。它们可以与可执行语句混合。
if、switch、while、do以及for语句都是一整个代码块,就如同语句前面有1个{并且后面有1个}。for语句的第一个表达式(初始化表达式)可以是一个循环变量的声明,该变量的生存期仅限于这个for语句。
类型限定符(const、volatile、restrict)可以重复指定了。
The enumerator list in the declaration of an enum type may have a trailing comma.
The minimum translation limits have been increased. Compilers are required to translate more complex programs.
Implementations must now support mixed-case external names. (C89 permitted implementations to force all external names to all upper case or all lower case.)
Functions can be declared inline to encourage the implementation to eliminate function call overhead by inline substituting the body of the function at a call site. Inline functions may be extern.
A pointer can be declared with the restrict keyword. For example:
int *restrict p;
tells the optimizer that the pointer p is the only way to access the object to which p points. This potentially permits the compiler to produce much better code. The keyword static may appear after the [ in the declaration of an array parameter. This tells the optimizer that the array really is as big as specified, and may permit better code to be generated.
int *restrict p;告诉优化程序指针p是访问p所指对象的唯一途径。这种假定可能会使编译器产生好得多的代码。数组作为参数声明时,关键字static可以出现在 [的后面。这告诉优化程序该数组大小确实与规格说明的一致,也许能够产生更好的代码。
struct S {Compound literals permit you to create an anonymous object and initialize it anywhere the value of such an object could appear. Syntactically, a compound literal is a cast followed by a brace-enclosed initializer. For example, if f is a function that takes an argument of type struct S above, you could write:
int i;
float f;
int a[2];
};
struct S x = {
.f=3.1,
.i=2,
.a[1]=9
};
f((struct S) {2, 3.1, {0,9}});
struct S {复合字面量(Compound literals)允许你创建一个匿名对象并初始化它,可以在任何该对象能出现的地方。语法上,复合字面赋值是一个类型(cast )跟随着由括号包括的的初始值。例如,假设f是一个接受上述struct S类型参数的函数,你可以这样写:
int i;
float f;
int a[2];
};
struct S x = {
.f=3.1,
.i=2,
.a[1]=9
};
f((struct S) {2, 3.1, {0, 9}});
_Pragma ( string-literal )This pragma operator behaves exactly as if a normal #pragma directive was encountered with the value of the string literal as its argument. However, the _Pragma operator may appear anywhere (not just at the beginning of a line) and macro bodies may contain _Pragma. There are additional predefined macro names indicating the version of the C Standard supported, which optional parts of the C Standard are supported, and whether the implementation is hosted or freestanding (whether there is an operating system and C library the program can call). Every function has the following implicit local variable:
static const char __func__[]
= "function-name";
where function-name is the name of the function. (Actually, __func__ does not exist unless referenced in the function.) The assert macro uses __func__ to report the function containing a failing assertion. (This is not a preprocessor feature, but it is similar to __FILE__ and __LINE__.)
Preprocessor arithmetic is performed in the largest signed and unsigned integer types the implementation supports.
_Pragma ( string-literal )这个Prama操作与普通的#pragma命令(directive)以一个字符串字面量作为实参时的行为完全相同。然而_Pragma操作可以出现在任何地方(不只是在一行的开头)并且宏体内还可以包含_Pragma。还有一些额外的预定义宏指明了所支持C标准的版本,支持C标准中哪些可选部分,以及该实现是托管的还是独立的(是否有操作系统和C函数库可以调用)。每个函数都有下面这样隐含的局部变量:
static const char __func__[]其中function-name就是该函数的名称。(事实上, 只有在函数中引用__func__,它才存在。)assert 宏使用 __func__ 来报告包含了失败断言的函数。(这不是于处理器的特性,但是类似于__FILE__ 和 __LINE__。)
= "function-name";
C89 assumes an implicit type of int when a type is needed but never specified. This might happen when a variable is declared without a type, or a function does not have a declared return value. C99 requires a diagnostic be issued for these cases. Most implementations will issue a warning message, and then assume int in order to avoid breaking programs that relied on implicit int.
C99 requires a diagnostic if a return statement fails to return a value in a non-void function. It also requires a diagnostic if a return statement returns a value in a void function.
Accented letters, Non-English letters, and ideograms from languages like Chinese may be used in identifier names, including external identifiers.
The ISO 10646 Standard [5] is a universal character set whose goal is to have character codes for all characters in all languages. ISO 10646 has both two-byte and four-byte character codes, and is a superset of Unicode. C99 permits you to represent any character in ISO 10646 by \u followed by four hex digits or \U followed by eight hex digits, where the hex digits are the character code in ISO 10646 for the character. These constructs are called the Universal Character Names or UCNs. You may use UCNs in strings, character constants, or identifiers.
Additional functions to process multibyte characters and wide characters are in the library.
The header <iso646.h> contains macros for some operators in C that require trigraphs to be used in some character sets.
Digraphs that are synonyms for some trigraphs are provided.
重音字母、非英文字母、以及像中文这样的表意文字都可以用作名字标识符,包括外部标识符。
ISO 10646 标准[5] 是一个通用字符集,它的目标是为所有语言中的所有字符编码。ISO 10646 同时包括两字节字符和四字节字符,是Unicode的超集。C99允许以ISO 10646的方式表示任何字符:在\u后面加四个十六进制数字或者在\U后面加八个十六进制数字,这些十六数字是该字符在ISO 10646中的字符编码。这种构造叫做通用字符名称(Universal Character Names)或者UCN。你可以在字符串、字符常量或是标识符中使用UCN。额外用于处理多字节字符和宽字符的函数包含在函数库中。
头文件<iso646.h> 包含了一些宏来处理某些需要使用三字符字母(trigraphs)[l]的字符集。
还提供了作为某些三字符组同义词的二合字母(Digraphs)。
In addition to new functions mentioned elsewhere in this article, the library contains some new specialized forms of printf and scanf. All printf and scanf family functions support new format conversion specifiers.
The strftime function supports additional conversion specifiers.
A new function, va_copy, is added to <stdarg.h>. It makes a copy of the variable argument pointer.
除了本文其他地方提到的新函数以外,函数库还包含了用于printf 和 scanf的格式说明符。所有printf 和 scanf家族函数都支持新的格式转换说明符。
strftime支持更多的转换说明符。
一个新的函数,va_copy,被加入到<stdarg.h>中。它复制一个变长参数的指针。
Next Month we'll take a look at C99's new restrict keyword, and examine what restricted pointers can do to improve performance.
下一个月我们来看看C99的新关键字restrict,并检验一个由restrict修饰的指针是否能够提高性能。
[1] ISO/IEC 9899:1999, Programming Languages — C. 1999.
[2] ISO/IEC 9899:1990, Programming Languages — C. 1990.
[3] ISO/IEC 9899 Amendment 1, Programming Languages — C Integrity. 1995.
[4] X3/TR-17:1997, Numerical C Extensions. 1997.
[5] ISO/IEC 10646, Information technology — Universal Multiple-Octet Coded Character Set (UCS).
Randy Meyers is consultant providing training and mentoring in C, C++, and Java. He is the current chair of J11, the ANSI C committee, and previously was a member of J16 (ANSI C++) and the ISO Java Study Group. He worked on compilers for Digital Equipment Corporation for 16 years and was Project Architect for DEC C and C++. He can be reached at rmeyers@ix.netcom.com.
Randy Meyers 是为C、C++和JAVA提供培训和指导的顾问。他目前是ANSI C委员会J11的主席,之前是J16(ANSI C++)和ISO JAVA学习小组(ISO Java Study Group)的成员。他曾经在DEC公司(Digital Equipment Corporation)研究编译器长达16年,并且是DEC C和C++的项目架构师。可以通过以下地址与他联系:rmeyers@ix.netcom.com。
[a] Implementations:这个单词我看有人翻译成编译器,但它的原意是实现。C语言的实现怎么体现出来?通过编译器体现。因而这个翻译也是可取的,我不确定哪个更好,所以直接取原意。
实际上在C99中至少有三种求正弦函数值的函数-- pmerofc
double sin(double);
float sinf(float);
long double sinl(long double);
(此外还有以复数为实参求正弦的)
通过tgmath.h可以实现根据实参的类型选择其中一个恰当的函数
struct _sample_struct其中other_data_size为other_data的大小,n则为array_size的大小。
{
char other_data;
char array[];
};
struct _sample_struct abc = (struct _sample_struct *)malloc (other_data_size + array_size);