[05] The New C: Integers, Part 3

By Randy Meyers, February 01, 2001

At first glance, C99’s new integral types seem to threaten its portability. But a few added headers and typedefs improve the outlook dramatically.

乍一看，C99新的整数类似似乎威胁到其移植性。但是一些添加的头文件和typedef 戏剧性的改善了其前景。

In my January 2001 column, I discussed how the introduction of 64-bit machines motivated C99 to generalize the rules for integer types and to allow new integer types as extensions. But, C99 did not describe any syntax for naming these new integer types. This seems to put programmers in an awkward position. Your implementation might support additional integer types. You might find such types useful. However, lacking a standardized syntax, it appears that you must learn about implementation-specific extensions to use new integer types, and that such uses automatically eliminate your chances for portability. This month, we see that C99 actually provides a solution to this problem that might even help with related portability problems. Interestingly, the 64-bit machines again thrust the issue before the standards committee.

在2001年一月份的专栏中，我论述了采用64位机器如何促使C99推广整数类型的规则，以允许新的整型作为扩展。但是C99没有描述任何用于命名这些新整型的语法。这似乎让程序员处于尴尬的境地。你的实现可能支持额外的整型。你可能会发现这些类型很有用。然而，缺乏规范的语法，看来你必须了解实现特定的扩展来使用新整型，这样的用法自动的消除了移植的可能性。本月，我们看到C99实际上给这个问题提供了一个解决方案，甚至可能对相关的移植性问题有帮助。有趣的是，64位机器再次在标准委员会之前推动这个问题。

Integers and Portability

Initially, the vendors of 64-bit hardware and software disagreed wildly about the mapping from C keywords to integer types. While much of the focus was on the sizes of the int and long types, some vendors even wanted to change the size of short. By contemplating changing the mapping to integer types, the 64-bit vendors magnified an old problem in writing portable C. Which keyword should you use when you want an integer of a particular size?

The C89 and C99 Standards do make certain guarantees. For example, short and int are at least a 16-bit integer type, and long is at least a 32-bit integer type. Thus, if you specifically want a 32-bit type, you might be tempted to just use long. However, on most 64-bit machines, long is 64 bits. If this causes problems for you, there is a simple solution. Use a typedef for your type, and change the definition of the typedef when you port your program.

The C committee realized that this simple technique can produce high payoff for little work, and that there was an advantage in standardizing the names of the typedefs. Thus, two new standard headers were added to C99, <stdint.h> and <inttypes.h>. Some C and C++ implementations have been providing these headers for years. If your implementation does not provide them, and they would be useful to you, they are simple enough that you could write your own version of the headers. They are not in ANSI C++98, but they are likely to be part of a future revision.

整型与可移植性

最初，64位硬件和软件的制造商在把C的关键字映射到整型时强烈地意见不一。虽然其中许多聚焦于 int 和 long 类型的大小，一些制造商甚至想改变 short 的大小。试图改变整型映射的时候，64位制造商放大了一个编写可移植C代码的老问题。当你想要使用一个特定大小的整数时，该使用哪一个关键字。

C89和C99标准确实做出了某些保证。例如，short 和 int 至少是16位整型，long 至少是32位整型，因此，如果你明确的需要一个32位整型，你可能受此诱惑而使用 long。然而，在大多数64位机器上，long 是64位的。如果这给你造成了问题，有一个简单的解决办法。对你的类型使用一个typedef，并在移植你的程序的时候改变该typedef 的定义。

C委员会意识到这个简单的技术能够对一些工作产生高回报，并且在标准化typedef的名字上有优点。因此，两个标准头文件被加入C99中，<stdint.h> 和 <inttypes.h>。一些C和C++实现已经提供这些头文件好几年了。如果你的实现没有提供它们，并且它们对你有用，它们非常简单，你可以写出你自己的这些头文件版本。它们不在ANSI C++ 98中，但是可能成为未来修订版的一部分。

Header <stdint.h>

The header <stdint.h> defines typedefs for integers of various sizes, macros that expand into the maximum value (and for the signed types, the minimum value) for those types, and a few other utility macros.

Most of the typedefs in the header are named according to the following patterns:

int_leastN_t
uint_leastN_t
int_fastN_t
uint_fastN_t
intN_t
uintN_t

where N is replaced with a decimal integer giving the size in bits (excluding pad bits) of that integer type. Typically, you will find typedefs defined in the header where N in the above patterns has been replaced by 8, 16, 32, and 64, yielding 24 different typedef names. The Standard permits an implementation to define other typedefs in the header fitting the patterns. The typedefs may be defined to be the traditional integer types or any extended integer types supported by the implementation. For example, an implementation supporting 128-bit integers would provide int128_t, uint128_t, and so on. The above typedef names that start with "u" are typedefs for unsigned types. The typedef names that do not start with "u" are the corresponding signed types. The typedef names break down into three families. The first family is the types formed from the patterns that contain "least" in their names. These types are at least the specified size in bits given by N, but they might be larger if the hardware makes it necessary. For example, on a 36-bit machine where int was 36 bits and short was 18 bits, the header would declare:

typedef short int_least16_t;
typedef int int_least32_t;
typedef short int_least18_t;
typedef int int_least36_t;

As we will see below, the first two of these typedefs are required. The last two of these typedefs are the implementation taking advantage of the ability to add extra typedefs to the header for "extra" integer types it supports. The least types are required to be the smallest integer type holding the required number of bits. Thus, the least types are the most space efficient types that can represent integers with the requested number of bits. The second family is the types formed from the patterns that contain "fast" in their names. Like the least types, these types can represent an integer with the requested number of bits, but the fast types may be as large as necessary to provide efficient computation. Consider a 64-bit machine that can operate on 16-bit integers, but whose operations on 16-bit integers are more expensive than operations on a 32- or 64-bit integer. Assume that short is 16 bits and int is 32-bits on that machine. The header would declare:

typedef short int_least16_t;
typedef int int_fast16_t;

Both the least and fast types can store all of the bits of a 16-bit integer, but the least types are optimized for space (like a big array of integers) while the fast types are optimized for speed (like a loop index). The third family is the types that contain neither "least" or "fast" in their names. These are the exact-sized, two’s complement integers with no padding bits. For the least and fast types, the Standard requires implementations to provide typedefs where N has been replaced with 8, 16, 32, and 64. However, for the exact-sized types, the requirements on the types are so demanding that the Standard makes these types optional. (This is discussed further below.) These types should be used sparingly. Typical uses are laying out a struct to match an externally defined data layout, such as a binary file from another system or a network packet. A 36-bit machine would not be able to provide typedefs where N was 8, 16, 36, and 64, but would be able to provide names where N was 9, 18, 36, and 72. An implementation is required to define macros in <stdint.h> giving the maximum (and for signed types, the minimum) values that can be stored in the types defined in the header. The name patterns for these limit macros match the patterns for the typedef names:

INT_LEASTN_MAX
INT_LEASTN_MIN
UINT_LEASTN_MAX
INT_FASTN_MAX
INT_FASTN_MIN
UINT_FASTN_MAX
INTN_MAX
INTN_MIN
UINTN_MAX

Note that an implementation defines limit macros if and only if the corresponding typedef is defined. Thus, you can use the limit macros to test whether a typedef name is defined. For example:

#include <stdint.h>
#ifdef INT_LEAST24_MAX
int_least24_t x;
#else
int_least32_t x;
#endif

The above defines x to have type int_least24_t if that type is supported. Otherwise, x is defined to have the type int_least32_t, a type the Standard requires to exist. The <stdint.h> header also defines two function-like macros for forming integer constants whose return type is one of the int_leastN_t or uint_leastN_t types, respectively: INTN_C(constant) UINTN_C(constant) where N is replaced with a decimal integer corresponding to one of the "least" types. The argument to these macros should be an unsuffixed constant. The macro uses the preprocessor paste operator ## to add a constant suffix to produce a constant with the proper type. For example, UINT64_C(0xA) might expand to 0xAULL, if the uint_least64_t type was unsigned long long.

头文件 <stdint.h>

头文件 <stdint.h> 定义了各种大小整型的typedef和扩展成这些类型最大值的宏（对于有符号数，还有最小值），以及一些其他的实用宏。

大多数头文件中的typedef按照下面的模式命名：

int_leastN_t
uint_leastN_t
int_fastN_t
uint_fastN_t
intN_t
uintN_t

其中N由一个指明该整型位大小（不包括填充位）的十进制整数取代。通常，你会发现头文件中定义的typedef在上述模式中的N被8、16、32和64取代，总共24钟不同的typedef名字。标准允许实现在该头文件中定义其他符合这个模式的typedef。这些typedef可以定义成传统的整型，或是实现支持的任何扩展整型。例如，一个支持128位整型的实现将提供int128_t、uint128_t等等。上面typedef名字中以“u”开头的是对无符号类型的typedef。typedef名字中不以“u”开头的是对应的有符号类型。这些typedef名字分成三类。第一类由它们名字模式中包含“least”的组成。这些类型是规定的最小的由N指明的位大小，但是它们可能可能更大，如果硬件有此需要。例如，在一个36位机器上，int是36位的，short是18位的，头文件将这样声明：

typedef short int_least16_t;
typedef int int_least32_t;
typedef short int_least18_t;

typedef int int_least36_t;

就如我们下面要看到的，头两个typedef是必须的。后两个是实现为它支持的“额外”整型，采取该能力的优点而额外添加到头文件中的typedef。最小类型需要是能够包含所需位数的最小整型。因此，最小类型是能够表示所需的位数的整数的最节省空间的类型。第二类由名字模式中包含“fast”的组成。就像最小类型，这些类型能够表示所需位大小的整数，但是快速类型可能跟提供高效计算的类型一样大。考虑一个能够在16位整型上运算的64位机器，但是在16位整型上的操作比32或64位上的操作昂贵得多。假设这个机器上 short 是16位，int 是32位。头文件将这样声明：

typedef short int_least16_t;
typedef int int_fast16_t;

最小类型和快速类型都能够存储16位整数的所有位，但是最小类型为空间优化（如一个整型的大数组），而快速类型为速度优化（如循环的索引）。第三类是名字中既不包含“least”也不包含“fast”的类型。它们是大小确切，以二的补码表示的不含填充位的整数。对于最小和快速类型，标准要求实现提供以8、16、32或64取代N的typedef。然而，对于大小确切的类型，在类型的要求上太苛刻以致标准使这些类型为可选的（这将在下文进一步讨论）。谨慎的使用这些类型。典型的用法是布置一个结构来匹配外部定义的数据布局，如一个来自其他系统或是网络包的二进制文件。一个36位机器不能提供N为8、16、36或是64的typedef，但是能够提供N为9、18、36和72的名字。实现需要在 <stdint.h> 中定义指明能够存储在头文件定义类型中的最大值（对于有符号类型，还要有最小值）的宏。这些限制宏的名字模式跟typedef名字的模式匹配：

INT_LEASTN_MAX
INT_LEASTN_MIN
UINT_LEASTN_MAX
INT_FASTN_MAX
INT_FASTN_MIN
UINT_FASTN_MAX
INTN_MAX
INTN_MIN
UINTN_MAX

注意，只有在定义了typedef时，实现才会定义相应的限制宏。因此，你可以使用限制宏来测试一个typedef名字是否已定义。例如：

#include <stdint.h>
#ifdef INT_LEAST24_MAX
int_least24_t x;
#else
int_least32_t x;
#endif

上面定义了x的类型为 int_least24_t，如果支持这样的类型。否则，定义x的类型为 int_least32_t 这个标准要求存在的类型。头文件 <stdint.h> 还定义了两个类似函数的宏，用来构建返回类型为 int_leastN_t 或 uint_leastN_t 的常量，分别是： INTN_C(常量) UINTN_C(常量)，其中N使用一个十进制整数取代，对应一个“最小”类型。这些宏的参数应为一个无后缀常量。这些宏使用预处理器粘贴操作 ## 添加一个常量后缀，来产生一个具有适当类型的常量。例如，UINT64_C(0xA) 可能扩展成 0xAULL，如果 uint_least64_t 类型是 unsigned long long。

Special Integer Types

The <stdint.h> header also defines a few integer types useful for special purposes. The typedefs intptr_t and uintptr_t are respectively a signed integer type and the corresponding unsigned integer type large enough to hold a pointer to an object without losing any information. (These integer types might not be large enough to hold a pointer to a function.) Specifically, the Standard demands that if you cast a pointer to void to one of these integer types, then cast the integer back to pointer to void, that the result of the two casts should equal the original pointer. There are a few systems where no such integer types exist, so C99 makes these typedefs optional in the header. You can test whether these types are declared by testing if their limit macros are defined: INTPTR_MIN, INTPTR_MAX, and UINTPTR_MAX. Perhaps the two most important typedefs defined in <stdint.h> are intmax_t and uintmax_t. These are respectively the largest signed integer type and the corresponding unsigned integer type supported by the implementation. Obviously such types are useful whenever you want your program to be able to process the largest numbers supported on the machine. As a less obvious use, such types are handy when working with typedefs for integer types from different sources. Consider:

apple_t num_apples();
orange_t num_oranges();
intmax_t num_fruit;
num_fruit = num_apples() + num_oranges();

where apple_t is a signed integer typedef used for storing the number of apples, and orange_t is a signed integer typedef used for storing the number of oranges. If you want to store the total number of fruit, neither apple_t nor orange_t might be appropriate. Perhaps you do not have many apples so that apple_t is short while you do have lots of oranges so that orange_t is long. Perhaps the situation is reversed. In the absence of a typedef specifically for storing numbers of any type of fruit, using the largest integer type, intmax_t, is a good idea. Luckily, comparing apples to oranges causes no similar problems. There are limit macros for intmax_t and uintmax_t: INTMAX_MIN, INTMAX_MAX, and UINTMAX_MAX. There are also function-like macros, similar to the ones described above, for writing constants of type intmax_t and uintmax_t: INTMAX_C(constant) and UINTMAX_C(constant). The Standard requires that intmax_t, uintmax_t, and their limit and constant macros to be defined. On many implementations, intmax_t will be long long and uintmax_t will be unsigned long long . Finishing out <stdint.h> are limit macros for integer typedefs defined in other headers. The other headers failed to define macros giving the minimum and maximum values. Rather than add new names to popular headers from C89, C99 defined the limit macros in <stdint.h>. For ptrdiff_t, the limits are PTRDIFF_MIN and PTRDIFF_MAX. For sig_atomic_t, the limits are SIG_ATOMIC_MIN and SIG_ATOMIC_MAX. For size_t, the limit is SIZE_MAX. For wchar_t, the limits are WCHAR_MIN and WCHAR_MAX. For wint_t, the limits are WINT_MIN and WINT_MAX.

特殊的整型

头文件 <stdint.h> 还定义了一些为特殊目的使用的整型。typepdef的 intptr_t 和 uintptr_t 分别是有符号整型和相应的无符号整型，它们的大小足够存储一个指向某个对象的指针，而不会丢失任何信息（这些整型的大小也许不足以存储一个指向函数的指针）。具体来说，标准要求如果你把一个指向 void 的指针转型（cast）为一个这样的整数类型，再把这个整数转回指向 void 的指针，那么这两次转型（cast）的结果跟原来的指针应该是相等的。有一些系统不存在这样的整型，所以C99让这些typedef在头文件中可选。你可以测试这些typedef是否已声明，通过测试它们的限制宏是否已定义：INTPTR_MIN、INTPTR_MAX 和 UINTPTR_MAX。也许头文件 <stdint.h> 中定义的最重要的两个typedef是 intmax_t 和 uintmax_t。它们分别是实现支持的最大有符号整型和相应的无符号整型。很明显这些类型是很有用的，当你希望你的程序能够处理机器支持的最大的数值时。作为一种不太明显的用法，处理来自不同来源的typedef的整型时，这样的类型很方便。考虑：

apple_t num_apples();
orange_t num_oranges();
intmax_t num_fruit;
num_fruit = num_apples() + num_oranges();

其中 apple_t typedef为一个有符号整型用来存储苹果的数目，orange_t typedef为一个有符号整型用来存储桔子的数目。如果你想要存储水果的总数目，apple_t 和 orange_t 也许都是合适的。也许你没有很多苹果所以 apple_t 是 short，同时你有很多桔子所以 orange_t 是 long。也许情况就反过来了。在未知每种水果存储数量特定的tyepdef 时，使用最大的整型 intmax_t 是一个好主意。幸运的是，比较苹果和桔子并不会引起类似的问题。有一些用于 intmax_t 和 uintmax_t 的限制宏：INTMAX_MIN、INTMAX_MAX 和 UINTMAX_MAX。还有一些类似函数的宏，跟上面描述的相似，用来写入 intmax_t 和 uintmax_t 类型的常量：INTMAX_C(常量) 和 UINTMAX_C(常量)。标准要求定义定义 intmax_t、uintmax_t 以及他们的限制和常量宏。在许多实现中， intmax_t 是 long long 并且 uintmax_t 是 unsigned long long。最后以用于其他头文件定义的限制宏来结束 <stdint.h>。其他头文件没有定义指明最小值和最大值的宏。与其在C89流行的头文件中添加新名字，C99在 <stdint.h> 中定义了这些限制宏。对于 ptrdiff_t，限制是 PTRDIFF_MIN 和 PTRDIFF_MAX。对于 sig_atomic_t，限制是 SIG_ATOMIC_MIN 和 SIG_ATOMIC_MAX。对于 size_t，限制是 SIZE_MAX。对于 wchar_t，限制是 WCHAR_MIN 和 WCHAR_MAX。对于 wint_t，限制是 WINT_MIN 和 WINT_MAX。

Header <inttypes.h>

The header <inttypes.h> includes the header <stdint.h>, and then defines a few functions (which I will cover in a future column) and many additional macros. Even though <inttypes.h> includes <stdint.h>, you may explicitly include both headers if you wish. All standard C99 headers except <assert.h> may be included multiple times without problem. The main purpose of <inttypes.h> is to allow printf and scanf to be used with the integer types defined in <stdint.h>. This is accomplished by defining macros that expand into quoted strings that are a particular printf or scanf format conversion specifier prefixed with any needed length modifiers. I will discuss the patterns of these macro names for format conversion specifiers later. For now, be aware that there is a separate macro for every integer type in <stdint.h> and for every printf and scanf format conversion specifier that operates on integers, and that there is a separate such macro for printf versus scanf. Using string literal concatenation, these macros can be used to form a printf or scanf format string. String literal concatenation is the C feature where two string literals separated only by whitespace will be combined into one large string literal by the compiler. Consider the small program in Listing 1. <inttypes.h> defines PRIdFAST64, which expands to the printf d conversion specifier for printing an intfast64_t, and SCNdFAST64, which expands to the scanf d conversion specifier for reading an intfast64_t. String literal concatenation combines the string literals from the two macros with the other adjacent string literals to produce a single string literal for the scanf format and a string literal for the printf format. Assuming the int_fast64_t type is really long long, both of the two macros expand into the quoted string "lld", which is the d format conversion specifier prefixed by the ll modifier to say the type is long long. Thus, the resulting format string for scanf is "%lld" and the resulting format string for printf is "x=%08lld\n". Note that a leading percent sign is not part of the strings from the conversion specifier macros. Thus, you can specify special flags, width, or precision arguments to the conversion specifier in the string concatenated to the front of the string from the macro. In the printf format for Listing 1, the leading zero flag was given to cause leading zeros to be written in front of the number, and a field width of 8 was specified. The macros for printf format conversion specifiers follow the naming patterns:

PRIsN
PRIsLEASTN
PRIsFASTN
PRIsMAX
PRIsPTR

where s is replaced by a printf integer format conversion specifier, one of d, i, o, u, x, or X, and N is replaced by a decimal number that is the same N used to form typedef names in <stdint.h>. The first macro name pattern above is used with the exact width types. The pattern containing "LEAST" is used with the least integer types. The pattern containing "FAST" is used with the fast integer types. The pattern containing "MAX" is used with the intmax_t and uintmax_t types. The pattern containing "PTR" is used with the intptr_t and uintptr_t types. The macros for the scanf format conversion specifiers follow the same naming patterns as printf, except that the macros begin with SCN instead of PRI, and s cannot be replaced with X. There are separate macros for printf and scanf because printf versus scanf format conversion specifiers sometimes need different length modifiers for the same type. For example, you print a short with %d and read it with %hd. Just as intmax_t and uintmax_t are useful when storing integers of unknown size, they can be useful when printing an integer of unknown size. Assume that the variable x has some signed integer type, but you do not know which type. You can print x by casting it to intmax_t before printing it:

The <inttypes.h> header was invented by some of the 64-bit vendors and predates C99. Originally, <inttypes.h> directly included the contents of <stdint.h>. The C99 committee decided to divide the header into two in order to separate the printf and scanf format macros (which do not interest C++ programmers) from the integer types and limit macros (which do interest C++ programmers). Some systems that are not fully compatible with C99 lack a <stdint.h> but have the older version of <inttypes.h>.

头文件 <inttypes.h> 是在 C99 以前由一些64位制造商发明的，<inttypes.h> 直接包含了 <stdint.h>的内容。C99委员会决定把这个头文件分割成两个，使（C++ 程序员不关心的） printf 和 scanf 格式宏独立于（C++程序员关心的）整数类型和限制宏。一些系统不完全兼容C99，缺少 <stdint.h> 但是拥有旧版本的 <inttypes.h>。

Conclusion

As many programmers have independently discovered, a simple solution to portability and different integer types is to use typedefs. C99 has standardized the names and uses of such typedefs in the new headers <stdint.h> and <inttypes.h>. These headers not only solve the problem of the mapping from C keywords to different sized integers, but also might give you access to an implementation’s extended integer types in a way that does not automatically disallow portability. These headers are also simple enough that if your implementation does not yet provide them, it is a simple matter to write your own.

总结

正如许多程序员独立发现的，对于移植和不同整数类型的一个简单解决方案是使用 typedef。C99已经在新的头文件 <stdint.h> 和 <inttypes.h> 中标准化了这些 typedef的名字和永福。这些头文件不仅仅解决了 C 关键字映射到不同大小整数的问题，还可能允许你以一种不会自动消除移植性的方法来存取实现的扩展整型。这些头文件也是如此简单，如果你的实现还没有提供他们，自己编写一个也不是难事。

The New C: Integers, Part 3

新的C语言：整型，第三部分

Integers and Portability

整型与可移植性

Header <stdint.h>

头文件 <stdint.h>

Special Integer Types

特殊的整型

Header <inttypes.h>

头文件 <inttypes.h>

Conclusion

总结

Listing 1: Using string literal concatenation with C99 <inttypes.h> macros to create format specifiers for scanf and printf

原文地址