C语言可变长参数函数与默认参数提升

1、概述

C标准中有一个默认参数提升（default argument promotions）规则。
默认参数提升有时会给我们带来疑惑。本文结合C语言的可变长参数函数来说明默认参数提升存在的陷阱。

2、默认参数提升的定义

标准中的定义如下：

If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. -- C11 6.5.2.2 Function calls (6)

意思大概是：如果一个函数的形参类型未知，例如使用了Old Style C风格的函数声明，或者函数的参数列表中有 ...，那么调用函数时要对相应的实参做Integer Pormotion，此外，相应的实参如果是float型的也要被提升为double类型，这条规则称为Default Argument Promotion。

3、可变长参数函数

熟悉C的人都知道，C语言支持可变参长数函数(Variable Argument Functions)，即参数的个数可以是不定个，在函数定义的时候用(...)表示，比如我们常用的printf()\execl函数等；printf函数的原型如下：

int printf(const char *format, ...);

注意，采用这种形式定义的可变参数函数，至少需要一个普通的形参，比如上面代码中的*format，后面的省略号是函数原型的一部分。

C语言定义了一系列宏来完成可变参数函数参数的读取和使用：宏va_start、va_arg和va_end；在ANSI C标准下，这些宏定义在stdarg.h中。三个宏的原型如下：

void va_start(va_list ap, last);// 取第一个可变参数（如上述printf中的i）的指针给ap，
// last是函数声明中的最后一个固定参数（比如printf函数原型中的*fromat）；
type va_arg(va_list ap, type); // 返回当前ap指向的可变参数的值，然后ap指向下一个可变参数；
// type表示当前可变参数的类型（支持的类型位int和double）；
void va_end(va_list ap); // 将ap置为NULL

当一个函数被定义为可变参数函数时，其函数体内首先要定义一个va_list的结构体类型，这里沿用原型中的名字，ap。va_start使ap指向第一个可选参数。va_arg返回参数列中的当前参数并使ap指向参数列表中的下一个参数。va_end把ap指针清为NULL。函数体内可以多次遍历这些参数，但是都必须以va_start开始，并以va_end结尾。

下面是一个具体的示例：

#include <stdarg.h>
double average(int count, ...)
{
va_list ap;
int j;
double tot = 0;
va_start(ap, count); //使va_list指向起始的参数
for(j=0; j<count; j++)
tot+=va_arg(ap, double);//检索参数，必须按需要指定类型
va_end(ap); //释放va_list
return tot/count;
}

4、默认参数提升在可变参数函数的陷阱

如果明白了C语言的可变参数函数，让我们实现一个简易的my_printf
1. 它只返回void，不记录输出的字符数目
2. 它只接受"%d"按整数输出、"%c"按字符输出、"%%"输出'%'本身
很多人的答案如下：

#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
void my_printf(const char* fmt, ... )
{
va_list ap;
va_start(ap,fmt); /* 用最后一个具有参数的类型的参数去初始化ap */
for (;*fmt;++fmt)
{
/* 如果不是控制字符 */
if (*fmt!='%')
{
putchar(*fmt); /* 直接输出 */
continue;
}
/* 如果是控制字符，查看下一字符 */
++fmt;
if ('\0'==*fmt) /* 如果是结束符 */
{
assert(0); /* 这是一个错误 */
break;
}
switch (*fmt)
{
case '%': /* 连续2个'%'输出1个'%' */
putchar('%');
break;
case 'd': /* 按照int输出 */
{
/* 下一个参数是int，取出 */
int i = va_arg(ap,int);
printf("%d",i);
}
break;
case 'c': /* 按照字符输出 */
{
/** 但是，下一个参数是char吗*/
/* 可以这样取出吗？ */
char c = va_arg(ap,char);
printf("%c",c);
}
break;
}
}
va_end(ap); /* 释放ap—— 必须！见下文分析*/
}

很可惜，这样的代码是错误的！

简单的说，我们用va_arg(ap,type)取出一个参数的时候，
type绝对不能为以下类型：
——char、signed char、unsigned char
——short、unsigned short
——signed short、short int、signed short int、unsigned short int
——float

一个简单的理由是：
——调用者绝对不会向my_printf传递以上类型的实际参数。

为什么呢？-- 这里就牵扯到默认参数提升问题。

看标准：

If the expression that denotes the called function has a type that does include a prototype, the arguments are implicitly converted, as if by assignment, to the types of the corresponding parameters, taking the type of each parameter to be the unqualied versionof its declared type.The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter. The default argument promotions are performed on trailing arguments. -- C11 6.5.2.2 Function calls (7)

C语言中什么时候会牵扯到默认参数提升呢？

在C语言中，调用一个不带原型声明的函数时：调用者会对每个参数执行“默认实际参数提升(default argument promotions)。

同时，对可变长参数列表超出最后一个有类型声明的形式参数之后的每一个实际参数，也将执行上述提升工作。

提升工作如下：
——float类型的实际参数将提升到double
——char、short和相应的signed、unsigned类型的实际参数提升到int
——如果int不能存储原值，则提升到unsigned int

然后，调用者将提升后的参数传递给被调用者。
所以，my_printf是绝对无法接收到上述类型的实际参数的。

上面的代码的42与43行，应该改为：
int c = va_arg(ap,int);
printf("%c",c);

同理，如果需要使用short和float，也应该这样：
short s = (short)va_arg(ap,int);
float f = (float)va_arg(ap,double);

再来看一个具体的例子吧：

#include <stdarg.h>
#include <stdio.h>
void read_args_from_va_good(int i, ...)
{
va_list arg_ptr;
va_start(arg_ptr, i);
/* This is right. */
printf("%c\n", va_arg(arg_ptr, int));
printf("%d\n", va_arg(arg_ptr, int));
printf("%f\n", va_arg(arg_ptr, double));
va_end(arg_ptr);
}
void read_args_from_va_bad(int i, ...)
{
va_list arg_ptr;
va_start(arg_ptr, i);
/* This is wrong. */
printf("%c\n", va_arg(arg_ptr, char));
printf("%d\n", va_arg(arg_ptr, short));
printf("%f\n", va_arg(arg_ptr, float));
va_end(arg_ptr);
}
int main()
{
char c = 'c';
short s = 0;
float f = 1.1f;
read_args_from_va_good(0, c, s, f);
read_args_from_va_bad(0, c, s, f);
return 0;
}

上面的代码用gcc4.4.0编译，会有警告：

va_arg.c: In function ‘read_args_from_va_bad’:
va_arg.c:47: warning: ‘char’ is promoted to ‘int’ when passed through ‘...’
va_arg.c:47: note: (so you should pass ‘int’ not ‘char’ to ‘va_arg’)
va_arg.c:47: note: if this code is reached, the program will abort
va_arg.c:48: warning: ‘short int’ is promoted to ‘int’ when passed through ‘...’
va_arg.c:48: note: if this code is reached, the program will abort
va_arg.c:49: warning: ‘float’ is promoted to ‘double’ when passed through ‘...’
va_arg.c:49: note: if this code is reached, the program will abort

运行gcc4.4.6生成的程序时，运行到第23行时，输出Illegal instruction，程序退出。查看了一下gcc4.4.6生成的汇编代码，发现没有为read_args_from_va_bad()生成有效的代码。

astrol@astrol:~/c$ gdb va_arg -q
Reading symbols from /home/astrol/c/va_arg...done.
(gdb) run
Starting program: /home/astrol/c/va_arg
c 0 1.100000
c
0
1.100000
Program received signal SIGILL, Illegal instruction.
0x08048452 in read_args_from_va_bad (i=0) at va_arg.c:44
44 va_start(arg_ptr, i);
(gdb) x/i $pc
=> 0x8048452 <read_args_from_va_bad+12>: ud2
(gdb)

UD2是一种让CPU产生invalid opcode exception的软件指令. 内核发现CPU出现这个异常, 会立即停止运行

在VC中运行的结果是不正确的：

以下摘自《C陷阱与缺陷》

这里有一个陷阱需要避免：
va_arg宏的第2个参数不能被指定为char、short或者float类型。
因为char和short类型的参数会被转换为int类型，而float类型的参数会被转换为double类型 ……
例如，这样写肯定是不对的：
c = va_arg(ap,char);
因为我们无法传递一个char类型参数，如果传递了，它将会被自动转化为int类型。上面的式子应该写成：
c = va_arg(ap,int);
——《C陷阱与缺陷》p164

可能有人会问，VC中的三个宏不是已经实现了自动int对齐了吗？如下：

#define _INTSIZEOF(n) ( (sizeof(n) + sizeof(int) - 1) & ~(sizeof(int) - 1) )
#define va_start(ap,v) ( ap = (va_list)&v + _INTSIZEOF(v) )
#define va_arg(ap,t) ( *(t *)((ap += _INTSIZEOF(t)) - _INTSIZEOF(t)) )
#define va_end(ap) ( ap = (va_list)0 )

下面是linux 2.6.22中的实现，其实是一样的意思

#define _AUPBND (sizeof (acpi_native_int) - 1)
#define _ADNBND (sizeof (acpi_native_int) - 1)
/*
* Variable argument list macro definitions
*/
#define _bnd(X, bnd) (((sizeof (X)) + (bnd)) & (~(bnd)))
#define va_arg(ap, T) (*(T *)(((ap) += (_bnd (T, _AUPBND))) - (_bnd (T,_ADNBND))))
#define va_end(ap) (void) 0
#define va_start(ap, A) (void) ((ap) = (((char *) &(A)) + (_bnd (A,_AUPBND))))