As I have mentioned in earlier posts, we are building a x-platform mobile chat client. One of the platforms is iPhone and that means that the team is having to learn Objective C, and therefore C. Last couple of days, I have been spending time  helping my team understand and appreciate C better. Yesterday, we got really deep on figuring out how to read a C declaration and it was a lot of fun taking people thru the rules and examples of how to read some complex C declarations. I am summarizing the key aspects over here.

PS: Could not cross link to references since there is no authoritative version of the C-99 standard in an HTML form – the standard docs are all in PDF.

C Declaration Grammar

Understanding a C declaration requires that we understand what elements can a C declaration have. While I will describe the overall syntax over here, I will not go into semantics:

declaration:
	declaration-specifiers init-declarator-listopt ;
declaration-specifiers:
	storage-class-specifier declaration-specifiersopt
	type-specifier declaration-specifiersopt
	type-qualifier declaration-specifiersopt
	function-specifier declaration-specifiersopt
init-declarator-list:
	init-declarator
	init-declarator-list , init-declarator
init-declarator:
	declarator
	declarator = initializer

This is what the above grammar is saying:

  • A declaration is:
    • declaration-specifiers followed by an optional list of init-declarators
  • The declaration-specifiers are:
    • A storage-class specifier, followed by other optional declaration specifiers
    • A type-specifier, followed by other optional declaration specifiers
    • A type-qualifier, followed by other optional declaration specifiers
    • A function-specifier, followed by other optional declaration specifiers
  • List of init-declarators is:
    • An init-declarator
    • An init-declarator list, followed by a comma, followed by an init-declarator
  • An init-declarator is:
    • A declarator
    • A declarator followed by an assignment operator (=),  followed by an initializer

Declaration Specifiers

As per the grammar defined above, a declaration starts with a declaration specifier. A declaration-specifier can be:

  • Storage-class specifier: These are: auto, static, extern, register and typedef. Default is auto, so if nothing is specified, auto is assumed.
  • Type-specifier: A type-specifier is one of the following. Default is int
    • void
    • char
    • signed char
    • unsigned char
    • short, signed short, short int, or signed short int
    • unsigned short or unsigned short int
    • int, signed or signed int
    • unsigned or unsigned int
    • long, signed long, long int or signed long int
    • unsigned long or unsigned long int
    • long long, signed long long, long long int, or signed long long int
    • unsigned long long or unsigned long long int
    • float
    • double
    • long double
    • _Bool
    • float _Complex
    • double _Complex
    • long double _Complex
    • struct-or-union-specifier
    • enum-specifier
    • typedef-name
  • Type-Qualifer: Applicable only to l-values: const, volatile, restrict
  • Function-specifier: There is only one function specifier: inline

Declarators

What follows a declaration-specifier is a init-declarator-list – a comma separated sequence of declarators that may optionally be initialized. A declarator [2] is defined as:

declarator:
	pointeropt direct-declarator
direct-declarator:
	identifier
	( declarator )
	direct-declarator [ type-qualifier-listopt assignment-expressionopt ]
	direct-declarator [ static type-qualifier-listopt assignment-expression ]
	direct-declarator [ type-qualifier-list static assignment-expression ]
	direct-declarator [ type-qualifier-listopt * ]
	direct-declarator ( parameter-type-list )
	direct-declarator ( identifier-listopt )
pointer:
	* type-qualifier-listopt
	* type-qualifier-listopt pointer
type-qualifier-list:
	type-qualifier
	type-qualifier-list type-qualifier
parameter-type-list:
	parameter-list
	parameter-list , ...
parameter-list:
	parameter-declaration
	parameter-list , parameter-declaration
parameter-declaration:
	declaration-specifiers declarator
	declaration-specifiers abstract-declaratoropt
identifier-list:
	identifier
	identifier-list , identifier

We can break down the above grammar the same way we did for the grammar of a declaration, but I’m gonna skip that and give some examples of declarators, just to give an idea of what declarators are supposed to be:

  • i – An identifier (and hence a direct-declarator)
  • i, j – A list of identifiers
  • i = 10 – An Identifer followed by an assignment expression
  • *p – A pointer declarator
  • * const p – A pointer declarator with a type-qualifer (const)
  • a[10] – An array declarator with a constant size
  • a[*] – An array declarator with a variable size
  • a[] – An array declarator with an unspecified size – the size will need to be defined somewhere else
  • f() – A function declarator
  • f(void) – A function declarator with no parameters
  • f(int i) – A function declarator with a parameter
  • f(int i, int j) – A function declarator with a parameter-list

Structure of a Declaration

Now that we have developed an idea of declaration-specifiers and declarators, we can see that the overall structure of a declaration is of the following form:

  1. One or more declaration-specifiers followed by
  2. One or more declarators (separated by commas)

However, not all combinations are valid. The following are not allowed:

f()[] // function can't return an array
f()() // function can't return a function
a[]() // array can't hold a function

The following, however, are allowed:

(* f())[] // function returning pointer to an array
(* f())() // function returning pointer to a function
(* a[])() // array holding pointers to functions

If you are flummoxed by the last three examples of combining pointers, functions and arrays, you are not alone. Such declarations can look scary till you develop a technique to start deciphering them. In order to do that however, we need to know the basic order of precedence

Operator Precedence

  1. Parentheses grouping together parts of a declaration
  2. Postfix operators: parentheses (for a function), square brackets (for an array)
  3. Prefix operator: Asterisk (for a pointer)

Here are some examples of how this order applies:

DeclarationSame asNot Same as
int * f();int * (f());int (* f());
int * a[];int * (a[]);int (* a[]);

Now the way to use these rules is as follows:

  1. Read from left to right
  2. For the first identifier you encounter (or if there is no identifier, then look for the inner-most construct), look to the immediate right
    • If there is nothing, or if you have a closing parenthesis, go to 3
    • Otherwise you have a function declarator indicated by () or an array declarator indicated by [] to the right of the identifier
    • Read left to right – you will have a “function returning” or “array of”
    • This would typically end with a right parenthesis, or the end of the declarator (semi-colon or assignment)
  3. Look to the left
    • If you find nothing on the left, or if you find an opening parenthesis, go to 4
    • Otherwise you have a pointer declarator indicated by an asterisk to the left of the identifier. Read right to left – you will get a “pointer to”
    • This would end with a left parenthesis, or start of declarator
  4. At this point you have either
    • A complete declarator – then you are done
    • Or an expression in parenthesis – go back to step 2.

If you did not realize this on reading the above description, what we are really doing in steps 2 and 3 is to convert the C declaration into a postfix expression by taking into account the lower predence of the asterisk. This same approach is used in the cdecl program given in K&R.

This will become clear with a few examples:

int * f();
  • Start with identifer – f
  • Parenthesis to the right – “function returning”
  • Asterisk to the left – “pointer to”
  • int

The postfix expression is:

f () * int // "f is" "a function returning" "pointer to int"

Here are a few more examples:

int * a[10];
// postfix expression: a [10] * int
// "a is" "array of 10" "pointers to int"
int (* a)[10]
// postfix: a * [10] int
// "a is" "pointer to" "array of 10" "int"
int **p;
// postfix: p * * int
// "p is" "pointer to" "pointer to" int
int **p[10];
// postfix: p [10] * * int
// "p is" "array of 10" "pointer to" "pointer to" int
int *f();
// postfix: f () * int
// "f is" "function returning" "pointer to" int
int (*f)();
// postfix: f * () int
// "f is" "a pointer to" "function returning" int
int (* vtable[])();
// postfix: vtable [] * () int
// vtable is an array of pointer to function returning int
int (* a[])(int, int);
// postfix: a [] * (int, int) int
// a is an array of pointer to function returning int
// and taking (int, int) as parameters

Applying Type-Qualifiers

The way you apply a type-qualifier (const, volatile) is that:

  1. If next to a type-specifer, it applies to the type-specifier
  2. Otherwise applies to the asterisk (pointer) on its immediate left

Examples:

const int * p;
// postfix: p * const int
// p is a pointer to a const int
int const * p;
// postfix: p * const int
// p is a pointer to a const int
int * const p;
// postfix: p const * int
// p is a const pointer to int
int * const * p;
// postfix: p * const * int
// p is a pointer to a const pointer to int
int * const * (* p)();
// postfix: p * () * const * int
// p is a pointer to a function returning a
// pointer to const-pointer-to-int

Abstract Declarators

An abstract declarator is a declarator without an identifer. However, lack of an identifier does not mean that we can’t interpret such declarators, since the missing identifier’s position can be determined by the placement of (), [] and * Some examples will help clarify this:

int *
// pointer to int
int * [10]
// array of 10 pointers to int
int * (*)
// function returing a pointer to int and taking no arguments
int (*) [10]
// pointer to array of 10 int
int (*) (*)
// pointer to a function returning an int and taking no arguments

With this background, lets tackle this gem from Andrew Koenig’s C Traps and Pitfalls (PDF):

(*(void(*)())0)();

Clearly, we are casting 0 to some type here. The type being:

void(*)()

We know this type:

// postifx: * () void
// pointer to function returning void

In the next step, this pointer is de-referenced and then called. So what is happening here is that zero is being cast to a pointer to a function returning void, then dereferenced and called!

  • http://twitter.com/dhruvbird Dhruv Matani

    a[*] – An array declarator with a variable size
    This is something new for me. Is it part of the standard? Seeing it used for the first time.

    • http://www.vineetgupta.com Vineet Gupta

      Referring C-99 std: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

      From section 6.7.5.2 Array declarators, Semantics, point 4:

      If the size is not present, the array type is an incomplete type. If the size is * instead of being an expression, the array type is a variable length array type of unspecified size, which can only be used in declarations with function prototype scope; such arrays are nonetheless complete types. If the size is an integer constant expression and the element type has a known constant size, the array type is not a variable length array type; otherwise, the array type is a variable length array type.

  • MiniMind

    Relevant http://cdecl.org/

  • http://www.facebook.com/profile.php?id=100001530003708 Saurabh Verma