Saturday, March 7, 2015

Reading C type declarations

Reading C type declarations

1. cdecl - is a program for encoding and decoding C (or C++) type declarations.

2. online cdecl tool - http://cdecl.org/

C is not an easy language to parse. A cursory glance at its BNF (Backus–Naur Form) grammar should convince anyone immediately. One of the hairiest parts of the grammar is type declarations.

The creators of C, Brian Kernighan and Dennis Ritchie, admit it themselves in the beginning of section 5.12 of K&R2 ("The C programming language 2nd Ed"):

C is sometimes castigated for the syntax of its declarations, particularly ones that involve pointers to functions. The syntax is an attempt to make the declaration and the use agree; it works well for simple cases, but it can be confusing for the harder ones, because declarations cannot be read left to right, and because parentheses are over-used.

Quick, what is the type of foo here:

char *(*(**foo [][8])())[];

Oh, you didn't know it's foo is array of array of 8 pointer to pointer to function returning pointer to array of pointer to char ? Shame on you...

Seriously, though, type declarations in C are complex, and sometimes aren't intuitive. There is, however, a relatively simple method of reading them.

First of all, declarations consist of a basic type and modifiers:

/* int is a basic type */
int x;     

/* [] is the 'array of' modifier */
int x[5]; 

/* * is the 'pointer to' modifier */
int *x;

/* () is the 'function returning...' modifier' */
int (*fptr)(void);

When you see a complex declaration, first recognize the basic type and the variable name. In:

int (*x)[10][20];

The basic type is int and the variable name is x. So the declaration means x is ... int for some yet unknown value of ...

To read the modifiers, go to the right from the variable name until you can - that is, until you run into a semicolon or a closing right parenthesis. When you reach one of these stops, start going left until you reach an opening left parenthesis (or the basic type, in which case you're done). Each time you see a new modifier (either going right or left), attach it to the end of the current declaration sentence.

Let's see some examples:

/* x is int (but that was easy...) */
int x;

/* go right from 'x' - we hit the array
   and then get stuck on the ';', so 
   we start going left, where there's
   nothing.
   
   so:
   
   x is an array[5] of int 
*/
int x[5];

/* there's nothing to the right, but a '*'
   to the left, so:
   
   x is a pointer to int
*/
int *x;

/* now, combining these cases:
   
   x is an array[5] of pointer to int
*/
int *x[5];

/* how about this ?
  
   x is an array[5] of array[2] of int
*/
int x[5][2];

/* hey, this is becoming easy...

   x is an array[5] of array[2] of pointer
     to pointer to int
*/
int **x[5][2];

/* grouping parantheses complicate things,
   but not too much.
   trying to go right from 'x', we hit the
   closing paren, so we go left. After
   we attach the pointer we see an opening
   paren, so we can go right again:
   
   x is a pointer to array[5] of int
*/
int (*x)[5];

/* function declarations are just like arrays:
   we go right from 'x', and attach 'array[4] of'
   then we hit the paren, and go left, attaching
   'pointer to'. Then, we hit the left paren, so
   we go right again, attaching 
   'function(char, int) returning'
   
   And eventually:
   
   x is an array[4] of pointer to 
     function(char, int) returning int
*/
int (*x[4])(char, int);

I hope you're now convinced that the task of understanding C type declarations isn't that difficult.

Some final notes:

1. If you really want to understand what's going under the hood of C type declarations, read sections A.8.5 and A.8.6 of K&R2. Also, section 5.12 contains a program that translates declarations into words.

2. This page was very useful in the preparation of the article. Thanks to Steve Friedl for sharing it

3. As some commenters kindly noted, other good sources of information on this topic are book "Expert C Programming" by Peter Van Der Linden (in chapter 3), and the unix command cdecl(1).

4. I can't imagine why you would ever need a type as complex as the initial example of this article, but if you do, the best way is to build the type incrementally using typedef declarations.

Reference:

http://stackoverflow.com/questions/89056/how-do-you-read-c-declarations
http://eli.thegreenplace.net/2008/07/18/reading-c-type-declarations/
http://blog.ijun.org/2013/05/lint-statically-checking-c-programs.html
http://blog.ijun.org/2013/04/parse-computer-langauge-syntax.html
http://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form

No comments: