Friday, December 28, 2012

How To Read C Declarations


Even experienced C programmers have difficulty reading declarations that go beyond simple arrays and pointers. For example, is the following an array of pointers or a pointer to an array?
int *a[10];
What the heck does the following mean?
int (*(*vtable)[])();
Naturally, it's a pointer to an array of pointers to functions returning integers(wink)
This short article tells you how to read any C declaration correctly using a very simple technique. I am 99% certain that I read this in a book in the late 1980s, but I can't remember where. I doubt that I discovered this on my own (even though I've always been delighted by computer language structure and esoterica). I do remember, however, building a simple program that would translate any declaration into English.

The golden rule

The rule goes like this:
Start at the variable name (or innermost construct if no identifier
is present. Look right without jumping over a right parenthesis; say
what you see. Look left again without jumping over a parenthesis; say
what you see. Jump out a level of parentheses if any. Look right;
say what you see. Look left; say what you see. Continue in this
manner until you say the variable type or return type.
The degenerate case is:
int i;
Starting at i , you look right and find nothing. You look left and find the type int , which you say. Done.
Ok, now a more complicated one:
int *a[3];
Start at a . Look right, say array of size 3. Look left and say pointer. Look right and see nothing. Look left and say int. All together you say a is an array of size 3 pointers to int.
Adding parentheses is when it gets weird:
int (*a)[3];
The parentheses change the order just like in an expression. When you look right after a , you see the right parenthesis, which you cannot jump over until you look left. Hence, you would say a is a pointer to an array of 3 ints.

Function pointers

The C "forward" declaration:
extern int foo();
just says that foo is a function returning int. This follows the same pattern for reading declarators as you saw in previous section. Start at foo and look right. You see () so say function. You look left and see int . Say int.
Now, try this one:
extern int *foo();
Yep, you say foo is a function returning a pointer to int.
Now for the big leap. Just like we can make a pointer to an int or whatever, let's make a pointer to a function. In this case, we can drop the extern as it's no longer a function forward reference, but a data variable declaration. Here is the basic pattern for function pointer:
int (*foo)();
You start at foo and see nothing to the right. So, to the left, you say pointer. Then to the right outside you see function. Then left you see int . So you say foo is a pointer to a function returning int.

Combinations

Here is an array of pointers to functions returning int, which we'll need for vtables below:
int (*Object_vtable[])();
You need one last, incredibly bizarre declaration, for the lab:
int (*(*vtable)[])();
This is the pointer to the vtable you will need in each "object" you define.
This pointer to a vtable is set to the address of a vtable; for example, &Truck_vtable .

Summary

The following examples summarize the cases needed for building virtual tables ala C++ to implement polymorphism (like the original cfront C++ to C translator).
int *ptr_to_int;
int *func_returning_ptr_to_int();
int (*ptr_to_func_returning_int)();
int (*array_of_ptr_to_func_returning_int[])();
int (*(*ptr_to_an_array_of_ptr_to_func_returning_int)[])();

Anonymous
Genius! Thanks for this.
Permalink
Feb 06, 2008
Anonymous
with respect to the book, you probably mean "Deep C Secrets" by Van der Linden. In there there's an algorithm for reading declarations

Permalink
Feb 06, 2008
Anonymous
Something similar is in K&R (section 5.12).
Permalink
Feb 06, 2008
Anonymous
The dcl and undcl programs in K&R.

The declarations are converted into text and vice versa.

And the table in an earlier chapter with precedence rules of operators.
Permalink
Feb 06, 2008
Anonymous
This is simply awesome...
Permalink
Feb 06, 2008
Anonymous
tip (*article)[]
Permalink
Feb 06, 2008
Anonymous
I saw a similar article in the Amiga Transactor magazine in the late 80s. I have never been able to find a copy of it online.
Permalink
Feb 06, 2008
Anonymous
Just found an online version of the Transactor article

http://untroubled.org/articles/cdecls.txt\\
Permalink
Feb 06, 2008
Anonymous
This power point might help if you're stuck reading declarations. http://ieng6.ucsd.edu/~cs12x/RightLeft.ppt
Permalink
Feb 06, 2008
Anonymous
In short: Skip the initial word, then read the rest as an expression involving the declared variable (with array indices missing and types instead of function arguments). The type of this expression is the initially skipped word. Now, if you really want to (you don't need this at this point) you can reconstruct backwards the variable's type.
Permalink
Feb 06, 2008
Anonymous
Look right, look left, it is so simple! Like crossing a street!

Thanks a lot!
Permalink
Feb 06, 2008
Anonymous
Thanks - this is a great technique for understanding existing code, but I believe any declaration that requires you to think this hard should be rewritten. Take this declaration of vtable, which is a pointer to an array of functions that return int:

int (*(*vtable)[])();

This could be rewritten using typedefs like so:

typedef int (*IntFunctionPtr)( void ); // IntFunctionPtr is a pointer to a function that returns an int
typedef IntFunctionPtr IntFunctionPtrArray[]; // IntFunctionPtrArray is an array of IntFunctionPtr's

IntFunctionPtrArray * vtable;

It may be wordier, but it compiles to the same code, and it's much easier to parse and maintain IMO.
Permalink
Feb 06, 2008
Anonymous
The book you are talking about is probably C traps and pitfalls by Andrew Koenig, published in 1989. Deep C secrets (AKA "Expert C programming") was published in 1994.
Permalink
Feb 06, 2008
Anonymous
I completely agree about using typedefs to make the code more readable. In fact, if I see definitions like the ones above, my mechanism for parsing them is entirely to decompose them into typedefs!!
Permalink
Feb 06, 2008
Anonymous
No no NO!

The rule for reading type declarations isn't some weird combination of look left/right do/not jump over parenthesis or whatever. That's insane!

The rule is much simpler: read the declaration as an expression that extracts a basic type out of the variable. This is the way the C type declarations were intentionally designed almost forty years ago! The fact many C programmers don't know this simple rule is an indication of the sorry state of humanity.

int i; // To get an integer out of "i", write "i". So "i" is an integer.
int *p; // To get an integer out of "p", write "*p". So "p" is a pointer to integer.
int a[3]; // To get an integer out of "a", write "a[index]". So "a" is an array of integers.

So to get an integer out of vtable where "int (*(*vtable)[])()"...

1. Dereference it "***vtable". So it is a pointer.
2. Index the result "(vtable)[]*". So it is a pointer to an array.
3. Dereference the result "*(*vtable)[]". So it is a pointer to an array of pointers.
4. Invoke the result "(*(vtable)[])()*"; So it is a pointer to an array of pointers to functions.
5. The result is an integer "int (*(*vtable)[])()"; So it is a pointer to an array of pointers to functions returning int.

This explains why there are parenthesis sprinkled through the declaration; they are used to ensure the expression works correctly given the precedence of the operators.
Permalink
Feb 06, 2008
Kay Röpke
Re: "No no NO!":

You have stated the exact same thing, obviously.
However, the derivation of "int (*(*vtable)[])()" you give, already assumes that you know the types, because you already start with "***vtable" not just the identifier.

So your derivation is correct, but gives no "rule" to follow when approaching that declaration. Of course, in your head you already know the type or part of it (you start by stating that there are three levels of indirection...).

I fail to see how the given advice is "insane". It's an easy recipe for reading type declarations, if you will it's a natural language description of the grammar for those things. And it is pretty useful, or how do you read type decls?

You probably do the very same thing, just without stating it explicitely, or could you share your exact approach when trying to figure out the exact type of something you've never seen before?
Permalink
Feb 04, 2009
Anonymous
awesomeness *readarticle()
Permalink
Jun 14, 2009
Anonymous
Really Cool Thanks!!!!
Permalink
Jul 27, 2009
Anonymous
It only gets interesting when you start doing pointers to functions taking function pointers as arguments, etc.
Permalink
Oct 10, 2009
Anonymous
How often such complex declarations are used in practical programs?
Permalink
Nov 16, 2009
Anonymous
Good Job Man This Is So Easy Now {Awsomeness}
Permalink
Feb 24, 2010
Anonymous
Truly this more than deserves number 4 on the google search "reading c code", and the first relevant one at that. Brilliantly useful.
Permalink
May 27, 2010
Anonymous
Automate it!

http://www.muquit.com/muquit/software/cdcl/cdcl.html

Enter the declaration, return, and it displays the C expression in English.

Reference:
http://www.antlr.org/wiki/display/CS652/How+To+Read+C+Declarations

1 comment:

Anonymous said...

Love this post - thanks! Since I'm a new programmer, C declarations always confused me, this helped clarify them.

This article helped me as well:

How to read C declarations