Sunday, November 22, 2009

Dealing with Compiler Errors

Dealing with Compiler Errors - Surviving the Compilation ProcessIt's your first C (or C++) program--it's not that long, and you're about to compile it. You hit compile (or enter the build command) and wait. Your compiler spits out fifty lines of text. You pick out words like "warning and "error". Does that mean it worked? you wonder. You look for the resulting executable. Nothing. Damn, you think, I guess I have to figure out what this all means...

The Types of Compilation ErrorsFirst, let's distinguish between the types of errors: most compilers will give three types of compile-time alerts: compiler warnings, compiler errors, and linker errors.

Although you don't want to ignore them, compiler warnings aren't something severe enough to actually keep your program from compiling. Usually, compiler warnings are an indication that something might go wrong at runtime. How can the compiler know this at all? You might be making a typical mistake that the compiler knows about. A common example is using the assignment operator ('=') instead of the equality operator ('==') inside an if statement. Your compiler may also warn you about using variables that haven't been initialized and other similar mistakes. Generally, you can set the warning level of your compiler--I like to keep it at its highest level so that my compiler warnings don't turn in to bugs in the running program ('runtime bugs').

Nevertheless, compiler warnings aren't going to stop you from getting your program working (unless you tell your compiler to treat warnings as errors), so they're probably a bit less frustrating than errors. Errors are conditions that prevent the compiler from completing the compilation of your files. Compiler errors are restricted to single source code files and are the result of 'syntax errors'. What this really means is that you've done something that the compiler cannot understand. For instance, the statement "for(;)" isn't correct syntax because a for loop always needs to have three parts. Although the compiler would have expected a semicolon, it would also have expected a conditional expression, so the error message you get might be something like "line 53, unexpected parenthesis ')'". Note, also, that compiler errors will always include a line number at which the error was detected.

Even if you make it through the compilation process successfully, you may run into linker errors. Linker errors, unlike compiler errors, have nothing to do with incorrect syntax. Instead, linker errors are usually problems with finding the definitions for functions, structs, classes, or global variables that were declared, but never actually defined, in a source code file. Generally, these errors will be of the form "could not find definition for X".

Usually, the compilation process will begin with a series of compiler errors and warnings and, once you've fixed all of them, you'll then be faced with any linker errors. In turn, I'll first cover dealing with compiler errors and then with linker errors.

Compiler Errors - Where do you start?If you're faced with a list of fifty or sixty error and warning messages, it can be daunting to even try to figure out where to start. The best place, though, is at the beginning--as in, the beginning of the list. In fact, you should almost never start trying to fix errors from the end of the file to the beginning for one simple reason: you don't know if they're actually errors!

A single error near the top of your program can cause a cascade of other compiler errors because those lines might rely on something early in the program that the compiler couldn't understand. For instance, if you declare a variable with improper syntax, the compiler will complain about that syntax error and that it cannot find a declaration for the variable. Leaving off a semicolon in the wrong place can result in an astonishing number of errors. Things like this can happen because C and C++ syntax allows for things like declaring of a type immediately after the type definition:

struct

{

        int x;

        int y;

} myStruct;

This would create a variable, myStruct, with room to store a struct containing two integers. Unfortunately, this means that if you leave off a semicolon, the compiler will interpret it as though the next thing in the program is intended to be a struct (or return a struct). Something like this

struct MyStructType

{

        int x;

        int y;

}

int foo()

{}

can result in an surprising number of errors (possibly including a complaint about an extraneous "int" being ignored). All this for a single character! best to start at the top.

Dissecting an Error MessageMost messages from the compiler will consist of at least four things: the type of message--warning or error--source code file in which the error appeared, and the line of the error, and a brief description of what was wrong. Output from g++ for the above program might look something like this (your results with other compilers may vary):

foo.cc:7: error: semicolon missing after struct declaration

foo.cc is the name of the file. 7 is the line number in question, and it is clear that this is an error. The brief message here is quite helpful because it says exactly what was wrong. Notice, however, that the message makes sense only in the context of the program. It doesn't say which struct was missing a semicolon.

More cryptic was another error message from the same compilation attempt: "extraneous 'int' ignored". It's up to the programmer to figure out exactly why it was extraneous. Notice again that this was an error caused by a problem earlier in the program, not on line 8, but earlier, when the struct lacked a semicolon terminator. Fortunately, it's pretty clear that the function definition for foo was OK; this tells us that the error must have been caused some where else in the program. In fact, it had to be earlier in the program--you won't get an error message that indicates a syntax error prior to the line on which the error actually occurred.

This brings up another guiding principle of hunting down compiler errors: when in doubt, look earlier in the program. Since syntax errors can have mysterious repercussions later, it's possible that the compiler was giving a line number that doesn't actually have a syntax error! Worse, many times, the compiler won't be as friendly in telling you exactly what happened earlier in the program. Even the first compiler error you get might be due to something several lines before the indicated warning.

Handling Cryptic or Bizarre MessagesThere are several types of compiler errors that are especially frustrating. The first is the case of an undeclared variable that you swear you declared. Often times, you can actually point out exactly where the variable was declared! The problem is often that the variable is simply misspelled. Unfortunately, this can be very hard to see since the mind typically reads what it expects rather than what is actually there. Worse, there are other reasons why this could be a problem too--scoping issues for instance!

To sort through the possible problems, one trick I like to use is to go to the line of the supposedly undeclared variable and have my text editor perform a search for the word under the cursor (alternatively, you could copy the variable name and perform a search); this guarantees that if I spelled it incorrectly, it will not find a match for my search. This also keeps me from having to type the word, which could result in my correctly spelling the variable name.

A second cryptic message is the "unexpected end of file". What's going on here? Why would the end of the file be "unexpected"? Well, the key here is to think like the compiler; if the end of the file is unexpected, then it must be that it's waiting for something. What could it be waiting for? The answer is usually "closure". For instance, closing curly braces or closing quotes. A good text editor that performs syntax highlighting and automatic indentation should help fix some of these issues by making it easier to spot problems when writing code.

Ultimately, when a message is cryptic, the way to approach the problem is to think about how the compiler is trying to interpret the file. This can be hard when you're just starting out, but if you pay attention to the messages and try to pick out what they could mean, you'll quickly get used to the general patterns.

Finally, if nothing else works, you can always just rewrite a few lines of code to clear out any hidden syntax errors that might be hard for the eye to catch. This can be danerous if you don't end up rewriting the right section of code, but it can be helpful.

Linker ErrorsOnce you've finally cleaned up all those frustrating syntax errors, taken a nap, had a meal or two, and mentally prepared yourself for the program to build correctly, you may still need to deal with linker errors. These can often be more frustrating because they aren't necessarily the result of something written in your program. I'll briefly cover some of the typical types of linker errors you can expect and some of the ways to fix them.

You may have issues with how you set up your compiler. For instance, even if you include the correct header files for all of your functions, you still need to provide your linker with the correct path to the library that has the actual implementation. Otherwise, you will get "undefined function" error messages. Be careful that your compiler doesn't actually support these functions at all (this could happen if you include your own declaration of a function to get around a compile-time error). If your compiler should support the function, then fixing this problem usually requires compiler-specific settings. You'll generally want to look for how to tell the compiler where to look for libraries and make sure that the libraries were actually installed correctly.

Linker errors can also come about in functions that you have declared and defined if you fail to include all of the necessary object files in the linking process. For example, if you write your class definition in myClass.cc, and your main function is in myMain.cc, your compiler will create two object files, myClass.o and myMain.o, and the linker will need both of them to finish the creation of the new program. If you leave out myClass.o, then it will not have the class definition even if you correctly included myClass.h!

A sometimes subtle error is when the linker complains about there being more than one definition for a class, function, or variable. This issue can come up in one of several ways: first, there might actually be two definitions of an object--for instance, two global variables both declared as external variables to be accessible outside of the source code file. This is a legitimate concern for both functions and variables, and it definitely can happen. On the other hand, sometimes the problem is with the directives to the linker; on more than one occassion, I've seen people include multiple copies of the same object file in the linking process. And bingo, you've got multiple definitions. A typical giveaway for this problem is that a whole host of functions have multiple definitions.

The last bizarre type of linker error is a complain about an "undefined reference to main". This particular linker error differs from the other in that it may have nothing to do with including object files or having the correct paths to your libraries. Instead, it means that the linker tried to create an executable and couldn't figure out where the main() function was located. This can happen if you forget to include the main function at all, or if you attempt to compile code that was never meant to be a stand-alone executable (for instance, if you tried to compile a library).

Related articles

Learn more about dealing with compiler warnings Compiler warnings can indicate future bugs!

Compiling and Linking A brief description of the compiling and linking process

The Static Keyword Covers the static keyword and how it can change the accessibility of global variables

Using Namespaces Learn how namespaces can hide function and variable declarations

No comments: