Sunday, November 22, 2009

Debuggers and Debugging Techniques

Debuggers and Debugging Techniques

Writing code is only the beginning of completing a programming project. After the original implementation is complete, it is time to test the program. Unfortunately, only the rare (and usually non-priority) code project is completed without a single bug. Hence, debugging takes on great importance: the earlier you find an error, the less it will cost. A major bug found before distribution is much, much cheaper to fix than a major bug found by thousands of your users.

Programmers have three general techniques for finding bugs, all of which revolve around getting a good idea of how the program is actually behaving between interactions with the user. One technique is to use printf or cout statements interspersed at points throughout the code to output the values stored in variables before and after the bug. This approach works for smaller programs and can help isolate the bug between one output statement and another. Unfortunately, it can be a bit difficult to do this with longer programs, as it may require many outputs in order to find a single bug, and the longer the program, the more likely one of these output statements is going to be left in for the end user to find. (Adding complexity to your program while you are hunting down bugs is likely to simply increase the number of bugs you later have to deal with, such as when you wonder why a seemingly random number is being printed to the screen without warning, just to find out it was one of your debugging outputs.)

A second, more structured, approach is to include a command line debug switch that can be used to have the program output debugging information at various points in the program execution. (In the code, you can place if(debug) { /*output statements */}) Note that this is a runtime argument, meaning that one benefit of this approach is that you can have end users enable the debugging switch and send you the output. Another advantage is that by structuring the debugging information in your code, you can easily use it whenever you want to know what is going on. The disadvantages are that you add complexity to your code and that you have to remember to insert these statements whenever you think it might be useful to know the values of certain variables. A lazy or tired programmer may leave out statements at crucial times, and be forced to add them in later. As well, it can be tedious to update the debugging output when the program changes, which can defeat the purpose of adding them in the first place.

Along the lines of this second approach is an even more sophisticated technique, which is to allow certain debug options to be set at run time, meaning that information about only specific variables may be output, rather than an entire snapshot of the program's state. You can do this by having a command-line switch such as -Dabc, where a, b, and c correspond to different sets of variables to be output while debugging. The benefit of this approach is that it makes managing the information easier, but the cost is that it makes maintaining the code even more difficult than the less sophisticated version of this technique.

For either of the two previous methods, it's important to note that the location in the source code of the printouts is as important as the values printed out; if you don't know where the error occurs, you can't do much to fix it. It helps to put in markers in your print outs, such as "inside function X." You should also be aware that if a program crashes and is using buffered I/O, your printouts might get stuck in the buffer. Therefore, make sure to flush the buffer after each printf or cout in order to avoid the problem of having a program appear to never reach a problem line because the output is stuck in the buffer when the problem line causes the program to crash.

The third technique is simply to use a debugger. A debugger allows the programmer to interact with a running program by setting break points where it would be useful to be able to halt execution of the program in order to check the values of variables, by stepping line by line through the code, and by testing the effect of executing statements in the current program's environment. (For instance, if you were testing a program and wanted to see if a function would return the correct value at some instant, you can use the debugger to run the function, even if it isn't the next line to be executed in a program.)

For more information about using a debugger, check out the GDB debugger tutorial, using examples from GDB.

Also see the Visual Studio debugging tutorial series:

Debugging with Visual Studio, Part 1: Debugging Concepts

Debugging with Visual Studio, Part 2: Setting up the IDE

Debugging with Visual Studio, Part 3: Using Breakpoints Effectively

Debugging with Visual Studio, Part 4: Setting up Code for the Debugger

Debugging with Visual Studio, Part 5: Using Trace and Log Messages

Debugging with Visual Studio, Part 6: Remote Debugging

5 Awesome Visual Studio Debugger Features

Help prevent small problems from becoming big headaches: bug prevention, debugging strategies, tips, and gotchas.

To learn more about hunting segmentation faults and pointer errors check out this tutorial.

Another excellent tool for finding memory leaks and other memory problems is Valgrind, a tool that helps find memory leaks and invalid memory usage.

If you use Visual Studio and you'd like to learn how to keep your debugger from stepping into common functions, check out Skip Stepping Into Functions with Visual Studio's NoStepInto Option.

No comments: