This article is a part of working with bug reports.
Following call stack... or not?
In many cases the reason for the problem may be obvious - all you need to do is to follow a call stack in the bug report and you get your buggy code. The only thing left in to analyze variables in that lines, check assumptions, conditions and figure how to fix it.
However, not always call stack points you to a problem. This is especially true for the Access Violation exception. For example - suppose that you have a memory-corruption bug in your application. Buggy code may run without a visible problem and just corrupt some other memory. And the actual exception will be raised later, when fully unrelated code will be executed. Now you have Access Violation and call stack, which points to some innocent code, which have nothing to do with the original problem.
You should just mindfully analyze the situation and do not blame unrelated code.
Some types of bug-reports can not point to the problem by definition. For example - memory problems (leaks and corruption). Memory manager just can not scan the entire memory pool for problems at every machine instruction in your application. What’s more: it is not possible to scan entire pool on every request to memory manager either. As there is a lot of used memory in every application. So, if you’ll scan all memory at every call to memory manager, then your performance will be near zero.
That’s why memory manager usually checks only blocks, which are directly related to the current operation. For example, when we ask to free memory, then memory manager will check only this block for corruption. Note: it’ll check only this block, not all allocated blocks. Next example: we’re asking for memory. So, memory manager will go through available “free” blocks and pick suitable one. Before returning it to us, memory manager will scan it to check, that no one have wrote into that block while it was free. See how to solve memory problems.
That is why problems with memory will be raised later, and not in the moment, when they actually occurred.
Another example is memory leak.
Using other information in the bug report
Apart from call stack - there may be other information available in the bug reports. You can extract hints to the problem from this auxiliary information. The most important are information about CPU register and memory dumps. Sometimes you may extract information about variables from these pieces. Of course, it requires assembler knowledge, but it can be very handy sometimes.
Actually, the memory dumps can be useful even for un-experienced programmer. For example, the dump of memory leak can tell you, what data was there at run-time. Of, you can see, say, a string’s data instead of object’s layout and so you can trace where did this string go and find the source of memory corruption.
Unfortunately, it is really difficult to give some general advices here – almost every case require individual approach. When you suggest (guess) what the situation can be at run-time in your application - then the additional information in the report can help you to check this guess.
Well, an example. The very strange access violation at, seemingly, usual place at function’s call. You may notice that your application was launched in Windows 2000 (that is field in the bug report). And you do know, that you didn’t tested your application on this system. So, you dig a little deeper and discover, that used function do not exists in Windows 2000, and so this is the reason for the problem (say, you’ve imported the function via GetProcAddress, but didn’t check for errors).
You may add extra information to your bug reports via logging.