Saturday, November 13, 2010

Debugging Is More Than “Making the Bug Go Away”

Ask an inexperienced programmer to define debugging, and they might answer that it is “finding a fix.” In fact, that is only one of several goals, and not even the most important of them.

Effective debugging requires that we take these steps:
1. Work out why the software is behaving unexpectedly.

2. Fix the problem.

3. Avoid breaking anything else.

4. Maintain or improve the overall quality (readability, architecture, test coverage, performance, and so on) of the code.

5. Ensure that the same problem does not occur elsewhere and cannot occur again.

Of these, by far the most important is the first—identifying the root cause of the problem is the cornerstone upon which everything else depends.

Understanding Is Everything
Inexperienced developers (and sometimes, unfortunately, those of us who should know better) often skip diagnosis altogether. Instead, they immediately implement what they think might be a fix. If they’re lucky, it won’t work, and all they will have done is waste their time. The real danger comes if it works, or seems to work, because now they’ve made a change to the source that they don’t really understand. It might fix the bug, but there is a real chance that in reality it is only masking the true underlying cause. Worse, there is a good chance that this kind of change will introduce regressions—breaking something that used to work correctly beforehand.

Wasted Time and Effort
Some years ago, I found myself working in a team containing a number of very experienced and talented developers. Most of their experience was with UNIX, but when I joined the team, they were in the late stages of porting the software to Windows. One of the bugs found during the port was a performance issue when running many threads simultaneously. Some threads were being starved, while others were running just fine.

Given that everything worked just fine under UNIX, the problem was clearly broken threading in Windows, so the decision was made to implement a custom thread scheduling system and avoid using that provided by the operating system. This would be a lot of work, obviously, but quite within the capabilities of a team of this caliber.

I joined the team when they were some way into the implementation, and sure enough, threads were no longer suffering from starvation. But thread scheduling is subtle, and they were still working through a number of issues that had been caused by the change (not least of which was that the changes had slowed the whole system down somewhat).

I was intrigued by this bug, because I’d previously experienced no problems with Windows’ threading. A little investigation demonstrated that the performance issue was caused by the fact that Windows implements a dynamic thread priority boost. The bug could be fixed by disabling this with a single line of code (a call to SetThreadPriorityBoost( )).

The moral? The team had decided that Windows’ threads were broken without really investigating the behavior they were seeing. In part, this might have been a cultural issue—Windows doesn’t have a good reputation among UNIX hackers. Nevertheless, if they had taken the time to identify the root cause, they would have saved themselves a great deal of work and avoided introducing complications that made the system both less efficient and more error-prone.

Without first understanding the true root cause of the bug, we are outside the realms of software engineering and delving instead into voodoo programming or programming by coincidence.

Source of Information : Paul Butcher - Debug it Find repair and prevent bugs
Debugging Is More Than “Making the Bug Go Away”SocialTwist Tell-a-Friend
Digg Google Bookmarks reddit Mixx StumbleUpon Technorati Yahoo! Buzz DesignFloat Delicious BlinkList Furl

0 comments: on "Debugging Is More Than “Making the Bug Go Away”"

Post a Comment