Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Donate
Contact info
Forum
 
Other projects
   Altirra

Search

Archives

01 Dec - 31 Dec 2013
01 Oct - 31 Oct 2013
01 Aug - 31 Aug 2013
01 May - 31 May 2013
01 Mar - 31 Mar 2013
01 Feb - 29 Feb 2013
01 Dec - 31 Dec 2012
01 Nov - 30 Nov 2012
01 Oct - 31 Oct 2012
01 Sep - 30 Sep 2012
01 Aug - 31 Aug 2012
01 June - 30 June 2012
01 May - 31 May 2012
01 Apr - 30 Apr 2012
01 Dec - 31 Dec 2011
01 Nov - 30 Nov 2011
01 Oct - 31 Oct 2011
01 Sep - 30 Sep 2011
01 Aug - 31 Aug 2011
01 Jul - 31 Jul 2011
01 June - 30 June 2011
01 May - 31 May 2011
01 Apr - 30 Apr 2011
01 Mar - 31 Mar 2011
01 Feb - 29 Feb 2011
01 Jan - 31 Jan 2011
01 Dec - 31 Dec 2010
01 Nov - 30 Nov 2010
01 Oct - 31 Oct 2010
01 Sep - 30 Sep 2010
01 Aug - 31 Aug 2010
01 Jul - 31 Jul 2010
01 June - 30 June 2010
01 May - 31 May 2010
01 Apr - 30 Apr 2010
01 Mar - 31 Mar 2010
01 Feb - 29 Feb 2010
01 Jan - 31 Jan 2010
01 Dec - 31 Dec 2009
01 Nov - 30 Nov 2009
01 Oct - 31 Oct 2009
01 Sep - 30 Sep 2009
01 Aug - 31 Aug 2009
01 Jul - 31 Jul 2009
01 June - 30 June 2009
01 May - 31 May 2009
01 Apr - 30 Apr 2009
01 Mar - 31 Mar 2009
01 Feb - 29 Feb 2009
01 Jan - 31 Jan 2009
01 Dec - 31 Dec 2008
01 Nov - 30 Nov 2008
01 Oct - 31 Oct 2008
01 Sep - 30 Sep 2008
01 Aug - 31 Aug 2008
01 Jul - 31 Jul 2008
01 June - 30 June 2008
01 May - 31 May 2008
01 Apr - 30 Apr 2008
01 Mar - 31 Mar 2008
01 Feb - 29 Feb 2008
01 Jan - 31 Jan 2008
01 Dec - 31 Dec 2007
01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Sep - 30 Sep 2007
01 Aug - 31 Aug 2007
01 Jul - 31 Jul 2007
01 June - 30 June 2007
01 May - 31 May 2007
01 Apr - 30 Apr 2007
01 Mar - 31 Mar 2007
01 Feb - 29 Feb 2007
01 Jan - 31 Jan 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Oct - 31 Oct 2006
01 Sep - 30 Sep 2006
01 Aug - 31 Aug 2006
01 Jul - 31 Jul 2006
01 June - 30 June 2006
01 May - 31 May 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 29 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Nov - 30 Nov 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 June - 30 June 2005
01 May - 31 May 2005
01 Apr - 30 Apr 2005
01 Mar - 31 Mar 2005
01 Feb - 29 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004

Stuff

Powered by Pivot  
XML: RSS feed 
XML: Atom feed 

§ Being on both sides of the bug database

Bug databases are fun. As a software developer, it's where you get to hear all of the ways in which you were stoned and the code you checked in last night was utter crap. If you're lucky, it just causes a minor problem, you check in a fix, quality assurance (QA) reports it as good, and you're done. If you're unlucky, you get the mystery bug from hell that everyone encounters on a daily basis but for which no one can identify the true cause. As no software is bug-free, handling bugs can seem Sisyphean at times, the list of bugs continuously growing almost as fast as you can shrink it.

Submitting bugs on someone else's product is an interesting experience, because in that case I'm on the other side of the fence the tester instead of the developer. It's a bit humbling to have your bug kicked back as By Design because you missed something in obvious in the UI. You learn to be a bit friendlier to testers when you have to fight to get a bug fixed the same way that they do. It also quickly shows that the worst kind of tester you can have is another developer, because they keep wondering why you can't fix the bug that seems so blindingly obvious to them (even though they don't have the source code).

Reproducing a bug

One of the major prerequisites for fixing a bug is getting it to happen locally. Occasionally you get a bug that is so blindingly obvious that you don't even need to run the program to find broken code, but that's rare, and even then that might not be enough because the code bug you found might not actually be the one you're looking for. Besides, if you can't reproduce the bug, it's hard to tell if you've actually fixed it. I can't stress enough that finding a solid reproduction case greatly increases the chances of a bug getting fixed. At a very minimum, it'll take at least two tries to fix a bug, one to reproduce the bug, and a second to verify that it doesn't happen with the fix. It'll usually take a lot more, to ensure that the bug doesn't happen randomly, because the first few fixes don't work, or to better verify the one that does. This is much more likely to go smoothly if the bug happens every time and in thirty seconds than if it only occurs 10% of the time and after three hours.

The best reproduction steps are those that are really easy and really quick. For instance, let's say VirtualDub crashes if you hit 'x' in the main window. Chances are I can fix this bug in five minutes, because I can get it to happen really quickly, there isn't a lot of code involved, and I can very quickly verify that it doesn't happen any more after I've fixed it. If I get a bug report that a crash happens about every tenth run of a 100GB file, that's bad news, because not only does it take forever to test, but odds are I can't create the same 100GB file, and I'd have to repeat the test about fifty times to be sure it didn't happen anymore. I rarely ever fix a bug on such a description; usually I have to go back and forth with the user to narrow down the cause, determine a smaller and faster set of repro steps, or meditate on the crash report and hope that it includes additional clues to narrow down the faulty code path.

Involving third-party code in a repro case is a bad idea. It makes the bug harder to test, because you have to get the third-party component and figure out how to use it, and it's not yours so you usually can't debug it, modify it, or look at the source code. Sure, it might be open source, but even then it's not necessarily code that your familiar with or that you can change (especially if it comes with the OS). Worst of all, third-party components can have bugs; vendors write and ship buggy code like everyone else does. Some crashes, most notably resource leaks or memory trashing problems, can be extremely difficult to track to the culprit, even with a debugger. Thus, whenever I report or try to reproduce a bug, I always attempt to exclude third-party components from the repro case whenever possible. That way, nobody is wondering whether the bug is in the main program or in XYZ.DLL it's undoubtedly your fault.

Incidentally, a sure way to piss off a developer is to report that a bug occurs 100% of the time and then admit later that you only saw it happen once and didn't try to reproduce the problem. Knowing whether a bug consistently occurs or not is valuable information, because it can indicate a random or timing-sensitive condition that may help isolate the location of the buggy code. Also, this generally means that there is irrelevant information in the bug report that can mislead, such as saying to press Q/R/T keys in a specific order when the bug occurs whether you do that or not.

Finally, if you know that only specific versions of a program have a particular problem, that can be critical information. For instance, say a user knows that only versions of VirtualDub between 1.5.6 and 1.5.8 trash video with a particular filter. That is very valuable information, because I keep source code for all versions of VirtualDub in a Perforce depot and can immediately isolate the problem to changes that went into those versions. The narrower the range, the fewer diffs in the suspect list. If that is not enough to find the bad code, I can start with the 1.5.5 code and begin introducing diffs until the problem occurs. This can greatly shorten the time to fix. People looking at bugs I've submitted at the Microsoft Product Feedback Center might wonder why I indicate whether Visual C++ 6.0 and Visual Studio .NET 2003 also have a bug that I report on Visual Studio 2005; this is the reason.

Forwarding bugs

Ah, shifting the blame. Always fun to do, especially if you supply comments along with the forward. Sure, you could send the bug back to the testers as just "I can't reproduce," but why not push the stake in farther and include comments to the effect of "we think the problem may have something to do with your testing methodology" or "we would like you to reconfirm that this is happening"? Or even better yet, forward it to some unsuspecting peer whose list is too short for his own good. The best is when you can put on the asbestos suits and experience of full-fledged war of "it's a bug - it's not a bug" between engineering and QA.

Or not.

Although there is a lot of variation in the way different bug databases tend to classify bug status, there are also a lot of common themes. There are New bugs, bugs that have been Reproduced, and bugs that Couldn't be Reproduced. Then there are bugs for which new code has been checked in and are Probably Fixed, bugs that have been Fixed, and then bugs that still Aren't Fixed even with the new code. And then, there are the dreaded bugs that Can't be Fixed or Won't be Fixed in time for ship, because it's too risky. And finally, for the ultimate slap-from-a-white-glove, there's Not a Bug.

It can be very tempting to think that a tester is simply smoking hemp and punt the bug back as doesn't-happen, but if you're the submitter of a bug, this feels like you've been called a liar to your face. Machine configurations and usage patterns are varied enough that usually both sides are right it always happens to the tester, and it never happens for the developer. Where conflicts can really occur is when development admits to a bad bug but says it won't be fixed in time, which can lead to QA massing a campaign to reverse that decision.

The tags used to forward bugs can contribute to this. Personally, I think that "Won't Fix" was a bad forwarding option to use on the Microsoft Product Feedback Center, because it implies that development doesn't want to fix a problem, whereas "Can't Fix" would indicate that there are reasons why the bug can't be fixed, such as someone depending on the bug. If I could set up a bug database, I would want the following set of tags:

Such tags would be clear and unambiguous in their meaning.

Yes, this is a bug, but I won't fix it. Nyaa-nyaa!

Q: If you can verify that a bug occurs and have a confirmed fix ready to go, why wouldn't you fix the bug?
A: Because you don't know what else would break.

A regression occurs when a change is made to a code base that introduces a new bug or causes a previously fixed one to reappear. Regressions are bad news, because they not only mean you aren't making progress toward a less buggy release, but you actually might be going backwards. Often these happen because of oversights when writing the fix, but sometimes they also happen indirectly the fix might be correct, but it might change conditions such that another dormant bug pops up much more often. The more frequently regressions occur, the harder it is to fix bugs. A gnarly code base with lots of hidden and unexpected connections between code is more prone to accidental regressions, so regression rate can be an indicator of how bad a code base is.

Amusingly, regressions can also occur because two bugs cancel each other out. For historical reasons, VirtualDub's filter system works with bitmaps that are stored upside-down, because that is the memory order for scanlines in a Windows device independent bitmap (DIB). Well, some filters ignore this and process the bitmaps as if they were stored right-side-up. This works because there are two oppositional bugs: the bitmap is read backwards, but it is also written backwards. There is no harm in this case, but let's say one of the filters had a mode that tried to put a mark in the upper-left corner, and due to the flip put it in the lower-left instead. If someone were to try to "fix" this by making the filter read the bitmap correctly, the output would flip and be wrong. The person would have to find the other bug in the output code to make a correct fix. Of course, the other alternative would be to make the marking code wrong too, which would fix the bug but add another land mine onto the pile. Yes, this is lame, but it can happen if you're up against a shipping deadline.

What this means is that fixing a bug isn't limited to testing the portion of code that was buggy. It also requires general testing of the area in question to ensure that a bug hasn't cropped up anywhere else. This means that the bottleneck in fixing bugs isn't always the software developers; it can also be testing. If the bug happens to be in low-level code, such as a file I/O layer, it's possible that the entire program may have to be re-tested. Considering that the number of possible code paths can increase exponentially with the number of variables (particularly configuration options), it shouldn't be a surprise that regressions can and often do occur. The closer a program is to shipping, the more reluctant everyone has to be to accept a fix, because it may make the program less stable. Sucks, doesn't it?

There are some bugs, of course, that will always be fixed regardless of regression risks. If the next build of a program starts crashing on startup on all machines, it's hard to imagine how a fix could make the situation any worse.

You might say that the reluctance to fix bugs is due to having rigid ship dates, and that open-source development doesn't have this constraint: the program ships when it's ready. In some ways, that's true. However, open-source software can suffer from the opposite problem, where the lack of ship dates means that there are no stabilization phases, and thus the project has a constant flow of bugs as well as bug fixes. Larger projects that are well-managed tend to have milestones, alpha and beta periods, and check-in procedures to address this problem. If you're a small team, though, or even a single developer, it can be hard to justify having strict beta periods and parallel stable/unstable branches. It's really a question of development and release processes, not an open source / closed source question.

Finally, you might ask if there is a way to prove that code is bug-free, instead of hoping that it is a relying on testing to catch the failures. It isn't possible in the fully general case, but with appropriate restrictions on coding techniques it is possible to formally prove a program correct. Doing so can very quickly become prohibitively expensive, though: verifying a simple thread synchronization algorithm can involve a 30+ node sequence graph. The cost can be justified if you're writing control software for a missile, but not for most consumer-level software. The software industry will have to move toward more proactive strategies for avoiding bugs as software becomes more complex, but it isn't feasible to formally prove real-world applications right now.

Comments

Comments posted:


Avery Said:

"A regression occurs when a change is made to a code base that introduces a new bug or causes a previously fixed one to reappear. Regressions are bad news, because they not only mean you aren't making progress toward a less buggy release, but you actually might be going backwards. Often these happen because of oversights when writing the fix, but sometimes they also happen indirectly the fix might be correct, but it might change conditions such that another dormant bug pops up much more often. The more frequently regressions occur, the harder it is to fix bugs. A gnarly code base with lots of hidden and unexpected connections between code is more prone to accidental regressions, so regression rate can be an indicator of how bad a code base is."

I recall reading (details below) about problems Microsoft experienced with a release of Word (I think) for Mac. The project spun completely out of control, deteriorating to the situation of fixing one bug would create ten new ones. The term 'INFINITE DEFECTS' was coined to describe the situation. The code base ultimately had to be scrapped and the whole project re-written from scratch.

A lot of interesting stories about how Microsoft does stuff in "Microsoft Secrets" by Cusamano and Selby - ISBN 0 00 638778 0 - well worth a read.

Calvin.

CalvinOZ - 23 12 05 - 03:03


Low Level Bugs - It's that the real reason why Bilbo Bobbins once in "The Hobbit" said: "Never leave home without a good rope."?
;)

Murmel - 23 12 05 - 12:12


The processes of unit testing, functional testing, regression testing, and automated running of these tests greatly help reduce the incidence of regressions. I think the entire industry is moving towards test driven processes to tackle this very problem.

For example: if you make it a rule that you have to write a script that can reproduce a bug before you fix that bug, then you can always run that script on every successive build, to make sure the bug hasn't come back. Re-factoring an existing codebase to support scripted testing is sometimes a lot of work, but them's probably the ropes if you want to climb to a higher level of development process.

Jon - 01 01 06 - 15:11


Oh, definitely. The best benefit IMO of introducing unit testing is not so much the decrease in bugs, but the increased confidence that a fix can be made without creating new ones.

There are, however, some classes of bugs for which feasible unit tests cannot be made. Thread synchronization issues often cannot, and should not, be reliably verified by a test harness, although they can sometimes be identified. They should always be worked through by proof instead.

Phaeron - 01 01 06 - 18:27


Vdub crash when using Smart rendering.
xvidvfw devision by zero. Sh_T!

version 1.8.6 does not crash.
1.9.5, 1.9.8, 1.9.9 and 1.9.10 crash. why?

Snaps
http://img258.imageshack.us/img258/741/c..
CrashFileS
http://www.sendspace.com/file/8zhfyi
http://www.sendspace.com/file/5yb7hh
!

BuGGeRs - 05 11 10 - 04:51

Comment form


Please keep comments on-topic for this entry. If you have unrelated comments about VirtualDub, the forum is a better place to post them.
Name:  
Remember personal info?

Email (Optional):
Your email address is only revealed to the blog owner and is not shown to the public.
URL (Optional):
Comment: /

An authentication dialog may appear when you click Post Comment. Simply type in "post" as the user and "now" as the password. I have had to do this to stop automated comment spam.



Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.