Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Donate
Contact info
Forum
 
Other projects
   Altirra

Search

Archives

01 Dec - 31 Dec 2013
01 Oct - 31 Oct 2013
01 Aug - 31 Aug 2013
01 May - 31 May 2013
01 Mar - 31 Mar 2013
01 Feb - 29 Feb 2013
01 Dec - 31 Dec 2012
01 Nov - 30 Nov 2012
01 Oct - 31 Oct 2012
01 Sep - 30 Sep 2012
01 Aug - 31 Aug 2012
01 June - 30 June 2012
01 May - 31 May 2012
01 Apr - 30 Apr 2012
01 Dec - 31 Dec 2011
01 Nov - 30 Nov 2011
01 Oct - 31 Oct 2011
01 Sep - 30 Sep 2011
01 Aug - 31 Aug 2011
01 Jul - 31 Jul 2011
01 June - 30 June 2011
01 May - 31 May 2011
01 Apr - 30 Apr 2011
01 Mar - 31 Mar 2011
01 Feb - 29 Feb 2011
01 Jan - 31 Jan 2011
01 Dec - 31 Dec 2010
01 Nov - 30 Nov 2010
01 Oct - 31 Oct 2010
01 Sep - 30 Sep 2010
01 Aug - 31 Aug 2010
01 Jul - 31 Jul 2010
01 June - 30 June 2010
01 May - 31 May 2010
01 Apr - 30 Apr 2010
01 Mar - 31 Mar 2010
01 Feb - 29 Feb 2010
01 Jan - 31 Jan 2010
01 Dec - 31 Dec 2009
01 Nov - 30 Nov 2009
01 Oct - 31 Oct 2009
01 Sep - 30 Sep 2009
01 Aug - 31 Aug 2009
01 Jul - 31 Jul 2009
01 June - 30 June 2009
01 May - 31 May 2009
01 Apr - 30 Apr 2009
01 Mar - 31 Mar 2009
01 Feb - 29 Feb 2009
01 Jan - 31 Jan 2009
01 Dec - 31 Dec 2008
01 Nov - 30 Nov 2008
01 Oct - 31 Oct 2008
01 Sep - 30 Sep 2008
01 Aug - 31 Aug 2008
01 Jul - 31 Jul 2008
01 June - 30 June 2008
01 May - 31 May 2008
01 Apr - 30 Apr 2008
01 Mar - 31 Mar 2008
01 Feb - 29 Feb 2008
01 Jan - 31 Jan 2008
01 Dec - 31 Dec 2007
01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Sep - 30 Sep 2007
01 Aug - 31 Aug 2007
01 Jul - 31 Jul 2007
01 June - 30 June 2007
01 May - 31 May 2007
01 Apr - 30 Apr 2007
01 Mar - 31 Mar 2007
01 Feb - 29 Feb 2007
01 Jan - 31 Jan 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Oct - 31 Oct 2006
01 Sep - 30 Sep 2006
01 Aug - 31 Aug 2006
01 Jul - 31 Jul 2006
01 June - 30 June 2006
01 May - 31 May 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 29 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Nov - 30 Nov 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 June - 30 June 2005
01 May - 31 May 2005
01 Apr - 30 Apr 2005
01 Mar - 31 Mar 2005
01 Feb - 29 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004

Stuff

Powered by Pivot  
XML: RSS feed 
XML: Atom feed 

§ Undoing indentation hell

Have you ever looked at a piece of code and seen something like this?

if (...) {
if (...) {
...
} else {
if (...) {
...
} else if (...) {
...
} else {
...
}
}
...

I'm going into style matters again and I know this is going to be contentious, but personally, I find that code is hard to read when it looks like a big tree view. You have to mentally match the braces or highlight them individually to have the editor show the matching, and that's hard to do when there are so many of them. It's even harder when the top-level statements don't even fit on one screen. The worst case I ever saw had something like 12 levels of indenting and was more than 17 pages long... and I use a small font size. Most of the time this happens because a routine snowballs until it grows out of control, and no one has the time/experience/constitution to refactor it. This happens to me as well, and eventually I get annoyed enough that I have to perform a nesting exorcism.

I attack such monstrosities by a series of small, incremental transformations. This increases the chances that the code will actually still work by the time I'm done with it. I should note that these are best done only if you own the code or already have to do major work on it for other reasons; doing transformations like this willy-nilly is a good way to completely scramble history in a source code control system and nuke any chances of your team members successfully merging your changes with theirs.

Comments

Comments posted:


Amazing, except the last point, which I never had to resort to, you describe my coding/refactoring habits to the dot. I applaud you for your taste :-)

Marcel - 04 01 09 - 20:31


It's probably more likely that we just read the same books. :)

Phaeron - 04 01 09 - 20:41


Yes.

Another annoying side effect of deep nesting is the tendency to aggressively shorten variables names to fit long indented lines on the screen. Then the code becomes doubly unreadable. And then combine that with code bases that mandate 8-column tabs (e.g. Linux) ...

Jon (link) - 04 01 09 - 20:53


A bit of an elementary topic for you, isn't it? :)

It seems like some people were instructed, in some CS101 class by a professor with programming habits from the 70s, that it's best to have only one exit path from a function. That's got to be a remnant from compilers and languages long dead. I remember reading some OpenAL code years back, and functions with six error checks looked like this:

if(a()) {
if(b)) {
if(c)) {
if(d)) {
if(e)) {
if(f)) {

... with error returns for each, way down the other side. Functions with a dozen indentation levels were common.

This was years ago, and I've searched for it just as a dismally amusing example of really terrible code, but I guess whoever wrote that code woke up one day and eradicated it from the net--an understandable thing to do--and I havn't been able to find it again.

Glenn Maynard - 04 01 09 - 21:27


(Use your imagination for the indentations, which were kindly stripped from the comment.)

Glenn Maynard - 04 01 09 - 21:28


> A bit of an elementary topic for you, isn't it? :)

It may be an elementary topic, but the code I'm trying to refactor right now definitely isn't. Fixing the deep nesting and multi-page functions is the least of the issues. I basically wrote this entry while postponing actually trying to get this code to work again....

As for the recommendation, from a code generation standpoint, I'd guess that having only one exit is likely to give you fairly efficient code simply because that's the most straightforward thing for a simple code generator to do given the input. In a modern compiler you'll likely get the same basic block graph regardless, so you might as well err on the side of readability. Besides, if you're allocating a bunch of resources, you probably aren't in a performance-critical path.

Phaeron - 04 01 09 - 21:43


@Glenn Maynard: well, the "only one exit path from a function" rule is part of what is called "structured programming", which is indeed taught in a lot of computer science classes because it should allow you, in some cases, to prove that an algorithm is correct. The introductory programming course in C I attended at university included some basic techniques to prove algorithms, like Hoare triples, which relied on the program being "structured".
But from a pratical standpoint, this has very little use and almost always makes the program much less readable.

I really like the list of coding guidelines. :)

Lorenz Cuno Klopfenstein (link) - 04 01 09 - 22:34


That seems suspect to me, as I'd think that any formal representation that could support asymmetric if() statements would also support predication, and once you can support prediction, it's possible to transform multiple exits into a single exit. There are several assembly language execution environments where it is advantageous or required to do such a transformation, like ARM and pixel shaders.

Phaeron - 04 01 09 - 22:57


Maybe the reason many people don't know about "continue" is that it was seen as a big fault in structured programming by many professors.
I remember back when I took classes in college that you may fail an exam only because you used "goto", "continue" or "break" (inside a loop). Well, that's kind of a strong negative reinforcement to that programming behaviour.

sarkw - 05 01 09 - 10:13


In PHP, you can combine multiple if / elseif / else statements into this:

switch (true) {
case cond1: ... break;
case cond2: ... break;
case cond3: ... break;
case cond4: ... break;
default: ...
}

imho, it makes for much more readable code than this:

if (cond1) {
...
} else if (cond2) {
...
} else if (cond3} {
...
} else if (cond4) {
...
} else {
...
}

yawnmoth - 05 01 09 - 14:35


interesting, how many comments are posted for a post about the structuring of control flow blocks ;-) next, try to post a statement about what language is "best" and see if the database survives that.

Tobias Rieper - 05 01 09 - 15:53


Your Idea to use "return" is a step in the right direction, but one that stops to early.

I tend to use loops like this:

while(1) {
if (failing condition) {
// some work
break;
}
// more conditions

// Work to be done when al condtions are meet
break;
}

This has a few advantages:

* it flattens the code
* it does not require a subfunction (like returns do)
* early bail out jumps behind the loop
* business logic is central at the end of the loop
* and as a bonus, all variables defined inside the loop are gone after the loop

--tf

Thomas Fanslau - 06 01 09 - 09:06


I also think that is better write brackets in this way:
if()
{
}
else
{
}

rather than this:
if(){
} else {
}

It make easier to match the brackets.

ale5000 - 06 01 09 - 12:47


Ok ale5000 usually I use the same for easy matching, just save one line putting the else after the previous closing bracket:
if()
{
} else
{
}

isidroco - 06 01 09 - 15:17


@Thomas Fanslau:
I've used that myself, although I use a do/while(false) instead of while+break. I don't use it too often because it can be confusing to have a loop that doesn't loop. I do often use infinite loops and if+break in the loop instead of trying to cram too much in the looping condition, though.

@ale5000:
This was supposed to just be about control flow, but I KNEW someone was eventually going to drag bracing style into this discussion. I prefer K&R myself, but I can write code comfortably in both styles. Most of these transformations work equally well either way, although some of them don't work as well if you enforce braces around single controlled statements -- which I'm not in favor of for just a break or continue statement.

@isidroco:
I think you may be unique with that style....

Phaeron - 07 01 09 - 01:38


> interesting, how many comments are posted for a post about the structuring of control flow blocks

More people can talk about code structure than optimization. :)

> I remember back when I took classes in college that you may fail an exam only because you used "goto", "continue" or "break" (inside a loop). Well, that's kind of a strong negative reinforcement to that programming behaviour.

In a compiler class, I had a professor tell the class that the appropriate behavior for a parser, if it receives unexpected input, is to just ignore it and move on. No, I'm not making this up.

Incompetent professors are common.

> * it does not require a subfunction (like returns do)

In C++, it's usually better to have cleanup stuff handled with destructors:

int fd = -1;
do { fd = fopen(); if(fd == -1) break; ... ] while(0);
if(fd != -1) close(fd);

is better off as an object:
file f;
if(!f.open()) return;
...

It's not only clearer--you also don't have to worry about cleanup on exception.

That said, I use the fake-loop idiom often; it's much better than goto for error cleanup. The reason people say "goto is evil" (even if people don't often seem to consider why) is that goto is poorly-scoped; you might arrive there from anywhere in the function and you have to scour a function to understand control flow, and this avoids that problem.

Glenn Maynard - 07 01 09 - 04:01


This is where ruby shines.

x = "some string"

case x
when /str/
puts "yup contains the chars str"
when 'abc','def','ghi'
puts "Yup the string has three chars, being of the allowed solution `"+x+"`"
else
puts "Nope, does not contain it."
end

granted, the above code example is trivial and stupid, but i only want to state that case/when structures can be MUCH more terse

mark - 07 01 09 - 12:24


i said above:
"interesting, how many comments are posted for a post about the structuring of control flow blocks ;-) next, try to post a statement about what language is "best" and see if the database survives that.
Tobias Rieper - 05 01 09 - 15:53"

and i was right!

so in order to keep the discussion alive: clearly, C# is superior to c++!

;-)

Tobias Rieper - 07 01 09 - 15:59


The primary reason to have one exit point for a function is that when debugging, you can set a single break point on that exit line instead of having to trace and debug multiple exit paths from the function.

I think the argument is that if you need multiple exit points in a function, then you're doing too much in that one function and it needs to be factored some more.

krick (link) - 08 01 09 - 15:56


Instead of:

HDC hdc = GetDC(hwnd);
if (hdc) {
....HDC hdc2 = GetCompatibleDC(hdc);
....if (hdc2) {
........HBITMAP hbm = CreateBitmap(...);
........if (hbm) {
............CODE BLOCK
............DeleteObject(hbm);
........}
........DeleteDC(hdc2);
....}
....DeleteDC(hdc);
}

How about the D scope statement:

HDC hdc = GetDC(hwnd);
if (!hdc) return;
scope(exit) DeleteDC(hdc);

HDC hdc2 = GetCompatibleDC(hdc);
if (!hdc2) return;
scope(exit) DeleteDC(hdc2);

HBITMAP hbm = CreateBitmap(...);
if (!hbm) return;
scope(exit) DeleteObject(hbm);

...CODE BLOCK...


This is exception safe (unlike the nested if and the goto cleanup) and allows you to put the clean up code right next to the start up code, unlike C++ RAII and the others. It also doesn't require creating a wrapper class for each type just to execute a single statement, like C++ RAII does.

glengarry - 08 01 09 - 22:42


Ew. scope() looks too much like INTERCAL's COME FROM.

Phaeron - 08 01 09 - 23:40


> The primary reason to have one exit point for a function is that when debugging, you can set a single break point on that exit line instead of having to trace and debug multiple exit paths from the function.

Eh, just use your debugger's "step out" function, which will also stop if an exception is thrown (which a breakpoint at the end of a function won't do). I doubt that's where the "one exit path" notion comes from.

Nice. gdb "finish" doesn't stop if an exception unwinds the frame you're finishing. That's a bug (VS's debugger does this correctly). Leads to funny things:

void foo(int x) { if(!x) throw 1; }
main() { for(int i = 0; i < 2; ++i) { try { foo(i); } catch(...) { } } }

... step into the first foo() call, do a "finish", and you'll come out with i = 1, having finished an entirely different invocation of the function you tried to step out of. Debugging with a buggy debugger is not much fun.

> I think the argument is that if you need multiple exit points in a function, then you're doing too much in that one function and it needs to be factored some more.

A function needs to be refactored if it's too big to be defined and understood clearly. The style at the top of the original post is proof that counting exit points isn't an indicator of this.

Glenn Maynard - 09 01 09 - 05:50


If like use modularity...

think about it

if (test1()){
// lot of expressions
if (test2()){
// More expressions
// WTH!!! What make this code
}
}
else {
if (test3()){
// The depuration in this time is terrible...
}
}

transform it on

if (test1()){
take_easy_block(); // Because I always forgot comment the code, this help some
// And I can jump this function when I debug, great, all this block is perfec, so continue...
}
else{
make_something_block();
}

....

void take_easy_block(){
double i;
if (!test2()){
return; // Using the Java source code format, use {} to all...
}
// More expressions
// Yea, I can jump this block, on debug stack
}

void make_something_block(){
// Great, I can use the same variable names to differents things...
char i;
if (!test3()){
return; // Again using the Java source code format
}
}

Great, you have the ciclomatyc level low, and you can forgot comment the code, if you use very descriptive function names.

Cesar (link) - 07 02 09 - 00:04


There is a downside to hoisting out code like that -- it can obscure the flow of code since the function call makes control flow non-linear in the source. It can also make code maintenance more difficult when it isn't clear which functions can be used in which context, or more precisely, whether they're ever called in more than one. If a fragment is very simple, I prefer keeping it inline.

Phaeron - 07 02 09 - 00:23


use layouts... and every layout with own source code files... :D

First Layout, command_play();
Second layout, avi_ini_to_play(); avi_begin_play(); avi_end_play();
Thirt layout, directx_begin_avi(); directx_refresh_output_screen(); more_dirty_work();

or, change coffee provider...

Cesar (link) - 07 02 09 - 01:05


(!test()) ? handle_failure()
:(!test1()) ? handle_failure_1()
:(!test2()) ? handle_failure_2()
:handle_success()

ekanna - 01 06 12 - 06:42


Don't know why it's been completely omitted here, but multiple ifs (be it nested or in serial) is not usually a good idea (unless you're doing something freaky where you need to do extreme optimizing).

Instead of having screenfuls of if/else/continue structures, split the algorithms to functions/methods and, by doing so, give a descriptive names for the individual parts. That way the code becomes significantly easier to read and you don't even need to add comments if the function names describe what the function does.

You'll end up with a domain specific language that in best cases also helps the coders to talk to customers with correct terminology.

This pattern has been discussed in great detail in Clean Code by Robert C. Martin, which -- in case you haven't already done so -- is something I think anyone interested in code craftmanship should read.

Jorma - 01 06 12 - 07:11


Single-exit can be handled tidily using goto (stop freaking out; think of it as an in-function exception).

resource1 = NULL
resource2 = NULL
result = FAIL;

if (!get_resource(&resource1))
goto out
if (!get_resource(&resource2))
goto out

... do stuff ...
result = SUCCESS

if (resource1)
free(resource1)
if (resource2)
free(resource2)

return result

This approach has a couple of beneficial side-effects; it encourages separation between allocation / work / teardown, it emphasises the acquire/release model, and in the case where the function interacts with locks it also provides a path that lets you unlock on the exit side.

Mike - 01 06 12 - 08:52


The goto cleanup strategy can be improved upon by using multiple goto labels, like here: http://vilimpoc.org/research/raii-in-c/

Christoffer S. (link) - 01 06 12 - 09:09


Interesting - this is exactly what I've come up with on my own. I'm a big fan of 'continue' and 'break' in a loop.

One exception - I'm a committed "Horstmann style" indenter. I cannot grok braces that are not aligned.

Pierre Clouthier (link) - 01 06 12 - 09:45


I tend to avoid early returns from functions in favour of the style expressed with removing cleanup pyramids. I do this typically because I often run across cases where each early return needs to also free resources that were allocated on the way to it, so it plays together nicely.

It also doesn't hurt that from what (relatively little I admit) I have seen of the code that gets spat out of compilers, every function _does_ only have one exit point, so in keeping the C/C++ like this it reminds you of what the chip will actually end up doing, which is always a good thing IMHO.

In fact, some code I have seen (windows kernel internals) reuses common cleanup blocks across multiple functions. So you have a few functions that do not end in a 'ret', but rather a 'jmp' off to the cleanup block. I assume this is done so the chip can keep more code in it's icache?

Dave

thehelix (link) - 01 06 12 - 10:17


Since the actual logic / function of these different source code formatting choices are strictly equivalent, then the development environment should provide a View As... menu where the coder could switch back and forth between different preferences. No human should need to manually reformat source code. There should be an underlying pseudo code and a mature calculas that defines permitted transformations (much like hidden optimizations in many compilers). Ultimately, the View As... system should permit display of the source code in any formatting style you like.

JeffyPooh - 02 06 12 - 05:54

Comment form


Please keep comments on-topic for this entry. If you have unrelated comments about VirtualDub, the forum is a better place to post them.
Name:  
Remember personal info?

Email (Optional):
Your email address is only revealed to the blog owner and is not shown to the public.
URL (Optional):
Comment: /

An authentication dialog may appear when you click Post Comment. Simply type in "post" as the user and "now" as the password. I have had to do this to stop automated comment spam.



Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.