Writing custom visualizers for Visual Studio 2005

¶Writing custom visualizers for Visual Studio 2005

The native debugger in Visual Studio has long had an underadvertised feature called autoexp.dat, which is a file in the PackagesDebugger folder that allows you to control several aspects of the debugger. Among the features that you can control in autoexp.dat include: the string that is displayed for types in the variable panes, which functions the debugger will skip when stepping, and string names for COM HRESULTs. The first is the most interesting and useful, but unfortunately it doesn't support complex expressions. If you try to access more than one field in an expression:

MyType=size=<m_last - m_first, i>

...the debugger simply displays ??? instead. You can get around these limitations by writing an expression evaluator plugin, which can read any process memory, but in the end you're still limited to outputing only a single short string.

In Visual Studio 2005, a powerful feature has been added to autoexp.dat in the form of the [Visualizer] section. This section too contains mappings from types to display form, but these have a whole language for evaluating the object -- and unlike the regular [AutoExpand] templates, you can also affect the contents when the object is expanded. This means you can actually view the contents of an STL set, not just see raw red/black tree nodes.

Now, the [Visualizer] section is undocumented, and there's a comment at the top of the section that says DO NOT MODIFY. For those of you who are new to Windows programming, that means "edit at will." The problem is then deciphering the visualizer language.

P.S. Back up autoexp.dat before proceeding.

Basic structure

A visualizer's basic structure is as follows:

typename[|typename...] {
    preview (
        preview-string-expression
    )
    stringview (
        text-visualizer-expression
    )
    children (
        expanded-contents-expression
    )
}

All three of these expressions are optional.

Like the AutoExpand templates, it is possible to match templates by using type<*> syntax, and also possible to do partial template matches, i.e. Foo<Bar, *>. The template parameters will show up as $T1, $T2, etc. You can also supply multiple types using Foo|Bar syntax.

autoexp.dat is reloaded every time the debugger starts. It is not necessary to restart Visual Studio to see changes; restarting the debugger, or detaching and reattaching to the target, is sufficient to effect changes.

The preview expression

The preview expression consists of a single expression to display in the one-line preview of a variable in Watch, QuickWatch, or the Command Window. This can either be a literal string or an expression, which is surrounded in brackets if it contains a formatting specifier. Unlike AutoExpand, however, the expression must dereference the current object using the $c variable. Since a single expression isn't usually useful, you will want to use #( and ) to bracket a comma-delimited list of strings and expressions.

Example:

// code
template<class T> struct MyType {
    T *start, *end;
};
// autoexp.dat
MyType<*> {
    preview (
        #(
            "ptr=", (void *)$c.start, ", size=", [$c.end-$c.start, i]
        )
    )
}

The parser isn't very robust and you'll often get weird results if you make mistakes. So don't do that. Another downer with regard to expressions here is that the sizeof() keyword doesn't work, which is unfortunate for template visualizers. Finally, colons, parens, and brackets within string and character literals can confuse the parser and give you inexplicable mismatch failures.

Displaying children

Now, for the real meat of the visualizer: showing children.

The children block also expects either a single item or a #() delimited list of items; when present, the object will have a [+] next to it and the expanded view will show the contents produced by this block. Each expression will show up as ascending indices, i.e. [0], [1], [2], etc. You can also name specific elements using name : expr syntax; the rules for the name are a little looser than C identifiers, with brackets in particular being accepted. The order of named fields is irrelevant; the contents are always sorted.

Example:

MyType<*> {
    children (
        #(
            ptr: (void *)$c.start,
            [size]: [$c.end-$c.start, i]
        )
    )
}

The ability of a visualizer to construct hierarchies in the watch window is limited. For the most part, you can only add items immediately below the item being visualized; the only ways available to add expandable objects is either to reference an aggregate or other object with a visualizer, or to use an array expression, i.e. [$c.start, 4x].

Note that the contents of the children block will replace the struct fields that would normally be displayed. It's a good idea to leave yourself an escape hatch by including a field with the value [$c,!], whose expanded view re-evaluates the current object without any visualizers or AutoExpand expressions.

Arrays

Being able to plop new fields in is neat, but the above doesn't allow you to view the contents of the data structure, which is hidden between those two pointers. Enter the #array statement:

MyType<*> {
    children (
        #(
            [raw members]: [$c,!],
            [size]: [$c.end-$c.start, i],
            #array (
                expr: $c.start[$i],
                size: $c.end - $c.start
            )
        )
    )
}

Here's what it looks like in the debugger:

[Custom vector in debugger]

The #array statement looks like it evaluates an array. Actually, it doesn't -- all it does is count the $i variable up from 0 to size-1, which you can then use in the expr field. This is actually better, because you can use it to evaluate chunked or triangular arrays. You don't even have to evaluate fields at all; you could generate a table with it if you were sufficiently bored. Any fields generated from #array are unnamed and will appear as ascending indices.

It is also possible to customize the value display for the array, by supplying an expression after the #array statement. The $e variable is set to the value produced by expr for that element. For instance, to display the address of each element in the data structure:

MyType<*> {
    children (
        #(
            [size]: [$c.end-$c.start, i],
            #array (
                expr: $c.start[$i],
                size: $c.end - $c.start
            ) : &$e
        )
    )
}

The bummer about doing this is that any other fields that you include will bump the indices produced by the array. I haven't found a way around this, yet. If you wrap the value expression in a nested #() it is possible to override the name for each element, although the name will be same for all elements. Also, if you supply multiple items per element, all of them will show up. If they're unnamed they'll get successive indices, so the first element will produce [0] and [1], the second [2] and [3], etc. One use for this is to display a packed array of 4-bit fields.

There are two additional options for #array: the base and rank fields. The base field allows you to change the starting array index. It doesn't change $i, but it does offset the array indices shown. (This only works if you don't override the display for $e, in which case single incrementing indices are always used.) The rank field allows you to do regular multi-dimensional arrays; when it is present and evaluates to greater than one, the size and base expressions are evaluated multiple times with $r indicating the zero-based dimension, and expr is then evaluated once for (product of all size) times with $i counting up contiguously. Thus:

// code
struct MyArray2D {
    T *start;
    int innerSize;
    int outerSize;
};

// autoexp.dat
MyArray2D {
    children (
        #array (
            expr: $c.start[$i],
            rank: 2,
            size: ($r==1)*$c.outerSize+($r==0)*$c.innerSize,
            base: 1
        )
    )
}

Sadly, the order of indices is reversed in the names produced, i.e [1,1], [2,1], [3,1], etc. If you don't mind the wonked ordering, you could fix this by doing some division and modulus on the index.

Conditionals

The #if() statement allows you to conditionally include expressions in your visualizer:

MyType {
    preview (
        #if (($c.end - $c.start) > 0) (
            #("size=", $c.end - $c.start)
        ) #else (
            "empty"
        )
    )
}

There are also #else and #elif() statements to do more complex if statements; these are analogous to the C equivalents.

The #switch() statement lets you... well, switch. #case and #default are present, except that there are no fall-throughs and no break. You can do some nice hacks using a #switch:

MyType {
    children (
        #switch($c.start[0].flags & 0xf000)
        #case 0x0000 ( #( freed: $c.start[0].value ) )
        #case 0x2000 ( #( alloc: $c.start[0].value ) )
        #case 0x3000 ( #( active: $c.start[0].value ) )
        #default ( #( huh: $c.start[0].value ) )
    )
}

Two things to note here. One, the #switch statement doesn't have a scope around its #case statements. Two, it doesn't work within an #array. If you try it, devenv will hang at 100% CPU. For those cases, use #if instead.

Lists

Arrays are fine, but sometimes you want to use a linked list. Well, the visualizer supports that:

// cpp file
struct MyNode {
    MyNode *prev, *next;
    int value;
};
struct MyList {
    MyList *head;
};
// autoexp.dat
MyList {
    children (
        #list (
            head: $c.head,
            next: next
        )
    )
}

The visualizer will then proceed to follow the singly-linked list and place each element in the output. Note that the next portion denotes a field name in the child node and not an expression. Like #array, you can use #list() : <expr> syntax to change the way each $e is displayed.

#list is protected against infinite traversals and will cope gracefully with a circular list. Also, you can use a skip: expression to denote a sentinel node that should not be reported. Although the name implies that the node will be skipped, it actually causes traversal to stop, so if your sentinel node is first you should start traversal after it.

You can also supply a size: expression to limit the number of elements displayed.

Trees

Ah, now for the really evil stuff: trees.

struct TreeNode {
    TreeNode *left, *right;
    int value;
};
struct TreeTest {
    TreeNode *head;
    int size;
};
TreeTest {
    children
    (
        #(
            #tree (
                head : $c.head,
                left : left,
                right : right,
                size : $c.size
            ) : &$e
        )
    )
}

In this case, Visual Studio trawls the tree using in-order traversal: that is, left, then current, then right. As with #list, size: can be an expression limiting the node count, skip: can avoid sentinels, and a value expression is possible.

There is a nasty bug with the #list and #tree statements you should be aware of: they do not work if the nodes exist in static storage in a module. The traversal will work OK, but the $e variable will point to an invalid offset instead of the actual item, and you won't be able to see any of the elements. I got burned by this when I tried to write a visualizer to dump the list of critical sections in the process, which turns out to reside in a static block in ntdll.dll, and I don't have a workaround yet. This bug does not affect the #array primitive.

The string view expression

stringview is an odd one: it sets the string that is displayed in the Text, XML, or HTML visualizer. This is what shows up when you click the little magnifying glass on the right side of some entries, like strings. Sadly, this is not terribly useful as all of the text visualizers are modal, but otherwise, it has the same syntax as the regular preview block.

StringTest {
    stringview ( "<HTML><FONT color=red>HTML output!!</FONT>" )
}

The usefulness of this is unfortunately damped by two limitations: you can't escape HTML symbols in strings you output, and you can't put literal strings in the value expressions evaluated by, say, #array or #list, as they evaluate to NULL. You can't generate an XML or HTML report from a data structure. It's really only useful if you already have HTML or XML somewhere in memory.

Real examples

One common container that you might write yourself, and want to view, is a hash table: a nice hash table, consisting of a prime number of buckets, and a linked list at each one. Good luck finding anything in a hash table that has 500 buckets, though, so it's perfect for a visualizer. Unfortunately, if you look at the visualizer for stdext::hash_set, you'll discover that the VC guys cheated -- the Dinkumware STL has a single linked list binding all nodes in the hash_set, and just trawled it with a #list. We have an array of singly linked lists to elements.

Can we crawl such a structure? You bet!

struct Node {
    Node *next;
    int value;
};
struct HashSet {
    Node **buckets;
    int bucketCount;
};
// autoexp.dat
HashSet {
    children (
        #(
            #array (
                expr : ($c.buckets)[$i],
                size : $c.bucketCount
            ) : #(
                #list (
                    head : $e,
                    next : next
                ) : $e.value
            )
        )
    )
}

Another tricky example: say you have an array of pointers, but want to skip the entries that are NULL. Can we do it? Sure!

struct SparseArray {
    void **start, **end;
};
// autoexp.dat
SparseArray {
    children(
        #array (
            expr: $c.start[$i],
            size: $c.end - $c.start
        ) : #if ($e != 0) (
            $e
        )
    )
}

And here it is in the debugger:

[Visual Studio debugger crash]

Ehh... hmmm. Well, as it turns out, doing an asymmetric #if within an #array causes the debugger to blow up. It is possible to do this, but only with a big hack:

SparseArray {
    children(
        #(
            #array (
                expr: &$c.start[$i],
                size: $c.end - $c.start
            ) : #(
                #array (
                    expr: &$e,
                    size: $e != 0
                ) : $e
            )
        )
    )
}

Hee-haw!

Conclusion

The new visualizers are much more powerful in their capabilities, and open up a lot of possibilities for interesting debug displays within the Visual Studio Debugger. It sorely needs to be documented, and it can be a bit unstable when you work with it, but in general it makes data structures a lot nicer to work with.

(Update 9/27/2007: See my followup note.)

82 comments | Jul 24, 2006 at 02:08 | default

Current version

Navigation

Archives

¶Writing custom visualizers for Visual Studio 2005

Comments