¶Drawing text in a 3D program
Every once in a while I hear someone say that 2D APIs are old and deprecated and that 3D APIs are the way to go for 2D rendering. This drives me nuts. Sure, in a 3D API like OpenGL or Direct3D, it's easy to render lines and boxes and to blit images. You even get alpha blending and rotation almost for free.
Unfortunately, another essential drawing primitive in a 2D layer, and one that is a pain in the posterior to implement, is drawing text.
By this, I don't mean drawing monospace 8x8 bitmap characters that look like they were extracted using int 10h / AX=1130h and prerendered into a 16x16 grid in a texture. Nor do I mean the ugly line based “vector fonts” that seemed to ship with the graphics library for every C compiler for DOS. I mean hinted, antialiased, properly spaced, proportional text written in a Unicode-supporting font as you'd see in any modern program.
When faced with a problem like this, an appropriate response is to try to get someone else to solve it for you, and that's actually what I would recommend first: the complexity of modern font rendering combined with the scary size of the Unicode standard book should give anyone pause. There are a few readily available libraries for Windows worth calling out here:
D3DXFont: Font rendering path in the Direct3D helper library (D3DX). Works decently well, but has a few major downsides: requires Direct3D, doesn't work with bitmap fonts, and doesn't support subpixel (ClearType) antialiasing. The last point is particularly problematic if you are doing UI with traditionally-sized fonts.
DirectWrite: Accelerated, non-GDI font rendering solution introduced in Windows Vista. Its biggest problem is not being available on XP, and its second biggest problem is that every single program that migrates to it seems to incur a large number of font rendering quality issues. If you use this, I implore you to turn on the whole pixel positioning mode and spare everyone from unusably blurry text.
Uniscribe: Complex script processing library. The rendering path won't be usable in a 3D-based program, but the layout and shaping facilities can still be useful. API looks a bit scary, but I haven't actually tried it yet.
And then, there are a whole bunch of third-party libraries too numerous to list.
When faced with this issue myself, I decided to see how much I could leverage ye olde Graphics Device Interface (GDI), the foundation of 2D graphics in Win32. This approach has a couple of advantages. The first is its universal availability. A much more important one is that by leveraging GDI it's much easier to support any font GDI does and render text the same way as GDI, which I consider important. In particular, many approaches I've seen either don't support bitmap fonts, don't use hinting, or don't do ClearType antialiasing, which I consider showstoppers for rendering UI.
As it turns out, this was a lot more painful than I had expected due to the usual number of quirks in Win32. Here's the story of all the approaches I went through....
The GetGlyphOutline() function
A simple way to leverage GDI for drawing text in a 3D API is to have GDI draw the text into an offscreen buffer and upload that to a texture. This lets GDI handle both positioning and rendering. Downside is that performance sucks due to the constant texture uploads, so I opted for a glyph-based rendering approach instead. That means finding a way to get glyph images out of GDI.
Many examples you'll find for drawing text in Direct3D 9 use GetGlyphOutline() to get images of individual glyphs. D3DXFont uses this method, for instance. It works, but has a couple of limitations. The first is that it simply doesn't work with bitmap fonts. Most bitmap fonts are not worth using nowadays and this isn't a problem if you get to choose the fonts. It's more of an issue if you're making a tool that either adapts to the system font settings or allows the user to choose a font. I wouldn't recommend shipping a text editor or terminal emulator with this limitation, for instance.
A more serious limitation is in output format. GetGlyphOutline() at least supports hinted bi-level output – not the unhinted garbage that some font frameworks try to pass off as a non-antialiased mode – and grayscale antialiased images up to 6-bit depth. It does not support subpixel (ClearType) antialiasing. This means that on a Windows XP system or above a GGO-based text renderer cannot produce comparable quality output. ClearType isn't for everyone, but I consider support for it a requirement for a production quality text renderer, and so I dismissed a GGO-based approach.
There is one other quirk with GetGlyphOutline(): it will not render underlining even if the font was created with that option. The solution is to retrieve the outline font metrics to get the underline position and to draw it yourself.
How to get ClearType-antialiased output
There are two requirements for getting rasterized glyph output with ClearType antialiasing out of GDI. The first is that the font must be enabled for it, either by default or by specifying the CLEARTYPE_QUALITY setting to CreateFont(). The second is that it must be rendered to a screen-compatible display context. This is pretty easy – get a screen DC, create a compatible DC, bind a 32-bit DIB to it, and do ExtTextOut(). Then, do GdiFlush() and read out the bits.
Somewhat more tricky is determining the bounds of the glyph. The initial ways you'd think of to do this like GetTextExtentPoint32() and DrawText(DT_CALCRECT) all give you the wrong answer. The reason is that they only give you the advance width of the text, or how far the text advances the positioning origin, and not the actual width of the text. In particular, this does not include overhangs at the ends which are most serious with italic fonts. This means that the text can render outside of the given bounds, which makes it useless for telling how much space we need to capture the glyph image. It's possible to forego this and just use the worst-case width for all glyphs, but this results in excessive wasted space and overdraw: Tahoma 11, for instance, reports a maximum width of 19 pixels.
To fix this you need to use one of the functions that gives you the overhang amounts. Annoyingly, this is different for bitmap fonts and for TrueType fonts: for bitmap fonts it comes from tmOverhang as returned by GetTextMetrics(), and for TrueType fonts you need to use a function like GetGlyphOutline() or one of the functions that returns the ABC widths.
There's one additional gotcha: ClearType antialiasing can result in an additional pixel of overhang that the functions don't tell you about. This is one cause of colored slivers at the edges of text in various applications. I don't know of any way to detect this other than to just add one pixel on both sides when capturing the glyph and trimming it back off by scanning the bitmap afterward if needed.
Rendering text using the glyphs
The next step is to actually get the text on screen, which involves uploading the glyph images to a texture and splatting out a series of quads for each glyph. Constructing the texture is left as an exercise for the reader; in this era of Unicode and 7K+ glyphs per font it's not a good idea to preload the entire font into a texture, so this requires dynamically updating a texture based on the current required subset. This works fine as long as the font isn't too big, beyond which more drastic measures are needed like a viable scaling algorithm (bilinear or bicubic do not count) or polygonal rendering.
Doing this also requires positioning the glyphs. The simplest way is to just add the advance widths of each character starting from the left. This definitely won't win you any awards for i18n support but is at least a viable start. On the GDI side, GetTextExtentExPoint() and GetCharacterPlacement() will produce positioning arrays from a string to make this easier.
Once you have the glyphs cached in a texture and all laid out, it's just a matter of blasting out some quads. Alpha test works for bi-level and blending will handle both bi-level and grayscale.
ClearType antialiased glyphs are a bit more troublesome as they require an alpha channel for each RGB channel. Since the output of the shader is only a 4-vector and six are needed this means multiple passes. The tricky part is that the glyphs can and do overlap. My first attempt involved doing a masking pass followed by an additive pass, which resulted in overlap artifacts; doing single pass blending with one RGB channel enabled in the destination color mask worked better. The RGB alpha requirement also needs to be kept in mind if the text is being prerendered into a translucent image as it will also require that image to have RGB alpha channels unless the background is opaque.
Incidentally, this method of rendering ignores the gamma correction that GDI does when rendering text and which is required of display drivers when accelerating it. In my experience the result is acceptable without it, but you'll need it if you want to match GDI's output quality. This is expensive to do as shaders can't read from the destination surface and the hardware blender isn't nearly powerful enough to do gamma correction with configurable gamma.
GetCharacterPlacement() doesn't work so well
I glossed over glyph placement earlier. As it turns out, GDI has an attractive-looking function called GetCharacterPlacement() to do this... but it doesn't work as well as you'd think.
GetCharacterPlacement() function does a lot of work for you in that it both converts characters to glyph indices and computes positions for you. It is less useful than it looks, however. The first problem I found is that it doesn't appear to handle diacritics properly despite having a flag to enable them; with fonts like Tahoma it seems to just place them in totally wrong positions.
A bigger problem with GCP() has to do with font substitution. Starting with Windows 2000, font drawing functions like TextOut() can draw characters even if they're not supported by the current font by pulling in glyphs from other linked fonts. For instance, it will draw katakana from a Unicode string with Tahoma selected in the display context. Not only does GCP() not handle font substitution, but it will also give you garbage glyph indices back instead with no error. This causes your text routine to draw a 'q' character where another character should be showing up.
This problem actually exists with any of the APIs that work in glyph indices as those are specific to each font, although some APIs are better behaved: GetGlyphIndices() at least returns the invalid character. Unfortunately, this means that if you are bypassing GDI's rendering, you need to do the font substitution yourself.
MLang to the rescue... not
The Old New Thing has a post about using MLang to do font substitution to handle missing glyphs (http://blogs.msdn.com/b/oldnewthing/archive/2004/07/16/185261.aspx). I tried this and found it problematic.
The basic approach involves checking the charsets supported by the font and having MLang remap to provide alternate fonts for any missing ones. Well, the first trouble I hit was with the Marlett font. Marlett is a strange but useful font in Windows that provides glyphs to represent common window decoration symbols in the classic theme, such as close buttons and menu checkmarks. MLang just totally dies on Marlett: GetFontCodePages() returns a charset mask of zero, and GetFontUnicodeRanges() returns E_FAIL. The result is that your text renderer can't properly tell what is in the Marlett font and helpfully substitutes a different one to render a submenu icon as an actual 8 on screen. Doh.
Don't care about Marlett? Okay, then try the pseudo-font MS Shell Dlg instead. CreateFont() succeeds, you can draw with it, and GetFontCodePages() will work with it. GetFontUnicodeRanges() returns... you guessed it, E_FAIL. Wonderful.
The backslash problem
Okay, so perhaps you don't care about drawing window decorations or drawing characters outside of the selected font. This was the way I was leaning, and with those limitations everything seemed to be working OK with using GetCharacterPlacement() for glyph conversion and positioning. That's when I ran into a brick wall with, of all things, the backslash character.
As you may know, I have been running with the system code page set to Japanese. A historical quirk of running this way is that backslashes appear as yen signs, which makes file paths look pretty strange. This is something I've gotten used to and didn't think about when working on this until it came time to render a file path on screen and ended up with 'y' characters where the backslashes should have been. That's when I discovered the serious problem with the way this substitution occurs.
Here's what happens: when you are running with the system code page set to Japanese and only with certain fonts like Tahoma and Microsoft Sans Serif, GDI checks for the backslash character (U+005C) and draws the yen sign instead (U+00A5). So far, so good... except that for some reason, it doesn't pull U+00A5 from Tahoma, but from a different font. Strange, but with the font substitution in TextOut() it works.
What doesn't work so well is trying to draw U+005C in a custom renderer. Because GDI blocks U+005C from the font, GetGlyphIndices() fails on this character and GetCharacterPlacement() returns a garbage glyph index. If you're using MLang to check if font substitution is needed, it won't catch this because either it nor your program knows this is happening: the charset test still passes, and GetFontUnicodeRanges() still reports that U+005C is in the font, because it is. As a result, your text renderer proceeds anyway and still fails to render backslashes properly.
And where are backslashes used? Oh yeah, file paths.
You can detect and special case this situation by checking for the GetGlyphIndices() failure and calling MapFont() to remap, but then it gets even more weird. MLang still doesn't know about the substitution, so it looks for a backslash and not a yen glyph. This leads to the bizarre behavior of it pulling a backslash from a different font, so you get a backslash from Arial in the middle of your text and it doesn't match all the other programs that are rendering a yen sign. Great.
(By the way, the GetGlyphOutlines() function does handle all of these substitutions if you are using it to extract bitmaps, so you're spared this problem if you're OK with its limitations.)
The outlier font
The basic problem here is that GDI has two separate paths for handling text, one that uses glyph indices and another that uses characters, and the opaque character and font substitutions are happening only on the latter path. Therefore, the solution is to avoid glyph indices entirely and use only the character APIs. This means ditching GetCharacterPlacement() and GetGlyphIndices().
In order to correctly capture and position each glyph, we still need to know its size parameters, specifically the ABC widths. GetCharABCWidths() will do this although it has the annoying restriction of only working with outline fonts; that's OK, because bitmap fonts don't have non-zero A and C widths and we can fall back to GetCharWidth() instead. Allocate bitmap space using the B width along with a little gutter, call TextOut() to draw the glyph, copy that to the texture, and use the ABC widths to position the quads... done.
When I tried doing this, it seemed to work across the board, except there was one font on my system for which it failed: Calibri. Inexplicably, GetCharABCWidths() fails to handle font substitutions with this specific font even though other functions like GetTextExtentPoint32() work fine. This leaves us again without a reliable way to get the ABC widths of a character. Argh!
The final solution
After almost tearing my hair out over all of these issues, I resorted to the solution I always hate using: image scanning.
Basically, I ripped out the code to try to extract ABC widths from GDI and instead wrote code to allocate a worst-case sized bitmap, render a glyph into it with TextOut(), and then scan the pixels in the bitmap to obtain a bounding rectangle. From this bounding rectangle, it then determined the A width from the offset and the B width from the rectangle width. Afterward, GetTextExtentPoint32() was used to obtain the advance width (A+B+C), from which the C width was derived. The result: a solution that finally worked with ClearType antialiasing, bitmap fonts, font substitution, and backslashes.