v1.10.4 (stable)

Main page
Archived news
Downloads
Documentation
Capture
Compiling
Processing
Crashes
Features
Filters
Plugin SDK
Knowledge base
Donate
Contact info
Forum

Other projects
Altirra

## § ¶"If I were king for a day, and giving answers to the test...."

(Read More) to see my answers to the Win32 programmer brain teasers I posed last time. If you haven't seen the questions yet, you may want to look at the original blog post again before you read the spoilers.

As usual, I might make mistakes, so if you think one of my answers is incorrect, feel free to comment.

Avery's Win32 Brain Teasers (answers)

#1:
Consider the two C++ expressions (x>=y) and !(x<y). Is there a functional difference between them? Why might you choose one over the other?

I forgot to say that x and y are floating point variables, so full credit if you say that operator>=() is not available. You'll see the latter construct often in STL-like code, where weak and strict orderings are always defined in terms of less-than comparisons, and all other inequalities are derived by negation and operand exchange.

If they are floating-point values, the difference is significant because the two will differ in result if one of the values is a NaN (not a number). Well, and assuming you are using at least Visual Studio .NET, or a compiler that implements NaN comparisons properly. This becomes really important if you are trying to write safety code. If you attempt to do a bounded array lookup like this:

`if (f > 4.0f) f = 4.0f;if (f < 0.0f) f = 0.0f;int i = (int)f;return values[i] + (values[i+1] - values[i])*(f - i);`

...you might be surprised when it still crashes on a NaN, because any comparison involving a NaN always returns false. Then again, if the array contains floats, the address computation might shift bits out of the integer indefinite that results (80000000) and convert that to zero, and then the routine will just compute a NaN for the next poor sucker to consume.

#2:
Explain the significance of at least three of the following constants:

• 0xC0000005 STATUS_ACCESS_VIOLATION. Congratulations, you just crashed.
• 0x8876086c D3DERR_INVALIDCALL -- check your caps bits again. If you aren't a Direct3D programmer, you should at least have recognized this as a failure HRESULT.
• 0xDDDDDDDD Freed heap memory.
• 0x027F     Standard FPU control word for Win32.
• 0x3F800000 Bit pattern for 1.0f.
• -858993460 A.k.a. 0xCCCCCCCC -- uninitialized local variable with VC++ runtime checks enabled.

#3:
When could the following code cause a deadlock?

MessageBox(hwnd, "Hello!", "Test", MB_OK);

When you call it from the DllMain() of a DLL, especially if a thread is spun up somewhere along the way, which then causes thread attach notifications to attempt to be sent... under loader lock. Whoops. And since MessageBox() spawns a modal message loop, nearly anything can happen (think hooks). Asserts on exit are great for tripping this....

#4:
A colleague is receiving the following link error.

linkerr.obj : error LNK2001: unresolved external symbol "public: int __thiscall Foo::GetCurrentTime(void)" (?GetCurrentTime@Foo@@QAEHXZ)
linkerr.exe : fatal error LNK1120: 1 unresolved externals

The method Foo::GetCurrentTime() appears to be both declared and defined correctly as a non-inline method, and other modules in the same project are calling it without problems. Yet, attempting to call it from the module that the colleague is working with results in this link error. Explain a possible reason for the error and a strategy for fixing it.

The modules that currently define or use the method include <windows.h>. Your colleague's doesn't. The link error is caused by this line in the nested header <winbase.h>:

`#define GetCurrentTime()                GetTickCount()`

Everyone is really seeing GetTickCount(), except him. This is especially fun with the semi-random ANSI/Unicode #defines. I hear a bug like this has bitten MFC one or two times, because some MFC routines have names that hit a #define, and no one noticed for a long time because MFC code usually includes <windows.h>.

#5:
Identify the most likely usage for memory pointed to by the following addresses in 32-bit Windows NT (not Win9x/ME):

• 0x0012f4cc Main thread stack (normally from 0012FFFF down to 00030000 for a normal 1MB stack reserve).
• 0x004039cf Main executable code (the first segment with a default base of 00400000).
• 0x038d02c0 Heap.
• 0x10004310 Default DLL load address, meaning random unbased DLL not part of the OS (since Microsoft is meticulous at rebasing all of the OS ones).
• 0x77f70218 System DLL. Most of the core DLLs in Windows are based in the 60000000-7FF00000 range.

Addresses are a bit different for Win9x. The executable still starts at 00400000 -- the OS can't relocate it anyway -- but system DLLs are in the shared range at 80000000-BFFFFFFF, and the main thread stack can't be that low due to the 1MB DOS arena at address zero.

#6:
Which implementation of a general-purpose memcpy() would be faster for large blocks: one that uses type (char) for memory movement, or one that uses type (double)? Why? You may assume that pointers and sizes are nicely aligned so that boundary fixups are not necessary.

This was a trick question. The answer is (char), because it's the one that actually works. If you attempt to write a memcpy() for arbitrary bit patterns that uses (double) on x86, it may either crash when it sees a bit pattern that corresponds to a NaN, or even worse, silently change a signaling NaN (SNaN) into a quiet NaN (QNaN) by setting the highest significand bit. Whether or not this happens can depend on your compiler's mood -- when I tested this, VC6 changed the copy loop to use regular integer moves, which would avoid this, but VC8 still used floating-point loads and stores. You didn't need bit 51, did you?

The chances of getting away with this are better if SSE2 code generation is enabled, as the SSE/SSE2 load instructions do not flag exceptions on NaNs like the x87 ones do. Even then, though, you can't depend on the compiler generating SSE2 moves, as sometimes it will still use x87 instructions in 32-bit for various reasons, and it's still a great bear trap given that the code looks portable.

You can actually do copies safely with the FPU using FILD/FISTP to load and store 64-bit integers. On the original Pentium, this was one of the few ways to do a 64-bit write transaction, and one of the faster ways to write to uncached memory. This is no longer the case, though, with fast string operations and MMX, and in any case it's not exactly what the compiler would generate from type (double).

#7:
What is wrong with this implementation of IUnknown::AddRef()? When will it work without any problems, and when will it fail? You may assume that COM aggregation is not involved.

ULONG STDMETHODCALLTYPE Foo::AddRef() {
return ++mRefCount;
}

Partial credit if you said that it may fail on a single CPU system, and will always eventually fail on a multi-CPU system, because there is a race condition in the increment that can goof if multiple threads call AddRef() on the same object. IUnknown::AddRef() must be thread-safe in COM, and thus the code should have used InterlockedIncrement() or equivalent.

Full credit if you also noted that it will work perfectly fine if mRefCount is actually an instance of an object whose operator++() does a thread-safe increment. (Highly recommended.)

#8:
Write a function to return the size of a Win32 GDI bitmap header, given a (const BITMAPINFOHEADER *).

If you said sizeof(BITMAPINFOHEADER), nice try. If you said sizeof(BITMAPINFO), go back to start. You've been reading too many MSDN samples.

This is one question I'm pretty sure I don't have a complete answer to, because there are so many special cases that have to be handled. Off the top of my head:

• Start with the biSize field, because the BITMAPINFOHEADER might actually be a BITMAPV4HEADER, or a BITMAPV5HEADER... or even the dreaded BITMAPCOREHEADER.
• Is biCompression == BI_BITFIELDS? Well, then remember to add the three DWORDs at the end indicating the red/green/blue masks... only if it's a BITMAPINFOHEADER. Otherwise, they're included already.
• Does the bitmap header have a palette attached? If so, biClrUsed indicates the number of RGBQUAD entries that follow.
• If biClrUsed == 0, it actually means the maximum.
• Oh yeah... if you're actually calling a function that takes either DIB_RGB_COLORS or DIB_PAL_COLORS as an argument and the latter is used, the palette is actually made of 16-bit words rather than RGBQUADs.

Good luck with this one, as I'm sure there are more oddball cases. Give yourself full credit if you got at least a couple of these.

#9:
Describe a reliable, efficient way to wait for a worker thread to exit from a user interface thread, i.e. a thread that has windows.

If you said Sleep(), you not only failed, but I'm going to have you expelled.

If you said "wait for a signal/semaphore" or "wait for a message," not quite. I said thread exit, not when the worker's job is finished. The distinction's important if you absolutely need to guarantee that the thread has exited... say, if you're going to unload the DLL that holds its code. (There's FreeLibraryAndExitThread() for that case, but pretend something else is going to happen besides just freeing a DLL.)

If you said PostQuitMessage() and a modal loop, you still failed, but I'll give you points for creativity.

If you said GetExitCodeThread(), good try, but there's that little problem of its return value being overloaded for both status and the thread's return value. And you're polling.

If you said WaitForSingleObject() on the thread, that would be a better way... but again, uh, don't you have to pump messages? WaitForSingleObject(hThread, 0) is indeed a safer way to check for thread exit, but it's still polling.

MsgWaitForMultipleObjects() or MsgWaitForMultipleObjectsEx() is the way to go here, because it allows you to wait for either thread exit or a message to arrive.

#10:
Write a function that determines the HMODULE of its own DLL or EXE. Complications: You cannot call GetModuleHandle() or GetModuleHandleEx(), use a cached value, or reference __ImageBase, and the implementation must not be hardcoded to a particular DLL or EXE.

Call VirtualQuery() on your own function pointer, or any other static object within your module. Cast the AllocationBase field of the MEMORY_BASIC_INFORMATION structure to (HMODULE).

I'll give you partial credit for creativity and for taking the unnecessarily hard road if you used ToolHelp or PSAPI to iterate over the process's module list.

Why would you want to do this? Some UI operations, particularly registering window classes and creating windows, require your HMODULE or HINSTANCE (the two are equivalent). If you're making a static library, it's best if you can avoid having to depend on whether you are in a DLL or EXE, or force clients to pass the HMODULE to you. GetModuleHandleEx() is a cleaner way to do this -- as long as you have Windows XP. Otherwise, the VirtualQuery() trick can come in handy. __ImageBase is a really clean way to do it, but it was only introduced in VS.NET, and it's specific to the VC++ linker.

### Comments

Comments posted:

Regarding #1, you didn't even need to mention float variables. Given that it's C++, which has operator overloading, there's no way to tell from an expression what it does. (x >= y) might download updates from the internet and !(x < y) might trigger your screensaver... That'd be very bad practice, of course, but it is quite possible (and I've seen it done).

Steven - 21 06 06 - 04:11

Yeah, I've got one right (#9)
*celebrating myself*
Luckily I dumped Windows 3 years ago (for private use) and will stop using it professionally in a month ... no more stupid Windows API, but instead: probably even worse documented open source libraries (but at least I'll get to look at the code ^_^)

Apexo - 21 06 06 - 11:19

Regarding #10: that was exactly what I've tried yesterday and wanted to post it, but: it returns a slightly higher number than my image base (or HMODULE). My guess is (although I didn't have time to check this): windows allocates the memory the following way (going from low to high):

[MZ header] --- This is the image header
[Section one]
[Section two]
...

Probably this is so, because the memory permissions in header differ from section 1 and so on. So with virtual query you will be actually getting the base of the current section (which is even worse if you were using some custom crafted binary and you would be in some other section than the first section).

I was reading now through the ReactOS documentation (which is always a nice way to get undocumented windows information :) ), and I've found the TEB and PEB data structures description here: http://www.reactos.org/generated/doxygen.. (if it goes away, just use "reactos teb peb" on google)

From what I see there, you can do a TEB -> PEB -> ImageBase lookup (you can get the address of the TEB from FS:[0]). But this seemingly won't work for third party dlls, only for the main executable since (afaik) the PEB is shared by the whole process (that's why it's called process enviroment block :) ) and from a dll you would still get the same PEB.

Cd0MaN - 22 06 06 - 02:40

Just tested it again, and VirtualQuery() is giving 00400000 for the allocation base. Dunno what's going on in your case. Protection can still be different even if the memory range is allocated for the whole DLL, because calling VirtualProtect() doesn't affect AllocationBase.

As for returning the image base of the main EXE, well, you can just use GetModuleHandle(NULL) for that.

Phaeron - 22 06 06 - 03:38

I've done some further testing and it seems that that I've used BaseAddress instead of AllocationBase, so that's where the problem came from. However a colleague of mine who works on a process dumper says that on really large executables windows may allocate the memory with several calls of VirtualAlloc, so that there is something one should watch out for.

Cd0MaN - 22 06 06 - 14:25

It depends on how VirtualAlloc() is being used. If memory is being committed straight, then yes, AllocationBase will be different for some parts of the image. However, it's more likely that the loader does a MEM_RESERVE allocation for the entire address range first, in order to atomically allocate a contiguous block of address space, and then MEM_COMMIT parts of it later. That way, no other thread can sneak in and allocate address space in the middle. Committing memory in the middle of a reserved block does not change the AllocationBase of those pages.

In any case, here's an MSJ article by Jeffrey Richter describing the method:
http://www.microsoft.com/msj/archive/S1D..

Phaeron - 23 06 06 - 00:24

You forgot the strategy for fixing #4. The two common ways are to to either #undef GetCurrentTime or suppress function macros by using parens: int (Foo::GetCurrentTime)()

Regarding #1, the only place I could find information about the semantics of the comparison operators is /fp:precise - "The compiler will properly handle comparisons involving NaN. For example, x != x evaluates to true if x is NaN and ordered comparisons involving NaN raise an exception". Other than that, it doesn't define what "properly handling NaN" means, my guess is (I use VC6, so I can't verify) that these operators bind to:

|C++|IEEE|GLEUI|
|==|=|FFTFn|
|!=|?<>|TTFTn|
|>|>|TFFFy|
|>=|>=|TFTFy|
|<|<|FTFFy|
|<=|<=|FTTFy|

G = greater-than
L = less-than
E = equal
U = either operand is NaN
I = raises exception if either operand is NaN

Other than using /fp:precise (which is the default) or #pragma float_control(precise,on), I wouldn't rely on these semantics nor do I expect Microsoft not to bungle this somewhere, somehow.

asdf - 22 08 06 - 00:09

#undef'ing symbols from windows.h is not recommended, as it can break otherwise valid code. I always discourage this technique. Parens does work in this case, because the macro is defined with an argument list, but that assumes you have access to the source and haven't gotten a precompiled lib or dll.

NaN-compliant comparison behavior is described at: http://msdn2.microsoft.com/en-us/library..

It was first introduced in VS2002; VC6 does not handle this correctly. The NaN-correct comparisons are actually handled by the FPU; the change in the compiler was to use different conditional jump instructions that test the correct combination of flags. I think it's fairly safe to rely upon, if you are using VS2003 or VS2005.

Phaeron - 22 08 06 - 02:14

### Comment form

Please keep comments on-topic for this entry. If you have unrelated comments about VirtualDub, the forum is a better place to post them.
 Name: Remember personal info? Yes No Email (Optional): Your email address is only revealed to the blog owner and is not shown to the public. URL (Optional): Comment: / An authentication dialog may appear when you click Post Comment. Simply type in "post" as the user and "now" as the password. I have had to do this to stop automated comment spam. Small print: All html tags except and will be removed from your comment. You can make links by just typing the url or mail-address.