Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Contact info
 
Other projects
   Altirra

Archives

Blog Archive

Eight general purpose registers on x86

An x86 CPU has eight main registers in its scalar register file in 32-bit mode: EAX, EBX, ECX, EDX, ESI, EDI, EBP, and ESP. All of these have various special uses, but of them, the eighth, ESP, has the most special status as the stack pointer.

I did say [i]eight[/i] general purpose registers.

It is possible, in some cases, to temporarily reuse ESP as a general purpose register. Since the x86 architecture is so register-starved, adding an eighth register can eliminate the need to spill variables to memory and boost the speed of a critical inner loop. I've used this in a couple of places in VirtualDub when I really needed it, and some have asked if it is actually safe to do this. The answer is yes.

How to do it

Simply reusing the stack pointer register is easy, as you just modify and use ESP directly. You do have to save and restore it, unless you expect to run for all eternity with no stack (good luck). The easiest way to do this is to just use a global variable:

mov stacksave, esp
...
mov esp, stacksave

Forget about using any high-level language within this scope — you will be writing the body in assembly language, whether it's inline assembly or external.

This works, but has the serious disadvantage of making your routine non-reentrant, as only one executing instance of the function can save its stack. (This is really only a problem for multithreaded scenarios, because if you are reusing the stack pointer, reentrancy is probably not a problem.) One way around this is to stash it in an MMX register, but that assumes you have MMX and don't need one of the registers. What I do on Win32 is to stash the stack pointer in the Structured Exception Handling (SEH) chain instead:

push 0
push fs:dword ptr [0]
mov fs:dword ptr [0], esp
...
mov esp, fs:dword ptr [0]
pop eax
pop fs:dword ptr [0]

The SEH chain is a linked list of active exception handling scopes in the current thread; it is used both for C++ exceptions and for system exceptions, and is pointed to by the first location in the thread environment block (TEB). The TEB is in turn pointed to by the FS: selector. We link a dummy node into the SEH chain to hold the stack pointer, and since there is a unique TEB for each thread, this allows our routine to run concurrently on multiple threads.

Now you can reuse ESP.

Aren't you screwed if an interrupt occurs?

Those of you who have programmed in DOS are likely squirming at this point about the possibility of interrupts. Ordinarily, reusing the stack pointer like this is a really bad idea because you have no idea when an interrupt might strike, and when one does, the CPU dutifully pushes the current program counter and flags onto the stack. If you have reused ESP, this would cause random data structures to be trashed. In this kind of environment, ESP must always point to valid and sufficient stack space to service an interrupt, and whenever this does not hold, interrupts must be disabled. Running with interrupts disabled for a long time lowers system responsiveness (lost interrupts and bad latency), and isn't practical for a big routine.

However, we're running in protected mode here.

When running in user space in Win32, interrupts do not push onto the user stack, but onto a kernel stack instead. If you think about it, it isn't possible for the user stack to be used. If the thread were out of stack space, or even just had an invalid stack, when the CPU tried to push EIP and EFLAGS, it would page fault, and you can't page fault in an interrupt handler. Thus, the scheduler can do any number of context switches while a no-stack routine is running, and any data structures that are being pointed to be ESP will not be affected.

There is one case where the OS will try to push data onto the invalid stack, and that is if an exception occurs. The most likely exception is an access violation (C0000005). The good news is that this will never happen for several reasons:

The bad news is that violating any one of these is enough to make the application toast when an exception occurs on that thread. Whenever an exception unwind fails like this, Windows will simply kill the task outright and the application disappears. No unhandled exception handler, no crash dialog, nothing. This means that you better make very sure that your routine is debugged before you ship it out to users, because no amount of in-process exception trapping is going to be able to fire a report if the routine dies. The good news is that a debugger can intercept such exceptions before the OS tries to resolve it in-thread, so you can still debug a stackless routine without problems.

Comments

This blog was originally open for comments when this entry was first posted, but was later closed and then removed due to spam and after a migration away from the original blog software. Unfortunately, it would have been a lot of work to reformat the comments to republish them. The author thanks everyone who posted comments and added to the discussion.