Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Donate
Contact info
Forum
 
Other projects
   Altirra

Search

Archives

01 Dec - 31 Dec 2013
01 Oct - 31 Oct 2013
01 Aug - 31 Aug 2013
01 May - 31 May 2013
01 Mar - 31 Mar 2013
01 Feb - 29 Feb 2013
01 Dec - 31 Dec 2012
01 Nov - 30 Nov 2012
01 Oct - 31 Oct 2012
01 Sep - 30 Sep 2012
01 Aug - 31 Aug 2012
01 June - 30 June 2012
01 May - 31 May 2012
01 Apr - 30 Apr 2012
01 Dec - 31 Dec 2011
01 Nov - 30 Nov 2011
01 Oct - 31 Oct 2011
01 Sep - 30 Sep 2011
01 Aug - 31 Aug 2011
01 Jul - 31 Jul 2011
01 June - 30 June 2011
01 May - 31 May 2011
01 Apr - 30 Apr 2011
01 Mar - 31 Mar 2011
01 Feb - 29 Feb 2011
01 Jan - 31 Jan 2011
01 Dec - 31 Dec 2010
01 Nov - 30 Nov 2010
01 Oct - 31 Oct 2010
01 Sep - 30 Sep 2010
01 Aug - 31 Aug 2010
01 Jul - 31 Jul 2010
01 June - 30 June 2010
01 May - 31 May 2010
01 Apr - 30 Apr 2010
01 Mar - 31 Mar 2010
01 Feb - 29 Feb 2010
01 Jan - 31 Jan 2010
01 Dec - 31 Dec 2009
01 Nov - 30 Nov 2009
01 Oct - 31 Oct 2009
01 Sep - 30 Sep 2009
01 Aug - 31 Aug 2009
01 Jul - 31 Jul 2009
01 June - 30 June 2009
01 May - 31 May 2009
01 Apr - 30 Apr 2009
01 Mar - 31 Mar 2009
01 Feb - 29 Feb 2009
01 Jan - 31 Jan 2009
01 Dec - 31 Dec 2008
01 Nov - 30 Nov 2008
01 Oct - 31 Oct 2008
01 Sep - 30 Sep 2008
01 Aug - 31 Aug 2008
01 Jul - 31 Jul 2008
01 June - 30 June 2008
01 May - 31 May 2008
01 Apr - 30 Apr 2008
01 Mar - 31 Mar 2008
01 Feb - 29 Feb 2008
01 Jan - 31 Jan 2008
01 Dec - 31 Dec 2007
01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Sep - 30 Sep 2007
01 Aug - 31 Aug 2007
01 Jul - 31 Jul 2007
01 June - 30 June 2007
01 May - 31 May 2007
01 Apr - 30 Apr 2007
01 Mar - 31 Mar 2007
01 Feb - 29 Feb 2007
01 Jan - 31 Jan 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Oct - 31 Oct 2006
01 Sep - 30 Sep 2006
01 Aug - 31 Aug 2006
01 Jul - 31 Jul 2006
01 June - 30 June 2006
01 May - 31 May 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 29 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Nov - 30 Nov 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 June - 30 June 2005
01 May - 31 May 2005
01 Apr - 30 Apr 2005
01 Mar - 31 Mar 2005
01 Feb - 29 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004

Stuff

Powered by Pivot  
XML: RSS feed 
XML: Atom feed 

§ FPO and a callee-pops parameter passing convention makes perfect stack walks impossible

There's a bit of discussion over at Larry Osterman's blog about the Frame Pointer Omission (FPO) optimization in the Visual C++ compiler and how it affects stack walking, which I've been participating in. I figured I'd expound it a bit more here.

The basic problem to be solved when doing a stack walk is finding the locations of return addresses in the stack, which are also the locations of the stack pointer upon entry to each function in the call stack. If you can somehow determine how much local data is present at each stack frame, you can maintain a virtual stack pointer and hop from stack frame to stack frame until the call stack is determined. On x86, the steps involved are generally as follows:

  1. Obtain the instruction pointer (EIP) and the stack pointer (ESP) of the thread.
  2. Look up the current virtual EIP in debugging information to determine the current function.
  3. Obtain the base of the stack frame, either by reading the frame pointer on the stack, or offsetting ESP if there is no frame pointer. This is now the new virtual ESP.
  4. Read the return address into the virtual EIP.
  5. Go to step 2.

The trick is trying to determine the base each stack frame. When EBP frame pointers are present, this is easy -- just keep following the saved frame pointers next to each return address. What's not so easy is the FPO case, where ESP is used directly, because the offset from ESP to the return address depends on how much local variable space is allocated, and how many parameters for called functions are present.

I claim that it is impossible to reliably stack walk in the general case with the __stdcall or thiscall calling convention and FPO involved -- even with full debugging information! And no, the code doesn't have to be that weird.

Consider the following function disassembly:

  00000000: 8B 01              mov         eax,dword ptr [ecx]
  00000002: 6A 02              push        2
  00000004: 6A 01              push        1
  00000006: FF D0              call        eax
  00000008: FF D0              call        eax
  0000000A: C3                 ret

What would be the appropriate debug information for this function? Ideally, you would want to encode an ESP-to-return-address offset for each instruction, so based on the instruction pointer you could unambiguously determine the offset from every possible instruction that could crash. In some cases, you wouldn't even need to encode this information, if you could walk the instruction stream and update a virtual ESP based on executed instructions. This is frequently possible with compiler-generated code, since the compiler uses well-defined and simple patterns to maintain the stack. This is often done with RISC CPUs that have very easy to parse instruction streams. It's also done on X64, with the help of restrictions on prolog/epilog code and unwind bytecode. X86, however, has neither of these advantages.

Let's say the second CALL instruction in the above code crashes, due to EAX=0 -- which would mean that the first function call returned a null pointer. What would the proper offset to add to ESP to get to the return address? You can't tell from the called function, because the call is indirect and you don't know which function was called.

If you answered 0, you were wrong. If you answered 8, you were wrong. In fact, no matter what value you picked, you would be wrong.

Here's the source code that produced the above machine code, when compiled with Visual Studio 2005 SP1 at /O2:

typedef void *(__stdcall *StateFunction0)();
typedef void *(__stdcall *StateFunction1)(int a);
typedef void *(__stdcall *StateFunction2)(int a, int b);
struct IState { virtual void RunState() = 0; };
struct State0 : public IState { virtual void RunState(); StateFunction0 fn; };
struct State1 : public IState { virtual void RunState(); StateFunction1 fn; };
void State0::RunState() { ((StateFunction2)fn())(1, 2); }
void State1::RunState() { ((StateFunction1)fn(1))(2); }

(The unusual casting is due to C++'s inability to create a recursive typedef. Returning a function pointer as a void* is common when programming state machines, for this reason.)

You might say, aren't there two methods there? Yes, but they compile to the exact same code, and the Visual C++ linker will collapse two functions that have the same code even if they do completely different things. Essentially, the correct ESP offset at the second CALL instruction can be either +8 or +4, depending on whether State0::RunState() or State1::RunState() was executing. Both of these are implementations of the same virtual call on the same interface, so knowing the parent function doesn't help; the only way you could tell is by examining the type of this by checking the vtable pointer, and unfortunately after the first CALL instruction this is no longer available (ECX is a volatile register in the thiscall calling convention). I'm pretty sure that this is unsolvable in the general case except by knowing the entire execution history of the program up to this point.

Moral of this story: Callee-pops calling conventions are absolutely evil with regard to accurate stack walks.

Comments

Comments posted:


I wished VC++ fastcall was like GCC regparam(3). In my experience with GCC in using regparam(3) and FPO leads to very good code generation in tearms of prolong and epilog, with the added bonus of easy debuging.

But who wathever came up with VC++ fastcall and thiscall are morons, I just don't say the same thing about the stdcall because that was probably a necessary evil in the days where memory was tigh, but Microsoft still insist in using stdcall in new API's. I that's the reason why 32bit's windows uses stack based (yuck) exception handling, instead of table based!
But I can can live with this vc++ brain damage stuff because most of the time I can rely in WinDbg, probably the best debugger I ever used.

Anyway, in x64 "ret n" was put to dead :)

nayart3 - 16 03 07 - 07:03


Upps, I mean in x64 usage of "ret n" ...

nayart3 - 16 03 07 - 07:06


But isn't it "just" the VC++ linkers fault since it collapsed the functions? I mean the functions happen to compile to the exact same machine instructions but obviously have different semantics.

macin - 28 03 07 - 11:10


Yes, you could say that -- the compiler and linker can be constrained to avoid making optimizations like this that prevent absolute determination of the stack offset. Unlike caller-pops (cdecl), however, you'd still have to encode significant debugging information to handle these cases. Visual C++ still doesn't encode enough to reliably unwind in the presence of FPO, even with /opt:noicf during the link.

Phaeron - 29 03 07 - 02:04


I wonder what Cygwin/gcc would output? Have you tried to see what would happen to Virtualdub's code with such a setup?

Mitch 74 - 29 03 07 - 04:50


What'd you get is a lot of errors. VirtualDub still uses some VC++isms in places, most notably inline assembly. GCC has some advantages over VC++, but in general I like the latter better. I've had some experiences with GCC 4 professionally and in my experience, VC++ is less trouble.

Phaeron - 31 03 07 - 16:44

Comment form


Please keep comments on-topic for this entry. If you have unrelated comments about VirtualDub, the forum is a better place to post them.
Name:  
Remember personal info?

Email (Optional):
Your email address is only revealed to the blog owner and is not shown to the public.
URL (Optional):
Comment: /

An authentication dialog may appear when you click Post Comment. Simply type in "post" as the user and "now" as the password. I have had to do this to stop automated comment spam.



Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.