§ ¶An absolutely brilliant way to make sure buggy programs don't get fixed
I just spent the last hour or so trying to figure out why the newly added 64-bit configuration of my program didn't run, only to discover it was because Visual Studio happily linked in all *.manifest files it found in the project even though I don't see any docs or options mentioning this, and the manifest command line in project settings doesn't show it. Okay, we can deal with that, just set "exclude from build" and hope it works. Run it from the command line, and, yup, it now launches.
Crash. Oh well, the first launch never works anyway, probably a simple 64-bit porting issue. Visual Studio JIT dialog comes up, choose to debug.
To my horror, I then saw this dialog (this is Windows 7, btw):
Uh, WTF? I'm trying to debug the program, why would I want to enable a compatibility mode for it? And where's the cancel button??
I spent the next ten minutes trying to find out how to check a program's status or remove it from the compatibility list, and I couldn't find one. After a bit of spelunking in the Registry, I was horrified to find this:
A bit of explanation: user callbacks happen whenever the kernel-mode OS code needs to call back into the program in user space. The most common case of this is a window procedure, where the window manager running in kernel mode issues a callback to invoke the program's winproc code. I can't find concrete information on it, but what this mode appears to do is eat exceptions thrown from the user code, as if a try/catch with an empty handler were wrapped around it. And sure enough, when I ran the program, it didn't crash. It simply proceeded to silently not work instead. I could at least reproduce the crash under the debugger, but no longer out of it.
If I get all of this correctly, I find this to be a terrible design. The reason for the PCA is understandable, but the execution is lousy. Not only is there no apparent way to tell Windows not to apply a compatibility mode, but it also permanently enables that behavior for the program. Not even deleting and recreating the program will work, because the path is now embedded in the Registry. There does appear to be a supported way to disable the PCA (Group Policy), but it looks like the Registry Editor is the only way to remove a program entry. This has a number of bad implications for developers, because it means that you will only see the bug once, your QA and beta testers will only see the bug once, and after you've shipped a buggy app that you think is OK, your end users will see the bug once. The end result is that programmers keep shipping broken code, because the bugs are being masked.
This is true and I have noticed it too. Unbelievably short-sighted. The fault-tolerant heap that is new in (I believe) Win7 is of the same nature: Heap bugs get masked by adding padding bytes to allocations and not freeing the last 1000 allocations.
The problem inside MS is not the idiot who decided this (he is just inexperienced). The problem is that there seems to be a lack of oversight on such important decisions. The fault tolerant heap was even praised on channel 9 so it must have gotten some attention internally.
tobi - 21 10 10 - 23:14
Then it comes to making programs work, the often cited concept of a microsoft app store would come into play too: Apps could be publicly penalized in the app store for having high crash rates. Such as a warning sign "This app might destabilize your system.". _That_ would be a way to ensure quality.
tobi - 21 10 10 - 23:21
You can stop this happening by adding the Windows version compatibility stuff to your app's manifest.
See the (horrendously named) "Leveraging Feature Capabilities" section of this document for instructions:
Here is a full example manifest for added clarity (the manifest format is such a goddamn mess already that it's sometimes difficult to now where to put things and what to combine):
I don't know if this effect is explicitly documented but it definitely works. After adding it I have not had to go in and delete those registry values again.
It is ridiculous that there isn't an easier way to prevent this happening with programs you are trying to debug. A checkbox (hell, even if it only showed when a debug build was detected) would be nice. Applying the settings automatically without so much as explaining how to remove them or how to prevent them coming back... it wasted many hours of my life before someone told me the solution.
The worst thing about all the app-compat stuff is that it get can turned on without you really realising it, and from then on you are not testing your app under the conditions that new users will run it. The first time they run your app it'll crash -- what a great first impression -- and after that it'll probably work.
Leo Davidson (link) - 22 10 10 - 00:43
By the way, I think there are two registry areas that you may need to clear of references to your exe (2nd one you already know):
HKEY_CURRENT_USER\Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\Compatibility Assistant\Persisted
Leo Davidson (link) - 22 10 10 - 00:47
Oops, the URLs in my first post got mangled. (The Preview Comment button seemed to lose the form text so I had to copy it out of the preview.)
Here they are:
(They'll still appear truncated, of course, but the links behind them should work now. :))
Leo Davidson (link) - 22 10 10 - 00:53
I noticed the Persisted entry too, but the Layers entry seems to be the one that actually activates the mode.
Ironically, what I was trying to do when I hit all of this was to add a manifest that includes the Win7 compatible annotation. The REALLY bad part is that it turned out to be caused by the thunk generator being totally broken on AMD64, which means there's also probably some UI within VirtualDub that is broken on 64-bit right now (and might have been getting masked by this).
The problem with tagging or banning programs that have high crash rates is that it doesn't work for programs that include third party plugins or modules. Any video apps would be particularly at risk, because some popular video codecs are not very robust and will crash at any problem with input data. Similarly, it would be a bad idea to ban Firefox due to hangs caused by the Adobe Reader plugin.
Phaeron - 22 10 10 - 07:31
Plugins: I did not consider that! I wonder how reliable faults can be traced to the module in practice. You could even let the app store team do the manual work and silently exclude programs with plugins in it. Or you let them still be flagged and create an incentive for out-of-process plugins.
tobi - 22 10 10 - 20:54
The whole PCA component can be disabled (on Windows 7 Professional, anyway), although it's a bit complicated.
If you open the Group Policy Editor and go to 'Administrative Templates > Windows Components > Application Compatibility' there is an option to "Turn off Program Compatibility Assistant".
That should help with development, I'd think, but I'm not sure if there are any cases where it actually helps to has PCA enabled.
Billy - 23 10 10 - 15:00
If they could only make Windows XP page more than 4 GiGs or ram.
Ah, we all have dreams.
evropej - 25 10 10 - 06:06
This PCA stuff is par with Microsoft's history of "solving" problems. These guys make Bell (the telephone company before the breakup) look good in comparison.
a009 - 25 10 10 - 17:31
"That 32-bit editions of Windows Vista are limited to 4GB is not because of any technical constraint on 32-bit operating systems. The 32-bit editions of Windows Vista all contain code for using physical memory above 4GB. Microsoft just doesn’t license you to use that code."
The same applies to 32-bit versions of Windows XP after SP1. More info...
krick (link) - 28 10 10 - 09:29
"PAE mode can be enabled on Windows XP SP2, Windows Server 2003 SP1 and later versions of Windows to support hardware-enforced DEP. However, many device drivers designed for these systems may not have been tested on system configurations with PAE enabled. In order to limit the impact to device driver compatibility, changes to the hardware abstraction layer (HAL) were made to Windows XP SP2 and Windows Server 2003 SP1 Standard Edition to limit physical address space to 4 GB."
stanislav - 31 10 10 - 10:15
Yuhong Bao (link) - 01 12 11 - 06:42
Much frightening background info about this problem can be found here:
Apparently this stuff (propagating the exceptions into/out of kernel mode in native x64 apps) was broken in certain ways on Server 2003 and Vista. Then they "fixed" it for Windows 7, with the result that old apps that had been written with the old broken behaviour could now crash in new ways that they did not used to crash on 2003 and Vista. Result: they added this PCA crap to make Windows 7 revert to the old silent-eating behaviour after the app's first crash of this type, unless your app jumps through the manifest hoops to tell it not to do that.
BTW, I found that link from Charles Bloom's blog, which also has many interesting posts over the years.
The link was on this page:
moo - 08 12 11 - 17:34