Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Donate
Contact info
Forum
 
Other projects
   Altirra

Search

Archives

01 Dec - 31 Dec 2013
01 Oct - 31 Oct 2013
01 Aug - 31 Aug 2013
01 May - 31 May 2013
01 Mar - 31 Mar 2013
01 Feb - 29 Feb 2013
01 Dec - 31 Dec 2012
01 Nov - 30 Nov 2012
01 Oct - 31 Oct 2012
01 Sep - 30 Sep 2012
01 Aug - 31 Aug 2012
01 June - 30 June 2012
01 May - 31 May 2012
01 Apr - 30 Apr 2012
01 Dec - 31 Dec 2011
01 Nov - 30 Nov 2011
01 Oct - 31 Oct 2011
01 Sep - 30 Sep 2011
01 Aug - 31 Aug 2011
01 Jul - 31 Jul 2011
01 June - 30 June 2011
01 May - 31 May 2011
01 Apr - 30 Apr 2011
01 Mar - 31 Mar 2011
01 Feb - 29 Feb 2011
01 Jan - 31 Jan 2011
01 Dec - 31 Dec 2010
01 Nov - 30 Nov 2010
01 Oct - 31 Oct 2010
01 Sep - 30 Sep 2010
01 Aug - 31 Aug 2010
01 Jul - 31 Jul 2010
01 June - 30 June 2010
01 May - 31 May 2010
01 Apr - 30 Apr 2010
01 Mar - 31 Mar 2010
01 Feb - 29 Feb 2010
01 Jan - 31 Jan 2010
01 Dec - 31 Dec 2009
01 Nov - 30 Nov 2009
01 Oct - 31 Oct 2009
01 Sep - 30 Sep 2009
01 Aug - 31 Aug 2009
01 Jul - 31 Jul 2009
01 June - 30 June 2009
01 May - 31 May 2009
01 Apr - 30 Apr 2009
01 Mar - 31 Mar 2009
01 Feb - 29 Feb 2009
01 Jan - 31 Jan 2009
01 Dec - 31 Dec 2008
01 Nov - 30 Nov 2008
01 Oct - 31 Oct 2008
01 Sep - 30 Sep 2008
01 Aug - 31 Aug 2008
01 Jul - 31 Jul 2008
01 June - 30 June 2008
01 May - 31 May 2008
01 Apr - 30 Apr 2008
01 Mar - 31 Mar 2008
01 Feb - 29 Feb 2008
01 Jan - 31 Jan 2008
01 Dec - 31 Dec 2007
01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Sep - 30 Sep 2007
01 Aug - 31 Aug 2007
01 Jul - 31 Jul 2007
01 June - 30 June 2007
01 May - 31 May 2007
01 Apr - 30 Apr 2007
01 Mar - 31 Mar 2007
01 Feb - 29 Feb 2007
01 Jan - 31 Jan 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Oct - 31 Oct 2006
01 Sep - 30 Sep 2006
01 Aug - 31 Aug 2006
01 Jul - 31 Jul 2006
01 June - 30 June 2006
01 May - 31 May 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 29 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Nov - 30 Nov 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 June - 30 June 2005
01 May - 31 May 2005
01 Apr - 30 Apr 2005
01 Mar - 31 Mar 2005
01 Feb - 29 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004

Stuff

Powered by Pivot  
XML: RSS feed 
XML: Atom feed 

§ Beware of the CPU-specific optimizations

Every once in a while, I get a crash report whose diagnosis looks like this:

10116789: 0ff6c1          psadbw mm0, mm1      <-- FAULT

Crash context:
An integer SSE (Pentium III/Athlon) instruction not supported by the CPU was executed in module '****'...
...while decompressing video frame 0 with "********* Codec" [biCompression=********] (VideoSource.cpp:1772).

I blocked out the codec ID information so as to not single out a video codec manufacturer.

A crash like this usually means that the video codec you were attempting to use was compiled with CPU-specific optimizations that your CPU doesn't support. This generally means that your CPU is below minimum requirements for the codec. Unlike normal CPU requirements, missing instructions doesn't mean the codec will run really slowly -- it simply means the codec won't work at all. There's nothing I can do about this in VirtualDub; if you're seeing something like this and are indeed below the minimum spec, you need to either upgrade or beg the codec vendor to support your CPU (if it is actually fast enough to handle the video format).

Note that crashing here also means that the codec didn't properly check for the availability of special CPU instructions before attempting to use them. This is unwise from a customer support standpoint and I would encourage adding CPU detection code and an error dialog instead. One trap that a lot of coders fall into is that they attempt to use the Pentium Pro conditional move instructions (CMOVcc), assuming that they are available since no one will be using a CPU below 300MHz. Unfortunately, the AMD K6 series of CPUs don't support this instruction and are available at least as high as 400MHz, so this is a bad assumption. Similarly, Athlons exist as fast as 1GHz that don't support SSE. Also, you would be surprised how slow of a system people will try your code on; I got a crash report recently from someone who tried using modern video codecs on a Pentium without MMX!

Embrace the CPUID instruction. CPUID is your friend. If you are targeting the integer SSE instructions -- such as pshufw, psadbw, pavgb, pavgw, and movntq -- remember to check for either the SSE bit (for PIII and Athlon XP or higher) and the 3DNow! extensions bit (for the original Athlon).

In VirtualDub, I generally write scalar C versions of processing routines and then keep those around even after writing CPU-specific assembly optimized versions. I do this for two reasons: one is for compatibility with all CPUs, and another is so that I don't have to write the AMD64 version immediately. (The AMD64 compiler doesn't support inline assembly, which was a pain at least during the initial port.) The scalar code also serves as a reference test for the optimized code. VirtualDub queries for CPU capabilities at the start of an operation and automatically chooses the appropriate optimized routine.

Comments

Comments posted:


The AMD K6 series reached 550Mhz. You can still buy a 533Mhz K6-II on Pricewatch.

Cyberia - 17 12 04 - 21:09


Not quite so simple. CPUID SSE detection is flawed on Athlon processors because the motherboard can disable SSE support.

A bit of a history lesson... Many years ago when 3Dnow was popular and SSE was the new kid on the block some programs had code paths for both instruction sets. In general most applications will use the newest instruction set, so if a processor supports SSE then that code path will be used, otherwise the program will check for 3dnow, then MMX+, MMX and gracefully fall back to that support level.

Some astute motherboard manufacturers found that on the newest Athlons the 3dnow path was faster than SSE in certain programs, and that they could get a small but noticeable improvement in the common benchmarks by disabling SSE in the CPU. In other words, they could appear to have a competitve advantage over other manufacturers by killing this feature set.

All of a sudden you end up in a situation where there are three classes of Athlon BIOS. Those that leave SSE alone at all times (enabled), those that provide a user option to toggle SSE support, and finally those that force SSE off and give no option to the user. In this case the only option is to use a program like WCPUID to enable it each and every time the PC boots. Many companies have support pages to cover this such as: http://www.adobe.com/support/techdocs/32..

Anyway this issue has been a pain for many developers big and small, not to mention pissing AMD off too. Fast forward to 2004 where 3dnow is dead (removed in 64-bit) and there is little reason for developers to implement it. So instead they just do an SSE path that should cover all processors for the last 4 or 5 years. And all of a sudden their support gets a flood of irate emails from people with brand new Athlon 2ghz+ machines that are being told they are missing features...

So the question you now have to ask yourself is how many Virtualdub users with Athlon CPU's have had their performance crippled over the past few years because they've been forced to use older code paths. Maybe you should add an "Always assume SSE" option to the program ;)

Ok maybe it's not that bad. You can ignore the feature bits and build your own tables based upon the CPU manufacturer/family/model/stepping. However I certainly understand why some companies have just put the onus on the end user to make sure they meet minimum specs.

anon - 19 12 04 - 02:24


Interesting, I didn't know this. How lame. Fortunately, VirtualDub has very little floating-point SSE code, which is where the SSE/3DNow conflict occurs; it has some code that uses the integer set, which CPUID will still flag as available as long as 3DNow Extensions is enabled, but that is only a minor improvement over the MMX path. Floating-point SSE simply has low throughput compared to integer MMX. In the end, I'm not too worried about it.

Btw, it isn't safe to assume SSE support just from CPU model identification, because the OS actually has to set a bit in the CPU to indicate support for context-switching the extra registers. You have to also actually execute an SSE instruction and possibly catch the exception. You can't just check CPUID or model ID because the OS may not have support for saving/restoring the registers (some of us aren't lazy and can still run on Windows 95), and you can't just test-execute an SSE FP instruction because older CPUs might simply mis-execute the instructions as they look like older instructions with a size override or REPZ/REPNZ prefix.

Phaeron - 19 12 04 - 02:47


Another example, BTW, is PAE. You'd think that, since all Intel processors since Pentium Pro supported PAE, CPUs that does not support PAE would be nowadays be completely obsolete. Except that AMD's CPUs did not implement PAE until the Athlon, and VIA's CPUs did not support PAE until VIA C7, and Transmeta's older Crusue CPU did not support PAE, only the newer Efficion did support PAE. In fact even AMD's own Geode GX and LX and versions of Intel's own Pentium M and Celeron M without NX support did not support PAE! I think NX was what pushed VIA and Transmeta to add PAE support into their CPUs. Also even if the CPU problem were solved, I am sure there are some buggy BIOSes that can crash when PAE is on. That is why most Linux distros does not make the PAE kernel the default just to take advantage of NX. Windows by default automaticly select a kernel based on whether PAE is going to be on or not, Linux can't do this. BTW, see this blog article for more info on the relationship between PAE and NX in Windows XP SP2 and later:
http://yuhong386.spaces.live.com/blog/cn..

Yuhong Bao - 09 02 08 - 18:31


Also older versions of VMware and even current version of Microsoft Virtual PC and Parallels does not support PAE.

Yuhong Bao - 13 02 08 - 22:18

Comment form


Please keep comments on-topic for this entry. If you have unrelated comments about VirtualDub, the forum is a better place to post them.
Name:  
Remember personal info?

Email (Optional):
Your email address is only revealed to the blog owner and is not shown to the public.
URL (Optional):
Comment: /

An authentication dialog may appear when you click Post Comment. Simply type in "post" as the user and "now" as the password. I have had to do this to stop automated comment spam.



Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.