Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Contact info
 
Other projects
   Altirra

Archives

Blog Archive

Beware of the CPU-specific optimizations

Every once in a while, I get a crash report whose diagnosis looks like this:

10116789: 0ff6c1          psadbw mm0, mm1      <-- FAULT

Crash context:
An integer SSE (Pentium III/Athlon) instruction not supported by the CPU was executed in module '****'...
...while decompressing video frame 0 with "********* Codec" [biCompression=********] (VideoSource.cpp:1772).

I blocked out the codec ID information so as to not single out a video codec manufacturer.

A crash like this usually means that the video codec you were attempting to use was compiled with CPU-specific optimizations that your CPU doesn't support. This generally means that your CPU is below minimum requirements for the codec. Unlike normal CPU requirements, missing instructions doesn't mean the codec will run really slowly -- it simply means the codec won't work at all. There's nothing I can do about this in VirtualDub; if you're seeing something like this and are indeed below the minimum spec, you need to either upgrade or beg the codec vendor to support your CPU (if it is actually fast enough to handle the video format).

Note that crashing here also means that the codec didn't properly check for the availability of special CPU instructions before attempting to use them. This is unwise from a customer support standpoint and I would encourage adding CPU detection code and an error dialog instead. One trap that a lot of coders fall into is that they attempt to use the Pentium Pro conditional move instructions (CMOVcc), assuming that they are available since no one will be using a CPU below 300MHz. Unfortunately, the AMD K6 series of CPUs don't support this instruction and are available at least as high as 400MHz, so this is a bad assumption. Similarly, Athlons exist as fast as 1GHz that don't support SSE. Also, you would be surprised how slow of a system people will try your code on; I got a crash report recently from someone who tried using modern video codecs on a Pentium without MMX!

Embrace the CPUID instruction. CPUID is your friend. If you are targeting the integer SSE instructions -- such as pshufw, psadbw, pavgb, pavgw, and movntq -- remember to check for either the SSE bit (for PIII and Athlon XP or higher) and the 3DNow! extensions bit (for the original Athlon).

In VirtualDub, I generally write scalar C versions of processing routines and then keep those around even after writing CPU-specific assembly optimized versions. I do this for two reasons: one is for compatibility with all CPUs, and another is so that I don't have to write the AMD64 version immediately. (The AMD64 compiler doesn't support inline assembly, which was a pain at least during the initial port.) The scalar code also serves as a reference test for the optimized code. VirtualDub queries for CPU capabilities at the start of an operation and automatically chooses the appropriate optimized routine.

Comments

This blog was originally open for comments when this entry was first posted, but was later closed and then removed due to spam and after a migration away from the original blog software. Unfortunately, it would have been a lot of work to reformat the comments to republish them. The author thanks everyone who posted comments and added to the discussion.