Current version

v1.10.4 (stable)


Main page
Archived news
Plugin SDK
Knowledge base
Contact info
Other projects


Blog Archive

Wikipedia and AVI fractions

There seems to have been a number of attacks on Wikipedia lately with regard to the credibility of its articles. Personally, I think Wikipedia deserves a lot of credit for the quality of its articles despite the problems inherent in allowing public posting and editing. Some classes of articles I would definitely not trust without double-checking sources; I'm betting that the article on George W. Bush will be either still protected or in an edit war when I die. With regard to science and math articles, however, Wikipedia tends to be a very excellent starting source. Sometimes when I get bored, I start digging around the Internet for information on the most random of topics, and frankly, Wikipedia can give me the most convenient and densest amount of reasonably correct information on polystyrene in short notice.

My main problem with Wikipedia now is that it doesn't have a convenient offline form; it was really annoying when I was without Internet for a couple of days and couldn't get to the continued fractions page that I had forgotten to locally save. It's possible to download the database dumps and import them into a local Mediawiki, but that seems like a lot of work. Personally, I'd love to pay the Wikimedia Foundation for a personal dump of the pages on a DVD, but I know that some contributors are upset with some potential plans for doing so.

Another good example of Wikipedia's helpfulness would be how to handle fractions.

The AVI format stores the sample rate for all streams in rational form — that is, as the ratio of two 32-bit unsigned integers. This means that a number of frame rates that would otherwise be inaccurately stored as a single integer can be stored exactly, most notably the annoying 30000/1001 frame rate of NTSC video. The need to manipulate fractions, however, necessitates arithmetic routines that are a bit more complicated than the standard scalar routines. Everyone knows the rules for handling basic fraction multiplication and division, but there are some corner cases that require much nastier algorithms to solve. I noted this in passing on one of my earlier blog entries, and a reader helpfully pointed me to the Wikipedia article for continued fractions, of which I had never heard before but offered a direct solution to the problem.

The AVI fraction problem

As I said earlier, AVI fractions are stored as two 32-bit unsigned integers in the header, a numerator (dwRate) and a denominator (dwScale). dwRate/dwScale exactly defines the rate of the stream in samples per second. The astute reader will immediately ask what it means if dwScale=0. I don't know, but I suspect most AVI applications will hork, so don't do that.

This is fine if you just want to interpret the stream or output another with the same rate, but if you want to actually modify the stream rate, then you need fraction-safe arithmetic. Really early versions of VirtualDub used to double dwScale in order to halve the frame rate without checking for overflow, which of course was a bad idea — it is perfectly legal to use a fraction of FFFFFFFF/FFFFFFFF in the stream header. I think I then added code to check for dwScale overflow and halve dwRate instead, which avoids catastrophic failure but can create inaccurate fractions. Sometime around the 1.6 era I bit the bullet and wrote a fraction library instead to properly solve the problem.

As a side note, using these fractions can be tricky. A large AVI file can have more than 2^32 audio samples, so at least a 96-bit intermediate result is required to correctly convert between sample numbers of different streams or between sample numbers and time. VirtualDub uses a custom set of assembly language routines to do this efficiently.

Multiplying fractions

Multiplying a fraction (A/B) by (C/D) is straightforward: the result is AC/BD. Well, almost. It's simple until you realize that if all inputs are 32-bit unsigned, the output values are 64-bit unsigned, and the resultant fraction needs to be reduced to use 32-bit values. As I said before, it's perfectly legal to use FFFFFFFF/FFFFFFFF to indicate one sample/second, so the ostrich method (stick your head in the sand and hope no one uses such huge values in practice) is not viable.

Fortunately, there is usually at least one common factor in the resultant fraction such that reducing it to lowest terms is sufficient. To do this, the greatest common factor (GCF) of the numerator and denominator needs to be computed. 32-bit fractions are far too big to brute force, so one way to attack this is with Euclid's GCD algorithm. Euclid's GCD algorithm uses the fact that, given two numbers A and B which have a common factor greater than 1 and where A > B, the value C = A - nB has the same common factors. The GCF can then be found by repeatedly replacing (A, B) with (B, A mod B). You wouldn't want to do this all of the time, since it involves a number of divisions and could take thousands of cycles, but it's still fairly fast nevertheless.

There's still the problem of how to handle the case where the fraction can't be exactly reduced two 32/32 form, however. In current versions I handle this by crudely divide numerator and divisor by powers of two until it fits. This introduces a slight sync error into the output, but the maximum error by this is 2^-31, which is usually neglegible. (I believe DirectShow already reduces the fraction to a frame period with 100ns precision during playback, which is a much larger potential error.) However, we can do slightly better than that.

Continued fractions to the rescue

Continued fractions provide the answer. Basically, continued fractions are an alternate way to represent numbers, with the very significant bonus that truncating the continued fraction gives the best possible rational approximation to that number. (I've seen some comments that this is not always the case, but the range of vulnerability looks small enough to be acceptable.) The pertinent algorithm for fractional approximation using continued fractions starts generating successively closer fractional approximations with larger and larger denominators, and stops when the denominator goes over the limit. It's fairly fast, too, from what I can tell.

The trick is what to do when the denominator limit of 2^32 is hit. It turns out that when the algorithm stops, neither the last fraction before 2^32 or the next one over it may be the best approximation for that range; the fraction then has to be backstepped to an intermediate denominator to determine the best one. I found several places on the Internet that described how to decrement the last component of the continued fraction to determine alternates, but the Wikipedia article was the only source I found that even mentioned the ½ak admissibility test that is crucial for determing the lower limit without actually testing if the resultant fraction is closer than the last truncated result. I'm sure I would have encountered it if I had read the appropriate books, but those are obviously not nearly as convenient.

Here's a program that implements the continued fraction based algorithm for rational approximations. It's written in new-style managed C++, so you will need Visual Studio 2005 or VC++ Express 2005 to compile it. There wasn't any particular reason for using MC++ except that I wanted to try it and see how improved it was over the old VS2003 syntax. It's definitely a lot less underlineful, but it didn't have any real advantage here than if I'd just used straight ANSI C/C++.

#using <mscorlib.dll>
using namespace System;
int main(array<String^>^ args) {
    try {
        __int64 nx, dx;
        __int64 limit = Int64::Parse(args[0]);
        if (args->Length == 2) {
            union {
                double d;
                unsigned __int64 i;
            } conv;
            conv.d = Double::Parse(args[1]);
            nx = (conv.i & 0xFFFFFFFFFFFFF) + 0x10000000000000;
            dx = 0x10000000000000;
            int exp = (conv.i >> 52) - 1023;
            if (exp < 0)
                dx <<= -exp;
                dx >>= exp;
        } else {
            nx = Int64::Parse(args[1]);
            dx = Int64::Parse(args[2]);
        // Algorithm from Wikipedia, Continued Fractions:
        __int64 n0 = 0;
        __int64 d0 = 1;
        __int64 n1 = 1;
        __int64 d1 = 0;
        __int64 fp = 0;
        double orig = (double)nx / (double)dx;
        for(;;) {
            __int64 a = nx/dx;
            __int64 f = nx - dx*a;
            __int64 n2 = n0 + n1*a;
            __int64 d2 = d0 + d1*a;
            if (d2 > limit) {
                // reduce last component until denominator is within range
                __int64 a2 = (limit - d0)/d1;
                if (a2 < (a+1)/2)
                // 1/2a_k admissibility test
                if (a2*2 == a && d0*fp <= f*d1)
                a = a2;
                n2 = n0 + n1*a2;
                d2 = d0 + d1*a2;
                f = 0;
            double approx = (double)n2 / (double)d2;
            Console::WriteLine("{0} | {1}/{2} = {3} (err={4})", a, n2,
d2, approx, Math::Abs(approx - orig)); if (!f) break; n0 = n1; n1 = n2; d0 = d1; d1 = d2; fp = f; nx = dx; dx = f; } } catch(Exception^ e) { Console::WriteLine(e); } }

The first argument to the program is the denominator limit; follow that with either a real number or a 32-bit integer numerator and denominator. I plan to use a similar algorithm to improve the fraction accuracy in VirtualDub, but I probably won't add it until the next major feature release.

If you try real number inputs you'll very quickly see the downsides to converting real numbers using this algorithm, at least with regard to frame rates. Running "frac 4294967295 29.97," for instance, gives the both exact and obviously useless answer of 2997/100. You have to enter in at least 29.97003 before 30000/1001 even appears as an approximation, and 29.97002997003 before it becomes the closest within 32-bit range. This makes the algorithm somewhat unuseful for converting decimal frame rates to fractions for UI purposes.


This blog was originally open for comments when this entry was first posted, but was later closed and then removed due to spam and after a migration away from the original blog software. Unfortunately, it would have been a lot of work to reformat the comments to republish them. The author thanks everyone who posted comments and added to the discussion.