Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Donate
Contact info
Forum
 
Other projects
   Altirra

Search

Calendar

« February 2013 »
S M T W T F S
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28    

Archives

01 Aug - 31 Aug 2013
01 May - 31 May 2013
01 Mar - 31 Mar 2013
01 Feb - 29 Feb 2013
01 Dec - 31 Dec 2012
01 Nov - 30 Nov 2012
01 Oct - 31 Oct 2012
01 Sep - 30 Sep 2012
01 Aug - 31 Aug 2012
01 June - 30 June 2012
01 May - 31 May 2012
01 Apr - 30 Apr 2012
01 Dec - 31 Dec 2011
01 Nov - 30 Nov 2011
01 Oct - 31 Oct 2011
01 Sep - 30 Sep 2011
01 Aug - 31 Aug 2011
01 Jul - 31 Jul 2011
01 June - 30 June 2011
01 May - 31 May 2011
01 Apr - 30 Apr 2011
01 Mar - 31 Mar 2011
01 Feb - 29 Feb 2011
01 Jan - 31 Jan 2011
01 Dec - 31 Dec 2010
01 Nov - 30 Nov 2010
01 Oct - 31 Oct 2010
01 Sep - 30 Sep 2010
01 Aug - 31 Aug 2010
01 Jul - 31 Jul 2010
01 June - 30 June 2010
01 May - 31 May 2010
01 Apr - 30 Apr 2010
01 Mar - 31 Mar 2010
01 Feb - 29 Feb 2010
01 Jan - 31 Jan 2010
01 Dec - 31 Dec 2009
01 Nov - 30 Nov 2009
01 Oct - 31 Oct 2009
01 Sep - 30 Sep 2009
01 Aug - 31 Aug 2009
01 Jul - 31 Jul 2009
01 June - 30 June 2009
01 May - 31 May 2009
01 Apr - 30 Apr 2009
01 Mar - 31 Mar 2009
01 Feb - 29 Feb 2009
01 Jan - 31 Jan 2009
01 Dec - 31 Dec 2008
01 Nov - 30 Nov 2008
01 Oct - 31 Oct 2008
01 Sep - 30 Sep 2008
01 Aug - 31 Aug 2008
01 Jul - 31 Jul 2008
01 June - 30 June 2008
01 May - 31 May 2008
01 Apr - 30 Apr 2008
01 Mar - 31 Mar 2008
01 Feb - 29 Feb 2008
01 Jan - 31 Jan 2008
01 Dec - 31 Dec 2007
01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Sep - 30 Sep 2007
01 Aug - 31 Aug 2007
01 Jul - 31 Jul 2007
01 June - 30 June 2007
01 May - 31 May 2007
01 Apr - 30 Apr 2007
01 Mar - 31 Mar 2007
01 Feb - 29 Feb 2007
01 Jan - 31 Jan 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Oct - 31 Oct 2006
01 Sep - 30 Sep 2006
01 Aug - 31 Aug 2006
01 Jul - 31 Jul 2006
01 June - 30 June 2006
01 May - 31 May 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 29 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Nov - 30 Nov 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 June - 30 June 2005
01 May - 31 May 2005
01 Apr - 30 Apr 2005
01 Mar - 31 Mar 2005
01 Feb - 29 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004

Stuff

Powered by Pivot  
XML: RSS feed 
XML: Atom feed 

§ It's good to know I'm not crazy

Going through my weekly backlog of email, I found a crash report on this assembly code:

004e19f3: 0f73d430        psrlq  mm4, 30h
004e19f7: 0f7ee0          movd   eax, mm4
004e19fa: 0fe504c5d89f5c  pmulhw mm0, [eax*8+005c9fd8]      <-- FAULT
          00

This is the division approximation code for the temporal smoother filter in VirtualDub. It essentially computes:

result.rgb = color.rgb * div_table[sum >> 48];

The crash was an access violation, indicating a bad pointer. Problem number one is that, the way the code's structured, the table index never exceeds 128. Problem number two:

EAX = 00800000

Extracting the top 16 bits of a 64-bit unsigned quantity gave a value bigger than 0x10000. That's... not possible.

I couldn't figure out how this could happen, so I wrote back the user asking if the crash was reproducible. As it turned out, he'd already diagnosed the problem: bad RAM. My guess is that the OS had done a context switch in the middle of these instructions, giving the opportunity for EAX to be dumped to memory and be corrupted. Sometimes the impossible does actually happen... well, at least when hardware failure is involved.

(Read more....)

§ Field order confusion in VirtualDub

Why are the "even field first" and "odd field first" labels in VirtualDub reversed?

Well, actually, they're not, depending on how you label scan lines.

The even field in VirtualDub corresponds to the field that has the upper set of scan lines, and the odd field is the one with the lower set. The reason is that internally VirtualDub numbers scan lines so that the top scan line is scan line 0. That means the even field consists of 0, 2, 4... and the odd field is 1, 3, 5..., meaning that the even field is positioned higher than the odd field. If you consider scan lines to be numbered starting from 1, then this would be backwards and thus confusing. When this issue was raised, I did some searching around and didn't see a clear consensus on scan line numbering, so the plan is to abandon even/odd terminology in UI and just use top/bottom instead across the board.

If you're confused about the field order of a clip, the best way to check it is to use bob deinterlacing to double it to field rate and then check if you get motion juddering. The mode that gives you the smooth output is the correct one. You can do this in VirtualDub via the "bob doubler" filter in 1.8.0+, or the "deinterlace" filter in 1.9.2+. Unfortunately, there are a few places where I've goofed the field order settings at times; the bob doubler had this backwards until 1.8.2, and I've just been informed that it's currently backwards in the new IVTC filter in 1.9.x. I'm working on making sure everything's correct for 1.9.3.

(Read more....)

§ Making a hash of floating point numbers

I've always thought that hash tables were well named, because often when you see how people have used them you wonder what they were smoking at the time. Often the problem revolves around a mistaken notion that switching a binary search tree for a hashed container bestows some sort of magical constant time lookup behavior, but sometimes it's more subtle. One case has to deal with the choice of hash function.

The hash function for a hashed container converts a key to a bucket index with the intent of trying to distribute data items as evenly as possible. Given a decent distribution for input values, the hash function for an integral key can be as simple as just using the integer value itself, with the container then applying a modulus operation to wrap it within the bucket count. Anyone who's gone down this route, however, then discovers the problem of trying to do this for a key that is of floating point type. Usually the first thing they try is something like this:

size_t operator()(float value) const {
    return (size_t)(value * 100);
}

This is unfortunately usually fairly slow due to poor performance in the float-to-int conversion. There's also the matter of slightly worse behavior around zero due to truncation toward zero instead of negative infinity.

At this point, the inclination is probably to just give up and either deal with it or use a different container. Others go "aha!" and use this hack instead:

size_t operator()(float value) const {
    return (size_t)*(const unsigned int *)&value;
}

This code uses the bit pattern of the float as the hash value. Yeah, it's non-portable. It's also got problems with the aliasing rules of the C language. In the not so unusual case of being able to depend on a 32-bit integral type and IEEE single precision floating point, though, it's a really neat and fast trick. And, sadly, it's also wrong. If you've done this or thought about it, don't feel bad. The .NET Framework team almost made this mistake, too.

Reason after the jump.

(Read more....)

§ Exploring CUDA

After experimenting with pixel shaders for video acceleration, I decided to give NVIDIA's CUDA a shot to see how it performed on the same hardware. As I've noted before, Direct3D 9 can be used for video acceleration, but has some performance problems in specific areas. CUDA uses a more general programming model and also provides more direct access to video driver, making it potentially more interesting for video acceleration.

The first thing I have to say about CUDA is that it's fairly easy to use. Kernels are written in C using a special nvcc pre-compiler that allows you to mix both host and device code; the precompiler automatically splits the two apart, compiling the host code using VC++ or GCC and compiling the device code separately, and then linking the two together. This means that you can compile a CUDA application from the command line as easily as a standard Hello World application. Invoking a kernel on the device is also easy:

Filter<<<M, N>>>(dy, py, dx, px, 1024, 1024);

This special syntax looks just like a function call, with the added fields in angle brackets describing the grid and block dimensions for threading purposes. This is an asynchronous call, so it's easy to do the right thing of queuing up kernel calls on the device and only waiting when you need to do data transfers. Those are also easy, because CUDA has device analogs for malloc(), free(), memcpy(), and memset() operations. The API is also nicely partitioned, with the runtime API being suitable for writing CUDA applications directly, and the explicit driver API being better for integration into frameworks. You can even use PTX assembly language if you need to check the compiler's output or generate code for CUDA directly.

My first attempt at CUDA didn't give stellar performance, but the result was reasonable:

(Read more....)