Current version

v1.10.4 (stable)


Main page
Archived news
Plugin SDK
Knowledge base
Contact info
Other projects



01 Dec - 31 Dec 2013
01 Oct - 31 Oct 2013
01 Aug - 31 Aug 2013
01 May - 31 May 2013
01 Mar - 31 Mar 2013
01 Feb - 29 Feb 2013
01 Dec - 31 Dec 2012
01 Nov - 30 Nov 2012
01 Oct - 31 Oct 2012
01 Sep - 30 Sep 2012
01 Aug - 31 Aug 2012
01 June - 30 June 2012
01 May - 31 May 2012
01 Apr - 30 Apr 2012
01 Dec - 31 Dec 2011
01 Nov - 30 Nov 2011
01 Oct - 31 Oct 2011
01 Sep - 30 Sep 2011
01 Aug - 31 Aug 2011
01 Jul - 31 Jul 2011
01 June - 30 June 2011
01 May - 31 May 2011
01 Apr - 30 Apr 2011
01 Mar - 31 Mar 2011
01 Feb - 29 Feb 2011
01 Jan - 31 Jan 2011
01 Dec - 31 Dec 2010
01 Nov - 30 Nov 2010
01 Oct - 31 Oct 2010
01 Sep - 30 Sep 2010
01 Aug - 31 Aug 2010
01 Jul - 31 Jul 2010
01 June - 30 June 2010
01 May - 31 May 2010
01 Apr - 30 Apr 2010
01 Mar - 31 Mar 2010
01 Feb - 29 Feb 2010
01 Jan - 31 Jan 2010
01 Dec - 31 Dec 2009
01 Nov - 30 Nov 2009
01 Oct - 31 Oct 2009
01 Sep - 30 Sep 2009
01 Aug - 31 Aug 2009
01 Jul - 31 Jul 2009
01 June - 30 June 2009
01 May - 31 May 2009
01 Apr - 30 Apr 2009
01 Mar - 31 Mar 2009
01 Feb - 29 Feb 2009
01 Jan - 31 Jan 2009
01 Dec - 31 Dec 2008
01 Nov - 30 Nov 2008
01 Oct - 31 Oct 2008
01 Sep - 30 Sep 2008
01 Aug - 31 Aug 2008
01 Jul - 31 Jul 2008
01 June - 30 June 2008
01 May - 31 May 2008
01 Apr - 30 Apr 2008
01 Mar - 31 Mar 2008
01 Feb - 29 Feb 2008
01 Jan - 31 Jan 2008
01 Dec - 31 Dec 2007
01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Sep - 30 Sep 2007
01 Aug - 31 Aug 2007
01 Jul - 31 Jul 2007
01 June - 30 June 2007
01 May - 31 May 2007
01 Apr - 30 Apr 2007
01 Mar - 31 Mar 2007
01 Feb - 29 Feb 2007
01 Jan - 31 Jan 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Oct - 31 Oct 2006
01 Sep - 30 Sep 2006
01 Aug - 31 Aug 2006
01 Jul - 31 Jul 2006
01 June - 30 June 2006
01 May - 31 May 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 29 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Nov - 30 Nov 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 June - 30 June 2005
01 May - 31 May 2005
01 Apr - 30 Apr 2005
01 Mar - 31 Mar 2005
01 Feb - 29 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004


Powered by Pivot  
XML: RSS feed 
XML: Atom feed 

§ Video shaders

As I said in the previous blog entry, one of the new features in 1.6.11 is the ability to bind custom vertex and pixel shaders to a video display pane. GPUs are great for massive image manipulation (unless you happen to have one with "Extreme" in the name), and thus it's only natural that they'd be useful for video. More importantly, though, it is easier to quickly write an optimized shader than to write an optimized software filter. This makes GPU shaders attractive also for rapid prototyping, which I hope is what the new shader support in 1.6.11 will enable.

To activate custom video shader mode, select Options > Preferences from the menu, jump to Display, then enable DirectX, Direct3D, and effect support. Then enter the .fx filename — relative paths are from the program directory, absolute paths are used directly. Dismiss the dialog, then deselect and reselect VirtualDub for the change to take effect.

D3DX DLL hassles

Before I begin, I have to rant about D3DX.

D3DX is the library that ships with the Microsoft DirectX SDK and which includes several useful, or rather, nearly essential, components such as the shader compiler and assembler. Starting sometime around December 2004, the D3DX library was changed from being a statically linked library to a DLL. This has the advantages of avoiding compatibility issues between C++ compilers and reducing working set footprint. However, Microsoft also:

So, basically, the D3DX DLLs are treated as system DLLs, but they don't come with the OS or any patches to the OS, and every application is supposed to include one, even though it's often bigger than the application that uses it. Supposedly, part of the reasoning behind this is that it allows Microsoft to update a D3DX version after the fact if it happens to contain a security flaw.

I don't know about other ISVs, but personally, I like to ship applications in as few locally-contained files as possible and for them to never change unless I explicitly update them. From what I gather, this is becoming quite a mess, because applications are forced to start the DirectX installer in their setup process, and lots of people are canceling it because they think they already have DirectX 9.0c installed, only to find that the application doesn't start because of a cryptic link error on the missing D3DX library.

VirtualDub 1.6.11 uses d3dx9_25.dll, which is used by the April 2005 SDK. It dynamically links to D3DX so that the DLL is only required if you are using the video shader support. The distribution doesn't include the D3DX DLL for various reasons, but there is a link in the help file to download the installer from Microsoft. As of today, it's at this location:

Some of you will already have it, as I think it's installed by some popular games.

Video shader compilation and invocation

The video shader support makes use of the D3DX effect system, which means that it takes standard .fx files. The format of these files is documented in the Microsoft DirectX SDK documentation; they allow vertex shaders, pixel shaders, and multiple passes to be defined. This makes them very flexible and powerful. In addition, VirtualDub will reparse and recompile the .fx file every time it gains focus, so iterating on a shader simply requires that you task switch to your editor, save the file, and task switch back. Effect compilation is very fast so this is much less of a problem than you might think.

The effect file can supply up to three techniques, which are bound to the point, bilinear, and bicubic filtering modes on the context menu (right-click menu) of the display panes. These techniques do not have to actually implement those filtering modes; this is simply a convenient hack to allow three different sets of shaders to be used. Once a technique is selected, VirtualDub uses it to render a single quad on screen with the source image bound as a texture. The shader can then do whatever it wants to the backbuffer, using multiple passes if necessary.

As documented in the help file, there is support for texture shaders to pre-initialize textures. These are software-emulated "shaders" that are used to initialize custom lookup textures. When I first discovered this feature I thought it would be horribly slow, as some of the D3DX image processing routines have been in the past, but it turns out they're decently fast, and generating several small 32x32 lookup textures at startup is no big deal.

Video shaders are, unfortunately, only really usable on hardware that supports pixel shader 2.0 or higher. For NVIDIA cards, this means GeForce FX or later, and for ATI, 9500+ or higher. It is possible to video shaders on ps1.4 (Radeon 8500), although that is rather limiting, and it can become quite difficult to do useful effects with pixel shader 1.1-1.3 (GeForce3/4), although still possible, especially with multipass and precomputed textures.

Writing video shaders

The simplest, actually useful shader is to just sample the entire source texture. This gives point and bilinear displays. After that, you can do a lot of interesting but not-so-useful effects using Sobel edge-detection filters and contrast/brightness/gamma filters, but you can't make too many effects that you'd actually want to use for preview or watching a video just with one texture lookup.

A custom pixel shader, as it turns out, isn't enough to do very interesting (and actually usable) video shading effects. The main problem is that if you only have the pixel shader, you have to do a lot of arithmetic in the pixel shader that could otherwise be factored out. For ps2.0 and higher profiles this is merely a performance annoyance in wasted clocks, but for ps1.4 and earlier it can be a dealkiller — among the problems is that it can force all of your texture lookups to be dependant texture reads, which can seriously cripple image processing capability. So, despite only processing four vertices, the vertex shader is quite important, as it can precompute as many as 40 interpolated values that then don't have to be computed in the pixel shader. Convolution filters, in particular, can have up to eight tap locations computed by the vertex shader.

ps1.4, by the way, is very annoying. You would never know it from the SDK documentation, but this shader model is only necessary for one range of video cards, the RADEON 8xxx series. Everything else either doesn't support this pixel shader model or has ps2.0, which is vastly more powerful. ps1.4 has a bunch of useful abilities that aren't in ps1.1-1.3, but when I tried recoding some of my shaders into ps1.4 so a friend could try them, I ran into precision problems with dependant texture reads. The problem is that in ps1.4, dependant texture coordinates are computed using fixed-point color ALUs instead of floating-point addressing ALUs. While the 8500 does have significantly better precision and range, 4.12 isn't enough for large textures. With a 1024x512 texture, that only gives two bits of subtexel positioning in the U axis, which shows clearly in warpsharp algorithms.

Bicubic interpolation turns out to be difficult to do efficiently in a pixel shader because it requires 16 taps to be sampled. The easiest way to do it is to just compute the entire filter kernel in the pixel shader and sample 16 times, but that results in a pixel shader of about 40-50 instruction slots. NVIDIA used to have a sample shader for this in the FX Composer distribution. This can be reduced dramatically by precomputing the taps in a filter texture and doing two separable filter passes, dropping it down to 10 texlds + 8 ALU ops per pixel. This is what VirtualDub's normal DX9 display minidriver does in bicubic mode. That requires that the sampling pattern be regular, though, which won't work for displacement mapping. For bicubic kernels that consist only of positive values, such as a B-spline kernel, it is possible to do bicubic with a per-pixel varying sample location, by doing four bilinear samples with the appropriate offsets; GPU Gems 2 has an example. It can be done for cardinal spline kernels with negative lobes, but it requires a rather nasty hack (complementing every other sample in a checkerboard pattern).

Warpedge, the test case

The primary test case I used for the video shader support was an experimental algorithm I'd been working on called warpedge, which is a variant of warpsharp. Warpsharp is an algorithm I originally found in a filter for the GIMP image processing tool, which sharpens an image by warping it toward the interior of edges. It is implemented by computing a bump map from edges in the image, then taking the gradient of the bump map to produce a displacement map. Warpsharp can produce some very impressive results, but it is also sensitive to noise, and can't really darken or brighten edges, only narrow them. There's a software version of the algorithm on my filters page.

Warpedge doesn't try to narrow edges; instead, it simply attempts to sharpen them. This is done primarily in two ways: by normalizing and limiting the length of gradient vectors, and by re-applying the difference between a straight interpolated image and the warped image. This seems to work decently well, although admittedly I tested it much more on anime than on real-world material. Also, warpedge is meant for interpolation; a stretch operation is part of the algorithm, and it doesn't work very well on 1:1 processing.

Here's the main version:

This is a four-pass shader; it could probably be reduced to three by merging the blur and gradient passes, but I don't know if it's worthwhile. The final pass barely fits in ps_2_0, because it computes a gradient vector and two bicubic A=-0.75 fetches — I had to hack on it a bit to get it to fit. (Such is the curse of prototyping on a GeForce 6800.) As it is, it takes 18 texture and 59 arithmetic instruction slots. The "bicubic" mode on the right-click context menu of the display pane selects the warpedge algorithm, while the "bilinear" mode selects a bicubic algorithm, and "point" selects bilinear. Yes, this is confusing, but you can't change the names in the context menu from the shader effect file. At least this way you can quickly flip between the modes for comparison.

Once I had finished writing the ps_2_0 version... and, uh, testing it on several episodes of Mai-HiME... I started working on the ps_1_1 version for a challenge. The problem with ps_1_1 is the pitiful support for dependant texture reads. I tried texbem and texbeml at first, but while the shader worked fine on a 6800, it didn't work correctly on a GeForce 4 Ti4600, because the older hardware expects a signed texture format, not the unsigned format that can be rendered to. In the end I found that texm3x2pad + texm3x2tex doesn't suffer from the same issue, so I was able to get a version of warpedge to work:

This version isn't quite as effective as it only uses bilinear interpolation. The mode mappings are: point -> point, bilinear -> bilinear, bicubic -> warpedge. In theory I could get bicubic to work, but it'd take a lot more passes — I'd estimate something like two more passes for the base bicubic layer, and four more passes for the displacement layer — and unfortunately, it'd probably need three render target textures, whereas VirtualDub currently only allows access to two. It might be possible to use the backbuffer as the third, but precision when blending in the framebuffer isn't great.

Also, I continued my tradition of breaking the compiler while writing this one. Attempting to do "dcl_2d t0.xy" in a ps_1_1 asm shader resulted in this nice error:

dep.fx(456): error X5487: Reserved bit(s) set in dcl info token! Aborting validation.

I also managed to trip an internal compiler error about uninitialized components in the vertex shader, but the error went away as mysteriously as it came.

Overall, though, I'm pretty satisfied with the results for a first pass, especially since it runs in real-time at 1920x1200, and isn't overly sensitive to noise.


Comments posted:

Seems like that both filters won't work with ati x850pro after copying only the d3dx9_25.dll to system32.
ps1.1: only draws the upper half of the picture with no visible effect
ps2.0: same as above + upper half looks weird (more or less damaged)

Murmel - 05 10 05 - 14:29

I saw this once on my GeForce 4 but was unable to reproduce it — I think it's caused by the shaders occasionally receiving the wrong size vector for the render targets. You might be able to work around this by hardcoding the RTT sizes as static const; they're normally the smallest powers of two that encompass the desktop resolution. For 1280x1024, that'd be 2048x1024, and thus you'd want something like this:

static const float4 vd_tempsize = { 2048.0f, 1024.0f, 1.0f / 1024.0f, 1.0f / 2048.0f };
static const float4 vd_tvpcorrect2 = { 2.0f / 2048.0f, -2.0f / 1024.0f, 1.0f + 1.0f / 1024.0f, -1.0f - 1.0f / 1024.0f };

Try fiddling with your desktop resolution first, though, to see if that changes the behavior. I'll try to track this down in the meantime.

Phaeron - 06 10 05 - 00:42

Ah, gomen... I made a mistake in the shader hosting code and inadvertently swapped U/V sizes for some of the constants. As a result, the vd_tvpcorrect, vd_tvpcorrect2, vd_t2vpcorrect, and vd_t2vpcorrect2 constants are wrong — you have to use -c.yxwz to get the correct values. I missed it because, as it turns out, there is only one desktop resolution on my system where the resulting textures are not powers of two: 1152x864. Here are versions that are modified to work around the bug in 1.6.11:

Phaeron - 06 10 05 - 02:20 it's working.
It produces moirée-artifacts in locations with noise or low contrast, though.
Where can I get this as a real plugin? ;)


Murmel - 07 10 05 - 14:00

will you be working this into the main filter pipeline? would be niiiice

coder - 10 10 05 - 09:11

Where can we find more .fx shaders? googling dosn't turn much up :(


Xaro - 10 10 05 - 09:47

Guys, let me tell you that because of internal problems with windows xp sp2 or later, no matter how many times you try to update directx 9.0b to direct x 9.0c it will not work.

I found a very useful site where you can fix this problem:

olddocks (link) - 01 12 05 - 04:49

Yeah, thats simply not true dude.
The issues with missing directx dll files were described earlier by Phaeron.
Every so often, the dlls are updated for directx9, and a new redist package is released. Most people get these from the newest game discs, and they're installed then.
The newest directx9 redist is easily obtainable from MS, and will install onto ANY version of windows XP.
I dont know where you got the idea that its impossible to install directx9c onto WinXP SP2.

Matariel - 26 12 05 - 13:24


I'm developing the new Direct3D driver in MPlayer. I just saw this comment about pixel shaders. May be this is not the right place to ask, but can I implement brightness/contrast/hue/etc. correction in Direct3D 9 without using pixel shaders? I mean - what's the way this kind of thing is done? Where should I start from?

Georgi Petrov - 06 02 09 - 09:21

My answer:

Phaeron - 07 02 09 - 02:31

Comment form

Please keep comments on-topic for this entry. If you have unrelated comments about VirtualDub, the forum is a better place to post them.
Remember personal info?

Email (Optional):
Your email address is only revealed to the blog owner and is not shown to the public.
URL (Optional):
Comment: /

An authentication dialog may appear when you click Post Comment. Simply type in "post" as the user and "now" as the password. I have had to do this to stop automated comment spam.

Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.