§ ¶VirtualDub 1.6.7 released
I had intended for 1.6.6 to be the first stable release in the 1.6.x series, but a couple of bugs were reported that required potentially risky changes, so I decided to push 1.6.7 out as experimental again and reset the clock. Again, barring no major issues, the next release — 1.6.8 — will be released as stable.
1.6.7 mostly contains a bunch of minor crash fixes, but has two notable changes. One is that the AVI indices have been moved inward a couple of chunks to match the OpenDML spec; they were formerly placed at the very end, outside of any other chunk. The only reason this got noticed is that someone was comparing VirtualDub's output to their own AVI writer; I doubt anyone will notice a difference otherwise, but if there is a problem with the old index placement, it can be rectified by re-running the AVI through VirtualDub in direct/direct mode. The other notable change is that I fixed a couple of issues with audio samples being dropped at the very end of the output file.
In the other branch....
I'm still working on the 1.7.0 branch, albeit slowly. One of the fixes in 1.6.7, the lowpass crash, I actually discovered by accident while merging the audio filter system's scheduler into the main scheduler in 1.7.0. Doing this wouldn't have been too bad except that the way I pulled audio out of the filter graph wasn't multithread-savvy in that it couldn't re-wake the sleeping graph, so I had to add additional rescheduling calls. After the audio filter work is complete I want to get the video filter system running multithreaded. In theory this will Just Work(tm) for most filters, since the scheduler guarantees that execution within a particular filter instance is always serialized. A few third-party ones, though, do naughty things with VirtualDub's internal structures or otherwise are non-reentrant, and would need to be serialized.
Re-threading the system is tricky work, and much harder to do while rewriting an existing system than when writing a new one from scratch. One of my pet peeves with regard to multithreaded code is seeing code like this:
A Sleep() loop is not a proper synchronization mechanism and fares badly as one, both in latency and scalability. It also doesn't work in a user-space scheduling system like mine where tasks can't sleep in-place, since they have to return back to the scheduler to suspend. Unfortunately, I can't use regular synchronization primitives either since they can't be used in a partially cooperatively-tasked system, so I have to manually have signallers reschedule the waiters instead. One nice side effect of having to do this, though, is that all waits occur in the scheduler, meaning that I no longer have to worry about providing an abort path for every wait — I can simply set the abort flag in the scheduler and have all processing threads exit from there.
Build 23538 (1.6.7, experimental): [June 12, 2005]
* Script: Added support for cast expressions to int/long/double.
* Video direct-stream path now treats zero-byte (drop) frames as non-
essential. A video stream that has been spaced with drop frames to a
higher frame rate can now be cleanly converted down as well.
* Fixed crash when decoding audio from broken or mismatched DV frame.
* Hierarchical AVI index blocks were being placed outside of RIFF chunks.
No known actual incompatibility cases (yet).
* Fixed occasional crash when using lowpass or highpass filters with a
large tap count.
* Fixed crash on attempting to abort a render when an error had already
* Fixed "cut off audio stream when video stream ends" sometimes being
active even after being disabled.
* Fixed audio stream being shortened slightly when using audio compression.
* Capture: Fixed crash in volume meter code triggered by load of DLLs that
whack the floating-point control word.
* Capture: Fixed crash on exit in DirectShow layer caused by shutting down
COM too early.
* Script: Fixed "method not found" errors loading .vcfs caused by failure
to convert longs to ints.
I work with VERY large AVI files on my computer and have 1GB RAM for doing so. File operations are very much speeded up with the least hard drive clicking when performance settings are set to the highest. I would appreciate if the next release of VirtualDub would save all the performance settings so I don't have to keep resetting them every time I launch VirtualDub. I would also prefer it to save my setting for Audio and Video Direct Stream Copy too. Your VirtualDub program is great and allows me to splice and join DV-AVI files I captured via FireWire.
Jack - 13 06 05 - 07:06
I strongly second the above recommendation and have been hoping to see it in one of the last half-dozen or so releases but no luck so far. It was one thing from the (currently no longer developed) VDubMod that I really liked, you could set your output memory to 8MB (or whatever) and increase stream buffers and it would carry over to future usage.
I use a .vcf autoloaded from the commandline to set most other things like my codecs, filters,etc but still have to manually set performance changes.
Mark - 14 06 05 - 01:27
I also agree that saving the options would be a nice addition.
Also, I find that the 1.6.x addition of the ability to resize the video is very useful, but I can't seem to find any way to maintain the aspect ratio. The convention in many other programs is to hold shift or ctrl while resizing to lock the aspect ratio - this would be a good addition. A "lock aspect" menu option would be good too, because for the most part, changing the aspect ratio is simple a form of unwanted distortion rather than a useful feature.
Robert - 14 06 05 - 13:01
Robert - Right click the image.
Simon E. - 14 06 05 - 21:13
is it just me or does load linked segments not work when you're trying to first load a file? At least on my computer, it only works when I go through append avi segment
spock1104 - 15 06 05 - 16:58
Ver 1.6.7 working well with Dlink DUB-AV300 USB box.
This device captures using phillips chip SAA(9 bit) then converts to DivX 5.0 on the fly(hardware).
Low PC overhead and high quality result. VirtualDub needs to be kick-started into capturing audio, by toggling the volume meter!! a bit dodgy, but eventually works. Editing the finished file somewhat unpredictable. Thanks for excellent work, rgds.
Michael J - 30 06 05 - 09:50
Glad to see you plan on making VirtualDub multithread ready. It'll be great for those dual-core processors when they become commonplace.
A suggestion though: When people make their applications multithread ready, they usually only do so with two cores in mind. I anticipate there'll be quite a few workstation people using two dual-core processors (such as myself). Unfortunately, most apps only make use of two cores.
Would it be extra trouble to make VirtualDub scaleable for more than two cores?
XStylus - 29 09 05 - 06:07
Oh yeah I agree ! Making your code optimized for a variable number of CPU is much more trickier than most people think !
I was in charge in programming an important tool of data processing (signal processing, vibration analysis, image processing) for the aeronautics industry, and very soon we started to desperatly need more horsepower to compute data. Even using friendly framework (like MATLAB) was not a simple issue to deal with multithread optimization. There were two type of work initiated : a grid computing solution using MPI and making several computers as cluster, and the second one creating a request-for-processing index creation module, using the pattern of cooley-tuckey fast data processing algorithm (fast fourier transform), ie : assuming that the number of cores will always be a power of two. And then, when the program is lauched, the request-for-processing index creation module initiate by creating several layers of request-for-processing index. The fist layer contains two possible indexes directly accessible for the user, the second one contains another set of two indexes per first layer index, meaning four indexes. So, when you send a request-for-processing through index 1 (layer 1), the request is splited again to index 2.1 (layer 2 index 1) and 2.2 (layer 2 index 2). And so on.
Why not making all the cores directly accessible to processing ? it's hard to DIRECTLY send data to be processed to a CPU because of the OS structure. Very often you have to rely on how Windows, linux (or else) deals with multiples cores and it's not so good... sometimes =)
The problem of adressing processing to a non-fixed multi CPU number is a BIG problem, one of my pal is doing his PhD on this subject.
The second part of the problem is to find another way to distribute, and collect data within the cluster (even if it is a single multi processor workstation), the model of multilayers request-for-processing index I am proposing is a simple but extremely effective way to distribute, and collect data efficiently, because making all the core accessible from the same adressing level would induce the distribution module to be to extremely hard to program in an efficient way. Multiple by-2 divisions are always simplier.
We assume that a workstation will always have a power of two number of cores
For a N (meaning that N=2^k) cores system, the software must emulate 2^(k-1) distribution modules at a first glance: taking the example of a 2 core2 quad system (8 cores =2^3 cores), you will have 2^(3-1)=2^2=4 distribution module, and three layers for the request-for-processing index.
It means finally that when you perform a huge operation on your computer, the software will send requests-for-processing throug index 1.1 (layer1 index 1) which is the only one directly accessible by the user, the distributor 1.1 (associated with the index 1.1) will distribute (let's simply see it as a split) the processing to the two layer 2 indexes 2.1 and 2.2, again behind each index there is a spliter, that distribute processing to the two index hierarchically associated to him in the next layer 3.1 and 3.2 for the 2.1 splitter and 3.3 and 3.4 for the 2.2 spliter.
It is complex ? Nooo.....complex to code yeesss !
but in this way you efficiently use all of your cores, and the second major point of this pattern, is that if you code a by-2 splitter module, meaning a simple dual-core optimization at the first sight, you just have to re-use it several time...you don't always emulate 2^(k-1) splitter, but you run you program as several loops of by-2 splitting ie: for the first loop you call him 1 time, for the loop 2 (at this time you understand idea of layers... each layer is a loop) you call him 2 times (2.1 and 2.2) .... for the layer R you call him 2^(R-1) times...
This is a very simple but effective way of computing, applied for simple applications, but trust me, it is far more complicated when you must address a non-power-of-two number of CPU, and when you must consider the load as a routing factor.. but is it an another story =)
I hope I made myself clear
I can't wait for vdub to be mutlithreaded !!!
great work! go on!
dariel - 22 01 07 - 04:45