Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Donate
Contact info
Forum
 
Other projects
   Altirra

Search

Archives

01 Dec - 31 Dec 2013
01 Oct - 31 Oct 2013
01 Aug - 31 Aug 2013
01 May - 31 May 2013
01 Mar - 31 Mar 2013
01 Feb - 29 Feb 2013
01 Dec - 31 Dec 2012
01 Nov - 30 Nov 2012
01 Oct - 31 Oct 2012
01 Sep - 30 Sep 2012
01 Aug - 31 Aug 2012
01 June - 30 June 2012
01 May - 31 May 2012
01 Apr - 30 Apr 2012
01 Dec - 31 Dec 2011
01 Nov - 30 Nov 2011
01 Oct - 31 Oct 2011
01 Sep - 30 Sep 2011
01 Aug - 31 Aug 2011
01 Jul - 31 Jul 2011
01 June - 30 June 2011
01 May - 31 May 2011
01 Apr - 30 Apr 2011
01 Mar - 31 Mar 2011
01 Feb - 29 Feb 2011
01 Jan - 31 Jan 2011
01 Dec - 31 Dec 2010
01 Nov - 30 Nov 2010
01 Oct - 31 Oct 2010
01 Sep - 30 Sep 2010
01 Aug - 31 Aug 2010
01 Jul - 31 Jul 2010
01 June - 30 June 2010
01 May - 31 May 2010
01 Apr - 30 Apr 2010
01 Mar - 31 Mar 2010
01 Feb - 29 Feb 2010
01 Jan - 31 Jan 2010
01 Dec - 31 Dec 2009
01 Nov - 30 Nov 2009
01 Oct - 31 Oct 2009
01 Sep - 30 Sep 2009
01 Aug - 31 Aug 2009
01 Jul - 31 Jul 2009
01 June - 30 June 2009
01 May - 31 May 2009
01 Apr - 30 Apr 2009
01 Mar - 31 Mar 2009
01 Feb - 29 Feb 2009
01 Jan - 31 Jan 2009
01 Dec - 31 Dec 2008
01 Nov - 30 Nov 2008
01 Oct - 31 Oct 2008
01 Sep - 30 Sep 2008
01 Aug - 31 Aug 2008
01 Jul - 31 Jul 2008
01 June - 30 June 2008
01 May - 31 May 2008
01 Apr - 30 Apr 2008
01 Mar - 31 Mar 2008
01 Feb - 29 Feb 2008
01 Jan - 31 Jan 2008
01 Dec - 31 Dec 2007
01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Sep - 30 Sep 2007
01 Aug - 31 Aug 2007
01 Jul - 31 Jul 2007
01 June - 30 June 2007
01 May - 31 May 2007
01 Apr - 30 Apr 2007
01 Mar - 31 Mar 2007
01 Feb - 29 Feb 2007
01 Jan - 31 Jan 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Oct - 31 Oct 2006
01 Sep - 30 Sep 2006
01 Aug - 31 Aug 2006
01 Jul - 31 Jul 2006
01 June - 30 June 2006
01 May - 31 May 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 29 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Nov - 30 Nov 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 June - 30 June 2005
01 May - 31 May 2005
01 Apr - 30 Apr 2005
01 Mar - 31 Mar 2005
01 Feb - 29 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004

Stuff

Powered by Pivot  
XML: RSS feed 
XML: Atom feed 

§ Learning DirectSound and audio with more than 2 channels

About a week ago, I acquired a Creative Labs Audigy 2 ZS Notebook card, partly out of the need to do some testing with 24-bit audio support, and partly due to disdain for the wonderfully fancy AC97 codec in my laptop. As far as I can tell, it has essentially the same feature set as a regular Audigy 2 ZS, including the EMU10K2-based effects engine and the separate 24-bit/96KHz support chip. It has a single headphone port on the side that doubles both for a regular wired connection and an optical output. An unexpected feature, however, is that if you don't plug anything into the Audigy 2 ZS Notebook, its drivers can take the output of the effects engine and push it through your regular sound card. The latency is a bit high, but this means that I can get EAX environmental effects through the onboard speakers. Also, the built-in sound card's mixer and the Audigy 2's mixer are both active, so I can set the built-in one's mixer such that the volume control that most programs see has a normal range instead of loud to extremely loud to ear-shatteringly loud. Neat.

Now, I had another agenda with getting this card, and that is to try a surround-sound hack. The idea was to use the "center cut" algorithm to split the center audio from the sides and run two speakers with the side audio and one or two more in front with the center audio. (With major help from Moitah on the forums, the algorithm has been improved -- the new version will appear in 1.7.0.) In order to do this, however, I needed to output sound with more than two channels, which I had never done before. I figured that while I was mucking around with this I might as well learn DirectSound as well.

The surround-sound hack sounded terrible, but DirectSound turned out to not be as bad, and it was interesting learning about the current state of the Windows sound system.

"Direct"Sound?

First, DirectSound isn't really direct anymore, joining DirectInput in the list of DirectX APIs that aren't. In Windows XP, both waveOut and DirectSound run through the kernel mixer API, and in Vista, they're both being layered on top of a new user-space API called WASAPI. The bad part is that calls are going through extra translations and overhead, but on the other hand, the Windows audio team has been doing a good job of maintaining all of the APIs -- in fact, waveOut code I wrote years ago still runs on Windows XP, but with more formats and with lower latency than when I developed it on Windows 95. This is in stark contrast to other teams at Microsoft. *glares at GDI and Direct3D teams*

DirectSound, or at least DirectSound 8, is actually fairly simple to set up for playback. For the most part, it is a lot like programming an old SoundBlaster 16 -- select a wave format, initialize an audio buffer, start playback, and periodically poll the position so you can lock and fill data ahead of the DMA point. The Lock() call simply gives you two pointers to write into (two are required to handle buffer wrap). As with programming on the bare metal, failure to stay ahead of hardware causes the playback pointer to loop around. You can also register for notifications for when the playback pointer crosses certain thresholds, allowing for non-polled loads, but apparently these aren't reliable on some drivers. Bummer. You can, however, retrieve an approximate read pointer, so you can at least periodically check the buffer status. In some ways, DirectSound is actually easier to use than waveOut, where you have to create a bunch of buffer headers, allocate memory for them, "prepare" the headers, and manage separate pools of active and pending buffers. Well, I guess there are looping buffers in waveOut too, but I never checked if they could be backfilled like hardware or DirectSound streaming buffers.

The DirectSound API does strike one of my pet peeves, which is that it has the ever annoying SetCooperativeLevel(), with a non-NULL window handle requirement. This means that your DirectSound objects have thread affinity, which I hate. I much prefer APIs that are thread agnostic, which are much easier to deal with because then you can use synchronization, instead of having a library dictate your threading model. One saving grace is that if you are creating buffers that only have global focus, apparently you can use GetDesktopWindow() as the handle. Before you say that this is naughty, note that a software developer on the DirectSound team said we could do it. Therefore, they can't complain later. :)

Multi-channel and high precision audio formats

The next question is how you blast out more than two channels of audio. There are two ways of requesting this. One is to just extend PCMWAVEFORMAT to use more than two channels. The other method is to use the newer WAVEFORMATEXTENSIBLE format, which is preferred since it contains a channel mask for remapping channels. Which one to use? The Audigy 2 ZS Notebook doesn't seem to care one way or the other, as it accepted either form for everything I tested; the Sigmatel C-Major audio device, however, only accepted WAVEFORMATEXTENSIBLE for 24-bit and 32-bit formats. For recording, however, none of the devices supported WAVEFORMATEXTENSIBLE, even for stereo formats that did work for playback.

That leaves the matter of which bit depths and frequencies are supported. Thanks to the kX Project and the sources to the ALSA drivers, quite a bit is known about the Audigy series hardware. The primary audio path is the EMU10K2 chip, which runs on a fixed regimen of 48KHz, 16-bit audio, to which all voices are resampled. The Audigy 2 series adds an additional chip ("p16v") with 24-bit/96KHz support, which can both feed into the regular 48KHz effects path or output directly. The EMU10K2 is much more flexible with regard to sampling rates than the p16v, which only handles specific sampling rates; I believe 44KHz, 48KHz, 96KHz, and 192KHz are available. Yet, trying various audio formats with DirectSound produces some interesting results: I can believe that the EMU10K2 would be very flexible with regard to voice input formats, but I have some skepticism that either sound card can really render 32-bit, 145KHz, 68 channel audio.

My theory is that the kernel mixer in Windows is very flexible at resampling from unsupported to supported formats, and that many of these formats aren't native, even in the sound card driver software. The sampling rate conversion is filtered, but I'm guessing that it simply chops off anything beyond 16 bits and two channels. Trying various formats for recording is more interesting, because requests for 24-bit and 32-bit formats only respond when hardware support is available, i.e. 96KHz/24-bit and 192KHz/24-bit work on the Audigy 2, but not 145KHz/24-bit -- but just about any sampling rate at 192KHz or lower passes at 8-bit or 16-bit depth on any sound card. Unfortunately, there aren't caps or query functions in DirectSound, so there isn't a way to tell if a format is actually supported in hardware. (This is probably one reason that the DirectSound capture filter in DirectShow doesn't expose 24-bit formats on its output pin, thus the reason that they don't show up in VirtualDub's "raw audio format" dialog box.) I have heard that this is going to become worse in Vista, with all audio capture being resampled from a single default recording format that is user-specified in Control Panel, and only being discernable through the new WASAPI. This is good from the standpoint of things Just Working(tm), but it's bad from the standpoint of not lying to the user about what is actually supported.

Comments

Comments posted:


I'm looking for at new 24-bit sound cards. You said the EMU10K2 (Audigy?) is fixed at 48kHz and 16-bit ... wouldn't that make the card 16-bit?

meanie - 13 09 06 - 21:40


Partially -- it's a dual personality card. The EMU10K2 refers to the primary voice/sfx chip, but the p16v/p17v chip does a single 24-bit/192KHz channel and can bypass the EMU10K2. You can do true 24-bit audio, but not all of the features will be available. As I noted above, though, you can access the 24-bit capability through regular Windows waveOut/DirectSound APIs; ASIO isn't necessary, it seems.

Phaeron - 14 09 06 - 02:03


check out the KX Project @ http://kxproject.lugosoft.com/index.php?.. . Cool stuff and a lot 2 learn ;) I'm sure you must know about it, but if not... hope u like.

Mark Conway - 10 10 06 - 13:42

Comment form


Please keep comments on-topic for this entry. If you have unrelated comments about VirtualDub, the forum is a better place to post them.
Name:  
Remember personal info?

Email (Optional):
Your email address is only revealed to the blog owner and is not shown to the public.
URL (Optional):
Comment: /

An authentication dialog may appear when you click Post Comment. Simply type in "post" as the user and "now" as the password. I have had to do this to stop automated comment spam.



Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.