Current version

v1.10.4 (stable)

Navigation

Main page
Archived news
Downloads
Documentation
   Capture
   Compiling
   Processing
   Crashes
Features
Filters
Plugin SDK
Knowledge base
Donate
Contact info
Forum
 
Other projects
   Altirra

Search

Archives

01 Dec - 31 Dec 2013
01 Oct - 31 Oct 2013
01 Aug - 31 Aug 2013
01 May - 31 May 2013
01 Mar - 31 Mar 2013
01 Feb - 29 Feb 2013
01 Dec - 31 Dec 2012
01 Nov - 30 Nov 2012
01 Oct - 31 Oct 2012
01 Sep - 30 Sep 2012
01 Aug - 31 Aug 2012
01 June - 30 June 2012
01 May - 31 May 2012
01 Apr - 30 Apr 2012
01 Dec - 31 Dec 2011
01 Nov - 30 Nov 2011
01 Oct - 31 Oct 2011
01 Sep - 30 Sep 2011
01 Aug - 31 Aug 2011
01 Jul - 31 Jul 2011
01 June - 30 June 2011
01 May - 31 May 2011
01 Apr - 30 Apr 2011
01 Mar - 31 Mar 2011
01 Feb - 29 Feb 2011
01 Jan - 31 Jan 2011
01 Dec - 31 Dec 2010
01 Nov - 30 Nov 2010
01 Oct - 31 Oct 2010
01 Sep - 30 Sep 2010
01 Aug - 31 Aug 2010
01 Jul - 31 Jul 2010
01 June - 30 June 2010
01 May - 31 May 2010
01 Apr - 30 Apr 2010
01 Mar - 31 Mar 2010
01 Feb - 29 Feb 2010
01 Jan - 31 Jan 2010
01 Dec - 31 Dec 2009
01 Nov - 30 Nov 2009
01 Oct - 31 Oct 2009
01 Sep - 30 Sep 2009
01 Aug - 31 Aug 2009
01 Jul - 31 Jul 2009
01 June - 30 June 2009
01 May - 31 May 2009
01 Apr - 30 Apr 2009
01 Mar - 31 Mar 2009
01 Feb - 29 Feb 2009
01 Jan - 31 Jan 2009
01 Dec - 31 Dec 2008
01 Nov - 30 Nov 2008
01 Oct - 31 Oct 2008
01 Sep - 30 Sep 2008
01 Aug - 31 Aug 2008
01 Jul - 31 Jul 2008
01 June - 30 June 2008
01 May - 31 May 2008
01 Apr - 30 Apr 2008
01 Mar - 31 Mar 2008
01 Feb - 29 Feb 2008
01 Jan - 31 Jan 2008
01 Dec - 31 Dec 2007
01 Nov - 30 Nov 2007
01 Oct - 31 Oct 2007
01 Sep - 30 Sep 2007
01 Aug - 31 Aug 2007
01 Jul - 31 Jul 2007
01 June - 30 June 2007
01 May - 31 May 2007
01 Apr - 30 Apr 2007
01 Mar - 31 Mar 2007
01 Feb - 29 Feb 2007
01 Jan - 31 Jan 2007
01 Dec - 31 Dec 2006
01 Nov - 30 Nov 2006
01 Oct - 31 Oct 2006
01 Sep - 30 Sep 2006
01 Aug - 31 Aug 2006
01 Jul - 31 Jul 2006
01 June - 30 June 2006
01 May - 31 May 2006
01 Apr - 30 Apr 2006
01 Mar - 31 Mar 2006
01 Feb - 29 Feb 2006
01 Jan - 31 Jan 2006
01 Dec - 31 Dec 2005
01 Nov - 30 Nov 2005
01 Oct - 31 Oct 2005
01 Sep - 30 Sep 2005
01 Aug - 31 Aug 2005
01 Jul - 31 Jul 2005
01 June - 30 June 2005
01 May - 31 May 2005
01 Apr - 30 Apr 2005
01 Mar - 31 Mar 2005
01 Feb - 29 Feb 2005
01 Jan - 31 Jan 2005
01 Dec - 31 Dec 2004
01 Nov - 30 Nov 2004
01 Oct - 31 Oct 2004
01 Sep - 30 Sep 2004
01 Aug - 31 Aug 2004

Stuff

Powered by Pivot  
XML: RSS feed 
XML: Atom feed 

§ stdext::hash_set

A couple of days ago I found myself writing a breadth-first search routine in C++ for a particular puzzle (Klotski, to be specific). Since I was using Visual C++, I needed a set for the board database, and decided to use its stdext::hash_set.

stdext::hash_set... sucks.

See, Visual C++'s Standard Template Library (STL) is based on the implementation from Dinkumware, and has the following prototype:

template<class Key,
class Traits = hash_compare<Key, less<Key> >,
class Allocator = allocator<Key> >
class hash_set;

Notice two things about this hash set implementation: it takes a less-than predicate, and it doesn't take an equality predicate. What the frick kind of hash table requires a less-than predicate??? And we're not talking about a predicate for the hash code here, but a less-than comparison operator for the whole key. It defeats one of the main advantages of a hash container, since IMO it's often easier to write hash and equality predicates instead of a less-than predicate, as long as you aren't dealing with floats. I've never seen a hashed container that required this -- not in SGI-STL/STLport, not in Java, not in the .NET Framework. As you might expect, this means that VC8's hash_set is a pain in the butt to use and is slow. Thankfully, C++ TR1's unordered_set follows the SGI-STL/STLport path rather than Dinkumware's.

I haven't really had a need for a full blown hash_map or a hash_set in VirtualDub, but I'm thinking that if I do, I'll be better off writing my own....

Comments

Comments posted:


Well, the find function just calls lower_bound, no surprise it needs the less than comparison. If it had a great than it would call upper_bound surely :) Why does a hash_set (or hash_map) have such a functions is another question.

Gabest - 03 05 08 - 23:52


I'm not sure how it works for a regular hash_set or hash_map given that the hash tables are unordered at the hash bucket level -- they can be ordered in the hash chains depending on the implementation -- but lower_bound(), upper_bound(), and equal_range() are useful for a hash_multimap or hash_multiset. You can use equal_range() to extract a range of identically keyed elements within the container. This does require the overhead of at least partially sorting the hash chain, however. A lot of the time I only need one operation in a hash container, insert(), and so a reduced hash table can be faster than a full-blown STL one.

A less-than predicate suffices to give all six relational operators: (x > y) = (y < x), (x == y) == !(x < y) && !(y < x), (x y), (x >= y) == !(x < y), and (x != y) == !(x == y). STL implementations commonly take advantage of this. Problem is, if your predicate isn't fully consistent, you can get bad behavior from garbled output to even memory corruption and a crash. One of the easiest ways to goof up is to implement (x (x < y) && (y < x). A more subtle way is to use floating point arithmetic such that the compiler generates code differently for two identical expressions, leading to order of evaluation induced errors.

Phaeron - 04 05 08 - 01:56


Correct me if I am wrong, but if you use chaining for collision resolution of the hash table then it makes sense to me to have the less than predicate for sorting the colliding entries (assuming you do not use linked lists for it).

macin - 04 05 08 - 06:16


The Dinkumware hash container maintains the buckets as iterators within a master list, yeah, so any gain from sorting the chains is minimal at best. If the "chains" were maintained as sorted trees, then yeah, you'd have a safeguard against O(N) worst case collision behavior... but the complexity constant and memory overhead would be so much worse that I doubt anyone would want to use it. If you really need guaranteed super-linear behavior then it's best to just use an rbtree based map or set. Generally the idea with a hash container is simply to try to keep the bucket sizes from going above tiny single digits.

There's a peculiar feature in the Dinkumware implementation where it can supposedly do incremental rehashing in order to avoid one big hitch when the hash table overflows. Not worth the ~4x penalty over SGI-STL/STLport that I've seen in benchmarks, though.

Phaeron - 04 05 08 - 15:30


Your father was gifted--at least compared to my understanding of computers, hardware and software.

I am converting 8mm file to DVD. I project the 8mm film onto a screen, recording that visual on a 8mm digital camcorder, and then puting that into my computer to edit with Pinnacle STUDIO 9. I am using MPEG 2. Because the projector projects at a slower frame per second that the camcorder records there is at times a flicker. Upon reviewing the internet I found a reference to using the VirtualDub, the Deflicker filter, in removing or reducing the flicker. Are you familiar with this? Do you recommend it? What are the pro and cons?
Thank you so much for assisting this senior.

John Ogden - 21 05 08 - 18:04


I have started writing an image viewer which uses OpenGL. Problem was not loading textures or something graphics related but keeping a sorted image list up to date if files are added or removed from the watched folder. Suffice to say I still haven't decided what to use for such a list. I need it sorted in "human-alphabet" mode and I need speedy insertions and deletions. I also need to keep track of current image so I can do prev/next navigation. If you have any advice when it comes to those sets, maps and containers it would be very appreciated.

Igor Levicki (link) - 28 05 08 - 12:17


@John:
Your best bet is to follow the directions that come with the Deflicker filter -- I'm aware of it but haven't actually used it much. I do know that you do need to do an analysis pass for it to measure the flickering, following by a second pass to mitigate it.

@Igor:
The best way to handle a situation like that is to store your items with multiple keys. I think databases can do this fairly easily, but unfortunately container libraries typically can't, at least not efficiently. STL definitely can't. If you have intrusive containers, where you incorporate the nodes into your data items rather than having the container library aggregate them on for you, then it's possible to insert the same data item into multiple associative containers at the same time with different keys and sort/equality/hash predicates, which makes things easy because then it's O(1) to switch between the different iterators to the same item. Sadly, intrusive_list and intrusive_unordered_map aren't common containers.

Your best bet, I think, would be to abstract it and then see if perf becomes a problem. Container updates driven by user interaction and filesystem activity aren't likely to be much of a performance problem even up to 1K-10K entries, even if you use dumb linear search algorithms. One starting approach would be to keep your data items in a slot array with unique IDs and keep pairs of forward and reverse maps for all the indices you need, such as one keyed by filesystem entry, and another for the UI list. I doubt one or two extra O(log N) lookups when you need to remove an item would be a problem. This way you can also be flexible about at which layers you maintain the various mappings. For example, you don't necessarily want to insert items into the UI in sorted order as soon as they come in, as it can lead to surprises when the list updates right when the user is trying to select an item. Windows Explorer adds new items at the bottom until you manually refresh the pane.

Phaeron - 30 05 08 - 02:12


Avery, this is a bit different because there is no visible list. There is a list user navigates using PGUP/PGDN/HOME/END keys. That list is a list of images from the active folder sorted alphabetically. If you copy new items into that folder (that may happen asynchronously by another process such as download manager) they have to be inserted in the right postition in the list and the current index (the picture shown) has to be preserved. Same goes if images are deleted unless of course the currently shown image gets deleted too when the program should keep displaying it until the user manually advances to another image.
Then there is also a selection list. I have an option to tag images (using spacebar, they turn red) and copy or move them to another folder.
I got all tangled up attempting to solve all that. :-)

Igor Levicki (link) - 07 06 08 - 12:43

Comment form


Please keep comments on-topic for this entry. If you have unrelated comments about VirtualDub, the forum is a better place to post them.
Name:  
Remember personal info?

Email (Optional):
Your email address is only revealed to the blog owner and is not shown to the public.
URL (Optional):
Comment: /

An authentication dialog may appear when you click Post Comment. Simply type in "post" as the user and "now" as the password. I have had to do this to stop automated comment spam.



Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.