Re: [libvirt] [PATCH] util: hash: Append to hash buckets when adding new entries

Tuesday, 16 April 2019

On Tue, Apr 16, 2019 at 04:26:49PM +0200, Peter Krempa wrote:
...
 On Tue, Apr 16, 2019 at 15:17:31 +0100, Daniel Berrange wrote:
 > On Tue, Apr 16, 2019 at 04:09:54PM +0200, Peter Krempa wrote:
 > > In cases when the hash function for a name collides with other entry
 > > already in the hash we prepend to the bucket. This creates a 'stack
 > > effect' on the buckets if we then iterate through the hash. Normally
 > > this is not a problem, but in tests we want deterministic results.
 > > 
 > > Since it does not matter where we add the entry and it's usually more
 > > probable that a different entry will be accessed next change it to
 > > append to the end of the bucket. Luckily we already iterate throught the
 > > bucket once thus we can easily find the last entry and just connect the
 > > new entry after it.
 > 
 > I'm not understanding how this change makes the tests any more
 > deterministic than before.
 > 
 > The determinism for hash bucket placement is already ensured by
 > having a non-random impl forvirHashCodeGen in
 > tests/virdeterministichashmock.c
 > 
 > This change simply reverses the order in which we iterate over values
 > when many keys hash to the same bucket.  If this mattered for tests
 > then surely, this change would have to update the test suite too to
 > take account of the new reversed ordering ?

 If you have a hash collision when adding into the hash and then use
 virHashForeach the items are formatted in opposite order.

 Thus if you use this to e.g. store things parsed from XML and then
 format it back, every single iteration reverses the order. This is not a
 problem per-se since we can use a different output file, but it's
 inconvenient since we can't use the same output file.

 Currently there aren't any collisions which would cause problems, but I
 have one on my private branch.

 Given that this change is rather simple I figured we could just stop
 inverting them. 
Ok, that does make a little more sense, though generally my view would
be that if we're doing something that is order sensitive like XML
parsing & formatting, then we shouldn't be using a hash table to
represent the data - at least not as the primary record. 

Which bit of code is using the hash in this case ? Is it something
purely inside the test suite, or in the real XML handling code ?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [PATCH] util: hash: Append to hash buckets when adding new entries