
On Tue, Oct 27, 2020 at 01:09:38PM +0100, Peter Krempa wrote:
On Tue, Oct 27, 2020 at 10:04:33 +0000, Daniel Berrange wrote:
On Tue, Oct 27, 2020 at 10:53:12AM +0100, Peter Krempa wrote:
On Mon, Oct 26, 2020 at 16:08:34 +0000, Daniel Berrange wrote:
On Mon, Oct 26, 2020 at 04:45:50PM +0100, Peter Krempa wrote:
Glib's hash table provides basically the same functionality as our hash table.
In most cases the only thing that remains in the virHash* wrappers is NULL-checks of '@table' argument as glib's hash functions don't tolerate NULL.
In case of iterators, we adapt the existing API of iterators to glibs to prevent having rewrite all callers at this point.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/libvirt_private.syms | 4 - src/util/meson.build | 1 - src/util/virhash.c | 416 ++++++++++----------------------------- src/util/virhash.h | 4 +- src/util/virhashcode.c | 125 ------------ src/util/virhashcode.h | 33 ----
Our hash code impl uses Murmurhash which makes some efforts to be robust against malicious inputs triggering collisons, notably using a random seed.
The new code uses g_str_hash which is much weaker, and the API docs explicitly recommend against using it if the input can be from an untrusted user.
Yes, I've noticed that, but didn't consider it to be that much of a problem as any untrusted input which is stored in a hash table (so that the attacker can use crafted keys) must be in the first place safeguarded against OOM condition by limiting the input count/size.
The problem isn't OOM, rather it is algorithmic complexity. With malicious hash collisions the runtime lookup performance degrades to O(n) which can cause scalability concerns in some cases.
I was pointing out that limiting the input size needed for OOM limit conveniently limits the size of 'n'.
The worst case for a malicious actor that I can see is the block device statistics code, where the worst case input would be based on 2 * 10 MiB of json, where based on 200 bytes per entry you could achieve 100k hash comparisons.
As noted though, I think we can use the better hash function we have.
The only difference will be probably that the seed will be global and not per-table since glibs table doesn't support that. If that's not acceptable we need to keep all the code since glibs hash table's hash function prototype is:
A one-time seed is likely fine. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|