On Tue, Oct 27, 2020 at 01:09:38PM +0100, Peter Krempa wrote:
On Tue, Oct 27, 2020 at 10:04:33 +0000, Daniel Berrange wrote:
> On Tue, Oct 27, 2020 at 10:53:12AM +0100, Peter Krempa wrote:
> > On Mon, Oct 26, 2020 at 16:08:34 +0000, Daniel Berrange wrote:
> > > On Mon, Oct 26, 2020 at 04:45:50PM +0100, Peter Krempa wrote:
> > > > Glib's hash table provides basically the same functionality as
our hash
> > > > table.
> > > >
> > > > In most cases the only thing that remains in the virHash* wrappers
is
> > > > NULL-checks of '@table' argument as glib's hash functions
don't tolerate
> > > > NULL.
> > > >
> > > > In case of iterators, we adapt the existing API of iterators to glibs
to
> > > > prevent having rewrite all callers at this point.
> > > >
> > > > Signed-off-by: Peter Krempa <pkrempa(a)redhat.com>
> > > > ---
> > > > src/libvirt_private.syms | 4 -
> > > > src/util/meson.build | 1 -
> > > > src/util/virhash.c | 416
++++++++++-----------------------------
> > > > src/util/virhash.h | 4 +-
> > > > src/util/virhashcode.c | 125 ------------
> > > > src/util/virhashcode.h | 33 ----
> > >
> > > Our hash code impl uses Murmurhash which makes some efforts to be
> > > robust against malicious inputs triggering collisons, notably using
> > > a random seed.
> > >
> > > The new code uses g_str_hash which is much weaker, and the API
> > > docs explicitly recommend against using it if the input can be from
> > > an untrusted user.
> >
> > Yes, I've noticed that, but didn't consider it to be that much of a
> > problem as any untrusted input which is stored in a hash table (so that
> > the attacker can use crafted keys) must be in the first place
> > safeguarded against OOM condition by limiting the input count/size.
>
> The problem isn't OOM, rather it is algorithmic complexity. With malicious
> hash collisions the runtime lookup performance degrades to O(n) which can
> cause scalability concerns in some cases.
I was pointing out that limiting the input size needed for OOM limit
conveniently limits the size of 'n'.
The worst case for a malicious actor that I can see is the block device
statistics code, where the worst case input would be based on 2 * 10 MiB
of json, where based on 200 bytes per entry you could achieve 100k hash
comparisons.
As noted though, I think we can use the better hash function we have.
The only difference will be probably that the seed will be global and
not per-table since glibs table doesn't support that. If that's not
acceptable we need to keep all the code since glibs hash table's hash
function prototype is:
A one-time seed is likely fine.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|