On Fri, Aug 10, 2012 at 08:58:04AM -0600, Eric Blake wrote:
On 08/10/2012 07:47 AM, Daniel P. Berrange wrote:
> From: "Daniel P. Berrange" <berrange(a)redhat.com>
>
> The current virRandomBits() API is only usable if the caller wants
> a random number in the range [0, (n-1)] where n is a power of two.
> This adds a virRandom() API which works for upper limits which are
> not a power of two. It works by using virRandomBits() to generate
> a number to the next nearest power of 2 limit, and then scales it
> down.
I like the idea, but NAK to this version of the patch. That algorithm
is not uniform.
> +/**
> + * virRandom:
> + * @max: Upper bound on random number (not inclusive)
> + *
> + * Generate an evenly distributed random number between [0,max-1]
> + * If @max is a power of 2, then use virRandomBits instead
> + *
> + * Return: a random number with @nbits entropy
> + */
> +uint64_t virRandom(unsigned long long max)
> +{
> + int bits = 0;
> + unsigned long long tmp = max - 1;
> + uint64_t ret;
> +
> + while (tmp) {
> + tmp >>= 1;
> + bits++;
> + }
Slow. I would suggest using gcc's __builtin_clz (the way qemu does),
except that gnulib doesn't yet have a module exposing it. I guess that
means I'd better write a quick gnulib module. Once you have clz(), this
is much simpler as:
assert(max);
bits = 64 - clz(max);
> +
> + ret = virRandomBits(bits);
> +
> + if ((1 << bits) != max) {
> + double d = ret;
> + d *= max;
> + d /= (1 << bits);
> + ret = (uint64_t)d;
> + }
Wrong. Consider when max == 3 (i.e. generate a random choice between 0,
1, or 2). bits is 2, so d starts as 0, 1, 2, or 3, each with 25%
probability. Then after this computation:
0 -> 0*3 / 4.0 -> 0
1 -> 1*3 / 4.0 -> 0
2 -> 2*3 / 4.0 -> 1
3 -> 3*3 / 4.0 -> 2
that is, you have a 50% chance of getting 0, but only a 25% chance of
getting 1 or 2 - that is not a uniform distribution. The ONLY way to do
this is:
do {
ret = virRandomBits(bits);
} while (ret >= max);
Then, with the same input of max == 3, you have a 25% chance of calling
virRandomBits at least twice, a 12.5% chance of calling it at least
three times, and so on, but you are guaranteed that you will eventually
get a result that is uniformly distributed.
Hmm, it was the looping that I used originally, but I wanted to get
something that had a deterministic upper bound, instead of just a
probablistic bound.
The alternative I wanted to use was drand48_t() which returns a double
in the range 0.0 -> 1.0, with 48 bits of entropy. You could multiply by
'max', and cast to int. But this isn't portable and is not fixed by
GNULIB.
I'm wondering if we could emulate drand48_t() ourselves by doing
something like this:
#include <ieee754.h>
double virRandom(void) {
union ieee754_double val;
val.ieee.negative = 0;
val.ieee.exponent = IEEE754_DOUBLE_BIAS;
val.ieee.mantissa0 = virRandomBits(20);
val.ieee.mantissa0 = virRandomBits(32);
return val.d - 1.0;
}
int virRandomInt(int max) {
return virRandom() * max;
}
Guess it depends whether ieee754.h is portable ? Or can we in fact just
define that union ourselves ?
union ieee754_double
{
double d;
/* This is the IEEE 754 double-precision format. */
struct
{
#if __BYTE_ORDER == __BIG_ENDIAN
unsigned int negative:1;
unsigned int exponent:11;
/* Together these comprise the mantissa. */
unsigned int mantissa0:20;
unsigned int mantissa1:32;
#endif /* Big endian. */
#if __BYTE_ORDER == __LITTLE_ENDIAN
# if __FLOAT_WORD_ORDER == __BIG_ENDIAN
unsigned int mantissa0:20;
unsigned int exponent:11;
unsigned int negative:1;
unsigned int mantissa1:32;
# else
/* Together these comprise the mantissa. */
unsigned int mantissa1:32;
unsigned int mantissa0:20;
unsigned int exponent:11;
unsigned int negative:1;
# endif
#endif /* Little endian. */
} ieee;
};
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|