On Tue, Apr 30, 2019 at 03:45:46PM +0100, Dr. David Alan Gilbert wrote:
* Daniel P. Berrangé (berrange(a)redhat.com) wrote:
> The QEMU QMP service is based on JSON which is nice because that is a
> widely supported "standard" data format.....
>
> ....except QEMU's implementation (and indeed most impls) are not strictly
> standards compliant.
>
> Specifically the problem is around representing 64-bit integers, whether
> signed or unsigned.
>
> The JSON standard declares that largest integer is 2^53-1 and the
> likewise the smallest is -(2^53-1):
>
>
http://www.ecma-international.org/ecma-262/6.0/index.html#sec-number.max_...
>
> A crazy limit inherited from its javascript origins IIUC.
Ewwww.
Looking a bit deeper it seems this limit comes from the use of double
precision floating point for storing integers. 2^53-1 is the largest
integer value that can be stored in a 64-bit float without loss of
precision.
The Golang JSON parser decodes JSON numbers to float64 by default so
will have this precision limitation too, though at least they provide
a backdoor for custom parsing from the original serialized representation.
> QEMU, and indeed many applications, want to handle 64-bit
integers.
> The C JSON library impls have traditionally mapped integers to the
> data type 'long long int' which gives a min/max of -(2^63) / 2^63-1.
>
> QEMU however /really/ needs 64-bit unsigned integers, ie a max 2^64-1.
>
> Libvirt has historically used the YAJL library which uses 'long long int'
> and thus can't officially go beyond 2^63-1 values. Fortunately it lets
> libvirt get at the raw json string, so libvirt can re-parse the value
> to get an 'unsigned long long'.
>
> We recently tried to switch to Jansson because YAJL has a dead upstream
> for many years and countless unanswered bugs & patches. Unfortunately we
> forgot about this need for 2^64-1 max, and Jansson also uses 'long long
int'
> and raises a fatal parse error for unsigned 64-bit values above 2^63-1. It
> also provides no backdoor for libvirt todo its own integer parsing. Thus
> we had to abort our switch to jansson as it broke parsing QEMU's JSON:
>
>
https://bugzilla.redhat.com/show_bug.cgi?id=1614569
>
> Other JSON libraries we've investigated have similar problems. I imagine
> the same may well be true of non-C based JOSN impls, though I've not
> investigated in any detail.
>
> Essentially libvirt is stuck with either using the dead YAJL library
> forever, or writing its own JSON parser (most likely copying QEMU's
> JSON code into libvirt's git).
>
> This feels like a very unappealing situation to be in as not being
> able to use a JSON library of our choice is loosing one of the key
> benefits of using a standard data format.
>
> Thus I'd like to see a solution to this to allow QMP to be reliably
> consumed by any JSON library that exists.
>
> I can think of some options:
>
> 1. Encode unsigned 64-bit integers as signed 64-bit integers.
>
> This follows the example that most C libraries map JSON ints
> to 'long long int'. This is still relying on undefined
> behaviour as apps don't need to support > 2^53-1.
>
> Apps would need to cast back to 'unsigned long long' for
> those QMP fields they know are supposed to be unsigned.
>
>
> 2. Encode all 64-bit integers as a pair of 32-bit integers.
>
> This is fully compliant with the JSON spec as each half
> is fully within the declared limits. App has to split or
> assemble the 2 pieces from/to a signed/unsigned 64-bit
> int as needed.
>
>
> 3. Encode all 64-bit integers as strings
>
> The application has todo all parsing/formatting client
> side.
>
>
> None of these changes are backwards compatible, so I doubt we could make
> the change transparently in QMP. Instead we would have to have a
> QMP greeting message capability where the client can request enablement
> of the enhanced integer handling.
>
> Any of the three options above would likely work for libvirt, but I
> would have a slight preference for either 2 or 3, so that we become
> 100% standards compliant.
My preference would be 3 with the strings defined as being
%x lower case hex formated with a 0x prefix and no longer than 18 characters
("0x" + 16 nybbles). Zero padding allowed but not required.
It's readable and unambiguous when dealing with addresses; I don't want
to have to start decoding (2) by hand when debugging.
Yep, that's a good point about readability.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|