Re: [libvirt] [Qemu-devel] QMP; unsigned 64-bit ints; JSON standards compliance

13 May 2019

      On Wed, May 08, 2019 at 02:44:07PM +0200, Markus Armbruster wrote:
...
Daniel P. Berrangé <berrange@redhat.com> writes:
...
On Tue, May 07, 2019 at 10:47:06AM +0200, Markus Armbruster wrote:
...
...
...
...
I can think of some options:
1. Encode unsigned 64-bit integers as signed 64-bit integers.
This follows the example that most C libraries map JSON ints
     to 'long long int'. This is still relying on undefined
     behaviour as apps don't need to support > 2^53-1.
Apps would need to cast back to 'unsigned long long' for
     those QMP fields they know are supposed to be unsigned.
Ugly.  It's also what we did until v2.10, August 2017.  QMP's input
direction still does it, for backward compatibility.
...
...
...
2. Encode all 64-bit integers as a pair of 32-bit integers.
This is fully compliant with the JSON spec as each half
     is fully within the declared limits. App has to split or
     assemble the 2 pieces from/to a signed/unsigned 64-bit
     int as needed.
Differently ugly.
...
...
...
3. Encode all 64-bit integers as strings
The application has todo all parsing/formatting client
     side.
Yet another ugly.
...
...
...
None of these changes are backwards compatible, so I doubt we could make
the change transparently in QMP.  Instead we would have to have a
QMP greeting message capability where the client can request enablement
of the enhanced integer handling.
We might be able to do option 1 without capability negotiation.  v2.10's
change from option 1 to what we have now produced zero complaints.
On the other hand, we made that change for a reason, so we may want a
"send large integers as negative integers" capability regardless.
...
...
...
Any of the three options above would likely work for libvirt, but I
would have a slight preference for either 2 or 3, so that we become
100% standards compliant.
There's no such thing.  You mean "we maximize interoperability with
common implementations of JSON".
s/common/any/
info: error correction applied, future applications will be silent ;-P
...
...
Let's talk implementation for a bit.
Encoding and decoding integers in funny ways should be fairly easy in
the QObject visitors.  The generated QMP marshallers all use them.
Trouble is a few commands still bypass the generated marshallers, and
mess with the QObject themselves:
* query-qmp-schema: minor hack explained in qmp_query_qmp_schema()'s
  comment.  Should be harmless.
* netdev_add: not QAPIfied.  Eric's patches to QAPIfy it got stuck
  because they reject some abuses like passing numbers and bools as
  strings.
* device_add: not QAPIfied.  We're not sure QAPIfication is feasible.
netdev_add and device_add both use qemu_opts_from_qdict().  Perhaps we
could hack that to mirror what the QObject visitor do.
Else, we might have to do it in the JSON parser.  Should be possible,
but I'd rather not.
...
...
My preference would be 3 with the strings defined as being
%x lower case hex formated with a 0x prefix and no longer than 18 characters
("0x" + 16 nybbles). Zero padding allowed but not required.
It's readable and unambiguous when dealing with addresses; I don't want
to have to start decoding (2) by hand when debugging.
Yep, that's a good point about readability.
QMP sending all integers in decimal is inconvenient for some values,
such as addresses.  QMP sending all (large) integers in hexadecimal
would be inconvenient for other values.
Let's keep it simple & stupid.  If you want sophistication, JSON is the
wrong choice.
Option 1 feels simplest.
But will still fail with any JSON impl that uses double precision floating
point for integers as it will loose precision.
...
Option 2 feels ugliest.  Less simple, more interoperable than option 1.
If we assume any JSON impl can do 32-bit integers without loss of
precision, then I think we can say it is guaranteed portable, but
it is certainly horrible / ugly.
...
Option 3 is like option 2, just not quite as ugly.
I think option 3 can be guaranteed to be loss-less with /any/ JSON impl
that exists, since you're delegating all string -> int conversion to
the application code taking the JSON parser/formatter out of the equation.
Double-checking: do you propose to encode *all* numbers as strings, or
just certain "problematic" numbers?
If the latter, I guess your idea of "problematic" is "not representable
exactly as double precision floating-point".
We have a few options

 1. Use string format for values > 2^53-1, int format below that
 2. Use string format for all fields which are 64-bit ints whether
    signed or unsigned
 3. Use string format for all fields which are integers, even 32-bit
    ones

I would probably suggest option 2. It would make the QEMU impl quite
easy IIUC, we we'd just change the QAPI visitor's impl for the int64
and uint64 fields to use string format (when the right capability is
negotiated by QMP).

I include 3 only for completeness - I don't think there's a hugely
compelling reason to mess with 32-bit ints.

Option 1 is the bare minimum needed to ensure precision, but to me
it feels a bit dirty to say a given field will have different encoding
depending on the value. If apps need to deal with string encoding, they
might as well just use it for all values in a given field.
...
...
I guess I'd have a preference for option 3 given that it has better
interoperability
If we add a QMP capability for interoperability with JSON
implementations that set limits on range and precision that are
incompatible with the ones QMP sets, one could argue we effectively pay
the price for option 3, so we should take it for its benefits.
Option 1 without a QMP capability merely reverts the change we made in
2.10.  We can do that if we think it's sufficient.
You expressed a strong preference for maximizing interoperability (via
option 3).  Acknowledged.  However, I care a lot more about issues we
know we have than about issues somebody might have.
You mentioned the libvirt's switch to Jansson you had to abort due to
QMP sending numbers Jansson refuses to parse.  That's the kind of
non-hypothetical issue that can make me mess with the QMP language.
You wrote Jansson "raises a fatal parse error for unsigned 64-bit values
above 2^63-1".  Does that mean it rejects 9223372036854775808, but
accepts 9223372036854775808.0 (with loss of precision)?
If it sees a '.' in the number, then it call strtod() and checks for
the overflow conditions.

If it doesn't see a '.' in the number then it calls strtoll and checks
for the overflow conditions.

So to answer you question, yes, it looks like it will reject
9223372036854775808 and accept 9223372036854775808.0 with loss of
precision.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|