Re: [PATCH V1 0/6] fast qom tree get

newer
[PATCH] ci: refresh with 'lcitool...

older
Release of libvirt-11.3.0

Markus Armbruster

9 Apr 2025 9 Apr '25

7:39 a.m.

Hi Steve, I apologize for the slow response. Steve Sistare <steven.sistare@oracle.com> writes:

...

Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

...

To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

...

In all cases, a returned property is represented by ObjectPropertyValue, with fields name, type, value, and error. If an error occurs when reading a value, the value field is omitted, and the error message is returned in the the error field. Thus an error for one property will not cause a bulk fetch operation to fail.

Returning errors this way is highly unusual. Observation; I'm not rejecting this out of hand. Can you elaborate a bit on why it's useful?

...

To evaluate each method, I modified scripts/qmp/qom-tree to use the method, verified all methods produce the same output, and timed each using:

qemu-system-x86_64 -display none \ -chardev socket,id=monitor0,path=/tmp/vm1.sock,server=on,wait=off \ -mon monitor0,mode=control &

time qom-tree -s /tmp/vm1.sock > /dev/null

Cool!

...

I only measured once per method, but the variation is low after a warm up run. The 'real - user - sys' column is a proxy for QEMU CPU time.

method real(s) user(s) sys(s) (real - user - sys)(s) qom-list / qom-get 2.048 0.932 0.057 1.059 qom-list-get 0.402 0.230 0.029 0.143 qom-list-getv 0.200 0.132 0.015 0.053 qom-tree-get 0.143 0.123 0.012 0.008

qom-tree-get is the clear winner, reducing elapsed time by a factor of 14X, and reducing QEMU CPU time by 132X.

qom-list-getv is slower when fetching the entire tree, but can beat qom-tree-get when only a subset of the tree needs to be fetched (not shown).

qom-list-get is shown for comparison only, and is not included in this series.

If we have qom-list-getv, then qom-list-get is not worth having.

Show replies by date

Peter Krempa

9 Apr 9 Apr

7:58 a.m.

New subject: [PATCH V1 0/6] fast qom tree get

On Wed, Apr 09, 2025 at 09:39:02 +0200, Markus Armbruster via Devel wrote:

...

Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

libvirt is at ~500 qom-get calls during an average startup ...

...

...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

YES!!! The getter with value could SO MUCH optimize the startup sequence of a VM where libvirt needs to probe CPU flags: (note the 'id' field in libvirt's monitor is sequential) buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotplugged"},"id":"libvirt-9"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotpluggable"},"id":"libvirt-10"} [...] buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hv-apicv"},"id":"libvirt-470"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"xd"},"id":"libvirt-471"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"sse4_1"},"id":"libvirt-472"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"} First and last line's timestamps: 2025-04-08 14:44:28.882+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"} 2025-04-08 14:44:29.149+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"} Libvirt spent ~170 ms probing cpu flags.

Daniel P. Berrangé

11 Apr 11 Apr

10:11 a.m.

New subject: [PATCH V1 0/6] fast qom tree get

On Wed, Apr 09, 2025 at 09:58:13AM +0200, Peter Krempa via Devel wrote:

...

On Wed, Apr 09, 2025 at 09:39:02 +0200, Markus Armbruster via Devel wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

libvirt is at ~500 qom-get calls during an average startup ...

...
...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

YES!!!

Not neccessarily, see below... !!!!

...

The getter with value could SO MUCH optimize the startup sequence of a VM where libvirt needs to probe CPU flags:

(note the 'id' field in libvirt's monitor is sequential)

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotplugged"},"id":"libvirt-9"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotpluggable"},"id":"libvirt-10"}

[...]

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hv-apicv"},"id":"libvirt-470"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"xd"},"id":"libvirt-471"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"sse4_1"},"id":"libvirt-472"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

First and last line's timestamps:

2025-04-08 14:44:28.882+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"}

2025-04-08 14:44:29.149+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

Libvirt spent ~170 ms probing cpu flags.

One thing I would point out is that qom-get can be considered an "escape hatch" to get information when no better QMP command exists. In this case, libvirt has made the assumption that every CPU feature is a QOM property. Adding qom-list-get doesn't appreciably change that, just makes the usage more efficient. Considering the bigger picture QMP design, when libvirt is trying to understand QEMU's CPU feature flag expansion, I would ask why we don't have something like a "query-cpu" command to tell us the current CPU expansion, avoiding the need for poking at QOM properties directly. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Markus Armbruster

10:40 a.m.

New subject: Management applications and CPU feature flags (was: [PATCH V1 0/6] fast qom tree get)

Daniel P. Berrangé <berrange@redhat.com> writes:

...

On Wed, Apr 09, 2025 at 09:58:13AM +0200, Peter Krempa via Devel wrote:

...
On Wed, Apr 09, 2025 at 09:39:02 +0200, Markus Armbruster via Devel wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

libvirt is at ~500 qom-get calls during an average startup ...

...
...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

YES!!!

Not neccessarily, see below... !!!!

...
The getter with value could SO MUCH optimize the startup sequence of a VM where libvirt needs to probe CPU flags:

(note the 'id' field in libvirt's monitor is sequential)

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotplugged"},"id":"libvirt-9"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotpluggable"},"id":"libvirt-10"}

[...]

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hv-apicv"},"id":"libvirt-470"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"xd"},"id":"libvirt-471"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"sse4_1"},"id":"libvirt-472"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

First and last line's timestamps:

2025-04-08 14:44:28.882+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"}

2025-04-08 14:44:29.149+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

Libvirt spent ~170 ms probing cpu flags.

One thing I would point out is that qom-get can be considered an "escape hatch" to get information when no better QMP command exists. In this case, libvirt has made the assumption that every CPU feature is a QOM property.

Adding qom-list-get doesn't appreciably change that, just makes the usage more efficient.

Considering the bigger picture QMP design, when libvirt is trying to understand QEMU's CPU feature flag expansion, I would ask why we don't have something like a "query-cpu" command to tell us the current CPU expansion, avoiding the need for poking at QOM properties directly.

How do the existing query-cpu-FOO fall short of what management applications such as libvirt needs?

Daniel P. Berrangé

10:43 a.m.

New subject: Management applications and CPU feature flags (was: [PATCH V1 0/6] fast qom tree get)

On Fri, Apr 11, 2025 at 12:40:46PM +0200, Markus Armbruster wrote:

...

Daniel P. Berrangé <berrange@redhat.com> writes:

...
On Wed, Apr 09, 2025 at 09:58:13AM +0200, Peter Krempa via Devel wrote:

...
On Wed, Apr 09, 2025 at 09:39:02 +0200, Markus Armbruster via Devel wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

libvirt is at ~500 qom-get calls during an average startup ...

...
...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

YES!!!

Not neccessarily, see below... !!!!

...
The getter with value could SO MUCH optimize the startup sequence of a VM where libvirt needs to probe CPU flags:

(note the 'id' field in libvirt's monitor is sequential)

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotplugged"},"id":"libvirt-9"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotpluggable"},"id":"libvirt-10"}

[...]

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hv-apicv"},"id":"libvirt-470"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"xd"},"id":"libvirt-471"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"sse4_1"},"id":"libvirt-472"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

First and last line's timestamps:

2025-04-08 14:44:28.882+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"}

2025-04-08 14:44:29.149+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

Libvirt spent ~170 ms probing cpu flags.

One thing I would point out is that qom-get can be considered an "escape hatch" to get information when no better QMP command exists. In this case, libvirt has made the assumption that every CPU feature is a QOM property.

Adding qom-list-get doesn't appreciably change that, just makes the usage more efficient.

Considering the bigger picture QMP design, when libvirt is trying to understand QEMU's CPU feature flag expansion, I would ask why we don't have something like a "query-cpu" command to tell us the current CPU expansion, avoiding the need for poking at QOM properties directly.

How do the existing query-cpu-FOO fall short of what management applications such as libvirt needs?

It has been along while since I looked at them, but IIRC they were returning static info about CPU models, whereas libvirt wanted info on the currently requested '-cpu ARGS' With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Markus Armbruster

11:43 a.m.

New subject: Management applications and CPU feature flags

Daniel P. Berrangé <berrange@redhat.com> writes:

...

On Fri, Apr 11, 2025 at 12:40:46PM +0200, Markus Armbruster wrote:

...
Daniel P. Berrangé <berrange@redhat.com> writes:

...
On Wed, Apr 09, 2025 at 09:58:13AM +0200, Peter Krempa via Devel wrote:

...
On Wed, Apr 09, 2025 at 09:39:02 +0200, Markus Armbruster via Devel wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

libvirt is at ~500 qom-get calls during an average startup ...

...
...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

YES!!!

Not neccessarily, see below... !!!!

...
The getter with value could SO MUCH optimize the startup sequence of a VM where libvirt needs to probe CPU flags:

(note the 'id' field in libvirt's monitor is sequential)

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotplugged"},"id":"libvirt-9"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotpluggable"},"id":"libvirt-10"}

[...]

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hv-apicv"},"id":"libvirt-470"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"xd"},"id":"libvirt-471"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"sse4_1"},"id":"libvirt-472"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

First and last line's timestamps:

2025-04-08 14:44:28.882+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"}

2025-04-08 14:44:29.149+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

Libvirt spent ~170 ms probing cpu flags.

One thing I would point out is that qom-get can be considered an "escape hatch" to get information when no better QMP command exists. In this case, libvirt has made the assumption that every CPU feature is a QOM property.

Adding qom-list-get doesn't appreciably change that, just makes the usage more efficient.

Considering the bigger picture QMP design, when libvirt is trying to understand QEMU's CPU feature flag expansion, I would ask why we don't have something like a "query-cpu" command to tell us the current CPU expansion, avoiding the need for poking at QOM properties directly.

How do the existing query-cpu-FOO fall short of what management applications such as libvirt needs?

It has been along while since I looked at them, but IIRC they were returning static info about CPU models, whereas libvirt wanted info on the currently requested '-cpu ARGS'

Libvirt developers, please work with us on design of new commands or improvements to existing ones to better meet libvirt's needs in this area.

David Hildenbrand

noon

New subject: Management applications and CPU feature flags

On 11.04.25 13:43, Markus Armbruster wrote:

...

Daniel P. Berrangé <berrange@redhat.com> writes:

...
On Fri, Apr 11, 2025 at 12:40:46PM +0200, Markus Armbruster wrote:

...
Daniel P. Berrangé <berrange@redhat.com> writes:

...
On Wed, Apr 09, 2025 at 09:58:13AM +0200, Peter Krempa via Devel wrote:

...
On Wed, Apr 09, 2025 at 09:39:02 +0200, Markus Armbruster via Devel wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

> Using qom-list and qom-get to get all the nodes and property values in a > QOM tree can take multiple seconds because it requires 1000's of individual > QOM requests. Some managers fetch the entire tree or a large subset > of it when starting a new VM, and this cost is a substantial fraction of > start up time.

"Some managers"... could you name one?

libvirt is at ~500 qom-get calls during an average startup ...

...
> To reduce this cost, consider QAPI calls that fetch more information in > each call: > * qom-list-get: given a path, return a list of properties and values. > * qom-list-getv: given a list of paths, return a list of properties and > values for each path. > * qom-tree-get: given a path, return all descendant nodes rooted at that > path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

YES!!!

Not neccessarily, see below... !!!!

...
The getter with value could SO MUCH optimize the startup sequence of a VM where libvirt needs to probe CPU flags:

(note the 'id' field in libvirt's monitor is sequential)

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotplugged"},"id":"libvirt-9"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hotpluggable"},"id":"libvirt-10"}

[...]

buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"hv-apicv"},"id":"libvirt-470"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"xd"},"id":"libvirt-471"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"sse4_1"},"id":"libvirt-472"} buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

First and last line's timestamps:

2025-04-08 14:44:28.882+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"realized"},"id":"libvirt-8"}

2025-04-08 14:44:29.149+0000: 1481190: info : qemuMonitorIOWrite:340 : QEMU_MONITOR_IO_WRITE: mon=0x7f4678048360 buf={"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]","property":"unavailable-features"},"id":"libvirt-473"}

Libvirt spent ~170 ms probing cpu flags.

One thing I would point out is that qom-get can be considered an "escape hatch" to get information when no better QMP command exists. In this case, libvirt has made the assumption that every CPU feature is a QOM property.

Adding qom-list-get doesn't appreciably change that, just makes the usage more efficient.

Considering the bigger picture QMP design, when libvirt is trying to understand QEMU's CPU feature flag expansion, I would ask why we don't have something like a "query-cpu" command to tell us the current CPU expansion, avoiding the need for poking at QOM properties directly.

How do the existing query-cpu-FOO fall short of what management applications such as libvirt needs?

It has been along while since I looked at them, but IIRC they were returning static info about CPU models, whereas libvirt wanted info on the currently requested '-cpu ARGS'

Not sure what the exact requirements and other archs, but at least on s390x I think that's exactly what we do. If you expand a non-static model (e.g., z14) you'd get the expansion as if you would specify "-cpu z14" on the cmdline for a specific QEMU machine. Looking at CPU properties is really a nasty hack.

...

Libvirt developers, please work with us on design of new commands or improvements to existing ones to better meet libvirt's needs in this area.

Yes, knowing about requirements and why the existing APIs don't work would be great. -- Cheers, David / dhildenb

Jiri Denemark

1:23 p.m.

New subject: Management applications and CPU feature flags

On Fri, Apr 11, 2025 at 13:43:39 +0200, Markus Armbruster wrote:

...

Daniel P. Berrangé <berrange@redhat.com> writes:

...
On Fri, Apr 11, 2025 at 12:40:46PM +0200, Markus Armbruster wrote:

...
Daniel P. Berrangé <berrange@redhat.com> writes:

...
Considering the bigger picture QMP design, when libvirt is trying to understand QEMU's CPU feature flag expansion, I would ask why we don't have something like a "query-cpu" command to tell us the current CPU expansion, avoiding the need for poking at QOM properties directly.

How do the existing query-cpu-FOO fall short of what management applications such as libvirt needs?

It has been along while since I looked at them, but IIRC they were returning static info about CPU models, whereas libvirt wanted info on the currently requested '-cpu ARGS'

Libvirt developers, please work with us on design of new commands or improvements to existing ones to better meet libvirt's needs in this area.

The existing commands (query-cpu-definitions, query-cpu-model-expansion) are useful for probing before starting a domain. But what we use qom-get for is to get a view of the currently instantiated virtual CPU created by QEMU according to -cpu when we're starting a domain. In other words, we start QEMU with -S and before starting vCPUs we need to know exactly what features were enabled and if any feature we requested was disabled by QEMU. Currently we query QOM for CPU properties as that's what we were advised to use ages ago. The reason behind querying such info is ensuring stable guest ABI during migration. Asking QEMU for a specific CPU model and features does not mean we'll get exactly what we asked for (this is not a bug) so we need to record the differences so that we can start QEMU for incoming migration with a CPU matching exactly the one provided on the source. As Peter said, the current way is terribly inefficient as it requires several hundreds of QMP commands so the goal is to have a single QMP command that would tell us all we need to know about the virtual CPU. That is all enabled features and all features that could not be enabled even though we asked for them. Jirka

Cornelia Huck

1:58 p.m.

New subject: Management applications and CPU feature flags

On Fri, Apr 11 2025, Jiri Denemark <jdenemar@redhat.com> wrote:

...

On Fri, Apr 11, 2025 at 13:43:39 +0200, Markus Armbruster wrote:

...
Daniel P. Berrangé <berrange@redhat.com> writes:

...
On Fri, Apr 11, 2025 at 12:40:46PM +0200, Markus Armbruster wrote:

...
Daniel P. Berrangé <berrange@redhat.com> writes:

...
Considering the bigger picture QMP design, when libvirt is trying to understand QEMU's CPU feature flag expansion, I would ask why we don't have something like a "query-cpu" command to tell us the current CPU expansion, avoiding the need for poking at QOM properties directly.

How do the existing query-cpu-FOO fall short of what management applications such as libvirt needs?

It has been along while since I looked at them, but IIRC they were returning static info about CPU models, whereas libvirt wanted info on the currently requested '-cpu ARGS'

Libvirt developers, please work with us on design of new commands or improvements to existing ones to better meet libvirt's needs in this area.

The existing commands (query-cpu-definitions, query-cpu-model-expansion) are useful for probing before starting a domain. But what we use qom-get for is to get a view of the currently instantiated virtual CPU created by QEMU according to -cpu when we're starting a domain. In other words, we start QEMU with -S and before starting vCPUs we need to know exactly what features were enabled and if any feature we requested was disabled by QEMU. Currently we query QOM for CPU properties as that's what we were advised to use ages ago.

The reason behind querying such info is ensuring stable guest ABI during migration. Asking QEMU for a specific CPU model and features does not mean we'll get exactly what we asked for (this is not a bug) so we need to record the differences so that we can start QEMU for incoming migration with a CPU matching exactly the one provided on the source.

As Peter said, the current way is terribly inefficient as it requires several hundreds of QMP commands so the goal is to have a single QMP command that would tell us all we need to know about the virtual CPU. That is all enabled features and all features that could not be enabled even though we asked for them.

Wandering in here from the still-very-much-in-progress Arm perspective (current but not yet posted QEMU code at https://gitlab.com/cohuck/qemu/-/tree/arm-cpu-model-rfcv3?ref_type=heads): We're currently operating at the "writable ID register fields" level with the idea of providing features (FEAT_xxx) as an extra layer on top (as they model a subset of what we actually need) and have yet to come up with a good way to do named models for KVM. The query-cpu-model-expansion command will yield a list of all writable ID register fields and their values (as for now, for the 'host' model.) IIUC you want to query (a) what is actually available for configuration (before starting a domain) and (b) what you actually got (when starting a domain). Would a dump of the current state of the ID register fields before starting the vcpus work for (b)? Or is that too different from what other archs need/want? How much wriggle room do we have for special handling (different commands, different output, ...?)

Jiří Denemark

15 Apr 15 Apr

11:33 a.m.

New subject: Management applications and CPU feature flags

On Fri, Apr 11, 2025 at 15:58:54 +0200, Cornelia Huck wrote:

...

On Fri, Apr 11 2025, Jiri Denemark <jdenemar@redhat.com> wrote:

...
On Fri, Apr 11, 2025 at 13:43:39 +0200, Markus Armbruster wrote:

...
Daniel P. Berrangé <berrange@redhat.com> writes:

...
On Fri, Apr 11, 2025 at 12:40:46PM +0200, Markus Armbruster wrote:

...
Daniel P. Berrangé <berrange@redhat.com> writes:

...
Considering the bigger picture QMP design, when libvirt is trying to understand QEMU's CPU feature flag expansion, I would ask why we don't have something like a "query-cpu" command to tell us the current CPU expansion, avoiding the need for poking at QOM properties directly.

How do the existing query-cpu-FOO fall short of what management applications such as libvirt needs?

It has been along while since I looked at them, but IIRC they were returning static info about CPU models, whereas libvirt wanted info on the currently requested '-cpu ARGS'

Libvirt developers, please work with us on design of new commands or improvements to existing ones to better meet libvirt's needs in this area.

The existing commands (query-cpu-definitions, query-cpu-model-expansion) are useful for probing before starting a domain. But what we use qom-get for is to get a view of the currently instantiated virtual CPU created by QEMU according to -cpu when we're starting a domain. In other words, we start QEMU with -S and before starting vCPUs we need to know exactly what features were enabled and if any feature we requested was disabled by QEMU. Currently we query QOM for CPU properties as that's what we were advised to use ages ago.

The reason behind querying such info is ensuring stable guest ABI during migration. Asking QEMU for a specific CPU model and features does not mean we'll get exactly what we asked for (this is not a bug) so we need to record the differences so that we can start QEMU for incoming migration with a CPU matching exactly the one provided on the source.

As Peter said, the current way is terribly inefficient as it requires several hundreds of QMP commands so the goal is to have a single QMP command that would tell us all we need to know about the virtual CPU. That is all enabled features and all features that could not be enabled even though we asked for them.

Wandering in here from the still-very-much-in-progress Arm perspective (current but not yet posted QEMU code at https://gitlab.com/cohuck/qemu/-/tree/arm-cpu-model-rfcv3?ref_type=heads):

We're currently operating at the "writable ID register fields" level with the idea of providing features (FEAT_xxx) as an extra layer on top (as they model a subset of what we actually need) and have yet to come up with a good way to do named models for KVM. The query-cpu-model-expansion command will yield a list of all writable ID register fields and their values (as for now, for the 'host' model.) IIUC you want to query (a) what is actually available for configuration (before starting a domain) and (b) what you actually got (when starting a domain).

I guess it will be possible for QEMU to actually set something different from what we tell it to do (for example dependency of a specific settings on something else which was not set, etc)? If so, we indeed need both (a) and (b).

...

Would a dump of the current state of the ID register fields before starting the vcpus work for (b)?

I guess so. Originally for x86_64 we got a dump of CPUID data, but that changed when some features started to be described by MSRs.

...

Or is that too different from what other archs need/want?

Each arch has some specifics in CPU configuration and the way we talk with QEMU about it. So having the same QMP interface is not a requirement. It depends how well the existing interface maps to details that need to be expressed. That said a common interface is better if it makes sense. Jirka

Steven Sistare

9 Apr 9 Apr

12:42 p.m.

New subject: [PATCH V1 0/6] fast qom tree get

On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...

Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

...

...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

...
In all cases, a returned property is represented by ObjectPropertyValue, with fields name, type, value, and error. If an error occurs when reading a value, the value field is omitted, and the error message is returned in the the error field. Thus an error for one property will not cause a bulk fetch operation to fail.

Returning errors this way is highly unusual. Observation; I'm not rejecting this out of hand. Can you elaborate a bit on why it's useful?

It is considered an error to read some properties if they are not valid for the configuration. And some properties are write-only and return an error if they are read. Examples: legacy-i8042: <EXCEPTION: Property 'vmmouse.legacy-i8042' is not readable> (str) legacy-memory: <EXCEPTION: Property 'qemu64-x86_64-cpu.legacy-memory' is not readable> (str) crash-information: <EXCEPTION: No crash occurred> (GuestPanicInformation) With conventional error handling, if any of these poison pills falls in the scope of a bulk get operation, the entire operation fails.

...

...
To evaluate each method, I modified scripts/qmp/qom-tree to use the method, verified all methods produce the same output, and timed each using:

qemu-system-x86_64 -display none \ -chardev socket,id=monitor0,path=/tmp/vm1.sock,server=on,wait=off \ -mon monitor0,mode=control &

time qom-tree -s /tmp/vm1.sock > /dev/null

Cool!

...
I only measured once per method, but the variation is low after a warm up run. The 'real - user - sys' column is a proxy for QEMU CPU time.

method real(s) user(s) sys(s) (real - user - sys)(s) qom-list / qom-get 2.048 0.932 0.057 1.059 qom-list-get 0.402 0.230 0.029 0.143 qom-list-getv 0.200 0.132 0.015 0.053 qom-tree-get 0.143 0.123 0.012 0.008

qom-tree-get is the clear winner, reducing elapsed time by a factor of 14X, and reducing QEMU CPU time by 132X.

qom-list-getv is slower when fetching the entire tree, but can beat qom-tree-get when only a subset of the tree needs to be fetched (not shown).

qom-list-get is shown for comparison only, and is not included in this series.

If we have qom-list-getv, then qom-list-get is not worth having.

Exactly. - Steve

Markus Armbruster

1:34 p.m.

New subject: [PATCH V1 0/6] fast qom tree get

Steven Sistare <steven.sistare@oracle.com> writes:

...

On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

Peter Krempa tells us libvirt would benefit.

...

...
...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

...
In all cases, a returned property is represented by ObjectPropertyValue, with fields name, type, value, and error. If an error occurs when reading a value, the value field is omitted, and the error message is returned in the the error field. Thus an error for one property will not cause a bulk fetch operation to fail.

Returning errors this way is highly unusual. Observation; I'm not rejecting this out of hand. Can you elaborate a bit on why it's useful?

It is considered an error to read some properties if they are not valid for the configuration. And some properties are write-only and return an error if they are read. Examples:

legacy-i8042: <EXCEPTION: Property 'vmmouse.legacy-i8042' is not readable> (str) legacy-memory: <EXCEPTION: Property 'qemu64-x86_64-cpu.legacy-memory' is not readable> (str) crash-information: <EXCEPTION: No crash occurred> (GuestPanicInformation)

With conventional error handling, if any of these poison pills falls in the scope of a bulk get operation, the entire operation fails.

I suspect many of these poison pills are design mistakes. If a property is not valid for the configuration, why does it exist? QOM is by design dynamic. I wish it wasn't, but as long as it is dynamic, I can't see why we should create properties we know to be unusable. Why is reading crash-information an error when no crash occured? This is the *normal* case. Errors are for the abnormal. Anyway, asking you to fix design mistakes all over the place wouldn't be fair. So I'm asking you something else instead: do you actually need the error information? [...]

Steven Sistare

2:06 p.m.

New subject: [PATCH V1 0/6] fast qom tree get

On 4/9/2025 9:34 AM, Markus Armbruster wrote:

...

Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

Peter Krempa tells us libvirt would benefit.

...
...
...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

...
In all cases, a returned property is represented by ObjectPropertyValue, with fields name, type, value, and error. If an error occurs when reading a value, the value field is omitted, and the error message is returned in the the error field. Thus an error for one property will not cause a bulk fetch operation to fail.

Returning errors this way is highly unusual. Observation; I'm not rejecting this out of hand. Can you elaborate a bit on why it's useful?

It is considered an error to read some properties if they are not valid for the configuration. And some properties are write-only and return an error if they are read. Examples:

legacy-i8042: <EXCEPTION: Property 'vmmouse.legacy-i8042' is not readable> (str) legacy-memory: <EXCEPTION: Property 'qemu64-x86_64-cpu.legacy-memory' is not readable> (str) crash-information: <EXCEPTION: No crash occurred> (GuestPanicInformation)

With conventional error handling, if any of these poison pills falls in the scope of a bulk get operation, the entire operation fails.

I suspect many of these poison pills are design mistakes.

If a property is not valid for the configuration, why does it exist? QOM is by design dynamic. I wish it wasn't, but as long as it is dynamic, I can't see why we should create properties we know to be unusable.

Why is reading crash-information an error when no crash occured? This is the *normal* case. Errors are for the abnormal.

Anyway, asking you to fix design mistakes all over the place wouldn't be fair. So I'm asking you something else instead: do you actually need the error information?

I don't need the specific error message. I could return a boolean meaning "property not available" instead of returning the exact error message, as long as folks are OK with the output of the qom-tree script changing for these properties. - Steve

Markus Armbruster

2:44 p.m.

New subject: [PATCH V1 0/6] fast qom tree get

Steven Sistare <steven.sistare@oracle.com> writes:

...

On 4/9/2025 9:34 AM, Markus Armbruster wrote:

...
Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

Peter Krempa tells us libvirt would benefit.

...
...
...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

...
In all cases, a returned property is represented by ObjectPropertyValue, with fields name, type, value, and error. If an error occurs when reading a value, the value field is omitted, and the error message is returned in the the error field. Thus an error for one property will not cause a bulk fetch operation to fail.

Returning errors this way is highly unusual. Observation; I'm not rejecting this out of hand. Can you elaborate a bit on why it's useful?

It is considered an error to read some properties if they are not valid for the configuration. And some properties are write-only and return an error if they are read. Examples:

legacy-i8042: <EXCEPTION: Property 'vmmouse.legacy-i8042' is not readable> (str) legacy-memory: <EXCEPTION: Property 'qemu64-x86_64-cpu.legacy-memory' is not readable> (str) crash-information: <EXCEPTION: No crash occurred> (GuestPanicInformation)

With conventional error handling, if any of these poison pills falls in the scope of a bulk get operation, the entire operation fails.

I suspect many of these poison pills are design mistakes.

If a property is not valid for the configuration, why does it exist? QOM is by design dynamic. I wish it wasn't, but as long as it is dynamic, I can't see why we should create properties we know to be unusable.

Why is reading crash-information an error when no crash occured? This is the *normal* case. Errors are for the abnormal.

Anyway, asking you to fix design mistakes all over the place wouldn't be fair. So I'm asking you something else instead: do you actually need the error information?

I don't need the specific error message.

I could return a boolean meaning "property not available" instead of returning the exact error message, as long as folks are OK with the output of the qom-tree script changing for these properties.

Let's put aside the qom-tree script for a moment. In your patches, the queries return an object's properties as a list of ObjectPropertyValue, defined as { 'struct': 'ObjectPropertyValue', 'data': { 'name': 'str', 'type': 'str', '*value': 'any', '*error': 'str' } } As far as I understand, exactly one of @value and @error are present. The list has no duplicates, i.e. no two elements have the same value of "name". Say we're interested in property "foo". Three cases: * The list has an element with "name": "foo", and the element has member "value": the property exists and "value" has its value. * The list has an element with "name": "foo", and the element does not have member "value": the property exists, but its value cannot be gotten; member "error" has the error message. * The list has no element with "name": "foo": the property does not exist. If we simply drop ObjectPropertyValue member @error, we lose 'member "error" has the error message'. That's all. If a need for more error information should arise later, we could add member @error. Or something else entirely. Or tell people to qom-get any properties qom-tree-get couldn't get for error information. My point is: dropping @error now does not tie our hands as far as I can tell. Back to qom-tree. I believe this script is a development aid that exists because qom-get is painful to use for humans. Your qom-tree command would completely obsolete it. I wouldn't worry about it. If you think I'm wrong there, please speak up!

Steven Sistare

3:14 p.m.

New subject: [PATCH V1 0/6] fast qom tree get

On 4/9/2025 10:44 AM, Markus Armbruster wrote:

...

Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 9:34 AM, Markus Armbruster wrote:

...
Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

Peter Krempa tells us libvirt would benefit.

...
...
...
To reduce this cost, consider QAPI calls that fetch more information in each call: * qom-list-get: given a path, return a list of properties and values. * qom-list-getv: given a list of paths, return a list of properties and values for each path. * qom-tree-get: given a path, return all descendant nodes rooted at that path, with properties and values for each.

Libvirt developers, would you be interested in any of these?

...
In all cases, a returned property is represented by ObjectPropertyValue, with fields name, type, value, and error. If an error occurs when reading a value, the value field is omitted, and the error message is returned in the the error field. Thus an error for one property will not cause a bulk fetch operation to fail.

Returning errors this way is highly unusual. Observation; I'm not rejecting this out of hand. Can you elaborate a bit on why it's useful?

It is considered an error to read some properties if they are not valid for the configuration. And some properties are write-only and return an error if they are read. Examples:

legacy-i8042: <EXCEPTION: Property 'vmmouse.legacy-i8042' is not readable> (str) legacy-memory: <EXCEPTION: Property 'qemu64-x86_64-cpu.legacy-memory' is not readable> (str) crash-information: <EXCEPTION: No crash occurred> (GuestPanicInformation)

With conventional error handling, if any of these poison pills falls in the scope of a bulk get operation, the entire operation fails.

I suspect many of these poison pills are design mistakes.

If a property is not valid for the configuration, why does it exist? QOM is by design dynamic. I wish it wasn't, but as long as it is dynamic, I can't see why we should create properties we know to be unusable.

Why is reading crash-information an error when no crash occured? This is the *normal* case. Errors are for the abnormal.

Anyway, asking you to fix design mistakes all over the place wouldn't be fair. So I'm asking you something else instead: do you actually need the error information?

I don't need the specific error message.

I could return a boolean meaning "property not available" instead of returning the exact error message, as long as folks are OK with the output of the qom-tree script changing for these properties.

Let's put aside the qom-tree script for a moment.

In your patches, the queries return an object's properties as a list of ObjectPropertyValue, defined as

{ 'struct': 'ObjectPropertyValue', 'data': { 'name': 'str', 'type': 'str', '*value': 'any', '*error': 'str' } }

As far as I understand, exactly one of @value and @error are present.

The list has no duplicates, i.e. no two elements have the same value of "name".

Say we're interested in property "foo". Three cases:

* The list has an element with "name": "foo", and the element has member "value": the property exists and "value" has its value.

* The list has an element with "name": "foo", and the element does not have member "value": the property exists, but its value cannot be gotten; member "error" has the error message.

* The list has no element with "name": "foo": the property does not exist.

If we simply drop ObjectPropertyValue member @error, we lose 'member "error" has the error message'. That's all.

If a need for more error information should arise later, we could add member @error. Or something else entirely. Or tell people to qom-get any properties qom-tree-get couldn't get for error information. My point is: dropping @error now does not tie our hands as far as I can tell.

Agreed. I forgot that I had defined value as an optional parameter, so simply omitting it means "property not available".

...

Back to qom-tree. I believe this script is a development aid that exists because qom-get is painful to use for humans. Your qom-tree command would completely obsolete it. I wouldn't worry about it. If you think I'm wrong there, please speak up!

Regarding dropping the error messages, I agree, I was just pointing it out in case anyone objected. Yes, the new command plus a formatter like jq obsoletes the qom-tree script. Just to be clear, I do not propose to delete the script, since folks are accustomed to it being available, and are accustomed to its output. It also serves as a nice example for how to use the new command. Do you want to review any code and specification now, or wait for me to send V2 that deletes the error member? The changes will be minor. - Steve

Markus Armbruster

10 Apr 10 Apr

5:57 a.m.

New subject: [PATCH V1 0/6] fast qom tree get

Steven Sistare <steven.sistare@oracle.com> writes:

...

On 4/9/2025 10:44 AM, Markus Armbruster wrote:

...
Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 9:34 AM, Markus Armbruster wrote:

...
Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 3:39 AM, Markus Armbruster wrote:

[...]

...

...
...
...
Anyway, asking you to fix design mistakes all over the place wouldn't be fair. So I'm asking you something else instead: do you actually need the error information?

I don't need the specific error message.

I could return a boolean meaning "property not available" instead of returning the exact error message, as long as folks are OK with the output of the qom-tree script changing for these properties.

Let's put aside the qom-tree script for a moment.

[...]

...

...
Back to qom-tree. I believe this script is a development aid that exists because qom-get is painful to use for humans. Your qom-tree command would completely obsolete it. I wouldn't worry about it. If you think I'm wrong there, please speak up!

Regarding dropping the error messages, I agree, I was just pointing it out in case anyone objected.

Appreciated.

...

Yes, the new command plus a formatter like jq obsoletes the qom-tree script. Just to be clear, I do not propose to delete the script, since folks are accustomed to it being available, and are accustomed to its output. It also serves as a nice example for how to use the new command.

I have little use for scripts/qmp/ myself. Since nothing there adds to my maintenance load appreciably, I don't mind keeping the scripts. qom-fuse is rather cute.

...

Do you want to review any code and specification now, or wait for me to send V2 that deletes the error member? The changes will be minor.

v1 should do for review. Thanks!

Markus Armbruster

28 Apr 28 Apr

8:04 a.m.

New subject: [PATCH V1 0/6] fast qom tree get

Steven Sistare <steven.sistare@oracle.com> writes:

...

On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

Elsewhere in this thread, we examined libvirt's use qom-get. Its use of qom-get is also noticably slow, and your work could speed it up. However, most of its use is for working around QMP interface shortcomings around probing CPU flags. Addressing these would help it even more. This makes me wonder what questions Oracle's OCI answers with the help of qom-get. Can you briefly describe them? Even if OCI would likewise be helped more by better QMP queries, your fast qom tree get work might still be useful.

Steven Sistare

4:18 p.m.

New subject: [PATCH V1 0/6] fast qom tree get

On 4/28/2025 4:04 AM, Markus Armbruster wrote:

...

Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

Elsewhere in this thread, we examined libvirt's use qom-get. Its use of qom-get is also noticably slow, and your work could speed it up. However, most of its use is for working around QMP interface shortcomings around probing CPU flags. Addressing these would help it even more.

This makes me wonder what questions Oracle's OCI answers with the help of qom-get. Can you briefly describe them?

Even if OCI would likewise be helped more by better QMP queries, your fast qom tree get work might still be useful.

We already optimized our queries as a first step, but what remains is still significant, which is why I submitted this RFE. - Steve

Markus Armbruster

29 Apr 29 Apr

6:02 a.m.

New subject: [PATCH V1 0/6] fast qom tree get

Steven Sistare <steven.sistare@oracle.com> writes:

...

On 4/28/2025 4:04 AM, Markus Armbruster wrote:

...
Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

Elsewhere in this thread, we examined libvirt's use qom-get. Its use of qom-get is also noticably slow, and your work could speed it up. However, most of its use is for working around QMP interface shortcomings around probing CPU flags. Addressing these would help it even more.

This makes me wonder what questions Oracle's OCI answers with the help of qom-get. Can you briefly describe them?

Even if OCI would likewise be helped more by better QMP queries, your fast qom tree get work might still be useful.

We already optimized our queries as a first step, but what remains is still significant, which is why I submitted this RFE.

I understand your motivation. I'd like to learn more on what OCI actually needs from QMP, to be able to better serve it and potentially other management applications.

Steven Sistare

2 May 2 May

4:19 p.m.

New subject: [PATCH V1 0/6] fast qom tree get

On 4/29/2025 2:02 AM, Markus Armbruster wrote:

...

Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/28/2025 4:04 AM, Markus Armbruster wrote:

...
Steven Sistare <steven.sistare@oracle.com> writes:

...
On 4/9/2025 3:39 AM, Markus Armbruster wrote:

...
Hi Steve, I apologize for the slow response.

Steve Sistare <steven.sistare@oracle.com> writes:

...
Using qom-list and qom-get to get all the nodes and property values in a QOM tree can take multiple seconds because it requires 1000's of individual QOM requests. Some managers fetch the entire tree or a large subset of it when starting a new VM, and this cost is a substantial fraction of start up time.

"Some managers"... could you name one?

My personal experience is with Oracle's OCI, but likely others could benefit.

Elsewhere in this thread, we examined libvirt's use qom-get. Its use of qom-get is also noticably slow, and your work could speed it up. However, most of its use is for working around QMP interface shortcomings around probing CPU flags. Addressing these would help it even more.

This makes me wonder what questions Oracle's OCI answers with the help of qom-get. Can you briefly describe them?

Even if OCI would likewise be helped more by better QMP queries, your fast qom tree get work might still be useful.

We already optimized our queries as a first step, but what remains is still significant, which is why I submitted this RFE.

I understand your motivation. I'd like to learn more on what OCI actually needs from QMP, to be able to better serve it and potentially other management applications.

OCI uses qemu as the single source of truth during hot plug operations, comparing what is actually there to what the user desires, and as such must request many nodes and their properties from qemu. Regarding being potentially useful to other management applications, my proposed interface is quite general and easy to use, and I don't believe we will improve it by examining more use cases in detail. I'd appreciate it if we could continue reviewing the code. I made a concerted effort to dot all the i's and cross all the t's, based on what you look for in our past discussions of other patches. - Steve

207

Age (days ago)

230

Last active (days ago)

List overview

Download

19 comments

8 participants

participants (8)

Cornelia Huck
Daniel P. Berrangé
David Hildenbrand
Jiri Denemark
Jiří Denemark
Markus Armbruster
Peter Krempa
Steven Sistare

Re: [PATCH V1 0/6] fast qom tree get

tags

participants (8)