[libvirt] clean/simple Q35 support in libvirt+QEMU for guest OSes that don't support virtio-1.0

(Several of us started an offline discussion on this topic, and it quickly became complicated, so we decided it should continue upstream. Here is a synopsis of the discussion so far (as *I've* interpreted it, so corrections are welcome and apologies in advance for anything I got wrong!) Some of the things are stated here as givens, but feel free to rip them apart.) Summary of the problem: 1) We want to persuade libvirt+QEMU users to move away from the i440fx machinetype in favor of Q35. (NB: Someday this *might* lead to the ability to deprecate and even remove the 440fx machinetype, but even if that were to happen, it would be a *very long* time from now, so this discussion is *not* about that!) 2) When Q35 machinetype is used, libvirt assigns virtio devices to a slot on a PCI Express controller (because why have modern PCIe controllers/slots available but force everything onto clunky old legacy controllers?). 3) When a virtio device is plugged into an Express controller, QEMU disables the device's IO port space, and it is put into "modern-only" mode (this is done to avoid a rapid exhaustion of limited IO port space). 4) modern-only virtio devices won't work with a legacy (virtio-0.9-only) guest driver, because virtio-0.9 requires IO port space. 5) Some guest OSes that we still want to support (and which would otherwise work okay on a Q35 virtual machine) have virtio drivers too old to support virtio-1.0 (CentOS6 and RHEL6 are examples of such OSes), but due to the chain of reasons listed above, the "standard" config for a Q35 guest generated by libvirt doesn't support virtio-0.9, hence doesn't support these guest OSes. And here's a list of possible solutions to this problem (note that "consumers" means management applications such as OpenStack, oVirt, virt-manager, virt-install, gnome-boxes, etc. In all cases, it's assumed that the consumer's decision on the action to take will be based on information from libosinfo). For completeness, I've included even the possibilities that have been rejected, along with a brief synopsis of (at least part of) the reason for rejection: (1) Add some way libvirt consumers can ask libvirt to place virtio devices on a legacy pci slot instead of pcie when the machinetype is q35 (qemu sets virtio devices in legacy PCI slots to transitional mode, so io port space is enabled and virtio-0.0 drivers will work). This has been proposed on libvir-list, but rejected. Here is the most elquently stated reasoning for the rejection I could find (with thanks to Dan Berrange): The domain XML is a way to express the configuration of the guest virtual machine. What we're talking about here is a policy tunable for an internal libvirt QEMU driver algorithm, as so does not belong anywhere in the domain XML. (2) Add full-blown pci enumeration support to all libvirt consumers (i.e. they will need to build a model of the PCI bus topology of each guest, and keep track of which addresses are in use). They can then manually place virtio devices on legacy pci slots (again, triggering transitional mode) when the intended guest OS doesn't support virtio-0.9. (This is seen as requiring too much duplicated effort for development and support/maintenance, since up until now libvirt has been the single point of action for PCI address assignment (well, QEMU can do it too, but for > 10 years libvirt has *always* provided full PCI addresses for all devices) (3) Add virtio-1.0 support to all guest OSes. If this is done, existing libvirt configs will work. (Aside from the difficulty of backporting, and the fact that there are going to be some OSes that don't get it *at all*, there will always be older releases that haven't gotten the backport. So this isn't a complete solution). (4) Consumers can continue using the 440fx machinetype for guest OSes that don't support virtio-0.9 (This would work, but perpetuates use of the 440fx machinetype, and all for just this one reason (at least in the case of CentOS6/RHEL6, which otherwise work just fine with Q35)). (5) Introduce virtio-0.9, virtio-1.0 models in libvirt which are explicitly legacy-only and modern-only. QEMU doesn't need to change, as libvirt can simply set the right params on existing QEMU models to force the behavior. (NB: it's unclear to me whether virtio-0.9 simply won't work without forcing the device to be on a legacy PCI slot, or if that's just "a very bad idea" because it will mean that the device uses up extra io port space) The offline discussion had basically come to the point of saying that options (4) and (5) were the only reasonable ones, with option (5) being preferred (I think). As a starter for continuing the discussion, it seems to me that for option (5): a) we don't really need the virtio-1.0 model, since that's what you currently get anyway when you ask for "virtio" on Q35 (and on 440fx, "virtio" gives you transitional, which works for everybody). b) Rather than a "legacy-only" model for virtio-0.9, it would be more useful to have "transitional". This way the config would work for older OSes that don't support virtio-1.0, and when/if the OS was upgraded such that it supported virtio-1.0, that would be automatically used without needing to change the config. c) Even if it's possible to force a device on an Express slot into transitional mode, this is extremely wasteful of io port space, so libvirt should consider virtio-0.9 devices to be legacy PCI, and thus plug them into legacy PCI slots. And once we're doing this, it's unnecessary to add any extra option to the qemu commandline to force legacy support (i.e. transitional mode), as that is what QEMU already does when the device is connected to a legacy PCI slot. So making the naive assumption that we agree on implementing option (5) and there are no objections to my points a-c (Hah! As if!), how does this sound as a plan: A) libosinfo starts telling consumers that the preferred virtio device model for the relevant OSes is "virtio-0.9", and leaves the recommendation for other OSes as "virtio". B) libvirt adds a "virtio-0.9" model for all virtio devices that actually have virtio-0.9 support (a couple of devices never existed prior to virtio-1.0 (rng and ???) so virtio-0.9 would be nonsensical for them). C) inside libvirt, the implementation of the "virtio-0.9" model is identical to "virtio", except that the VIR_PCI_CONNECT_TYPE flags for these devices contain VIR_PCI_CONNECT_TYPE_PCI rather than VIR_PCI_CONNECT_TYPE_PCIE, resulting in those devices being assigned to a legacy PCI slot, and thus they would be transitional mode by default. (If there is disagreement about putting these devices on a legacy PCI slot, then (C) could be changed to add "disable-legacy=off" to the qemu commandline. But again, even if that works, it would use up 4k of IO port space for each device, causing it to rapidly run out, and I don't think that should be the default mode of operation).

Hi,
b) Rather than a "legacy-only" model for virtio-0.9, it would be more useful to have "transitional". This way the config would work for older OSes that don't support virtio-1.0, and when/if the OS was upgraded such that it supported virtio-1.0, that would be automatically used without needing to change the config.
Having a legacy-only (instead of transitional) model could be useful for regression-testing (i.e. whenever virtio-0.9 mode still works properly in guests with virtio-1.0 support). But that is pretty much the only reason I can think of to prefer the virtio-0.9 devices being legacy-only instead of transitional.
A) libosinfo starts telling consumers that the preferred virtio device model for the relevant OSes is "virtio-0.9", and leaves the recommendation for other OSes as "virtio".
B) libvirt adds a "virtio-0.9" model for all virtio devices that actually have virtio-0.9 support (a couple of devices never existed prior to virtio-1.0 (rng and ???) so virtio-0.9 would be nonsensical for them).
input, gpu are 1.0 only too.
C) inside libvirt, the implementation of the "virtio-0.9" model is identical to "virtio", except that the VIR_PCI_CONNECT_TYPE flags for these devices contain VIR_PCI_CONNECT_TYPE_PCI rather than VIR_PCI_CONNECT_TYPE_PCIE, resulting in those devices being assigned to a legacy PCI slot, and thus they would be transitional mode by default.
Looks good to me. cheers, Gerd

On Thu, Aug 16, 2018 at 06:20:29PM -0400, Laine Stump wrote:
Summary of the problem:
1) We want to persuade libvirt+QEMU users to move away from the i440fx machinetype in favor of Q35. (NB: Someday this *might* lead to the ability to deprecate and even remove the 440fx machinetype, but even if that were to happen, it would be a *very long* time from now, so this discussion is *not* about that!)
There are plenty of OS that will never support Q35 and are still interesting to use under Q35. The set which could use Q35, but lack virtio1.0 is fairly small. So removal of i440fx is really only something for downstream KVM vendors to consider. Those vendors only care about modern OS, but upstream is much more open minded about what QEMU is used for, so I see it probably living forever in upstream, or at least long enough that current maintaniers will be retired ;-P.
2) When Q35 machinetype is used, libvirt assigns virtio devices to a slot on a PCI Express controller (because why have modern PCIe controllers/slots available but force everything onto clunky old legacy controllers?).
3) When a virtio device is plugged into an Express controller, QEMU disables the device's IO port space, and it is put into "modern-only" mode (this is done to avoid a rapid exhaustion of limited IO port space).
4) modern-only virtio devices won't work with a legacy (virtio-0.9-only) guest driver, because virtio-0.9 requires IO port space.
5) Some guest OSes that we still want to support (and which would otherwise work okay on a Q35 virtual machine) have virtio drivers too old to support virtio-1.0 (CentOS6 and RHEL6 are examples of such OSes), but due to the chain of reasons listed above, the "standard" config for a Q35 guest generated by libvirt doesn't support virtio-0.9, hence doesn't support these guest OSes.
Note when talking about "support" you're really saying it from the downstream vendor, specifically RHEL, POV. From upstream or Fedora POV essentially all x86 OS ever made are in scope for running under QEMU if suitable virtual hardware models have been provided. QEMU doesn't maintain any whitelist of "supported" OS that differs from what is technically capable of being run, in the way downstream vendors do.
And here's a list of possible solutions to this problem (note that "consumers" means management applications such as OpenStack, oVirt, virt-manager, virt-install, gnome-boxes, etc. In all cases, it's assumed that the consumer's decision on the action to take will be based on information from libosinfo). For completeness, I've included even the possibilities that have been rejected, along with a brief synopsis of (at least part of) the reason for rejection:
(1) Add some way libvirt consumers can ask libvirt to place virtio devices on a legacy pci slot instead of pcie when the machinetype is q35 (qemu sets virtio devices in legacy PCI slots to transitional mode, so io port space is enabled and virtio-0.0 drivers will work).
This has been proposed on libvir-list, but rejected. Here is the most elquently stated reasoning for the rejection I could find (with thanks to Dan Berrange):
The domain XML is a way to express the configuration of the guest virtual machine. What we're talking about here is a policy tunable for an internal libvirt QEMU driver algorithm, as so does not belong anywhere in the domain XML.
Indeed, that's a guiding principal in general, not just for this PCI question.
(2) Add full-blown pci enumeration support to all libvirt consumers (i.e. they will need to build a model of the PCI bus topology of each guest, and keep track of which addresses are in use). They can then manually place virtio devices on legacy pci slots (again, triggering transitional mode) when the intended guest OS doesn't support virtio-0.9.
(This is seen as requiring too much duplicated effort for development and support/maintenance, since up until now libvirt has been the single point of action for PCI address assignment (well, QEMU can do it too, but for > 10 years libvirt has *always* provided full PCI addresses for all devices)
It really depends on the scope of the mgmt app - at some point the mgmt apps needs to take charge to some degree if it has particular ideas about how a machine should look. Libvirt's placement strategy is a good default for 95% of use cases, but it'll never be 100%. An example is setting up a particular PCI topology that is guest NUMA node aware, using expander buses. So some apps might take this option, but in the common case it is undesirable.
(3) Add virtio-1.0 support to all guest OSes. If this is done, existing libvirt configs will work.
(Aside from the difficulty of backporting, and the fact that there are going to be some OSes that don't get it *at all*, there will always be older releases that haven't gotten the backport. So this isn't a complete solution).
Yep, there will always be guest OS that don't support 1.0. So that's only a solution if the person who cares about Q35 support also controls the guest OS in question.
(4) Consumers can continue using the 440fx machinetype for guest OSes that don't support virtio-0.9
(This would work, but perpetuates use of the 440fx machinetype, and all for just this one reason (at least in the case of CentOS6/RHEL6, which otherwise work just fine with Q35)).
From an usptream POV this is always going to be the case. This is really only an undesirable thing for downstream who are trying to artificially restrict what QEMU features users have available to them.
(5) Introduce virtio-0.9, virtio-1.0 models in libvirt which are explicitly legacy-only and modern-only. QEMU doesn't need to change, as libvirt can simply set the right params on existing QEMU models to force the behavior.
(NB: it's unclear to me whether virtio-0.9 simply won't work without forcing the device to be on a legacy PCI slot, or if that's just "a very bad idea" because it will mean that the device uses up extra io port space)
As a starter for continuing the discussion, it seems to me that for option (5):
a) we don't really need the virtio-1.0 model, since that's what you currently get anyway when you ask for "virtio" on Q35 (and on 440fx, "virtio" gives you transitional, which works for everybody).
At some point we might have a virtio-2.0 and find ourselves in a similar problem again. IMHO it is preferrable to have both explicit versioned models, and discourage use of the magical 'virtio' model from mgmt apps. Use libosinfo to identify which virtio model is supported for the OS in question and use that explicitly. Only use the magical 'virtio' model if there's no information about what version the OS supports.
b) Rather than a "legacy-only" model for virtio-0.9, it would be more useful to have "transitional". This way the config would work for older OSes that don't support virtio-1.0, and when/if the OS was upgraded such that it supported virtio-1.0, that would be automatically used without needing to change the config.
I don't think the case of OS suddenly gaining support for 1.0 in an update is frequent enough to be worth worrying about.
c) Even if it's possible to force a device on an Express slot into transitional mode, this is extremely wasteful of io port space, so libvirt should consider virtio-0.9 devices to be legacy PCI, and thus plug them into legacy PCI slots. And once we're doing this, it's unnecessary to add any extra option to the qemu commandline to force legacy support (i.e. transitional mode), as that is what QEMU already does when the device is connected to a legacy PCI slot.
Yes, it should plug them into legacy PCI slots by default, but if a mgmt app has done explicit placement itself, it should honour that even if it means wasting IO space.
So making the naive assumption that we agree on implementing option (5) and there are no objections to my points a-c (Hah! As if!), how does this sound as a plan:
A) libosinfo starts telling consumers that the preferred virtio device model for the relevant OSes is "virtio-0.9", and leaves the recommendation for other OSes as "virtio".
Libosinfo already uses 'virtio' as the prefix identifying virtio-0.9 support (the old PCI product IDs), and 'virtio-1.0' as the prefix for identifying virtio-1.0 support (the new PCI product IDs). That these don't match libvirt model names doesn't matter.
B) libvirt adds a "virtio-0.9" model for all virtio devices that actually have virtio-0.9 support (a couple of devices never existed prior to virtio-1.0 (rng and ???) so virtio-0.9 would be nonsensical for them).
C) inside libvirt, the implementation of the "virtio-0.9" model is identical to "virtio", except that the VIR_PCI_CONNECT_TYPE flags for these devices contain VIR_PCI_CONNECT_TYPE_PCI rather than VIR_PCI_CONNECT_TYPE_PCIE, resulting in those devices being assigned to a legacy PCI slot, and thus they would be transitional mode by default.
For 'virtio-0.9' libvirt should set "disable-modern=yes" in QEMU args For 'virtio-1.0' libvirt should set "disable-legacy=yes" in QEMU args Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, 2018-08-17 at 10:29 +0100, Daniel P. Berrangé wrote:
On Thu, Aug 16, 2018 at 06:20:29PM -0400, Laine Stump wrote:
5) Some guest OSes that we still want to support (and which would otherwise work okay on a Q35 virtual machine) have virtio drivers too old to support virtio-1.0 (CentOS6 and RHEL6 are examples of such OSes), but due to the chain of reasons listed above, the "standard" config for a Q35 guest generated by libvirt doesn't support virtio-0.9, hence doesn't support these guest OSes.
Note when talking about "support" you're really saying it from the downstream vendor, specifically RHEL, POV. From upstream or Fedora POV essentially all x86 OS ever made are in scope for running under QEMU if suitable virtual hardware models have been provided. QEMU doesn't maintain any whitelist of "supported" OS that differs from what is technically capable of being run, in the way downstream vendors do.
Well, at least in the case of RHEL 6, "not supported" means that it will not boot at all on q35 with the default guest topology created by libvirt, so that's not really a downstream-only problem :)
a) we don't really need the virtio-1.0 model, since that's what you currently get anyway when you ask for "virtio" on Q35 (and on 440fx, "virtio" gives you transitional, which works for everybody).
At some point we might have a virtio-2.0 and find ourselves in a similar problem again. IMHO it is preferrable to have both explicit versioned models, and discourage use of the magical 'virtio' model from mgmt apps. Use libosinfo to identify which virtio model is supported for the OS in question and use that explicitly. Only use the magical 'virtio' model if there's no information about what version the OS supports.
Agreed.
c) Even if it's possible to force a device on an Express slot into transitional mode, this is extremely wasteful of io port space, so libvirt should consider virtio-0.9 devices to be legacy PCI, and thus plug them into legacy PCI slots. And once we're doing this, it's unnecessary to add any extra option to the qemu commandline to force legacy support (i.e. transitional mode), as that is what QEMU already does when the device is connected to a legacy PCI slot.
Yes, it should plug them into legacy PCI slots by default, but if a mgmt app has done explicit placement itself, it should honour that even if it means wasting IO space.
This is consistent with what already happens with PCI addresses: a PCIe device will never be assigned to a PCI slot automatically, but the user can still force that to happen by providing the PCI address themselves.
C) inside libvirt, the implementation of the "virtio-0.9" model is identical to "virtio", except that the VIR_PCI_CONNECT_TYPE flags for these devices contain VIR_PCI_CONNECT_TYPE_PCI rather than VIR_PCI_CONNECT_TYPE_PCIE, resulting in those devices being assigned to a legacy PCI slot, and thus they would be transitional mode by default.
For 'virtio-0.9' libvirt should set "disable-modern=yes" in QEMU args
For 'virtio-1.0' libvirt should set "disable-legacy=yes" in QEMU args
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust. -- Andrea Bolognani / Red Hat / Virtualization

On Fri, Aug 17, 2018 at 12:35:11PM +0200, Andrea Bolognani wrote:
On Fri, 2018-08-17 at 10:29 +0100, Daniel P. Berrangé wrote:
On Thu, Aug 16, 2018 at 06:20:29PM -0400, Laine Stump wrote:
5) Some guest OSes that we still want to support (and which would otherwise work okay on a Q35 virtual machine) have virtio drivers too old to support virtio-1.0 (CentOS6 and RHEL6 are examples of such OSes), but due to the chain of reasons listed above, the "standard" config for a Q35 guest generated by libvirt doesn't support virtio-0.9, hence doesn't support these guest OSes.
Note when talking about "support" you're really saying it from the downstream vendor, specifically RHEL, POV. From upstream or Fedora POV essentially all x86 OS ever made are in scope for running under QEMU if suitable virtual hardware models have been provided. QEMU doesn't maintain any whitelist of "supported" OS that differs from what is technically capable of being run, in the way downstream vendors do.
Well, at least in the case of RHEL 6, "not supported" means that it will not boot at all on q35 with the default guest topology created by libvirt, so that's not really a downstream-only problem :)
I mean from an upstream POV we still support RHEL-6 fine in i440fx, so there's no reason to particularly care about RHEL-6 with q35 upstream. It is only downstream decision to try to force it to use q35, despite it not working right today.
C) inside libvirt, the implementation of the "virtio-0.9" model is identical to "virtio", except that the VIR_PCI_CONNECT_TYPE flags for these devices contain VIR_PCI_CONNECT_TYPE_PCI rather than VIR_PCI_CONNECT_TYPE_PCIE, resulting in those devices being assigned to a legacy PCI slot, and thus they would be transitional mode by default.
For 'virtio-0.9' libvirt should set "disable-modern=yes" in QEMU args
For 'virtio-1.0' libvirt should set "disable-legacy=yes" in QEMU args
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
From a testing POV it is desirable to be able to have legacy-only. There is also possibility that guest OS impl 1.0 in a buggy manner, so forcing legacy only is desirable.
The existing device still already provides a transitional option on i440fx, and on Q35 if you do explicit placement in a PCI slot. So I don't think there's a good reason to have a second transitional device type, especially if we're naming it virtio-0.9, it is rather misleading if it would be in fact able to run virtio-1.0 mode. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Daniel P. Berrangé <berrange@redhat.com> writes:
On Fri, Aug 17, 2018 at 12:35:11PM +0200, Andrea Bolognani wrote:
On Fri, 2018-08-17 at 10:29 +0100, Daniel P. Berrangé wrote:
On Thu, Aug 16, 2018 at 06:20:29PM -0400, Laine Stump wrote:
5) Some guest OSes that we still want to support (and which would otherwise work okay on a Q35 virtual machine) have virtio drivers too old to support virtio-1.0 (CentOS6 and RHEL6 are examples of such OSes), but due to the chain of reasons listed above, the "standard" config for a Q35 guest generated by libvirt doesn't support virtio-0.9, hence doesn't support these guest OSes.
Note when talking about "support" you're really saying it from the downstream vendor, specifically RHEL, POV. From upstream or Fedora POV essentially all x86 OS ever made are in scope for running under QEMU if suitable virtual hardware models have been provided. QEMU doesn't maintain any whitelist of "supported" OS that differs from what is technically capable of being run, in the way downstream vendors do.
Well, at least in the case of RHEL 6, "not supported" means that it will not boot at all on q35 with the default guest topology created by libvirt, so that's not really a downstream-only problem :)
I mean from an upstream POV we still support RHEL-6 fine in i440fx, so there's no reason to particularly care about RHEL-6 with q35 upstream.
Only true if Q35 provides nothing of value over i440FX for RHEL-6 guests. Does it?
It is only downstream decision to try to force it to use q35, despite it not working right today.

On Fri, Aug 17, 2018 at 03:13:22PM +0200, Markus Armbruster wrote:
Daniel P. Berrangé <berrange@redhat.com> writes:
On Fri, Aug 17, 2018 at 12:35:11PM +0200, Andrea Bolognani wrote:
On Fri, 2018-08-17 at 10:29 +0100, Daniel P. Berrangé wrote:
On Thu, Aug 16, 2018 at 06:20:29PM -0400, Laine Stump wrote:
5) Some guest OSes that we still want to support (and which would otherwise work okay on a Q35 virtual machine) have virtio drivers too old to support virtio-1.0 (CentOS6 and RHEL6 are examples of such OSes), but due to the chain of reasons listed above, the "standard" config for a Q35 guest generated by libvirt doesn't support virtio-0.9, hence doesn't support these guest OSes.
Note when talking about "support" you're really saying it from the downstream vendor, specifically RHEL, POV. From upstream or Fedora POV essentially all x86 OS ever made are in scope for running under QEMU if suitable virtual hardware models have been provided. QEMU doesn't maintain any whitelist of "supported" OS that differs from what is technically capable of being run, in the way downstream vendors do.
Well, at least in the case of RHEL 6, "not supported" means that it will not boot at all on q35 with the default guest topology created by libvirt, so that's not really a downstream-only problem :)
I mean from an upstream POV we still support RHEL-6 fine in i440fx, so there's no reason to particularly care about RHEL-6 with q35 upstream.
Only true if Q35 provides nothing of value over i440FX for RHEL-6 guests. Does it?
Q35 has little technical benefit over i440fx for the majority of guest deployments, regardless of guest OS. It provides a more moderning looking platform (nice, but few users are going to especially care about that), and lets you do secure boot with OVMF firmware (blocker if you want that feature). The desire to have everything use Q35 instead of i440fx is more about downstream vendor testing / support, rather than a strong technical feature gap requiring it.
It is only downstream decision to try to force it to use q35, despite it not working right today.
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Daniel P. Berrangé <berrange@redhat.com> writes:
On Fri, Aug 17, 2018 at 03:13:22PM +0200, Markus Armbruster wrote:
Daniel P. Berrangé <berrange@redhat.com> writes:
On Fri, Aug 17, 2018 at 12:35:11PM +0200, Andrea Bolognani wrote:
On Fri, 2018-08-17 at 10:29 +0100, Daniel P. Berrangé wrote:
On Thu, Aug 16, 2018 at 06:20:29PM -0400, Laine Stump wrote:
5) Some guest OSes that we still want to support (and which would otherwise work okay on a Q35 virtual machine) have virtio drivers too old to support virtio-1.0 (CentOS6 and RHEL6 are examples of such OSes), but due to the chain of reasons listed above, the "standard" config for a Q35 guest generated by libvirt doesn't support virtio-0.9, hence doesn't support these guest OSes.
Note when talking about "support" you're really saying it from the downstream vendor, specifically RHEL, POV. From upstream or Fedora POV essentially all x86 OS ever made are in scope for running under QEMU if suitable virtual hardware models have been provided. QEMU doesn't maintain any whitelist of "supported" OS that differs from what is technically capable of being run, in the way downstream vendors do.
Well, at least in the case of RHEL 6, "not supported" means that it will not boot at all on q35 with the default guest topology created by libvirt, so that's not really a downstream-only problem :)
I mean from an upstream POV we still support RHEL-6 fine in i440fx, so there's no reason to particularly care about RHEL-6 with q35 upstream.
Only true if Q35 provides nothing of value over i440FX for RHEL-6 guests. Does it?
Q35 has little technical benefit over i440fx for the majority of guest deployments, regardless of guest OS.
Alright, I can look it up myself. This list is from Marcel's slide deck "Q35 - QEMU" <https://wiki.qemu.org/images/4/4e/Q35.pdf>, August 2016, page 13: Q35-only features ● PCIe “goodies” – Extended configuration space (MMCFG) – PCIe native hotplug – Advanced Error Reporting (AER) – Alternative Routing-ID Interpretation (ARI) – Native Power Management – Function Level Reset (FLR) – Address Translation Services (ATS) ● AHCI storage controller ● vIOMMU emulation ● “Secure” Secure Boot We can debate the actual value of these items. Perhaps this will then result in a "little technical benefit over i440fx for the majority of guest deployments, regardless of guest OS" verdict. That's okay. What doesn't work for me is making such sweeping claims without presenting the evidence.

On Fri, 2018-08-17 at 11:43 +0100, Daniel P. Berrangé wrote:
On Fri, Aug 17, 2018 at 12:35:11PM +0200, Andrea Bolognani wrote:
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
From a testing POV it is desirable to be able to have legacy-only. There is also possibility that guest OS impl 1.0 in a buggy manner, so forcing legacy only is desirable.
The existing device still already provides a transitional option on i440fx, and on Q35 if you do explicit placement in a PCI slot. So I don't think there's a good reason to have a second transitional device type, especially if we're naming it virtio-0.9, it is rather misleading if it would be in fact able to run virtio-1.0 mode.
Sounds reasonable. -- Andrea Bolognani / Red Hat / Virtualization

On 08/17/2018 06:35 AM, Andrea Bolognani wrote:
On Fri, 2018-08-17 at 10:29 +0100, Daniel P. Berrangé wrote:
5) Some guest OSes that we still want to support (and which would otherwise work okay on a Q35 virtual machine) have virtio drivers too old to support virtio-1.0 (CentOS6 and RHEL6 are examples of such OSes), but due to the chain of reasons listed above, the "standard" config for a Q35 guest generated by libvirt doesn't support virtio-0.9, hence doesn't support these guest OSes. Note when talking about "support" you're really saying it from the downstream vendor, specifically RHEL, POV. From upstream or Fedora POV essentially all x86 OS ever made are in scope for running under QEMU if suitable virtual hardware models have been provided. QEMU doesn't
On Thu, Aug 16, 2018 at 06:20:29PM -0400, Laine Stump wrote: maintain any whitelist of "supported" OS that differs from what is technically capable of being run, in the way downstream vendors do. Well, at least in the case of RHEL 6, "not supported" means that it will not boot at all on q35 with the default guest topology created by libvirt, so that's not really a downstream-only problem :)
a) we don't really need the virtio-1.0 model, since that's what you currently get anyway when you ask for "virtio" on Q35 (and on 440fx, "virtio" gives you transitional, which works for everybody). At some point we might have a virtio-2.0 and find ourselves in a similar problem again. IMHO it is preferrable to have both explicit versioned models, and discourage use of the magical 'virtio' model from mgmt apps. Use libosinfo to identify which virtio model is supported for the OS in question and use that explicitly. Only use the magical 'virtio' model if there's no information about what version the OS supports. Agreed.
c) Even if it's possible to force a device on an Express slot into transitional mode, this is extremely wasteful of io port space, so libvirt should consider virtio-0.9 devices to be legacy PCI, and thus plug them into legacy PCI slots. And once we're doing this, it's unnecessary to add any extra option to the qemu commandline to force legacy support (i.e. transitional mode), as that is what QEMU already does when the device is connected to a legacy PCI slot. Yes, it should plug them into legacy PCI slots by default, but if a mgmt app has done explicit placement itself, it should honour that even if it means wasting IO space. This is consistent with what already happens with PCI addresses: a PCIe device will never be assigned to a PCI slot automatically, but the user can still force that to happen by providing the PCI address themselves.
C) inside libvirt, the implementation of the "virtio-0.9" model is identical to "virtio", except that the VIR_PCI_CONNECT_TYPE flags for these devices contain VIR_PCI_CONNECT_TYPE_PCI rather than VIR_PCI_CONNECT_TYPE_PCIE, resulting in those devices being assigned to a legacy PCI slot, and thus they would be transitional mode by default. For 'virtio-0.9' libvirt should set "disable-modern=yes" in QEMU args
For 'virtio-1.0' libvirt should set "disable-legacy=yes" in QEMU args If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way... The problem I can see with the virtio-1.0 model name is that if management applications start putting that into their XML (even though "virtio" would yield a working guest), the guests will be unable to migrate to another machine that has the same version of qemu, but an older libvirt that doesn't understand the virtio-1.0 model number. If that's acceptable, then management apps can being always specifying the version for virtio whether it's old or new. If not, then they should continue to use plain "virtio" unless they specifically need to force virtio-0.9. An additional task for this is that I've noticed that libvirt's domain capabilities info only contains info for devices of type disk, graphics, video, and hostdev. So even if oVirt learns from libosinfo that a particular guest OS only supports virtio-0.9 netdevs (from libosinfo's POV this means that it shows support for "virtio-net, but not "virtio1.0-net", it currently has no way of determining if the version of libvirt it's working with supports the "virtio-0.9" model of an interface device (or the "virtio-1.0 model - see above). So we'll need to add sections to the domain capabilities info for all devices that have both virtio-0.9 and virtio-1.0 support (everything but gpu and input, right?) Is anything else needed (other than the code in the management apps) to make this workable?

On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote:
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
The problem I can see with the virtio-1.0 model name is that if management applications start putting that into their XML (even though "virtio" would yield a working guest), the guests will be unable to migrate to another machine that has the same version of qemu, but an older libvirt that doesn't understand the virtio-1.0 model number. If that's acceptable, then management apps can being always specifying the version for virtio whether it's old or new. If not, then they should continue to use plain "virtio" unless they specifically need to force virtio-0.9.
Well, even using virtio-0.9 could be considered problematic, because at least from the QEMU point of view there's nothing preventing the guest from working correctly as long as the version is recent enough that disable-legacy/disable-modern are available. AFAIK management applications such as oVirt and OpenStack usually require specific, reasonably recent versions of QEMU and libvirt, so they could make sure virtio-0.9 and virtio-1.0 are understood by all nodes in the cluster that way. For something like virt-manager where the coupling is loose perhaps it would make sense to use virtio-0.9 only on q35 when the OS requires it and plain virtio everywhere else for the time being, then switch to always using virtio-0.9 and virtio-1.0 after a Reasonable Amount of Time™ has passed. -- Andrea Bolognani / Red Hat / Virtualization

On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote:
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
The problem I can see with the virtio-1.0 model name is that if management applications start putting that into their XML (even though "virtio" would yield a working guest), the guests will be unable to migrate to another machine that has the same version of qemu, but an older libvirt that doesn't understand the virtio-1.0 model number. If that's acceptable, then management apps can being always specifying the version for virtio whether it's old or new. If not, then they should continue to use plain "virtio" unless they specifically need to force virtio-0.9.
Well, even using virtio-0.9 could be considered problematic, because at least from the QEMU point of view there's nothing preventing the guest from working correctly as long as the version is recent enough that disable-legacy/disable-modern are available.
AFAIK management applications such as oVirt and OpenStack usually require specific, reasonably recent versions of QEMU and libvirt, so they could make sure virtio-0.9 and virtio-1.0 are understood by all nodes in the cluster that way.
Of course this is not a new scenario - any time an app makes use of a new feature exposed in libvirt there's a chance that guests using that feature will not be migratable to older libvirt. The apps and/or administrators deploying them, have to decide on the cost/benefit tradeoff. I think this will have an impact on ability of apps to adopt use of the virtio-0.9/1.0 device model variants though. Both oVirt and OpenStack do care about live migration to older versions, at least at certain periods in time. For example, during a live upgrade scenario, This dovetails into Laine querying about the domain capabilities not currently reporting info on the available device models. In the case that migration to older verisons is needed, the dom capabilities info won't be looked at anyway, as they don't wish to blindly use a feature that just happens to exist in the current version. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote:
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
I don't understand why we are optimizing the new system for the less useful use cases: I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future. I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio". The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why? -- Eduardo

On Wed, Aug 22, 2018 at 09:01:35AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote:
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
I don't understand why we are optimizing the new system for the less useful use cases:
I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future.
I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio".
The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why?
You can already get a transitional device on Q35, albeit with manual placement. Adding flags for magic placement for the existing devices is not something that is suitable for the XML. The ability to get legacy-only, or modern-only doesn't exist today in any way, so that would be a valid new feature. Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35. I think we should just say that RHEL-6 should use i440fx forever and be done with it. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Aug 22, 2018 at 01:26:01PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 09:01:35AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote:
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
I don't understand why we are optimizing the new system for the less useful use cases:
I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future.
I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio".
The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why?
You can already get a transitional device on Q35, albeit with manual placement. Adding flags for magic placement for the existing devices is not something that is suitable for the XML. The ability to get legacy-only, or modern-only doesn't exist today in any way, so that would be a valid new feature.
Transitional devices and modern-only devices are different kinds of devices. Making the guest see a different type of device depending on where it's plugged is why we got into this mess, why would we recommend applications to rely on this behavior? That's why I like your virtio-0.9/virtio-1.0 proposal. I just don't see why you think virtio-transitional should be out of it.
Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35. I think we should just say that RHEL-6 should use i440fx forever and be done with it.
I'm not sure if you are saying that we (Red Hat) shouldn't spend time implementing it, or that the libvirt upstream project should reject the patches if somebody implements it. I would understand the former, but not the latter. -- Eduardo

On Wed, Aug 22, 2018 at 09:54:55AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 01:26:01PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 09:01:35AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote:
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
I don't understand why we are optimizing the new system for the less useful use cases:
I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future.
I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio".
The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why?
You can already get a transitional device on Q35, albeit with manual placement. Adding flags for magic placement for the existing devices is not something that is suitable for the XML. The ability to get legacy-only, or modern-only doesn't exist today in any way, so that would be a valid new feature.
Transitional devices and modern-only devices are different kinds of devices. Making the guest see a different type of device depending on where it's plugged is why we got into this mess, why would we recommend applications to rely on this behavior?
That's why I like your virtio-0.9/virtio-1.0 proposal. I just don't see why you think virtio-transitional should be out of it.
An explicit virtio-transitional device is still two separate devices pretending to be the same thing, but magically changing their identity at runtime. We've already got that situation with existing device models, and I'm loathe to see us add 2nd device model with that same behaviour, just for sake of having a slightly different PCI bus placement strategy to support outdated guest OS.
Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35. I think we should just say that RHEL-6 should use i440fx forever and be done with it.
I'm not sure if you are saying that we (Red Hat) shouldn't spend time implementing it, or that the libvirt upstream project should reject the patches if somebody implements it. I would understand the former, but not the latter.
Even if someone is willing to implement it in libvirt, we have to consider the cost of supporting it in both libvirt and applications using libvirt and the complexity it adds to our story about the docs / best practices for configuring guests. Even though I do kind of like the virtio-0.9/virtio-1.0 device model as concepts, I'm yet to be convinced that implementing them in libvirt and then also in all the downstream applications (oVirt, OpenStack, virt-manager, cockpit, etc) is actually worth the cost. There's little compelling reason to care about running outdated OS like RHEL-6 on Q35 in general. The motivation behind it is just coming from an artifically created problem downstream, by wishing to drop the i440fx machine at some still undeteremined point in the future. By the time that future comes, RHEL-6 may well even be end of life making the entire exercise a pointless. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Aug 22, 2018 at 02:44:40PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 09:54:55AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 01:26:01PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 09:01:35AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote: > If we decide we want to explicitly spell out the options instead > of relying on QEMU changing behavior based on the slot type, which > is probably a good idea anyway, I think we should have > > virtio-0.9 => disable-legacy=no,disable-modern=no > virtio-1.0 => disable-legacy=yes,disable-modern=no > > There's basically no reason to have a device legacy-only rather > than transitional, and spelling out both options instead of only > one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
I don't understand why we are optimizing the new system for the less useful use cases:
I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future.
I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio".
The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why?
You can already get a transitional device on Q35, albeit with manual placement. Adding flags for magic placement for the existing devices is not something that is suitable for the XML. The ability to get legacy-only, or modern-only doesn't exist today in any way, so that would be a valid new feature.
Transitional devices and modern-only devices are different kinds of devices. Making the guest see a different type of device depending on where it's plugged is why we got into this mess, why would we recommend applications to rely on this behavior?
That's why I like your virtio-0.9/virtio-1.0 proposal. I just don't see why you think virtio-transitional should be out of it.
An explicit virtio-transitional device is still two separate devices pretending to be the same thing, but magically changing their identity at runtime. We've already got that situation with existing device models, and I'm loathe to see us add 2nd device model with that same behaviour, just for sake of having a slightly different PCI bus placement strategy to support outdated guest OS.
Your seem to be describing what the current "virtio" device is: it becomes a non-transitional (modern-only) Virtio device on some cases, and becomes a transitional Virtio device on other cases. It is two devices pretending to be the same thing. That's exactly what I would like applications to get rid of. Now, the above is really not an accurate description of transitional Virtio devices. A transitional Virtio device is something clearly specified in the Virtio spec, and it just means a device that supports two types of drivers. It's not different from a x86_64 CPU that can run 32-bit OSes. See: http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-... http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-...
Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35. I think we should just say that RHEL-6 should use i440fx forever and be done with it.
I'm not sure if you are saying that we (Red Hat) shouldn't spend time implementing it, or that the libvirt upstream project should reject the patches if somebody implements it. I would understand the former, but not the latter.
Even if someone is willing to implement it in libvirt, we have to consider the cost of supporting it in both libvirt and applications using libvirt and the complexity it adds to our story about the docs / best practices for configuring guests.
Even though I do kind of like the virtio-0.9/virtio-1.0 device model as concepts, I'm yet to be convinced that implementing them in libvirt and then also in all the downstream applications (oVirt, OpenStack, virt-manager, cockpit, etc) is actually worth the cost.
There's little compelling reason to care about running outdated OS like RHEL-6 on Q35 in general. The motivation behind it is just coming from an artifically created problem downstream, by wishing to drop the i440fx machine at some still undeteremined point in the future. By the time that future comes, RHEL-6 may well even be end of life making the entire exercise a pointless.
I'm all for making a cost/benefit analysis, but I don't think you are taking into account the costs of keeping the confusing semantics of existing "virtio" devices. If you still want to refuse to provide a sane way to configure transitional Virtio devices, I really won't care. But I believe the interface you are trying to keep is actually the one you are criticizing ("two separate devices pretending to be the same thing, but magically changing their identity at runtime"). -- Eduardo

On Wed, Aug 22, 2018 at 11:18:28AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 02:44:40PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 09:54:55AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 01:26:01PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 09:01:35AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote: > On 08/17/2018 06:35 AM, Andrea Bolognani wrote: > > If we decide we want to explicitly spell out the options instead > > of relying on QEMU changing behavior based on the slot type, which > > is probably a good idea anyway, I think we should have > > > > virtio-0.9 => disable-legacy=no,disable-modern=no > > virtio-1.0 => disable-legacy=yes,disable-modern=no > > > > There's basically no reason to have a device legacy-only rather > > than transitional, and spelling out both options instead of only > > one of them just seems more robust. > > I agree with both of those, but the counter-argument is that "virtio" > already describes a transitional device like your proposal for > virtio-0.9 (at least today), and it makes the versioned models less > orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
I don't understand why we are optimizing the new system for the less useful use cases:
I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future.
I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio".
The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why?
You can already get a transitional device on Q35, albeit with manual placement. Adding flags for magic placement for the existing devices is not something that is suitable for the XML. The ability to get legacy-only, or modern-only doesn't exist today in any way, so that would be a valid new feature.
Transitional devices and modern-only devices are different kinds of devices. Making the guest see a different type of device depending on where it's plugged is why we got into this mess, why would we recommend applications to rely on this behavior?
That's why I like your virtio-0.9/virtio-1.0 proposal. I just don't see why you think virtio-transitional should be out of it.
An explicit virtio-transitional device is still two separate devices pretending to be the same thing, but magically changing their identity at runtime. We've already got that situation with existing device models, and I'm loathe to see us add 2nd device model with that same behaviour, just for sake of having a slightly different PCI bus placement strategy to support outdated guest OS.
Your seem to be describing what the current "virtio" device is: it becomes a non-transitional (modern-only) Virtio device on some cases, and becomes a transitional Virtio device on other cases. It is two devices pretending to be the same thing. That's exactly what I would like applications to get rid of.
Now, the above is really not an accurate description of transitional Virtio devices. A transitional Virtio device is something clearly specified in the Virtio spec, and it just means a device that supports two types of drivers. It's not different from a x86_64 CPU that can run 32-bit OSes.
See: http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-... http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-...
When I say a device pretending to be 2 different devices, I'm generally referring to the fact that a single QEMU device model can expose two different PCI device IDs depending on how it is configured and/or placed.
Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35. I think we should just say that RHEL-6 should use i440fx forever and be done with it.
I'm not sure if you are saying that we (Red Hat) shouldn't spend time implementing it, or that the libvirt upstream project should reject the patches if somebody implements it. I would understand the former, but not the latter.
Even if someone is willing to implement it in libvirt, we have to consider the cost of supporting it in both libvirt and applications using libvirt and the complexity it adds to our story about the docs / best practices for configuring guests.
Even though I do kind of like the virtio-0.9/virtio-1.0 device model as concepts, I'm yet to be convinced that implementing them in libvirt and then also in all the downstream applications (oVirt, OpenStack, virt-manager, cockpit, etc) is actually worth the cost.
There's little compelling reason to care about running outdated OS like RHEL-6 on Q35 in general. The motivation behind it is just coming from an artifically created problem downstream, by wishing to drop the i440fx machine at some still undeteremined point in the future. By the time that future comes, RHEL-6 may well even be end of life making the entire exercise a pointless.
I'm all for making a cost/benefit analysis, but I don't think you are taking into account the costs of keeping the confusing semantics of existing "virtio" devices.
If you still want to refuse to provide a sane way to configure transitional Virtio devices, I really won't care. But I believe the interface you are trying to keep is actually the one you are criticizing ("two separate devices pretending to be the same thing, but magically changing their identity at runtime").
Yeah, I guess I should make a distinction between what I would do if it was a clean slate, and what we should do given our existing practice. If we had a clean slate I would not like to see our current impl done. Given that it already exists, however, we're stuck with that forever. So the question is whether implementing any of the alternative options is a net benefit for libvirt & mgmt apps overall. My gut feeling is that despite the downsides of the current impl, it is not worth trying todo something different. The thing that has really tipped my mind this way is that even if we provide new device models, mgmt apps will be loathe to actually use them because it will prevent live migration of those guests to hosts with older libvirt. So my feeling is we should do the work to enable use of Q35 by default in mgmt apps, for guest OS where it is known to work correctly today. Every other OS should just stick with i440fx as we already know that works for them today and Q35 doesn't offer legacy OS compelling enough benefits to switch. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Aug 22, 2018 at 03:57:20PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 11:18:28AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 02:44:40PM +0100, Daniel P. Berrangé wrote: [...]
An explicit virtio-transitional device is still two separate devices pretending to be the same thing, but magically changing their identity at runtime. We've already got that situation with existing device models, and I'm loathe to see us add 2nd device model with that same behaviour, just for sake of having a slightly different PCI bus placement strategy to support outdated guest OS.
Your seem to be describing what the current "virtio" device is: it becomes a non-transitional (modern-only) Virtio device on some cases, and becomes a transitional Virtio device on other cases. It is two devices pretending to be the same thing. That's exactly what I would like applications to get rid of.
Now, the above is really not an accurate description of transitional Virtio devices. A transitional Virtio device is something clearly specified in the Virtio spec, and it just means a device that supports two types of drivers. It's not different from a x86_64 CPU that can run 32-bit OSes.
See: http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-... http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-...
When I say a device pretending to be 2 different devices, I'm generally referring to the fact that a single QEMU device model can expose two different PCI device IDs depending on how it is configured and/or placed.
Understood. Then you are not describing transitional Virtio, you are just describing QEMU's disable-legacy=auto behavior (which is the default).
Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35. I think we should just say that RHEL-6 should use i440fx forever and be done with it.
I'm not sure if you are saying that we (Red Hat) shouldn't spend time implementing it, or that the libvirt upstream project should reject the patches if somebody implements it. I would understand the former, but not the latter.
Even if someone is willing to implement it in libvirt, we have to consider the cost of supporting it in both libvirt and applications using libvirt and the complexity it adds to our story about the docs / best practices for configuring guests.
Even though I do kind of like the virtio-0.9/virtio-1.0 device model as concepts, I'm yet to be convinced that implementing them in libvirt and then also in all the downstream applications (oVirt, OpenStack, virt-manager, cockpit, etc) is actually worth the cost.
There's little compelling reason to care about running outdated OS like RHEL-6 on Q35 in general. The motivation behind it is just coming from an artifically created problem downstream, by wishing to drop the i440fx machine at some still undeteremined point in the future. By the time that future comes, RHEL-6 may well even be end of life making the entire exercise a pointless.
I'm all for making a cost/benefit analysis, but I don't think you are taking into account the costs of keeping the confusing semantics of existing "virtio" devices.
If you still want to refuse to provide a sane way to configure transitional Virtio devices, I really won't care. But I believe the interface you are trying to keep is actually the one you are criticizing ("two separate devices pretending to be the same thing, but magically changing their identity at runtime").
Yeah, I guess I should make a distinction between what I would do if it was a clean slate, and what we should do given our existing practice.
If we had a clean slate I would not like to see our current impl done. Given that it already exists, however, we're stuck with that forever. So the question is whether implementing any of the alternative options is a net benefit for libvirt & mgmt apps overall. My gut feeling is that despite the downsides of the current impl, it is not worth trying todo something different.
Fair enough.
The thing that has really tipped my mind this way is that even if we provide new device models, mgmt apps will be loathe to actually use them because it will prevent live migration of those guests to hosts with older libvirt.
This might be an issue for some apps, but is it going to happen in practice? Don't they all need mechanisms to flip a switch and enable features that require newer host software, already?
So my feeling is we should do the work to enable use of Q35 by default in mgmt apps, for guest OS where it is known to work correctly today. Every other OS should just stick with i440fx as we already know that works for them today and Q35 doesn't offer legacy OS compelling enough benefits to switch.
I'd still prefer if libvirt provided a saner configuration mechanism, and let libosinfo and management apps decide what works best for them. If it helps, I can send QEMU patches to make virtio-0.9/virtio-1.0-non-transitional/virtio-1.0-transitional appear as different device types. libvirt would then be able to check if the device type implements "pci-express-device" or "conventional-pci-device", instead of adding device-specific placement logic. -- Eduardo

On Wed, Aug 22, 2018 at 12:49:48PM -0300, Eduardo Habkost wrote:
The thing that has really tipped my mind this way is that even if we provide new device models, mgmt apps will be loathe to actually use them because it will prevent live migration of those guests to hosts with older libvirt.
This might be an issue for some apps, but is it going to happen in practice? Don't they all need mechanisms to flip a switch and enable features that require newer host software, already?
That is true, but most features that get added to virt these days are things which are opt-in for specific use cases. eg when we added mdev support if a guest gets given an mdev VGPU it obviously won't be migratable to older libvirt lacking mdev support. The mitigation is that only $TINY % of guests will be using mdev, so the compat problem won't widely affect things. With the alternative virtio models we're discussing here, the idea is that they'd be used by default for all new guest deployments, so the impact will be felt on every guest.
So my feeling is we should do the work to enable use of Q35 by default in mgmt apps, for guest OS where it is known to work correctly today. Every other OS should just stick with i440fx as we already know that works for them today and Q35 doesn't offer legacy OS compelling enough benefits to switch.
I'd still prefer if libvirt provided a saner configuration mechanism, and let libosinfo and management apps decide what works best for them.
If it helps, I can send QEMU patches to make virtio-0.9/virtio-1.0-non-transitional/virtio-1.0-transitional appear as different device types. libvirt would then be able to check if the device type implements "pci-express-device" or "conventional-pci-device", instead of adding device-specific placement logic.
I don't think it makes a big difference from pov of the libvirt impl, as its just the difference between "-device virtio-net,modern=off" and "-device virtio-net-0.9". It still has the same amount of extra work and complexity rippling out from libvirt to mgmt apps, to address a problem (make old RHEL6 use Q35 instad of i440fx) that shouldn't really need to exist. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 08/22/2018 09:44 AM, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 01:26:01PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote: > If we decide we want to explicitly spell out the options instead > of relying on QEMU changing behavior based on the slot type, which > is probably a good idea anyway, I think we should have > > virtio-0.9 => disable-legacy=no,disable-modern=no > virtio-1.0 => disable-legacy=yes,disable-modern=no > > There's basically no reason to have a device legacy-only rather > than transitional, and spelling out both options instead of only > one of them just seems more robust. I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way... Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional. I don't understand why we are optimizing the new system for the less useful use cases:
I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future.
I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio".
The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why? You can already get a transitional device on Q35, albeit with manual
On Wed, Aug 22, 2018 at 09:01:35AM -0300, Eduardo Habkost wrote: placement. Adding flags for magic placement for the existing devices is not something that is suitable for the XML. The ability to get legacy-only, or modern-only doesn't exist today in any way, so that would be a valid new feature. Transitional devices and modern-only devices are different kinds of devices. Making the guest see a different type of device depending on where it's plugged is why we got into this mess, why would we recommend applications to rely on this behavior?
That's why I like your virtio-0.9/virtio-1.0 proposal. I just don't see why you think virtio-transitional should be out of it. An explicit virtio-transitional device is still two separate devices pretending to be the same thing, but magically changing
On Wed, Aug 22, 2018 at 09:54:55AM -0300, Eduardo Habkost wrote: their identity at runtime. We've already got that situation with existing device models, and I'm loathe to see us add 2nd device model with that same behaviour, just for sake of having a slightly different PCI bus placement strategy to support outdated guest OS.
Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35. I think we should just say that RHEL-6 should use i440fx forever and be done with it. I'm not sure if you are saying that we (Red Hat) shouldn't spend time implementing it, or that the libvirt upstream project should reject the patches if somebody implements it. I would understand the former, but not the latter. Even if someone is willing to implement it in libvirt, we have to consider the cost of supporting it in both libvirt and applications using libvirt and the complexity it adds to our story about the docs / best practices for configuring guests.
Even though I do kind of like the virtio-0.9/virtio-1.0 device model as concepts, I'm yet to be convinced that implementing them in libvirt and then also in all the downstream applications (oVirt, OpenStack, virt-manager, cockpit, etc) is actually worth the cost.
I'm starting to lean towards this opinion too - I was thinking about this over the weekend, and it does seem like the code in the management apps will be convoluted/complex/your favorite adjective. Going into this I had the naive impression that a simple bit of logic in the management application could just take the union of supported devices from libosinfo(guestOS) and domaincapabilities(qemu), then pick the top model name from the list. It's unfortunately not that simple, so we're going to end up with a bunch of extra code in the management application (multiplied by the number of management apps, multiplied by the number of different virtio devices) and that code will need to be maintained. In the meantime, the only advantages over just giving up and using 440fx for RHEL6 would be 1) consistent support for using Q35 on all "supported" guest OSes, 2) the possibility of doing SecureBoot, and 2) being able to someday in the future eliminate all 440fx-specific code from the set of code that needs to be tested/maintained by downstream maintainers. (1) is nice and clean, but the value is dubious if it's achieved by "unclean" code elsewhere. (2) is a non-feature for almost everyone, and (3) isn't going to happen anyway, since any existing guests have already been setup using 440fx as the machinetype, and we can't force people to change the machinetype of an existing guest (as Dan says, over the amount of time needed to make that an acceptable requirement, the guest OSes in question could likely reach EOL anyway). So, while technically it's just a tiny change in the configuration of the guest that's needed to make RHEL6+Q35+virtio work, all the trappings around it turn it into a big mess that will likely be more trouble than the trouble of just using 440fx for a couple guest OSes. (By comparison, I think it might be cleaner/simpler/less bug prone to simply backport the virtio-1.0 driveres to RHEL6 Am I giving up too easily? (NB: of course if we want to require 440fx for RHEL6, we'll still need to do the work on libosinfo to report "supported machinetype", and on all the management applications to honor that information)

On Wed, Aug 22, 2018 at 10:37:12AM -0400, Laine Stump wrote:
On 08/22/2018 09:44 AM, Daniel P. Berrangé wrote:
Even if someone is willing to implement it in libvirt, we have to consider the cost of supporting it in both libvirt and applications using libvirt and the complexity it adds to our story about the docs / best practices for configuring guests.
Even though I do kind of like the virtio-0.9/virtio-1.0 device model as concepts, I'm yet to be convinced that implementing them in libvirt and then also in all the downstream applications (oVirt, OpenStack, virt-manager, cockpit, etc) is actually worth the cost.
I'm starting to lean towards this opinion too - I was thinking about this over the weekend, and it does seem like the code in the management apps will be convoluted/complex/your favorite adjective. Going into this I had the naive impression that a simple bit of logic in the management application could just take the union of supported devices from libosinfo(guestOS) and domaincapabilities(qemu), then pick the top model name from the list. It's unfortunately not that simple, so we're going to end up with a bunch of extra code in the management application (multiplied by the number of management apps, multiplied by the number of different virtio devices) and that code will need to be maintained.
In the meantime, the only advantages over just giving up and using 440fx for RHEL6 would be 1) consistent support for using Q35 on all "supported" guest OSes, 2) the possibility of doing SecureBoot, and 2) being able to someday in the future eliminate all 440fx-specific code from the set of code that needs to be tested/maintained by downstream maintainers. (1) is nice and clean, but the value is dubious if it's achieved by "unclean" code elsewhere. (2) is a non-feature for almost everyone, and (3) isn't going to happen anyway, since any existing guests have already been setup using 440fx as the machinetype, and we can't force people to change the machinetype of an existing guest (as Dan says, over the amount of time needed to make that an acceptable requirement, the guest OSes in question could likely reach EOL anyway).
Two notes - The notion of "supported" guest OS is an invention of downstream distro vendors. - RHEL-6 doesn't support SecureBoot at all, even on bare metal. IOW the possibility of SecureBoot with KVM for RHEL-6 doesn't even arise to begin with.
(NB: of course if we want to require 440fx for RHEL6, we'll still need to do the work on libosinfo to report "supported machinetype", and on all the management applications to honor that information)
Yes, we still have plenty of work todo across the mgmt apps just to get Q35 used by default in the first place. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Eduardo Habkost <ehabkost@redhat.com> writes:
On Wed, Aug 22, 2018 at 01:26:01PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 09:01:35AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote:
If we decide we want to explicitly spell out the options instead of relying on QEMU changing behavior based on the slot type, which is probably a good idea anyway, I think we should have
virtio-0.9 => disable-legacy=no,disable-modern=no virtio-1.0 => disable-legacy=yes,disable-modern=no
There's basically no reason to have a device legacy-only rather than transitional, and spelling out both options instead of only one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
I don't understand why we are optimizing the new system for the less useful use cases:
I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future.
I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio".
The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why?
You can already get a transitional device on Q35, albeit with manual placement. Adding flags for magic placement for the existing devices is not something that is suitable for the XML. The ability to get legacy-only, or modern-only doesn't exist today in any way, so that would be a valid new feature.
Transitional devices and modern-only devices are different kinds of devices. Making the guest see a different type of device depending on where it's plugged is why we got into this mess,
Every time we make -device FOO result in a different device depending on context, device configuration or placement, it eventually joins our collection of Very Bad Ideas. Different PCI device IDs are a clear indicator of device difference. Instances of this class of Very Bad Ideas I've addressed myself: * I deprecated "ivshmem" in favor of "ivshmem-plain" and "ivshmem-doorbell". * I split "ide-drive" into of "ide-hd" and "ide-cd" (deprecation wasn't fashionable back then) * I split "scsi-disk" into "scsi-hd" and "scsi-cd" (likewise) One time pain, long term gain. We should consider addressing virtio devices, too: deprecate the chameleon device models an adequate grace period.
why would we recommend applications to rely on this behavior?
That's why I like your virtio-0.9/virtio-1.0 proposal. I just don't see why you think virtio-transitional should be out of it.
Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35.
I strenously disagree. This is first and foremost about correcting a design mistake we made.
I think we should just say that RHEL-6 should use i440fx forever and be done with it.
I'm not sure if you are saying that we (Red Hat) shouldn't spend time implementing it, or that the libvirt upstream project should reject the patches if somebody implements it. I would understand the former, but not the latter.
I would be willing to listen a reasoned argument why correcting the design mistake is not worthwhile. I'm unwilling to listen to more downstream blaming. Please stop it.

On Thu, Aug 23, 2018 at 06:08:55PM +0200, Markus Armbruster wrote:
Eduardo Habkost <ehabkost@redhat.com> writes:
On Wed, Aug 22, 2018 at 01:26:01PM +0100, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 09:01:35AM -0300, Eduardo Habkost wrote:
On Wed, Aug 22, 2018 at 12:36:27PM +0200, Andrea Bolognani wrote:
On Tue, 2018-08-21 at 14:21 -0400, Laine Stump wrote:
On 08/17/2018 06:35 AM, Andrea Bolognani wrote: > If we decide we want to explicitly spell out the options instead > of relying on QEMU changing behavior based on the slot type, which > is probably a good idea anyway, I think we should have > > virtio-0.9 => disable-legacy=no,disable-modern=no > virtio-1.0 => disable-legacy=yes,disable-modern=no > > There's basically no reason to have a device legacy-only rather > than transitional, and spelling out both options instead of only > one of them just seems more robust.
I agree with both of those, but the counter-argument is that "virtio" already describes a transitional device like your proposal for virtio-0.9 (at least today), and it makes the versioned models less orthogonal. In the end, I could go either way...
Yeah, Dan already made that argument and convinced me that we should use virtio-0.9 for legacy only, virtio-1.0 for modern only and plain virtio for no enforced behavior / transitional.
I don't understand why we are optimizing the new system for the less useful use cases:
I don't see a use case where virtio-0.9 (legacy-only) would be more useful than virtio-transitional. I don't see why anybody would prefer a legacy-only device instead of a transitional device. Even if your guest has only legacy drivers, it might be upgraded and get new drivers in the future.
I don't see a use case where virtio-1.0 (modern-only) would be more useful than "virtio". If you are running i440fx, you get a transitional device with "virtio", and I don't see why anybody would prefer a modern-only device. If you are running Q35, you already get a modern-only device with "virtio".
The most useful feature users need is the ability to ask for a transitional virtio device on Q35, and this use case is explicitly being left out of the proposal. Why?
You can already get a transitional device on Q35, albeit with manual placement. Adding flags for magic placement for the existing devices is not something that is suitable for the XML. The ability to get legacy-only, or modern-only doesn't exist today in any way, so that would be a valid new feature.
Transitional devices and modern-only devices are different kinds of devices. Making the guest see a different type of device depending on where it's plugged is why we got into this mess,
Every time we make -device FOO result in a different device depending on context, device configuration or placement, it eventually joins our collection of Very Bad Ideas. Different PCI device IDs are a clear indicator of device difference.
Instances of this class of Very Bad Ideas I've addressed myself:
* I deprecated "ivshmem" in favor of "ivshmem-plain" and "ivshmem-doorbell".
* I split "ide-drive" into of "ide-hd" and "ide-cd" (deprecation wasn't fashionable back then)
* I split "scsi-disk" into "scsi-hd" and "scsi-cd" (likewise)
One time pain, long term gain.
The pain in those three cases was largely non-existant and/or hidden. Almost no one ever used ivshmem, so by extension almost no one was impacted by need to use different names. It was also not migratable, so there was no need to care about compatibility with older versions. For ide-drive/scsi-disk, the change was completely hidden inside libvirt and wasn't ABI sensitive so didn't affect apps or migration.
We should consider addressing virtio devices, too: deprecate the chameleon device models an adequate grace period.
The change discussed here would have major impact by comparison as it would be telling every single app to change what they do, and the change would prevent their guests being compatible with older libvirt. I don't think the gain outweighs the costs of making the changes across every mgmt app, even ignoring the cost of the backcompatibility problems
Honestly though, the longer this discussion goes on, the more I think the answer is just "do nothing". All this time spent on discussion, and future time spent on implementing new logic in apps, is merely to support running RHEL-6 on Q35.
I strenously disagree. This is first and foremost about correcting a design mistake we made.
I'm not sure if you are saying that we (Red Hat) shouldn't spend time implementing it, or that the libvirt upstream project should reject the patches if somebody implements it. I would understand the former, but not the latter.
I would be willing to listen a reasoned argument why correcting the design mistake is not worthwhile. I'm unwilling to listen to more downstream blaming. Please stop it.
There are countless mistakes in both QEMU & libvirt, but only some of them are worth the cost of changing. I'm not seeing a compelling reason why this change is worthwhile. The impact of the design mistake is narrow and only raised because of downstream desire to change even legacy OS to use Q35 when there's no benefit to those OS of such a change. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Aug 23, 2018 at 05:26:47PM +0100, Daniel P. Berrangé wrote: [...]
There are countless mistakes in both QEMU & libvirt, but only some of them are worth the cost of changing. I'm not seeing a compelling reason why this change is worthwhile. The impact of the design mistake is narrow and only raised because of downstream desire to change even legacy OS to use Q35 when there's no benefit to those OS of such a change.
I think you underestimate the impact of the design mistake. Maintaining and working around badly designed interfaces have costs. The virtio device model was already an obstacle when designing new bus/device introspection interfaces. It will be an obstacle for adding mechanisms to tell applications that legacy virtio devices can't be plugged on PCI Express slots. Anyway, if we want to fix the design mistake it wouldn't make sense to do it only on the libvirt side and not on QEMU. We can address that on QEMU first, and then let libvirt decide how to handle it. -- Eduardo

Eduardo Habkost <ehabkost@redhat.com> writes:
On Thu, Aug 23, 2018 at 05:26:47PM +0100, Daniel P. Berrangé wrote: [...]
There are countless mistakes in both QEMU & libvirt, but only some of them are worth the cost of changing.
Agreed.
I'm not seeing a compelling reason why this change is worthwhile. The impact of the design mistake is narrow and only raised because of downstream desire to change even legacy OS to use Q35 when there's no benefit to those OS of such a change.
I think you underestimate the impact of the design mistake.
And overstate the "this is just for a downstream need".
Maintaining and working around badly designed interfaces have costs.
The virtio device model was already an obstacle when designing new bus/device introspection interfaces. It will be an obstacle for adding mechanisms to tell applications that legacy virtio devices can't be plugged on PCI Express slots.
Thus, there's a genuine upstream motivation to clean up this mess. Whether it's worthwhile is of course a fair question. The argument for "it is worthwhile" I like to see in general is patches.
Anyway, if we want to fix the design mistake it wouldn't make sense to do it only on the libvirt side and not on QEMU. We can address that on QEMU first, and then let libvirt decide how to handle it.
Yes.
participants (7)
-
Andrea Bolognani
-
Daniel P. Berrangé
-
Eduardo Habkost
-
Gerd Hoffmann
-
Laine Stump
-
Laine Stump
-
Markus Armbruster