On Wed, 2016-08-10 at 00:32 -0400, Laine Stump wrote:
On 08/09/2016 09:11 AM, Andrea Bolognani wrote:
> On Mon, 2016-08-08 at 04:56 -0400, Laine Stump wrote:
> >
> > These patches use three methods to get more of the PCI devices onto PCIe
> > slots on Q35 and aarch64/virt machinetypes:
> >
> > 1) When virtio devices can present themselves as PCIe if they're
> > plugged into a PCIe controller, do that. (This capability is
> > detected by looking for presence of the "disable-modern" option
on
> > the virtio-net device. If you have a better idea for how to detect
> > it, please let me know.)
> How does this play along with Ján's series[1] implementing a
> way to control the protocol revision for virtio devices?
It doesn't, because those patches haven't been pushed yet. But it
doesn't conflict either (except that Jan and I are both checking for
essentially the same epoch difference in virtio by looking at different
options - he looks for "disable-legacy" in any virtio device, while I
look for "disable-modern" in virtio-net).
Note that setting disable-modern/disable-legacy does *not* necessarily
do what you think, and we don't want to always have to set virtio
revision='1.0' in order to get a PCIe device, nor do we want all PCIe
devices to be revision 1.0-only. There is a case for having
disable-modern=off,disable-legacy=off even for a device installed on a
PCIe slot (it's compatible with old guest drivers, but new guest drivers
can take advantage of virtio 1.0, and it is legacy-PCI-free. On the
other hand, it requires the PCIe controller to reserve IO port space,
which we try to avoid.)
Sheesh, you're right, I got it backwards - it's not that you
need 1.0-only for PCIe, you *can't* have PCIe unless you also
enable 1.0. Thanks for pointing out the mistake.
For example, I just tried every combination of:
disable-modern=off,disable-legacy=off (i.e. "all versions")
disable-modern=on,disable-legacy=off ("0.9")
disable-modern=off,disable-legacy=on ("1.0")
with these three types of slots:
pcie-root (PCIe root complex "integrated" device)
pcie-root-port (pcie)
pci-bridge (legacy PCI)
I found that setting "1.0" doesn't force the device to be PCIe - if
it's
on either pci-bridge or pcie-root, it will still be properly advertised
as a PCI device (the "Express Device" capability has been removed from
the PCI capabilities data). The difference is that it will no longer
request/require IO port space, and the device ID changes from 1af4:1000
to 1af4:1041. So there is no problem with putting a device with
revision='1.0' into a legacy PCI slot.
The pcie-root bit is weird - I know legacy PCI devices can be
assigned directly to it, but I'd expect a device that can do
both PCI and PCIe to present itself as PCIe when plugged into
a PCIe-capable slot.
Does this mean we have to be careful *not* to assign virtio
devices to pcie-root unless they're 0.9-only, in order to
chase our goal of having PCI-free guests?
Setting 0.9 *does* force the device to be a PCI device even if
it's
plugged into a PCIe slot - the "Express is removed from the PCI
capabilities. So definitely if 0.9 is set in order to disable
virtio-1.0, we should not put it in a PCIe slot. This leads to the
conclusion that any guest OS on Q35 that doesn't have virtio-1.0 drivers
will require one of the following:
1) require that we can force both disables to OFF (that's the only way
to get a true PCIe device that supports virtio-0.9)
That sounds like a good idea, and Ján's series already covers
that use case IIRC: you'd just have to add
<virtio revision='0.9'/>
<virtio revision='1.0'/>
to the device. That gives you full control.
(NB: starting with
qemu-2.7.0, the default for virtio devices plugged into a pcie-*-port
slot will be "disable-legacy=on" (in order to conserve IO port space),
and the only way for us to set disable-legacy=off will be to force
virtio-0.9, i.e. there will be *no method* of getting a PCIe device that
supports old virtio.)
Even though disable-legacy will default to on, I assume we
would still be able to set it to off explicitly? Or would
that be prevented altogether?
2) live with having a dmi-to-pci-bridge and pci-bridge so that
we'll
have a proper place to plug in an 0.9-only device.
3) provide a simple way to force the 0.9-only virtio device onto
pcie-root (pcie.0) (but of course that doesn't support hotplug...)
Anyway, back to the topic - we *don't* want to be using the virtio
revision setting to force PCI vs. PCIe (or think that we have to set it
in order to get one or the other); the only information we can take from
that setting to aid us in the PCI/PCIe decision is that when 0.9 is set,
we definitely can't put the device on a PCIe slot; otherwise, it can be
put on either type of slot and it will work correctly.
Indeed.
> At the very least, the current approaches don't mix - if
both
> your series and Ján's were to be merged now, a device that has
> been configured to use virtio 0.9 would end up assigned to a
> PCIe slot.
The only reason they don't appear to mix is because Jan's patches don't
account for mine, and my patches don't account for his, because none of
them are pushed yet. There's no intrinsic reason why they couldn't
though.
Just to be clear, I didn't mean to suggest that you were doing
anything wrong: I just noticed both you and Ján were working on
partially overlapping features that don't necessarily work well
together (see example above). Discussing how to make them work
well together was the whole point of me bringing that up :)
> > 2) Any devices that aren't hotpluggable anyway will no
longer request
> > a hotplug-capable slot. This, along with a change to auto-assign
> > legacy PCI devices to pcie-root (as long as they don't require
> > hotplug) means the devices will now be assigned to pcie-root.
> >
> > 3) Also using the fact that devices that won't be hotplugged can be
> > assigned to pcie-root, the devices that *do* support hotplug have a
> > new optional subelement "<hotplug require='no'/>"
(it defaults to
> > "yes" for historical reasons). If hotplug require is set to
'no',
> > even a PCI device can be auto-assigned to pcie-root.
> >
> > I haven't yet removed the dmi-to-pci-bridge that is added by default,
> > and we still will only auto-add pci-bridge (so if you're adding a
> > virtio device without <hostplug require='no'/> then you will also
need
> > to add a <controller type='pci' model='pcie-root-port'/>
to plug it
> > into).
>
> I don't get any dmi-to-pci-bridge on aarch64 virt guests,
> which is good because we don't want it at the moment :)
Yes, because you've already removed it, in the way that I want to remove
it from Q35. (but because Q35 guests are more likely to need a PCI-only
device than aarch64 guests, I can't really do that until I've updated
the "auto-create more slots" code to create a dmi-to-pci-bridge when
necessary.
(Hmm, but here's a problem - if there aren't currently any legacy PCI
devices in a config (and thus no dmi-to-pci-bridge or pci-bridge), and
someone decides to hotplug a legacy-PCI device, *then* what do you do? I
don't want to have these legacy controllers in *every damn config in the
world* just in case some moro... er "well meaning user" thinks they want
to hotplug an rtl_8139 emulated ethernet...)
I don't really have an answer for that, I'm afraid.
What I can say is: there are many areas in libvirt where we
choose the setup that ensures the widest compatibility by
default, so that users who don't want to care too much about
the details can just stick to that and have very high chances
of never running into trouble.
I think that's a sensible course of action, as long as you
also allow users who know what they're doing and want to
build a very precise setup to do so.
With that in mind, auto-adding a dmi-to-pci-bridge/pci-bridge
combo by default is probably not too bad *assuming* you have
a way to opt out of it.
--
Andrea Bolognani / Red Hat / Virtualization