[libvirt] How should libvirt apps enable virtio-pci for aarch64?

Hi all, I'm trying to figure out how apps should request virtio-pci for libvirt + qemu + arm/aarch64. Let me provide some background. qemu's arm/aarch64 original virtio support is via virtio-mmio, libvirt XML <address type='virtio-mmio'/>. Currently this is what libvirt sets as the address default for all arm/aarch64 virtio devices in the XML. Long term though all arm virt will likely be using virtio-pci: it's faster, enables hotplug, is more x86 like, etc. Support for virtio-pci is newer and not as widespread. qemu has had the necessary support since 2.4 at least, but the guest side isn't well distributed yet. For example, Fedora 23 and earlier don't work out of the box with virtio-pci. Internal RHELSA (RHEL Server for Aarch64) builds have it recently working AFAIK. Libvirt has some support for enabling virtio-pci with aarch64, commits added by Pavel Fedin in v1.2.19. (See e8d55172544c1fafe31a9e09346bdebca4f0d6f9). The patches add a PCIe controller automatically to the XML (and qemu commandline) if qemu-system-aarch64 supports it. However virtio-mmio is still used as the default virtio address, given the current lack of OS support. So we are at the point where libvirt apps want to enable this, but presently there isn't a good solution; the only option is to fully allocate <address type='pci' ...> for each virtio device in the XML. This is suboptimal for 2 reasons: #1) apps need to duplicate libvirt's non-trivial address type=pci allocation logic #2) apps have to add an <address> block for every virtio device, which is less friendly than the x86 case where this is rarely required. Any XML device snippets that work for x86 likely won't give the desired result for aarch64, since they will default to virtio-mmio. Think virsh attach-device/attach-disk commands Here are some possible solutions: * Drop the current behavior of adding a PCIe controller unconditionally, and instead require apps to specify it in the XML. Then, if libvirt sees a PCIe controller in the XML, default the virtio address type to pci. Apps will know if the OS they are installing supports virtio-pci (eventually via libosinfo), so this is the way we can implicitly ask libvirt 'allocate us pci addresses' Upsides: - Solves both the stated problems. - Simplest addition for applications IMO Downsides: - Requires a libvirt behavior change, no longer adding the PCIe controller by default. But in practice I don't think it will really affect anyone, since there isn't really any OS support for virtio-pci yet, and no apps support it either AFAIK. - The PCIe controller is not strictly about virtio-pci, it's for enabling plain emulated PCI devices as well. So there is a use case for using the PCIe controller for a graphics card even while your OS doesn't yet support virtio-pci. In the big picture though this is a small time window with current OS, and users can work around it by manually requesting <address type='virtio-mmio'/>, so medium/long term this isn't a big deal IMO - The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps. Laine? * Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us. Upsides: - We can stick with the current PCIe controller default and avoid some of the problems mentioned above. - An auto address feature may be useful in other contexts as well. Downsides: - Seems potentially tricky to implement in libvirt code. There's many places that check type=pci and key off that, seems like it would be easy to miss updating a check and cause regressions. Maybe we could add a new type like auto-pci to make it explicit. There's probably some implementation trick to make this safe, but at first glance it looked a little dicey. - Doesn't really solve problem #2 mentioned up above... maybe we could change the address allocation logic to default to virtio-pci if there's already a virtio-pci device in the XML. But it's more work. - More work for apps, but nothing horrible. * Change the default address type from virtio-mmio to pci, if qemu supports it. I'm listing this for completeness. In the short term this doesn't make sense as there isn't any OS releases that will work with this default. However it might be worth considering for the future, maybe keying off a particular qemu version or machine type. I suspect 2 years from now no one is going to be using virtio-mmio so long term it's not an ideal default. I think the first option is best (keying off the PCIe controller specified by the user), with a longer term plan to change the default from mmio to pci. But I'm not really sold on anything either way. So I'm interested if anyone else has ideas. Thanks, Cole

Hello!
- The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps.
Since we are discussing this, i have a question. Why do we construct exactly this thing? What is "dmi-to-pci-bridge" and why do we need it? I guess this is something PC-specific, may be this layout has been copied from some real PC model, but i don't see any practical sense in it. Also, there are two problems with "pci-bridge": 1. Live migration of this thing in qemu is broken. After migration the bridge screws up, lspci says "invalid header", and i don't know whether it actually works because i never attach anything behind it, because of (2). 2. After pcie-root we have PCI-X, which supports MSI-X. And after pci-bridge we seem to have a plain PCI, which supports only plain MSI. The problem here is that virtio seems to work only with MSI-X in any advanced mode (multiqueue, vhost, etc). If i place it behind the bridge (and default libvirt's logic is to place the device there), MSI-X will not work. The same applies to passthrough VFIO devices. This is especially painful because on real-life ARM64 platforms builtin hardware seems to mandate MSI-X. For example on ThunderX NIC driver simply does not support anything except MSI-X.
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us.
I like this one, and IMHO this would be nice to have regardless of the default. Manual assignment of PCI layout is a tedious process, which is not always necessary. I think it is quite logical to allow the user just to say: "I want this device on the PCI bus", and do the rest for him. Also, this would allow to have a simple "Address type" switch in e.g. virt-manager, so that i would be able to easily choose a way which the device uses to connect to the system. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On 12/07/2015 03:19 AM, Pavel Fedin wrote:
Hello!
- The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps.
Since we are discussing this, i have a question. Why do we construct exactly this thing? What is "dmi-to-pci-bridge" and why do we need it? I guess this is something PC-specific, may be this layout has been copied from some real PC model, but i don't see any practical sense in it.
This is likely just a side effect of the libvirt code requesting PCIe for aarch64, but the original PCIe support was added for the x86 q35 layout.
Also, there are two problems with "pci-bridge": 1. Live migration of this thing in qemu is broken. After migration the bridge screws up, lspci says "invalid header", and i don't know whether it actually works because i never attach anything behind it, because of (2).
I didn't even know aarch64 migration was working...
2. After pcie-root we have PCI-X, which supports MSI-X. And after pci-bridge we seem to have a plain PCI, which supports only plain MSI. The problem here is that virtio seems to work only with MSI-X in any advanced mode (multiqueue, vhost, etc). If i place it behind the bridge (and default libvirt's logic is to place the device there), MSI-X will not work. The same applies to passthrough VFIO devices. This is especially painful because on real-life ARM64 platforms builtin hardware seems to mandate MSI-X. For example on ThunderX NIC driver simply does not support anything except MSI-X.
Maybe this is just a factor of libvirt specifying the wrong bits on the aarch64 command line. Do you have a working qemu commandline outside of libvirt?
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us.
I like this one, and IMHO this would be nice to have regardless of the default. Manual assignment of PCI layout is a tedious process, which is not always necessary. I think it is quite logical to allow the user just to say: "I want this device on the PCI bus", and do the rest for him.
Agreed, I'll look into it in addition to the user PCI controller bits.
Also, this would allow to have a simple "Address type" switch in e.g. virt-manager, so that i would be able to easily choose a way which the device uses to connect to the system.
Currently I don't see a compelling reason to add this in the UI really. Sure it's probably useful for virt dev testing to switching between mmio and pci but long term everyone is going to use pci so I don't think end users are going to be tweaking this. I think extending virt-xml to handle address bits should cover it. Thanks, Cole

On 12/07/2015 10:33 AM, Cole Robinson wrote:
On 12/07/2015 03:19 AM, Pavel Fedin wrote:
Hello!
- The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps. Since we are discussing this, i have a question. Why do we construct exactly this thing? What is "dmi-to-pci-bridge" and why do we need it?
A dmi-to-pci-bridge plugs into a PCIe port (on real hardware at least, it isn't allowed in a normal PCI slot) and provides standard PCI slots downstream. So it is a way to convert from PCIe to PCI. However, you can't hot-plug devices into these standard PCI slots, which is why a pci-bridge is plugged into one of the slots of the dmi-to-pci-bridge - to provide standard PCI slots that can accept hot-plugged devices (which is what most management apps expect to be available). There was considerable discussion about this when support for the Q35 chipset was added, and this bus structure was directly taken from the advice I was given by qemu/pci people. (At the time I was concerned about whether or not it should be allowed to plug standard PCI devices into PCIe slots and vice versa (since that is physically not possible on real hardware); we've since learned that qemu doesn't have much problem with this in most cases, and I've loosened up restrictions in libvirt (auto-assign will match types, but you can force most endpoint devices into any PCI or PCIe slot you like.) For AARCH64, though... well, if you want to know why it's added for that machinetype, I guess you'd need to talk to the person who turned on addPCIeRoot for AARCH64 :-). I actually wondered about that recently when I was tinkering with auto-adding USB2 controllers when machinetype is Q35 (i.e. "why are we adding an Intel/x86-specific controller to AARCH64 machines?") (BTW, the name "dmi-to-pci-bridge" was chosen specifically to *not* be platform-specific. It happens that currently the only example of this type of controller is i82801b11-bridge (as can be seen in the xml here: <model name='i82801b11-bridge'/> but if some other pci controller in the future behaves in the same way, it could also be classified as a dmi-to-pci-bridge (with corresponding different <model name...).)
I guess this is something PC-specific, may be this layout has been copied from some real PC model, but i don't see any practical sense in it. This is likely just a side effect of the libvirt code requesting PCIe for aarch64, but the original PCIe support was added for the x86 q35 layout.
Correct. the "addPCIeRoot" bool, created only with thought to the Q35 bus structure was overloaded to also create the other controllers, and that detail was missed when support for pcie-root on aarch64 virt machinetypes was added.
Also, there are two problems with "pci-bridge": 1. Live migration of this thing in qemu is broken. After migration the bridge screws up, lspci says "invalid header", and i don't know whether it actually works because i never attach anything behind it, because of (2). I didn't even know aarch64 migration was working...
Is this only a problem on aarch64, or is there a migration problem with pci-bridge on x86 as well? (It's possible there is but it hasn't been noticed, because pci-bridge likely isn't used much outside of q35 machines, and q35 was prohibited from migrating until qemu 2.4 due to the embedded sata driver which didn't support migration.)
2. After pcie-root we have PCI-X, which supports MSI-X. And after pci-bridge we seem to have a plain PCI, which supports only plain MSI. The problem here is that virtio seems to work only with MSI-X in any advanced mode (multiqueue, vhost, etc). If i place it behind the bridge (and default libvirt's logic is to place the device there), MSI-X will not work.
libvirt's PCI address allocation logic can certainly be changed, as long as it's done in a way that won't break auto-allocation for any older machinetypes. For example, we could make virtio devices prefer to be plugged into a PCIe port (probably a pcie-downstream-switch-port or pcie-root-port). BTW, does the aarch64 virt machinetype support any controller aside from the embedded pcie-root? Normally the ports on pcie-root don't support hotplug directly, but need a pcie-root-port (or a set of switch ports) plugged into them at boot time. The only examples of these types of controllers that libvirt knows about are based on Intel chips (ioh3420, x3130-whatever).
The same applies to passthrough VFIO devices. This is especially painful because on real-life ARM64 platforms builtin hardware seems to mandate MSI-X. For example on ThunderX NIC driver simply does not support anything except MSI-X.
Maybe this is just a factor of libvirt specifying the wrong bits on the aarch64 command line. Do you have a working qemu commandline outside of libvirt?
Similar question - is this problem present on x86 as well?
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us. I like this one, and IMHO this would be nice to have regardless of the default. Manual assignment of PCI layout is a tedious process, which is not always necessary. I think it is quite logical to allow the user just to say: "I want this device on the PCI bus", and do the rest for him. Agreed, I'll look into it in addition to the user PCI controller bits.
Right now we will auto-add only a pci-bridge if no available slot is found for a pci device, but we will (should anyway) auto-assign a slot on an *existing* PCIe controller if the device has PCIE as the preferred slot type. It would be really cool if the "current machinetype" had a "preferred controller type" for pci devices with no manually assigned address, and would auto-add the appropriate controller according to that (with parents auto-added as necessary).

Hello!
For AARCH64, though... well, if you want to know why it's added for that machinetype, I guess you'd need to talk to the person who turned on addPCIeRoot for AARCH64 :-).
That was me, and i simply reused the existing code. Actually, initially i tried to use addPCIRoot first, but it ended up in a plain PCI bus which does not support MSI-X. Or, even, it did not work at all, because libvirt refers to bus#0 as to just "pci", which does not work with virt. I don't remember the exact outcome. So, i decided to use addPCIeRoot instead, and everything just worked.
I actually wondered about that recently when I was tinkering with auto-adding USB2 controllers when machinetype is Q35 (i.e. "why are we adding an Intel/x86-specific controller to AARCH64 machines?")
I guess we could tweak it so that on non-x86 we have only pcie-root without all the downstream chain. For qemu it will perfectly work, and actually in my XML i put all my devices to bus#0, so that MSI-X works. I don't use downstream buses at all. But, indeed, i never tested hotplugging.
I didn't even know aarch64 migration was working...
It should work out of the box with GICv2, and for GICv3 i posted RFC patches. RFC because kernel API for that is not ready yet.
Is this only a problem on aarch64, or is there a migration problem with pci-bridge on x86 as well? (It's possible there is but it hasn't been noticed, because pci-bridge likely isn't used much outside of q35 machines, and q35 was prohibited from migrating until qemu 2.4 due to the embedded sata driver which didn't support migration.)
I think it should be a generic problem because i don't see how PCI bridge implementation could contain any arch-specific code.
BTW, does the aarch64 virt machinetype support any controller aside from the embedded pcie-root?
No. The actual controller used by "virt" machine is called "generic PCI host", and it registers itself as "pcie.0".
Right now we will auto-add only a pci-bridge if no available slot is found for a pci device, but we will (should anyway) auto-assign a slot on an *existing* PCIe controller if the device has PCIE as the preferred slot type.
AFAIR we already do this. The only problem is that preferred slot type for the device is currently hardcoded to be PCIe for video cards and, IIRC, SCSI; and plain PCI for everything else. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On 12/08/2015 02:22 AM, Pavel Fedin wrote:
Hello!
For AARCH64, though... well, if you want to know why it's added for that machinetype, I guess you'd need to talk to the person who turned on addPCIeRoot for AARCH64 :-). That was me, and i simply reused the existing code.
Yeah, I know :-) (or rather "git-blame-mode" in emacs knows!) Just some good-natured poking :-) (actually I wouldn't be surprised if it was me who reviewed your patch, and just assumed that if you were adding the other controllers, that you must actually want them).
Actually, initially i tried to use addPCIRoot first, but it ended up in a plain PCI bus which does not support MSI-X. Or, even, it did not work at all, because libvirt refers to bus#0 as to just "pci", which does not work with virt.
qemu has hardcoded the PCI root bus with the name "pci.0" and the PCIe root complex with the name "pcie.0". libvirt sticks by those names. (but names all optional buses as "pci.n")
I don't remember the exact outcome. So, i decided to use addPCIeRoot instead, and everything just worked.
Except the extra stuff. The origin of the problem was in my overloading "addPCIeRoot" to indicate that the other two controllers should also be added. You only made the mistake of thinking that the name of the variable was actually an accurate/complete description of what it did :-)
I actually wondered about that recently when I was tinkering with auto-adding USB2 controllers when machinetype is Q35 (i.e. "why are we adding an Intel/x86-specific controller to AARCH64 machines?") I guess we could tweak it so that on non-x86 we have only pcie-root without all the downstream chain.
That sounds fine to me...
For qemu it will perfectly work, and actually in my XML i put all my devices to bus#0, so that MSI-X works. I don't use downstream buses at all. But, indeed, i never tested hotplugging.
...as long as something is done about hotplug (assuming aarch64 supports that, or plans to support it) (I wonder if maybe we could/should add an option somewhere saying "hotplugCapable=yes|no". No idea where or how that would go, or *exactly* what it would do, but I guess it would end up being a short hand for adding a controller or two, and would then be swallowed by the parser...)
I didn't even know aarch64 migration was working... It should work out of the box with GICv2, and for GICv3 i posted RFC patches. RFC because kernel API for that is not ready yet.
Is this only a problem on aarch64, or is there a migration problem with pci-bridge on x86 as well? (It's possible there is but it hasn't been noticed, because pci-bridge likely isn't used much outside of q35 machines, and q35 was prohibited from migrating until qemu 2.4 due to the embedded sata driver which didn't support migration.) I think it should be a generic problem because i don't see how PCI bridge implementation could contain any arch-specific code.
BTW, does the aarch64 virt machinetype support any controller aside from the embedded pcie-root? No. The actual controller used by "virt" machine is called "generic PCI host", and it registers itself as "pcie.0".
But apparently qemu is accepting "-device i82801b11-bridge" and -device pci-bridge, since that's what you get from libvirt. If these devices aren't supported for aarch64, they should be disabled in qemu (and not listed in the devices when libvirt asks). (An aside - it's really annoying that so many of the different qemu binaries don't accept "-device ?" to return a list of valid devices; this is very useful when exploring...)
Right now we will auto-add only a pci-bridge if no available slot is found for a pci device, but we will (should anyway) auto-assign a slot on an *existing* PCIe controller if the device has PCIE as the preferred slot type. AFAIR we already do this. The only problem is that preferred slot type for the device is currently hardcoded to be PCIe for video cards and, IIRC, SCSI; and plain PCI for everything else.
Right (at least for video, don't know about SCSI - if that's the case then it was added by someone after my initial implementation). These are hardcoded because there is currently no information available by qemu introspection that can tell us which type of bus is better for any device (there was a thread about that on the qemu mailing list a year or so ago, but I don't think anything ever came of it). Since we can't learn the proper information from qemu, the best we can do is modify that one horribly ugly function so that it is more horrible and even uglier (and bigger) to change the preferred bus type for each device depending on the arch/machinetype and the device itself.

Hello!
I don't remember the exact outcome. So, i decided to use addPCIeRoot instead, and everything just worked.
Except the extra stuff. The origin of the problem was in my overloading "addPCIeRoot" to indicate that the other two controllers should also be added. You only made the mistake of thinking that the name of the variable was actually an accurate/complete description of what it did :-)
No, it was fine, and i don't consider it a mistake. All i needed is to tell libvirt somehow that the machine has PCIe, and that did the job perfectly. I knew that it adds two more devices for some reason, but decided just to leave it as it is, because i assumed that there is a reason for them to be there, just didn't care what the reason exactly is. I am questioning it only now.
But apparently qemu is accepting "-device i82801b11-bridge" and -device pci-bridge, since that's what you get from libvirt. If these devices aren't supported for aarch64, they should be disabled in qemu (and not listed in the devices when libvirt asks).
It's actually supported, and everything just works. Here is what i get in the machine: --- cut --- [root@localhost ~]# lspci -v 00:00.0 Host bridge: Red Hat, Inc. Device 0008 Subsystem: Red Hat, Inc Device 1100 Flags: fast devsel lspci: Unable to load libkmod resources: error -12 00:01.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92) (prog-if 01 [Subtractive decode]) Flags: 66MHz, fast devsel Bus: primary=00, secondary=01, subordinate=02, sec-latency=0 Memory behind bridge: 10000000-100fffff Capabilities: [50] Subsystem: Device 0000:0000 00:02.0 SCSI storage controller: Red Hat, Inc Virtio SCSI Subsystem: Red Hat, Inc Device 0008 Flags: bus master, fast devsel, latency 0, IRQ 39 I/O ports at 1000 [size=64] Memory at 10140000 (32-bit, non-prefetchable) [size=4K] Capabilities: [40] MSI-X: Enable+ Count=4 Masked- Kernel driver in use: virtio-pci 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device Subsystem: Red Hat, Inc Device 0001 Flags: bus master, fast devsel, latency 0, IRQ 40 I/O ports at 1040 [size=32] Memory at 10141000 (32-bit, non-prefetchable) [size=4K] [virtual] Expansion ROM at 10100000 [disabled] [size=256K] Capabilities: [40] MSI-X: Enable+ Count=3 Masked- Kernel driver in use: virtio-pci 00:04.0 Ethernet controller: Cavium, Inc. Device 0011 Subsystem: Cavium, Inc. Device a11e Flags: bus master, fast devsel, latency 0 Memory at 8000000000 (64-bit, non-prefetchable) [size=2M] Memory at 8000200000 (64-bit, non-prefetchable) [size=2M] Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [b0] MSI-X: Enable+ Count=20 Masked- Capabilities: [100] Alternative Routing-ID Interpretation (ARI) Kernel driver in use: thunder-nicvf 01:01.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge (prog-if 00 [Normal decode]) Flags: 66MHz, fast devsel Memory at 10000000 (64-bit, non-prefetchable) [disabled] [size=256] Bus: primary=01, secondary=02, subordinate=02, sec-latency=0 Capabilities: [4c] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [48] Slot ID: 0 slots, First+, chassis 02 Capabilities: [40] Hot-plug capable --- cut --- 00:04:0 is VFIO passthrough, and i put everything to bus#0 by hands, for MSI-X to work. I could also leave devices at bus#2, just in this case i don't get MSI-X, and this, for example, makes vhost-net unable to use irqfds (or it cannot initialize at all, again, i already don't remember), because this requires per-event irqs. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

Hello!
00:04:0 is VFIO passthrough, and i put everything to bus#0 by hands, for MSI-X to work. I could also leave devices at bus#2, just in this case i don't get MSI-X, and this, for example, makes vhost-net unable to use irqfds (or it cannot initialize at all, again, i already don't remember), because this requires per-event irqs.
Just tested Q35 machine on x86-64, and retested the same on ARM64. Sorry, but please ignore what i said before about MSI-X. Everything works fine, MSI-X is OK on bus#2 after the bridge, and this was either some old bug in old qemu, or i was simply doing something crappy and wrong. So, i don't care anymore about this bridge complex on PCIe bus, it does not make my living any worse. If libvirt strictly distinguishes between PCI and PCIe, let it just be there. And, to complete the testing, i've just migrated the Q35 machine and the bridge also breaks down. 'lspci -v' starts reporting "Unknown header type: 7f" for the bridge and for all devices behind it (and all devices fail). But it just means it should be fixed in qemu. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On Sun, Dec 06, 2015 at 09:46:56PM -0500, Cole Robinson wrote:
Hi all,
[...]
Here are some possible solutions:
* Drop the current behavior of adding a PCIe controller unconditionally, and instead require apps to specify it in the XML. Then, if libvirt sees a PCIe controller in the XML, default the virtio address type to pci. Apps will know if the OS they are installing supports virtio-pci (eventually via libosinfo), so this is the way we can implicitly ask libvirt 'allocate us pci addresses'
For now this is probably better solution considering the fact that there is minimal support from OSes. But eventually it would be better to switch to PCIe as default. From libvirt POV we should allow to choose between mmio and PCIe to override the default. This would resolve the current situation and also the future state, where PCIe will became the preferred one.
Upsides: - Solves both the stated problems. - Simplest addition for applications IMO
Downsides: - Requires a libvirt behavior change, no longer adding the PCIe controller by default. But in practice I don't think it will really affect anyone, since there isn't really any OS support for virtio-pci yet, and no apps support it either AFAIK. - The PCIe controller is not strictly about virtio-pci, it's for enabling plain emulated PCI devices as well. So there is a use case for using the PCIe controller for a graphics card even while your OS doesn't yet support virtio-pci. In the big picture though this is a small time window with current OS, and users can work around it by manually requesting <address type='virtio-mmio'/>, so medium/long term this isn't a big deal IMO - The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps. Laine?
No, hardcoded staff tend to be short-term solution and makes more pain in the future to get rid of them and don't break anything. The other thing is, aarch64 doesn't have i82801b11-bridge controller, which is used for dmi-to-pci-bridge. I'm not even sure, whether pci-bridge is required for aarch64 to use pci devices.
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us.
This would be nice feature to specify address type and let libvirt fill remaining address information and add required controllers or report that required controllers are missing. Pavel [...]

On 12/07/2015 06:38 AM, Pavel Hrdina wrote:
On Sun, Dec 06, 2015 at 09:46:56PM -0500, Cole Robinson wrote:
Hi all,
[...]
Here are some possible solutions:
* Drop the current behavior of adding a PCIe controller unconditionally, and instead require apps to specify it in the XML. Then, if libvirt sees a PCIe controller in the XML, default the virtio address type to pci. Apps will know if the OS they are installing supports virtio-pci (eventually via libosinfo), so this is the way we can implicitly ask libvirt 'allocate us pci addresses'
For now this is probably better solution considering the fact that there is minimal support from OSes. But eventually it would be better to switch to PCIe as default. From libvirt POV we should allow to choose between mmio and PCIe to override the default. This would resolve the current situation and also the future state, where PCIe will became the preferred one.
Upsides: - Solves both the stated problems. - Simplest addition for applications IMO
Downsides: - Requires a libvirt behavior change, no longer adding the PCIe controller by default. But in practice I don't think it will really affect anyone, since there isn't really any OS support for virtio-pci yet, and no apps support it either AFAIK. - The PCIe controller is not strictly about virtio-pci, it's for enabling plain emulated PCI devices as well. So there is a use case for using the PCIe controller for a graphics card even while your OS doesn't yet support virtio-pci. In the big picture though this is a small time window with current OS, and users can work around it by manually requesting <address type='virtio-mmio'/>, so medium/long term this isn't a big deal IMO - The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps. Laine?
No, hardcoded staff tend to be short-term solution and makes more pain in the future to get rid of them and don't break anything. The other thing is, aarch64 doesn't have i82801b11-bridge controller, which is used for dmi-to-pci-bridge. I'm not even sure, whether pci-bridge is required for aarch64 to use pci devices.
Yeah maybe we can add a convenience options like <controller type='pcie'/> that libvirt converts to the correct topology for qemu. Dunno. Step 1 is figuring out if that XML even makes sense for qemu aarch64
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us.
This would be nice feature to specify address type and let libvirt fill remaining address information and add required controllers or report that required controllers are missing.
Yeah I'll look into it. - Cole

On Sun, Dec 06, 2015 at 09:46:56PM -0500, Cole Robinson wrote:
Hi all,
I'm trying to figure out how apps should request virtio-pci for libvirt + qemu + arm/aarch64. Let me provide some background.
qemu's arm/aarch64 original virtio support is via virtio-mmio, libvirt XML <address type='virtio-mmio'/>. Currently this is what libvirt sets as the address default for all arm/aarch64 virtio devices in the XML. Long term though all arm virt will likely be using virtio-pci: it's faster, enables hotplug, is more x86 like, etc.
Support for virtio-pci is newer and not as widespread. qemu has had the necessary support since 2.4 at least, but the guest side isn't well distributed yet. For example, Fedora 23 and earlier don't work out of the box with virtio-pci. Internal RHELSA (RHEL Server for Aarch64) builds have it recently working AFAIK.
Libvirt has some support for enabling virtio-pci with aarch64, commits added by Pavel Fedin in v1.2.19. (See e8d55172544c1fafe31a9e09346bdebca4f0d6f9). The patches add a PCIe controller automatically to the XML (and qemu commandline) if qemu-system-aarch64 supports it. However virtio-mmio is still used as the default virtio address, given the current lack of OS support.
So we are at the point where libvirt apps want to enable this, but presently there isn't a good solution; the only option is to fully allocate <address type='pci' ...> for each virtio device in the XML. This is suboptimal for 2 reasons:
#1) apps need to duplicate libvirt's non-trivial address type=pci allocation logic
#2) apps have to add an <address> block for every virtio device, which is less friendly than the x86 case where this is rarely required. Any XML device snippets that work for x86 likely won't give the desired result for aarch64, since they will default to virtio-mmio. Think virsh attach-device/attach-disk commands
Yeah this is very undesirable for a default out of the box config - we should always strive to "do the best thing" when no address is given.
Here are some possible solutions:
* Drop the current behavior of adding a PCIe controller unconditionally, and instead require apps to specify it in the XML. Then, if libvirt sees a PCIe controller in the XML, default the virtio address type to pci. Apps will know if the OS they are installing supports virtio-pci (eventually via libosinfo), so this is the way we can implicitly ask libvirt 'allocate us pci addresses'
Yes, clearly we need to record in libosinfo whether an OS can do PCI vs MMIO.
Upsides: - Solves both the stated problems. - Simplest addition for applications IMO
Downsides: - Requires a libvirt behavior change, no longer adding the PCIe controller by default. But in practice I don't think it will really affect anyone, since there isn't really any OS support for virtio-pci yet, and no apps support it either AFAIK. - The PCIe controller is not strictly about virtio-pci, it's for enabling plain emulated PCI devices as well. So there is a use case for using the PCIe controller for a graphics card even while your OS doesn't yet support virtio-pci. In the big picture though this is a small time window with current OS, and users can work around it by manually requesting <address type='virtio-mmio'/>, so medium/long term this isn't a big deal IMO - The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps. Laine?
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us.
Upsides: - We can stick with the current PCIe controller default and avoid some of the problems mentioned above. - An auto address feature may be useful in other contexts as well.
Downsides: - Seems potentially tricky to implement in libvirt code. There's many places that check type=pci and key off that, seems like it would be easy to miss updating a check and cause regressions. Maybe we could add a new type like auto-pci to make it explicit. There's probably some implementation trick to make this safe, but at first glance it looked a little dicey.
I'm not sure it is actually all that hairy - it might be as simple as updating only qemuAssignDevicePCISlots so that instread of: if (def->controllers[i]->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) continue; It handles type=pci with no values set too
- Doesn't really solve problem #2 mentioned up above... maybe we could change the address allocation logic to default to virtio-pci if there's already a virtio-pci device in the XML. But it's more work. - More work for apps, but nothing horrible.
* Change the default address type from virtio-mmio to pci, if qemu supports it. I'm listing this for completeness. In the short term this doesn't make sense as there isn't any OS releases that will work with this default. However it might be worth considering for the future, maybe keying off a particular qemu version or machine type. I suspect 2 years from now no one is going to be using virtio-mmio so long term it's not an ideal default.
Yeah, when QEMU starts doing versioned machine types for AArch64 we could do this, but then this just kind of flips the problem around - apps now need to manually add <address type="mmio"> for every device if deploying an OS that can't do PCI. Admittedly this is slightly easier, since address rules for mmio are simpler than address rules for PCI.
I think the first option is best (keying off the PCIe controller specified by the user), with a longer term plan to change the default from mmio to pci. But I'm not really sold on anything either way. So I'm interested if anyone else has ideas.
I guess I'd tend towards option 1 too - only adding PCI controller if we actually want to use PCI with the guest. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 12/07/2015 07:27 AM, Daniel P. Berrange wrote:
On Sun, Dec 06, 2015 at 09:46:56PM -0500, Cole Robinson wrote:
Hi all,
I'm trying to figure out how apps should request virtio-pci for libvirt + qemu + arm/aarch64. Let me provide some background.
qemu's arm/aarch64 original virtio support is via virtio-mmio, libvirt XML <address type='virtio-mmio'/>. Currently this is what libvirt sets as the address default for all arm/aarch64 virtio devices in the XML. Long term though all arm virt will likely be using virtio-pci: it's faster, enables hotplug, is more x86 like, etc.
Support for virtio-pci is newer and not as widespread. qemu has had the necessary support since 2.4 at least, but the guest side isn't well distributed yet. For example, Fedora 23 and earlier don't work out of the box with virtio-pci. Internal RHELSA (RHEL Server for Aarch64) builds have it recently working AFAIK.
Libvirt has some support for enabling virtio-pci with aarch64, commits added by Pavel Fedin in v1.2.19. (See e8d55172544c1fafe31a9e09346bdebca4f0d6f9). The patches add a PCIe controller automatically to the XML (and qemu commandline) if qemu-system-aarch64 supports it. However virtio-mmio is still used as the default virtio address, given the current lack of OS support.
So we are at the point where libvirt apps want to enable this, but presently there isn't a good solution; the only option is to fully allocate <address type='pci' ...> for each virtio device in the XML. This is suboptimal for 2 reasons:
#1) apps need to duplicate libvirt's non-trivial address type=pci allocation logic
#2) apps have to add an <address> block for every virtio device, which is less friendly than the x86 case where this is rarely required. Any XML device snippets that work for x86 likely won't give the desired result for aarch64, since they will default to virtio-mmio. Think virsh attach-device/attach-disk commands
Yeah this is very undesirable for a default out of the box config - we should always strive to "do the best thing" when no address is given.
Here are some possible solutions:
* Drop the current behavior of adding a PCIe controller unconditionally, and instead require apps to specify it in the XML. Then, if libvirt sees a PCIe controller in the XML, default the virtio address type to pci. Apps will know if the OS they are installing supports virtio-pci (eventually via libosinfo), so this is the way we can implicitly ask libvirt 'allocate us pci addresses'
Yes, clearly we need to record in libosinfo whether an OS can do PCI vs MMIO.
Upsides: - Solves both the stated problems. - Simplest addition for applications IMO
Downsides: - Requires a libvirt behavior change, no longer adding the PCIe controller by default. But in practice I don't think it will really affect anyone, since there isn't really any OS support for virtio-pci yet, and no apps support it either AFAIK. - The PCIe controller is not strictly about virtio-pci, it's for enabling plain emulated PCI devices as well. So there is a use case for using the PCIe controller for a graphics card even while your OS doesn't yet support virtio-pci. In the big picture though this is a small time window with current OS, and users can work around it by manually requesting <address type='virtio-mmio'/>, so medium/long term this isn't a big deal IMO - The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps. Laine?
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us.
Upsides: - We can stick with the current PCIe controller default and avoid some of the problems mentioned above. - An auto address feature may be useful in other contexts as well.
Downsides: - Seems potentially tricky to implement in libvirt code. There's many places that check type=pci and key off that, seems like it would be easy to miss updating a check and cause regressions. Maybe we could add a new type like auto-pci to make it explicit. There's probably some implementation trick to make this safe, but at first glance it looked a little dicey.
I'm not sure it is actually all that hairy - it might be as simple as updating only qemuAssignDevicePCISlots so that instread of:
if (def->controllers[i]->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) continue;
It handles type=pci with no values set too
I think my fear was that there are other places in domain_conf that check for ADDRESS_TYPE_PCI before we even get to assigning PCI slots. But i'll poke at it.
- Doesn't really solve problem #2 mentioned up above... maybe we could change the address allocation logic to default to virtio-pci if there's already a virtio-pci device in the XML. But it's more work. - More work for apps, but nothing horrible.
* Change the default address type from virtio-mmio to pci, if qemu supports it. I'm listing this for completeness. In the short term this doesn't make sense as there isn't any OS releases that will work with this default. However it might be worth considering for the future, maybe keying off a particular qemu version or machine type. I suspect 2 years from now no one is going to be using virtio-mmio so long term it's not an ideal default.
Yeah, when QEMU starts doing versioned machine types for AArch64 we could do this, but then this just kind of flips the problem around - apps now need to manually add <address type="mmio"> for every device if deploying an OS that can't do PCI. Admittedly this is slightly easier, since address rules for mmio are simpler than address rules for PCI.
I think the first option is best (keying off the PCIe controller specified by the user), with a longer term plan to change the default from mmio to pci. But I'm not really sold on anything either way. So I'm interested if anyone else has ideas.
I guess I'd tend towards option 1 too - only adding PCI controller if we actually want to use PCI with the guest.
Sounds good, thanks for the comments. - Cole

On Mon, Dec 07, 2015 at 10:37:38AM -0500, Cole Robinson wrote:
On 12/07/2015 07:27 AM, Daniel P. Berrange wrote:
On Sun, Dec 06, 2015 at 09:46:56PM -0500, Cole Robinson wrote:
Hi all,
I'm trying to figure out how apps should request virtio-pci for libvirt + qemu + arm/aarch64. Let me provide some background.
qemu's arm/aarch64 original virtio support is via virtio-mmio, libvirt XML <address type='virtio-mmio'/>. Currently this is what libvirt sets as the address default for all arm/aarch64 virtio devices in the XML. Long term though all arm virt will likely be using virtio-pci: it's faster, enables hotplug, is more x86 like, etc.
Support for virtio-pci is newer and not as widespread. qemu has had the necessary support since 2.4 at least, but the guest side isn't well distributed yet. For example, Fedora 23 and earlier don't work out of the box with virtio-pci. Internal RHELSA (RHEL Server for Aarch64) builds have it recently working AFAIK.
Libvirt has some support for enabling virtio-pci with aarch64, commits added by Pavel Fedin in v1.2.19. (See e8d55172544c1fafe31a9e09346bdebca4f0d6f9). The patches add a PCIe controller automatically to the XML (and qemu commandline) if qemu-system-aarch64 supports it. However virtio-mmio is still used as the default virtio address, given the current lack of OS support.
So we are at the point where libvirt apps want to enable this, but presently there isn't a good solution; the only option is to fully allocate <address type='pci' ...> for each virtio device in the XML. This is suboptimal for 2 reasons:
#1) apps need to duplicate libvirt's non-trivial address type=pci allocation logic
#2) apps have to add an <address> block for every virtio device, which is less friendly than the x86 case where this is rarely required. Any XML device snippets that work for x86 likely won't give the desired result for aarch64, since they will default to virtio-mmio. Think virsh attach-device/attach-disk commands
Yeah this is very undesirable for a default out of the box config - we should always strive to "do the best thing" when no address is given.
Here are some possible solutions:
* Drop the current behavior of adding a PCIe controller unconditionally, and instead require apps to specify it in the XML. Then, if libvirt sees a PCIe controller in the XML, default the virtio address type to pci. Apps will know if the OS they are installing supports virtio-pci (eventually via libosinfo), so this is the way we can implicitly ask libvirt 'allocate us pci addresses'
Yes, clearly we need to record in libosinfo whether an OS can do PCI vs MMIO.
Upsides: - Solves both the stated problems. - Simplest addition for applications IMO
Downsides: - Requires a libvirt behavior change, no longer adding the PCIe controller by default. But in practice I don't think it will really affect anyone, since there isn't really any OS support for virtio-pci yet, and no apps support it either AFAIK. - The PCIe controller is not strictly about virtio-pci, it's for enabling plain emulated PCI devices as well. So there is a use case for using the PCIe controller for a graphics card even while your OS doesn't yet support virtio-pci. In the big picture though this is a small time window with current OS, and users can work around it by manually requesting <address type='virtio-mmio'/>, so medium/long term this isn't a big deal IMO - The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps. Laine?
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us.
Upsides: - We can stick with the current PCIe controller default and avoid some of the problems mentioned above. - An auto address feature may be useful in other contexts as well.
Downsides: - Seems potentially tricky to implement in libvirt code. There's many places that check type=pci and key off that, seems like it would be easy to miss updating a check and cause regressions. Maybe we could add a new type like auto-pci to make it explicit. There's probably some implementation trick to make this safe, but at first glance it looked a little dicey.
I'm not sure it is actually all that hairy - it might be as simple as updating only qemuAssignDevicePCISlots so that instread of:
if (def->controllers[i]->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) continue;
It handles type=pci with no values set too
I think my fear was that there are other places in domain_conf that check for ADDRESS_TYPE_PCI before we even get to assigning PCI slots. But i'll poke at it.
Its always possible, but I'd rather hope nothing actually is. The only thing that should care about specific addresses is QEMU itself - the specific choice of bus/slot/func shouldn't be relevant to any logic inside libvirt except the command line generator - at most code should care whether it is PCI or not. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 12/07/2015 10:37 AM, Cole Robinson wrote:
On 12/07/2015 07:27 AM, Daniel P. Berrange wrote:
On Sun, Dec 06, 2015 at 09:46:56PM -0500, Cole Robinson wrote:
Hi all,
I'm trying to figure out how apps should request virtio-pci for libvirt + qemu + arm/aarch64. Let me provide some background.
qemu's arm/aarch64 original virtio support is via virtio-mmio, libvirt XML <address type='virtio-mmio'/>. Currently this is what libvirt sets as the address default for all arm/aarch64 virtio devices in the XML. Long term though all arm virt will likely be using virtio-pci: it's faster, enables hotplug, is more x86 like, etc.
Support for virtio-pci is newer and not as widespread. qemu has had the necessary support since 2.4 at least, but the guest side isn't well distributed yet. For example, Fedora 23 and earlier don't work out of the box with virtio-pci. Internal RHELSA (RHEL Server for Aarch64) builds have it recently working AFAIK.
Libvirt has some support for enabling virtio-pci with aarch64, commits added by Pavel Fedin in v1.2.19. (See e8d55172544c1fafe31a9e09346bdebca4f0d6f9). The patches add a PCIe controller automatically to the XML (and qemu commandline) if qemu-system-aarch64 supports it. However virtio-mmio is still used as the default virtio address, given the current lack of OS support.
So we are at the point where libvirt apps want to enable this, but presently there isn't a good solution; the only option is to fully allocate <address type='pci' ...> for each virtio device in the XML. This is suboptimal for 2 reasons:
#1) apps need to duplicate libvirt's non-trivial address type=pci allocation logic
#2) apps have to add an <address> block for every virtio device, which is less friendly than the x86 case where this is rarely required. Any XML device snippets that work for x86 likely won't give the desired result for aarch64, since they will default to virtio-mmio. Think virsh attach-device/attach-disk commands Yeah this is very undesirable for a default out of the box config - we should always strive to "do the best thing" when no address is given.
Here are some possible solutions:
* Drop the current behavior of adding a PCIe controller unconditionally, and instead require apps to specify it in the XML. Then, if libvirt sees a PCIe controller in the XML, default the virtio address type to pci. Apps will know if the OS they are installing supports virtio-pci (eventually via libosinfo), so this is the way we can implicitly ask libvirt 'allocate us pci addresses' Yes, clearly we need to record in libosinfo whether an OS can do PCI vs MMIO.
Upsides: - Solves both the stated problems. - Simplest addition for applications IMO
Downsides: - Requires a libvirt behavior change, no longer adding the PCIe controller by default. But in practice I don't think it will really affect anyone, since there isn't really any OS support for virtio-pci yet, and no apps support it either AFAIK. - The PCIe controller is not strictly about virtio-pci, it's for enabling plain emulated PCI devices as well. So there is a use case for using the PCIe controller for a graphics card even while your OS doesn't yet support virtio-pci. In the big picture though this is a small time window with current OS, and users can work around it by manually requesting <address type='virtio-mmio'/>, so medium/long term this isn't a big deal IMO - The PCIe controller XML is: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> I have no idea if that's always going to be the expected XML, maybe it's not wise to hardcode that in apps. Laine?
That was only intended for the Q35 machinetype (but somehow all of them got turned on by a patch to add pcie-root to aarch64 virt machinetypes). pcie-root is included in the hardware by qemu on Q35, and can't be removed. dmi-to-pci-bridge translates from the PCIe ports of pcie-root to standard pci ports (but non-hotpluggable), and pci-bridge converts from non-hotpluggable PCI to hotpluggable PCI, which is the kind of slots that management applications expect to be available. Other machinetypes don't need to do this same thing (for that matter, in the future this may not be the most desirable way to go for Q35 - in the 2 (or is it 3) years since Q35 support was added, I've learned that pretty much every emulated PCI device in qemu can be plugged into a PCIe port (on the Q35 machinetype at least) with no complaints from qemu, and we now have pcie-root-port and pcie-switch-downstream-port (with matching pcie-switch-upstream-port) that can accept hotplugged devices, so a Q35 machine could now be constructed as: <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='pcie-root-port'/> <controller type='pci' index='2' model='pcie-switch-upstream-port'/> <controller type='pci' index='3' model='pcie-switch-downstream-port'/> <controller type='pci' index='4' model='pcie-switch-downstream-port'/> ... and the address assignment could be modified to allow auto-selection of PCIe ports for PCI devices (the downstream ports support hotplugging devices, but can't be hotplugged themselves, and they can only be plugged into an upstream port, which can only be plugged into a root-port (or a downstream-port)). At any rate, I don't think the current PCIe bus structure should be hardcoded anywhere. We can change what libvirt does by default any time; existing configs will continue with what we setup in the past (and thus won't suffer "your hardware has changed! Reactivate!!" problems), but newly created ones will use whatever new model we come up with.
* Next idea: Users specify something like like <address type='pci'/> and libvirt fills in the address for us.
Upsides: - We can stick with the current PCIe controller default and avoid some of the problems mentioned above. - An auto address feature may be useful in other contexts as well.
Downsides: - Seems potentially tricky to implement in libvirt code. There's many places that check type=pci and key off that, seems like it would be easy to miss updating a check and cause regressions. Maybe we could add a new type like auto-pci to make it explicit. There's probably some implementation trick to make this safe, but at first glance it looked a little dicey.
I'm not sure it is actually all that hairy - it might be as simple as updating only qemuAssignDevicePCISlots so that instread of:
if (def->controllers[i]->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) continue;
It handles type=pci with no values set too
I think my fear was that there are other places in domain_conf that check for ADDRESS_TYPE_PCI before we even get to assigning PCI slots. But i'll poke at it.
This is very possible. ADDRESS_TYPE_PCI means that the PCI address is "valid". 0000:00:00.0 is a valid PCI address (although it happens to be reserved on any x86 architecture. For the bit early on after the parse we may need to have a separate "valid address" flag beyond the type to prevent confusion. I do like the idea of being able to say "<address type='pci'/> to select the bus without specifying an address though.
- Doesn't really solve problem #2 mentioned up above... maybe we could change the address allocation logic to default to virtio-pci if there's already a virtio-pci device in the XML. But it's more work. - More work for apps, but nothing horrible. * Change the default address type from virtio-mmio to pci, if qemu supports it. I'm listing this for completeness. In the short term this doesn't make sense as there isn't any OS releases that will work with this default. However it might be worth considering for the future, maybe keying off a particular qemu version or machine type. I suspect 2 years from now no one is going to be using virtio-mmio so long term it's not an ideal default. Yeah, when QEMU starts doing versioned machine types for AArch64 we could do this, but then this just kind of flips the problem around - apps now need to manually add <address type="mmio"> for every device if deploying an OS that can't do PCI. Admittedly this is slightly easier, since address rules for mmio are simpler than address rules for PCI.
I think the first option is best (keying off the PCIe controller specified by the user), with a longer term plan to change the default from mmio to pci. But I'm not really sold on anything either way. So I'm interested if anyone else has ideas. I guess I'd tend towards option 1 too - only adding PCI controller if we actually want to use PCI with the guest.
I *kind of* agree with this, but not completely. I think that if we know for sure based on introspection of the virtual machine (or based on verified knowledge that we hardcode into libvirt) that there is a PCI controller of some type that is implemented in the machine and no way to remove it, we should put that information in the XML. Any controllers that are optional, and don't exist in the virtual machine if they're not added to the commandline, can be auto-added only if needed. So for example, if a Q35 domain was created that had just 2 emulated PCI devices, we could auto-add just enough ports to accommodate that. (For that matter, adding an emulated PCI device would cause the auto-add of a pcie-switch-downstream-port, which might cause the auto-add of a pcie-switch-upstream-port, which might cause the auto-add of a pcie-root-port; the 2nd emulated PCI device would cause an auto-add of another pcie-switch-downstream-port which would find a spot in the existing pcie-switch-upstream-port) The one problem with doing this would be that there would be no free ports for hotplugging; I'm not sure what is the best way to deal with that; I suppose any attempt at hotplugging a new device would lead to an error, after which you would shut down the virtual machine, manually add in a port, then start it up again; or maybe when there are *any* PCI devices, we could always make sure there were 'several' extra ports for hotplugging? Neither sounds ideal...).
participants (5)
-
Cole Robinson
-
Daniel P. Berrange
-
Laine Stump
-
Pavel Fedin
-
Pavel Hrdina