Changes from previous version [1], all of them result of
feedback from Alex Williamson and Abdulla Bubshait:
- use <address type='none'/> instead of creating a new subsys
attribute;
- expand the change to all PCI hostdevs. Former patch 01 was
discarded because we don't need the PCI Multifunction helpers
for now;
- series changed name to reflect what it's being done
- new patch 04: add documentation to formatdomain.html.in
To avoid a huge wall of text please refer to [1] for context
about the road up to here. Commit msgs of the first 3 patches
tells the story as well.
[1]
https://www.redhat.com/archives/libvir-list/2019-October/msg00298.html
What I want to discuss here instead is a caveat that I've found
while testing this work, since its first version. This test was
done in a Power 8 system with a Broadcom BCM5719 PCIe Multifunction
card, with 4 virtual functions. This series enables Libvirt to
declare PCI hostdevs that will not be visible to the guest using
address type='none'. During the tests I faced a scenario that I
expected to fail, but it didn't. This is the relevant XML except:
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0001' bus='0x09' slot='0x00'
function='0x0'/>
</source>
<address type='none'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0001' bus='0x09' slot='0x00'
function='0x1'/>
</source>
<address type='none'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0001' bus='0x09' slot='0x00'
function='0x2'/>
</source>
<address type='pci' domain='0x0000' bus='0x01'
slot='0x01' function='0x2'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0001' bus='0x09' slot='0x00'
function='0x3'/>
</source>
<address type='none'/>
</hostdev>
I'm declaring all the BCM5719 functions in the XML, but I am making
functions 0, 1 and 3 unassignable by the guest using address type='none'.
This test was meant to fail, but it didn't. To my surprise the guest
booted and the device is functional:
$ lspci
0000:00:01.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon
0000:00:03.0 USB controller: Red Hat, Inc. QEMU XHCI Host Controller (rev 01)
0000:00:04.0 SCSI storage controller: Red Hat, Inc Virtio block device
0001:00:01.2 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe
(rev 01)
$
I've talked with Michael Roth (QEMU PPC64 developer that worked with
the PCI multifunction hotplug/unplug support in the PPC64 machine)
about this. He mentioned that this is intended. I'll quote here what he
had to say about it:
"The current logic is that we only emit the hotplug event when function
0 is attached, but if some other function is attached at boot-time the
guest will still see it on the bus, and whether that works or not I
think is up to the device/driver"
This explains why this test didn't fail as I expected. At least for
the PPC64 machine, depending on the device support, this setup is
allowed. PPC64 machine uses function 0 hotplug as a signal of
'plug all the queue functions and function 0', but function 0
isn't required at boot time. I would like to hear other opinions
in this because I can't say whether this is allowed in x86.
I am mentioning all this now because this had a direct impact on the
design of this work since the previous version, and I failed
to bring it up back then. I am *not* checking for the assignment of
function 0 at guest boot time in Libvirt, leaving the user free to
decide what to do. I am aware that this will be inconsistent to the
logic of the PCI multifunction hotplug/unplug support, where
function 0 is required. This also puts a lot of faith in the user,
relying that the user is fully aware of the capabilities of the
hardware.
My question is: should Libvirt force function 0 to be present in
boot time as well, regardless of whether the PPC64 guest or some
cards are able to boot without it?
Thanks,
DHB
Daniel Henrique Barboza (4):
domain_conf: allow address type='none' to unassign PCI hostdevs
qemu: handle unassigned PCI hostdevs in command line and alias
virhostdev.c: check all IOMMU devs in virHostdevPreparePCIDevices
formatdomain.html.in: document <address type='none'/>
docs/formatdomain.html.in | 15 +++++
docs/schemas/domaincommon.rng | 5 ++
src/conf/domain_conf.c | 56 ++++++++++++++--
src/conf/domain_conf.h | 3 +
src/qemu/qemu_alias.c | 6 ++
src/qemu/qemu_command.c | 4 ++
src/qemu/qemu_domain_address.c | 6 ++
src/util/virhostdev.c | 64 +++++++++++++++++--
.../hostdev-pci-address-none.args | 31 +++++++++
.../hostdev-pci-address-none.xml | 42 ++++++++++++
...ostdev-pci-multifunction-partial-fail.args | 31 +++++++++
...hostdev-pci-multifunction-partial-fail.xml | 35 ++++++++++
tests/qemuxml2argvtest.c | 8 +++
.../hostdev-pci-address-none.xml | 58 +++++++++++++++++
tests/qemuxml2xmltest.c | 1 +
15 files changed, 352 insertions(+), 13 deletions(-)
create mode 100644 tests/qemuxml2argvdata/hostdev-pci-address-none.args
create mode 100644 tests/qemuxml2argvdata/hostdev-pci-address-none.xml
create mode 100644 tests/qemuxml2argvdata/hostdev-pci-multifunction-partial-fail.args
create mode 100644 tests/qemuxml2argvdata/hostdev-pci-multifunction-partial-fail.xml
create mode 100644 tests/qemuxml2xmloutdata/hostdev-pci-address-none.xml
--
2.21.0