[libvirt] [PATCH v5 0/4] PCI hostdev partial assignment support

changes from v4 [1]: - previous patch 3 was removed. The validation it was implementating proved to be too restrict, while providing no tangible benefits for the trouble of having existing domains failing to launch. This makes this series all about the new address type implementation and its benefits. More info about the rationale can be found at [1]. - documentation was changed to reflect this new tone [1] https://www.redhat.com/archives/libvir-list/2019-December/msg01016.html Daniel Henrique Barboza (4): Introducing new address type='unassigned' for PCI hostdevs qemu: handle unassigned PCI hostdevs in command line formatdomain.html.in: document <address type='unassigned'/> news.xml: add address type='unassigned' entry docs/formatdomain.html.in | 13 +++++ docs/news.xml | 14 +++++ docs/schemas/domaincommon.rng | 5 ++ src/conf/device_conf.c | 2 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 7 ++- src/qemu/qemu_command.c | 5 ++ src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain_address.c | 5 ++ .../hostdev-pci-address-unassigned.args | 31 ++++++++++ .../hostdev-pci-address-unassigned.xml | 42 ++++++++++++++ tests/qemuxml2argvtest.c | 4 ++ .../hostdev-pci-address-unassigned.xml | 58 +++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 14 files changed, 188 insertions(+), 1 deletion(-) create mode 100644 tests/qemuxml2argvdata/hostdev-pci-address-unassigned.args create mode 100644 tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml create mode 100644 tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml -- 2.23.0

Today, to use a PCI hostdev "A" in a domain, all PCI devices that belongs to the same IOMMU group must also be declared in the domain XML, meaning that all IOMMU devices are detached from the host and all of them are visible to the guest. The result is that the guest will have access to all devices, but this is not necessarily what the administrator wanted. If only the hostdev "A" was intended for guest usage, but hostdevs "B" and "C" happened to be in the same IOMMU group of "A", the guest will gain access to all 3 devices. This makes the administrator rely on alternative solutions, such as use all hostdevs with un-managed mode and detached all the IOMMU before the guest starts. If use un-managed mode is not an option, the only alternative left is an ACS patch to deny guest access to "B" and "C". This patch introduces a new address type called "unassigned" to handle this situation where a hostdev will be owned by a domain, but not visible to the guest OS. This allows the administrator to declare all the IOMMU while also choosing which hostdevs will be usable by the guest. This new mechanic applies to all PCI hostdevs, regardless of whether they are a PCI multifunction hostdev or not. Using <address type='unassigned'/> in any case other than a PCI hostdev will result in error. Next patch will use this new address type in the QEMU driver to avoid adding unassigned devices to the QEMU launch command line. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- docs/schemas/domaincommon.rng | 5 ++ src/conf/device_conf.c | 2 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 7 ++- src/qemu/qemu_command.c | 1 + src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain_address.c | 5 ++ .../hostdev-pci-address-unassigned.xml | 42 ++++++++++++++ .../hostdev-pci-address-unassigned.xml | 58 +++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 10 files changed, 122 insertions(+), 1 deletion(-) create mode 100644 tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml create mode 100644 tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index e964773f5e..5f1d4a34a4 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -5502,6 +5502,11 @@ </attribute> <ref name="dimmaddress"/> </group> + <group> + <attribute name="type"> + <value>unassigned</value> + </attribute> + </group> </choice> </element> </define> diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c index 4c57f0995f..4dbd5c1ac9 100644 --- a/src/conf/device_conf.c +++ b/src/conf/device_conf.c @@ -45,6 +45,7 @@ VIR_ENUM_IMPL(virDomainDeviceAddress, "virtio-mmio", "isa", "dimm", + "unassigned", ); static int @@ -120,6 +121,7 @@ virDomainDeviceInfoAddressIsEqual(const virDomainDeviceInfo *a, /* address types below don't have any specific data */ case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: break; case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI: diff --git a/src/conf/device_conf.h b/src/conf/device_conf.h index d98fae775c..e091d7cfe2 100644 --- a/src/conf/device_conf.h +++ b/src/conf/device_conf.h @@ -45,6 +45,7 @@ typedef enum { VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ISA, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM, + VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST } virDomainDeviceAddressType; diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e46f423b19..afa072e17d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -6352,9 +6352,11 @@ virDomainHostdevDefValidate(const virDomainHostdevDef *hostdev) switch ((virDomainHostdevSubsysType) hostdev->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: if (hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED && hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("PCI host devices must use 'pci' address type")); + _("PCI host devices must use 'pci' or " + "'unassigned' address type")); return -1; } break; @@ -7371,6 +7373,7 @@ virDomainDeviceInfoFormat(virBufferPtr buf, case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; } @@ -7571,6 +7574,7 @@ virDomainDeviceAddressParseXML(xmlNodePtr address, break; case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; } @@ -21951,6 +21955,7 @@ virDomainDeviceInfoCheckABIStability(virDomainDeviceInfoPtr src, case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 836057a4ff..864967f375 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -459,6 +459,7 @@ qemuBuildVirtioDevStr(virBufferPtr buf, return -1; case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: default: virReportEnumRangeError(virDomainDeviceAddressType, info->type); diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 20a9629760..cce99f831f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8242,6 +8242,7 @@ qemuDomainDeviceDefValidateAddress(const virDomainDeviceDef *dev, case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ISA: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: /* No validation for these address types yet */ break; diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 605984f80f..b077240899 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -2313,6 +2313,11 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, continue; } + /* do not reserve address for info->type='unassigned' */ + if (def->hostdevs[i]->info->type == + VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED) + continue; + if (qemuDomainPCIAddressReserveNextAddr(addrs, def->hostdevs[i]->info) < 0) goto error; diff --git a/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml b/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml new file mode 100644 index 0000000000..9a2685ca0e --- /dev/null +++ b/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml @@ -0,0 +1,42 @@ +<domain type='kvm'> + <name>delete</name> + <uuid>583a8e8e-f0ce-4f53-89ab-092862148b25</uuid> + <memory unit='KiB'>262144</memory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + </os> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x0'/> + </source> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x1'/> + </source> + <address type='unassigned'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x2'/> + </source> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x3'/> + </source> + </hostdev> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml b/tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml new file mode 100644 index 0000000000..2341e8432b --- /dev/null +++ b/tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml @@ -0,0 +1,58 @@ +<domain type='kvm'> + <name>delete</name> + <uuid>583a8e8e-f0ce-4f53-89ab-092862148b25</uuid> + <memory unit='KiB'>262144</memory> + <currentMemory unit='KiB'>262144</currentMemory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x0'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x1'/> + </source> + <address type='unassigned'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x2'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x3'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> + </hostdev> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 55bbd924fb..d0a91df2ad 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -491,6 +491,7 @@ mymain(void) DO_TEST("hostdev-usb-address", NONE); DO_TEST("hostdev-pci-address", NONE); + DO_TEST("hostdev-pci-address-unassigned", QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("hostdev-pci-multifunction", QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("hostdev-vfio", QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("hostdev-vfio-zpci", -- 2.23.0

On Tue, 17 Dec 2019 16:06:44 -0300 Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
Today, to use a PCI hostdev "A" in a domain, all PCI devices that belongs to the same IOMMU group must also be declared in the domain XML, meaning that all IOMMU devices are detached from the host and all of them are visible to the guest.
This is not accurate. All endpoint devices in the IOMMU group need to be "viable" (ie. bound to vfio-pci, pci-stub, unbound) for any device within the group to be used through vfio. There is absolutely no requirement that they all be declared in the XML or assigned to the guest. Also please clarify "IOMMU devices". Is that all devices within the IOMMU group?
The result is that the guest will have access to all devices, but this is not necessarily what the administrator wanted. If only the hostdev "A" was intended for guest usage, but hostdevs "B" and "C" happened to be in the same IOMMU group of "A", the guest will gain access to all 3 devices. This makes the administrator rely on alternative solutions, such as use all hostdevs with un-managed mode and detached all the IOMMU before the guest starts. If use un-managed mode is not an option, the only alternative left is an ACS patch to deny guest access to "B" and "C".
Also not accurate. The libvirt hooks interface can be used to manage the additional devices ad-hoc, the additional devices might not need management if they're not endpoints, they might be statically bound to vfio-pci at boot time, etc. I don't think the scenario is nearly as dire as presented here. "detached all the IOMMU" doesn't make sense.
This patch introduces a new address type called "unassigned" to handle this situation where a hostdev will be owned by a domain, but not visible to the guest OS. This allows the administrator to declare all the IOMMU while also choosing which hostdevs will be
"declare all the IOMMU" doesn't make sense.
usable by the guest. This new mechanic applies to all PCI hostdevs, regardless of whether they are a PCI multifunction hostdev or not. Using <address type='unassigned'/> in any case other than a PCI hostdev will result in error.
Regardless of my comments above, I think this support is worthwhile, but let's not pretend there aren't solutions, this just makes it easier to accomplish in a supported and dynamic way. Thanks, Alex
Next patch will use this new address type in the QEMU driver to avoid adding unassigned devices to the QEMU launch command line.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- docs/schemas/domaincommon.rng | 5 ++ src/conf/device_conf.c | 2 + src/conf/device_conf.h | 1 + src/conf/domain_conf.c | 7 ++- src/qemu/qemu_command.c | 1 + src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain_address.c | 5 ++ .../hostdev-pci-address-unassigned.xml | 42 ++++++++++++++ .../hostdev-pci-address-unassigned.xml | 58 +++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 10 files changed, 122 insertions(+), 1 deletion(-) create mode 100644 tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml create mode 100644 tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index e964773f5e..5f1d4a34a4 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -5502,6 +5502,11 @@ </attribute> <ref name="dimmaddress"/> </group> + <group> + <attribute name="type"> + <value>unassigned</value> + </attribute> + </group> </choice> </element> </define> diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c index 4c57f0995f..4dbd5c1ac9 100644 --- a/src/conf/device_conf.c +++ b/src/conf/device_conf.c @@ -45,6 +45,7 @@ VIR_ENUM_IMPL(virDomainDeviceAddress, "virtio-mmio", "isa", "dimm", + "unassigned", );
static int @@ -120,6 +121,7 @@ virDomainDeviceInfoAddressIsEqual(const virDomainDeviceInfo *a, /* address types below don't have any specific data */ case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: break;
case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI: diff --git a/src/conf/device_conf.h b/src/conf/device_conf.h index d98fae775c..e091d7cfe2 100644 --- a/src/conf/device_conf.h +++ b/src/conf/device_conf.h @@ -45,6 +45,7 @@ typedef enum { VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ISA, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM, + VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED,
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST } virDomainDeviceAddressType; diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e46f423b19..afa072e17d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -6352,9 +6352,11 @@ virDomainHostdevDefValidate(const virDomainHostdevDef *hostdev) switch ((virDomainHostdevSubsysType) hostdev->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: if (hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED && hostdev->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("PCI host devices must use 'pci' address type")); + _("PCI host devices must use 'pci' or " + "'unassigned' address type")); return -1; } break; @@ -7371,6 +7373,7 @@ virDomainDeviceInfoFormat(virBufferPtr buf,
case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; } @@ -7571,6 +7574,7 @@ virDomainDeviceAddressParseXML(xmlNodePtr address, break;
case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; } @@ -21951,6 +21955,7 @@ virDomainDeviceInfoCheckABIStability(virDomainDeviceInfoPtr src, case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; } diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 836057a4ff..864967f375 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -459,6 +459,7 @@ qemuBuildVirtioDevStr(virBufferPtr buf, return -1;
case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: default: virReportEnumRangeError(virDomainDeviceAddressType, info->type); diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 20a9629760..cce99f831f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8242,6 +8242,7 @@ qemuDomainDeviceDefValidateAddress(const virDomainDeviceDef *dev, case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ISA: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED: /* No validation for these address types yet */ break;
diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 605984f80f..b077240899 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -2313,6 +2313,11 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, continue; }
+ /* do not reserve address for info->type='unassigned' */ + if (def->hostdevs[i]->info->type == + VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED) + continue; + if (qemuDomainPCIAddressReserveNextAddr(addrs, def->hostdevs[i]->info) < 0) goto error; diff --git a/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml b/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml new file mode 100644 index 0000000000..9a2685ca0e --- /dev/null +++ b/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.xml @@ -0,0 +1,42 @@ +<domain type='kvm'> + <name>delete</name> + <uuid>583a8e8e-f0ce-4f53-89ab-092862148b25</uuid> + <memory unit='KiB'>262144</memory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + </os> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x0'/> + </source> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x1'/> + </source> + <address type='unassigned'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x2'/> + </source> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x3'/> + </source> + </hostdev> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml b/tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml new file mode 100644 index 0000000000..2341e8432b --- /dev/null +++ b/tests/qemuxml2xmloutdata/hostdev-pci-address-unassigned.xml @@ -0,0 +1,58 @@ +<domain type='kvm'> + <name>delete</name> + <uuid>583a8e8e-f0ce-4f53-89ab-092862148b25</uuid> + <memory unit='KiB'>262144</memory> + <currentMemory unit='KiB'>262144</currentMemory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x0'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x1'/> + </source> + <address type='unassigned'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x2'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0005' bus='0x90' slot='0x01' function='0x3'/> + </source> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> + </hostdev> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> + </memballoon> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 55bbd924fb..d0a91df2ad 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -491,6 +491,7 @@ mymain(void)
DO_TEST("hostdev-usb-address", NONE); DO_TEST("hostdev-pci-address", NONE); + DO_TEST("hostdev-pci-address-unassigned", QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("hostdev-pci-multifunction", QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("hostdev-vfio", QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST("hostdev-vfio-zpci",

Previous patch made it possible for the QEMU driver to check if a given PCI hostdev is unassigned, by checking if dev->info->type is VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED, meaning that this device shouldn't be part of the actual guest launch. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/qemu_command.c | 4 +++ .../hostdev-pci-address-unassigned.args | 31 +++++++++++++++++++ tests/qemuxml2argvtest.c | 4 +++ 3 files changed, 39 insertions(+) create mode 100644 tests/qemuxml2argvdata/hostdev-pci-address-unassigned.args diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 864967f375..ed8b878566 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5336,6 +5336,10 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd, *bootHostdevNet = 0; } + /* Ignore unassigned devices */ + if (hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED) + continue; + if (qemuCommandAddExtDevice(cmd, hostdev->info) < 0) return -1; diff --git a/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.args b/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.args new file mode 100644 index 0000000000..42fae17444 --- /dev/null +++ b/tests/qemuxml2argvdata/hostdev-pci-address-unassigned.args @@ -0,0 +1,31 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-delete \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-delete/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-delete/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-delete/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name delete \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-m 256 \ +-realtime mlock=off \ +-smp 4,sockets=4,cores=1,threads=1 \ +-uuid 583a8e8e-f0ce-4f53-89ab-092862148b25 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-delete/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device vfio-pci,host=0005:90:01.0,id=hostdev0,bus=pci.0,addr=0x3 \ +-device vfio-pci,host=0005:90:01.2,id=hostdev2,bus=pci.0,addr=0x4 \ +-device vfio-pci,host=0005:90:01.3,id=hostdev3,bus=pci.0,addr=0x5 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 118a460633..bfbed5c31d 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1332,6 +1332,10 @@ mymain(void) QEMU_CAPS_KVM, QEMU_CAPS_DEVICE_VFIO_PCI); + DO_TEST("hostdev-pci-address-unassigned", + QEMU_CAPS_KVM, + QEMU_CAPS_DEVICE_VFIO_PCI); + DO_TEST("serial-file-log", QEMU_CAPS_CHARDEV_FILE_APPEND, QEMU_CAPS_DEVICE_ISA_SERIAL, -- 2.23.0

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- docs/formatdomain.html.in | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index e06cf2061b..7a5ebdd67e 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -4203,6 +4203,19 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>unassigned</code></dt> + <dd>For PCI hostdevs, <code><address type='unassigned'/></code> + allows the admin to include a PCI hostdev in the domain XML definition, + without making it available for the guest. This allows for configurations + in which Libvirt manages the device as a regular PCI hostdev, + regardless of whether the guest will have access to it. This is + an alternative to scenarios in which the admin might be compelled to use + an ACS patch to remove the device from the guest while Libvirt + retains control of the PCI device. + <code><address type='unassigned'/></code> is an invalid address + type for all other device types. + <span class="since">Since 6.0.0</span> + </dd> </dl> <h4><a id="elementsVirtio">Virtio-related options</a></h4> -- 2.23.0

On Tue, 17 Dec 2019 16:06:46 -0300 Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- docs/formatdomain.html.in | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index e06cf2061b..7a5ebdd67e 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -4203,6 +4203,19 @@ attributes: <code>iobase</code> and <code>irq</code>. <span class="since">Since 1.2.1</span> </dd> + <dt><code>unassigned</code></dt> + <dd>For PCI hostdevs, <code><address type='unassigned'/></code> + allows the admin to include a PCI hostdev in the domain XML definition, + without making it available for the guest. This allows for configurations + in which Libvirt manages the device as a regular PCI hostdev, + regardless of whether the guest will have access to it. This is + an alternative to scenarios in which the admin might be compelled to use + an ACS patch to remove the device from the guest while Libvirt + retains control of the PCI device.
The ACS patch is really orthogonal to the goal here, so I don't think it should be included in the discussion. A user can just as easily pre-bind other devices to vfio-pci to make the configuration viable rather than patch their kernel to change the viability constraints, which this series doesn't accomplish either. Thanks, Alex
+ <code><address type='unassigned'/></code> is an invalid address + type for all other device types. + <span class="since">Since 6.0.0</span> + </dd> </dl>
<h4><a id="elementsVirtio">Virtio-related options</a></h4>

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- docs/news.xml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/docs/news.xml b/docs/news.xml index 2a25b6ca49..febda970f6 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -54,6 +54,20 @@ written in the RST as an alternative to HTML. </description> </change> + <change> + <summary> + new PCI hostdev address type: unassigned + </summary> + <description> + A new PCI hostdev address type 'unassigned' is introduced, + giving users the option to choose which PCI hostdevs + within the same IOMMU group will not be assigned to the + guest. PCI hostdevs that shouldn't be used by the guest + can be classified as <code>address type='unassigned'</code>. + Libvirt will still be able to manage the device as a + regular PCI hostdev. + </description> + </change> </section> <section title="Improvements"> <change> -- 2.23.0

On Tue, 17 Dec 2019 16:06:47 -0300 Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- docs/news.xml | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/docs/news.xml b/docs/news.xml index 2a25b6ca49..febda970f6 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -54,6 +54,20 @@ written in the RST as an alternative to HTML. </description> </change> + <change> + <summary> + new PCI hostdev address type: unassigned + </summary> + <description> + A new PCI hostdev address type 'unassigned' is introduced, + giving users the option to choose which PCI hostdevs + within the same IOMMU group will not be assigned to the + guest. PCI hostdevs that shouldn't be used by the guest + can be classified as <code>address type='unassigned'</code>. + Libvirt will still be able to manage the device as a + regular PCI hostdev. + </description>
This is rather convoluted. Users have always had the choice of which devices NOT to assign. I would present this as a mechanism for libvirt to manage the driver binding of devices, as done for managed PCI hostdev devices, without actually attaching the devices to the VM, thereby facilitating device assignment when only a subset of the devices within an IOMMU group are intended to be used by the guest. BTW, this should also work with the hostdev <driver='kvm'/> option if the user wanted to bind the device to pci-stub rather than vfio-pci to provide even further isolation from the user being able to access the device. Thanks, Alex
+ </change> </section> <section title="Improvements"> <change>

On 12/17/19 4:44 PM, Alex Williamson wrote:
On Tue, 17 Dec 2019 16:06:47 -0300 Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- docs/news.xml | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/docs/news.xml b/docs/news.xml index 2a25b6ca49..febda970f6 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -54,6 +54,20 @@ written in the RST as an alternative to HTML. </description> </change> + <change> + <summary> + new PCI hostdev address type: unassigned + </summary> + <description> + A new PCI hostdev address type 'unassigned' is introduced, + giving users the option to choose which PCI hostdevs + within the same IOMMU group will not be assigned to the + guest. PCI hostdevs that shouldn't be used by the guest + can be classified as <code>address type='unassigned'</code>. + Libvirt will still be able to manage the device as a + regular PCI hostdev. + </description>
This is rather convoluted. Users have always had the choice of which devices NOT to assign. I would present this as a mechanism for libvirt to manage the driver binding of devices, as done for managed PCI hostdev devices, without actually attaching the devices to the VM, thereby facilitating device assignment when only a subset of the devices within an IOMMU group are intended to be used by the guest.
Thanks. I'll resend the series with a better tone in the docs and the commit message of patch 1.
BTW, this should also work with the hostdev <driver='kvm'/> option if the user wanted to bind the device to pci-stub rather than vfio-pci to provide even further isolation from the user being able to access the device. Thanks,
After dropping the heavy validation logic I was doing in the previous versions, it should work with driver='kvm' as well. What the code is doing now is removing any device marked as unassigned from QEMU launch command. If QEMU is fine with the pci-stub setup the user is setting up, this code will not stand in the way. Thanks, DHB
Alex
+ </change> </section> <section title="Improvements"> <change>
participants (2)
-
Alex Williamson
-
Daniel Henrique Barboza