On 7/16/19 3:00 PM, Peter Krempa wrote:
On Thu, Jul 11, 2019 at 17:53:58 +0200, Michal Privoznik wrote:
> To simplify implementation, some restrictions are added. For
> instance, an NVMe disk can't go to any bus but virtio and has to
> be type of 'disk' and can't have startupPolicy set.
>
> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
> ---
> src/conf/domain_conf.c | 129 +++++++++++++++++++++++++
> src/libvirt_private.syms | 1 +
> src/qemu/qemu_block.c | 1 +
> src/qemu/qemu_command.c | 1 +
> src/qemu/qemu_driver.c | 4 +
> src/qemu/qemu_migration.c | 1 +
> src/util/virstoragefile.c | 59 +++++++++++
> src/util/virstoragefile.h | 15 +++
> src/xenconfig/xen_xl.c | 1 +
> tests/qemuxml2argvdata/disk-nvme.xml | 12 ++-
> tests/qemuxml2xmloutdata/disk-nvme.xml | 1 +
> tests/qemuxml2xmltest.c | 1 +
> 12 files changed, 224 insertions(+), 2 deletions(-)
> create mode 120000 tests/qemuxml2xmloutdata/disk-nvme.xml
>
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 3323c9a5b1..73f5e1fa0f 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
[...]
> @@ -5938,6 +5943,38 @@ virDomainDiskDefValidate(const virDomainDiskDef *disk)
> return -1;
> }
>
> + if (disk->src->type == VIR_STORAGE_TYPE_NVME) {
> + /* These might not be valid for all hypervisors, but be
> + * strict now and possibly refine in the future. */
> + if (disk->device != VIR_DOMAIN_DISK_DEVICE_DISK) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Unsupported disk type '%s' for NVMe
disk"),
> + virDomainDiskDeviceTypeToString(disk->device));
> + return -1;
> + }
> +
> + if (disk->bus != VIR_DOMAIN_DISK_BUS_VIRTIO) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Unsupported bus '%s' for NVMe
disk"),
> + virDomainDiskBusTypeToString(disk->bus));
> + return -1;
> + }
> +
> + if (disk->startupPolicy != VIR_DOMAIN_STARTUP_POLICY_DEFAULT &&
> + disk->startupPolicy != VIR_DOMAIN_STARTUP_POLICY_MANDATORY) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Unsupported startup policy '%s' for NVMe
disk"),
> +
virDomainStartupPolicyTypeToString(disk->startupPolicy));
> + return -1;
> + }
> +
> + if (disk->src->shared) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("Unsupported <shareable/> for NVMe
disk"));
> + return -1;
> + }
> + }
> +
> return 0;
> }
As noted in the other thread, this really should be extracted, placed in
the validation callback rather than post parse and must iterate the
backing chain if you want this to keep working with -blockdev.
I'm not sure I understand what you mean. This function is called
virDomainDiskDefValidate() and therefore it is validation callback
rather than post parse callback. Where do you want me to put these checks?
>
> @@ -9184,6 +9221,76 @@ virDomainDiskSourceNetworkParse(xmlNodePtr node,
> }
>
>
> +static int
> +virDomainDiskSourceNVMeParse(xmlNodePtr node,
> + xmlXPathContextPtr ctxt,
> + virStorageSourcePtr src)
> +{
> + VIR_AUTOPTR(virStorageSourceNVMeDef) nvme = NULL;
> + VIR_AUTOFREE(char *) type = NULL;
> + VIR_AUTOFREE(char *) namespace = NULL;
> + VIR_AUTOFREE(char *) managed = NULL;
> + xmlNodePtr address;
> +
> + if (VIR_ALLOC(nvme) < 0)
> + return -1;
> +
> + if (!(type = virXMLPropString(node, "type"))) {
> + virReportError(VIR_ERR_XML_ERROR, "%s",
> + _("missing 'type' attribute to disk
source"));
> + return -1;
> + }
> +
> + if (STRNEQ(type, "pci")) {
> + virReportError(VIR_ERR_XML_ERROR,
> + _("unsupported source type '%s'"),
> + type);
> + return -1;
> + }
> +
> + if (!(namespace = virXMLPropString(node, "namespace"))) {
> + virReportError(VIR_ERR_XML_ERROR, "%s",
> + _("missing 'namespace' attribute to disk
source"));
> + return -1;
> + }
> +
> + if (virStrToLong_ul(namespace, NULL, 10, &nvme->namespace) < 0) {
> + virReportError(VIR_ERR_XML_ERROR,
> + _("malformed namespace '%s'"),
> + namespace);
> + return -1;
> + }
> +
> + /* NVMe namespaces start from 1 */
> + if (nvme->namespace == 0) {
> + virReportError(VIR_ERR_XML_ERROR, "%s",
> + _("NVMe namespace can't be zero"));
> + return -1;
> + }
> +
> + if ((managed = virXMLPropString(node, "managed"))) {
> + if ((nvme->managed = virTristateBoolTypeFromString(managed)) <= 0) {
> + virReportError(VIR_ERR_XML_ERROR,
> + _("malformed managed value '%s'"),
> + managed);
> + return -1;
> + }
> + }
> +
> + if (!(address = virXPathNode("./address", ctxt))) {
> + virReportError(VIR_ERR_XML_ERROR, "%s",
> + _("NVMe disk source is missing address"));
> + return -1;
> + }
I'm displeased that this is yet another function adding validation in
the parser. You don't make the status quo any worse though so this is
not a request to change it.
What validation do you mean? namespace != 0 check? Well, that stems
straight from NVMe specs, so it is completely independent of any driver.
Or do you mean this !address check? Well, it's needed later in the parsing.
> +
> + if (virPCIDeviceAddressParseXML(address, &nvme->pciAddr) < 0)
> + return -1;
> +
> + VIR_STEAL_PTR(src->nvme, nvme);
> + return 0;
> +}
> +
> +
> static int
> virDomainDiskSourcePRParse(xmlNodePtr node,
> xmlXPathContextPtr ctxt,
[...]
> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> index 2436f5051b..87adccab3d 100644
> --- a/src/qemu/qemu_migration.c
> +++ b/src/qemu/qemu_migration.c
Missing change to 'qemuMigrationSrcIsSafe' to reject VMs containing NVMe
disks not having shared source.
A NVMe disk can't be shared. It's even explicitly denied in the
validation callback. The reason is that there is no way to share a
single PCI device with multiple domains.
But this is a good point. I'll probably put it into a different commt
though. Even though I'm changing some parts of qemu driver here it's
only because of the way we handle switch() with enums.
Plus that function should probably check the full backing chain rather
than the top element only.
Pre-existing, but I can try to fix it. Okay.
> diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c
> index 269d0050fd..18aa33fe05 100644
> --- a/src/util/virstoragefile.c
> +++ b/src/util/virstoragefile.c
[...]
> @@ -2114,6 +2115,48 @@ virStoragePRDefCopy(virStoragePRDefPtr src)
> }
>
>
> +static virStorageSourceNVMeDefPtr
> +virStorageSourceNVMeDefCopy(const virStorageSourceNVMeDef *src)
> +{
> + VIR_AUTOPTR(virStorageSourceNVMeDef) ret = NULL;
> +
> + if (VIR_ALLOC(ret) < 0)
> + return NULL;
> +
> + *ret = *src;
You opted to use memcpy for the pci address few patches ago.
Darn! You're right.
Honestly, I wanted to use coccinelle to adapt our code to
virPCIDeviceAddressCopy(); I've written the spatch to do that but then
failed to install coccinelle on my rawhide box. And then I've simply
forgot about it. Ehm.
> + VIR_RETURN_PTR(ret);
> +}
[...]
> @@ -2463,6 +2514,7 @@ virStorageSourceIsLocalStorage(const virStorageSource *src)
>
> case VIR_STORAGE_TYPE_NETWORK:
> case VIR_STORAGE_TYPE_VOLUME:
> + case VIR_STORAGE_TYPE_NVME:
Welp. While I agree that virStorageSourceIsLocalStorage should return
false you should really add a documentation comment explaining why NVMe
is different. (e.g. regular code accessing src->path would not work).
I think I'm mentioning this somewhere later in the series, but it makes
sense to add it here. Okay.
> case VIR_STORAGE_TYPE_LAST:
> case VIR_STORAGE_TYPE_NONE:
> return false;
> @@ -2493,6 +2545,10 @@ virStorageSourceIsEmpty(virStorageSourcePtr src)
> src->protocol == VIR_STORAGE_NET_PROTOCOL_NONE)
> return true;
>
> + if (src->type == VIR_STORAGE_TYPE_NVME &&
> + !src->nvme)
> + return true;
Formating a disk type='nvme' without any data would not be parseable in
our parser, so this will never happen.
> +
> return false;
> }
>
[...]
> diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h
> index 38ba901858..a1294ea608 100644
> --- a/src/util/virstoragefile.h
> +++ b/src/util/virstoragefile.h
[...]
> @@ -231,6 +233,14 @@ struct _virStorageSourceInitiatorDef {
> char *iqn; /* Initiator IQN */
> };
>
Add a comment noting that the copy function needs to be fixed if this is
ever being updated
> +typedef struct _virStorageSourceNVMeDef virStorageSourceNVMeDef;
> +typedef virStorageSourceNVMeDef *virStorageSourceNVMeDefPtr;
> +struct _virStorageSourceNVMeDef {
> + unsigned long namespace;
'long' is either 32 or 64 bit depending on the architecture, please use
unsigned int or unsigned long long.
> + int managed; /* enum virTristateBool */
> + virPCIDeviceAddress pciAddr;
> +};
> +
> typedef struct _virStorageDriverData virStorageDriverData;
> typedef virStorageDriverData *virStorageDriverDataPtr;
>
[...]
> @@ -416,6 +428,9 @@ bool virStoragePRDefIsManaged(virStoragePRDefPtr prd);
> bool
> virStorageSourceChainHasManagedPR(virStorageSourcePtr src);
>
> +void virStorageSourceNVMeDefFree(virStorageSourceNVMeDefPtr def);
> +VIR_DEFINE_AUTOPTR_FUNC(virStorageSourceNVMeDef, virStorageSourceNVMeDefFree);
Do these need to be exposed?
Yes. In fact, you can see it used right in this patch in
virDomainDiskSourceNVMeParse() which lives in src/conf/domain_conf.c.
> +
> virSecurityDeviceLabelDefPtr
> virStorageSourceGetSecurityLabelDef(virStorageSourcePtr src,
> const char *model);
[...]
> diff --git a/tests/qemuxml2argvdata/disk-nvme.xml
b/tests/qemuxml2argvdata/disk-nvme.xml
> index 0b3dbad4eb..fe956d5ab6 100644
> --- a/tests/qemuxml2argvdata/disk-nvme.xml
> +++ b/tests/qemuxml2argvdata/disk-nvme.xml
> @@ -20,6 +20,7 @@
> <address domain='0x0000' bus='0x01' slot='0x00'
function='0x0'/>
> </source>
> <target dev='vda' bus='virtio'/>
> + <address type='pci' domain='0x0000' bus='0x00'
slot='0x04' function='0x0'/>
> </disk>
> <disk type='nvme' device='disk'>
> <driver name='qemu' type='raw'/>
> @@ -27,6 +28,7 @@
> <address domain='0x0000' bus='0x01' slot='0x00'
function='0x0'/>
> </source>
> <target dev='vdb' bus='virtio'/>
> + <address type='pci' domain='0x0000' bus='0x00'
slot='0x05' function='0x0'/>
> </disk>
> <disk type='nvme' device='disk'>
> <driver name='qemu' type='raw'/>
> @@ -34,6 +36,7 @@
> <address domain='0x0000' bus='0x02' slot='0x00'
function='0x0'/>
> </source>
> <target dev='vdc' bus='virtio'/>
> + <address type='pci' domain='0x0000' bus='0x00'
slot='0x06' function='0x0'/>
> </disk>
> <disk type='nvme' device='disk'>
> <driver name='qemu' type='raw'/>
> @@ -44,10 +47,15 @@
> </encryption>
> </source>
> <target dev='vdd' bus='virtio'/>
> + <address type='pci' domain='0x0000' bus='0x00'
slot='0x07' function='0x0'/>
> </disk>
> - <controller type='usb' index='0'/>
> + <controller type='usb' index='0'>
> + <address type='pci' domain='0x0000' bus='0x00'
slot='0x01' function='0x2'/>
> + </controller>
> <controller type='pci' index='0'
model='pci-root'/>
> - <controller type='scsi' index='0'
model='virtio-scsi'/>
> + <controller type='scsi' index='0'
model='virtio-scsi'>
> + <address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
> + </controller>
> <input type='mouse' bus='ps2'/>
> <input type='keyboard' bus='ps2'/>
> <memballoon model='none'/>
All of these belong to the previous patch adding the test file in the
first place.
Okay.
Michal