[libvirt] [PATCH 0/8] disk device="lun" related fixes

Fix a few invalid error messages, remove code that would make certain configs vanish and reject block copy to a regular file. Peter Krempa (8): util: Replace virDomainDiskSourceIsBlockType with a new helper lxc: Fix wrong error message on disk hotplug qemu: Support <disk device='lun'> for iSCSI direct mapped volumes qemu: command: Use more appropriate checking function for block devices conf: Kill now unused virDomainDiskSourceIsBlockType qemu: command: Remove unnecessary label in qemuCheckDiskConfig qemu: Reject invalid block copy targets for <disk device='lun'> Revert "conf: Validate disk lun using correct types" src/conf/domain_conf.c | 74 ----------------------------------------------- src/conf/domain_conf.h | 3 -- src/libvirt_private.syms | 2 +- src/lxc/lxc_cgroup.c | 3 +- src/lxc/lxc_driver.c | 9 ++++-- src/qemu/qemu_command.c | 29 +++++++------------ src/qemu/qemu_conf.c | 10 +++++-- src/qemu/qemu_domain.c | 31 ++++++++++++++++++++ src/qemu/qemu_domain.h | 3 ++ src/qemu/qemu_driver.c | 4 +++ src/util/virstoragefile.c | 16 +++++++++- src/util/virstoragefile.h | 3 +- tests/qemuxml2argvtest.c | 3 +- 13 files changed, 82 insertions(+), 108 deletions(-) -- 2.8.1

For disks sources described by a libvirt volume we don't need to do a complicated check since virStorageTranslateDiskSourcePool already correctly determines the actual disk type. Replace the checks using a new accessor that does not open-code the whole logic. --- src/libvirt_private.syms | 1 + src/lxc/lxc_cgroup.c | 3 ++- src/qemu/qemu_conf.c | 10 +++++++--- src/util/virstoragefile.c | 16 +++++++++++++++- src/util/virstoragefile.h | 3 ++- 5 files changed, 27 insertions(+), 6 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 5030ec3..73c9fdb 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2277,6 +2277,7 @@ virStorageSourceFree; virStorageSourceGetActualType; virStorageSourceGetSecurityLabelDef; virStorageSourceInitChainElement; +virStorageSourceIsBlockLocal; virStorageSourceIsEmpty; virStorageSourceIsLocalStorage; virStorageSourceNewFromBacking; diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 4afe7e2..ea86d36 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -393,7 +393,8 @@ static int virLXCCgroupSetupDeviceACL(virDomainDefPtr def, VIR_DEBUG("Allowing any disk block devs"); for (i = 0; i < def->ndisks; i++) { - if (!virDomainDiskSourceIsBlockType(def->disks[i]->src, false)) + if (virStorageSourceIsEmpty(def->disks[i]->src) || + !virStorageSourceIsBlockLocal(def->disks[i]->src)) continue; if (virCgroupAllowDevicePath(cgroup, diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 5ed5776..e00ddca 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -1243,7 +1243,9 @@ qemuAddSharedDisk(virQEMUDriverPtr driver, char *key = NULL; int ret = -1; - if (!disk->src->shared || !virDomainDiskSourceIsBlockType(disk->src, false)) + if (virStorageSourceIsEmpty(disk->src) || + !disk->src->shared || + !virStorageSourceIsBlockLocal(disk->src)) return 0; qemuDriverLock(driver); @@ -1388,7 +1390,9 @@ qemuRemoveSharedDisk(virQEMUDriverPtr driver, char *key = NULL; int ret = -1; - if (!disk->src->shared || !virDomainDiskSourceIsBlockType(disk->src, false)) + if (virStorageSourceIsEmpty(disk->src) || + !disk->src->shared || + !virStorageSourceIsBlockLocal(disk->src)) return 0; qemuDriverLock(driver); @@ -1476,7 +1480,7 @@ qemuSetUnprivSGIO(virDomainDeviceDefPtr dev) disk = dev->data.disk; if (disk->device != VIR_DOMAIN_DISK_DEVICE_LUN || - !virDomainDiskSourceIsBlockType(disk->src, false)) + !virStorageSourceIsBlockLocal(disk->src)) return 0; path = virDomainDiskGetSource(disk); diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 05ac254..4f44e05 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -1955,7 +1955,7 @@ virStorageSourcePoolDefFree(virStorageSourcePoolDefPtr def) int -virStorageSourceGetActualType(virStorageSourcePtr def) +virStorageSourceGetActualType(const virStorageSource *def) { if (def->type == VIR_STORAGE_TYPE_VOLUME && def->srcpool) return def->srcpool->actualtype; @@ -2012,6 +2012,20 @@ virStorageSourceIsEmpty(virStorageSourcePtr src) /** + * virStorageSourceIsBlockLocal: + * @src: disk source definition + * + * Returns true if @src describes a locally accessible block storage source. + * This includes block devices and host-mapped iSCSI volumes. + */ +bool +virStorageSourceIsBlockLocal(const virStorageSource *src) +{ + return virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_BLOCK; +} + + +/** * virStorageSourceBackingStoreClear: * * @src: disk source to clear diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index b98fe25..17e1277 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -357,9 +357,10 @@ int virStorageSourceInitChainElement(virStorageSourcePtr newelem, bool force); void virStorageSourcePoolDefFree(virStorageSourcePoolDefPtr def); void virStorageSourceClear(virStorageSourcePtr def); -int virStorageSourceGetActualType(virStorageSourcePtr def); +int virStorageSourceGetActualType(const virStorageSource *def); bool virStorageSourceIsLocalStorage(virStorageSourcePtr src); bool virStorageSourceIsEmpty(virStorageSourcePtr src); +bool virStorageSourceIsBlockLocal(const virStorageSource *src); void virStorageSourceFree(virStorageSourcePtr def); void virStorageSourceBackingStoreClear(virStorageSourcePtr def); int virStorageSourceUpdateBlockPhysicalSize(virStorageSourcePtr src, -- 2.8.1

On 05/02/2016 10:32 AM, Peter Krempa wrote:
For disks sources described by a libvirt volume we don't need to do a complicated check since virStorageTranslateDiskSourcePool already correctly determines the actual disk type.
Replace the checks using a new accessor that does not open-code the whole logic. --- src/libvirt_private.syms | 1 + src/lxc/lxc_cgroup.c | 3 ++- src/qemu/qemu_conf.c | 10 +++++++--- src/util/virstoragefile.c | 16 +++++++++++++++- src/util/virstoragefile.h | 3 ++- 5 files changed, 27 insertions(+), 6 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 5030ec3..73c9fdb 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2277,6 +2277,7 @@ virStorageSourceFree; virStorageSourceGetActualType; virStorageSourceGetSecurityLabelDef; virStorageSourceInitChainElement; +virStorageSourceIsBlockLocal; virStorageSourceIsEmpty; virStorageSourceIsLocalStorage; virStorageSourceNewFromBacking; diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 4afe7e2..ea86d36 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -393,7 +393,8 @@ static int virLXCCgroupSetupDeviceACL(virDomainDefPtr def,
VIR_DEBUG("Allowing any disk block devs"); for (i = 0; i < def->ndisks; i++) { - if (!virDomainDiskSourceIsBlockType(def->disks[i]->src, false)) + if (virStorageSourceIsEmpty(def->disks[i]->src) || + !virStorageSourceIsBlockLocal(def->disks[i]->src))
Could an LXC domain use a storage pool for a volume? If so, then somewhere in the LXC code wouldn't we need to translate the source pool in order to get "actualtype" and the volume check...
continue;
if (virCgroupAllowDevicePath(cgroup, diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 5ed5776..e00ddca 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -1243,7 +1243,9 @@ qemuAddSharedDisk(virQEMUDriverPtr driver, char *key = NULL; int ret = -1;
- if (!disk->src->shared || !virDomainDiskSourceIsBlockType(disk->src, false)) + if (virStorageSourceIsEmpty(disk->src) || + !disk->src->shared || + !virStorageSourceIsBlockLocal(disk->src)) return 0;
qemuDriverLock(driver); @@ -1388,7 +1390,9 @@ qemuRemoveSharedDisk(virQEMUDriverPtr driver, char *key = NULL; int ret = -1;
- if (!disk->src->shared || !virDomainDiskSourceIsBlockType(disk->src, false)) + if (virStorageSourceIsEmpty(disk->src) || + !disk->src->shared || + !virStorageSourceIsBlockLocal(disk->src)) return 0;
qemuDriverLock(driver); @@ -1476,7 +1480,7 @@ qemuSetUnprivSGIO(virDomainDeviceDefPtr dev) disk = dev->data.disk;
if (disk->device != VIR_DOMAIN_DISK_DEVICE_LUN || - !virDomainDiskSourceIsBlockType(disk->src, false)) + !virStorageSourceIsBlockLocal(disk->src)) return 0;
path = virDomainDiskGetSource(disk); diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 05ac254..4f44e05 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -1955,7 +1955,7 @@ virStorageSourcePoolDefFree(virStorageSourcePoolDefPtr def)
int -virStorageSourceGetActualType(virStorageSourcePtr def) +virStorageSourceGetActualType(const virStorageSource *def) { if (def->type == VIR_STORAGE_TYPE_VOLUME && def->srcpool) return def->srcpool->actualtype; @@ -2012,6 +2012,20 @@ virStorageSourceIsEmpty(virStorageSourcePtr src)
/** + * virStorageSourceIsBlockLocal: + * @src: disk source definition + * + * Returns true if @src describes a locally accessible block storage source. + * This includes block devices and host-mapped iSCSI volumes. + */ +bool +virStorageSourceIsBlockLocal(const virStorageSource *src) +{ + return virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_BLOCK;
This assumes that virStorageTranslateDiskSourcePool has been run, which is true is for the non-LXC paths changed in this patch... So as long as you're comfortable with the LXC adjustment, then ACK - John
+} + + +/** * virStorageSourceBackingStoreClear: * * @src: disk source to clear diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index b98fe25..17e1277 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -357,9 +357,10 @@ int virStorageSourceInitChainElement(virStorageSourcePtr newelem, bool force); void virStorageSourcePoolDefFree(virStorageSourcePoolDefPtr def); void virStorageSourceClear(virStorageSourcePtr def); -int virStorageSourceGetActualType(virStorageSourcePtr def); +int virStorageSourceGetActualType(const virStorageSource *def); bool virStorageSourceIsLocalStorage(virStorageSourcePtr src); bool virStorageSourceIsEmpty(virStorageSourcePtr src); +bool virStorageSourceIsBlockLocal(const virStorageSource *src); void virStorageSourceFree(virStorageSourcePtr def); void virStorageSourceBackingStoreClear(virStorageSourcePtr def); int virStorageSourceUpdateBlockPhysicalSize(virStorageSourcePtr src,

On Fri, May 06, 2016 at 10:33:49 -0400, John Ferlan wrote:
On 05/02/2016 10:32 AM, Peter Krempa wrote:
For disks sources described by a libvirt volume we don't need to do a complicated check since virStorageTranslateDiskSourcePool already correctly determines the actual disk type.
Replace the checks using a new accessor that does not open-code the whole logic. --- src/libvirt_private.syms | 1 + src/lxc/lxc_cgroup.c | 3 ++- src/qemu/qemu_conf.c | 10 +++++++--- src/util/virstoragefile.c | 16 +++++++++++++++- src/util/virstoragefile.h | 3 ++- 5 files changed, 27 insertions(+), 6 deletions(-)
[...]
diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index 4afe7e2..ea86d36 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -393,7 +393,8 @@ static int virLXCCgroupSetupDeviceACL(virDomainDefPtr def,
VIR_DEBUG("Allowing any disk block devs"); for (i = 0; i < def->ndisks; i++) { - if (!virDomainDiskSourceIsBlockType(def->disks[i]->src, false)) + if (virStorageSourceIsEmpty(def->disks[i]->src) || + !virStorageSourceIsBlockLocal(def->disks[i]->src))
Could an LXC domain use a storage pool for a volume? If so, then somewhere in the LXC code wouldn't we need to translate the source pool in order to get "actualtype" and the volume check...
LXC won't work with 'volume' disks as as you've said it's not translated. virDomainDiskSourceIsBlockType would return false at this point since def->disks[i]->src->path would not be set at this point since it was not translated and the very first check of that function was to check the source path. As of that this change doesn't change the behavior in LXC in any way just uses a more sane function to do the check.
continue;
if (virCgroupAllowDevicePath(cgroup, /** + * virStorageSourceIsBlockLocal: + * @src: disk source definition + * + * Returns true if @src describes a locally accessible block storage source. + * This includes block devices and host-mapped iSCSI volumes. + */ +bool +virStorageSourceIsBlockLocal(const virStorageSource *src) +{ + return virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_BLOCK;
This assumes that virStorageTranslateDiskSourcePool has been run, which is true is for the non-LXC paths changed in this patch...
So as long as you're comfortable with the LXC adjustment, then
Allowing to use volumes is definitely something for a separate patch. Peter

Commit 36025c552 tried to improve error reporting for <disk type="lun"> but reused the code in LXC which doesn't care about the actual disk type. The error messages would then contain a bogous hint that the config for the 'lun' device is invalid which might not be the case. Re-do the relevant portion of the commit with the original message. --- src/lxc/lxc_driver.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 1dfbde3..7b76daf 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -4106,9 +4106,6 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, goto cleanup; } - if (!virDomainDiskSourceIsBlockType(def->src, true)) - goto cleanup; - src = virDomainDiskGetSource(def); if (src == NULL) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", @@ -4116,6 +4113,12 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, goto cleanup; } + if (virStorageSourceIsBlockLocal(def->src)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Can't setup disk for non-block device")); + goto cleanup; + } + if (virDomainDiskIndexByName(vm->def, def->dst, true) >= 0) { virReportError(VIR_ERR_OPERATION_FAILED, _("target %s already exists"), def->dst); -- 2.8.1

On 05/02/2016 10:32 AM, Peter Krempa wrote:
Commit 36025c552 tried to improve error reporting for <disk type="lun"> but reused the code in LXC which doesn't care about the actual disk type. The error messages would then contain a bogous hint that the config for the 'lun' device is invalid which might not be the case.
Re-do the relevant portion of the commit with the original message. --- src/lxc/lxc_driver.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 1dfbde3..7b76daf 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -4106,9 +4106,6 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, goto cleanup; }
- if (!virDomainDiskSourceIsBlockType(def->src, true)) - goto cleanup; - src = virDomainDiskGetSource(def); if (src == NULL) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", @@ -4116,6 +4113,12 @@ lxcDomainAttachDeviceDiskLive(virLXCDriverPtr driver, goto cleanup; }
+ if (virStorageSourceIsBlockLocal(def->src)) {
Shouldn't this be "if (!virStorageSourceIsBlockLocal(def->src)" ? ACK w/ the adjustment John
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Can't setup disk for non-block device")); + goto cleanup; + } +
FWIW: Prior to my commit referenced above the check was: if (!virDomainDiskSourceIsBlockType(def->src)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Can't setup disk for non-block device")); And yes, of course since virDomainDiskSourceIsBlockType checks "src->path" first (and provides a different message), moving this after that check seems right; otherwise, the error message is "source path not found for device='lun' using type='%d'" if (!src->path). All that commit did was try to have the lower layer generate the error message. Of course in this case erroneously swapping "Can't setup disk for non-block device" for "disk device='lun' is only valid for block type disk source". Still going back to 'a7785ccf' we see the call to "if (!virDomainDiskSourceIsBlockType(def)"...
if (virDomainDiskIndexByName(vm->def, def->dst, true) >= 0) { virReportError(VIR_ERR_OPERATION_FAILED, _("target %s already exists"), def->dst);

Commit c820fbff9fbfe1f2549a5b60967496587f8d8bfc added support for iSCSI disk as backing for <disk device='lun'>. We would not use it for a disk type="volume" with direct access mode which basically maps to direct iSCSI usage. Fix it by adding the storage source type accessor that resolves the volume type. --- src/qemu/qemu_command.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index c65ab16..e6f0181 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -907,7 +907,8 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk) virDomainDiskQEMUBusTypeToString(disk->bus)); goto error; } - if (disk->src->type == VIR_STORAGE_TYPE_NETWORK) { + + if (virStorageSourceGetActualType(disk->src) == VIR_STORAGE_TYPE_NETWORK) { if (disk->src->protocol != VIR_STORAGE_NET_PROTOCOL_ISCSI) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("disk device='lun' is not supported " -- 2.8.1

On 05/02/2016 10:32 AM, Peter Krempa wrote:
Commit c820fbff9fbfe1f2549a5b60967496587f8d8bfc added support for iSCSI disk as backing for <disk device='lun'>. We would not use it for a disk type="volume" with direct access mode which basically maps to direct iSCSI usage. Fix it by adding the storage source type accessor that resolves the volume type. --- src/qemu/qemu_command.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
ACK John

In qemuCheckDiskConfig would now use virDomainDiskSourceIsBlockType just as a glorified version of virStorageSourceIsBlockLocal that reports error messages. Replace it with the latter including the message for clarity. --- src/qemu/qemu_command.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index e6f0181..110ee19 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -916,9 +916,13 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk) virStorageNetProtocolTypeToString(disk->src->protocol)); goto error; } - } else if (!virDomainDiskSourceIsBlockType(disk->src, true)) { + } else if (!virStorageSourceIsBlockLocal(disk->src)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("disk device='lun' is only valid for block " + "type disk source")); goto error; } + if (disk->wwn) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Setting wwn is not supported for lun device")); -- 2.8.1

On 05/02/2016 10:32 AM, Peter Krempa wrote:
In qemuCheckDiskConfig would now use virDomainDiskSourceIsBlockType just as a glorified version of virStorageSourceIsBlockLocal that reports error messages. Replace it with the latter including the message for clarity. --- src/qemu/qemu_command.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
Hmmm... I wonder if the previous commit should remove or at least reference the check in virDomainDiskSourceIsBlockType that neglected to consider the translation and actualtype... ACK for this - your call on whether to adjust the previous commit... John

On Fri, May 06, 2016 at 11:09:38 -0400, John Ferlan wrote:
On 05/02/2016 10:32 AM, Peter Krempa wrote:
In qemuCheckDiskConfig would now use virDomainDiskSourceIsBlockType just as a glorified version of virStorageSourceIsBlockLocal that reports error messages. Replace it with the latter including the message for clarity. --- src/qemu/qemu_command.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
Hmmm... I wonder if the previous commit should remove or at least reference the check in virDomainDiskSourceIsBlockType that neglected to
I don't see a point in doing this since that function is removed in the next patch.

--- src/conf/domain_conf.c | 52 ------------------------------------------------ src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 1 - 3 files changed, 56 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 23ff887..48a220f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -24004,58 +24004,6 @@ virDomainDefFindDevice(virDomainDefPtr def, return 0; } -/** - * virDomainDiskSourceIsBlockType: - * - * Check if the disk *source* is of block type. This just tries - * to check from the type of disk def, not to probe the underlying - * storage. - * - * Return true if its source is block type, or false otherwise. - */ -bool -virDomainDiskSourceIsBlockType(virStorageSourcePtr src, - bool report) -{ - if (!src->path) { - if (report) - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("source path not found for device='lun' " - "using type='%d'"), src->type); - return false; - } - - if (src->type == VIR_STORAGE_TYPE_BLOCK) - return true; - - /* For volume types, check the srcpool. - * If it's a block type source pool, then it's possible - */ - if (src->type == VIR_STORAGE_TYPE_VOLUME && - src->srcpool && - src->srcpool->voltype == VIR_STORAGE_VOL_BLOCK) { - /* We don't think the volume accessed by remote URI is - * block type source, since we can't/shouldn't manage it - * (e.g. set sgio=filtered|unfiltered for it) in libvirt. - */ - if (src->srcpool->pooltype == VIR_STORAGE_POOL_ISCSI && - src->srcpool->mode == VIR_STORAGE_SOURCE_POOL_MODE_DIRECT) { - if (report) - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("disk device='lun' for iSCSI is not " - "supported with mode='direct'.")); - return false; - } - - return true; - } - if (report) - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("disk device='lun' is only valid for block " - "type disk source")); - return false; -} - char * virDomainObjGetMetadata(virDomainObjPtr vm, diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index b9fe109..bd97153 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -3115,9 +3115,6 @@ int virDomainDefFindDevice(virDomainDefPtr def, virDomainDeviceDefPtr dev, bool reportError); -bool virDomainDiskSourceIsBlockType(virStorageSourcePtr src, bool report) - ATTRIBUTE_NONNULL(1); - void virDomainChrSourceDefClear(virDomainChrSourceDefPtr def); char *virDomainObjGetMetadata(virDomainObjPtr vm, diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 73c9fdb..8e4fdf2 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -289,7 +289,6 @@ virDomainDiskSetDriver; virDomainDiskSetFormat; virDomainDiskSetSource; virDomainDiskSetType; -virDomainDiskSourceIsBlockType; virDomainFSDefFree; virDomainFSIndexByName; virDomainFSInsert; -- 2.8.1

On 05/02/2016 10:32 AM, Peter Krempa wrote:
--- src/conf/domain_conf.c | 52 ------------------------------------------------ src/conf/domain_conf.h | 3 --- src/libvirt_private.syms | 1 - 3 files changed, 56 deletions(-)
ACK John

--- src/qemu/qemu_command.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 110ee19..0f0427f 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -879,7 +879,7 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk) if (virDiskNameToIndex(disk->dst) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("unsupported disk type '%s'"), disk->dst); - goto error; + return -1; } if (disk->wwn) { @@ -887,7 +887,7 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk) (disk->bus != VIR_DOMAIN_DISK_BUS_SCSI)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Only ide and scsi disk support wwn")); - goto error; + return -1; } } @@ -895,7 +895,7 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk) disk->bus != VIR_DOMAIN_DISK_BUS_SCSI) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Only scsi disk supports vendor and product")); - goto error; + return -1; } if (disk->device == VIR_DOMAIN_DISK_DEVICE_LUN) { @@ -905,7 +905,7 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk) virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("disk device='lun' is not supported for bus='%s'"), virDomainDiskQEMUBusTypeToString(disk->bus)); - goto error; + return -1; } if (virStorageSourceGetActualType(disk->src) == VIR_STORAGE_TYPE_NETWORK) { @@ -914,30 +914,28 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk) _("disk device='lun' is not supported " "for protocol='%s'"), virStorageNetProtocolTypeToString(disk->src->protocol)); - goto error; + return -1; } } else if (!virStorageSourceIsBlockLocal(disk->src)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("disk device='lun' is only valid for block " "type disk source")); - goto error; + return -1; } if (disk->wwn) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Setting wwn is not supported for lun device")); - goto error; + return -1; } if (disk->vendor || disk->product) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Setting vendor or product is not supported " "for lun device")); - goto error; + return -1; } } return 0; - error: - return -1; } -- 2.8.1

Extract the relevant parts of the existing checker and reuse them for blockcopy since copying to a non-block device creates an invalid configuration. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1209802 --- src/qemu/qemu_command.c | 14 +------------- src/qemu/qemu_domain.c | 31 +++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 3 +++ src/qemu/qemu_driver.c | 4 ++++ 4 files changed, 39 insertions(+), 13 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 0f0427f..16a044e 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -908,20 +908,8 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk) return -1; } - if (virStorageSourceGetActualType(disk->src) == VIR_STORAGE_TYPE_NETWORK) { - if (disk->src->protocol != VIR_STORAGE_NET_PROTOCOL_ISCSI) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("disk device='lun' is not supported " - "for protocol='%s'"), - virStorageNetProtocolTypeToString(disk->src->protocol)); - return -1; - } - } else if (!virStorageSourceIsBlockLocal(disk->src)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("disk device='lun' is only valid for block " - "type disk source")); + if (qemuDomainDefValidateDiskLunSource(disk->src) < 0) return -1; - } if (disk->wwn) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d065edb..472b569 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5267,3 +5267,34 @@ qemuDomainDiskByName(virDomainDefPtr def, return ret; } + + +/** + * qemuDomainDefValidateDiskLunSource: + * @src: disk source struct + * + * Validate whether the disk source is valid for disk device='lun'. + * + * Returns 0 if the configuration is valid -1 and a libvirt error if the soure + * is invalid. + */ +int +qemuDomainDefValidateDiskLunSource(const virStorageSource *src) +{ + if (virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_NETWORK) { + if (src->protocol != VIR_STORAGE_NET_PROTOCOL_ISCSI) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("disk device='lun' is not supported " + "for protocol='%s'"), + virStorageNetProtocolTypeToString(src->protocol)); + return -1; + } + } else if (!virStorageSourceIsBlockLocal(src)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("disk device='lun' is only valid for block " + "type disk source")); + return -1; + } + + return 0; +} diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 63f98ba..7b71ae6 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -647,4 +647,7 @@ void qemuDomainSecretDestroy(virDomainObjPtr vm) int qemuDomainSecretPrepare(virConnectPtr conn, virDomainObjPtr vm) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); +int qemuDomainDefValidateDiskLunSource(const virStorageSource *src) + ATTRIBUTE_NONNULL(1); + #endif /* __QEMU_DOMAIN_H__ */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 3d0c7c8..cc50043 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -16759,6 +16759,10 @@ qemuDomainBlockCopyCommon(virDomainObjPtr vm, if (qemuDomainDiskBlockJobIsActive(disk)) goto endjob; + if (disk->device == VIR_DOMAIN_DISK_DEVICE_LUN && + qemuDomainDefValidateDiskLunSource(mirror) < 0) + goto endjob; + if (!(virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_MIRROR) && virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKJOB_ASYNC))) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", -- 2.8.1

On 05/02/2016 10:32 AM, Peter Krempa wrote:
Extract the relevant parts of the existing checker and reuse them for blockcopy since copying to a non-block device creates an invalid configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1209802 --- src/qemu/qemu_command.c | 14 +------------- src/qemu/qemu_domain.c | 31 +++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 3 +++ src/qemu/qemu_driver.c | 4 ++++ 4 files changed, 39 insertions(+), 13 deletions(-)
ACK John

This reverts commit c79ebf53b5fe0a33bf407b3bcb49e3a27ec97eb4. We can't just add checks to the XML parser once we've accepted such configuration in the past. --- src/conf/domain_conf.c | 22 ---------------------- tests/qemuxml2argvtest.c | 3 +-- 2 files changed, 1 insertion(+), 24 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 48a220f..c4cbbff 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4181,28 +4181,6 @@ virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, } } - /* Validate LUN configuration - * NOTE: virStorageTranslateDiskSourcePool is not run yet, so for - * disk "volume"'s, the closest we can get at config time is - * to ensure mode isn't direct since host/default will allow - * lun/block usage. At run time if it's determined the wrong - * voltype and pooltype values are set, then failure occurs - */ - if (disk->device == VIR_DOMAIN_DISK_DEVICE_LUN && - !(disk->src->type == VIR_STORAGE_TYPE_BLOCK || - (disk->src->type == VIR_STORAGE_TYPE_NETWORK && - disk->src->protocol == VIR_STORAGE_NET_PROTOCOL_ISCSI) || - (disk->src->type == VIR_STORAGE_TYPE_VOLUME && - disk->src->srcpool && - disk->src->srcpool->mode != - VIR_STORAGE_SOURCE_POOL_MODE_DIRECT))) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("disk '%s' improperly configured for a " - "device='lun'"), - disk->dst); - return -1; - } - if (disk->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && virDomainDiskDefAssignAddress(xmlopt, disk, def) < 0) return -1; diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 8842b2f..4211e82 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -780,8 +780,7 @@ mymain(void) DO_TEST_FAILURE("disk-drive-network-rbd-no-colon", NONE); DO_TEST("disk-drive-no-boot", QEMU_CAPS_BOOTINDEX); - DO_TEST_PARSE_ERROR("disk-device-lun-type-invalid", - QEMU_CAPS_VIRTIO_SCSI); + DO_TEST_FAILURE("disk-device-lun-type-invalid", QEMU_CAPS_VIRTIO_SCSI); DO_TEST_FAILURE("disk-usb-nosupport", NONE); DO_TEST("disk-usb-device", QEMU_CAPS_DEVICE_USB_STORAGE, -- 2.8.1

On 05/02/2016 10:32 AM, Peter Krempa wrote:
This reverts commit c79ebf53b5fe0a33bf407b3bcb49e3a27ec97eb4.
We can't just add checks to the XML parser once we've accepted such configuration in the past. --- src/conf/domain_conf.c | 22 ---------------------- tests/qemuxml2argvtest.c | 3 +-- 2 files changed, 1 insertion(+), 24 deletions(-)
There was a bz associated with that commit - that'll need to be addressed in some manner... While I understand your point here, the configuration didn't work - that is it couldn't be started anyway so there could not be a domain running with that configuration and thus it wouldn't disappear on a subsequent reload, hence why checking the config and rejecting "earlier" seemed proper even though we hadn't rejected such a config when the "mode='host'" was first implemented. Commit 'c79ebf53b' is a followup of sorts to commit '33188c9f'... For me it's a NACK, but someone else may feel differently John
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 48a220f..c4cbbff 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4181,28 +4181,6 @@ virDomainDeviceDefPostParseInternal(virDomainDeviceDefPtr dev, } }
- /* Validate LUN configuration - * NOTE: virStorageTranslateDiskSourcePool is not run yet, so for - * disk "volume"'s, the closest we can get at config time is - * to ensure mode isn't direct since host/default will allow - * lun/block usage. At run time if it's determined the wrong - * voltype and pooltype values are set, then failure occurs - */ - if (disk->device == VIR_DOMAIN_DISK_DEVICE_LUN && - !(disk->src->type == VIR_STORAGE_TYPE_BLOCK || - (disk->src->type == VIR_STORAGE_TYPE_NETWORK && - disk->src->protocol == VIR_STORAGE_NET_PROTOCOL_ISCSI) || - (disk->src->type == VIR_STORAGE_TYPE_VOLUME && - disk->src->srcpool && - disk->src->srcpool->mode != - VIR_STORAGE_SOURCE_POOL_MODE_DIRECT))) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("disk '%s' improperly configured for a " - "device='lun'"), - disk->dst); - return -1; - } - if (disk->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && virDomainDiskDefAssignAddress(xmlopt, disk, def) < 0) return -1; diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 8842b2f..4211e82 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -780,8 +780,7 @@ mymain(void) DO_TEST_FAILURE("disk-drive-network-rbd-no-colon", NONE); DO_TEST("disk-drive-no-boot", QEMU_CAPS_BOOTINDEX); - DO_TEST_PARSE_ERROR("disk-device-lun-type-invalid", - QEMU_CAPS_VIRTIO_SCSI); + DO_TEST_FAILURE("disk-device-lun-type-invalid", QEMU_CAPS_VIRTIO_SCSI); DO_TEST_FAILURE("disk-usb-nosupport", NONE); DO_TEST("disk-usb-device", QEMU_CAPS_DEVICE_USB_STORAGE,

On Fri, May 06, 2016 at 11:19:05 -0400, John Ferlan wrote:
On 05/02/2016 10:32 AM, Peter Krempa wrote:
This reverts commit c79ebf53b5fe0a33bf407b3bcb49e3a27ec97eb4.
We can't just add checks to the XML parser once we've accepted such configuration in the past. --- src/conf/domain_conf.c | 22 ---------------------- tests/qemuxml2argvtest.c | 3 +-- 2 files changed, 1 insertion(+), 24 deletions(-)
There was a bz associated with that commit - that'll need to be addressed in some manner...
Well, the initial assesment of that BZ was wrong. This should have been fixed in virt manager at that point.
While I understand your point here, the configuration didn't work - that is it couldn't be started anyway so there could not be a domain running with that configuration and thus it wouldn't disappear on a subsequent reload, hence why checking the config and rejecting "earlier" seemed
It does not kill any running domain, that's right. Defined domains still vanish after that commit if they were defined before. That is still unwanted.
proper even though we hadn't rejected such a config when the "mode='host'" was first implemented.
Only when introducing a feature you are allowed to do a check that rejects parsing XML, afterwards, no such thins should be added.
Commit 'c79ebf53b' is a followup of sorts to commit '33188c9f'...
'33188c9f' is okay. Startup time checks can be added.
For me it's a NACK, but someone else may feel differently
I still insist on reverting the code. This should be fixed in virt manager as in libvirt it creates a vanishing VM problem. Peter

On 05/09/2016 04:59 AM, Peter Krempa wrote:
On Fri, May 06, 2016 at 11:19:05 -0400, John Ferlan wrote:
On 05/02/2016 10:32 AM, Peter Krempa wrote:
This reverts commit c79ebf53b5fe0a33bf407b3bcb49e3a27ec97eb4.
We can't just add checks to the XML parser once we've accepted such configuration in the past. --- src/conf/domain_conf.c | 22 ---------------------- tests/qemuxml2argvtest.c | 3 +-- 2 files changed, 1 insertion(+), 24 deletions(-)
There was a bz associated with that commit - that'll need to be addressed in some manner...
Well, the initial assesment of that BZ was wrong. This should have been fixed in virt manager at that point.
That bz: https://bugzilla.redhat.com/show_bug.cgi?id=1201143 Is loaded with GSS and other 'priority' tags. Maybe the original assessment of the bug is wrong. But just reverting the commit with little mention of that until John dug it up is not helpful IMO
While I understand your point here, the configuration didn't work - that is it couldn't be started anyway so there could not be a domain running with that configuration and thus it wouldn't disappear on a subsequent reload, hence why checking the config and rejecting "earlier" seemed
It does not kill any running domain, that's right. Defined domains still vanish after that commit if they were defined before. That is still unwanted.
Yes, this is problematic, but then again has this issue bitten us in the year since this was committed? We should still fix it, but it's not time critical... Maybe we come up with a better solution.
proper even though we hadn't rejected such a config when the "mode='host'" was first implemented.
Only when introducing a feature you are allowed to do a check that rejects parsing XML, afterwards, no such thins should be added.
Right, these rules make technical sense, but are extremely difficult to audit, and have proven hard to enforce. People wandering into the code may follow the conventions of the code around them, and have no idea that adding a new validation check is 'bad' when the pre-existing similar checks are 'good' strictly based one when they were added. I think we need a better framework for this. I'll probably send a larger mail at some point, but basically I think we should: - rename VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS to something generic like VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATION. The original flag is used at libvirtd startup time to avoid the 'disappearing domain' problem for when a qemu is uninstalled for example, but we still get that validation check for normal runtime XML define - Make an explicit virDomainDefValidate function and move the OSTYPE validation there. - Only run virDomainDefValidate if VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATION is _not_ set. So skip validation at libvirtd startup - Then for example we can move this block of code to that function Doing all that, at least to cover this particular check, shouldn't be much work, won't potentially revive the bug, _and_ it sets us up some nice infrastructure to avoid this type of problem in the future. Thanks, Cole

On Mon, May 09, 2016 at 11:57:20AM -0400, Cole Robinson wrote:
On 05/09/2016 04:59 AM, Peter Krempa wrote:
On Fri, May 06, 2016 at 11:19:05 -0400, John Ferlan wrote:
On 05/02/2016 10:32 AM, Peter Krempa wrote:
This reverts commit c79ebf53b5fe0a33bf407b3bcb49e3a27ec97eb4.
We can't just add checks to the XML parser once we've accepted such configuration in the past.
But we can just revert the check once we've accepted it? :)
--- src/conf/domain_conf.c | 22 ---------------------- tests/qemuxml2argvtest.c | 3 +-- 2 files changed, 1 insertion(+), 24 deletions(-)
There was a bz associated with that commit - that'll need to be addressed in some manner...
Well, the initial assesment of that BZ was wrong. This should have been fixed in virt manager at that point.
That bz: https://bugzilla.redhat.com/show_bug.cgi?id=1201143
Is loaded with GSS and other 'priority' tags. Maybe the original assessment of the bug is wrong. But just reverting the commit with little mention of that until John dug it up is not helpful IMO
Neither is discussing a private BZ on a public list: "You are not authorized to access bug #1201143. To see this bug, you must first log in to an account with the appropriate permissions."
While I understand your point here, the configuration didn't work - that is it couldn't be started anyway so there could not be a domain running with that configuration and thus it wouldn't disappear on a subsequent reload, hence why checking the config and rejecting "earlier" seemed
It does not kill any running domain, that's right. Defined domains still vanish after that commit if they were defined before. That is still unwanted.
Yes, this is problematic, but then again has this issue bitten us in the year since this was committed? We should still fix it, but it's not time critical... Maybe we come up with a better solution.
proper even though we hadn't rejected such a config when the "mode='host'" was first implemented.
Only when introducing a feature you are allowed to do a check that rejects parsing XML, afterwards, no such thins should be added.
Right, these rules make technical sense, but are extremely difficult to audit, and have proven hard to enforce. People wandering into the code may follow the conventions of the code around them, and have no idea that adding a new validation check is 'bad' when the pre-existing similar checks are 'good' strictly based one when they were added.
I think we need a better framework for this. I'll probably send a larger mail at some point, but basically I think we should:
- rename VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS to something generic like VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATION. The original flag is used at libvirtd startup time to avoid the 'disappearing domain' problem for when a qemu is uninstalled for example, but we still get that validation check for normal runtime XML define
The problem with skipping validation is that we will end up with invalid domains being defined, breaking assumptions in other parts of libvirt. That way we would have to duplicate the checks on domain startup too (which would not be such a problem if they were all in the same function). For example: commit 21b316f4d351859d9ccbf8a20199f7e8707fd51d qemu: error out on missing machine type in configs which I added after someone tried to put the machine XML directly in /etc/libvirt. There's no point in allowing the user to (try to) start such a domain, but currently we treat such unvalidated XML the same as XML from a fresh define. I think I recall an attempt to introudce an 'invalid' state, where you could not start the domain, but it was editable by libvirt APIs. Unfortunately I cannot find it in the archives. Does anyone remember what happened to it? Jan

On 05/10/2016 03:30 AM, Ján Tomko wrote:
On Mon, May 09, 2016 at 11:57:20AM -0400, Cole Robinson wrote:
On 05/09/2016 04:59 AM, Peter Krempa wrote:
On Fri, May 06, 2016 at 11:19:05 -0400, John Ferlan wrote:
On 05/02/2016 10:32 AM, Peter Krempa wrote:
This reverts commit c79ebf53b5fe0a33bf407b3bcb49e3a27ec97eb4.
We can't just add checks to the XML parser once we've accepted such configuration in the past.
But we can just revert the check once we've accepted it? :)
--- src/conf/domain_conf.c | 22 ---------------------- tests/qemuxml2argvtest.c | 3 +-- 2 files changed, 1 insertion(+), 24 deletions(-)
There was a bz associated with that commit - that'll need to be addressed in some manner...
Well, the initial assesment of that BZ was wrong. This should have been fixed in virt manager at that point.
That bz: https://bugzilla.redhat.com/show_bug.cgi?id=1201143
Is loaded with GSS and other 'priority' tags. Maybe the original assessment of the bug is wrong. But just reverting the commit with little mention of that until John dug it up is not helpful IMO
Neither is discussing a private BZ on a public list: "You are not authorized to access bug #1201143. To see this bug, you must first log in to an account with the appropriate permissions."
Personally not sure why it's private, seeing as that "decision point" doesn't seem to have any consistency. For some bugs, it's easy, but for this one, it's not clear. In any case, there's an external bug id attached to it as well - although again - no idea how to point someone at that. There's also errata associated with it.. There's another private bug that was used to generate the "original" adjustments that were only "run-time" related that resulted in commit id '33188c9fc' - the bz associated with that uses the same external customer id. Personally what I find interesting about the bug report is that the consumer was indicating that the configuration should have been rejected at the libvirt level. I cannot recall the details or any discussion that happened last June over this (hence why I tend to be more verbose either in code commits/comments, list responses, or bzs). My point in NACK'ing wasn't that this isn't the right thing to do, it was more towards what is going to be done to address the bz that was associated with this. I'm not sure it'd be "good form" to just revert something causing the original problem to surface again months later without also dealing with the cause. Although perhaps by this time that type of configuration is avoided.
While I understand your point here, the configuration didn't work - that is it couldn't be started anyway so there could not be a domain running with that configuration and thus it wouldn't disappear on a subsequent reload, hence why checking the config and rejecting "earlier" seemed
It does not kill any running domain, that's right. Defined domains still vanish after that commit if they were defined before. That is still unwanted.
Yes, this is problematic, but then again has this issue bitten us in the year since this was committed? We should still fix it, but it's not time critical... Maybe we come up with a better solution.
Recalling and following the bz text/comments - this whole domain disappearing was brought up prior to posting a patch upstream, but alas the memory cannot recall the details of why I pushed forward anyway. I probably leaned more heavily towards the side of what the 'consumer' was requesting in my decision.
proper even though we hadn't rejected such a config when the "mode='host'" was first implemented.
Only when introducing a feature you are allowed to do a check that rejects parsing XML, afterwards, no such thins should be added.
Right, these rules make technical sense, but are extremely difficult to audit, and have proven hard to enforce. People wandering into the code may follow the conventions of the code around them, and have no idea that adding a new validation check is 'bad' when the pre-existing similar checks are 'good' strictly based one when they were added.
I think we need a better framework for this. I'll probably send a larger mail at some point, but basically I think we should:
- rename VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS to something generic like VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATION. The original flag is used at libvirtd startup time to avoid the 'disappearing domain' problem for when a qemu is uninstalled for example, but we still get that validation check for normal runtime XML define
The problem with skipping validation is that we will end up with invalid domains being defined, breaking assumptions in other parts of libvirt. That way we would have to duplicate the checks on domain startup too (which would not be such a problem if they were all in the same function).
For example: commit 21b316f4d351859d9ccbf8a20199f7e8707fd51d qemu: error out on missing machine type in configs which I added after someone tried to put the machine XML directly in /etc/libvirt.
There's no point in allowing the user to (try to) start such a domain, but currently we treat such unvalidated XML the same as XML from a fresh define.
Maybe some new API virIsDomainRunable (or some better name) that could be generated by carefully extracting checks from the qemu_command line build processing in order to make all those run/build time checks. Then command line building just fails if there's some sort of problem with the actual building rather than the config. Some of those checks are very specific configuration checks for supportable cross device or architecture validation. We don't provide an easy way for our consumers to validate something is good or bad until run time when it's rejected. For some consumers, that's a bad time to make that decision.
I think I recall an attempt to introudce an 'invalid' state, where you could not start the domain, but it was editable by libvirt APIs. Unfortunately I cannot find it in the archives. Does anyone remember what happened to it?
This one I believe is what you're looking for: http://www.redhat.com/archives/libvir-list/2015-December/msg00027.html John

On 05/10/2016 07:15 AM, John Ferlan wrote:
On 05/10/2016 03:30 AM, Ján Tomko wrote:
On Mon, May 09, 2016 at 11:57:20AM -0400, Cole Robinson wrote:
On 05/09/2016 04:59 AM, Peter Krempa wrote:
On Fri, May 06, 2016 at 11:19:05 -0400, John Ferlan wrote:
On 05/02/2016 10:32 AM, Peter Krempa wrote: proper even though we hadn't rejected such a config when the "mode='host'" was first implemented.
Only when introducing a feature you are allowed to do a check that rejects parsing XML, afterwards, no such thins should be added.
Right, these rules make technical sense, but are extremely difficult to audit, and have proven hard to enforce. People wandering into the code may follow the conventions of the code around them, and have no idea that adding a new validation check is 'bad' when the pre-existing similar checks are 'good' strictly based one when they were added.
I think we need a better framework for this. I'll probably send a larger mail at some point, but basically I think we should:
- rename VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS to something generic like VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATION. The original flag is used at libvirtd startup time to avoid the 'disappearing domain' problem for when a qemu is uninstalled for example, but we still get that validation check for normal runtime XML define
The problem with skipping validation is that we will end up with invalid domains being defined, breaking assumptions in other parts of libvirt. That way we would have to duplicate the checks on domain startup too (which would not be such a problem if they were all in the same function).
For example: commit 21b316f4d351859d9ccbf8a20199f7e8707fd51d qemu: error out on missing machine type in configs which I added after someone tried to put the machine XML directly in /etc/libvirt.
There's no point in allowing the user to (try to) start such a domain, but currently we treat such unvalidated XML the same as XML from a fresh define.
Maybe some new API virIsDomainRunable (or some better name) that could be generated by carefully extracting checks from the qemu_command line build processing in order to make all those run/build time checks. Then command line building just fails if there's some sort of problem with the actual building rather than the config. Some of those checks are very specific configuration checks for supportable cross device or architecture validation. We don't provide an easy way for our consumers to validate something is good or bad until run time when it's rejected. For some consumers, that's a bad time to make that decision.
We already have qemuBuildCommandLineValidate which has a lot of those checks already. Moving validation out of the cli builder into that function has a bunch of benefits: - Like you suggest, we can re-use that function for validation at different points if we want - Simplifies the command line building code, which is the more interesting stuff - Will make eventual code coverage analysis easier. We don't need to worry too much about covering every validation check, but we _definitely_ want to cover every bit of the cli building. It'll be easier to spot the uncovered lines if we don't have validation mixed in
I think I recall an attempt to introudce an 'invalid' state, where you could not start the domain, but it was editable by libvirt APIs. Unfortunately I cannot find it in the archives. Does anyone remember what happened to it?
This one I believe is what you're looking for:
http://www.redhat.com/archives/libvir-list/2015-December/msg00027.html
Ooops, I just mailed the same link :) Should have read the whole thread first :) - Cole

On 05/10/2016 03:30 AM, Ján Tomko wrote:
On Mon, May 09, 2016 at 11:57:20AM -0400, Cole Robinson wrote:
On 05/09/2016 04:59 AM, Peter Krempa wrote:
On Fri, May 06, 2016 at 11:19:05 -0400, John Ferlan wrote:
--- src/conf/domain_conf.c | 22 ---------------------- tests/qemuxml2argvtest.c | 3 +-- 2 files changed, 1 insertion(+), 24 deletions(-)
There was a bz associated with that commit - that'll need to be addressed in some manner...
Well, the initial assesment of that BZ was wrong. This should have been fixed in virt manager at that point.
That bz: https://bugzilla.redhat.com/show_bug.cgi?id=1201143
Is loaded with GSS and other 'priority' tags. Maybe the original assessment of the bug is wrong. But just reverting the commit with little mention of that until John dug it up is not helpful IMO
Neither is discussing a private BZ on a public list: "You are not authorized to access bug #1201143. To see this bug, you must first log in to an account with the appropriate permissions."
Apologies, I didn't notice it was private.
While I understand your point here, the configuration didn't work - that is it couldn't be started anyway so there could not be a domain running with that configuration and thus it wouldn't disappear on a subsequent reload, hence why checking the config and rejecting "earlier" seemed
It does not kill any running domain, that's right. Defined domains still vanish after that commit if they were defined before. That is still unwanted.
Yes, this is problematic, but then again has this issue bitten us in the year since this was committed? We should still fix it, but it's not time critical... Maybe we come up with a better solution.
proper even though we hadn't rejected such a config when the "mode='host'" was first implemented.
Only when introducing a feature you are allowed to do a check that rejects parsing XML, afterwards, no such thins should be added.
Right, these rules make technical sense, but are extremely difficult to audit, and have proven hard to enforce. People wandering into the code may follow the conventions of the code around them, and have no idea that adding a new validation check is 'bad' when the pre-existing similar checks are 'good' strictly based one when they were added.
I think we need a better framework for this. I'll probably send a larger mail at some point, but basically I think we should:
- rename VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS to something generic like VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATION. The original flag is used at libvirtd startup time to avoid the 'disappearing domain' problem for when a qemu is uninstalled for example, but we still get that validation check for normal runtime XML define
The problem with skipping validation is that we will end up with invalid domains being defined, breaking assumptions in other parts of libvirt. That way we would have to duplicate the checks on domain startup too (which would not be such a problem if they were all in the same function).
For example: commit 21b316f4d351859d9ccbf8a20199f7e8707fd51d qemu: error out on missing machine type in configs which I added after someone tried to put the machine XML directly in /etc/libvirt.
I think those types of errors are actually going to be limited. I can't think of many cases where we set defaults in the generic parser that would have to be skipped on libvirtd load. The arch+ostype+virt+machine stuff is pretty special in that the default setting is heavily intertwined with the validation. But yes if someone went with my approach they would need to do so carefully. Part of that would be moving all the setting of defaults out of the XML parser as well and into an explicit PostParse type function, so it's at least easier to audit.
There's no point in allowing the user to (try to) start such a domain, but currently we treat such unvalidated XML the same as XML from a fresh define.
I think I recall an attempt to introudce an 'invalid' state, where you could not start the domain, but it was editable by libvirt APIs. Unfortunately I cannot find it in the archives. Does anyone remember what happened to it?
Last patches I found are: https://www.redhat.com/archives/libvir-list/2015-December/msg00027.html But I didn't read the discussion to see why it stalled Thanks, Cole

On Fri, May 06, 2016 at 11:19:05 -0400, John Ferlan wrote:
On 05/02/2016 10:32 AM, Peter Krempa wrote:
This reverts commit c79ebf53b5fe0a33bf407b3bcb49e3a27ec97eb4.
We can't just add checks to the XML parser once we've accepted such configuration in the past. --- src/conf/domain_conf.c | 22 ---------------------- tests/qemuxml2argvtest.c | 3 +-- 2 files changed, 1 insertion(+), 24 deletions(-)
There was a bz associated with that commit - that'll need to be addressed in some manner...
I now have an even better justification for this that I didn't notice earlier. As of commits c820fbff9fbfe1f2549a5b60967496587f8d8bfc and 82ba41108acd0f3fc99f62effc7422a1e1e7c944 respectively we allow actually to use the native iSCSI transport for disk type=lun so basically this commit allows to actually use this with the volumes. I'll post it with fixed commit message. This basically should have been part of the latter commit. Peter
participants (4)
-
Cole Robinson
-
John Ferlan
-
Ján Tomko
-
Peter Krempa