[libvirt] [PATCH 0/3] Patches for some device mapper multipath issues

Patch 1 just makes sure that prior to calling refreshPool we make sure to clear the objects; otherwise, the refresh code will duplicate. Text within patch 3 seems to indicate that entries were showing up more than once - although those may be bug related. In any case, although perhaps not necessary here - better safe than sorry. Patch 2 is a rework of changes that went into 1.3.2 related to how libvirt_parthelper generates the volume target path name that didn't quite get the algorithm right when the device source path ended with a non numeric value. As it turns out it seems the naming algorithm is much "simpler"; however, rather than just remove the attribute, I figured it should be kept "just in case" something *needed* to add that "p" it would be possible. It seems from output described here: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm... That it would be possible. Patch 3 was probably seen because Patch 2 testing occurs, so the same "sequence" used in other disk backend based testing was attempted. As it turns out, I don't believe the deletion for a multipath device ever really worked. The original patch could have been reverted except that it intermingled changes to support linking device-mapper with parthelper. John Ferlan (3): storage: Need to clear pool prior to calling the refreshPool storage: Fix algorithm generating path names for devmapper storage: Fix virStorageBackendDiskDeleteVol for device mapper configure.ac | 15 ++---- docs/formatstorage.html.in | 18 ++----- src/storage/parthelper.c | 13 +++-- src/storage/storage_backend_disk.c | 108 ++++++++++++++++++++++++------------- src/storage/storage_driver.c | 3 ++ 5 files changed, 89 insertions(+), 68 deletions(-) -- 2.5.5

Prior to calling the 'refreshPool' during CreatePool or UploadPool operations, we need to clear the pool; otherwise, the pool will have duplicated entries. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_driver.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 1d42f24..bf05b19 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -727,6 +727,7 @@ storagePoolCreateXML(virConnectPtr conn, stateFile = virFileBuildPath(driver->stateDir, pool->def->name, ".xml"); + virStoragePoolObjClearVols(pool); if (!stateFile || virStoragePoolSaveState(stateFile, pool->def) < 0 || backend->refreshPool(conn, pool) < 0) { if (stateFile) @@ -918,6 +919,7 @@ storagePoolCreate(virStoragePoolPtr obj, stateFile = virFileBuildPath(driver->stateDir, pool->def->name, ".xml"); + virStoragePoolObjClearVols(pool); if (!stateFile || virStoragePoolSaveState(stateFile, pool->def) < 0 || backend->refreshPool(obj->conn, pool) < 0) { if (stateFile) @@ -2356,6 +2358,7 @@ storageVolUpload(virStorageVolPtr obj, * interaction and we can just lookup the backend in the callback * routine in order to call the refresh API. */ + virStoragePoolObjClearVols(pool); if (backend->refreshPool) { if (VIR_ALLOC(cbdata) < 0 || VIR_STRDUP(cbdata->pool_name, pool->def->name) < 0) -- 2.5.5

On 10.05.2016 15:26, John Ferlan wrote:
Prior to calling the 'refreshPool' during CreatePool or UploadPool operations, we need to clear the pool; otherwise, the pool will have duplicated entries.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_driver.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 1d42f24..bf05b19 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -727,6 +727,7 @@ storagePoolCreateXML(virConnectPtr conn, stateFile = virFileBuildPath(driver->stateDir, pool->def->name, ".xml");
+ virStoragePoolObjClearVols(pool); if (!stateFile || virStoragePoolSaveState(stateFile, pool->def) < 0 || backend->refreshPool(conn, pool) < 0) { if (stateFile) @@ -918,6 +919,7 @@ storagePoolCreate(virStoragePoolPtr obj, stateFile = virFileBuildPath(driver->stateDir, pool->def->name, ".xml");
+ virStoragePoolObjClearVols(pool); if (!stateFile || virStoragePoolSaveState(stateFile, pool->def) < 0 || backend->refreshPool(obj->conn, pool) < 0) { if (stateFile) @@ -2356,6 +2358,7 @@ storageVolUpload(virStorageVolPtr obj, * interaction and we can just lookup the backend in the callback * routine in order to call the refresh API. */ + virStoragePoolObjClearVols(pool); if (backend->refreshPool) { if (VIR_ALLOC(cbdata) < 0 || VIR_STRDUP(cbdata->pool_name, pool->def->name) < 0)
While there's nothing wrong with this patch, after it we have this pair of ClearVols() + refreshPool() occurring all over the place. Should we create a wrapper function over it so we not miss anything similar in the future? Michal

[...]
While there's nothing wrong with this patch, after it we have this pair of ClearVols() + refreshPool() occurring all over the place. Should we create a wrapper function over it so we not miss anything similar in the future?
I've thrashed that idea around a few times as well... Never quite sure if some sort of 'virStoragePoolRefreshPool()' should be generated or to make the virStoragePoolObjClearVols() call in each refreshPool callback. To a degree, I see value in forcing each backend to make the decision whether to "update"/refresh it's list or just take the large hammer method and generate the list all over since it's a lot easier than knowing what your current list is, deciding if the next thing you find is new or not, and then deciding what was on the list before you started the refresh and no longer is found, thus must be removed. Currently each backend just takes the large hammer approach. John

https://bugzilla.redhat.com/show_bug.cgi?id=1265694 Commit id '020135dc' didn't quite get the algorithm correct when a device mapper source ended with a non numeric value (e.g. ends with an alphabet value). This patch modifies the 'part_separator' logic to add the "p" separator to the attempted target path name only when specified as part_separator='yes'. For a source name that already ends with a number, the logic doesn't change as the part separator would need to be there. For a source name that ends with something other than a number, this allows the possibility that a "p" separator can be added. The default for one of these source devices is to not add the separator. The key for device mapper and the need for a partition separator "p" is the presence of a number in the last character of the device name link in /dev/mapper. A name such as "/dev/mapper/mpatha1" would generate a "/dev/mapper/mpatha1p1" partition, while "/dev/mapper/mpatha" would generate partition "/dev/mapper/mpatha1". Similarly for a device mapper entry not using friendly names or an alias, a device such as "/dev/mapper/3600a0b80005b10ca00005ad656fd8d93" would generate a paritition "/dev/mapper/3600a0b80005b10ca00005ad656fd8d93p1", while a device such as "/dev/mapper/3600a0b80005b10ca00005e115729093f" would generate a partition "/dev/mapper/3600a0b80005b10ca00005e115729093f1". The long number is the WWID of the device. It's also possible to assign an alias for a device mapper entry, that alias follows the same rules with respect to ending with a number or not when adding a "p" to create the target device path. Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatstorage.html.in | 18 +++++------------- src/storage/parthelper.c | 11 ++++++----- src/storage/storage_backend_disk.c | 9 +++++---- 3 files changed, 16 insertions(+), 22 deletions(-) diff --git a/docs/formatstorage.html.in b/docs/formatstorage.html.in index 0356182..94277a1 100644 --- a/docs/formatstorage.html.in +++ b/docs/formatstorage.html.in @@ -134,19 +134,11 @@ <code>path</code> may be supplied. Valid values for the attribute may be either "yes" or "no". This attribute is to be used for a <code>disk</code> pool type using a <code>path</code> to a - device mapper multipath device configured to utilize either - 'user_friendly_names' or a custom 'alias' name in the - /etc/multipath.conf. The attribute directs libvirt to not - generate device volume names with the partition character "p". - By default, when libvirt generates the partition names for - device mapper multipath devices it will add a "p" path separator - to the device name before adding the partition number. For example, - a <code>device path</code> of '/dev/mapper/mpatha' libvirt would - generate partition names of '/dev/mapper/mpathap1', - '/dev/mapper/mpathap2', etc. for each partition found. With - this attribute set to "no", libvirt will not append the "p" to - the name unless it ends with a number thus generating names - of '/dev/mapper/mpatha1', '/dev/mapper/mpatha2', etc. + device mapper multipath device. Setting the attribute to "yes" + causes libvirt to attempt to generate and find target volume path's + using a "p" separator. The default algorithm used by device mapper + is to add the "p" separator only when the source device path ends + with a number. <span class="since">Since 1.3.1</span></p></dd> <dt><code>dir</code></dt> <dd>Provides the source for pools backed by directories (pool diff --git a/src/storage/parthelper.c b/src/storage/parthelper.c index 6695f23..d81b177 100644 --- a/src/storage/parthelper.c +++ b/src/storage/parthelper.c @@ -69,7 +69,7 @@ int main(int argc, char **argv) const char *path; char *canonical_path; const char *partsep; - bool devmap_nopartsep = false; + bool devmap_partsep = false; if (virGettextInitialize() < 0) exit(EXIT_FAILURE); @@ -77,7 +77,7 @@ int main(int argc, char **argv) if (argc == 3 && STREQ(argv[2], "-g")) { cmd = DISK_GEOMETRY; } else if (argc == 3 && STREQ(argv[2], "-p")) { - devmap_nopartsep = true; + devmap_partsep = true; } else if (argc != 2) { fprintf(stderr, _("syntax: %s DEVICE [-g]|[-p]\n"), argv[0]); return 1; @@ -85,10 +85,11 @@ int main(int argc, char **argv) path = argv[1]; if (virIsDevMapperDevice(path)) { - /* The 'devmap_nopartsep' option will not append the partsep of "p" - * onto the name unless the 'path' ends in a number + /* If the path ends with a number or we explicitly request it for + * path, then append the "p" partition separator. Otherwise, if + * the path ends with a letter already, then no need for a separator. */ - if (c_isdigit(path[strlen(path)-1]) || !devmap_nopartsep) + if (c_isdigit(path[strlen(path)-1]) || devmap_partsep) partsep = "p"; else partsep = ""; diff --git a/src/storage/storage_backend_disk.c b/src/storage/storage_backend_disk.c index 3e0395d..eadf8a3 100644 --- a/src/storage/storage_backend_disk.c +++ b/src/storage/storage_backend_disk.c @@ -325,13 +325,14 @@ virStorageBackendDiskReadPartitions(virStoragePoolObjPtr pool, pool->def->source.devices[0].path, NULL); - /* Check for the presence of the part_separator='no'. Pass this + /* Check for the presence of the part_separator='yes'. Pass this * along to the libvirt_parthelper as option '-p'. This will cause - * libvirt_parthelper to not append the "p" partition separator to - * the generated device name, unless the name ends with a number. + * libvirt_parthelper to append the "p" partition separator to + * the generated device name for a source device which ends with + * a non-numeric value (e.g. mpatha would generate mpathap#). */ if (pool->def->source.devices[0].part_separator == - VIR_TRISTATE_BOOL_NO) + VIR_TRISTATE_BOOL_YES) virCommandAddArg(cmd, "-p"); /* If a volume is passed, virStorageBackendDiskMakeVol only updates the -- 2.5.5

Commit id 'df1011ca8' modified virStorageBackendDiskDeleteVol to use "dmsetup remove --force" to remove the volume, but left things in an inconsistent state since the partition still existed on the disk and only the device mapper device (/dev/dm-#) was removed. Prior to commit '1895b421' (or '1ffd82bb' and '471e1c4e'), this could go unnoticed since virStorageBackendDiskRefreshPool wasn't called. However, the pool would be unusable since the /dev/dm-# device would be removed even though the partition was not removed unless a multipathd restart reset the link. That would of course make the volume appear again in the pool after a refresh or pool start after libvirt reload. This patch removes the 'dmsetup' logic and re-implements the partition deletion logic for device mapper devices. The removal of the partition via 'parted rm --script #' will cause udev device change logic to allow multipathd to handle removing the dm-* device associated with the partition. Signed-off-by: John Ferlan <jferlan@redhat.com> --- configure.ac | 15 ++---- src/storage/parthelper.c | 2 + src/storage/storage_backend_disk.c | 99 +++++++++++++++++++++++++------------- 3 files changed, 70 insertions(+), 46 deletions(-) diff --git a/configure.ac b/configure.ac index d19c1a9..c3f0add 100644 --- a/configure.ac +++ b/configure.ac @@ -1957,19 +1957,12 @@ LIBPARTED_LIBS= if test "$with_storage_disk" = "yes" || test "$with_storage_disk" = "check"; then AC_PATH_PROG([PARTED], [parted], [], [$PATH:/sbin:/usr/sbin]) - AC_PATH_PROG([DMSETUP], [dmsetup], [], [$PATH:/sbin:/usr/sbin]) if test -z "$PARTED" ; then PARTED_FOUND=no else PARTED_FOUND=yes fi - if test -z "$DMSETUP" ; then - DMSETUP_FOUND=no - else - DMSETUP_FOUND=yes - fi - if test "$PARTED_FOUND" = "yes" && test "x$PKG_CONFIG" != "x" ; then PKG_CHECK_MODULES([LIBPARTED], [libparted >= $PARTED_REQUIRED], [], [PARTED_FOUND=no]) @@ -1988,12 +1981,12 @@ if test "$with_storage_disk" = "yes" || fi if test "$with_storage_disk" = "yes" && - test "$PARTED_FOUND:$DMSETUP_FOUND" != "yes:yes"; then - AC_MSG_ERROR([Need both parted and dmsetup for disk storage driver]) + test "$PARTED_FOUND" != "yes"; then + AC_MSG_ERROR([Need parted for disk storage driver]) fi if test "$with_storage_disk" = "check"; then - if test "$PARTED_FOUND:$DMSETUP_FOUND" != "yes:yes"; then + if test "$PARTED_FOUND" != "yes"; then with_storage_disk=no else with_storage_disk=yes @@ -2005,8 +1998,6 @@ if test "$with_storage_disk" = "yes" || [whether Disk backend for storage driver is enabled]) AC_DEFINE_UNQUOTED([PARTED],["$PARTED"], [Location or name of the parted program]) - AC_DEFINE_UNQUOTED([DMSETUP],["$DMSETUP"], - [Location or name of the dmsetup program]) fi fi AM_CONDITIONAL([WITH_STORAGE_DISK], [test "$with_storage_disk" = "yes"]) diff --git a/src/storage/parthelper.c b/src/storage/parthelper.c index d81b177..afdaa1c 100644 --- a/src/storage/parthelper.c +++ b/src/storage/parthelper.c @@ -83,6 +83,8 @@ int main(int argc, char **argv) return 1; } + /* NB: Changes to the following algorithm will need corresponding + * changes to virStorageBackendDiskDeleteVol */ path = argv[1]; if (virIsDevMapperDevice(path)) { /* If the path ends with a number or we explicitly request it for diff --git a/src/storage/storage_backend_disk.c b/src/storage/storage_backend_disk.c index eadf8a3..666ad03 100644 --- a/src/storage/storage_backend_disk.c +++ b/src/storage/storage_backend_disk.c @@ -799,6 +799,31 @@ virStorageBackendDiskPartBoundaries(virStoragePoolObjPtr pool, } +/* virStorageBackendDiskDeleteVol + * @conn: Pointer to a libvirt connection + * @pool: Pointer to the storage pool + * @vol: Pointer to the volume definition + * @flags: flags (unused for now) + * + * This API will remove the disk volume partition either from direct + * API call or as an error path during creation when the partition + * name provided during create doesn't match the name read from + * virStorageBackendDiskReadPartitions. + * + * For a device mapper device, device respresentation is dependant upon + * device mapper configuration, but the general rule of thumb is that at + * creation if a device name ends with a number, then a partition separator + * "p" is added to the created name; otherwise, if the device name doesn't + * end with a number, then there is no partition separator. This name is + * what ends up in the vol->target.path. This ends up being a link to a + * /dev/mapper/dm-# device which cannot be used in the algorithm to determine + * which partition to remove, but a properly handled target.path can be. + * + * For non device mapper devices, just need to resolve the link of the + * vol->target.path in order to get the path. + * + * Returns 0 on success, -1 on failure with error message set. + */ static int virStorageBackendDiskDeleteVol(virConnectPtr conn, virStoragePoolObjPtr pool, @@ -807,7 +832,9 @@ virStorageBackendDiskDeleteVol(virConnectPtr conn, { char *part_num = NULL; char *devpath = NULL; - char *dev_name, *srcname; + char *dev_name; + char *src_path = pool->def->source.devices[0].path; + char *srcname = last_component(src_path); virCommandPtr cmd = NULL; bool isDevMapperDevice; int rc = -1; @@ -817,56 +844,60 @@ virStorageBackendDiskDeleteVol(virConnectPtr conn, if (!vol->target.path) { virReportError(VIR_ERR_INVALID_ARG, _("volume target path empty for source path '%s'"), - pool->def->source.devices[0].path); + src_path); return -1; } - if (virFileResolveLink(vol->target.path, &devpath) < 0) { - virReportSystemError(errno, - _("Couldn't read volume target path '%s'"), - vol->target.path); - goto cleanup; + /* NB: This is the corollary to the algorithm in libvirt_parthelper + * (parthelper.c) that is used to generate the target.path name + * for use by libvirt. Changes to either, need to be reflected + * in both places */ + isDevMapperDevice = virIsDevMapperDevice(vol->target.path); + if (isDevMapperDevice) { + dev_name = last_component(vol->target.path); + } else { + if (virFileResolveLink(vol->target.path, &devpath) < 0) { + virReportSystemError(errno, + _("Couldn't read volume target path '%s'"), + vol->target.path); + goto cleanup; + } + dev_name = last_component(devpath); } - dev_name = last_component(devpath); - srcname = last_component(pool->def->source.devices[0].path); VIR_DEBUG("dev_name=%s, srcname=%s", dev_name, srcname); - isDevMapperDevice = virIsDevMapperDevice(devpath); - - if (!isDevMapperDevice && !STRPREFIX(dev_name, srcname)) { + if (!STRPREFIX(dev_name, srcname)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Volume path '%s' did not start with parent " "pool source device name."), dev_name); goto cleanup; } - if (!isDevMapperDevice) { - part_num = dev_name + strlen(srcname); + part_num = dev_name + strlen(srcname); - if (*part_num == 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("cannot parse partition number from target " - "'%s'"), dev_name); - goto cleanup; - } + /* For device mapper and we have a partition character 'p' as the + * current character, let's move beyond that before checking part_num */ + if (isDevMapperDevice && *part_num == 'p') + part_num++; - /* eg parted /dev/sda rm 2 */ - cmd = virCommandNewArgList(PARTED, - pool->def->source.devices[0].path, - "rm", - "--script", - part_num, - NULL); - if (virCommandRun(cmd, NULL) < 0) - goto cleanup; - } else { - cmd = virCommandNewArgList(DMSETUP, "remove", "--force", devpath, NULL); - - if (virCommandRun(cmd, NULL) < 0) - goto cleanup; + if (*part_num == 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse partition number from target " + "'%s'"), dev_name); + goto cleanup; } + /* eg parted /dev/sda rm 2 or /dev/mapper/mpathc rm 2 */ + cmd = virCommandNewArgList(PARTED, + src_path, + "rm", + "--script", + part_num, + NULL); + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + /* Refreshing the pool is the easiest option as LOGICAL and EXTENDED * partition allocation/capacity management is handled within * virStorageBackendDiskMakeDataVol and trying to redo that logic -- 2.5.5

On 10.05.2016 15:26, John Ferlan wrote:
Patch 1 just makes sure that prior to calling refreshPool we make sure to clear the objects; otherwise, the refresh code will duplicate. Text within patch 3 seems to indicate that entries were showing up more than once - although those may be bug related. In any case, although perhaps not necessary here - better safe than sorry.
Patch 2 is a rework of changes that went into 1.3.2 related to how libvirt_parthelper generates the volume target path name that didn't quite get the algorithm right when the device source path ended with a non numeric value. As it turns out it seems the naming algorithm is much "simpler"; however, rather than just remove the attribute, I figured it should be kept "just in case" something *needed* to add that "p" it would be possible. It seems from output described here:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm...
That it would be possible.
Patch 3 was probably seen because Patch 2 testing occurs, so the same "sequence" used in other disk backend based testing was attempted. As it turns out, I don't believe the deletion for a multipath device ever really worked. The original patch could have been reverted except that it intermingled changes to support linking device-mapper with parthelper.
John Ferlan (3): storage: Need to clear pool prior to calling the refreshPool storage: Fix algorithm generating path names for devmapper storage: Fix virStorageBackendDiskDeleteVol for device mapper
configure.ac | 15 ++---- docs/formatstorage.html.in | 18 ++----- src/storage/parthelper.c | 13 +++-- src/storage/storage_backend_disk.c | 108 ++++++++++++++++++++++++------------- src/storage/storage_driver.c | 3 ++ 5 files changed, 89 insertions(+), 68 deletions(-)
ACK Michal
participants (2)
-
John Ferlan
-
Michal Privoznik