[libvirt] [PATCH v2 0/6] storage: handle scsi/iscsi error paths better

v1 here: http://www.redhat.com/archives/libvir-list/2015-March/msg01562.html changes: 1/6: Removed the virGetLastError() == NULL check 2/6: New - reading more closely showed an error path without a goto 3/6: New - Adjust return for virStorageBackendSCSINewLun to be able to differentiate real error vs. deciding to return because a stable path could not be determined 4/6: New - Found an unused variable in processLU 5/6: Adjust the logic to return -1 for "real" errors and -2 for errors that are "recoverable". Removed the need for goto cleanup. 6/6: Add check for non-existent stable path at startup (and fail). Add check during refresh for a found non standard stable path that returns no volumes in the pool John Ferlan (6): iscsi: Use error message from virStorageBackendSCSIFindLUs iscsi: Fix exit path for virStorageBackendISCSIFindLUs failure scsi: Adjust return value for virStorageBackendSCSINewLun scsi: Remove unused 'type_path' in processLU scsi: Adjust return values from processLU iscsi: Add checks for non standard stable target.path src/storage/storage_backend_iscsi.c | 33 ++++++++++++++++----- src/storage/storage_backend_scsi.c | 59 ++++++++++++++++++++++++------------- 2 files changed, 64 insertions(+), 28 deletions(-) -- 2.1.0

Don't supercede the error message virStorageBackendSCSIFindLUs as the message such as "error: Failed to find LUs on host 60: ..." is not overly clear as to what the real problem might be. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_iscsi.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/storage/storage_backend_iscsi.c b/src/storage/storage_backend_iscsi.c index 079c767..48f0957 100644 --- a/src/storage/storage_backend_iscsi.c +++ b/src/storage/storage_backend_iscsi.c @@ -146,11 +146,8 @@ virStorageBackendISCSIFindLUs(virStoragePoolObjPtr pool, retval = -1; } - if (virStorageBackendSCSIFindLUs(pool, host) < 0) { - virReportSystemError(errno, - _("Failed to find LUs on host %u"), host); + if (virStorageBackendSCSIFindLUs(pool, host) < 0) retval = -1; - } VIR_FREE(sysfs_path); -- 2.1.0

On Wed, Apr 01, 2015 at 01:29:06PM -0400, John Ferlan wrote:
Don't supercede the error message virStorageBackendSCSIFindLUs as the message such as "error: Failed to find LUs on host 60: ..." is not overly clear as to what the real problem might be.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_iscsi.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
ACK Jan

If the call to virStorageBackendISCSIGetHostNumber failed, we set retval = -1, but yet still called virStorageBackendSCSIFindLUs. Need to add a goto cleanup - while at it, adjust the logic to initialize retval to -1 and only changed to 0 (zero) on success. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_iscsi.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/storage/storage_backend_iscsi.c b/src/storage/storage_backend_iscsi.c index 48f0957..fba037f 100644 --- a/src/storage/storage_backend_iscsi.c +++ b/src/storage/storage_backend_iscsi.c @@ -131,23 +131,27 @@ virStorageBackendISCSIFindLUs(virStoragePoolObjPtr pool, const char *session) { char *sysfs_path; - int retval = 0; + int retval = -1; uint32_t host; if (virAsprintf(&sysfs_path, "/sys/class/iscsi_session/session%s/device", session) < 0) - return -1; + goto cleanup; if (virStorageBackendISCSIGetHostNumber(sysfs_path, &host) < 0) { virReportSystemError(errno, _("Failed to get host number for iSCSI session " "with path '%s'"), sysfs_path); - retval = -1; + goto cleanup; } if (virStorageBackendSCSIFindLUs(pool, host) < 0) - retval = -1; + goto cleanup; + + retval = 0; + + cleanup: VIR_FREE(sysfs_path); -- 2.1.0

On Wed, Apr 01, 2015 at 01:29:07PM -0400, John Ferlan wrote:
If the call to virStorageBackendISCSIGetHostNumber failed, we set retval = -1, but yet still called virStorageBackendSCSIFindLUs. Need to add a goto cleanup - while at it, adjust the logic to initialize retval to -1 and only changed to 0 (zero) on success.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_iscsi.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
ACK Jan

Add a return -2 to differentiate that the failure was a result of a non stable device path found or some other real error which would be messaged in some manner. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_scsi.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/src/storage/storage_backend_scsi.c b/src/storage/storage_backend_scsi.c index 58e7e6d..6def373 100644 --- a/src/storage/storage_backend_scsi.c +++ b/src/storage/storage_backend_scsi.c @@ -146,6 +146,16 @@ virStorageBackendSCSISerial(const char *dev) } +/* + * Attempt to create a new LUN + * + * Returns: + * + * 0 => Success + * -1 => Failure due to some sort of OOM or other fatal issue found when + * attempting to get/update information about a found volume + * -2 => Failure to find a stable path + */ static int virStorageBackendSCSINewLun(virStoragePoolObjPtr pool, uint32_t host ATTRIBUTE_UNUSED, @@ -202,7 +212,7 @@ virStorageBackendSCSINewLun(virStoragePoolObjPtr pool, VIR_DEBUG("No stable path found for '%s' in '%s'", devpath, pool->def->target.path); - retval = -1; + retval = -2; goto free_vol; } -- 2.1.0

On Wed, Apr 01, 2015 at 01:29:08PM -0400, John Ferlan wrote:
Add a return -2 to differentiate that the failure was a result of a non stable device path found or some other real error which would be messaged in some manner.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_scsi.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/storage/storage_backend_scsi.c b/src/storage/storage_backend_scsi.c index 58e7e6d..6def373 100644 --- a/src/storage/storage_backend_scsi.c +++ b/src/storage/storage_backend_scsi.c @@ -146,6 +146,16 @@ virStorageBackendSCSISerial(const char *dev) }
+/* + * Attempt to create a new LUN + * + * Returns: + * + * 0 => Success + * -1 => Failure due to some sort of OOM or other fatal issue found when + * attempting to get/update information about a found volume + * -2 => Failure to find a stable path + */ static int virStorageBackendSCSINewLun(virStoragePoolObjPtr pool, uint32_t host ATTRIBUTE_UNUSED, @@ -202,7 +212,7 @@ virStorageBackendSCSINewLun(virStoragePoolObjPtr pool, VIR_DEBUG("No stable path found for '%s' in '%s'", devpath, pool->def->target.path);
- retval = -1; + retval = -2;
This mixes two different errors: * virStorageBackendStablePath short-circuited based on pool target path that's just as fatal as OOM and the attempt to find other volumes will also fail * virStorageBackendStablePath fell through to ret_strdup - the directory exists, but a symlink to the device did not show up This means the device could appear later, so it makes sense to retry later. And it doesn't handle the opendir failure in virStorageBackendStablePath, which could also mean that the device will appear later. Given that this check here has to negate the !STREQ "/dev" checks in virStorageBackendStablePath, maybe virStorageBackendStablePath should be split up to two functions - one that returns an error when the device path can't be stabilized and the other that would strdup the original path? Jan

On 04/02/2015 07:31 AM, Ján Tomko wrote:
On Wed, Apr 01, 2015 at 01:29:08PM -0400, John Ferlan wrote:
Add a return -2 to differentiate that the failure was a result of a non stable device path found or some other real error which would be messaged in some manner.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_scsi.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/storage/storage_backend_scsi.c b/src/storage/storage_backend_scsi.c index 58e7e6d..6def373 100644 --- a/src/storage/storage_backend_scsi.c +++ b/src/storage/storage_backend_scsi.c @@ -146,6 +146,16 @@ virStorageBackendSCSISerial(const char *dev) }
+/* + * Attempt to create a new LUN + * + * Returns: + * + * 0 => Success + * -1 => Failure due to some sort of OOM or other fatal issue found when + * attempting to get/update information about a found volume + * -2 => Failure to find a stable path + */ static int virStorageBackendSCSINewLun(virStoragePoolObjPtr pool, uint32_t host ATTRIBUTE_UNUSED, @@ -202,7 +212,7 @@ virStorageBackendSCSINewLun(virStoragePoolObjPtr pool, VIR_DEBUG("No stable path found for '%s' in '%s'", devpath, pool->def->target.path);
- retval = -1; + retval = -2;
This mixes two different errors: * virStorageBackendStablePath short-circuited based on pool target path that's just as fatal as OOM and the attempt to find other volumes will also fail * virStorageBackendStablePath fell through to ret_strdup - the directory exists, but a symlink to the device did not show up
This means the device could appear later, so it makes sense to retry later.
And it doesn't handle the opendir failure in virStorageBackendStablePath, which could also mean that the device will appear later.
Given that this check here has to negate the !STREQ "/dev" checks in virStorageBackendStablePath, maybe virStorageBackendStablePath should be split up to two functions - one that returns an error when the device path can't be stabilized and the other that would strdup the original path?
True - the real bugger is virStorageBackendStablePath - since it's shared code, my goal here was to minimize impact to the [i]SCSI paths. In any case, I've pushed the 3 ACK'd patches and will see what I can come up with for StablePath checking. Tks - John

Seems to be a remnant that was never cleaned up from original submit... Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_scsi.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/storage/storage_backend_scsi.c b/src/storage/storage_backend_scsi.c index 6def373..e085da2 100644 --- a/src/storage/storage_backend_scsi.c +++ b/src/storage/storage_backend_scsi.c @@ -382,7 +382,6 @@ processLU(virStoragePoolObjPtr pool, uint32_t target, uint32_t lun) { - char *type_path = NULL; int retval = -1; int device_type; char *block_device = NULL; @@ -427,8 +426,6 @@ processLU(virStoragePoolObjPtr pool, VIR_DEBUG("Created new storage volume for %u:%u:%u:%u successfully", host, bus, target, lun); - VIR_FREE(type_path); - out: VIR_FREE(block_device); return retval; -- 2.1.0

On Wed, Apr 01, 2015 at 01:29:09PM -0400, John Ferlan wrote:
Seems to be a remnant that was never cleaned up from original submit...
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_scsi.c | 3 --- 1 file changed, 3 deletions(-)
ACK Jan

Adjust the processLU error returns to be a bit more logical. Currently, the calling code cannot determine the difference between a non disk/lun volume and a processed/found disk/lun. It can also not differentiate between perhaps real/fatal error and one that won't necessarily stop the code from finding other volumes. After this patch virStorageBackendSCSIFindLUsInternal will stop processing as soon as a "fatal" message occurs rather than continuting processing for (well) no apparent reason. It will also only set the *found value when at least one of the processLU's was successful. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_scsi.c | 44 +++++++++++++++++++++++--------------- 1 file changed, 27 insertions(+), 17 deletions(-) diff --git a/src/storage/storage_backend_scsi.c b/src/storage/storage_backend_scsi.c index e085da2..8087960 100644 --- a/src/storage/storage_backend_scsi.c +++ b/src/storage/storage_backend_scsi.c @@ -375,6 +375,16 @@ getBlockDevice(uint32_t host, } +/* + * Process a Logical Unit entry from the scsi host device directory + * + * Returns: + * + * 0 => Found a valid entry + * -1 => Some sort of fatal error + * -2 => A "non-fatal" error such as a non disk/lun entry or an entry + * without a block device found + */ static int processLU(virStoragePoolObjPtr pool, uint32_t host, @@ -393,7 +403,7 @@ processLU(virStoragePoolObjPtr pool, virReportError(VIR_ERR_INTERNAL_ERROR, _("Failed to determine if %u:%u:%u:%u is a Direct-Access LUN"), host, bus, target, lun); - goto out; + return -1; } /* We don't create volumes for devices other than disk and cdrom @@ -401,32 +411,25 @@ processLU(virStoragePoolObjPtr pool, * isn't an error, either. */ if (!(device_type == VIR_STORAGE_DEVICE_TYPE_DISK || device_type == VIR_STORAGE_DEVICE_TYPE_ROM)) - { - retval = 0; - goto out; - } + return -2; VIR_DEBUG("%u:%u:%u:%u is a Direct-Access LUN", host, bus, target, lun); if (getBlockDevice(host, bus, target, lun, &block_device) < 0) { VIR_DEBUG("Failed to find block device for this LUN"); - goto out; + return -2; } - if (virStorageBackendSCSINewLun(pool, - host, bus, target, lun, - block_device) < 0) { + retval = virStorageBackendSCSINewLun(pool, host, bus, target, lun, + block_device); + if (retval < 0) VIR_DEBUG("Failed to create new storage volume for %u:%u:%u:%u", host, bus, target, lun); - goto out; - } - retval = 0; - - VIR_DEBUG("Created new storage volume for %u:%u:%u:%u successfully", - host, bus, target, lun); + else + VIR_DEBUG("Created new storage volume for %u:%u:%u:%u successfully", + host, bus, target, lun); - out: VIR_FREE(block_device); return retval; } @@ -460,6 +463,8 @@ virStorageBackendSCSIFindLUsInternal(virStoragePoolObjPtr pool, *found = false; while ((retval = virDirRead(devicedir, &lun_dirent, device_path)) > 0) { + int rc; + if (sscanf(lun_dirent->d_name, devicepattern, &bus, &target, &lun) != 3) { continue; @@ -467,7 +472,12 @@ virStorageBackendSCSIFindLUsInternal(virStoragePoolObjPtr pool, VIR_DEBUG("Found possible LU '%s'", lun_dirent->d_name); - if (processLU(pool, scanhost, bus, target, lun) == 0) + rc = processLU(pool, scanhost, bus, target, lun); + if (rc == -1) { + retval = -1; + break; + } + if (rc == 0) *found = true; } -- 2.1.0

https://bugzilla.redhat.com/show_bug.cgi?id=1171933 If a non stable path is provided for the pool's target path, check to see if the directory exists before allowing pool startup; otherwise, later in the processLU calls to find LUN's all that happens is the volume target.path will get the strdup'd value of the pool target.path (which doesn't exist), so attempts to find the LU are unsuccessful resulting in a started pool with no devices listed even though the block devices for the iSCSI LU's do exist. Additionally if the non stable path does exist and it's determined no targets are found, then force failure in the refresh path. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_iscsi.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/src/storage/storage_backend_iscsi.c b/src/storage/storage_backend_iscsi.c index fba037f..b5a15b1 100644 --- a/src/storage/storage_backend_iscsi.c +++ b/src/storage/storage_backend_iscsi.c @@ -149,6 +149,15 @@ virStorageBackendISCSIFindLUs(virStoragePoolObjPtr pool, if (virStorageBackendSCSIFindLUs(pool, host) < 0) goto cleanup; + if (pool->volumes.count == 0 && + !STRPREFIX(pool->def->target.path, "/dev")) { + virReportError(VIR_ERR_INVALID_ARG, + _("Non stable target path '%s' for pool '%s' " + "found no target volumes"), + pool->def->target.path, pool->def->name); + return -1; + } + retval = 0; cleanup: @@ -393,6 +402,15 @@ virStorageBackendISCSIStartPool(virConnectPtr conn, return -1; } + if (!STRPREFIX(pool->def->target.path, "/dev") && + !virFileExists(pool->def->target.path)) { + virReportError(VIR_ERR_INVALID_ARG, + _("Non stable target path '%s' not found for pool '%s'"), + pool->def->target.path, pool->def->name); + return -1; + } + + if ((session = virStorageBackendISCSISession(pool, true)) == NULL) { if ((portal = virStorageBackendISCSIPortal(&pool->def->source)) == NULL) goto cleanup; -- 2.1.0

On Wed, Apr 01, 2015 at 01:29:11PM -0400, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1171933
If a non stable path is provided for the pool's target path, check to see if the directory exists before allowing pool startup; otherwise,
The only non-stable path that can possibly result in a non-empty pool is '/dev/'.
later in the processLU calls to find LUN's all that happens is the volume target.path will get the strdup'd value of the pool target.path (which doesn't exist), so attempts to find the LU are unsuccessful resulting in a started pool with no devices listed even though the block devices for the iSCSI LU's do exist.
Additionally if the non stable path does exist and it's determined no targets are found, then force failure in the refresh path.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_iscsi.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/src/storage/storage_backend_iscsi.c b/src/storage/storage_backend_iscsi.c index fba037f..b5a15b1 100644 --- a/src/storage/storage_backend_iscsi.c +++ b/src/storage/storage_backend_iscsi.c @@ -149,6 +149,15 @@ virStorageBackendISCSIFindLUs(virStoragePoolObjPtr pool, if (virStorageBackendSCSIFindLUs(pool, host) < 0) goto cleanup;
+ if (pool->volumes.count == 0 &&
This is always true if the path does not start with /dev. https://www.redhat.com/archives/libvir-list/2015-March/msg01578.html
+ !STRPREFIX(pool->def->target.path, "/dev")) { + virReportError(VIR_ERR_INVALID_ARG, + _("Non stable target path '%s' for pool '%s' " + "found no target volumes"), + pool->def->target.path, pool->def->name); + return -1; + } + retval = 0;
cleanup:
@@ -393,6 +402,15 @@ virStorageBackendISCSIStartPool(virConnectPtr conn, return -1; }
+ if (!STRPREFIX(pool->def->target.path, "/dev") && + !virFileExists(pool->def->target.path)) {
If you remove the virFileExists condition, you don't need the error check in the first hunk. Jan
+ virReportError(VIR_ERR_INVALID_ARG, + _("Non stable target path '%s' not found for pool '%s'"), + pool->def->target.path, pool->def->name); + return -1; + } + + if ((session = virStorageBackendISCSISession(pool, true)) == NULL) { if ((portal = virStorageBackendISCSIPortal(&pool->def->source)) == NULL) goto cleanup; -- 2.1.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
participants (2)
-
John Ferlan
-
Ján Tomko