On 01/06/2014 05:19 AM, Osier Yang wrote:
The SCSI device corresponding to the vHBA might not show up in
sysfs yet when we trying to scan the LUNs. As a result, we will
end up with an empty volume set for the pool after pool-start,
even if there are LUNs.
So what causes the "delay" to get the LUN's into sysfs? Is there
something that can be done at creation time (or wherever) to sync that
sooner? Is there a way to determine that the SCSI device hasn't shown
up yet other than the readdir()? You're adding a delay/loop for some
other subsystem's inability? to sync or provide the resources.
Though the time of the device showing up is rather depended,
better than doing nothing, this patch introduces the polling
with 5 * 1 seconds in maximum (the time works fine on my
testing machine at least). Note that for the pool which doesn't
have any LUN, it will still take 5 seconds to poll, but it's
not a bad trade, 5 seconds is not much, and in most cases,
one won't use an empty pool in practice.
Since the paths that call into this only seem to be via refresh and
perhaps a subsequent refresh would resolve things. Could this be better
served by documenting that it's possible that depending on circumstance
"X" (answer to my first question) that in order to see elements in the
pool, one may have to reload again. Assuming of course that the next
reload would find them...
I guess I'm a bit cautious about adding randomly picked timeout values
based on some test because while it may work for you, perhaps it's 10
seconds for someone else. While you may consider a 5 second pause "OK"
and "reasonable" a customer may not consider that to be reasonable.
People (and testers) do strange and random things.
Furthermore, could it be possible that you "catch" things perfectly and
only say 10 of 100 devices are found... But if you waited another 5
seconds the other 90 devices would show up.. I think by adding this
code you end up down a slippery slope of handing fc_host devices
specially...
John
---
src/storage/storage_backend_scsi.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/storage/storage_backend_scsi.c b/src/storage/storage_backend_scsi.c
index 93039c1..2efcdb8 100644
--- a/src/storage/storage_backend_scsi.c
+++ b/src/storage/storage_backend_scsi.c
@@ -495,6 +495,8 @@ virStorageBackendSCSIFindLUs(virStoragePoolObjPtr pool,
DIR *devicedir = NULL;
struct dirent *lun_dirent = NULL;
char devicepattern[64];
+ bool found = false;
+ size_t i = 0;
VIR_DEBUG("Discovering LUs on host %u", scanhost);
@@ -510,6 +512,7 @@ virStorageBackendSCSIFindLUs(virStoragePoolObjPtr pool,
snprintf(devicepattern, sizeof(devicepattern), "%u:%%u:%%u:%%u\n",
scanhost);
+retry:
while ((lun_dirent = readdir(devicedir))) {
if (sscanf(lun_dirent->d_name, devicepattern,
&bus, &target, &lun) != 3) {
@@ -518,9 +521,22 @@ virStorageBackendSCSIFindLUs(virStoragePoolObjPtr pool,
VIR_DEBUG("Found LU '%s'", lun_dirent->d_name);
+ found = true;
processLU(pool, scanhost, bus, target, lun);
}
+ /* Sleep for 5 seconds in maximum if the pool's source
+ * adapter type is "fc_host", since the corresponding
+ * SCSI device might not show up in the sysfs yet.
+ */
+ if (!found && i++ < 5 &&
+ pool->def->source.adapter.type ==
+ VIR_STORAGE_POOL_SOURCE_ADAPTER_TYPE_FC_HOST) {
+ sleep(1);
+ rewinddir(devicedir);
+ goto retry;
+ }
+
closedir(devicedir);
return retval;