[libvirt] [PATCH]: Give /dev/disk/by-{id,path} a chance to exist

All, When refreshing a storage pool (say, an iSCSI pool), one of the things it does is to generate the stable path for each of the LUNs via virStorageBackendStablePath. That, in turn, has the following code: /* The pool is pointing somewhere like /dev/disk/by-path * or /dev/disk/by-id, so we need to check all symlinks in * the target directory and figure out which one points * to this device node */ if ((dh = opendir(pool->def->target.path)) == NULL) { virStorageReportError(conn, VIR_ERR_INTERNAL_ERROR, _("cannot read dir %s: %s"), pool->def->target.path, strerror(errno)); return NULL; } For a normal machine, that code is totally legit; /dev/disk/by-{id,path} almost certainly exists because of your hard drive, etc. However, this may not be the case in the stateless ovirt environment, since /dev/disk/by-{id,path} is created on-demand by udev. It's basically a race between udev creating the directory and libvirt scanning the directory. The following patch just adds a retry to the code above so that if it doesn't exist when we first get there, we give it time to come into being. If it still hasn't shown up 5 seconds after we started, we give up and throw the error. Signed-off-by: Chris Lalancette <clalance@redhat.com>

Chris Lalancette <clalance@redhat.com> wrote: ...
on-demand by udev. It's basically a race between udev creating the directory and libvirt scanning the directory. The following patch just adds a retry to the code above so that if it doesn't exist when we first get there, we give it time to come into being. If it still hasn't shown up 5 seconds after we started, we give up and throw the error. ... - if ((dh = opendir(pool->def->target.path)) == NULL) { + dh = NULL;
not needed?
+ opentries = 0; + while (opentries < 50) { + if ((dh = opendir(pool->def->target.path)) != NULL) + break;
Hi Chris, Nice one. Must have been er,.. fun to track down. The patch looks fine, but how about making it so that it retries only for a missing final (ENOENT) or intermediate (ENOTDIR) component. We'd rather not have it retry due to OOM or lack of permission, e.g., dh = opendir(pool->def->target.path); if (dh != NULL || (errno != ENOENT && errno != ENOTDIR)) break; but that has so many 'not's that you might prefer to write it this way: if (! (dh == NULL && (errno == ENOENT || errno == ENOTDIR))) break; You can save a two lines by using a for loop. It's nice to put a name on the constant, e.g., enum { WAIT_FOR_UDEV_SECS = 5 }; (and enum is better than #define, because it's usable in a debugger) for (open_tries = 0; open_tries < WAIT_FOR_UDEV_SECS * 10; open_tries++) { dh = opendir(pool->def->target.path); /* Stop if opendir succeeds, or if it fails for any reason other than a missing file name component. */ if (dh != NULL || (errno != ENOENT && errno != ENOTDIR)) break; usleep(100 * 1000); } Why "open_tries"? Because I first read "opentries" as "op" "entries" ;-)

On Tue, Nov 25, 2008 at 04:15:05PM +0100, Chris Lalancette wrote:
All, When refreshing a storage pool (say, an iSCSI pool), one of the things it does is to generate the stable path for each of the LUNs via virStorageBackendStablePath. That, in turn, has the following code:
Index: src/storage_backend.c =================================================================== RCS file: /data/cvs/libvirt/src/storage_backend.c,v retrieving revision 1.29 diff -u -r1.29 storage_backend.c --- a/src/storage_backend.c 17 Nov 2008 11:19:33 -0000 1.29 +++ b/src/storage_backend.c 25 Nov 2008 15:14:20 -0000 @@ -291,6 +291,7 @@ DIR *dh; struct dirent *dent; char *stablepath; + int opentries;
/* Short circuit if pool has no target, or if its /dev */ if (pool->def->target.path == NULL || @@ -304,19 +305,33 @@ if (!STRPREFIX(pool->def->target.path, "/dev")) goto ret_strdup;
- /* The pool is pointing somewhere like /dev/disk/by-path - * or /dev/disk/by-id, so we need to check all symlinks in - * the target directory and figure out which one points - * to this device node + /* We loop here because /dev/disk/by-{id,path} may not have existed + * before we started this operation, so we have to give it some time to + * get created. */ - if ((dh = opendir(pool->def->target.path)) == NULL) { + dh = NULL; + opentries = 0; + while (opentries < 50) { + if ((dh = opendir(pool->def->target.path)) != NULL) + break; + + usleep(100 * 1000); + opentries++; + }
Just as stylist thing - in most other places in storage code where we retry operations we use a different code pattern - I'd prefer to keep things consistent in this regard, so closer to this: reopen: if ((dh = opendir(pool->def->target.path) == NULL) { if (errno == ENOENT && opentries++ < 50) usleep(100 * 1000); goto reopen; } virStorageReportError(conn, VIR_ERR_INTERNAL_ERROR, _("cannot read dir %s: %s"), pool->def->target.path, strerror(errno)); return NULL; } Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (3)
-
Chris Lalancette
-
Daniel P. Berrange
-
Jim Meyering