[libvirt] [PATCH 0/3] Workaround mdev uevent race affecting node device driver

This series resolves https://bugzilla.redhat.com/show_bug.cgi?id=1463285 Erik Skultety (3): util: Report an error when virFileResolveLinkHelper's lstat fails util: Introduce virFileWaitForAccess nodedev: Work around the uevent race by hooking up virFileWaitForAccess src/libvirt_private.syms | 1 + src/node_device/node_device_udev.c | 48 +++++++++++++++++++++++++++++++++++++- src/util/virfile.c | 40 ++++++++++++++++++++++++++++++- src/util/virfile.h | 2 ++ 4 files changed, 89 insertions(+), 2 deletions(-) -- 2.13.1

During investigation of [1] I saw nothing in the logs that would help me get to the root cause. Then I found out that we don't log anything when lstat fails. Sure, doesn't happen often, but if it happens we should reflect that in the logs to prevent spurious behaviour. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1463285 Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/util/virfile.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/util/virfile.c b/src/util/virfile.c index d444b32f8..6bbcc3d15 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -1560,8 +1560,10 @@ virFileResolveLinkHelper(const char *linkpath, * directories, if linkpath is absolute and the basename is * already a non-symlink. */ if (IS_ABSOLUTE_FILE_NAME(linkpath) && !intermediatePaths) { - if (lstat(linkpath, &st) < 0) + if (lstat(linkpath, &st) < 0) { + virReportSystemError(errno, "%s", linkpath); return -1; + } if (!S_ISLNK(st.st_mode)) return VIR_STRDUP_QUIET(*resultpath, linkpath) < 0 ? -1 : 0; -- 2.13.1

On Tue, Jun 20, 2017 at 17:03:30 +0200, Erik Skultety wrote:
During investigation of [1] I saw nothing in the logs that would help me get to the root cause. Then I found out that we don't log anything when lstat fails. Sure, doesn't happen often, but if it happens we should reflect that in the logs to prevent spurious behaviour.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1463285
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/util/virfile.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/util/virfile.c b/src/util/virfile.c index d444b32f8..6bbcc3d15 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -1560,8 +1560,10 @@ virFileResolveLinkHelper(const char *linkpath, * directories, if linkpath is absolute and the basename is * already a non-symlink. */ if (IS_ABSOLUTE_FILE_NAME(linkpath) && !intermediatePaths) { - if (lstat(linkpath, &st) < 0) + if (lstat(linkpath, &st) < 0) { + virReportSystemError(errno, "%s", linkpath); return -1; + }
NACK, this function is designed not to report errors. It's even documented so. Some other callers even report their own errors. At most a VIR_DEBUG is appropriate here.

Since we have a number of places where we workaround timing issues with devices, attributes (files in general) not being available at the time of processing them by calling usleep in a loop for a fixed number of tries, we could as well have a utility function that would do that. Therefore we won't have to duplicate this ugly workaround even more. This is a prerequisite for https://bugzilla.redhat.com/show_bug.cgi?id=1463285. Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 36 ++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 39 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c1e9471c5..53878a30f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1698,6 +1698,7 @@ virFileStripSuffix; virFileTouch; virFileUnlock; virFileUpdatePerm; +virFileWaitForAccess; virFileWrapperFdClose; virFileWrapperFdFree; virFileWrapperFdNew; diff --git a/src/util/virfile.c b/src/util/virfile.c index 6bbcc3d15..0b1a91699 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4164,3 +4164,39 @@ virFileReadValueString(char **value, const char *format, ...) VIR_FREE(str); return ret; } + + +/** + * virFileWaitForAccess: + * @path: absolute path to a sysfs attribute (can be a symlink) + * @ms: how long to wait (in milliseconds) + * @tries: how many times should we try to wait for @path to become accessible + * + * Checks the existence of @path. In case the file defined by @path + * doesn't exist, we wait for it to appear in 100ms (for up to @tries times). + * + * Returns 0 on success, -1 on error (ENOENT is fine here). + */ +int +virFileWaitForAccess(const char *path, size_t ms, size_t tries) +{ + errno = 0; + + /* wait for @path to be accessible in @ms milliseconds, up to @tries */ + while (tries-- > 0 && !virFileExists(path)) { + if (errno != ENOENT) { + virReportSystemError(errno, "%s", path); + return -1; + } else if (tries == 10) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to access '%s' after %zu tries"), + path, tries); + return -1; + } else { + VIR_DEBUG("Failed to access '%s', re-try in %zu ms", path, ms); + usleep(ms * 1000); + } + } + + return 0; +} diff --git a/src/util/virfile.h b/src/util/virfile.h index 57ceb8072..42630ebb5 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -347,6 +347,8 @@ int virFileReadValueScaledInt(unsigned long long *value, const char *format, ... int virFileReadValueString(char **value, const char *format, ...) ATTRIBUTE_FMT_PRINTF(2, 3); +int virFileWaitForAccess(const char *path, size_t ms, size_t tries); + int virFileInData(int fd, int *inData, -- 2.13.1

On Tue, 2017-06-20 at 17:03 +0200, Erik Skultety wrote:
Since we have a number of places where we workaround timing issues with devices, attributes (files in general) not being available at the time of processing them by calling usleep in a loop for a fixed number of tries, we could as well have a utility function that would do that. Therefore we won't have to duplicate this ugly workaround even more.
This is a prerequisite for https://bugzilla.redhat.com/show_bug.cgi?id=1463285.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 36 ++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 39 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c1e9471c5..53878a30f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1698,6 +1698,7 @@ virFileStripSuffix; virFileTouch; virFileUnlock; virFileUpdatePerm; +virFileWaitForAccess; virFileWrapperFdClose; virFileWrapperFdFree; virFileWrapperFdNew; diff --git a/src/util/virfile.c b/src/util/virfile.c index 6bbcc3d15..0b1a91699 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4164,3 +4164,39 @@ virFileReadValueString(char **value, const char *format, ...) VIR_FREE(str); return ret; } + + +/** + * virFileWaitForAccess: + * @path: absolute path to a sysfs attribute (can be a symlink) + * @ms: how long to wait (in milliseconds) + * @tries: how many times should we try to wait for @path to become accessible + * + * Checks the existence of @path. In case the file defined by @path + * doesn't exist, we wait for it to appear in 100ms (for up to @tries times). + * + * Returns 0 on success, -1 on error (ENOENT is fine here). + */ +int +virFileWaitForAccess(const char *path, size_t ms, size_t tries) +{ + errno = 0; + + /* wait for @path to be accessible in @ms milliseconds, up to @tries */ + while (tries-- > 0 && !virFileExists(path)) { + if (errno != ENOENT) { + virReportSystemError(errno, "%s", path); + return -1; + } else if (tries == 10) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to access '%s' after %zu tries"), + path, tries); + return -1; + } else { + VIR_DEBUG("Failed to access '%s', re-try in %zu ms", path, ms); + usleep(ms * 1000); + } + } + + return 0; +}
Just FYI, there's another way to address it by calling udevadm settle before and after "touching" a block device, libguestfs is using this approach and it works very well: https://github.com/libguestfs/libguestfs/search?utf8=%E2%9C%93&q=udev_s ettle&type=
diff --git a/src/util/virfile.h b/src/util/virfile.h index 57ceb8072..42630ebb5 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -347,6 +347,8 @@ int virFileReadValueScaledInt(unsigned long long *value, const char *format, ... int virFileReadValueString(char **value, const char *format, ...) ATTRIBUTE_FMT_PRINTF(2, 3);
+int virFileWaitForAccess(const char *path, size_t ms, size_t tries); +
int virFileInData(int fd, int *inData,

On 06/20/2017 05:21 PM, Dawid Zamirski wrote:
On Tue, 2017-06-20 at 17:03 +0200, Erik Skultety wrote:
Since we have a number of places where we workaround timing issues with devices, attributes (files in general) not being available at the time of processing them by calling usleep in a loop for a fixed number of tries, we could as well have a utility function that would do that. Therefore we won't have to duplicate this ugly workaround even more.
This is a prerequisite for https://bugzilla.redhat.com/show_bug.cgi?id=1463285.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 36 ++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 39 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c1e9471c5..53878a30f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1698,6 +1698,7 @@ virFileStripSuffix; virFileTouch; virFileUnlock; virFileUpdatePerm; +virFileWaitForAccess; virFileWrapperFdClose; virFileWrapperFdFree; virFileWrapperFdNew; diff --git a/src/util/virfile.c b/src/util/virfile.c index 6bbcc3d15..0b1a91699 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4164,3 +4164,39 @@ virFileReadValueString(char **value, const char *format, ...) VIR_FREE(str); return ret; } + + +/** + * virFileWaitForAccess: + * @path: absolute path to a sysfs attribute (can be a symlink) + * @ms: how long to wait (in milliseconds) + * @tries: how many times should we try to wait for @path to become accessible + * + * Checks the existence of @path. In case the file defined by @path + * doesn't exist, we wait for it to appear in 100ms (for up to @tries times). + * + * Returns 0 on success, -1 on error (ENOENT is fine here). + */ +int +virFileWaitForAccess(const char *path, size_t ms, size_t tries) +{ + errno = 0; + + /* wait for @path to be accessible in @ms milliseconds, up to @tries */ + while (tries-- > 0 && !virFileExists(path)) { + if (errno != ENOENT) { + virReportSystemError(errno, "%s", path); + return -1; + } else if (tries == 10) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to access '%s' after %zu tries"), + path, tries); + return -1; + } else { + VIR_DEBUG("Failed to access '%s', re-try in %zu ms", path, ms); + usleep(ms * 1000); + } + } + + return 0; +}
Just FYI, there's another way to address it by calling udevadm settle before and after "touching" a block device, libguestfs is using this approach and it works very well:
https://github.com/libguestfs/libguestfs/search?utf8=%E2%9C%93&q=udev_s ettle&type=
Does it? udevadm settle waits for all the events to be processed, not just the one that we want. The wait time would be unpredictable IMO. Michal

On Tue, 2017-06-20 at 17:45 +0200, Michal Privoznik wrote:
On 06/20/2017 05:21 PM, Dawid Zamirski wrote:
On Tue, 2017-06-20 at 17:03 +0200, Erik Skultety wrote:
Since we have a number of places where we workaround timing issues with devices, attributes (files in general) not being available at the time of processing them by calling usleep in a loop for a fixed number of tries, we could as well have a utility function that would do that. Therefore we won't have to duplicate this ugly workaround even more.
This is a prerequisite for https://bugzilla.redhat.com/show_bug.cgi?id=1463285.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 36 ++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 39 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c1e9471c5..53878a30f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1698,6 +1698,7 @@ virFileStripSuffix; virFileTouch; virFileUnlock; virFileUpdatePerm; +virFileWaitForAccess; virFileWrapperFdClose; virFileWrapperFdFree; virFileWrapperFdNew; diff --git a/src/util/virfile.c b/src/util/virfile.c index 6bbcc3d15..0b1a91699 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4164,3 +4164,39 @@ virFileReadValueString(char **value, const char *format, ...) VIR_FREE(str); return ret; } + + +/** + * virFileWaitForAccess: + * @path: absolute path to a sysfs attribute (can be a symlink) + * @ms: how long to wait (in milliseconds) + * @tries: how many times should we try to wait for @path to become accessible + * + * Checks the existence of @path. In case the file defined by @path + * doesn't exist, we wait for it to appear in 100ms (for up to @tries times). + * + * Returns 0 on success, -1 on error (ENOENT is fine here). + */ +int +virFileWaitForAccess(const char *path, size_t ms, size_t tries) +{ + errno = 0; + + /* wait for @path to be accessible in @ms milliseconds, up to @tries */ + while (tries-- > 0 && !virFileExists(path)) { + if (errno != ENOENT) { + virReportSystemError(errno, "%s", path); + return -1; + } else if (tries == 10) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to access '%s' after %zu tries"), + path, tries); + return -1; + } else { + VIR_DEBUG("Failed to access '%s', re-try in %zu ms", path, ms); + usleep(ms * 1000); + } + } + + return 0; +}
Just FYI, there's another way to address it by calling udevadm settle before and after "touching" a block device, libguestfs is using this approach and it works very well:
https://github.com/libguestfs/libguestfs/search?utf8=%E2%9C%93&q=ud ev_s ettle&type=
Does it? udevadm settle waits for all the events to be processed, not just the one that we want. The wait time would be unpredictable IMO.
Michal
Well it's a kind of brute-force approach but at least guarantees the device will be accessible once it exits. Why would setting a custom arbitrary timeout be better which may be too short? Is the predictable wait time here more important than being able to open the device at all?

On Tue, Jun 20, 2017 at 12:21:48PM -0400, Dawid Zamirski wrote:
On Tue, 2017-06-20 at 17:45 +0200, Michal Privoznik wrote:
On 06/20/2017 05:21 PM, Dawid Zamirski wrote:
On Tue, 2017-06-20 at 17:03 +0200, Erik Skultety wrote:
Since we have a number of places where we workaround timing issues with devices, attributes (files in general) not being available at the time of processing them by calling usleep in a loop for a fixed number of tries, we could as well have a utility function that would do that. Therefore we won't have to duplicate this ugly workaround even more.
This is a prerequisite for https://bugzilla.redhat.com/show_bug.cgi?id=1463285.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 36 ++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 39 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c1e9471c5..53878a30f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1698,6 +1698,7 @@ virFileStripSuffix; virFileTouch; virFileUnlock; virFileUpdatePerm; +virFileWaitForAccess; virFileWrapperFdClose; virFileWrapperFdFree; virFileWrapperFdNew; diff --git a/src/util/virfile.c b/src/util/virfile.c index 6bbcc3d15..0b1a91699 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4164,3 +4164,39 @@ virFileReadValueString(char **value, const char *format, ...) VIR_FREE(str); return ret; } + + +/** + * virFileWaitForAccess: + * @path: absolute path to a sysfs attribute (can be a symlink) + * @ms: how long to wait (in milliseconds) + * @tries: how many times should we try to wait for @path to become accessible + * + * Checks the existence of @path. In case the file defined by @path + * doesn't exist, we wait for it to appear in 100ms (for up to @tries times). + * + * Returns 0 on success, -1 on error (ENOENT is fine here). + */ +int +virFileWaitForAccess(const char *path, size_t ms, size_t tries) +{ + errno = 0; + + /* wait for @path to be accessible in @ms milliseconds, up to @tries */ + while (tries-- > 0 && !virFileExists(path)) { + if (errno != ENOENT) { + virReportSystemError(errno, "%s", path); + return -1; + } else if (tries == 10) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to access '%s' after %zu tries"), + path, tries); + return -1; + } else { + VIR_DEBUG("Failed to access '%s', re-try in %zu ms", path, ms); + usleep(ms * 1000); + } + } + + return 0; +}
Just FYI, there's another way to address it by calling udevadm settle before and after "touching" a block device, libguestfs is using this approach and it works very well:
https://github.com/libguestfs/libguestfs/search?utf8=%E2%9C%93&q=ud ev_s ettle&type=
Does it? udevadm settle waits for all the events to be processed, not just the one that we want. The wait time would be unpredictable IMO.
Michal
Well it's a kind of brute-force approach but at least guarantees the device will be accessible once it exits. Why would setting a custom arbitrary timeout be better which may be too short? Is the predictable wait time here more important than being able to open the device at all?
So, there was this thing about udevadm settle that wasn't clear to me, because it says it waits for udevd to create all the device nodes - so since udev only manages "/dev", I assumed it wouldn't work for sysfs attributes, but gave it a try since I wasn't sure and, no, we can't use udevadm here, because udevd really doesn't care about sysfs (mdevs only live in sysfs - well, strictly speaking, a corresponding /dev/vfio/ device IS created for an mdev, but for that we get a separate event and because for node device purposes we don't report vfio devices, we ignore such event) hence the explicit, unpredictable, best-effort timeout here. Erik

On Tue, Jun 20, 2017 at 17:03:31 +0200, Erik Skultety wrote:
Since we have a number of places where we workaround timing issues with devices, attributes (files in general) not being available at the time of processing them by calling usleep in a loop for a fixed number of tries, we could as well have a utility function that would do that. Therefore we won't have to duplicate this ugly workaround even more.
This is a prerequisite for https://bugzilla.redhat.com/show_bug.cgi?id=1463285.
This statement is useless. The helper function can be reused elsewhere.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 36 ++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 39 insertions(+)
[...]
diff --git a/src/util/virfile.c b/src/util/virfile.c index 6bbcc3d15..0b1a91699 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4164,3 +4164,39 @@ virFileReadValueString(char **value, const char *format, ...) VIR_FREE(str); return ret; } + + +/** + * virFileWaitForAccess: + * @path: absolute path to a sysfs attribute (can be a symlink) + * @ms: how long to wait (in milliseconds) + * @tries: how many times should we try to wait for @path to become accessible + * + * Checks the existence of @path. In case the file defined by @path + * doesn't exist, we wait for it to appear in 100ms (for up to @tries times). + * + * Returns 0 on success, -1 on error (ENOENT is fine here).
This description does not make sense. You don't state that this reports errors. Also the mention of ENOENT does not make sense. This function in fact sometimes returns enoent if the file does not appear until timeout.
+ */ +int +virFileWaitForAccess(const char *path, size_t ms, size_t tries) +{ + errno = 0; + + /* wait for @path to be accessible in @ms milliseconds, up to @tries */ + while (tries-- > 0 && !virFileExists(path)) { + if (errno != ENOENT) { + virReportSystemError(errno, "%s", path);
This does not really explain stuff to users. You might want to add a more comprehensive error message or leave error reporting to users.
+ return -1; + } else if (tries == 10) {
This does not make any sense. The while loop counts down and checks if tries is more than 10. And this checks that tries is equal to 10. So if somebody passes 11 as @tries he/she gets this error? If tries is set to < 10 you get success even on timeout? Did you modify the code without testing it?
+ virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to access '%s' after %zu tries"), + path, tries); + return -1; + } else { + VIR_DEBUG("Failed to access '%s', re-try in %zu ms", path, ms);
This will spam logs too much. I think a debug prior to the while loop and one after it succeeds should be enough.
+ usleep(ms * 1000);

On Tue, Jun 20, 2017 at 05:22:52PM +0200, Peter Krempa wrote:
On Tue, Jun 20, 2017 at 17:03:31 +0200, Erik Skultety wrote:
Since we have a number of places where we workaround timing issues with devices, attributes (files in general) not being available at the time of processing them by calling usleep in a loop for a fixed number of tries, we could as well have a utility function that would do that. Therefore we won't have to duplicate this ugly workaround even more.
This is a prerequisite for https://bugzilla.redhat.com/show_bug.cgi?id=1463285.
This statement is useless. The helper function can be reused elsewhere.
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 36 ++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ 3 files changed, 39 insertions(+)
[...]
diff --git a/src/util/virfile.c b/src/util/virfile.c index 6bbcc3d15..0b1a91699 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4164,3 +4164,39 @@ virFileReadValueString(char **value, const char *format, ...) VIR_FREE(str); return ret; } + + +/** + * virFileWaitForAccess: + * @path: absolute path to a sysfs attribute (can be a symlink) + * @ms: how long to wait (in milliseconds) + * @tries: how many times should we try to wait for @path to become accessible + * + * Checks the existence of @path. In case the file defined by @path + * doesn't exist, we wait for it to appear in 100ms (for up to @tries times). + * + * Returns 0 on success, -1 on error (ENOENT is fine here).
This description does not make sense. You don't state that this reports errors. Also the mention of ENOENT does not make sense.
This function in fact sometimes returns enoent if the file does not appear until timeout.
+ */ +int +virFileWaitForAccess(const char *path, size_t ms, size_t tries) +{ + errno = 0; + + /* wait for @path to be accessible in @ms milliseconds, up to @tries */ + while (tries-- > 0 && !virFileExists(path)) { + if (errno != ENOENT) { + virReportSystemError(errno, "%s", path);
This does not really explain stuff to users. You might want to add a more comprehensive error message or leave error reporting to users.
+ return -1; + } else if (tries == 10) {
This does not make any sense. The while loop counts down and checks if tries is more than 10. And this checks that tries is equal to 10.
So if somebody passes 11 as @tries he/she gets this error? If tries is set to < 10 you get success even on timeout?
Did you modify the code without testing it?
Damn :/, I actually tested it (after those 3+ modifications I did to it in an iterative manner) and it worked, but surely didn't hit the issue you describe. Nevertheless, yep, it's wrong and I need to rework it.

If we find ourselves in the situation that the 'add' uevent has been fired earlier than the sysfs tree for a device, we'd better wait for the attributes to be ready rather than discarding the device from our device list forever. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285 Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/node_device/node_device_udev.c | 48 +++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 174124a1b..4f9ca0c67 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -60,6 +60,43 @@ struct _udevPrivate { }; +/** + * udevWaitForAttrs: + * @sys_path: node device base path + * @...: attributes to wait for, last attribute must be NULL + * + * Takes a list of attributes to wait for, waits until all of them are + * available, unless the max number of tries (10) has been reached. + * + * Returns 0 if all attributes became available, -1 on error. + */ +static int +udevWaitForAttrs(const char *sys_path, ...) +{ + int ret = -1; + const char *attr = NULL; + char *attr_path = NULL; + va_list args; + + va_start(args, sys_path); + while ((attr = va_arg(args, char *))) { + if (virAsprintf(&attr_path, "%s/%s", sys_path, attr) < 0) + goto cleanup; + + if (virFileWaitForAccess(attr_path, 100, 10) < 0) + goto cleanup; + + VIR_FREE(attr_path); + } + + ret = 0; + cleanup: + va_end(args); + VIR_FREE(attr_path); + return ret; +} + + static bool udevHasDeviceProperty(struct udev_device *dev, const char *key) @@ -1113,12 +1150,21 @@ udevProcessMediatedDevice(struct udev_device *dev, { int ret = -1; const char *uuidstr = NULL; + const char *devpath = udev_device_get_syspath(dev); int iommugrp = -1; char *linkpath = NULL; char *canonicalpath = NULL; virNodeDevCapMdevPtr data = &def->caps->data.mdev; - if (virAsprintf(&linkpath, "%s/mdev_type", udev_device_get_syspath(dev)) < 0) + /* Because of a kernel uevent race, we might get the 'add' event prior to + * the sysfs tree being ready, so any attempt to access any sysfs attribute + * would result in ENOENT and us dropping the device, so let's work around + * it by waiting for the attributes to become available. + */ + if (udevWaitForAttrs(devpath, "mdev_type", "iommu_group", NULL) < 0) + goto cleanup; + + if (virAsprintf(&linkpath, "%s/mdev_type", devpath) < 0) goto cleanup; if (virFileResolveLink(linkpath, &canonicalpath) < 0) -- 2.13.1

On Tue, Jun 20, 2017 at 17:03:32 +0200, Erik Skultety wrote:
If we find ourselves in the situation that the 'add' uevent has been fired earlier than the sysfs tree for a device, we'd better wait for the attributes to be ready rather than discarding the device from our device list forever.
You'd have to wait for an infinite amount of time to guarantee this. This patch just makes it slightly more probable that libvirt will find the device.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285
Signed-off-by: Erik Skultety <eskultet@redhat.com> --- src/node_device/node_device_udev.c | 48 +++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-)
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 174124a1b..4f9ca0c67 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -60,6 +60,43 @@ struct _udevPrivate { };
+/** + * udevWaitForAttrs: + * @sys_path: node device base path + * @...: attributes to wait for, last attribute must be NULL + * + * Takes a list of attributes to wait for, waits until all of them are + * available, unless the max number of tries (10) has been reached. + * + * Returns 0 if all attributes became available, -1 on error.
Since this function waits in order to see whether the individual components are available I don't really see a point in having the varargs here. A simple helper that only waits for 1 file (basically formats the path and waits) would be way better.
+ */ +static int +udevWaitForAttrs(const char *sys_path, ...) +{ + int ret = -1; + const char *attr = NULL; + char *attr_path = NULL; + va_list args; + + va_start(args, sys_path); + while ((attr = va_arg(args, char *))) { + if (virAsprintf(&attr_path, "%s/%s", sys_path, attr) < 0) + goto cleanup; + + if (virFileWaitForAccess(attr_path, 100, 10) < 0)
So this waits up to 1 second per file in rather long increments (100 ms) which I don't think is really desired. The only prior art here which I think is somewhat relevant is the waiting code for netdevs, where a 1 ms timeout with 100 retries is used. Also note that this will delay the event loop since the function is called by udevEventHandleCallback which is registered in the event loop. This is definitely unaceptable. NACK to this approach
+ goto cleanup; + + VIR_FREE(attr_path); + } + + ret = 0; + cleanup: + va_end(args); + VIR_FREE(attr_path); + return ret; +} + + static bool udevHasDeviceProperty(struct udev_device *dev, const char *key) @@ -1113,12 +1150,21 @@ udevProcessMediatedDevice(struct udev_device *dev, { int ret = -1; const char *uuidstr = NULL; + const char *devpath = udev_device_get_syspath(dev); int iommugrp = -1; char *linkpath = NULL; char *canonicalpath = NULL; virNodeDevCapMdevPtr data = &def->caps->data.mdev;
- if (virAsprintf(&linkpath, "%s/mdev_type", udev_device_get_syspath(dev)) < 0) + /* Because of a kernel uevent race, we might get the 'add' event prior to + * the sysfs tree being ready, so any attempt to access any sysfs attribute + * would result in ENOENT and us dropping the device, so let's work around + * it by waiting for the attributes to become available. + */ + if (udevWaitForAttrs(devpath, "mdev_type", "iommu_group", NULL) < 0) + goto cleanup;
This call formats the same string ...
+ + if (virAsprintf(&linkpath, "%s/mdev_type", devpath) < 0) goto cleanup;
.. as this and then throws it away, just to be reformatted in the same way.

+ */ +static int +udevWaitForAttrs(const char *sys_path, ...) +{ + int ret = -1; + const char *attr = NULL; + char *attr_path = NULL; + va_list args; + + va_start(args, sys_path); + while ((attr = va_arg(args, char *))) { + if (virAsprintf(&attr_path, "%s/%s", sys_path, attr) < 0) + goto cleanup; + + if (virFileWaitForAccess(attr_path, 100, 10) < 0)
So this waits up to 1 second per file in rather long increments (100 ms) which I don't think is really desired.
The only prior art here which I think is somewhat relevant is the waiting code for netdevs, where a 1 ms timeout with 100 retries is used.
Also note that this will delay the event loop since the function is called by udevEventHandleCallback which is registered in the event loop. This is definitely unaceptable. NACK to this approach
Oh, I was in a rush writing this and missed that one completely, true, no blocking in the eventloop, naturally. I'll try a different approach and respin. Erik

On 06/20/2017 11:03 AM, Erik Skultety wrote:
This series resolves https://bugzilla.redhat.com/show_bug.cgi?id=1463285
Erik Skultety (3): util: Report an error when virFileResolveLinkHelper's lstat fails util: Introduce virFileWaitForAccess nodedev: Work around the uevent race by hooking up virFileWaitForAccess
src/libvirt_private.syms | 1 + src/node_device/node_device_udev.c | 48 +++++++++++++++++++++++++++++++++++++- src/util/virfile.c | 40 ++++++++++++++++++++++++++++++- src/util/virfile.h | 2 ++ 4 files changed, 89 insertions(+), 2 deletions(-)
-- 2.13.1
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
FWIW: This seemed a bit familiar to something for NPIV as well. Although for NPIV the files exist, it's just that they have bogus data. See: https://www.redhat.com/archives/libvir-list/2016-June/msg02213.html The referenced bz: https://bugzilla.redhat.com/show_bug.cgi?id=1319544 The settle code is used in a number of place in libvirt, search on virWaitForDevices John

On Tue, Jun 20, 2017 at 01:59:59PM -0400, John Ferlan wrote:
On 06/20/2017 11:03 AM, Erik Skultety wrote:
This series resolves https://bugzilla.redhat.com/show_bug.cgi?id=1463285
Erik Skultety (3): util: Report an error when virFileResolveLinkHelper's lstat fails util: Introduce virFileWaitForAccess nodedev: Work around the uevent race by hooking up virFileWaitForAccess
src/libvirt_private.syms | 1 + src/node_device/node_device_udev.c | 48 +++++++++++++++++++++++++++++++++++++- src/util/virfile.c | 40 ++++++++++++++++++++++++++++++- src/util/virfile.h | 2 ++ 4 files changed, 89 insertions(+), 2 deletions(-)
-- 2.13.1
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
FWIW: This seemed a bit familiar to something for NPIV as well. Although for NPIV the files exist, it's just that they have bogus data. See:
https://www.redhat.com/archives/libvir-list/2016-June/msg02213.html
So I read the reply as well and though the argument about leaving kernel bugs to kernel to fix is right, this may take and indefinite time to actually get the fix and having an open kernel BZ is about it in terms what we can do about it. So, in order to make things work in the meantime, we have to work around things - discouraging? yes, ugly? absolutely, but unfortunately necessary.
The referenced bz:
https://bugzilla.redhat.com/show_bug.cgi?id=1319544
The settle code is used in a number of place in libvirt, search on virWaitForDevices
Thanks for the hint, I was looking for exactly something like this, but that function would not work in this case, because it would indeed wait for /dev/vfio/XY to get created, but for nodedev purposes we don't list those devices and mdevs are only matter of sysfs which is out of scope of udev. Erik
participants (5)
-
Dawid Zamirski
-
Erik Skultety
-
John Ferlan
-
Michal Privoznik
-
Peter Krempa