
On 4/20/20 3:09 PM, Mark Asselstine wrote:
On Monday, April 20, 2020 6:11:25 A.M. EDT Michal Privoznik wrote:
On 4/16/20 5:57 PM, Mark Asselstine wrote:
The udev monitor thread "udevEventHandleThread()" will lag the actual/real view of devices in sysfs as it serially processes udev monitor events. So for instance if you were to run the following cmd to create a new veth pair and rename one of the veth endpoints
you might see the following monitor events and real world that looks like
time | | create v0 sysfs entry
wake udevEventHandleThread | create v1 sysfs entry udev_monitor_receive_device(v1-add) | move v0 sysfs to v2 udevHandleOneDevice(v1) | udev_monitor_receive_device(v0-add) | udevHandleOneDevice(v0) | <--- error msgs in virNetDevGetLinkInfo() udev_monitor_receive_device(v2-move) | as v0 no longer exists udevHandleOneDevice(v2) |
\/
As you can see the changes in sysfs can take place well before we get to act on the events in the udevEventHandleThread(), so by the time we get around to processing the v0 add event, the sysfs entry has been moved to v2.
To work around this we check if the sysfs entry is valid before attempting to read it and don't bother trying to read link info if not. This is safe since we will never read sysfs entries earlier than it existing, ie. if the entry is not there it has either been removed in the time since we enumerated the device or something bigger is busted, in either case, no sysfs entry, no link info. In the case described above we will eventually get the link info as we work through the queue of monitor events and get to the 'move' event.
https://bugzilla.redhat.com/show_bug.cgi?id=1557902
Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com> ---
src/util/virnetdev.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index b465bdac2e..bf256cbe35 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -2438,6 +2438,17 @@ virNetDevGetLinkInfo(const char *ifname,
if (virNetDevSysfsFile(&path, ifname, "operstate") < 0)
goto cleanup;
+ /* The device may have been removed or moved by the time we got here. + * Obviously attempting to get LinkInfo on a no longer existing device + * is useless, so stop processing. If we got here via the udev monitor + * a remove or move event will follow and we will be able to get valid + * LinkInfo at that time */ + if (!virFileExists(path)) { + VIR_WARN("The interface '%s' was removed before we could query it.", + ifname);
This is not aligned.
Should I send a V2 or can this be correct when merged?
No need, I've fixed it (also changed goto cleanup to return, since I've merged a patch that made 'cleanup' label go away this morning) and pushed.
+ goto cleanup; + }
But I have another idea to consider. We could return a different error value here (say -2) and then have the caller decide what to do with it (e.g. print the warning).
I had thought about propagating the error so the caller could possibly wrap the warning message in more context but in the end I decided to call it out at the issue site avoiding having each caller essentially reproducing essentially the same code.
Yeah, what convinced me was that non-Linux version of this function simply prints a debug message and returns 0. I haven't realized that earlier, sorry. Michal