
Osier Yang wrote:
On 2012年04月24日 03:47, Guido Günther wrote:
Hi, On Sun, Apr 22, 2012 at 02:41:54PM -0400, Jim Paris wrote:
Hi,
http://bugs.debian.org/663931 is a bug I'm hitting, where virt-manager times out on the initial connection to libvirt.
I reassigned the bug back to libvirt. I still wonder what triggers this though for some users but not for others? Cheers, -- Guido
The basic problem is that, while checking storage volumes, virt-manager causes libvirt to call "udevadm settle". There's an interaction where libvirt's earlier use of network namespaces (to probe LXC features) had caused some uevents to be sent that get filtered out before they reach udev. This confuses "udevadm settle" a bit, and so it sits there waiting for a 2-3 minute built-in timeout before returning. Eventually libvirtd prints: 2012-04-22 18:22:18.678+0000: 30503: warning : virKeepAliveTimer:182 : No response from client 0x7feec4003630 after 5 keepalive messages in 30 seconds and virt-manager prints: 2012-04-22 18:22:18.931+0000: 30647: warning : virKeepAliveSend:128 : Failed to send keepalive response to client 0x25004e0 and the connection gets dropped.
One workaround could be to specify a shorter timeout when doing the settle. The patch appended below allows virt-manager to work, although the connection still has to wait for the 10 second timeout before it succeeds. I don't know what a better solution would be, though. It seems the udevadm behavior might not be considered a bug from the udev/kernel point of view: https://lkml.org/lkml/2012/4/22/60
I'm using Linux 3.2.14 with libvirt 0.9.11. You can trigger the udevadm issue using a program I posted at the Debian bug report link above.
-jim
From 17e5b9ebab76acb0d711e8bc308023372fbc4180 Mon Sep 17 00:00:00 2001 From: Jim Paris<jim@jtan.com> Date: Sun, 22 Apr 2012 14:35:47 -0400 Subject: [PATCH] shorten udevadmin settle timeout
Otherwise, udevadmin settle can take so long that connections from e.g. virt-manager will get closed. --- src/util/util.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/util/util.c b/src/util/util.c index 6e041d6..dfe458e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2593,9 +2593,9 @@ virFileFindMountPoint(const char *type ATTRIBUTE_UNUSED) void virFileWaitForDevices(void) { # ifdef UDEVADM - const char *const settleprog[] = { UDEVADM, "settle", NULL }; + const char *const settleprog[] = { UDEVADM, "settle", "--timeout", "10", NULL };
Though I don't have a good idea to fix it either, I guess this change could cause "lvremove" to fail again for the udev race.
See BZs:
https://bugzilla.redhat.com/show_bug.cgi?id=702260 https://bugzilla.redhat.com/show_bug.cgi?id=570359
It seems that those bugs were caused by something like 1. open(lv, O_RDWR) 2. close(lv) 3. system("lvremove ...") where udev would fire off a command between 2 and 3 that caused 3 to fail. Adding "udevadm settle" as step 2.5 is a good way to wait for that command to finish, but: - it doesn't necessarily fix the issue; something could easily re-open the device between 2.5 and 3 and cause the same failure. - the race condition sounds like it was a short window, and sometimes the original sequence would still work even without the settle. That would suggest to me that a timeout of 10s is still plenty long. A few thoughts: - For lvremove: can we try a short timeout (3 seconds), then if the lvremove still fails, try again with the default udevadm timeout (120 seconds)? - Even in that case, we need to fix libvirtd to not kill the connection after 30 seconds when it's libvirtd's fault that the connection is blocked for so long anyway. - When connecting with virt-manager, is the udevadm settle really necessary? We're not calling lvremove. Thanks, -jim