On Thu, Apr 02, 2015 at 06:21:01PM +0200, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1200149
Even though we have a mutex mechanism so that two clients don't spawn
two daemons, it's not strong enough. It can happen that while one
client is spawning the daemon, the other one fails to connect.
Basically two possible errors can happen:
error: Failed to connect socket to
'/home/mprivozn/.cache/libvirt/libvirt-sock': Connection refused
or:
error: Failed to connect socket to
'/home/mprivozn/.cache/libvirt/libvirt-sock': No such file or directory
The problem in both cases is, the daemon is only starting up, while we
are trying to connect (and fail). We should postpone the connecting
phase until the daemon is started (by the other thread that is
spawning it). In order to do that, create a file lock 'libvirt-lock'
in the directory where session daemon would create its socket. So even
when called from multiple processes, spawning a daemon will serialize
on the file lock. So only the first to come will spawn the daemon.
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
diff to v1:
-hopefully this one is race free
src/rpc/virnetsocket.c | 148 ++++++++++++-------------------------------------
1 file changed, 36 insertions(+), 112 deletions(-)
ACK, with a few changes to the lock file path
diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
index 6b019cc..2b891ae 100644
--- a/src/rpc/virnetsocket.c
+++ b/src/rpc/virnetsocket.c
@@ -543,10 +539,10 @@ int virNetSocketNewConnectUNIX(const char *path,
const char *binary,
virNetSocketPtr *retsock)
{
- char *binname = NULL;
- char *pidpath = NULL;
- int fd, passfd = -1;
- int pidfd = -1;
+ char *lockpath = NULL;
+ int lockfd = -1;
+ int fd = -1;
+ int retries = 100;
virSocketAddr localAddr;
virSocketAddr remoteAddr;
@@ -561,6 +557,22 @@ int virNetSocketNewConnectUNIX(const char *path,
return -1;
}
+ if (spawnDaemon) {
+ if (virPidFileConstructPath(false, NULL, "libvirt-lock",
&lockpath) < 0)
+ goto error;
Here the lock gets named: .cache/libvirt/libvirt-lock.pid, even though it does not
contain a pid and we could be using this function to spawn virlockd too, not just
libvirtd.
Can you move the last_component call here and name it "%s.lock",
binname?
Jan