On Mon, Dec 30, 2013 at 07:23:00PM +0200, Pavel Fux wrote:
Hi,
I am talking about *Bug
987088*<https://bugzilla.redhat.com/show_bug.cgi?id=987088>,
I have added a patch and a description of the fix, this is a copy of my
comment on the bug:
the default code behavior is wait for 3 seconds and if the socket is not
opened yet, print this error and terminate.
the code is in src/qemu/qemu_monitor.c in function qemuMonitorOpenUnix.
In 2009 there was a patch that added the original 3 seconds retry, the
patch can be found here:
http://www.redhat.com/archives/libvir-list/2009-July/msg00335.html
I have added a patch with this solution:
the default behavior stays the same, but a user can add a configuration
variable to qemu.conf and change the timeout value.
every system needs a different value according to their system
configuration but anyway 3 seconds is not suitable for all cases.
It would be nice to have brief info about this issue in the commit
message of the patch...
I am attaching my patch.
... and make it in a formal patch by git-format-patch(1).
Pavel Fux.
From 89063e242b46a781b7416a5d395ece9afea635a5 Mon Sep 17 00:00:00
2001
From: Pavel Fux <pavel(a)stratoscale.com>
Date: Thu, 19 Dec 2013 15:20:57 +0200
Subject: [PATCH] Add monitor timeout configuration
---
src/qemu/qemu.conf | 7 +++++++
src/qemu/qemu_conf.c | 2 ++
src/qemu/qemu_conf.h | 2 ++
src/qemu/qemu_monitor.c | 13 +++++++++++++
4 files changed, 24 insertions(+)
diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf
index 0f0a24c..6ba6207 100644
--- a/src/qemu/qemu.conf
+++ b/src/qemu/qemu.conf
@@ -409,3 +409,10 @@
# Defaults to -1.
#
#seccomp_sandbox = 1
+
+
+#If you sometimes get the message "failed to connect to monitor socket"
According to the bug description, the message is different and it
would be nice of us to match the error message in the description of
this option in order not to confuse users.
+#that could be because qemu did not wait enough time, you can try
increasing
+#this timeout, the default is 3 seconds
+#
+#monitor_socket_open_timeout = 30
diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c
index 7c3f317..4f35801 100644
--- a/src/qemu/qemu_conf.c
+++ b/src/qemu/qemu_conf.c
@@ -520,6 +520,8 @@ int virQEMUDriverConfigLoadFile(virQEMUDriverConfigPtr cfg,
GET_VALUE_LONG("seccomp_sandbox", cfg->seccompSandbox);
+ GET_VALUE_LONG("monitor_socket_open_timeout",
cfg->monitorSocketOpenTimeout);
+
ret = 0;
cleanup:
diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h
index 77d3d2f..625e69d 100644
--- a/src/qemu/qemu_conf.h
+++ b/src/qemu/qemu_conf.h
@@ -148,6 +148,8 @@ struct _virQEMUDriverConfig {
unsigned int keepAliveCount;
int seccompSandbox;
+
+ int monitorSocketOpenTimeout;
};
/* Main driver state */
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index 1b1d4a1..b5deb6b 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -29,6 +29,7 @@
#include <fcntl.h>
#include "qemu_monitor.h"
+#include "qemu_conf.h"
#include "qemu_monitor_text.h"
#include "qemu_monitor_json.h"
#include "virerror.h"
@@ -47,6 +48,8 @@
#define DEBUG_IO 0
#define DEBUG_RAW_IO 0
+extern virQEMUDriverPtr qemu_driver;
+
We prefer passing the driver as an argument instead.
struct _qemuMonitor {
virObjectLockable parent;
@@ -252,6 +255,16 @@ qemuMonitorOpenUnix(const char *monitor, pid_t cpid)
int monfd;
int timeout = 3; /* In seconds */
int ret, i = 0;
+ virQEMUDriverConfigPtr cfg = NULL;
+
+ if (qemu_driver != NULL){
+ cfg = virQEMUDriverGetConfig(qemu_driver);
+ if (cfg->monitorSocketOpenTimeout > 3){
Last, but not least, I don't see a point in allowing the value to be
only larger. Making it an unsigned int with '0' meaning "the default"
would make it more understandable, IMHO.
+ timeout = cfg->monitorSocketOpenTimeout;
+ }
+ virObjectUnref(cfg);
+ cfg = NULL;
+ }
if ((monfd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {
virReportSystemError(errno,
--
1.8.3.1
Other than that, I think it looks ok, I'm just unsure whether this
approach wasn't discussed and dismissed earlier.
Thanks,
Martin