On 02/15/2017 03:43 AM, Blair Bethwaite wrote:
On 15 February 2017 at 00:57, Daniel P. Berrange
<berrange(a)redhat.com> wrote:
> What is the actual error you're getting during startup.
# virsh -d0 start instance-0000037c
start: domain(optdata): instance-0000037c
start: found option <domain>: instance-0000037c
start: <domain> trying as domain NAME
error: Failed to start domain instance-0000037c
error: monitor socket did not show up: No such file or directory
Full libvirtd debug log at
https://gist.github.com/bmb/08fbb6b6136c758d027e90ff139d5701
On 15 February 2017 at 00:47, Michal Privoznik <mprivozn(a)redhat.com> wrote:
> I don't think I understand this. Who is running the other job? I mean,
> I'd expect qemu fail to create the socket and thus hitting 30s timeout
> in qemuMonitorOpenUnix().
Yes you're right, I just blindly started looking for 30s constants in
the code and that one seemed the most obvious but I had not tried to
trace it all the way back to the domain start job or checked the debug
logs yet, sorry. So looking a bit more carefully I see the real issue
is in src/qemu/qemu_monitor.c:
321 static int
322 qemuMonitorOpenUnix(const char *monitor, pid_t cpid)
323 {
324 struct sockaddr_un addr;
325 int monfd;
326 int timeout = 30; /* In seconds */
Is this safe to increase? Is there any reason to keep it at 30s given
(from what I'm seeing on a fast 2-socket Haswell system) that hugepage
backed guests larger than ~160GB memory will not be able to start in
that time?
I recall some similar discussion took place in the past. But I just
cannot find it now. I think the problem was that kernel is zeroing the
pages on huge page allocation. Anyway, this timeout used to be 3 seconds
and inly in fe89b687a0 it has been changed to 30 seconds.
We can increase the limit, but that would solve just this case until
somebody tries to assign even more RAM to their domain. What if we would
instead make this configurable? Have yet another variable living inside
qemu.conf that by default has value of 30 and specifies how long should
libvirt wait for qemu monitor to show up?
But frankly, on one hand I like this approach. But on the other I
dislike it at the same time - we have just too much variables in
qemu.conf because that's our answer to problems like these. We don't
know so we offload the setting to the sys admin.
Michal