Re: [libvirt] Revisiting qemu monitor timeout on VM start

Thursday, 9 March 2017

On 03/08/2017 10:19 PM, Jim Fehlig wrote:
...
 Hi All,

 Encountering a qemu monitor timeout when starting a VM has been discussed here
 before, e.g.

 https://www.redhat.com/archives/libvir-list/2014-January/msg00060.html
 https://www.redhat.com/archives/libvir-list/2014-January/msg00408.html

 Recently I've received reports of the same when starting large memory VMs backed
 by 1G huge pages. In one of the reports, Matt timed how long it takes to
 allocate 402GB worth of hugetlbfs pages (these are 1G pages, but the time is
 similar for 2M):

 real 105.47
 user 0.05
 sys 105.42

 The time is spent entirely in the kernel zero'ing pages and as you can see it
 exceeds the 30 second monitor timeout in libvirt. Do folks have any suggestions
 on how to avoid the timeout?

 Obviously one solution is to introduce a knob in qemu.conf to control the
 timeout, as was proposed in the above threads. Another solution that came to
 mind is changing qemu to open the monitor earlier, making it available while the
 kernel is off scrubbing pages. I'm not familiar enough with qemu code to know if
 such a change is possible, but given the amount of initialization done in main()
 prior to calling mon_init_func(), my confidence in this idea is low. Perhaps
 someone more familiar with qemu initialization can comment on that. Thanks in
 advance for comments on these ideas or alternate proposals! 
As suggested in one of the threads, the ideal solution would be that
libvirt would create the unix socket and then just pass it to qemu
during exec(). This way there would be no need for timeout. On the other
hand, this approach obviously requires some work on qemu side too and
I'm not sure: a) how much  b) whether there is somebody working on it.

If we would introduce the timeout now (say in qemu.conf), then we would
be unable to honour it once the approach described above gets implemented.

Another workaround might be to raise the 30 second limit we currently
have hard coded in our sources. Although, I'm not sure if this is an
upstream material or a downstream one (e.g. if a distro aims on
supporting such large guests, they can have a downstream only patch that
increases the timeout to say 2 minutes - this might be undesirable for
upstream).

Michal

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] Revisiting qemu monitor timeout on VM start