Hi All,
Encountering a qemu monitor timeout when starting a VM has been discussed here
before, e.g.
https://www.redhat.com/archives/libvir-list/2014-January/msg00060.html
https://www.redhat.com/archives/libvir-list/2014-January/msg00408.html
Recently I've received reports of the same when starting large memory VMs backed
by 1G huge pages. In one of the reports, Matt timed how long it takes to
allocate 402GB worth of hugetlbfs pages (these are 1G pages, but the time is
similar for 2M):
real 105.47
user 0.05
sys 105.42
The time is spent entirely in the kernel zero'ing pages and as you can see it
exceeds the 30 second monitor timeout in libvirt. Do folks have any suggestions
on how to avoid the timeout?
Obviously one solution is to introduce a knob in qemu.conf to control the
timeout, as was proposed in the above threads. Another solution that came to
mind is changing qemu to open the monitor earlier, making it available while the
kernel is off scrubbing pages. I'm not familiar enough with qemu code to know if
such a change is possible, but given the amount of initialization done in main()
prior to calling mon_init_func(), my confidence in this idea is low. Perhaps
someone more familiar with qemu initialization can comment on that. Thanks in
advance for comments on these ideas or alternate proposals!
Regards,
Jim