On 08/27/2013 11:06 AM, Richard W.M. Jones wrote:
On Thu, Aug 22, 2013 at 03:09:06PM -0500, Anthony Liguori wrote:
> Paolo Bonzini <pbonzini(a)redhat.com> writes:
>> Also, a virtio watchdog device makes little sense, IMHO. PV makes sense
>> if emulation has insufficient performance, excessive CPU usage, or
>> excessive complexity. We already have both an ISA and a PCI watchdog,
>> and they serve their purpose wonderfully.
> Neither of which actually work with modern versions of Windows FWIW.
Correct, although someone could write a driver!
> Plus emulated watchdogs do not take into account steal time or
> overcommit in general. I've seen multiple cases where a naive watchdog
> has a problem in the field when the system is under heavy load.
The watchdog devices in qemu run on guest time. However the watchdog
*daemon* inside the guest probably does behave badly as you describe.
Changing the device model isn't going to help this, but it would
definitely make sense to fix the daemon (although I don't know how --
is steal time exposed to guests?)
I don't necessarily think a virtio-watchdog is a bad idea. For one
thing it'd mean we would have a watchdog device that works on ARM.
Rich.
I believe that a watchdog is not the way to go. You need host-side decision making.
Say that the guest did not receive CPU/Disk/network resources for a lengthy period
of time, but the host knows that this is due to host resources availability. In such
cases,
you certainly do not want to reboot all the guests, especially since rebooting 50
Windows VMs could be a nightmare.
BTW, Windows guest disable some of their watchdogs when they detect the presence
of Hyper-V, we use it to overcome BSODs!
So the right solution is to send a heart-beat to a management application (using qemu-ga
or whatever), and let it decide how to handle it.
Ronen.