On 22.08.2011 20:31, Daniel P. Berrange wrote:
On Mon, Aug 22, 2011 at 05:33:12PM +0100, Daniel P. Berrange wrote:
> On Mon, Aug 22, 2011 at 09:29:56AM -0600, Eric Blake wrote:
>> On 08/22/2011 09:21 AM, Daniel P. Berrange wrote:
>>> If we had a separate API for sending 'quit' on the monitor, then the
>>> mgmt app can decide how long to wait for the graceful shutdown of QEMU
>>> before resorting to the hard virDomainDestroy command. If the app knows
>>> that there is high I/O load, then it might want to wait for 'quit'
to
>>> complete longer than normal to allow enough time for I/O flush.
>>
>> Indeed - that is exactly what I was envisioning with a
>> virDomainShutdownFlags() call with a flag to request to use the quit
>> monitor command instead of the default ACPI injection. The
>> virDomainShutdownFlags() would have no timeout (it blocks until
>> successful, or returns failure with no 'quit' command attempted),
>> and the caller can inject their own unconditional virDomainDestroy()
>> at whatever timeout they think is appropriate.
>
> The virDomainShutdown API is really about guest initiated graceful
> shutdown. Sending the 'quit' command to QEMU is still *ungraceful*
> as far as the guest OS is concerned, so I think it is best not to
> leverage the Shutdown API for 'quit'.
>
> I think this probably calls for a virDomainQuit API.
Actually this entire thread is on the wrong path.
Both the monitor 'quit' command and 'SIGTERM' in QEMU do exactly the
same thing. A immediate stop of the guest, but with a graceful shutdown
of the QEMU process[1].
In theory there is a difference that sending a signal is asynchronous
and 'quit' is a synchronous command, but in practice this is not
relevant, since while executio nof the 'quit' command is synchronous,
this command only makes the *request* to exit. QEMU won't actually
exit until the event loop processes the request later.
There is thus no point in us even bothering with sending 'quit' to
the QEMU monitor.
The virDomainDestroy method calls qemuProcessKill which sends SIGTERM,
waits a short while, then sends SIGKILL. It then calls qemuProcessStop
to reap the process. This also happens to call qemuProcessKill again
with SIGTERM, then SIGKILL.
We need to make this more controllable by apps, by making it possible
to send just the SIGTERM and not the SIGKILL. Then we can add a new
flag to virDomainDestroy to request this SIGTERM only behaviour. If
the guest does not actually die, the mgmt app can then just reinvoke
virDomainDestroy without the flag, to get the full SIGTERM+SIGKILL
behaviour we have today.
Sending signal to qemu process is just a part of domain destroying. What
about cleanup code (emitting event, audit log, removing transient
domain, ...)? Can I rely on monitor EOF handling code? What should be
the return value for this case when only SIGTERM is sent?
So there's no need for the QEMU monitor to be involved anywhere in
this.
Regards,
Daniel
[1] The SIGTERM handler and 'quit' command handler both end up just
calling qemu_system_shutdown_request().