On 08/18/2011 11:42 PM, Osier Yang wrote:
> Remember, that 'migrate' is a long-running async job
command, and can
> be interrupted. That is, 'service libvirtd restart' is a legal action
> to take during step 3, and it is not as severe as a libvirtd crash,
> and we have already recently added patches to remember async job
> status across libvirtd restarts with the intention of making it legal
> to restart libvirtd in the middle of an async job (whether the async
> job should still succeed, or should remove the save file, is a
> slightly different question; but removing the save file would require
> that we save in the XML the name of the file to remove if libvirtd is
> restarted).
Hmm, how about restart libvirtd during the process of managed saving?
Domain will be restored from the corrupt save image automatically. We
report an error like "image is corrupt" and quite the domain starting
simply?
This might be not good, as one will see a running domain fails to start
after libvirtd restarting.
Or we want to the managed saving still succeed? If so, we might need:
1) continue the managed saving job, (Per we are already support remeber
the async job status across libvirtd restarting)
2) restore from the saved image finished in 1).
I think the easiest approach is:
if we restart libvirtd, and see that an async job for save-to-file was
in progress, then we abort the job (leaving the file marked unfinished,
whether it was managed save or user save), and log the error.
On managed restore (virDomainCreate or autostart), if the save file
exists but is incomplete, then log the fact that the file is unusable,
then unlink() the file and proceed to do a normal boot (nothing we can
do to recover the lost autosave, but we can at least clean up on the
user's behalf).
On user restore (virDomainRestore), if the save file exists but is
incomplete, report the error to the user. No unlink(), and no rebooting
the guest; it's up to the user to decide how to handle the failed save.
But if we can figure out how to do better, by making a libvirtd restart
able to complete the save process rather than ditch it, then that would
be nicer. It's just that I don't know how easy that would be, and we
have to start this patch somewhere.
--
Eric Blake eblake(a)redhat.com +1-801-349-2682
Libvirt virtualization library
http://libvirt.org