On Fri, Sep 25, 2009 at 06:47:07PM -0500, Charles Duffy wrote:
Howdy, all.
I maintain a test infrastructure which makes heavy use of virDomainSave
and virDomainRestore, and have been seeing occasional cases where my
saved images are for some reason not restored correctly -- and, indeed,
the incoming migration streams are not even read in their entirety.
While this generally appears to be caused by issues outside of libvirt's
purview, one unfortunate issue is that libvirt can report success
performing a restore even when the operation is effectively an abject
failure.
Consider the following snippet, taken from one of my
/var/log/libvirt/qemu/<domain>.log files:
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin USER=root LOGNAME=root
/usr/bin/qemu-kvm -S -M pc-0.11 -m 512 -smp 1 <...lots of arguments
here...> -incoming exec:cat
cat: write error: Broken pipe
This leaves a running qemu hosting a catatonic guest -- but the libvirt
client (connecting through the Python bindings) received a status of
success for the operation given here.
Urgh, more QEMU badness. QEMU spawned 'cat', so if 'cat' exits with a
non-zero exit status, QEMU should see that it failed and thus exit
itself, rather than pretending everything was OK with migration.
The flaw in QEMU is depressingly obvious
static int stdio_pclose(void *opaque)
{
QEMUFileStdio *s = opaque;
pclose(s->stdio_file);
qemu_free(s);
return 0;
}
Notice how it completely discards the exit status returned by
pclone() and just pretends everything always worked :-(
If this was handling errors correctly, you'd at least see QEMU
exiting rather than hanging around broken.
libvirt's mechanism for validating a successful restore consists
of
running a "cont" command on the guest, and then checking
virGetLastError(); AIUI, it is expected that the "cont" will not be able
to run until the restore is completed, as the monitor should not be
responsive until that time. Browsing through qemudMonitorSendCont (and
qemudMonitorCommandWithHandler, which it calls), I don't see anything
which looks at the log file with the stderr output from qemu to
determine whether an error actually occurred. (As an aside, "info
history" on the guest's monitor socket indicates that it was indeed
issued this "cont").
Hmm, this does look problematic - we need the monitor to be responsive
in order to do things like CPU pinning. We need the monitor to be
non-responsive to ensure 'cont' doesn't run until migration has finished.
We can't have it both ways, and the former wins since we need that to be
done before ever letting QEMU start allocating guest RAM pages. So relying
on 'cont' to block is not good. Is the 'cont' even neccessary - I
remember
seeing somewhere that QEMU unconditionally started its CPUs after an
incoming migraiton finished ?
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://ovirt.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|