On 08/20/2013 04:46 PM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=822052
When doing a live migration, if the destination fails for any
reason after the point in which files should be labeled, then
the cleanup of the destination would restore the labels to their
defaults, even though the source is still trying to continue
running with the image open. Bug 822052 mentioned one source
of live migration failure - a mismatch in SELinux virt_use_nfs
settings (on for source, off for destination); but I found other
situations that would also trigger it (for example, having a
graphics device tied to port 5999 on the source, and a different
domain on the destination already using that port, so that the
destination cannot reuse the port).
In short, just as cleanup of the source on a successful migration
must not relabel files (because the destination would be crippled
by the relabel), cleanup of the destination on a failed migraion
must not relabel files (because the source would be crippled).
Note that while debugging this, I ran across a bigger design problem.
This patch works as long as source and destination are configured
equally, or where the difference (such as having selinux virt_use_nfs
settings distinct) can be detected immediately upon the attempt to start
qemu on the destination. A much more insidious case occurs if you have
a shared disk image on NFS, but where you have a configuration mismatch
between source and destination. In my case, the mismatch was caused by
running Fedora's pre-packaged libvirtd (which is built using qemu:qemu
as the default user/group for qemu processes) on the source, but a
self-built libvirtd (built with ./autogen.sh --system) on the
destination, and with no edits to /etc/libvirt/qemu.conf on either side.
Note that the default user is a configure option, and that if
/etc/libvirt/qemu.conf does NOT call out a user/group setting, then
libvirtd uses the configure defaults; and since './autogen.sh --system'
currently does not change the configure defaults, it ends up using
root:root to run qemu guests. This is probably a shortcoming of
'./autogen.sh --system' for not matching what the rpm does, but fixing
that is a separate patch.
The problem at hand is that when migrating between different
configurations, the destination first attempts to label files before
even starting qemu - and does a chown() to root:root. The failure case
described above for matched configurations occurred when the destination
did a chown() to root:root as part of the cleanup after failure to
migrate, but with my mismatched configuration, the chown() happened much
earlier. One of the (numerous) annoyances of NFS is that a chown() can
kill access of existing open fds (on a POSIX file system, once you have
an fd via open() or other means, you cannot lose rights to the inode
that the fd is visiting, no matter whether chown(), unlink(), or any
other number of things occur on the file that you originally opened in
the meantime; not so for NFS). But since the forced loss of access
isn't instantaneous, you now have a race - if live migration is fast
enough, the source can still get the guest migrated to the destination;
but if there is lots of I/O and memory churn in the guest, so that the
live migration takes long enough for the forced loss of access to kick
in, then the source will start seeing I/O errors, possibly causing
severe corruption to the guest.
We should really consider whether we need more framework for diagnosing
incompatible configurations between source and destination, and outright
rejecting migration where two machines do not share the same user/group
setup.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org