On Wed, Mar 20, 2024 at 09:10:48AM -0700, Andrea Bolognani wrote:
On Wed, Mar 20, 2024 at 10:18:39AM -0400, Stefan Berger wrote:
> On 3/20/24 08:23, Peter Krempa wrote:
> > Did you consider the case when the migration fails and the VM will be
> > restored to run on the source host again? In such case doin the
> > relabelling might break the source host.
>
> Right. I seem to remember testing such scenarios. I had to put an exit() (or
> something like it) into swtpm on the destination side to trigger the
> fallback to the source side. The swtpm on the source side had closed file
> access and wants to open them (lockfile) again and so the files needed to be
> labeled correctly if the storage on the source side is
> on the disk and exported via NFS from there (iirc). If the storage is
> NFS-exported from a 3rd host it probably would not require the labels.
I didn't really consider the failure scenario, so thank you for
bringing that up.
I think it would be still fine. If the source has NFS storage, then
access will keep working regardless of what relabeling the
destination has been up to in the meantime. And if the source has
local storage, then the relabeling on the destination (via NFS) will
not actually have touched the SELinux labels.
The only concern I have is that, when going from local to NFS, labels
might have been restored on the source side. But I assume that
restoring only happens once the migration has been confirmed as
successful? I'll check.
Once again, as far as I can tell (please let me know if I'm wrong!)
there is no special casing when it comes to disks and other types of
persistent storage, so if this approach was problematic I would have
expected many issues to have been reported by now.
I've tested this to confirm. My trick to simulating a migration
failure was to add
<qemu:commandline>
<qemu:arg value='-machine'/>
<qemu:arg value='pc-i440fx-6.0'/>
</qemu:commandline>
to the migration XML, where the VM uses a different machine type in
its configuration. This results in something like
process exited while connecting to monitor: qemu-system-x86_64:
-device
{"driver":"pcie-root-port","port":8,"chassis":1,"id":"pci.1",\
"bus":"pcie.0","multifunction":true,"addr":"0x1"}:
Bus 'pcie.0' not found
which should be a decent enough proxy for the kind of "we went as far
as attempting to start the QEMU process on the destination host, then
things went sideways" failure that we'd experience as a consequence
of a permission error. Suggestions on how to improve the test
methodology are very much appreciated :)
Anyway, based on what I've seen I think I can confirm my initial
intuition as reported above. Things work the way you expect, in that
upon migration failure the VM keeps happily chugging along on the
source host.
--
Andrea Bolognani / Red Hat / Virtualization