Re: [PATCH 09/10] qemu: Always set labels for TPM state

18 Apr 2024


      On 4/17/24 11:20, Andrea Bolognani wrote:
...
On Wed, Mar 20, 2024 at 09:10:48AM -0700, Andrea Bolognani wrote:
...
On Wed, Mar 20, 2024 at 10:18:39AM -0400, Stefan Berger wrote:
...
On 3/20/24 08:23, Peter Krempa wrote:
...
Did you consider the case when the migration fails and the VM will be
restored to run on the source host again? In such case doin the
relabelling might break the source host.
Right. I seem to remember testing such scenarios. I had to put an exit() (or
something like it) into swtpm on the destination side to trigger the
fallback to the source side. The swtpm on the source side had closed file
access and wants to open them (lockfile) again and so the files needed to be
labeled correctly if the storage on the source side is
on the disk and exported via NFS from there (iirc). If the storage is
NFS-exported from a 3rd host it probably would not require the labels.
I didn't really consider the failure scenario, so thank you for
bringing that up.
I think it would be still fine. If the source has NFS storage, then
access will keep working regardless of what relabeling the
destination has been up to in the meantime. And if the source has
local storage, then the relabeling on the destination (via NFS) will
not actually have touched the SELinux labels.
The only concern I have is that, when going from local to NFS, labels
might have been restored on the source side. But I assume that
restoring only happens once the migration has been confirmed as
successful? I'll check.
Once again, as far as I can tell (please let me know if I'm wrong!)
there is no special casing when it comes to disks and other types of
persistent storage, so if this approach was problematic I would have
expected many issues to have been reported by now.
I've tested this to confirm. My trick to simulating a migration
failure was to add
<qemu:commandline>
     <qemu:arg value='-machine'/>
     <qemu:arg value='pc-i440fx-6.0'/>
   </qemu:commandline>
to the migration XML, where the VM uses a different machine type in
its configuration. This results in something like
process exited while connecting to monitor: qemu-system-x86_64:
   -device {"driver":"pcie-root-port","port":8,"chassis":1,"id":"pci.1",\
   "bus":"pcie.0","multifunction":true,"addr":"0x1"}: Bus 'pcie.0' not found
which should be a decent enough proxy for the kind of "we went as far
as attempting to start the QEMU process on the destination host, then
things went sideways" failure that we'd experience as a consequence
of a permission error. Suggestions on how to improve the test
methodology are very much appreciated :)
Anyway, based on what I've seen I think I can confirm my initial
intuition as reported above. Things work the way you expect, in that
upon migration failure the VM keeps happily chugging along on the
source host.
If a file is labeled on the NFS server side then the current code 
doesn't relabel the state file on server side when we have an outgoing 
migration due to possible fall-back in case of error. With NFS at least 
not supporting SElinux labels (and security xattrs) we are fine on the 
server side if the fallback happens whatever was done with SELinux 
labeling on the client side (gets EOPNOTSUPP) because the label will not 
be changed on the server side. If there existed a shared filesystem that 
supports SELinux labels and propagated them we'd be in trouble at least 
for the fall-back scenario, right? I don't know whether any shared 
filesystem would ever share security xattrs across the network, though, 
so that this could become a problem.