On Wed, Mar 20, 2024 at 08:43:24AM -0700, Andrea Bolognani wrote:
On Wed, Mar 20, 2024 at 12:37:37PM +0100, Peter Krempa wrote:
> On Wed, Mar 20, 2024 at 10:19:11 +0100, Andrea Bolognani wrote:
> > +# libvirt will normally prevent migration if the storage backing the VM is
not
> > +# on a shared filesystems. Sometimes, however, the storage *is* shared
despite
> > +# not being detected as such: for example, this is the case when one of the
> > +# hosts involved in the migration is exporting its local storage to the other
> > +# one via NFS.
> > +#
> > +# Any directory listed here will be assumed to live on a shared filesystem,
> > +# making migration possible in scenarios such as the one described above.
> > +#
> > +# If you need this feature, you probably want to set remember_owner=0 too.
>
> Could you please elaborate why you'd want to disable owner remembering?
> With remote filesystems this works so I expect that if this makes
> certain paths behave as shared filesystems, they should behave such
> without any additional tweaks
To be quite honest I don't remember exactly why I've added that, but
I can confirm that if remember_owner=0 is not used on the destination
host then migration will stall for a bit and then fail with
error: unable to lock /var/lib/libvirt/swtpm/.../tpm2/.lock for
metadata change: Resource temporarily unavailable
Things work fine if swtpm is not involved. I'm going to dig deeper,
but my guess is that, just like the situation addressed by the last
patch, having an additional process complicates things compared to
when we need to worry about QEMU only.
I've managed to track this down, and I wasn't far off.
The issue is that, when remember_owner is enabled, we perform a bunch
of additional locking on files around the actual relabel operation;
specifically, we call virSecurityManagerTransactionCommit() with the
last argument set to true.
This doesn't seem to cause any issues in general, *except* when it
comes to swtpm's storage lock. The swtpm process holds this lock
while it's running, and only releases it once migration is triggered.
So, when we're about to start the target swtpm process and want to
prepare the environment by setting up labels, we try to acquire the
storage lock, and can't proceed because the source swtpm process is
still holding on to it.
The hacky patch below makes migration work even when remember_owner
is enabled. Obviously I'd rewrite it so that we'd only skip locking
for incoming migration, but I wonder if there could be nasty side
effects to this...
Other ideas? Can we perhaps change things so that swtpm releases the
lock earlier upon our request or something like that?
diff --git a/src/security/security_manager.c b/src/security/security_manager.c
index fc2747c45e..2aa06eaec2 100644
--- a/src/security/security_manager.c
+++ b/src/security/security_manager.c
@@ -1356,6 +1356,9 @@
virSecurityManagerMetadataLock(virSecurityManager *mgr G_GNUC_UNUSED,
continue;
}
+ if (g_str_has_suffix(p, "/.lock"))
+ continue;
+
if ((fd = open(p, O_RDWR)) < 0) {
if (errno == EROFS) {
/* There is nothing we can do for RO filesystem. */
--
Andrea Bolognani / Red Hat / Virtualization