On 3/26/24 12:38, Andrea Bolognani wrote:
On Tue, Mar 26, 2024 at 12:04:21PM -0400, Stefan Berger wrote:
>
>
> On 3/26/24 11:54, Andrea Bolognani wrote:
>> On Wed, Mar 20, 2024 at 08:43:24AM -0700, Andrea Bolognani wrote:
>>> On Wed, Mar 20, 2024 at 12:37:37PM +0100, Peter Krempa wrote:
>>>> On Wed, Mar 20, 2024 at 10:19:11 +0100, Andrea Bolognani wrote:
>>>>> +# libvirt will normally prevent migration if the storage backing the
VM is not
>>>>> +# on a shared filesystems. Sometimes, however, the storage *is*
shared despite
>>>>> +# not being detected as such: for example, this is the case when one
of the
>>>>> +# hosts involved in the migration is exporting its local storage to
the other
>>>>> +# one via NFS.
>>>>> +#
>>>>> +# Any directory listed here will be assumed to live on a shared
filesystem,
>>>>> +# making migration possible in scenarios such as the one described
above.
>>>>> +#
>>>>> +# If you need this feature, you probably want to set
remember_owner=0 too.
>>>>
>>>> Could you please elaborate why you'd want to disable owner
remembering?
>>>> With remote filesystems this works so I expect that if this makes
>>>> certain paths behave as shared filesystems, they should behave such
>>>> without any additional tweaks
>>>
>>> To be quite honest I don't remember exactly why I've added that, but
>>> I can confirm that if remember_owner=0 is not used on the destination
>>> host then migration will stall for a bit and then fail with
>>>
>>> error: unable to lock /var/lib/libvirt/swtpm/.../tpm2/.lock for
>>> metadata change: Resource temporarily unavailable
>>>
>>> Things work fine if swtpm is not involved. I'm going to dig deeper,
>>> but my guess is that, just like the situation addressed by the last
>>> patch, having an additional process complicates things compared to
>>> when we need to worry about QEMU only.
>>
>> I've managed to track this down, and I wasn't far off.
>>
>> The issue is that, when remember_owner is enabled, we perform a bunch
>> of additional locking on files around the actual relabel operation;
>> specifically, we call virSecurityManagerTransactionCommit() with the
>> last argument set to true.
>>
>> This doesn't seem to cause any issues in general, *except* when it
>> comes to swtpm's storage lock. The swtpm process holds this lock
>> while it's running, and only releases it once migration is triggered.
>> So, when we're about to start the target swtpm process and want to
>> prepare the environment by setting up labels, we try to acquire the
>> storage lock, and can't proceed because the source swtpm process is
>> still holding on to it.
>
> Who is 'we try to acquire the storage lock'? Is libvirt trying to acquire
> swtpm's storage lock? I would assume that only an instance of swtpm would
> acquire the lock.
Yes, it's libvirt doing that. The lock is only held for a brief
period while labels are being applied, and released immediately
afterwards. swtpm is only launched once that's done, so generally
there's no conflict.
Yes, I saw the code now. It kind of prevents lock files from being used.
In the migration case, things have worked so far because labeling in
general has been skipped for shared filesystem, which means that
locking was not performed either. This series changes things so that
labeling is necessary.
Thanks for the re-cap.
Does libvirt actually get involved in case of a migration failure and
fallback to the source host so that it could again relabel all files
before QEMU and swtpm (and possibly other virtual devices) again open
their files?
>> The hacky patch below makes migration work even when remember_owner
>> is enabled. Obviously I'd rewrite it so that we'd only skip locking
>> for incoming migration, but I wonder if there could be nasty side
>> effects to this...
>>
>> Other ideas? Can we perhaps change things so that swtpm releases the
>> lock earlier upon our request or something like that?
>
> Maybe below functiokn needs to be passed an array of exceptions, one being
> swtpm's lock file. I mean the lock file serves the purpose of locking via
> filesystem, so I don't think a third party should start interfering here and
> influencing the protocol ...
I guess the idea behind the locking is to prevent unrelated processes
(and possibly other libvirt threads?) from stepping on each other's
toes. Some exceptions are carved out already, so it's not like
there's no precedent for what we're suggesting. I'm just concerned
about accidentally opening the door for fun race conditions or CVEs.
Hm, maybe exceptions of filenames not to lock could be configured in the
config file instead of hard coded but with concrete names already given
in the sample config file so that users don't need to find out.
Some ideas about exceptions for files not to lock:
- there is a list of exception of files
- check whether a file is locked by a process with a given name in an
exception list (e.g., swtpm in /proc/<flock->l_pid>/comm ).