Stefan,
as you saw, I'm trying to implement support for migration with TPM state
on a shared volume. I mean, it is working when the shared volume is an
NFS mount point because NFS does not really propagate SELinux labels,
but rather has this 'virt_use_nfs' sebool which effectivelly allows all
svirt_t processes to access NFS (thus including swtpm). But things get
trickier when a distributed FS that knows SELinux properly (e.g. ceph)
is used instead.
What I am currently struggling with is - finding the sweet spot when the
source swtpm has let go of the state and the destination has not
accessed it (because if it did it would get EPERM).
Bottom line - the SELinux label is generated dynamically on each guest
startup (to ensure its uniqueness on the system). Therefore, the label
on the destination is different to the one on the source.
The behavior I'm seeing now is:
1) the source starts migration:
{"execute":"migrate","arguments":{"detach":true,"resume":false,"uri":"fd:migrate"},"id":"libvirt-428"}
2) the destination does not touch the swtpm state right away, but when
the TPM state comes in the migration stream, it is touched. Partial logs
from the destination:
->
{"execute":"migrate-incoming","arguments":{"uri":"tcp:[::]:49152"},"id":"libvirt-415"}
<- {"timestamp": {"seconds": 1671449778, "microseconds":
500164},
"event": "MIGRATION", "data": {"status":
"setup"}}]
<- {"return": {}, "id": "libvirt-415"}
<- {"timestamp": {"seconds": 1671449778, "microseconds":
732358},
"event": "MIGRATION", "data": {"status":
"active"}}
Now, before QEMU sends MIGRATION status:completed, I can see QEMU
accessing the TPM state:
Thread 1 "qemu-kvm" hit Breakpoint 1, tpm_emulator_set_state_blob
(tpm_emu=0x5572af389cb0, type=1, tsb=0x5572af389db0, flags=0) at
../backends/tpm/tpm_emulator.c:796
796 {
(gdb) bt
#0 tpm_emulator_set_state_blob (tpm_emu=0x5572af389cb0, type=1,
tsb=0x5572af389db0, flags=0) at ../backends/tpm/tpm_emulator.c:796
#1 0x00005572acf21fe5 in tpm_emulator_post_load (opaque=0x5572af389cb0,
version_id=<optimized out>) at ../backends/tpm/tpm_emulator.c:868
#2 0x00005572acf25497 in vmstate_load_state (f=0x5572af512a10,
vmsd=0x5572ad7743b8 <vmstate_tpm_emulator>, opaque=0x5572af389cb0,
version_id=1) at ../migration/vmstate.c:162
#3 0x00005572acf45753 in qemu_loadvm_state_main (f=0x5572af512a10,
mis=0x5572af39b4e0) at ../migration/savevm.c:876
#4 0x00005572acf47591 in qemu_loadvm_state (f=0x5572af512a10) at
../migration/savevm.c:2712
#5 0x00005572acf301f6 in process_incoming_migration_co
(opaque=<optimized out>) at ../migration/migration.c:591
#6 0x00005572ad400976 in coroutine_trampoline (i0=<optimized out>,
i1=<optimized out>) at ../util/coroutine-ucontext.c:177
#7 0x00007f64ca22a360 in ?? () from target:/lib64/libc.so.6
#8 0x00007f64c948cbc0 in ?? ()
#9 0x0000000000000000 in ?? ()
This in turn means that swtpm on the destination is going to CMD_INIT
itself while the source is still using it.
I wonder what we can do about this. Perhaps - postpone init until the
time the vCPUs on the destination are resumed? That way libvirt on the
source could restore labels (effectively cut the source swtpm process
off the TPM state), then libvirtd on the destination could set the label
and 'cont'. If 'cont' fails for whatever reason then the source libvirtd
would just set the label on the TPM state and everything is back to normal.
Corresponding BZ link:
https://bugzilla.redhat.com/show_bug.cgi?id=2130192
Michal