[PATCH] qemu: fix switchover-ack regression for old qemu

When enabling switchover-ack on qemu from libvirt, the .party value was set to both source and target; however, qemuMigrationParamsCheck() only takes that into account to validate that the remote side of the migration supports the flag if it is marked optional or auto/always on. In the case of switchover-ack, when enabled on only the dst and not the src, the migration will fail if the src qemu does not support switchover-ack, as the dst qemu will issue a switchover-ack msg: qemu/migration/savevm.c -> loadvm_process_command -> migrate_send_rp_switchover_ack(mis) -> migrate_send_rp_message(mis, MIG_RP_MSG_SWITCHOVER_ACK, 0, NULL) Since the src qemu doesn't understand messages with header_type == MIG_RP_MSG_SWITCHOVER_ACK, qemu will kill the migration with error: qemu-kvm: RP: Received invalid message 0x0007 length 0x0000 qemu-kvm: Unable to write to socket: Bad file descriptor Looking at the original commit [1] for optional migration capabilities, it seems that the spirit of optional handling was to enhance a given existing capability where possible. Given that switchover-ack exclusively depends on return-path, adding it as optional to that cap feels right. [1] 61e34b08568 ("qemu: Add support for optional migration capabilities") Fixes: 1cc7737f69e ("qemu: add support for qemu switchover-ack") Signed-off-by: Jon Kohler <jon@nutanix.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Avihai Horon <avihaih@nvidia.com> Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: YangHang Liu <yanghliu@redhat.com> --- src/qemu/qemu_migration_params.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/src/qemu/qemu_migration_params.c b/src/qemu/qemu_migration_params.c index 9593b6ba65..98822012cc 100644 --- a/src/qemu/qemu_migration_params.c +++ b/src/qemu/qemu_migration_params.c @@ -210,17 +210,13 @@ static const qemuMigrationParamsFlagMapItem qemuMigrationParamsFlagMap[] = { {.match = QEMU_MIGRATION_FLAG_FORBIDDEN, .flag = VIR_MIGRATE_TUNNELLED, .cap = QEMU_MIGRATION_CAP_RETURN_PATH, + .optional = QEMU_MIGRATION_CAP_SWITCHOVER_ACK, .party = QEMU_MIGRATION_SOURCE | QEMU_MIGRATION_DESTINATION}, {.match = QEMU_MIGRATION_FLAG_REQUIRED, .flag = VIR_MIGRATE_ZEROCOPY, .cap = QEMU_MIGRATION_CAP_ZERO_COPY_SEND, .party = QEMU_MIGRATION_SOURCE}, - - {.match = QEMU_MIGRATION_FLAG_FORBIDDEN, - .flag = VIR_MIGRATE_TUNNELLED, - .cap = QEMU_MIGRATION_CAP_SWITCHOVER_ACK, - .party = QEMU_MIGRATION_SOURCE | QEMU_MIGRATION_DESTINATION}, }; /* Translation from VIR_MIGRATE_PARAM_* typed parameters to -- 2.43.0

On Thu, Jun 27, 2024 at 11:11:56 -0700, Jon Kohler wrote:
When enabling switchover-ack on qemu from libvirt, the .party value was set to both source and target; however, qemuMigrationParamsCheck() only takes that into account to validate that the remote side of the migration supports the flag if it is marked optional or auto/always on.
In the case of switchover-ack, when enabled on only the dst and not the src, the migration will fail if the src qemu does not support switchover-ack, as the dst qemu will issue a switchover-ack msg: qemu/migration/savevm.c -> loadvm_process_command -> migrate_send_rp_switchover_ack(mis) -> migrate_send_rp_message(mis, MIG_RP_MSG_SWITCHOVER_ACK, 0, NULL)
Since the src qemu doesn't understand messages with header_type == MIG_RP_MSG_SWITCHOVER_ACK, qemu will kill the migration with error: qemu-kvm: RP: Received invalid message 0x0007 length 0x0000 qemu-kvm: Unable to write to socket: Bad file descriptor
Looking at the original commit [1] for optional migration capabilities, it seems that the spirit of optional handling was to enhance a given existing capability where possible. Given that switchover-ack exclusively depends on return-path, adding it as optional to that cap feels right.
Oops, sorry for not spotting this in my review and thanks for fixing it. I guess we should add a possibility to check QEMU support for capabilities in case they don't depend on another capability and they are supposed to be enabled automatically. But that's a separate thing which may possibly become useful in the future. Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
participants (2)
-
Jiri Denemark
-
Jon Kohler