Hi,
I'm wondering about the best next steps to debug a migration issue.
What I found is that with libvirt v6.9.0 a migration hangs if used like:
$ virsh migrate --unsafe --live --p2p --tunnelled h-migr-test \
qemu+ssh://testkvm-hirsute-to/system
Just "--live --p2p" works fine. Also a bunch of other migration option
combinations work with the same build/setup, just p2p+tunnelled fails.
Also if either source or target are not on 6.9 the migration works
(former version used is v6.6 for me).
I looked at alternating qemu versions (5.0 / 5.1), but it had no impact.
It only depends on libvirt to be on the new version on source&target to
trigger the issue.
Unfortunately there is no crash or error to debug into, it just gets stuck
with the "virsh migrate" hanging on the source and never leaving the
"paused"
state on the target.
I have compared setups with the least amount of "change":
good: Qemu 5.1 + Libvirt 6.9 -> Qemu 5.1 / Libvirt 6.6
bad: Qemu 5.1 + Libvirt 6.9 -> Qemu 5.1 / Libvirt 6.9
[1] has the debug logs of those, beginning with a libvirtd restart that one can
likely skip and then into the migration that hangs in the bad case.
But I failed to see an obvious reason in the log.
In git/news I only found these changes which sounded to be relevant:
f51cbe92c0 qemu: Allow migration over UNIX socket
c69915ccaf peer2peer migration: allow connecting to local sockets
But I'm not using unix: and in the logs the only unix: mentions are for the
qemu monitor and qemu-guest-agent.
I wanted to ask:
- if something related was recently changed that comes to mind?
- if someone else sees anything in the linked logs that I missed?
- if someone else has seen/reproduced the same?
- for best practise to debug a hanging migration
Thanks in advance!
[1]:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1904584/+attachment/5...
--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd