unable to migrate when TLS is used

With libvirt 6.9.0, qemu 5.1.0, and following configurations: libvirt: key_file = "/etc/ssl/libvirt/server.lan.key" cert_file = "/etc/ssl/libvirt/server.lan.crt" ca_file = "/etc/ssl/libvirt/ca.crt" log_filters="3:remote 4:event 3:util.json 3:rpc 1:*" log_outputs="1:file:/var/log/libvirt/libvirtd.log" qemu: default_tls_x509_cert_dir = "/etc/ssl/qemu" default_tls_x509_verify = 1 migration with tls: virsh # migrate vm1 qemu+tls://server2.lan/system --persistent --undefinesource --copy-storage-all --verbose --tls never succeeds. Progress stops typically at high progress amounts (95%-98%), and network traffic drastically drops as well (from 1 gbps+ to nothing). domjobinfo progress also stops. Without --tls migrations succeed without issues without any other changes to hosts or configurations. Logs of failed migration: Source: https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s... Destination: https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s... Same exact hosts, successful migration logs (without --tls): Source: https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s... Destination: https://drive.google.com/file/d/1EWkCkSBhj76T05k86QjdL-6icJyruK5-/view?usp=s... Should I report this as a bug or is there an issue with my configuration?

On Thu, Nov 19, 2020 at 00:04:28 -0800, Vjaceslavs Klimovs wrote:
With libvirt 6.9.0, qemu 5.1.0, and following configurations: libvirt: key_file = "/etc/ssl/libvirt/server.lan.key" cert_file = "/etc/ssl/libvirt/server.lan.crt" ca_file = "/etc/ssl/libvirt/ca.crt" log_filters="3:remote 4:event 3:util.json 3:rpc 1:*" log_outputs="1:file:/var/log/libvirt/libvirtd.log"
qemu: default_tls_x509_cert_dir = "/etc/ssl/qemu" default_tls_x509_verify = 1
migration with tls: virsh # migrate vm1 qemu+tls://server2.lan/system --persistent --undefinesource --copy-storage-all --verbose --tls
never succeeds. Progress stops typically at high progress amounts (95%-98%), and network traffic drastically drops as well (from 1 gbps+ to nothing). domjobinfo progress also stops. Without --tls migrations succeed without issues without any other changes to hosts or configurations.
Logs of failed migration: Source: https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s... Destination: https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s...
Same exact hosts, successful migration logs (without --tls): Source: https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s... Destination: https://drive.google.com/file/d/1EWkCkSBhj76T05k86QjdL-6icJyruK5-/view?usp=s...
Should I report this as a bug or is there an issue with my configuration?
According to the logs it seems to be stuck in the disk copy phase. I'll have a look.

Hi Peter, I tested this with 6.10.0 and it reproduces. Were you able to take a look by any chance? Happy to test out a patch or provide any additional information. On Thu, Nov 19, 2020 at 12:14 AM Peter Krempa <pkrempa@redhat.com> wrote:
On Thu, Nov 19, 2020 at 00:04:28 -0800, Vjaceslavs Klimovs wrote:
With libvirt 6.9.0, qemu 5.1.0, and following configurations: libvirt: key_file = "/etc/ssl/libvirt/server.lan.key" cert_file = "/etc/ssl/libvirt/server.lan.crt" ca_file = "/etc/ssl/libvirt/ca.crt" log_filters="3:remote 4:event 3:util.json 3:rpc 1:*" log_outputs="1:file:/var/log/libvirt/libvirtd.log"
qemu: default_tls_x509_cert_dir = "/etc/ssl/qemu" default_tls_x509_verify = 1
migration with tls: virsh # migrate vm1 qemu+tls://server2.lan/system --persistent --undefinesource --copy-storage-all --verbose --tls
never succeeds. Progress stops typically at high progress amounts (95%-98%), and network traffic drastically drops as well (from 1 gbps+ to nothing). domjobinfo progress also stops. Without --tls migrations succeed without issues without any other changes to hosts or configurations.
Logs of failed migration: Source:
https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s...
Destination:
https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s...
Same exact hosts, successful migration logs (without --tls): Source:
https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s...
Destination:
https://drive.google.com/file/d/1EWkCkSBhj76T05k86QjdL-6icJyruK5-/view?usp=s...
Should I report this as a bug or is there an issue with my configuration?
According to the logs it seems to be stuck in the disk copy phase. I'll have a look.

On Sun, Dec 06, 2020 at 17:07:31 -0800, Vjaceslavs Klimovs wrote:
Hi Peter, I tested this with 6.10.0 and it reproduces. Were you able to take a look by any chance?
Happy to test out a patch or provide any additional information.
I didn't forget about it, but didn't have time to look into it either. Given that there's holliday time approaching rather quickly, please file this as an issue in the upstream tracker so that it's tracked for now. I'll look into it as soon as I will have time. Thanks for re-testing.
On Thu, Nov 19, 2020 at 12:14 AM Peter Krempa <pkrempa@redhat.com> wrote:
On Thu, Nov 19, 2020 at 00:04:28 -0800, Vjaceslavs Klimovs wrote:
With libvirt 6.9.0, qemu 5.1.0, and following configurations: libvirt: key_file = "/etc/ssl/libvirt/server.lan.key" cert_file = "/etc/ssl/libvirt/server.lan.crt" ca_file = "/etc/ssl/libvirt/ca.crt" log_filters="3:remote 4:event 3:util.json 3:rpc 1:*" log_outputs="1:file:/var/log/libvirt/libvirtd.log"
qemu: default_tls_x509_cert_dir = "/etc/ssl/qemu" default_tls_x509_verify = 1
migration with tls: virsh # migrate vm1 qemu+tls://server2.lan/system --persistent --undefinesource --copy-storage-all --verbose --tls
never succeeds. Progress stops typically at high progress amounts (95%-98%), and network traffic drastically drops as well (from 1 gbps+ to nothing). domjobinfo progress also stops. Without --tls migrations succeed without issues without any other changes to hosts or configurations.
Logs of failed migration: Source:
https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s...
Destination:
https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s...
Same exact hosts, successful migration logs (without --tls): Source:
https://drive.google.com/file/d/1d0dJumicW0TUdG1osNxNnWWiYpfAaIb_/view?usp=s...
Destination:
https://drive.google.com/file/d/1EWkCkSBhj76T05k86QjdL-6icJyruK5-/view?usp=s...
Should I report this as a bug or is there an issue with my configuration?
Please attach the logs as archive to the issue.
participants (2)
-
Peter Krempa
-
Vjaceslavs Klimovs