On Thu, Jan 22, 2015 at 22:40:59 +0400, Andrey Korolyov wrote:
On Thu, Jan 22, 2015 at 9:11 PM, Xu (Simon) Chen
<xchenum(a)gmail.com> wrote:
> Hey folks,
>
> I am running libvirt 1.2.4 and qemu 2.1 on a 3.14.27 kernel. I've found that
> live migrating a relatively large VM (16 cores and 64G ram) is taking
> forever - close to 15 hours now, and still not done...
>
> With "lsof -i", I can see a connection is established from my source
> hypervisor to a target hypervisor, likely for the purpose of copying data.
> nettop shows that this connection is constantly sending 50-60MBps traffic.
> The VM being migrated has a disk on ceph by using librbd.
>
> I wonder if anyone has seen similar issues, and how I could troubleshoot
> further. (I tried but failed to get qemu monitor to work on the VM...)
>
> Thanks.
> -Simon
>
Hi, under certain conditions, like memory-intensive procedures inside
guest, live migration effectively will have no end, and this is
expected behavior. You may want to use --timeout parameter for
fallback interval to non-live migration for example. If your vm is
sitting idle, the observed behavior is most probably a bug.
Or you can try to play with --auto-converge and --compressed options of
virsh migrate. Mainly the --auto-converge options was designed to help
in you situation. If used, QEMU will automatically slow down guest CPUs
so that it cannot change too much memory during the migration. It may be
better than non-live migration in case you need the guest to be at least
partially responsive.
Jirka