Re: [libvirt-users] VM I/O performance drops dramatically during storage migration with drive-mirror

Cc the QEMU Block Layer mailing list (qemu-block@nongnu.org), who might have more insights here; and wrap long lines. On Mon, May 28, 2018 at 06:07:51PM +0800, Chunguang Li wrote:
Hi, everyone.
Recently I am doing some tests on the VM storage+memory migration with KVM/QEMU/libvirt. I use the following migrate command through virsh: "virsh migrate --live --copy-storage-all --verbose vm1 qemu+ssh://192.168.1.91/system tcp://192.168.1.91". I have checked the libvirt debug output, and make sure that the drive-mirror + NBD migration method is used.
Inside the VM, I use an I/O benchmark (Iometer) to generate an oltp workload. I record the I/O performance (IOPS) before/during/after migration. When the migration begins, the IOPS dropped by 30%-40%. This is reasonable, because the migration I/O competes with the workload I/O. However, during almost the last period of migration (which is 66s in my case), the IOPS dropped dramatically, from about 170 to less than 10. I also show the figure of this experiment in the attachment of this email.
[The attachment should arrive on the 'libvirt-users' list archives; but it's not there yet -- https://www.redhat.com/archives/libvirt-users/2018-May/thread.html]
I want to figure out what results in this period with very low IOPS. First, I added some printf()s in the QEMU code, and knew that, this period occurs just before the memory migration phase. (BTW, the memory migration is very fast, which is just about 5s.) So I think this period should be the last phase of the "drive-mirror" process of QEMU. So then I tried to read the code of "drive-mirror" in QEMU, but failed to understand it very well.
Does anybody know what may lead to this period with very low IOPS? Thank you very much.
Some details of this experiment: The VM disk image file is 30GB (format = raw,cache=none,aio=native), and Iometer operates on an 10GB file inside the VM. The oltp workload consists of 33% writes and 67% reads (8KB request size, all random). The VM memory size is 4GB, most of which should be zero pages, so the memory migration is very fast.
-- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China
-- /kashyap

On Mon, May 28, 2018 at 02:05:05PM +0200, Kashyap Chamarthy wrote:
Cc the QEMU Block Layer mailing list (qemu-block@nongnu.org),
[Sigh; now add the QEMU BLock Layer e-mail list to Cc, without typos.]
who might have more insights here; and wrap long lines.
On Mon, May 28, 2018 at 06:07:51PM +0800, Chunguang Li wrote:
Hi, everyone.
Recently I am doing some tests on the VM storage+memory migration with KVM/QEMU/libvirt. I use the following migrate command through virsh: "virsh migrate --live --copy-storage-all --verbose vm1 qemu+ssh://192.168.1.91/system tcp://192.168.1.91". I have checked the libvirt debug output, and make sure that the drive-mirror + NBD migration method is used.
Inside the VM, I use an I/O benchmark (Iometer) to generate an oltp workload. I record the I/O performance (IOPS) before/during/after migration. When the migration begins, the IOPS dropped by 30%-40%. This is reasonable, because the migration I/O competes with the workload I/O. However, during almost the last period of migration (which is 66s in my case), the IOPS dropped dramatically, from about 170 to less than 10. I also show the figure of this experiment in the attachment of this email.
[The attachment should arrive on the 'libvirt-users' list archives; but it's not there yet -- https://www.redhat.com/archives/libvirt-users/2018-May/thread.html]
I want to figure out what results in this period with very low IOPS. First, I added some printf()s in the QEMU code, and knew that, this period occurs just before the memory migration phase. (BTW, the memory migration is very fast, which is just about 5s.) So I think this period should be the last phase of the "drive-mirror" process of QEMU. So then I tried to read the code of "drive-mirror" in QEMU, but failed to understand it very well.
Does anybody know what may lead to this period with very low IOPS? Thank you very much.
Some details of this experiment: The VM disk image file is 30GB (format = raw,cache=none,aio=native), and Iometer operates on an 10GB file inside the VM. The oltp workload consists of 33% writes and 67% reads (8KB request size, all random). The VM memory size is 4GB, most of which should be zero pages, so the memory migration is very fast.
-- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China
-- /kashyap
-- /kashyap

-----Original Messages----- From: "Kashyap Chamarthy" <kchamart@redhat.com> Sent Time: 2018-05-28 21:19:14 (Monday) To: "Chunguang Li" <lichunguang@hust.edu.cn> Cc: libvirt-users@redhat.com, qemu-block@nongnu.org, dgilbert@redhat.com Subject: Re: [libvirt-users] VM I/O performance drops dramatically during storage migration with drive-mirror
On Mon, May 28, 2018 at 02:05:05PM +0200, Kashyap Chamarthy wrote:
Cc the QEMU Block Layer mailing list (qemu-block@nongnu.org),
[Sigh; now add the QEMU BLock Layer e-mail list to Cc, without typos.]
Yes, thank you very much.
who might have more insights here; and wrap long lines.
On Mon, May 28, 2018 at 06:07:51PM +0800, Chunguang Li wrote:
Hi, everyone.
Recently I am doing some tests on the VM storage+memory migration with KVM/QEMU/libvirt. I use the following migrate command through virsh: "virsh migrate --live --copy-storage-all --verbose vm1 qemu+ssh://192.168.1.91/system tcp://192.168.1.91". I have checked the libvirt debug output, and make sure that the drive-mirror + NBD migration method is used.
Inside the VM, I use an I/O benchmark (Iometer) to generate an oltp workload. I record the I/O performance (IOPS) before/during/after migration. When the migration begins, the IOPS dropped by 30%-40%. This is reasonable, because the migration I/O competes with the workload I/O. However, during almost the last period of migration (which is 66s in my case), the IOPS dropped dramatically, from about 170 to less than 10. I also show the figure of this experiment in the attachment of this email.
[The attachment should arrive on the 'libvirt-users' list archives; but it's not there yet -- https://www.redhat.com/archives/libvirt-users/2018-May/thread.html]
The figure of the experiment is also available at: https://pan.baidu.com/s/1pByKQtJ7VdFCDbX-ZMyOwQ
I want to figure out what results in this period with very low IOPS. First, I added some printf()s in the QEMU code, and knew that, this period occurs just before the memory migration phase. (BTW, the memory migration is very fast, which is just about 5s.) So I think this period should be the last phase of the "drive-mirror" process of QEMU. So then I tried to read the code of "drive-mirror" in QEMU, but failed to understand it very well.
Does anybody know what may lead to this period with very low IOPS? Thank you very much.
Some details of this experiment: The VM disk image file is 30GB (format = raw,cache=none,aio=native), and Iometer operates on an 10GB file inside the VM. The oltp workload consists of 33% writes and 67% reads (8KB request size, all random). The VM memory size is 4GB, most of which should be zero pages, so the memory migration is very fast.
-- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China
-- /kashyap
-- /kashyap

On 05/28/2018 07:05 AM, Kashyap Chamarthy wrote:
Cc the QEMU Block Layer mailing list (qemu-block@nongnu.org), who might have more insights here; and wrap long lines.
...
170 to less than 10. I also show the figure of this experiment in the attachment of this email.
[The attachment should arrive on the 'libvirt-users' list archives; but it's not there yet -- https://www.redhat.com/archives/libvirt-users/2018-May/thread.html]
Actually, the attachment was probably rejected by list moderation for being oversized. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

-----Original Messages----- From: "Kashyap Chamarthy" <kchamart@redhat.com> Sent Time: 2018-05-28 20:05:05 (Monday) To: "Chunguang Li" <lichunguang@hust.edu.cn> Cc: libvirt-users@redhat.com, qemu-block@redhat.com, dgilbert@redhat.com Subject: Re: [libvirt-users] VM I/O performance drops dramatically during storage migration with drive-mirror
Cc the QEMU Block Layer mailing list (qemu-block@nongnu.org), who might have more insights here; and wrap long lines.
Hi, Kashyap, thank you very much.
On Mon, May 28, 2018 at 06:07:51PM +0800, Chunguang Li wrote:
Hi, everyone.
Recently I am doing some tests on the VM storage+memory migration with KVM/QEMU/libvirt. I use the following migrate command through virsh: "virsh migrate --live --copy-storage-all --verbose vm1 qemu+ssh://192.168.1.91/system tcp://192.168.1.91". I have checked the libvirt debug output, and make sure that the drive-mirror + NBD migration method is used.
Inside the VM, I use an I/O benchmark (Iometer) to generate an oltp workload. I record the I/O performance (IOPS) before/during/after migration. When the migration begins, the IOPS dropped by 30%-40%. This is reasonable, because the migration I/O competes with the workload I/O. However, during almost the last period of migration (which is 66s in my case), the IOPS dropped dramatically, from about 170 to less than 10. I also show the figure of this experiment in the attachment of this email.
[The attachment should arrive on the 'libvirt-users' list archives; but it's not there yet -- https://www.redhat.com/archives/libvirt-users/2018-May/thread.html]
I don't know whether the attachment could arrive on the list archives. So I make it available through this link: https://pan.baidu.com/s/1pByKQtJ7VdFCDbX-ZMyOwQ Chunguang
I want to figure out what results in this period with very low IOPS. First, I added some printf()s in the QEMU code, and knew that, this period occurs just before the memory migration phase. (BTW, the memory migration is very fast, which is just about 5s.) So I think this period should be the last phase of the "drive-mirror" process of QEMU. So then I tried to read the code of "drive-mirror" in QEMU, but failed to understand it very well.
Does anybody know what may lead to this period with very low IOPS? Thank you very much.
Some details of this experiment: The VM disk image file is 30GB (format = raw,cache=none,aio=native), and Iometer operates on an 10GB file inside the VM. The oltp workload consists of 33% writes and 67% reads (8KB request size, all random). The VM memory size is 4GB, most of which should be zero pages, so the memory migration is very fast.
-- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China
-- /kashyap
participants (3)
-
Chunguang Li
-
Eric Blake
-
Kashyap Chamarthy