On Thu, Nov 21, 2013 at 17:47:24 -0800, Chegu Vinod wrote:
Busy enterprise workloads hosted on large sized VM's tend to
dirty
memory faster than the transfer rate achieved via live guest migration.
Despite some good recent improvements (& using dedicated 10Gig NICs
between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to
choose if they wish to force convergence of their migration via a
new migration capability : "auto-converge". This feature allows for qemu
to auto-detect lack of convergence and trigger a throttle-down of the
VCPUs.
This RFC patch includes the libvirt support needed to trigger this
feature. (Testing is still in progress)
Signed-off-by: Chegu Vinod <chegu_vinod(a)hp.com>
---
include/libvirt/libvirt.h.in | 1 +
src/qemu/qemu_migration.c | 44 ++++++++++++++++++++++++++++++++++++++++++
src/qemu/qemu_migration.h | 1 +
src/qemu/qemu_monitor.c | 2 +-
src/qemu/qemu_monitor.h | 1 +
tools/virsh-domain.c | 7 ++++++
6 files changed, 55 insertions(+), 1 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in
index 146a59b..13b0bfc 100644
--- a/include/libvirt/libvirt.h.in
+++ b/include/libvirt/libvirt.h.in
@@ -1192,6 +1192,7 @@ typedef enum {
VIR_MIGRATE_OFFLINE = (1 << 10), /* offline migrate */
VIR_MIGRATE_COMPRESSED = (1 << 11), /* compress data during migration
*/
VIR_MIGRATE_ABORT_ON_ERROR = (1 << 12), /* abort migration on I/O errors
happened during migration */
+ VIR_MIGRATE_AUTO_CONVERGE = (1 << 13), /* force auto-convergence during
during migration */
} virDomainMigrateFlags;
I feel like there must be a better name we could use for this flag but
I'm not able to come up with one... :-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
index e87ea85..8cc0c56 100644
--- a/src/qemu/qemu_migration.c
+++ b/src/qemu/qemu_migration.c
@@ -1565,6 +1565,40 @@ cleanup:
}
static int
+qemuMigrationSetAutoConverge(virQEMUDriverPtr driver,
+ virDomainObjPtr vm,
+ enum qemuDomainAsyncJob job)
Nicely copied&pasted from qemuMigrationSetCompression but you forgot to
fix indentation :-)
+{
+ qemuDomainObjPrivatePtr priv = vm->privateData;
+ int ret;
+
+ if (qemuDomainObjEnterMonitorAsync(driver, vm, job) < 0)
+ return -1;
+
+ ret = qemuMonitorGetMigrationCapability(
+ priv->mon,
+ QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE);
+
+ if (ret < 0) {
+ goto cleanup;
+ } else if (ret == 0) {
+ virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
+ _("Auto-Converge migration is not supported by "
+ "QEMU binary"));
Reduce indentation by one level. Also I think "migration" is not really
needed in the error message, I'd just change it to
"Auto-converge is not supported by QEMU binary".
+ ret = -1;
+ goto cleanup;
+ }
+
+ ret = qemuMonitorSetMigrationCapability(
+ priv->mon,
+ QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE);
+
+cleanup:
+ qemuDomainObjExitMonitor(driver, vm);
+ return ret;
+}
+
+static int
qemuMigrationWaitForSpice(virQEMUDriverPtr driver,
virDomainObjPtr vm)
{
@@ -2389,6 +2423,11 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
QEMU_ASYNC_JOB_MIGRATION_IN) < 0)
goto stop;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE &&
+ qemuMigrationSetAutoConverge(driver, vm,
+ QEMU_ASYNC_JOB_MIGRATION_IN) < 0)
+ goto stop;
+
Hmm, I know you are just following what I did with
VIR_MIGRATE_COMPRESSED, but setting auto-converge on destination doesn't
make any sense. And it doesn't even make a lot of sense to set
compression on destination (other than checking the destination supports
compression) so I'm wondering why I did so.
if (mig->lockState) {
VIR_DEBUG("Received lockstate %s", mig->lockState);
VIR_FREE(priv->lockState);
@@ -3181,6 +3220,11 @@ qemuMigrationRun(virQEMUDriverPtr driver,
QEMU_ASYNC_JOB_MIGRATION_OUT) < 0)
goto cleanup;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE &&
+ qemuMigrationSetAutoConverge(driver, vm,
+ QEMU_ASYNC_JOB_MIGRATION_OUT) < 0)
Indentation is off by one space.
+ goto cleanup;
+
if (qemuDomainObjEnterMonitorAsync(driver, vm,
QEMU_ASYNC_JOB_MIGRATION_OUT) < 0)
goto cleanup;
...
So except for the small issues, the patch looks good to me. However, do
I remember correctly that this feature can be turned on dynamically for
an already running migration? If so, I think we want a second patch
adding a new API for setting this auto-converge feature.
Jirka