[libvirt] [RFC PATCH] libvirt support to force convergence of live guest migration

Busy enterprise workloads hosted on large sized VM's tend to dirty memory faster than the transfer rate achieved via live guest migration. Despite some good recent improvements (& using dedicated 10Gig NICs between hosts) the live migration may NOT converge. Recently support was added in qemu (version 1.6) to allow a user to choose if they wish to force convergence of their migration via a new migration capability : "auto-converge". This feature allows for qemu to auto-detect lack of convergence and trigger a throttle-down of the VCPUs. This RFC patch includes the libvirt support needed to trigger this feature. (Testing is still in progress) Signed-off-by: Chegu Vinod <chegu_vinod@hp.com> --- include/libvirt/libvirt.h.in | 1 + src/qemu/qemu_migration.c | 44 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 1 + src/qemu/qemu_monitor.c | 2 +- src/qemu/qemu_monitor.h | 1 + tools/virsh-domain.c | 7 ++++++ 6 files changed, 55 insertions(+), 1 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 146a59b..13b0bfc 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1192,6 +1192,7 @@ typedef enum { VIR_MIGRATE_OFFLINE = (1 << 10), /* offline migrate */ VIR_MIGRATE_COMPRESSED = (1 << 11), /* compress data during migration */ VIR_MIGRATE_ABORT_ON_ERROR = (1 << 12), /* abort migration on I/O errors happened during migration */ + VIR_MIGRATE_AUTO_CONVERGE = (1 << 13), /* force auto-convergence during during migration */ } virDomainMigrateFlags; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index e87ea85..8cc0c56 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1565,6 +1565,40 @@ cleanup: } static int +qemuMigrationSetAutoConverge(virQEMUDriverPtr driver, + virDomainObjPtr vm, + enum qemuDomainAsyncJob job) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, job) < 0) + return -1; + + ret = qemuMonitorGetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + + if (ret < 0) { + goto cleanup; + } else if (ret == 0) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Auto-Converge migration is not supported by " + "QEMU binary")); + ret = -1; + goto cleanup; + } + + ret = qemuMonitorSetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + +cleanup: + qemuDomainObjExitMonitor(driver, vm); + return ret; +} + +static int qemuMigrationWaitForSpice(virQEMUDriverPtr driver, virDomainObjPtr vm) { @@ -2389,6 +2423,11 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_IN) < 0) goto stop; + if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_IN) < 0) + goto stop; + if (mig->lockState) { VIR_DEBUG("Received lockstate %s", mig->lockState); VIR_FREE(priv->lockState); @@ -3181,6 +3220,11 @@ qemuMigrationRun(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup; + if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) + goto cleanup; + if (qemuDomainObjEnterMonitorAsync(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup; diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index cafa2a2..c4258a1 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -39,6 +39,7 @@ VIR_MIGRATE_UNSAFE | \ VIR_MIGRATE_OFFLINE | \ VIR_MIGRATE_COMPRESSED | \ + VIR_MIGRATE_AUTO_CONVERGE | \ VIR_MIGRATE_ABORT_ON_ERROR) /* All supported migration parameters and their types. */ diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 1514715..780a29a 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -118,7 +118,7 @@ VIR_ENUM_IMPL(qemuMonitorMigrationStatus, VIR_ENUM_IMPL(qemuMonitorMigrationCaps, QEMU_MONITOR_MIGRATION_CAPS_LAST, - "xbzrle") + "xbzrle", "auto-converge") VIR_ENUM_IMPL(qemuMonitorVMStatus, QEMU_MONITOR_VM_STATUS_LAST, diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index eabf000..95e70ab 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -440,6 +440,7 @@ int qemuMonitorGetSpiceMigrationStatus(qemuMonitorPtr mon, typedef enum { QEMU_MONITOR_MIGRATION_CAPS_XBZRLE, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE, QEMU_MONITOR_MIGRATION_CAPS_LAST } qemuMonitorMigrationCaps; diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 1fe138c..d94a81b 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -8532,6 +8532,10 @@ static const vshCmdOptDef opts_migrate[] = { .type = VSH_OT_BOOL, .help = N_("compress repeated pages during live migration") }, + {.name = "auto-converge", + .type = VSH_OT_BOOL, + .help = N_("force auto convergence during live migration") + }, {.name = "abort-on-error", .type = VSH_OT_BOOL, .help = N_("abort on soft errors during migration") @@ -8676,6 +8680,9 @@ doMigrate(void *opaque) if (vshCommandOptBool(cmd, "compressed")) flags |= VIR_MIGRATE_COMPRESSED; + if (vshCommandOptBool(cmd, "auto-converge")) + flags |= VIR_MIGRATE_AUTO_CONVERGE; + if (vshCommandOptBool(cmd, "offline")) { flags |= VIR_MIGRATE_OFFLINE; } -- 1.7.1

On Thu, Nov 21, 2013 at 17:47:24 -0800, Chegu Vinod wrote:
Busy enterprise workloads hosted on large sized VM's tend to dirty memory faster than the transfer rate achieved via live guest migration. Despite some good recent improvements (& using dedicated 10Gig NICs between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to choose if they wish to force convergence of their migration via a new migration capability : "auto-converge". This feature allows for qemu to auto-detect lack of convergence and trigger a throttle-down of the VCPUs.
This RFC patch includes the libvirt support needed to trigger this feature. (Testing is still in progress)
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com> --- include/libvirt/libvirt.h.in | 1 + src/qemu/qemu_migration.c | 44 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 1 + src/qemu/qemu_monitor.c | 2 +- src/qemu/qemu_monitor.h | 1 + tools/virsh-domain.c | 7 ++++++ 6 files changed, 55 insertions(+), 1 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 146a59b..13b0bfc 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1192,6 +1192,7 @@ typedef enum { VIR_MIGRATE_OFFLINE = (1 << 10), /* offline migrate */ VIR_MIGRATE_COMPRESSED = (1 << 11), /* compress data during migration */ VIR_MIGRATE_ABORT_ON_ERROR = (1 << 12), /* abort migration on I/O errors happened during migration */ + VIR_MIGRATE_AUTO_CONVERGE = (1 << 13), /* force auto-convergence during during migration */ } virDomainMigrateFlags;
I feel like there must be a better name we could use for this flag but I'm not able to come up with one... :-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index e87ea85..8cc0c56 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1565,6 +1565,40 @@ cleanup: }
static int +qemuMigrationSetAutoConverge(virQEMUDriverPtr driver, + virDomainObjPtr vm, + enum qemuDomainAsyncJob job)
Nicely copied&pasted from qemuMigrationSetCompression but you forgot to fix indentation :-)
+{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, job) < 0) + return -1; + + ret = qemuMonitorGetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + + if (ret < 0) { + goto cleanup; + } else if (ret == 0) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Auto-Converge migration is not supported by " + "QEMU binary"));
Reduce indentation by one level. Also I think "migration" is not really needed in the error message, I'd just change it to "Auto-converge is not supported by QEMU binary".
+ ret = -1; + goto cleanup; + } + + ret = qemuMonitorSetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + +cleanup: + qemuDomainObjExitMonitor(driver, vm); + return ret; +} + +static int qemuMigrationWaitForSpice(virQEMUDriverPtr driver, virDomainObjPtr vm) { @@ -2389,6 +2423,11 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_IN) < 0) goto stop;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_IN) < 0) + goto stop; +
Hmm, I know you are just following what I did with VIR_MIGRATE_COMPRESSED, but setting auto-converge on destination doesn't make any sense. And it doesn't even make a lot of sense to set compression on destination (other than checking the destination supports compression) so I'm wondering why I did so.
if (mig->lockState) { VIR_DEBUG("Received lockstate %s", mig->lockState); VIR_FREE(priv->lockState); @@ -3181,6 +3220,11 @@ qemuMigrationRun(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) < 0)
Indentation is off by one space.
+ goto cleanup; + if (qemuDomainObjEnterMonitorAsync(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup; ...
So except for the small issues, the patch looks good to me. However, do I remember correctly that this feature can be turned on dynamically for an already running migration? If so, I think we want a second patch adding a new API for setting this auto-converge feature. Jirka

Il 26/11/2013 13:24, Jiri Denemark ha scritto:
Hmm, I know you are just following what I did with VIR_MIGRATE_COMPRESSED, but setting auto-converge on destination doesn't make any sense. And it doesn't even make a lot of sense to set compression on destination (other than checking the destination supports compression) so I'm wondering why I did so.
FWIW, when new capabilities pop up, I'm trying to enforce that they only need to be set in the source and that (if they affect the destination at all) the destination should automatically discover them through the migration stream. "query-migrate-capabilities" can be used to find out if the destination knows about the capability. Paolo

On 11/26/2013 4:24 AM, Jiri Denemark wrote:
On Thu, Nov 21, 2013 at 17:47:24 -0800, Chegu Vinod wrote:
Busy enterprise workloads hosted on large sized VM's tend to dirty memory faster than the transfer rate achieved via live guest migration. Despite some good recent improvements (& using dedicated 10Gig NICs between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to choose if they wish to force convergence of their migration via a new migration capability : "auto-converge". This feature allows for qemu to auto-detect lack of convergence and trigger a throttle-down of the VCPUs.
This RFC patch includes the libvirt support needed to trigger this feature. (Testing is still in progress)
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com> --- include/libvirt/libvirt.h.in | 1 + src/qemu/qemu_migration.c | 44 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 1 + src/qemu/qemu_monitor.c | 2 +- src/qemu/qemu_monitor.h | 1 + tools/virsh-domain.c | 7 ++++++ 6 files changed, 55 insertions(+), 1 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 146a59b..13b0bfc 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1192,6 +1192,7 @@ typedef enum { VIR_MIGRATE_OFFLINE = (1 << 10), /* offline migrate */ VIR_MIGRATE_COMPRESSED = (1 << 11), /* compress data during migration */ VIR_MIGRATE_ABORT_ON_ERROR = (1 << 12), /* abort migration on I/O errors happened during migration */ + VIR_MIGRATE_AUTO_CONVERGE = (1 << 13), /* force auto-convergence during during migration */ } virDomainMigrateFlags; I feel like there must be a better name we could use for this flag but I'm not able to come up with one... :-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index e87ea85..8cc0c56 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1565,6 +1565,40 @@ cleanup: }
static int +qemuMigrationSetAutoConverge(virQEMUDriverPtr driver, + virDomainObjPtr vm, + enum qemuDomainAsyncJob job) Nicely copied&pasted from qemuMigrationSetCompression but you forgot to fix indentation :-) Yes being very new to libvirt...I did "leverage" from the previous changes :) 'will fix the indentation.
+{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, job) < 0) + return -1; + + ret = qemuMonitorGetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + + if (ret < 0) { + goto cleanup; + } else if (ret == 0) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Auto-Converge migration is not supported by " + "QEMU binary")); Reduce indentation by one level. Also I think "migration" is not really needed in the error message, I'd just change it to "Auto-converge is not supported by QEMU binary".
Ok got it.
+ ret = -1; + goto cleanup; + } + + ret = qemuMonitorSetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + +cleanup: + qemuDomainObjExitMonitor(driver, vm); + return ret; +} + +static int qemuMigrationWaitForSpice(virQEMUDriverPtr driver, virDomainObjPtr vm) { @@ -2389,6 +2423,11 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_IN) < 0) goto stop;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_IN) < 0) + goto stop; + Hmm, I know you are just following what I did with VIR_MIGRATE_COMPRESSED, but setting auto-converge on destination doesn't make any sense. And it doesn't even make a lot of sense to set compression on destination (other than checking the destination supports compression) so I'm wondering why I did so.
if (mig->lockState) { VIR_DEBUG("Received lockstate %s", mig->lockState); VIR_FREE(priv->lockState); @@ -3181,6 +3220,11 @@ qemuMigrationRun(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) < 0)
Indentation is off by one space.
Ok
+ goto cleanup; + if (qemuDomainObjEnterMonitorAsync(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup; ...
So except for the small issues, the patch looks good to me. However, do I remember correctly that this feature can be turned on dynamically for an already running migration? If so, I think we want a second patch adding a new API for setting this auto-converge feature.
Yes. Is there a sample reference/example that I could look up ? Thanks for your feedback. Vinod
Jirka .

On Tue, Nov 26, 2013 at 12:31:12 -0800, Chegu Vinod wrote:
On 11/26/2013 4:24 AM, Jiri Denemark wrote:
On Thu, Nov 21, 2013 at 17:47:24 -0800, Chegu Vinod wrote:
Busy enterprise workloads hosted on large sized VM's tend to dirty memory faster than the transfer rate achieved via live guest migration. Despite some good recent improvements (& using dedicated 10Gig NICs between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to choose if they wish to force convergence of their migration via a new migration capability : "auto-converge". This feature allows for qemu to auto-detect lack of convergence and trigger a throttle-down of the VCPUs.
This RFC patch includes the libvirt support needed to trigger this feature. (Testing is still in progress) ... So except for the small issues, the patch looks good to me. However, do I remember correctly that this feature can be turned on dynamically for an already running migration? If so, I think we want a second patch adding a new API for setting this auto-converge feature.
Yes.
Is there a sample reference/example that I could look up ?
You can have a look at virConnectGetCPUModelNames, which is the most recent simple API added to libvirt (commits f90857b32aa81f9a5e878e7bceb2df30e2b1b4f8, fd69544965ddf0e49e40c99d24708da4ba1a7648, cbcecd7ab14de22467c405023580afcb9e8eca54, ea45b23cfc2b75053aea96f29a2a91cfc8e26013) but n qemu_driver.c you will need to use QEMU_JOB_MIGRATION_OP as the job type (the third parameter to qemuDomainObjBeginJob) to make this API working during migration. Jirka

Chegu Vinod wrote:
Busy enterprise workloads hosted on large sized VM's tend to dirty memory faster than the transfer rate achieved via live guest migration. Despite some good recent improvements (& using dedicated 10Gig NICs between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to choose if they wish to force convergence of their migration via a new migration capability : "auto-converge". This feature allows for qemu to auto-detect lack of convergence and trigger a throttle-down of the VCPUs.
This RFC patch includes the libvirt support needed to trigger this feature. (Testing is still in progress)
Vinod, What is the status of this patch? I see there were a few comments, but don't see (or perhaps missed) a V2.
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com> --- include/libvirt/libvirt.h.in | 1 + src/qemu/qemu_migration.c | 44 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 1 + src/qemu/qemu_monitor.c | 2 +- src/qemu/qemu_monitor.h | 1 + tools/virsh-domain.c | 7 ++++++ 6 files changed, 55 insertions(+), 1 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 146a59b..13b0bfc 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1192,6 +1192,7 @@ typedef enum { VIR_MIGRATE_OFFLINE = (1 << 10), /* offline migrate */ VIR_MIGRATE_COMPRESSED = (1 << 11), /* compress data during migration */ VIR_MIGRATE_ABORT_ON_ERROR = (1 << 12), /* abort migration on I/O errors happened during migration */ + VIR_MIGRATE_AUTO_CONVERGE = (1 << 13), /* force auto-convergence during during migration */ } virDomainMigrateFlags;
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index e87ea85..8cc0c56 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1565,6 +1565,40 @@ cleanup: }
static int +qemuMigrationSetAutoConverge(virQEMUDriverPtr driver, + virDomainObjPtr vm, + enum qemuDomainAsyncJob job) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, job) < 0) + return -1; + + ret = qemuMonitorGetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + + if (ret < 0) { + goto cleanup; + } else if (ret == 0) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Auto-Converge migration is not supported by " + "QEMU binary")); + ret = -1; + goto cleanup; + } + + ret = qemuMonitorSetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + +cleanup: + qemuDomainObjExitMonitor(driver, vm); + return ret; +} + +static int qemuMigrationWaitForSpice(virQEMUDriverPtr driver, virDomainObjPtr vm) { @@ -2389,6 +2423,11 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_IN) < 0) goto stop;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_IN) < 0) + goto stop; + if (mig->lockState) { VIR_DEBUG("Received lockstate %s", mig->lockState); VIR_FREE(priv->lockState); @@ -3181,6 +3220,11 @@ qemuMigrationRun(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) + goto cleanup; + if (qemuDomainObjEnterMonitorAsync(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup; diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index cafa2a2..c4258a1 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -39,6 +39,7 @@ VIR_MIGRATE_UNSAFE | \ VIR_MIGRATE_OFFLINE | \ VIR_MIGRATE_COMPRESSED | \ + VIR_MIGRATE_AUTO_CONVERGE | \ VIR_MIGRATE_ABORT_ON_ERROR)
/* All supported migration parameters and their types. */ diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 1514715..780a29a 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -118,7 +118,7 @@ VIR_ENUM_IMPL(qemuMonitorMigrationStatus,
VIR_ENUM_IMPL(qemuMonitorMigrationCaps, QEMU_MONITOR_MIGRATION_CAPS_LAST, - "xbzrle") + "xbzrle", "auto-converge")
VIR_ENUM_IMPL(qemuMonitorVMStatus, QEMU_MONITOR_VM_STATUS_LAST, diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index eabf000..95e70ab 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -440,6 +440,7 @@ int qemuMonitorGetSpiceMigrationStatus(qemuMonitorPtr mon,
typedef enum { QEMU_MONITOR_MIGRATION_CAPS_XBZRLE, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE,
QEMU_MONITOR_MIGRATION_CAPS_LAST } qemuMonitorMigrationCaps; diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 1fe138c..d94a81b 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -8532,6 +8532,10 @@ static const vshCmdOptDef opts_migrate[] = { .type = VSH_OT_BOOL, .help = N_("compress repeated pages during live migration") }, + {.name = "auto-converge", + .type = VSH_OT_BOOL, + .help = N_("force auto convergence during live migration") + }, {.name = "abort-on-error", .type = VSH_OT_BOOL, .help = N_("abort on soft errors during migration") @@ -8676,6 +8680,9 @@ doMigrate(void *opaque) if (vshCommandOptBool(cmd, "compressed")) flags |= VIR_MIGRATE_COMPRESSED;
+ if (vshCommandOptBool(cmd, "auto-converge")) + flags |= VIR_MIGRATE_AUTO_CONVERGE; + if (vshCommandOptBool(cmd, "offline")) { flags |= VIR_MIGRATE_OFFLINE; }
virsh.pod needs updated too. Do you have time to continue working on this patch? If not, I could find some cycles to help out. Thanks! Regards, Jim

On 1/29/2014 8:27 AM, Jim Fehlig wrote:
Chegu Vinod wrote:
Busy enterprise workloads hosted on large sized VM's tend to dirty memory faster than the transfer rate achieved via live guest migration. Despite some good recent improvements (& using dedicated 10Gig NICs between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to choose if they wish to force convergence of their migration via a new migration capability : "auto-converge". This feature allows for qemu to auto-detect lack of convergence and trigger a throttle-down of the VCPUs.
This RFC patch includes the libvirt support needed to trigger this feature. (Testing is still in progress)
Vinod,
What is the status of this patch? I see there were a few comments, but don't see (or perhaps missed) a V2.
Thanks for your follow up query.
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com> --- include/libvirt/libvirt.h.in | 1 + src/qemu/qemu_migration.c | 44 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 1 + src/qemu/qemu_monitor.c | 2 +- src/qemu/qemu_monitor.h | 1 + tools/virsh-domain.c | 7 ++++++ 6 files changed, 55 insertions(+), 1 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 146a59b..13b0bfc 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1192,6 +1192,7 @@ typedef enum { VIR_MIGRATE_OFFLINE = (1 << 10), /* offline migrate */ VIR_MIGRATE_COMPRESSED = (1 << 11), /* compress data during migration */ VIR_MIGRATE_ABORT_ON_ERROR = (1 << 12), /* abort migration on I/O errors happened during migration */ + VIR_MIGRATE_AUTO_CONVERGE = (1 << 13), /* force auto-convergence during during migration */ } virDomainMigrateFlags;
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index e87ea85..8cc0c56 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1565,6 +1565,40 @@ cleanup: }
static int +qemuMigrationSetAutoConverge(virQEMUDriverPtr driver, + virDomainObjPtr vm, + enum qemuDomainAsyncJob job) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, job) < 0) + return -1; + + ret = qemuMonitorGetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + + if (ret < 0) { + goto cleanup; + } else if (ret == 0) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Auto-Converge migration is not supported by " + "QEMU binary")); + ret = -1; + goto cleanup; + } + + ret = qemuMonitorSetMigrationCapability( + priv->mon, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE); + +cleanup: + qemuDomainObjExitMonitor(driver, vm); + return ret; +} + +static int qemuMigrationWaitForSpice(virQEMUDriverPtr driver, virDomainObjPtr vm) { @@ -2389,6 +2423,11 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_IN) < 0) goto stop;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_IN) < 0) + goto stop; + if (mig->lockState) { VIR_DEBUG("Received lockstate %s", mig->lockState); VIR_FREE(priv->lockState); @@ -3181,6 +3220,11 @@ qemuMigrationRun(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup;
+ if (flags & VIR_MIGRATE_AUTO_CONVERGE && + qemuMigrationSetAutoConverge(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) + goto cleanup; + if (qemuDomainObjEnterMonitorAsync(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto cleanup; diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index cafa2a2..c4258a1 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -39,6 +39,7 @@ VIR_MIGRATE_UNSAFE | \ VIR_MIGRATE_OFFLINE | \ VIR_MIGRATE_COMPRESSED | \ + VIR_MIGRATE_AUTO_CONVERGE | \ VIR_MIGRATE_ABORT_ON_ERROR)
/* All supported migration parameters and their types. */ diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 1514715..780a29a 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -118,7 +118,7 @@ VIR_ENUM_IMPL(qemuMonitorMigrationStatus,
VIR_ENUM_IMPL(qemuMonitorMigrationCaps, QEMU_MONITOR_MIGRATION_CAPS_LAST, - "xbzrle") + "xbzrle", "auto-converge")
VIR_ENUM_IMPL(qemuMonitorVMStatus, QEMU_MONITOR_VM_STATUS_LAST, diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index eabf000..95e70ab 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -440,6 +440,7 @@ int qemuMonitorGetSpiceMigrationStatus(qemuMonitorPtr mon,
typedef enum { QEMU_MONITOR_MIGRATION_CAPS_XBZRLE, + QEMU_MONITOR_MIGRATION_CAPS_AUTO_CONVERGE,
QEMU_MONITOR_MIGRATION_CAPS_LAST } qemuMonitorMigrationCaps; diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 1fe138c..d94a81b 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -8532,6 +8532,10 @@ static const vshCmdOptDef opts_migrate[] = { .type = VSH_OT_BOOL, .help = N_("compress repeated pages during live migration") }, + {.name = "auto-converge", + .type = VSH_OT_BOOL, + .help = N_("force auto convergence during live migration") + }, {.name = "abort-on-error", .type = VSH_OT_BOOL, .help = N_("abort on soft errors during migration") @@ -8676,6 +8680,9 @@ doMigrate(void *opaque) if (vshCommandOptBool(cmd, "compressed")) flags |= VIR_MIGRATE_COMPRESSED;
+ if (vshCommandOptBool(cmd, "auto-converge")) + flags |= VIR_MIGRATE_AUTO_CONVERGE; + if (vshCommandOptBool(cmd, "offline")) { flags |= VIR_MIGRATE_OFFLINE; }
virsh.pod needs updated too.
Do you have time to continue working on this patch? If not, I could find some cycles to help out. Thanks!
Sorry for not responding...I have been busy with other things. There were some minor comments on this specific patch. I can work with you and address those soon. But there is also another comment about having a new virsh interface to allow for this capability to be changed during migration. I did look into that but it requires a lot more changes in both libvirt and qemu (qemu currently doesn't seem to support changing migration capabilities once the migration has started... so we may have to defer that for now). Thanks Vinod
Regards, Jim .

On Wed, Jan 29, 2014 at 08:34:58 -0800, Chegu Vinod wrote:
On 1/29/2014 8:27 AM, Jim Fehlig wrote:
Chegu Vinod wrote:
Busy enterprise workloads hosted on large sized VM's tend to dirty memory faster than the transfer rate achieved via live guest migration. Despite some good recent improvements (& using dedicated 10Gig NICs between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to choose if they wish to force convergence of their migration via a new migration capability : "auto-converge". This feature allows for qemu to auto-detect lack of convergence and trigger a throttle-down of the VCPUs.
This RFC patch includes the libvirt support needed to trigger this feature. (Testing is still in progress)
Vinod,
What is the status of this patch? I see there were a few comments, but don't see (or perhaps missed) a V2.
Thanks for your follow up query.
Do you have time to continue working on this patch? If not, I could find some cycles to help out. Thanks! Sorry for not responding...I have been busy with other things.
There were some minor comments on this specific patch. I can work with you and address those soon.
But there is also another comment about having a new virsh interface to allow for this capability to be changed during migration. I did look into that but it requires a lot more changes in both libvirt and qemu (qemu currently doesn't seem to support changing migration capabilities once the migration has started... so we may have to defer that for now).
Oh, if qemu does not support that, we don't need the new APIs naturally. It was not a hard requirement. Just that if qemu supported it, we would like to support it too. Jirka
participants (4)
-
Chegu Vinod
-
Jim Fehlig
-
Jiri Denemark
-
Paolo Bonzini