On 2014-09-24 17:19, Jiri Denemark wrote:
On Tue, Sep 23, 2014 at 16:10:01 +0200, Cristian Klein wrote:
> Added a new parameter to the `virsh migrate` command to switch to
> post-copy migration after a given timeout.
>
> Signed-off-by: Cristian Klein <cristian.klein(a)cs.umu.se>
> ---
> tools/virsh-domain.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> tools/virsh.pod | 5 ++++
> 2 files changed, 75 insertions(+), 2 deletions(-)
>
> diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
> index a6ced5f..1bba710 100644
> --- a/tools/virsh-domain.c
> +++ b/tools/virsh-domain.c
> @@ -9250,6 +9250,10 @@ static const vshCmdOptDef opts_migrate[] = {
> .type = VSH_OT_INT,
> .help = N_("force guest to suspend if live migration exceeds timeout (in
seconds)")
> },
> + {.name = "postcopy-after",
> + .type = VSH_OT_INT,
> + .help = N_("switch to post-copy migration if live migration exceeds
timeout (in seconds)")
> + },
I think we want both postcopy and postcopy-after options. And we should
also add a dedicated migrate-startpostcopy command.
Just to make sure I understood correctly, the "postcopy" option would
enable post-copy without starting it, then the user would need to type
"migrate-startpostcopy" to trigger post-copy. If so, then I'm a bit
reluctant to such a usage, as the user would need to type
"migrate-startpostcopy" from a second instance of virsh, the first one
being busy displaying migration progress. Do you think users would find
favour/need such a usage?
> {.name = "xml",
> .type = VSH_OT_STRING,
> .help = N_("filename containing updated XML for the target")
> @@ -9331,6 +9335,8 @@ doMigrate(void *opaque)
> VIR_FREE(xml);
> }
>
> + if (vshCommandOptBool(cmd, "postcopy-after")) /* actually an int */
> + flags |= VIR_MIGRATE_POSTCOPY;
> if (vshCommandOptBool(cmd, "live"))
> flags |= VIR_MIGRATE_LIVE;
> if (vshCommandOptBool(cmd, "p2p"))
> @@ -9422,6 +9428,50 @@ vshMigrationTimeout(vshControl *ctl,
> virDomainSuspend(dom);
> }
>
> +static void
> +vshMigrationPostCopyAfter(vshControl *ctl,
> + virDomainPtr dom,
> + void *opaque ATTRIBUTE_UNUSED)
> +{
> + vshDebug(ctl, VSH_ERR_DEBUG, "starting post-copy\n");
> + int rv = virDomainMigrateStartPostCopy(dom);
> + if (rv < 0) {
> + vshError(ctl, "%s", _("start post-copy command
failed"));
> + } else {
> + vshDebug(ctl, VSH_ERR_INFO, "switched to post-copy\n");
> + }
> +}
> +
> +/* Parse the --postcopy-after parameter in seconds, but store its
> + * value in milliseconds. Return -1 on error, 0 if
> + * no timeout was requested, and 1 if timeout was set.
> + * Copy-paste-adapted from vshCommandOptTimeoutToMs.
> + */
> +static int
> +vshCommandOptPostCopyAfterToMs(vshControl *ctl, const vshCmd *cmd, int
*postCopyAfter)
> +{
> + int rv = vshCommandOptInt(cmd, "postcopy-after", postCopyAfter);
> +
> + if (rv < 0 || (rv > 0 && *postCopyAfter < 0)) {
> + vshError(ctl, "%s", _("invalid postcopy-after
parameter"));
> + return -1;
> + }
> + if (rv > 0) {
> + /* Ensure that we can multiply by 1000 without overflowing. */
> + if (*postCopyAfter > INT_MAX / 1000) {
> + vshError(ctl, "%s", _("post-copy after parameter is too
big"));
> + return -1;
> + }
> + *postCopyAfter *= 1000;
> + /* 0 is a special value inside virsh, which means no timeout, so
> + * use 1ms instead for "start post-copy immediately"
> + */
> + if (*postCopyAfter == 0)
> + *postCopyAfter = 1;
> + }
> + return rv;
> +}
> +
> static bool
> cmdMigrate(vshControl *ctl, const vshCmd *cmd)
> {
> @@ -9431,6 +9481,7 @@ cmdMigrate(vshControl *ctl, const vshCmd *cmd)
> bool verbose = false;
> bool functionReturn = false;
> int timeout = 0;
> + int postCopyAfter = 0;
> bool live_flag = false;
> vshCtrlData data = { .dconn = NULL };
>
> @@ -9450,6 +9501,18 @@ cmdMigrate(vshControl *ctl, const vshCmd *cmd)
> goto cleanup;
> }
>
> + if (vshCommandOptPostCopyAfterToMs(ctl, cmd, &postCopyAfter) < 0) {
> + goto cleanup;
> + } else if (postCopyAfter > 0 && !live_flag) {
> + vshError(ctl, "%s",
> + _("migrate: Unexpected postcopy-after for offline
migration"));
> + goto cleanup;
> + } else if (postCopyAfter > 0 && timeout > 0) {
> + vshError(ctl, "%s",
> + _("migrate: --postcopy-after is incompatible with
--timeout"));
> + goto cleanup;
> + }
> +
Is post-copy only allowed for live migrations? I know it doesn't make
much sense otherwise but I'm just checking... Anyway, this check belongs
to qemu driver rather than virsh so that any API user gets the error
message.
Yes, post-copy only makes sense for live migrations. I'll add a check in
the qemu driver too, but suggest keeping the above code for coherence
with the "--timeout" option.
--
Cristian Klein, PhD
Post-doc @ UmeƄ Universitet
http://www8.cs.umu.se/~cklein