[libvirt] [PATCH v2 00/11] Rework storage migration

This patch set re-implements migration with storage for enough new qemu. Currently, you can migrate a domain to a host without need for shared storage. This is done by setting 'blk' or 'inc' attribute (representing VIR_MIGRATE_NON_SHARED_DISK and VIR_MIGRATE_NON_SHARED_INC flags respectively) of 'migrate' monitor command. However, the qemu implementation is buggy and applications are advised to switch to new impementation which, moreover, offers some nice features, like migrating only explicitly specified disks. The new functionality is controlled via 'nbd-server-*' and 'drive-mirror' commands. The flow is meant to look like this: 1) User invokes libvirt's migrate functionality. 2) libvirt checks that no block jobs are active on the source. 3) libvirt starts the destination QEMU and sets up the NBD server using the nbd-server-start and nbd-server-add commands. 4) libvirt starts drive-mirror with a destination pointing to the remote NBD server, for example nbd:host:port:exportname=diskname (where diskname is the -drive id specified on the destination). 5) once all mirroring jobs reach steady state, libvirt invokes the migrate command. 6) once migration completed, libvirt invokes the nbd-server-stop command on the destination QEMU. If we just skip the 2nd step and there is an active block-job, qemu will fail in step 4. No big deal. Since we try to NOT break migration and keep things compatible, this feature is enabled iff both sides support it. Since there's obvious need for some data transfer between src and dst, I've put it into qemuCookieMigration: 1) src -> dest: (QEMU_MIGRATION_PHASE_BEGIN3 -> QEMU_MIGRATION_PHASE_PREPARE) <nbd> <disk size='17179869184'/> </nbd> Hey destination, I know how to use this cool new feature. Moreover, these are the disks I'll send you. Each one of them is X bytes big. It's one of the prerequisite - the file (disk->src) on dst exists and has at least the same size as on dst. 2) dst -> src: (QEMU_MIGRATION_PHASE_PREPARE -> QEMU_MIGRATION_PHASE_PERFORM3) <nbd port='X'/> Okay, I (destination) support this feature as well. I've created all files as you (src) told me to and you can start rolling data. I am listening on port X. 3) src -> dst: (QEMU_MIGRATION_PHASE_PERFORM3 -> QEMU_MIGRATION_PHASE_FINISH3) <nbd port='-1'/> Migration completed, destination, you may shut the NBD server down. If either src or dst doesn't support NBD, it is not used and whole process fall backs to old implementation. diff to v1: -Eric's and Daniel's suggestions worked in. To point out the bigger ones: don't do NBD style when TUNNELLED requested, added 'b:writable' to 'nbd-server-add' -drop '/qemu-migration/nbd/disk/@src' attribute from migration cookie. As pointed out by Jirka, disk->src can be changed during migration (e.g. by migration hook or by passed xml). So I've tried (as suggested on the list) passing disk alias. However, since qemu hasn't been started on destination yet, the aliases hasn't been generated yet. So we have to rely on ordering completely. The patches 1,3 and 5 has been ACKed already. Michal Privoznik (11): qemu: Introduce NBD_SERVER capability Introduce NBD migration cookie qemu: Introduce nbd-server-start command qemu: Introduce nbd-server-add command qemu: Introduce nbd-server-stop command qemu_migration: Introduce qemuMigrationStartNBDServer qemu_migration: Move port allocation to a separate func qemu_migration: Implement qemuMigrationStartNBDServer() qemu_migration: Implement qemuMigrationDriveMirror qemu_migration: Check size prerequisites qemu_migration: Stop NBD server at Finish phase src/qemu/qemu_capabilities.c | 3 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_driver.c | 8 +- src/qemu/qemu_migration.c | 620 +++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_migration.h | 6 +- src/qemu/qemu_monitor.c | 63 +++++ src/qemu/qemu_monitor.h | 7 + src/qemu/qemu_monitor_json.c | 95 +++++++ src/qemu/qemu_monitor_json.h | 7 + 9 files changed, 772 insertions(+), 38 deletions(-) -- 1.7.8.6

This just keeps track whether qemu knows nbd-server-* commands so we can use it during migration or not. --- src/qemu/qemu_capabilities.c | 3 +++ src/qemu/qemu_capabilities.h | 1 + 2 files changed, 4 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 01a1b98..0c7785a 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -194,6 +194,7 @@ VIR_ENUM_IMPL(qemuCaps, QEMU_CAPS_LAST, "usb-redir.bootindex", "usb-host.bootindex", "blockdev-snapshot-sync", + "nbd-server-start", ); struct _qemuCaps { @@ -1951,6 +1952,8 @@ qemuCapsProbeQMPCommands(qemuCapsPtr caps, qemuCapsSet(caps, QEMU_CAPS_DRIVE_MIRROR); else if (STREQ(name, "blockdev-snapshot-sync")) qemuCapsSet(caps, QEMU_CAPS_DISK_SNAPSHOT); + else if (STREQ(name, "nbd-server-start")) + qemuCapsSet(caps, QEMU_CAPS_NBD_SERVER); VIR_FREE(name); } VIR_FREE(commands); diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 3da8672..ff215ae 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -156,6 +156,7 @@ enum qemuCapsFlags { QEMU_CAPS_USB_REDIR_BOOTINDEX = 116, /* usb-redir.bootindex */ QEMU_CAPS_USB_HOST_BOOTINDEX = 117, /* usb-host.bootindex */ QEMU_CAPS_DISK_SNAPSHOT = 118, /* blockdev-snapshot-sync command */ + QEMU_CAPS_NBD_SERVER = 119, /* nbd-server-start QMP command */ QEMU_CAPS_LAST, /* this must always be the last item */ }; -- 1.7.8.6

On 12/10/2012 12:27 PM, Michal Privoznik wrote:
This just keeps track whether qemu knows nbd-server-* commands so we can use it during migration or not. --- src/qemu/qemu_capabilities.c | 3 +++ src/qemu/qemu_capabilities.h | 1 + 2 files changed, 4 insertions(+), 0 deletions(-)
ACK. Probably some trivial conflicts to deal with as you rebase to latest. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

This migration cookie is meant for two purposes. The first is to be sent in begin phase from source to destination to let it know we support new implementation of VIR_MIGRATE_NON_SHARED_{DISK,INC} so destination can start NBD server. Then, the second purpose is, destination can let us know, on which port is NBD server running. --- src/qemu/qemu_migration.c | 117 ++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 106 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 86060dc..8c1e873 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -72,6 +72,7 @@ enum qemuMigrationCookieFlags { QEMU_MIGRATION_COOKIE_FLAG_LOCKSTATE, QEMU_MIGRATION_COOKIE_FLAG_PERSISTENT, QEMU_MIGRATION_COOKIE_FLAG_NETWORK, + QEMU_MIGRATION_COOKIE_FLAG_NBD, QEMU_MIGRATION_COOKIE_FLAG_LAST }; @@ -79,13 +80,18 @@ enum qemuMigrationCookieFlags { VIR_ENUM_DECL(qemuMigrationCookieFlag); VIR_ENUM_IMPL(qemuMigrationCookieFlag, QEMU_MIGRATION_COOKIE_FLAG_LAST, - "graphics", "lockstate", "persistent", "network"); + "graphics", + "lockstate", + "persistent", + "network", + "nbd"); enum qemuMigrationCookieFeatures { QEMU_MIGRATION_COOKIE_GRAPHICS = (1 << QEMU_MIGRATION_COOKIE_FLAG_GRAPHICS), QEMU_MIGRATION_COOKIE_LOCKSTATE = (1 << QEMU_MIGRATION_COOKIE_FLAG_LOCKSTATE), QEMU_MIGRATION_COOKIE_PERSISTENT = (1 << QEMU_MIGRATION_COOKIE_FLAG_PERSISTENT), QEMU_MIGRATION_COOKIE_NETWORK = (1 << QEMU_MIGRATION_COOKIE_FLAG_NETWORK), + QEMU_MIGRATION_COOKIE_NBD = (1 << QEMU_MIGRATION_COOKIE_FLAG_NBD), }; typedef struct _qemuMigrationCookieGraphics qemuMigrationCookieGraphics; @@ -119,6 +125,17 @@ struct _qemuMigrationCookieNetwork { qemuMigrationCookieNetDataPtr net; }; +typedef struct _qemuMigrationCookieNBD qemuMigrationCookieNBD; +typedef qemuMigrationCookieNBD *qemuMigrationCookieNBDPtr; +struct _qemuMigrationCookieNBD { + int port; /* on which port does NBD server listen for incoming data. + Zero value has special meaning - it is there just to let + destination know we (the source) do support NBD. + Negative one is meant to be sent when translating from + perform to finish phase to let destination know it's + safe to stop NBD server.*/ +}; + typedef struct _qemuMigrationCookie qemuMigrationCookie; typedef qemuMigrationCookie *qemuMigrationCookiePtr; struct _qemuMigrationCookie { @@ -147,6 +164,9 @@ struct _qemuMigrationCookie { /* If (flags & QEMU_MIGRATION_COOKIE_NETWORK) */ qemuMigrationCookieNetworkPtr network; + + /* If (flags & QEMU_MIGRATION_COOKIE_NBD) */ + qemuMigrationCookieNBDPtr nbd; }; static void qemuMigrationCookieGraphicsFree(qemuMigrationCookieGraphicsPtr grap) @@ -192,6 +212,7 @@ static void qemuMigrationCookieFree(qemuMigrationCookiePtr mig) VIR_FREE(mig->name); VIR_FREE(mig->lockState); VIR_FREE(mig->lockDriver); + VIR_FREE(mig->nbd); VIR_FREE(mig); } @@ -492,6 +513,24 @@ qemuMigrationCookieAddNetwork(qemuMigrationCookiePtr mig, } +static int +qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig, + int nbdPort) +{ + /* It is not a bug if there already is a NBD data */ + if (!mig->nbd && + VIR_ALLOC(mig->nbd) < 0) { + virReportOOMError(); + return -1; + } + + mig->nbd->port = nbdPort; + mig->flags |= QEMU_MIGRATION_COOKIE_NBD; + + return 0; +} + + static void qemuMigrationCookieGraphicsXMLFormat(virBufferPtr buf, qemuMigrationCookieGraphicsPtr grap) { @@ -594,6 +633,13 @@ qemuMigrationCookieXMLFormat(virQEMUDriverPtr driver, if ((mig->flags & QEMU_MIGRATION_COOKIE_NETWORK) && mig->network) qemuMigrationCookieNetworkXMLFormat(buf, mig->network); + if ((mig->flags & QEMU_MIGRATION_COOKIE_NBD) && mig->nbd) { + virBufferAddLit(buf, " <nbd"); + if (mig->nbd->port) + virBufferAsprintf(buf, " port='%d'", mig->nbd->port); + virBufferAddLit(buf, "/>\n"); + } + virBufferAddLit(buf, "</qemu-migration>\n"); return 0; } @@ -821,6 +867,12 @@ qemuMigrationCookieXMLParse(qemuMigrationCookiePtr mig, goto error; } + /* nbd is optional */ + if (val == QEMU_MIGRATION_COOKIE_FLAG_NBD) { + VIR_FREE(str); + continue; + } + if ((flags & (1 << val)) == 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Unsupported migration cookie feature %s"), @@ -873,6 +925,25 @@ qemuMigrationCookieXMLParse(qemuMigrationCookiePtr mig, (!(mig->network = qemuMigrationCookieNetworkXMLParse(ctxt)))) goto error; + if (flags & QEMU_MIGRATION_COOKIE_NBD && + virXPathBoolean("count(./nbd) > 0", ctxt)) { + char *port; + + if (VIR_ALLOC(mig->nbd) < 0) { + virReportOOMError(); + goto error; + } + + port = virXPathString("string(./nbd/@port)", ctxt); + if (port && + virStrToLong_i(port, NULL, 10, &mig->nbd->port) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Malformed nbd port '%s'"), + port); + goto error; + } + } + return 0; error: @@ -911,6 +982,7 @@ static int qemuMigrationBakeCookie(qemuMigrationCookiePtr mig, virQEMUDriverPtr driver, virDomainObjPtr dom, + int nbdPort, char **cookieout, int *cookieoutlen, unsigned int flags) @@ -937,6 +1009,10 @@ qemuMigrationBakeCookie(qemuMigrationCookiePtr mig, return -1; } + if (flags & QEMU_MIGRATION_COOKIE_NBD && + qemuMigrationCookieAddNBD(mig, nbdPort) < 0) + return -1; + if (!(*cookieout = qemuMigrationCookieXMLFormatStr(driver, mig))) return -1; @@ -1415,6 +1491,7 @@ char *qemuMigrationBegin(virQEMUDriverPtr driver, qemuMigrationCookiePtr mig = NULL; virDomainDefPtr def = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; + unsigned int cookie_flags = QEMU_MIGRATION_COOKIE_LOCKSTATE; VIR_DEBUG("driver=%p, vm=%p, xmlin=%s, dname=%s," " cookieout=%p, cookieoutlen=%p, flags=%lx", @@ -1434,12 +1511,20 @@ char *qemuMigrationBegin(virQEMUDriverPtr driver, if (!(flags & VIR_MIGRATE_UNSAFE) && !qemuMigrationIsSafe(vm->def)) goto cleanup; + if (qemuCapsGet(priv->caps, QEMU_CAPS_NBD_SERVER)) { + /* TODO support NBD for TUNNELLED migration */ + if (flags & VIR_MIGRATE_TUNNELLED) + VIR_DEBUG("NBD in tunnelled migration is currently not supported"); + else + cookie_flags |= QEMU_MIGRATION_COOKIE_NBD; + } + if (!(mig = qemuMigrationEatCookie(driver, vm, NULL, 0, 0))) goto cleanup; - if (qemuMigrationBakeCookie(mig, driver, vm, + if (qemuMigrationBakeCookie(mig, driver, vm, 0, cookieout, cookieoutlen, - QEMU_MIGRATION_COOKIE_LOCKSTATE) < 0) + cookie_flags) < 0) goto cleanup; if (xmlin) { @@ -1512,6 +1597,8 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, bool tunnel = !!st; char *origname = NULL; char *xmlout = NULL; + int nbdPort = 0; + unsigned int cookie_flags = QEMU_MIGRATION_COOKIE_GRAPHICS; if (virTimeMillisNow(&now) < 0) return -1; @@ -1589,7 +1676,8 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, origname = NULL; if (!(mig = qemuMigrationEatCookie(driver, vm, cookiein, cookieinlen, - QEMU_MIGRATION_COOKIE_LOCKSTATE))) + QEMU_MIGRATION_COOKIE_LOCKSTATE | + QEMU_MIGRATION_COOKIE_NBD))) goto cleanup; if (qemuMigrationJobStart(driver, vm, QEMU_ASYNC_JOB_MIGRATION_IN) < 0) @@ -1640,8 +1728,12 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, VIR_DEBUG("Received no lockstate"); } - if (qemuMigrationBakeCookie(mig, driver, vm, cookieout, cookieoutlen, - QEMU_MIGRATION_COOKIE_GRAPHICS) < 0) { + /* dummy place holder for real work */ + nbdPort = 0; + cookie_flags |= QEMU_MIGRATION_COOKIE_NBD; + + if (qemuMigrationBakeCookie(mig, driver, vm, nbdPort, + cookieout, cookieoutlen, cookie_flags) < 0) { /* We could tear down the whole guest here, but * cookie data is (so far) non-critical, so that * seems a little harsh. We'll just warn for now. @@ -2150,7 +2242,8 @@ qemuMigrationRun(virQEMUDriverPtr driver, } if (!(mig = qemuMigrationEatCookie(driver, vm, cookiein, cookieinlen, - QEMU_MIGRATION_COOKIE_GRAPHICS))) + QEMU_MIGRATION_COOKIE_GRAPHICS | + QEMU_MIGRATION_COOKIE_NBD))) goto cleanup; if (qemuDomainMigrateGraphicsRelocate(driver, vm, mig) < 0) @@ -2296,9 +2389,10 @@ cleanup: } if (ret == 0 && - qemuMigrationBakeCookie(mig, driver, vm, cookieout, cookieoutlen, + qemuMigrationBakeCookie(mig, driver, vm, -1, cookieout, cookieoutlen, QEMU_MIGRATION_COOKIE_PERSISTENT | - QEMU_MIGRATION_COOKIE_NETWORK) < 0) { + QEMU_MIGRATION_COOKIE_NETWORK | + QEMU_MIGRATION_COOKIE_NBD) < 0) { VIR_WARN("Unable to encode migration cookie"); } @@ -3233,7 +3327,7 @@ qemuMigrationFinish(virQEMUDriverPtr driver, qemuDomainCleanupRemove(vm, qemuMigrationPrepareCleanup); - cookie_flags = QEMU_MIGRATION_COOKIE_NETWORK; + cookie_flags = QEMU_MIGRATION_COOKIE_NETWORK | QEMU_MIGRATION_COOKIE_NBD; if (flags & VIR_MIGRATE_PERSIST_DEST) cookie_flags |= QEMU_MIGRATION_COOKIE_PERSISTENT; @@ -3377,7 +3471,8 @@ qemuMigrationFinish(virQEMUDriverPtr driver, VIR_DOMAIN_EVENT_STOPPED_FAILED); } - if (qemuMigrationBakeCookie(mig, driver, vm, cookieout, cookieoutlen, 0) < 0) + if (qemuMigrationBakeCookie(mig, driver, vm, 0, + cookieout, cookieoutlen, 0) < 0) VIR_WARN("Unable to encode migration cookie"); endjob: -- 1.7.8.6

在 2012-12-10一的 20:27 +0100,Michal Privoznik写道:
This migration cookie is meant for two purposes. The first is to be sent in begin phase from source to destination to let it know we support new implementation of VIR_MIGRATE_NON_SHARED_{DISK,INC} so destination can start NBD server. Then, the second purpose is, destination can let us know, on which port is NBD server running. --- src/qemu/qemu_migration.c | 117 ++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 106 insertions(+), 11 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 86060dc..8c1e873 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -72,6 +72,7 @@ enum qemuMigrationCookieFlags { QEMU_MIGRATION_COOKIE_FLAG_LOCKSTATE, QEMU_MIGRATION_COOKIE_FLAG_PERSISTENT, QEMU_MIGRATION_COOKIE_FLAG_NETWORK, + QEMU_MIGRATION_COOKIE_FLAG_NBD,
QEMU_MIGRATION_COOKIE_FLAG_LAST }; @@ -79,13 +80,18 @@ enum qemuMigrationCookieFlags { VIR_ENUM_DECL(qemuMigrationCookieFlag); VIR_ENUM_IMPL(qemuMigrationCookieFlag, QEMU_MIGRATION_COOKIE_FLAG_LAST, - "graphics", "lockstate", "persistent", "network"); + "graphics", + "lockstate", + "persistent", + "network", + "nbd");
enum qemuMigrationCookieFeatures { QEMU_MIGRATION_COOKIE_GRAPHICS = (1 << QEMU_MIGRATION_COOKIE_FLAG_GRAPHICS), QEMU_MIGRATION_COOKIE_LOCKSTATE = (1 << QEMU_MIGRATION_COOKIE_FLAG_LOCKSTATE), QEMU_MIGRATION_COOKIE_PERSISTENT = (1 << QEMU_MIGRATION_COOKIE_FLAG_PERSISTENT), QEMU_MIGRATION_COOKIE_NETWORK = (1 << QEMU_MIGRATION_COOKIE_FLAG_NETWORK), + QEMU_MIGRATION_COOKIE_NBD = (1 << QEMU_MIGRATION_COOKIE_FLAG_NBD), };
typedef struct _qemuMigrationCookieGraphics qemuMigrationCookieGraphics; @@ -119,6 +125,17 @@ struct _qemuMigrationCookieNetwork { qemuMigrationCookieNetDataPtr net; };
+typedef struct _qemuMigrationCookieNBD qemuMigrationCookieNBD; +typedef qemuMigrationCookieNBD *qemuMigrationCookieNBDPtr; +struct _qemuMigrationCookieNBD { + int port; /* on which port does NBD server listen for incoming data. + Zero value has special meaning - it is there just to let + destination know we (the source) do support NBD. + Negative one is meant to be sent when translating from + perform to finish phase to let destination know it's + safe to stop NBD server.*/ +}; + typedef struct _qemuMigrationCookie qemuMigrationCookie; typedef qemuMigrationCookie *qemuMigrationCookiePtr; struct _qemuMigrationCookie { @@ -147,6 +164,9 @@ struct _qemuMigrationCookie {
/* If (flags & QEMU_MIGRATION_COOKIE_NETWORK) */ qemuMigrationCookieNetworkPtr network; + + /* If (flags & QEMU_MIGRATION_COOKIE_NBD) */ + qemuMigrationCookieNBDPtr nbd; };
static void qemuMigrationCookieGraphicsFree(qemuMigrationCookieGraphicsPtr grap) @@ -192,6 +212,7 @@ static void qemuMigrationCookieFree(qemuMigrationCookiePtr mig) VIR_FREE(mig->name); VIR_FREE(mig->lockState); VIR_FREE(mig->lockDriver); + VIR_FREE(mig->nbd); VIR_FREE(mig); }
@@ -492,6 +513,24 @@ qemuMigrationCookieAddNetwork(qemuMigrationCookiePtr mig, }
+static int +qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig, + int nbdPort) +{ + /* It is not a bug if there already is a NBD data */ + if (!mig->nbd && + VIR_ALLOC(mig->nbd) < 0) { + virReportOOMError(); + return -1; + } + + mig->nbd->port = nbdPort; + mig->flags |= QEMU_MIGRATION_COOKIE_NBD; + + return 0; +} + + static void qemuMigrationCookieGraphicsXMLFormat(virBufferPtr buf, qemuMigrationCookieGraphicsPtr grap) { @@ -594,6 +633,13 @@ qemuMigrationCookieXMLFormat(virQEMUDriverPtr driver, if ((mig->flags & QEMU_MIGRATION_COOKIE_NETWORK) && mig->network) qemuMigrationCookieNetworkXMLFormat(buf, mig->network);
+ if ((mig->flags & QEMU_MIGRATION_COOKIE_NBD) && mig->nbd) { + virBufferAddLit(buf, " <nbd"); + if (mig->nbd->port) + virBufferAsprintf(buf, " port='%d'", mig->nbd->port); + virBufferAddLit(buf, "/>\n"); + } + virBufferAddLit(buf, "</qemu-migration>\n"); return 0; } @@ -821,6 +867,12 @@ qemuMigrationCookieXMLParse(qemuMigrationCookiePtr mig, goto error; }
+ /* nbd is optional */ + if (val == QEMU_MIGRATION_COOKIE_FLAG_NBD) { + VIR_FREE(str); + continue; + } + if ((flags & (1 << val)) == 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Unsupported migration cookie feature %s"), @@ -873,6 +925,25 @@ qemuMigrationCookieXMLParse(qemuMigrationCookiePtr mig, (!(mig->network = qemuMigrationCookieNetworkXMLParse(ctxt)))) goto error;
+ if (flags & QEMU_MIGRATION_COOKIE_NBD && + virXPathBoolean("count(./nbd) > 0", ctxt)) { + char *port; + + if (VIR_ALLOC(mig->nbd) < 0) { + virReportOOMError(); + goto error; + } + + port = virXPathString("string(./nbd/@port)", ctxt); + if (port && + virStrToLong_i(port, NULL, 10, &mig->nbd->port) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Malformed nbd port '%s'"), + port); + goto error; + } + } + return 0;
error: @@ -911,6 +982,7 @@ static int qemuMigrationBakeCookie(qemuMigrationCookiePtr mig, virQEMUDriverPtr driver, virDomainObjPtr dom, + int nbdPort, char **cookieout, int *cookieoutlen, unsigned int flags) @@ -937,6 +1009,10 @@ qemuMigrationBakeCookie(qemuMigrationCookiePtr mig, return -1; }
+ if (flags & QEMU_MIGRATION_COOKIE_NBD && + qemuMigrationCookieAddNBD(mig, nbdPort) < 0) + return -1; + if (!(*cookieout = qemuMigrationCookieXMLFormatStr(driver, mig))) return -1;
@@ -1415,6 +1491,7 @@ char *qemuMigrationBegin(virQEMUDriverPtr driver, qemuMigrationCookiePtr mig = NULL; virDomainDefPtr def = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; + unsigned int cookie_flags = QEMU_MIGRATION_COOKIE_LOCKSTATE;
VIR_DEBUG("driver=%p, vm=%p, xmlin=%s, dname=%s," " cookieout=%p, cookieoutlen=%p, flags=%lx", @@ -1434,12 +1511,20 @@ char *qemuMigrationBegin(virQEMUDriverPtr driver, if (!(flags & VIR_MIGRATE_UNSAFE) && !qemuMigrationIsSafe(vm->def)) goto cleanup;
+ if (qemuCapsGet(priv->caps, QEMU_CAPS_NBD_SERVER)) { + /* TODO support NBD for TUNNELLED migration */ + if (flags & VIR_MIGRATE_TUNNELLED) + VIR_DEBUG("NBD in tunnelled migration is currently not supported"); + else + cookie_flags |= QEMU_MIGRATION_COOKIE_NBD; + } +
'if (flags & VIR_MIGRATE_NON_SHARED_INC || flags & VIR_MIGRATE_NON_SHARED_DISK)' also required here.
if (!(mig = qemuMigrationEatCookie(driver, vm, NULL, 0, 0))) goto cleanup;
- if (qemuMigrationBakeCookie(mig, driver, vm, + if (qemuMigrationBakeCookie(mig, driver, vm, 0, cookieout, cookieoutlen, - QEMU_MIGRATION_COOKIE_LOCKSTATE) < 0) + cookie_flags) < 0) goto cleanup;
if (xmlin) { @@ -1512,6 +1597,8 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, bool tunnel = !!st; char *origname = NULL; char *xmlout = NULL; + int nbdPort = 0; + unsigned int cookie_flags = QEMU_MIGRATION_COOKIE_GRAPHICS;
if (virTimeMillisNow(&now) < 0) return -1; @@ -1589,7 +1676,8 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, origname = NULL;
if (!(mig = qemuMigrationEatCookie(driver, vm, cookiein, cookieinlen, - QEMU_MIGRATION_COOKIE_LOCKSTATE))) + QEMU_MIGRATION_COOKIE_LOCKSTATE | + QEMU_MIGRATION_COOKIE_NBD))) goto cleanup;
if (qemuMigrationJobStart(driver, vm, QEMU_ASYNC_JOB_MIGRATION_IN) < 0) @@ -1640,8 +1728,12 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, VIR_DEBUG("Received no lockstate"); }
- if (qemuMigrationBakeCookie(mig, driver, vm, cookieout, cookieoutlen, - QEMU_MIGRATION_COOKIE_GRAPHICS) < 0) { + /* dummy place holder for real work */ + nbdPort = 0; + cookie_flags |= QEMU_MIGRATION_COOKIE_NBD; + + if (qemuMigrationBakeCookie(mig, driver, vm, nbdPort, + cookieout, cookieoutlen, cookie_flags) < 0) { /* We could tear down the whole guest here, but * cookie data is (so far) non-critical, so that * seems a little harsh. We'll just warn for now. @@ -2150,7 +2242,8 @@ qemuMigrationRun(virQEMUDriverPtr driver, }
if (!(mig = qemuMigrationEatCookie(driver, vm, cookiein, cookieinlen, - QEMU_MIGRATION_COOKIE_GRAPHICS))) + QEMU_MIGRATION_COOKIE_GRAPHICS | + QEMU_MIGRATION_COOKIE_NBD))) goto cleanup;
if (qemuDomainMigrateGraphicsRelocate(driver, vm, mig) < 0) @@ -2296,9 +2389,10 @@ cleanup: }
if (ret == 0 && - qemuMigrationBakeCookie(mig, driver, vm, cookieout, cookieoutlen, + qemuMigrationBakeCookie(mig, driver, vm, -1, cookieout, cookieoutlen, QEMU_MIGRATION_COOKIE_PERSISTENT | - QEMU_MIGRATION_COOKIE_NETWORK) < 0) { + QEMU_MIGRATION_COOKIE_NETWORK | + QEMU_MIGRATION_COOKIE_NBD) < 0) { VIR_WARN("Unable to encode migration cookie"); }
@@ -3233,7 +3327,7 @@ qemuMigrationFinish(virQEMUDriverPtr driver,
qemuDomainCleanupRemove(vm, qemuMigrationPrepareCleanup);
- cookie_flags = QEMU_MIGRATION_COOKIE_NETWORK; + cookie_flags = QEMU_MIGRATION_COOKIE_NETWORK | QEMU_MIGRATION_COOKIE_NBD; if (flags & VIR_MIGRATE_PERSIST_DEST) cookie_flags |= QEMU_MIGRATION_COOKIE_PERSISTENT;
@@ -3377,7 +3471,8 @@ qemuMigrationFinish(virQEMUDriverPtr driver, VIR_DOMAIN_EVENT_STOPPED_FAILED); }
- if (qemuMigrationBakeCookie(mig, driver, vm, cookieout, cookieoutlen, 0) < 0) + if (qemuMigrationBakeCookie(mig, driver, vm, 0, + cookieout, cookieoutlen, 0) < 0) VIR_WARN("Unable to encode migration cookie");
endjob:
-- regards! li guang

This will be used with new migration scheme. This patch creates basically just monitor stub functions. Wiring them into something useful is done in later patches. --- src/qemu/qemu_monitor.c | 22 ++++++++++++++++++ src/qemu/qemu_monitor.h | 3 ++ src/qemu/qemu_monitor_json.c | 49 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 3 ++ 4 files changed, 77 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 5ad6c15..50cf34a 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -3320,3 +3320,25 @@ char *qemuMonitorGetTargetArch(qemuMonitorPtr mon) return qemuMonitorJSONGetTargetArch(mon); } + +int qemuMonitorNBDServerStart(qemuMonitorPtr mon, + const char *host, + unsigned int port) +{ + VIR_DEBUG("mon=%p host=%s port=%u", + mon, host, port); + + if (!mon) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (!mon->json) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("JSON monitor is required")); + return -1; + } + + return qemuMonitorJSONNBDServerStart(mon, host, port); +} diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index dbfab88..eb069ed 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -631,6 +631,9 @@ int qemuMonitorGetObjectProps(qemuMonitorPtr mon, char ***props); char *qemuMonitorGetTargetArch(qemuMonitorPtr mon); +int qemuMonitorNBDServerStart(qemuMonitorPtr mon, + const char *host, + unsigned int port); /** * When running two dd process and using <> redirection, we need a * shell that will not truncate files. These two strings serve that diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 0cd66b6..545285d 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -4273,3 +4273,52 @@ cleanup: virJSONValueFree(reply); return ret; } + +int +qemuMonitorJSONNBDServerStart(qemuMonitorPtr mon, + const char *host, + unsigned int port) +{ + int ret = -1; + virJSONValuePtr cmd = NULL; + virJSONValuePtr reply = NULL; + virJSONValuePtr data = NULL; + virJSONValuePtr addr = NULL; + char *port_str = NULL; + + if (!(data = virJSONValueNewObject()) || + !(addr = virJSONValueNewObject()) || + virAsprintf(&port_str, "%d", port) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (virJSONValueObjectAppendString(data, "host", host) < 0 || + virJSONValueObjectAppendString(data, "port", port_str) < 0 || + virJSONValueObjectAppendString(addr, "type", "inet") < 0 || + virJSONValueObjectAppend(addr, "data", data) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (!(cmd = qemuMonitorJSONMakeCommand("nbd-server-start", + "a:addr", addr, + NULL))) + goto cleanup; + + if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0) + goto cleanup; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + goto cleanup; + + ret = 0; + +cleanup: + VIR_FREE(port_str); + virJSONValueFree(reply); + virJSONValueFree(cmd); + virJSONValueFree(addr); + virJSONValueFree(data); + return ret; +} diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index acca4ec..0e43e39 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -324,4 +324,7 @@ int qemuMonitorJSONGetObjectProps(qemuMonitorPtr mon, ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); char *qemuMonitorJSONGetTargetArch(qemuMonitorPtr mon); +int qemuMonitorJSONNBDServerStart(qemuMonitorPtr mon, + const char *host, + unsigned int port); #endif /* QEMU_MONITOR_JSON_H */ -- 1.7.8.6

This will be used with new migration scheme. This patch creates basically just monitor stub functions. Wiring them into something useful is done in later patches. --- src/qemu/qemu_monitor.c | 22 ++++++++++++++++++++++ src/qemu/qemu_monitor.h | 3 +++ src/qemu/qemu_monitor_json.c | 25 +++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 3 +++ 4 files changed, 53 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 50cf34a..525cc0e 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -3342,3 +3342,25 @@ int qemuMonitorNBDServerStart(qemuMonitorPtr mon, return qemuMonitorJSONNBDServerStart(mon, host, port); } + +int qemuMonitorNBDServerAdd(qemuMonitorPtr mon, + const char *deviceID, + bool writable) +{ + VIR_DEBUG("mon=%p deviceID=%s", + mon, deviceID); + + if (!mon) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (!mon->json) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("JSON monitor is required")); + return -1; + } + + return qemuMonitorJSONNBDServerAdd(mon, deviceID, writable); +} diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index eb069ed..33c772f 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -634,6 +634,9 @@ char *qemuMonitorGetTargetArch(qemuMonitorPtr mon); int qemuMonitorNBDServerStart(qemuMonitorPtr mon, const char *host, unsigned int port); +int qemuMonitorNBDServerAdd(qemuMonitorPtr mon, + const char *deviceID, + bool writable); /** * When running two dd process and using <> redirection, we need a * shell that will not truncate files. These two strings serve that diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 545285d..1f2951f 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -4322,3 +4322,28 @@ cleanup: virJSONValueFree(data); return ret; } + +int +qemuMonitorJSONNBDServerAdd(qemuMonitorPtr mon, + const char *deviceID, + bool writable) +{ + int ret = -1; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + + if (!(cmd = qemuMonitorJSONMakeCommand("nbd-server-add", + "s:device", deviceID, + "b:writable", writable, + NULL))) + return ret; + + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + + if (ret == 0) + ret = qemuMonitorJSONCheckError(cmd, reply); + + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index 0e43e39..14bd59f 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -327,4 +327,7 @@ char *qemuMonitorJSONGetTargetArch(qemuMonitorPtr mon); int qemuMonitorJSONNBDServerStart(qemuMonitorPtr mon, const char *host, unsigned int port); +int qemuMonitorJSONNBDServerAdd(qemuMonitorPtr mon, + const char *deviceID, + bool writable); #endif /* QEMU_MONITOR_JSON_H */ -- 1.7.8.6

This will be used after all migration work is done to stop NBD server running on destination. It doesn't take any arguments, just issues a command. --- src/qemu/qemu_monitor.c | 19 +++++++++++++++++++ src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_monitor_json.c | 21 +++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 1 + 4 files changed, 42 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 525cc0e..c7a07c2 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -3364,3 +3364,22 @@ int qemuMonitorNBDServerAdd(qemuMonitorPtr mon, return qemuMonitorJSONNBDServerAdd(mon, deviceID, writable); } + +int qemuMonitorNBDServerStop(qemuMonitorPtr mon) +{ + VIR_DEBUG("mon=%p", mon); + + if (!mon) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (!mon->json) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("JSON monitor is required")); + return -1; + } + + return qemuMonitorJSONNBDServerStop(mon); +} diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 33c772f..a18691d 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -637,6 +637,7 @@ int qemuMonitorNBDServerStart(qemuMonitorPtr mon, int qemuMonitorNBDServerAdd(qemuMonitorPtr mon, const char *deviceID, bool writable); +int qemuMonitorNBDServerStop(qemuMonitorPtr); /** * When running two dd process and using <> redirection, we need a * shell that will not truncate files. These two strings serve that diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 1f2951f..d9504ca 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -4347,3 +4347,24 @@ qemuMonitorJSONNBDServerAdd(qemuMonitorPtr mon, virJSONValueFree(reply); return ret; } + +int +qemuMonitorJSONNBDServerStop(qemuMonitorPtr mon) +{ + int ret = -1; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + + if (!(cmd = qemuMonitorJSONMakeCommand("nbd-server-stop", + NULL))) + return ret; + + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + + if (ret == 0) + ret = qemuMonitorJSONCheckError(cmd, reply); + + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index 14bd59f..7193998 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -330,4 +330,5 @@ int qemuMonitorJSONNBDServerStart(qemuMonitorPtr mon, int qemuMonitorJSONNBDServerAdd(qemuMonitorPtr mon, const char *deviceID, bool writable); +int qemuMonitorJSONNBDServerStop(qemuMonitorPtr mon); #endif /* QEMU_MONITOR_JSON_H */ -- 1.7.8.6

This is a stub internal API just for now. Its purpose in life is to start NBD server and feed it with all domain disks. When adding a disk to NBD server, it is addressed via its alias (id= param on qemu command line). --- src/qemu/qemu_driver.c | 8 +++--- src/qemu/qemu_migration.c | 59 +++++++++++++++++++++++++++++++++++--------- src/qemu/qemu_migration.h | 6 +++- 3 files changed, 55 insertions(+), 18 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e099c5c..dfb6f9f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9698,7 +9698,7 @@ qemuDomainMigratePrepareTunnel(virConnectPtr dconn, ret = qemuMigrationPrepareTunnel(driver, dconn, NULL, 0, NULL, NULL, /* No cookies in v2 */ - st, dname, dom_xml); + st, dname, dom_xml, flags); cleanup: qemuDriverUnlock(driver); @@ -9758,7 +9758,7 @@ qemuDomainMigratePrepare2(virConnectPtr dconn, ret = qemuMigrationPrepareDirect(driver, dconn, NULL, 0, NULL, NULL, /* No cookies */ uri_in, uri_out, - dname, dom_xml); + dname, dom_xml, flags); cleanup: qemuDriverUnlock(driver); @@ -9995,7 +9995,7 @@ qemuDomainMigratePrepare3(virConnectPtr dconn, cookiein, cookieinlen, cookieout, cookieoutlen, uri_in, uri_out, - dname, dom_xml); + dname, dom_xml, flags); cleanup: qemuDriverUnlock(driver); @@ -10040,7 +10040,7 @@ qemuDomainMigratePrepareTunnel3(virConnectPtr dconn, ret = qemuMigrationPrepareTunnel(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, - st, dname, dom_xml); + st, dname, dom_xml, flags); qemuDriverUnlock(driver); cleanup: diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 8c1e873..d785e75 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1078,6 +1078,29 @@ error: return NULL; } +/** + * qemuMigrationStartNBDServer: + * @driver: qemu driver + * @vm: domain + * @nbdPort: which port is NBD server listening to + * + * Starts NBD server. This is a newer method to copy + * storage during migration than using 'blk' and 'inc' + * arguments in 'migrate' monitor command. + * Error is reported here. + * + * Returns 0 on success, -1 otherwise. + */ +static int +qemuMigrationStartNBDServer(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr vm ATTRIBUTE_UNUSED, + int *nbdPort ATTRIBUTE_UNUSED) +{ + /* do nothing for now */ + return 0; +} + + /* Validate whether the domain is safe to migrate. If vm is NULL, * then this is being run in the v2 Prepare stage on the destination * (where we only have the target xml); if vm is provided, then this @@ -1584,7 +1607,8 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, const char *dname, const char *dom_xml, const char *migrateFrom, - virStreamPtr st) + virStreamPtr st, + unsigned long flags) { virDomainDefPtr def = NULL; virDomainObjPtr vm = NULL; @@ -1728,9 +1752,17 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, VIR_DEBUG("Received no lockstate"); } - /* dummy place holder for real work */ - nbdPort = 0; - cookie_flags |= QEMU_MIGRATION_COOKIE_NBD; + if ((flags & VIR_MIGRATE_NON_SHARED_INC || + flags & VIR_MIGRATE_NON_SHARED_DISK) && + mig->nbd && qemuCapsGet(priv->caps, QEMU_CAPS_NBD_SERVER)) { + /* both source and destination qemus support nbd-server-* + * commands and user requested disk copy. Use the new ones */ + if (qemuMigrationStartNBDServer(driver, vm, &nbdPort) < 0) { + /* error already reported */ + goto endjob; + } + cookie_flags |= QEMU_MIGRATION_COOKIE_NBD; + } if (qemuMigrationBakeCookie(mig, driver, vm, nbdPort, cookieout, cookieoutlen, cookie_flags) < 0) { @@ -1800,21 +1832,23 @@ qemuMigrationPrepareTunnel(virQEMUDriverPtr driver, int *cookieoutlen, virStreamPtr st, const char *dname, - const char *dom_xml) + const char *dom_xml, + unsigned long flags) { int ret; VIR_DEBUG("driver=%p, dconn=%p, cookiein=%s, cookieinlen=%d, " - "cookieout=%p, cookieoutlen=%p, st=%p, dname=%s, dom_xml=%s", + "cookieout=%p, cookieoutlen=%p, st=%p, dname=%s, dom_xml=%s " + "flags=%lx", driver, dconn, NULLSTR(cookiein), cookieinlen, - cookieout, cookieoutlen, st, NULLSTR(dname), dom_xml); + cookieout, cookieoutlen, st, NULLSTR(dname), dom_xml, flags); /* QEMU will be started with -incoming stdio (which qemu_command might * convert to exec:cat or fd:n) */ ret = qemuMigrationPrepareAny(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, dname, dom_xml, - "stdio", st); + "stdio", st, flags); return ret; } @@ -1829,7 +1863,8 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver, const char *uri_in, char **uri_out, const char *dname, - const char *dom_xml) + const char *dom_xml, + unsigned long flags) { static int port = 0; int this_port; @@ -1840,10 +1875,10 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver, VIR_DEBUG("driver=%p, dconn=%p, cookiein=%s, cookieinlen=%d, " "cookieout=%p, cookieoutlen=%p, uri_in=%s, uri_out=%p, " - "dname=%s, dom_xml=%s", + "dname=%s, dom_xml=%s flags=%lx", driver, dconn, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, NULLSTR(uri_in), uri_out, - NULLSTR(dname), dom_xml); + NULLSTR(dname), dom_xml, flags); /* The URI passed in may be NULL or a string "tcp://somehostname:port". * @@ -1925,7 +1960,7 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver, ret = qemuMigrationPrepareAny(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, dname, dom_xml, - migrateFrom, NULL); + migrateFrom, NULL, flags); cleanup: VIR_FREE(hostname); if (ret != 0) diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index 62e39a0..c961866 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -97,7 +97,8 @@ int qemuMigrationPrepareTunnel(virQEMUDriverPtr driver, int *cookieoutlen, virStreamPtr st, const char *dname, - const char *dom_xml); + const char *dom_xml, + unsigned long flags); int qemuMigrationPrepareDirect(virQEMUDriverPtr driver, virConnectPtr dconn, @@ -108,7 +109,8 @@ int qemuMigrationPrepareDirect(virQEMUDriverPtr driver, const char *uri_in, char **uri_out, const char *dname, - const char *dom_xml); + const char *dom_xml, + unsigned long flags); int qemuMigrationPerform(virQEMUDriverPtr driver, virConnectPtr conn, -- 1.7.8.6

在 2012-12-10一的 20:27 +0100,Michal Privoznik写道:
This is a stub internal API just for now. Its purpose in life is to start NBD server and feed it with all domain disks. When adding a disk to NBD server, it is addressed via its alias (id= param on qemu command line). --- src/qemu/qemu_driver.c | 8 +++--- src/qemu/qemu_migration.c | 59 +++++++++++++++++++++++++++++++++++--------- src/qemu/qemu_migration.h | 6 +++- 3 files changed, 55 insertions(+), 18 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e099c5c..dfb6f9f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9698,7 +9698,7 @@ qemuDomainMigratePrepareTunnel(virConnectPtr dconn,
ret = qemuMigrationPrepareTunnel(driver, dconn, NULL, 0, NULL, NULL, /* No cookies in v2 */ - st, dname, dom_xml); + st, dname, dom_xml, flags);
cleanup: qemuDriverUnlock(driver); @@ -9758,7 +9758,7 @@ qemuDomainMigratePrepare2(virConnectPtr dconn, ret = qemuMigrationPrepareDirect(driver, dconn, NULL, 0, NULL, NULL, /* No cookies */ uri_in, uri_out, - dname, dom_xml); + dname, dom_xml, flags);
cleanup: qemuDriverUnlock(driver); @@ -9995,7 +9995,7 @@ qemuDomainMigratePrepare3(virConnectPtr dconn, cookiein, cookieinlen, cookieout, cookieoutlen, uri_in, uri_out, - dname, dom_xml); + dname, dom_xml, flags);
cleanup: qemuDriverUnlock(driver); @@ -10040,7 +10040,7 @@ qemuDomainMigratePrepareTunnel3(virConnectPtr dconn, ret = qemuMigrationPrepareTunnel(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, - st, dname, dom_xml); + st, dname, dom_xml, flags);
support offline migration patch (commit 347a712ab19b33867efc1e58bcd2f3219271cd3f) has done these add flags changes
qemuDriverUnlock(driver);
cleanup: diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 8c1e873..d785e75 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1078,6 +1078,29 @@ error: return NULL; }
+/** + * qemuMigrationStartNBDServer: + * @driver: qemu driver + * @vm: domain + * @nbdPort: which port is NBD server listening to + * + * Starts NBD server. This is a newer method to copy + * storage during migration than using 'blk' and 'inc' + * arguments in 'migrate' monitor command. + * Error is reported here. + * + * Returns 0 on success, -1 otherwise. + */ +static int +qemuMigrationStartNBDServer(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr vm ATTRIBUTE_UNUSED, + int *nbdPort ATTRIBUTE_UNUSED) +{ + /* do nothing for now */ + return 0; +} + + /* Validate whether the domain is safe to migrate. If vm is NULL, * then this is being run in the v2 Prepare stage on the destination * (where we only have the target xml); if vm is provided, then this @@ -1584,7 +1607,8 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, const char *dname, const char *dom_xml, const char *migrateFrom, - virStreamPtr st) + virStreamPtr st, + unsigned long flags) { virDomainDefPtr def = NULL; virDomainObjPtr vm = NULL; @@ -1728,9 +1752,17 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, VIR_DEBUG("Received no lockstate"); }
- /* dummy place holder for real work */ - nbdPort = 0; - cookie_flags |= QEMU_MIGRATION_COOKIE_NBD; + if ((flags & VIR_MIGRATE_NON_SHARED_INC || + flags & VIR_MIGRATE_NON_SHARED_DISK) && + mig->nbd && qemuCapsGet(priv->caps, QEMU_CAPS_NBD_SERVER)) { + /* both source and destination qemus support nbd-server-* + * commands and user requested disk copy. Use the new ones */ + if (qemuMigrationStartNBDServer(driver, vm, &nbdPort) < 0) { + /* error already reported */ + goto endjob; + } + cookie_flags |= QEMU_MIGRATION_COOKIE_NBD; + }
if (qemuMigrationBakeCookie(mig, driver, vm, nbdPort, cookieout, cookieoutlen, cookie_flags) < 0) { @@ -1800,21 +1832,23 @@ qemuMigrationPrepareTunnel(virQEMUDriverPtr driver, int *cookieoutlen, virStreamPtr st, const char *dname, - const char *dom_xml) + const char *dom_xml, + unsigned long flags) { int ret;
VIR_DEBUG("driver=%p, dconn=%p, cookiein=%s, cookieinlen=%d, " - "cookieout=%p, cookieoutlen=%p, st=%p, dname=%s, dom_xml=%s", + "cookieout=%p, cookieoutlen=%p, st=%p, dname=%s, dom_xml=%s " + "flags=%lx", driver, dconn, NULLSTR(cookiein), cookieinlen, - cookieout, cookieoutlen, st, NULLSTR(dname), dom_xml); + cookieout, cookieoutlen, st, NULLSTR(dname), dom_xml, flags);
/* QEMU will be started with -incoming stdio (which qemu_command might * convert to exec:cat or fd:n) */ ret = qemuMigrationPrepareAny(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, dname, dom_xml, - "stdio", st); + "stdio", st, flags); return ret; }
@@ -1829,7 +1863,8 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver, const char *uri_in, char **uri_out, const char *dname, - const char *dom_xml) + const char *dom_xml, + unsigned long flags) { static int port = 0; int this_port; @@ -1840,10 +1875,10 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver,
VIR_DEBUG("driver=%p, dconn=%p, cookiein=%s, cookieinlen=%d, " "cookieout=%p, cookieoutlen=%p, uri_in=%s, uri_out=%p, " - "dname=%s, dom_xml=%s", + "dname=%s, dom_xml=%s flags=%lx", driver, dconn, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, NULLSTR(uri_in), uri_out, - NULLSTR(dname), dom_xml); + NULLSTR(dname), dom_xml, flags);
/* The URI passed in may be NULL or a string "tcp://somehostname:port". * @@ -1925,7 +1960,7 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver,
ret = qemuMigrationPrepareAny(driver, dconn, cookiein, cookieinlen, cookieout, cookieoutlen, dname, dom_xml, - migrateFrom, NULL); + migrateFrom, NULL, flags); cleanup: VIR_FREE(hostname); if (ret != 0) diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index 62e39a0..c961866 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -97,7 +97,8 @@ int qemuMigrationPrepareTunnel(virQEMUDriverPtr driver, int *cookieoutlen, virStreamPtr st, const char *dname, - const char *dom_xml); + const char *dom_xml, + unsigned long flags);
int qemuMigrationPrepareDirect(virQEMUDriverPtr driver, virConnectPtr dconn, @@ -108,7 +109,8 @@ int qemuMigrationPrepareDirect(virQEMUDriverPtr driver, const char *uri_in, char **uri_out, const char *dname, - const char *dom_xml); + const char *dom_xml, + unsigned long flags);
int qemuMigrationPerform(virQEMUDriverPtr driver, virConnectPtr conn,
-- regards! li guang

There's a code snippet which allocates a port for incoming migration. We will need this for allocating a port for incoming disks as well. Hence, move this snippet into a separate function called qemuMigrationNextPort(). --- src/qemu/qemu_migration.c | 19 +++++++++++++------ 1 files changed, 13 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index d785e75..9ad3ee7 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1078,6 +1078,17 @@ error: return NULL; } +static int +qemuMigrationNextPort(void) { + static int port = 0; + int ret = QEMUD_MIGRATION_FIRST_PORT + port++; + + if (port == QEMUD_MIGRATION_NUM_PORTS) + port = 0; + + return ret; +} + /** * qemuMigrationStartNBDServer: * @driver: qemu driver @@ -1866,7 +1877,6 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver, const char *dom_xml, unsigned long flags) { - static int port = 0; int this_port; char *hostname = NULL; char migrateFrom [64]; @@ -1891,8 +1901,7 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver, * to be a correct hostname which refers to the target machine). */ if (uri_in == NULL) { - this_port = QEMUD_MIGRATION_FIRST_PORT + port++; - if (port == QEMUD_MIGRATION_NUM_PORTS) port = 0; + this_port = qemuMigrationNextPort(); /* Get hostname */ if ((hostname = virGetHostname(NULL)) == NULL) @@ -1931,9 +1940,7 @@ qemuMigrationPrepareDirect(virQEMUDriverPtr driver, p = strrchr(uri_in, ':'); if (p == strchr(uri_in, ':')) { /* Generate a port */ - this_port = QEMUD_MIGRATION_FIRST_PORT + port++; - if (port == QEMUD_MIGRATION_NUM_PORTS) - port = 0; + this_port = qemuMigrationNextPort(); /* Caller frees */ if (virAsprintf(uri_out, "%s:%d", uri_in, this_port) < 0) { -- 1.7.8.6

We need to start NBD server and feed it with all non-<shared/> disks. However, after qemuDomainObjEnterMonitorAsync the domain object is unlocked so we cannot touch its disk definitions. Therefore, we must prepare the list of disk IDs prior entering monitor. --- src/qemu/qemu_migration.c | 59 +++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 57 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 9ad3ee7..9177777 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -33,6 +33,7 @@ #include "qemu_domain.h" #include "qemu_process.h" #include "qemu_capabilities.h" +#include "qemu_command.h" #include "qemu_cgroup.h" #include "domain_audit.h" @@ -1107,8 +1108,62 @@ qemuMigrationStartNBDServer(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr vm ATTRIBUTE_UNUSED, int *nbdPort ATTRIBUTE_UNUSED) { - /* do nothing for now */ - return 0; + int ret = -1; + qemuDomainObjPrivatePtr priv = vm->privateData; + int port = qemuMigrationNextPort(); + const char *listen = "0.0.0.0"; + char **disks = NULL; + size_t i, ndisks = 0; + + for (i = 0; i < vm->def->ndisks; i++) { + virDomainDiskDefPtr disk = vm->def->disks[i]; + + /* skip shared disks */ + if (disk->shared) + continue; + + if (VIR_REALLOC_N(disks, ndisks + 1) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (virAsprintf(&disks[ndisks++], "%s%s", + QEMU_DRIVE_HOST_PREFIX, disk->info.alias) < 0) { + virReportOOMError(); + goto cleanup; + } + } + + if (!ndisks) { + /* Hooray! Nothing to care about */ + ret = 0; + goto cleanup; + } + + if (qemuDomainObjEnterMonitorAsync(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_IN) < 0) + goto cleanup; + + if (qemuMonitorNBDServerStart(priv->mon, listen, port) < 0) + goto endjob; + + for (i = 0; i < ndisks; i++) { + if (qemuMonitorNBDServerAdd(priv->mon, disks[i], true) < 0) { + VIR_WARN("Unable to add '%s' to NDB server", disks[i]); + goto endjob; + } + } + + *nbdPort = port; + ret = 0; + +endjob: + qemuDomainObjExitMonitorWithDriver(driver, vm); +cleanup: + for (i = 0; i < ndisks; i++) + VIR_FREE(disks[i]); + VIR_FREE(disks); + return ret; } -- 1.7.8.6

This function does the source part of NBD magic. It invokes drive-mirror on each non shared disk and wait till the mirroring process completes. When it does we can proceed with migration. Currently, an active waiting is done: every 50ms libvirt asks qemu if block-job is finished or not. However, once the job finishes, qemu doesn't report its progress so we can only assume if the job finished successfully or not. The better solution would be to listen to the event which is sent as soon as the job finishes. The event does contain the result of job. --- src/qemu/qemu_migration.c | 190 +++++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 184 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 9177777..6ee8f0e 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1166,6 +1166,177 @@ cleanup: return ret; } +/** + * qemuMigrationDiskMirror: + * @driver: qemu driver + * @vm: domain + * @mig: migration cookie + * @migrate_flags: migrate monitor command flags + * + * Run drive-mirror to feed NBD server running on dst and + * wait till the process completes. On success, update + * @migrate_flags so we don't tell 'migrate' command to + * do the very same operation. + * + * Returns 0 on success (@migrate_flags updated), + * -1 otherwise. + */ +static int +qemuMigrationDriveMirror(virQEMUDriverPtr driver, + virDomainObjPtr vm, + qemuMigrationCookiePtr mig, + const char *host, + unsigned long speed, + unsigned int *migrate_flags) +{ + int ret = -1; + int mon_ret; + qemuDomainObjPrivatePtr priv = vm->privateData; + size_t ndisks = 0, i; + char **disks = NULL; + unsigned int mirror_flags = VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; + char *nbd_dest = NULL; + + if (*migrate_flags & QEMU_MONITOR_MIGRATE_NON_SHARED_DISK) { + /* dummy */ + } else if (*migrate_flags & QEMU_MONITOR_MIGRATE_NON_SHARED_INC) { + mirror_flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + } else { + /* Nothing to be done here. Claim success */ + return 0; + } + + for (i = 0; i < vm->def->ndisks; i++) { + virDomainDiskDefPtr disk = vm->def->disks[i]; + + /* skip shared disks */ + if (disk->shared) + continue; + + if (VIR_REALLOC_N(disks, ndisks + 1) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (virAsprintf(&disks[ndisks++], "%s%s", + QEMU_DRIVE_HOST_PREFIX, disk->info.alias) < 0) { + virReportOOMError(); + goto cleanup; + } + } + + if (!ndisks) { + /* Hooray! Nothing to care about */ + ret = 0; + goto cleanup; + } + + if (!mig->nbd) { + /* Destination doesn't support NBD server. + * Fall back to previous implementation. + * XXX Or should we report an error here? */ + VIR_DEBUG("Destination doesn't support NBD server " + "Falling back to previous implementation."); + ret = 0; + goto cleanup; + } + + for (i = 0; i < ndisks; i++) { + virDomainBlockJobInfo info; + VIR_FREE(nbd_dest); + if (virAsprintf(&nbd_dest, "nbd:%s:%u:exportname=%s", + host, mig->nbd->port, disks[i]) < 0) { + virReportOOMError(); + goto error; + } + + if (qemuDomainObjEnterMonitorAsync(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) + goto error; + mon_ret = qemuMonitorDriveMirror(priv->mon, disks[i], nbd_dest, + NULL, speed, mirror_flags); + qemuDomainObjExitMonitorWithDriver(driver, vm); + + if (mon_ret < 0) + goto error; + + /* wait for completion */ + while (true) { + /* Poll every 50ms for progress & to allow cancellation */ + struct timespec ts = { .tv_sec = 0, .tv_nsec = 50 * 1000 * 1000ull }; + if (qemuDomainObjEnterMonitorAsync(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) + goto error; + if (priv->job.asyncAbort) { + /* explicitly do this *after* we entered the monitor, + * as this is a critical section so we are guaranteed + * priv->job.asyncAbort will not change */ + qemuDomainObjExitMonitorWithDriver(driver, vm); + virReportError(VIR_ERR_OPERATION_ABORTED, _("%s: %s"), + qemuDomainAsyncJobTypeToString(priv->job.asyncJob), + _("canceled by client")); + goto cleanup; + } + mon_ret = qemuMonitorBlockJob(priv->mon, disks[i], NULL, 0, + &info, BLOCK_JOB_INFO, true); + qemuDomainObjExitMonitorWithDriver(driver, vm); + + if (mon_ret < 0) { + /* qemu doesn't report finished jobs */ + VIR_WARN("Unable to query drive-mirror job status. " + "Stop polling on '%s' cur:%llu end:%llu", + disks[i], info.cur, info.end); + break; + } + + if (info.cur == info.end) { + VIR_DEBUG("Drive mirroring of '%s' completed", disks[i]); + break; + } + + /* XXX Frankly speaking, we should listen to the events, + * instead of doing this. But this works for now and we + * are doing something similar in migration itself anyway */ + + virDomainObjUnlock(vm); + qemuDriverUnlock(driver); + + nanosleep(&ts, NULL); + + qemuDriverLock(driver); + virDomainObjLock(vm); + + } + } + + /* okay, copied. modify migrate_flags */ + *migrate_flags &= ~(QEMU_MONITOR_MIGRATE_NON_SHARED_DISK | + QEMU_MONITOR_MIGRATE_NON_SHARED_INC); + ret = 0; + +cleanup: + for (i = 0; i < ndisks; i++) + VIR_FREE(disks[i]); + VIR_FREE(disks); + VIR_FREE(nbd_dest); + return ret; + +error: + /* cancel any outstanding jobs */ + if (qemuDomainObjEnterMonitorAsync(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT) == 0) { + while (i) { + if (qemuMonitorBlockJob(priv->mon, disks[i], NULL, 0, + NULL, BLOCK_JOB_ABORT, true) < 0) + VIR_WARN("Unable to cancel block-job on '%s'", disks[i]); + i--; + } + qemuDomainObjExitMonitorWithDriver(driver, vm); + } else { + VIR_WARN("Unable to enter monitor. No block job cancelled"); + } + goto cleanup; +} /* Validate whether the domain is safe to migrate. If vm is NULL, * then this is being run in the v2 Prepare stage on the destination @@ -2329,6 +2500,12 @@ qemuMigrationRun(virQEMUDriverPtr driver, cookieout, cookieoutlen, flags, resource, spec, spec->destType, spec->fwdType); + if (flags & VIR_MIGRATE_NON_SHARED_DISK) + migrate_flags |= QEMU_MONITOR_MIGRATE_NON_SHARED_DISK; + + if (flags & VIR_MIGRATE_NON_SHARED_INC) + migrate_flags |= QEMU_MONITOR_MIGRATE_NON_SHARED_INC; + if (virLockManagerPluginUsesState(driver->lockManager) && !cookieout) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -2346,6 +2523,13 @@ qemuMigrationRun(virQEMUDriverPtr driver, if (qemuDomainMigrateGraphicsRelocate(driver, vm, mig) < 0) VIR_WARN("unable to provide data for graphics client relocation"); + /* this will update migrate_flags on success */ + if (qemuMigrationDriveMirror(driver, vm, mig, spec->dest.host.name, + migrate_speed, &migrate_flags) < 0) { + /* error reported by helper func */ + goto cleanup; + } + /* Before EnterMonitor, since qemuMigrationSetOffline already does that */ if (!(flags & VIR_MIGRATE_LIVE) && virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { @@ -2373,12 +2557,6 @@ qemuMigrationRun(virQEMUDriverPtr driver, goto cleanup; } - if (flags & VIR_MIGRATE_NON_SHARED_DISK) - migrate_flags |= QEMU_MONITOR_MIGRATE_NON_SHARED_DISK; - - if (flags & VIR_MIGRATE_NON_SHARED_INC) - migrate_flags |= QEMU_MONITOR_MIGRATE_NON_SHARED_INC; - /* connect to the destination qemu if needed */ if (spec->destType == MIGRATION_DEST_CONNECT_HOST && qemuMigrationConnect(driver, vm, spec) < 0) { -- 1.7.8.6

With new NBD storage migration approach there are several requirements that need to be meet for successful use of the feature. One of them is - the file representing a disk, needs to have at least same size as on the source. Hence, we must transfer a list of pairs [disk source, size] and check on destination that this requirement is met and/or take actions to meet it. --- src/qemu/qemu_migration.c | 164 ++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 161 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 6ee8f0e..0100a6f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -27,6 +27,9 @@ #include <gnutls/x509.h> #include <fcntl.h> #include <poll.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/stat.h> #include "qemu_migration.h" #include "qemu_monitor.h" @@ -135,6 +138,18 @@ struct _qemuMigrationCookieNBD { Negative one is meant to be sent when translating from perform to finish phase to let destination know it's safe to stop NBD server.*/ + + /* The list of pairs [disk-size] (in Bytes). This is needed + * because the same disk size is one of prerequisites for NBD + * storage migration. Unfortunately, we can't rely on + * anything but disk order, since 'src' can be overwritten by + * migration hook script, device aliases are not assigned on + * dst yet (as the source files need to be created before + * qemuProcessStart). */ + size_t ndisks; + struct { + size_t bytes; + } *disk; }; typedef struct _qemuMigrationCookie qemuMigrationCookie; @@ -197,6 +212,17 @@ qemuMigrationCookieNetworkFree(qemuMigrationCookieNetworkPtr network) } +static void +qemuMigrationCookieNBDFree(qemuMigrationCookieNBDPtr nbd) +{ + if (!nbd) + return; + + VIR_FREE(nbd->disk); + VIR_FREE(nbd); +} + + static void qemuMigrationCookieFree(qemuMigrationCookiePtr mig) { if (!mig) @@ -208,12 +234,13 @@ static void qemuMigrationCookieFree(qemuMigrationCookiePtr mig) if (mig->flags & QEMU_MIGRATION_COOKIE_NETWORK) qemuMigrationCookieNetworkFree(mig->network); + qemuMigrationCookieNBDFree(mig->nbd); + VIR_FREE(mig->localHostname); VIR_FREE(mig->remoteHostname); VIR_FREE(mig->name); VIR_FREE(mig->lockState); VIR_FREE(mig->lockDriver); - VIR_FREE(mig->nbd); VIR_FREE(mig); } @@ -516,8 +543,12 @@ qemuMigrationCookieAddNetwork(qemuMigrationCookiePtr mig, static int qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig, + virDomainObjPtr vm, int nbdPort) { + qemuDomainObjPrivatePtr priv = vm->privateData; + size_t i; + /* It is not a bug if there already is a NBD data */ if (!mig->nbd && VIR_ALLOC(mig->nbd) < 0) { @@ -525,6 +556,33 @@ qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig, return -1; } + /* in Begin phase add info about disks */ + if (priv->job.phase == QEMU_MIGRATION_PHASE_BEGIN3 && + vm->def->ndisks) { + if (VIR_ALLOC_N(mig->nbd->disk, vm->def->ndisks) < 0) { + virReportOOMError(); + return -1; + } + + for (i = 0; i < vm->def->ndisks; i++) { + virDomainDiskDefPtr disk = vm->def->disks[i]; + struct stat sb; + + /* Add only non-shared disks with source */ + if (!disk->src || disk->shared) + continue; + + if (stat(disk->src, &sb) < 0) { + virReportSystemError(errno, + _("Unable to stat '%s'"), + disk->src); + return -1; + } + + mig->nbd->disk[mig->nbd->ndisks++].bytes = sb.st_size; + } + } + mig->nbd->port = nbdPort; mig->flags |= QEMU_MIGRATION_COOKIE_NBD; @@ -638,7 +696,15 @@ qemuMigrationCookieXMLFormat(virQEMUDriverPtr driver, virBufferAddLit(buf, " <nbd"); if (mig->nbd->port) virBufferAsprintf(buf, " port='%d'", mig->nbd->port); - virBufferAddLit(buf, "/>\n"); + if (mig->nbd->ndisks) { + virBufferAddLit(buf, ">\n"); + for (i = 0; i < mig->nbd->ndisks; i++) + virBufferAsprintf(buf, " <disk size='%zu'/>\n", + mig->nbd->disk[i].bytes); + virBufferAddLit(buf, " </nbd>\n"); + } else { + virBufferAddLit(buf, "/>\n"); + } } virBufferAddLit(buf, "</qemu-migration>\n"); @@ -943,6 +1009,32 @@ qemuMigrationCookieXMLParse(qemuMigrationCookiePtr mig, port); goto error; } + + if ((n = virXPathNodeSet("./nbd/disk", ctxt, &nodes)) > 0) { + xmlNodePtr oldNode = ctxt->node; + if (VIR_ALLOC_N(mig->nbd->disk, n) < 0) { + virReportOOMError(); + goto error; + } + mig->nbd->ndisks = n; + + for (i = 0; i < n; i++) { + ctxt->node = nodes[i]; + + tmp = virXPathString("string(./@size)", ctxt); + if (virStrToLong_ull(tmp, NULL, 10, (unsigned long long *) + &mig->nbd->disk[i].bytes) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Malformed size attribute '%s'"), + tmp); + VIR_FREE(tmp); + goto error; + } + VIR_FREE(tmp); + } + VIR_FREE(nodes); + ctxt->node = oldNode; + } } return 0; @@ -1011,7 +1103,7 @@ qemuMigrationBakeCookie(qemuMigrationCookiePtr mig, } if (flags & QEMU_MIGRATION_COOKIE_NBD && - qemuMigrationCookieAddNBD(mig, nbdPort) < 0) + qemuMigrationCookieAddNBD(mig, dom, nbdPort) < 0) return -1; if (!(*cookieout = qemuMigrationCookieXMLFormatStr(driver, mig))) @@ -1338,6 +1430,68 @@ error: goto cleanup; } +static int +qemuMigrationPreCreateStorage(virDomainObjPtr vm, + qemuMigrationCookiePtr mig) +{ + int ret = -1; + size_t i, mig_i = 0; + struct stat sb; + int fd = -1; + + if (!mig->nbd || !mig->nbd->ndisks) { + /* nothing to do here */ + return 0; + } + + for (i = 0; i < vm->def->ndisks; i++) { + virDomainDiskDefPtr disk = vm->def->disks[i]; + size_t bytes = mig->nbd->disk[mig_i].bytes; + + /* skip shared and source-free disks */ + if (!disk->src || disk->shared) + continue; + + mig_i++; + if (mig_i > mig->nbd->ndisks) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("disk count doesn't match")); + goto cleanup; + } + + VIR_DEBUG("Checking '%s' for its size (requested %zuB)", disk->src, bytes); + + if ((fd = virFileOpenAs(disk->src, O_RDWR | O_CREAT, 0660, + -1, -1, VIR_FILE_OPEN_NOFORK)) < 0) { + virReportSystemError(errno, _("Unable to create '%s'"), disk->src); + goto cleanup; + } + + if (fstat(fd, &sb) < 0) { + virReportSystemError(errno, _("Unable to stat '%s'"), disk->src); + goto cleanup; + } + + VIR_DEBUG("File '%s' is %zuB big", disk->src, sb.st_size); + if (sb.st_size < bytes && + ftruncate(fd, bytes) < 0) { + virReportSystemError(errno, _("Unable to ftruncate '%s'"), disk->src); + goto cleanup; + } + + VIR_FORCE_CLOSE(fd); + } + + ret = 0; +cleanup: + VIR_FORCE_CLOSE(fd); + /* free from migration data to prevent + * infinite sending from src to dst and back */ + VIR_FREE(mig->nbd->disk); + mig->nbd->ndisks = 0; + return ret; +} + /* Validate whether the domain is safe to migrate. If vm is NULL, * then this is being run in the v2 Prepare stage on the destination * (where we only have the target xml); if vm is provided, then this @@ -1941,6 +2095,10 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, QEMU_MIGRATION_COOKIE_NBD))) goto cleanup; + /* pre-create all storage */ + if (qemuMigrationPreCreateStorage(vm, mig) < 0) + goto cleanup; + if (qemuMigrationJobStart(driver, vm, QEMU_ASYNC_JOB_MIGRATION_IN) < 0) goto cleanup; qemuMigrationJobSetPhase(driver, vm, QEMU_MIGRATION_PHASE_PREPARE); -- 1.7.8.6

At the end of migration, it is important to stop NBD server and thus release all allocated resources. --- src/qemu/qemu_migration.c | 28 ++++++++++++++++++++++++++++ 1 files changed, 28 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 0100a6f..f53fcf6 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1492,6 +1492,32 @@ cleanup: return ret; } +static void +qemuMigrationStopNBDServer(virQEMUDriverPtr driver, + virDomainObjPtr vm, + qemuMigrationCookiePtr mig) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (!mig->nbd) + return; + + if (mig->nbd->port != -1) + VIR_WARN("This is strange. NBD port was not -1 " + "when shutting NDB server down"); + + if (qemuDomainObjEnterMonitorAsync(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_IN) < 0) { + VIR_WARN("Unable to enter monitor"); + return; + } + + if (qemuMonitorNBDServerStop(priv->mon) < 0) + VIR_WARN("Unable to stop NBD server"); + + qemuDomainObjExitMonitorWithDriver(driver, vm); +} + /* Validate whether the domain is safe to migrate. If vm is NULL, * then this is being run in the v2 Prepare stage on the destination * (where we only have the target xml); if vm is provided, then this @@ -3792,6 +3818,8 @@ qemuMigrationFinish(virQEMUDriverPtr driver, if (qemuDomainMigrateOPDRelocate(driver, vm, mig) < 0) VIR_WARN("unable to provide network data for relocation"); + qemuMigrationStopNBDServer(driver, vm, mig); + if (flags & VIR_MIGRATE_PERSIST_DEST) { virDomainDefPtr vmdef; if (vm->persistent) -- 1.7.8.6

On 10.12.2012 20:27, Michal Privoznik wrote:
This patch set re-implements migration with storage for enough new qemu. Currently, you can migrate a domain to a host without need for shared storage. This is done by setting 'blk' or 'inc' attribute (representing VIR_MIGRATE_NON_SHARED_DISK and VIR_MIGRATE_NON_SHARED_INC flags respectively) of 'migrate' monitor command. However, the qemu implementation is buggy and applications are advised to switch to new impementation which, moreover, offers some nice features, like migrating only explicitly specified disks.
The new functionality is controlled via 'nbd-server-*' and 'drive-mirror' commands. The flow is meant to look like this:
1) User invokes libvirt's migrate functionality.
2) libvirt checks that no block jobs are active on the source.
3) libvirt starts the destination QEMU and sets up the NBD server using the nbd-server-start and nbd-server-add commands.
4) libvirt starts drive-mirror with a destination pointing to the remote NBD server, for example nbd:host:port:exportname=diskname (where diskname is the -drive id specified on the destination).
5) once all mirroring jobs reach steady state, libvirt invokes the migrate command.
6) once migration completed, libvirt invokes the nbd-server-stop command on the destination QEMU.
If we just skip the 2nd step and there is an active block-job, qemu will fail in step 4. No big deal. Since we try to NOT break migration and keep things compatible, this feature is enabled iff both sides support it. Since there's obvious need for some data transfer between src and dst, I've put it into qemuCookieMigration:
1) src -> dest: (QEMU_MIGRATION_PHASE_BEGIN3 -> QEMU_MIGRATION_PHASE_PREPARE) <nbd> <disk size='17179869184'/> </nbd>
Hey destination, I know how to use this cool new feature. Moreover, these are the disks I'll send you. Each one of them is X bytes big. It's one of the prerequisite - the file (disk->src) on dst exists and has at least the same size as on dst.
2) dst -> src: (QEMU_MIGRATION_PHASE_PREPARE -> QEMU_MIGRATION_PHASE_PERFORM3) <nbd port='X'/>
Okay, I (destination) support this feature as well. I've created all files as you (src) told me to and you can start rolling data. I am listening on port X.
3) src -> dst: (QEMU_MIGRATION_PHASE_PERFORM3 -> QEMU_MIGRATION_PHASE_FINISH3) <nbd port='-1'/>
Migration completed, destination, you may shut the NBD server down.
If either src or dst doesn't support NBD, it is not used and whole process fall backs to old implementation.
diff to v1: -Eric's and Daniel's suggestions worked in. To point out the bigger ones: don't do NBD style when TUNNELLED requested, added 'b:writable' to 'nbd-server-add'
-drop '/qemu-migration/nbd/disk/@src' attribute from migration cookie. As pointed out by Jirka, disk->src can be changed during migration (e.g. by migration hook or by passed xml). So I've tried (as suggested on the list) passing disk alias. However, since qemu hasn't been started on destination yet, the aliases hasn't been generated yet. So we have to rely on ordering completely.
The patches 1,3 and 5 has been ACKed already.
Michal Privoznik (11): qemu: Introduce NBD_SERVER capability Introduce NBD migration cookie qemu: Introduce nbd-server-start command qemu: Introduce nbd-server-add command qemu: Introduce nbd-server-stop command qemu_migration: Introduce qemuMigrationStartNBDServer qemu_migration: Move port allocation to a separate func qemu_migration: Implement qemuMigrationStartNBDServer() qemu_migration: Implement qemuMigrationDriveMirror qemu_migration: Check size prerequisites qemu_migration: Stop NBD server at Finish phase
src/qemu/qemu_capabilities.c | 3 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_driver.c | 8 +- src/qemu/qemu_migration.c | 620 +++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_migration.h | 6 +- src/qemu/qemu_monitor.c | 63 +++++ src/qemu/qemu_monitor.h | 7 + src/qemu/qemu_monitor_json.c | 95 +++++++ src/qemu/qemu_monitor_json.h | 7 + 9 files changed, 772 insertions(+), 38 deletions(-)
Now, that we are post release, it would be nice if somebody has a look at this. Thanks. Michal
participants (3)
-
Eric Blake
-
li guang
-
Michal Privoznik