[libvirt] [RFC: PATCH 00/13] live block migration via virDomainBlockCopy

This is in response to my discussion with Paolo here: https://www.redhat.com/archives/libvir-list/2012-March/msg01171.html It also picks up an old patch from Adam here (now split into two): https://www.redhat.com/archives/libvir-list/2012-January/msg00562.html Patches 1-12 are in pretty good shape, while patch 13 is still a work in progress. I can test a lot of the pieces, but still have several patches to go before I can fully test the new API. I'm posting this now to at least start some review on the earlier pieces, and to get consensus that the API and XML changes are correct. None of these patches should be applied until after 0.9.11 is released, and even then, we still want confirmation of whether upstream qemu 1.1 will include complete support for block mirroring. For anyone trying to backport this feature to 0.9.10 .so API, I've made sure that you can just omit patches 5 and 6, and that everything else will backport cleanly to give you the new feature through just the public virDomainBlockRebase() call. Adam Litke (2): blockjob: add API for async virDomainBlockJobAbort blockjob: wire up qemu async virDomainBlockJobAbort Eric Blake (11): blockjob: add new API flags blockjob: add 'blockcopy' to virsh blockjob: add virDomainBlockCopy blockjob: enhance virsh 'blockcopy' blockjob: wire up RPC for block copy blockjob: enhance xml to track mirrors across libvirtd restart blockjob: react to active block copy blockjob: expose qemu commands for mirrored storage migration blockjob: query backing file of a disk blockjob: return appropriate event and info WIP: blockjob: implement block copy for qemu docs/apibuild.py | 1 + docs/formatdomain.html.in | 11 ++ docs/schemas/domaincommon.rng | 19 +++- include/libvirt/libvirt.h.in | 50 +++++++++- include/libvirt/virterror.h | 1 + src/conf/domain_conf.c | 77 +++++++++++++++ src/conf/domain_conf.h | 14 +++ src/driver.h | 6 + src/libvirt.c | 216 +++++++++++++++++++++++++++++++++++++++-- src/libvirt_private.syms | 1 + src/libvirt_public.syms | 5 + src/qemu/qemu_capabilities.c | 3 + src/qemu/qemu_capabilities.h | 2 + src/qemu/qemu_conf.h | 1 + src/qemu/qemu_driver.c | 211 ++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_hotplug.c | 7 ++ src/qemu/qemu_monitor.c | 60 +++++++++++- src/qemu/qemu_monitor.h | 23 +++++ src/qemu/qemu_monitor_json.c | 128 ++++++++++++++++++++++-- src/qemu/qemu_monitor_json.h | 21 ++++- src/qemu/qemu_process.c | 19 ++++ src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 13 +++- src/remote_protocol-structs | 10 ++ src/rpc/gendispatch.pl | 1 + src/util/virterror.c | 6 + tools/virsh.c | 135 ++++++++++++++++++++----- tools/virsh.pod | 33 ++++++- 28 files changed, 1007 insertions(+), 68 deletions(-) -- 1.7.7.6

From: Adam Litke <agl@us.ibm.com> Qemu has changed the semantics of the "block_job_cancel" API. The original qed implementation (pretty much only backported to RHEL 6.2 qemu) was synchronous (ie. upon command completion, the operation was guaranteed to be completely stopped). With the new semantics going into qemu 1.1 for qcow2, a "block_job_cancel" merely requests that the operation be cancelled and an event is triggered once the cancellation request has been honored. To adopt the new semantics while preserving compatibility the following updates are made to the virDomainBlockJob API: A new block job event type VIR_DOMAIN_BLOCK_JOB_CANCELLED is recognized by libvirt. Regardless of the flags used with virDomainBlockJobAbort, this event will be raised whenever it is received from qemu. This event indicates that a block job has been successfully cancelled. For now, libvirt does not try to synthesize this event if using an older qemu that did not generate it. A new extension flag VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC is added to the virDomainBlockJobAbort API. When enabled, this function will operate asynchronously (ie, it can return before the job has actually been cancelled). When the API is used in this mode, it is the responsibility of the caller to wait for a VIR_DOMAIN_BLOCK_JOB_CANCELLED event or poll via the virDomainGetBlockJobInfo API to check the cancellation status; this flag is an error if it is not known if the hypervisor supports asynchronous cancel. This patch also exposes the new flag through virsh. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 10 +++++++++ src/libvirt.c | 10 ++++++++- src/qemu/qemu_monitor_json.c | 42 +++++++++++++++++++++++++++++++------ tools/virsh.c | 47 ++++++++++++++++++++++++++--------------- tools/virsh.pod | 9 +++++-- 5 files changed, 90 insertions(+), 28 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 499dcd4..97ad99d 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1946,6 +1946,15 @@ typedef enum { #endif } virDomainBlockJobType; +/** + * virDomainBlockJobAbortFlags: + * + * VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC: Request only, do not wait for completion + */ +typedef enum { + VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC = 1 << 0, +} virDomainBlockJobAbortFlags; + /* An iterator for monitoring block job operations */ typedef unsigned long long virDomainBlockJobCursor; @@ -3617,6 +3626,7 @@ typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn, typedef enum { VIR_DOMAIN_BLOCK_JOB_COMPLETED = 0, VIR_DOMAIN_BLOCK_JOB_FAILED = 1, + VIR_DOMAIN_BLOCK_JOB_CANCELED = 2, #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_BLOCK_JOB_LAST diff --git a/src/libvirt.c b/src/libvirt.c index 16d1fd5..af22232 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -17902,7 +17902,7 @@ error: * virDomainBlockJobAbort: * @dom: pointer to domain object * @disk: path to the block device, or device shorthand - * @flags: extra flags; not used yet, so callers should always pass 0 + * @flags: bitwise-OR of virDomainBlockJobAbortFlags * * Cancel the active block job on the given disk. * @@ -17913,6 +17913,14 @@ error: * can be found by calling virDomainGetXMLDesc() and inspecting * elements within //domain/devices/disk. * + * By default, this function performs a synchronous operation and the caller + * may assume that the operation has completed when 0 is returned. However, + * BlockJob operations may take a long time to complete, and during this time + * further domain interactions may be unresponsive. To avoid this problem, + * pass VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC in the @flags argument to enable + * asynchronous behavior. Either way, when the job has been cancelled, a + * BlockJob event will be emitted, with status VIR_DOMAIN_BLOCK_JOB_CANCELLED. + * * Returns -1 in case of failure, 0 when successful. */ int virDomainBlockJobAbort(virDomainPtr dom, const char *disk, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 7093e1d..4ec7832 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -58,13 +58,14 @@ static void qemuMonitorJSONHandleIOError(qemuMonitorPtr mon, virJSONValuePtr dat static void qemuMonitorJSONHandleVNCConnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleVNCInitialize(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleVNCDisconnect(qemuMonitorPtr mon, virJSONValuePtr data); -static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEConnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEInitialize(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEDisconnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleTrayChange(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandlePMWakeup(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandlePMSuspend(qemuMonitorPtr mon, virJSONValuePtr data); +static void qemuMonitorJSONHandleBlockJobCompleted(qemuMonitorPtr mon, virJSONValuePtr data); +static void qemuMonitorJSONHandleBlockJobCanceled(qemuMonitorPtr mon, virJSONValuePtr data); static struct { const char *type; @@ -80,13 +81,14 @@ static struct { { "VNC_CONNECTED", qemuMonitorJSONHandleVNCConnect, }, { "VNC_INITIALIZED", qemuMonitorJSONHandleVNCInitialize, }, { "VNC_DISCONNECTED", qemuMonitorJSONHandleVNCDisconnect, }, - { "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJob, }, { "SPICE_CONNECTED", qemuMonitorJSONHandleSPICEConnect, }, { "SPICE_INITIALIZED", qemuMonitorJSONHandleSPICEInitialize, }, { "SPICE_DISCONNECTED", qemuMonitorJSONHandleSPICEDisconnect, }, { "DEVICE_TRAY_MOVED", qemuMonitorJSONHandleTrayChange, }, { "WAKEUP", qemuMonitorJSONHandlePMWakeup, }, { "SUSPEND", qemuMonitorJSONHandlePMSuspend, }, + { "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJobCompleted, }, + { "BLOCK_JOB_CANCELLED", qemuMonitorJSONHandleBlockJobCanceled, }, }; @@ -754,13 +756,15 @@ static void qemuMonitorJSONHandleSPICEDisconnect(qemuMonitorPtr mon, virJSONValu qemuMonitorJSONHandleGraphics(mon, data, VIR_DOMAIN_EVENT_GRAPHICS_DISCONNECT); } -static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data) +static void +qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon, + virJSONValuePtr data, + int event) { const char *device; const char *type_str; int type = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; unsigned long long offset, len; - int status = VIR_DOMAIN_BLOCK_JOB_FAILED; if ((device = virJSONValueObjectGetString(data, "device")) == NULL) { VIR_WARN("missing device in block job event"); @@ -785,11 +789,19 @@ static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr da if (STREQ(type_str, "stream")) type = VIR_DOMAIN_BLOCK_JOB_TYPE_PULL; - if (offset != 0 && offset == len) - status = VIR_DOMAIN_BLOCK_JOB_COMPLETED; + switch (event) { + case VIR_DOMAIN_BLOCK_JOB_COMPLETED: + /* Make sure the whole device has been processed */ + if (offset != len) + event = VIR_DOMAIN_BLOCK_JOB_FAILED; + break; + case VIR_DOMAIN_BLOCK_JOB_FAILED: + case VIR_DOMAIN_BLOCK_JOB_CANCELED: + break; + } out: - qemuMonitorEmitBlockJob(mon, device, type, status); + qemuMonitorEmitBlockJob(mon, device, type, event); } static void @@ -832,6 +844,22 @@ qemuMonitorJSONHandlePMSuspend(qemuMonitorPtr mon, qemuMonitorEmitPMSuspend(mon); } +static void +qemuMonitorJSONHandleBlockJobCompleted(qemuMonitorPtr mon, + virJSONValuePtr data) +{ + qemuMonitorJSONHandleBlockJobImpl(mon, data, + VIR_DOMAIN_BLOCK_JOB_COMPLETED); +} + +static void +qemuMonitorJSONHandleBlockJobCanceled(qemuMonitorPtr mon, + virJSONValuePtr data) +{ + qemuMonitorJSONHandleBlockJobImpl(mon, data, + VIR_DOMAIN_BLOCK_JOB_CANCELED); +} + int qemuMonitorJSONHumanCommandWithFd(qemuMonitorPtr mon, const char *cmd_str, diff --git a/tools/virsh.c b/tools/virsh.c index 1ed2dda..7135b15 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -7525,6 +7525,7 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, const char *name, *path; unsigned long bandwidth = 0; int ret = -1; + unsigned int flags = 0; if (!vshConnectionUsability(ctl, ctl->conn)) goto cleanup; @@ -7541,7 +7542,9 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, } if (mode == VSH_CMD_BLOCK_JOB_ABORT) { - ret = virDomainBlockJobAbort(dom, path, 0); + if (vshCommandOptBool(cmd, "async")) + flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC; + ret = virDomainBlockJobAbort(dom, path, flags); } else if (mode == VSH_CMD_BLOCK_JOB_INFO) { ret = virDomainGetBlockJobInfo(dom, path, info, 0); } else if (mode == VSH_CMD_BLOCK_JOB_SPEED) { @@ -7589,20 +7592,25 @@ cmdBlockPull(vshControl *ctl, const vshCmd *cmd) } /* - * "blockjobinfo" command + * "blockjob" command */ static const vshCmdInfo info_block_job[] = { - {"help", N_("Manage active block operations.")}, - {"desc", N_("Manage active block operations.")}, + {"help", N_("Manage active block operations")}, + {"desc", N_("Query, adjust speed, or cancel active block operations.")}, {NULL, NULL} }; static const vshCmdOptDef opts_block_job[] = { {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")}, {"path", VSH_OT_DATA, VSH_OFLAG_REQ, N_("Fully-qualified path of disk")}, - {"abort", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Abort the active job on the specified disk")}, - {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Get active job information for the specified disk")}, - {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, N_("Set the Bandwidth limit in MB/s")}, + {"abort", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("Abort the active job on the specified disk")}, + {"async", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("don't wait for --abort to complete")}, + {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("Get active job information for the specified disk")}, + {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, + N_("Set the Bandwidth limit in MB/s")}, {NULL, 0, 0, NULL} }; @@ -7613,19 +7621,24 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) virDomainBlockJobInfo info; const char *type; int ret; + bool abortMode = vshCommandOptBool(cmd, "abort"); + bool infoMode = vshCommandOptBool(cmd, "info"); + bool bandwidth = vshCommandOptBool(cmd, "bandwidth"); - if (vshCommandOptBool (cmd, "abort")) { - mode = VSH_CMD_BLOCK_JOB_ABORT; - } else if (vshCommandOptBool (cmd, "info")) { - mode = VSH_CMD_BLOCK_JOB_INFO; - } else if (vshCommandOptBool (cmd, "bandwidth")) { - mode = VSH_CMD_BLOCK_JOB_SPEED; - } else { + if (abortMode + infoMode + bandwidth > 1) { vshError(ctl, "%s", _("One of --abort, --info, or --bandwidth is required")); return false; } + if (abortMode) + mode = VSH_CMD_BLOCK_JOB_ABORT; + else if (bandwidth) + mode = VSH_CMD_BLOCK_JOB_SPEED; + else + mode = VSH_CMD_BLOCK_JOB_INFO; + + ret = blockJobImpl(ctl, cmd, &info, mode); if (ret < 0) return false; @@ -7634,13 +7647,13 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) return true; if (info.type == VIR_DOMAIN_BLOCK_JOB_TYPE_PULL) - type = "Block Pull"; + type = _("Block Pull"); else - type = "Unknown job"; + type = _("Unknown job"); print_job_progress(type, info.end - info.cur, info.end); if (info.bandwidth != 0) - vshPrint(ctl, " Bandwidth limit: %lu MB/s\n", info.bandwidth); + vshPrint(ctl, _(" Bandwidth limit: %lu MB/s\n"), info.bandwidth); return true; } diff --git a/tools/virsh.pod b/tools/virsh.pod index d4971a3..cc7de24 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -685,13 +685,16 @@ Both I<--live> and I<--current> flags may be given, but I<--current> is exclusive. If no flag is specified, behavior is different depending on hypervisor. -=item B<blockjob> I<domain> I<path> [I<--abort>] [I<--info>] [I<bandwidth>] +=item B<blockjob> I<domain> I<path> { I<--abort> [I<--async>] | +[I<--info>] | I<bandwidth> } -Manage active block operations. +Manage active block operations. If no mode is chosen, I<--info> is assumed. I<path> specifies fully-qualified path of the disk. + If I<--abort> is specified, the active job on the specified disk will -be aborted. +be aborted. If I<--async> is also specified, this command will return +immediately, rather than waiting for the cancelation to complete. If I<--info> is specified, the active job information on the specified disk will be printed. I<bandwidth> can be used to set bandwidth limit for the active job. -- 1.7.7.6

From: Adam Litke <agl@us.ibm.com> Without the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag, libvirt will internally poll using qemu's "query-block-jobs" API and will not return until the operation has been completed. API users are advised that this operation is unbounded and further interaction with the domain during this period may block. Future patches may refactor things to allow other queries in parallel with this polling. Unfortunately, there's no good way to tell if qemu will emit the new event, so this implementation always polls to deal with older qemu. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 55 +++++++++++++++++++++++++++++++++++++++++------ 1 files changed, 48 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index dd79973..f5b3406 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11601,7 +11601,7 @@ cleanup: static int qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, virDomainBlockJobInfoPtr info, - int mode) + int mode, unsigned int flags) { struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm = NULL; @@ -11643,6 +11643,45 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, ret = qemuMonitorBlockJob(priv->mon, device, base, bandwidth, info, mode); qemuDomainObjExitMonitorWithDriver(driver, vm); + /* Qemu provides asynchronous block job cancellation, but without + * the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag libvirt guarantees a + * synchronous operation. Provide this behavior by waiting here, + * so we don't get confused by newly scheduled block jobs. + */ + if (ret == 0 && mode == BLOCK_JOB_ABORT && + !(flags & VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC)) { + ret = 1; + while (1) { + /* Poll every 50ms */ + struct timespec ts = { .tv_sec = 0, + .tv_nsec = 50 * 1000 * 1000ull }; + virDomainBlockJobInfo dummy; + + qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorBlockJob(priv->mon, device, NULL, 0, &dummy, + BLOCK_JOB_INFO); + qemuDomainObjExitMonitorWithDriver(driver, vm); + + if (ret <= 0) + break; + + virDomainObjUnlock(vm); + qemuDriverUnlock(driver); + + nanosleep(&ts, NULL); + + qemuDriverLock(driver); + virDomainObjLock(vm); + + if (!virDomainObjIsActive(vm)) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + ret = -1; + break; + } + } + } + endjob: if (qemuDomainObjEndJob(driver, vm) == 0) { vm = NULL; @@ -11660,8 +11699,9 @@ cleanup: static int qemuDomainBlockJobAbort(virDomainPtr dom, const char *path, unsigned int flags) { - virCheckFlags(0, -1); - return qemuDomainBlockJobImpl(dom, path, NULL, 0, NULL, BLOCK_JOB_ABORT); + virCheckFlags(VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC, -1); + return qemuDomainBlockJobImpl(dom, path, NULL, 0, NULL, BLOCK_JOB_ABORT, + flags); } static int @@ -11669,7 +11709,8 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, virDomainBlockJobInfoPtr info, unsigned int flags) { virCheckFlags(0, -1); - return qemuDomainBlockJobImpl(dom, path, NULL, 0, info, BLOCK_JOB_INFO); + return qemuDomainBlockJobImpl(dom, path, NULL, 0, info, BLOCK_JOB_INFO, + flags); } static int @@ -11678,7 +11719,7 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, { virCheckFlags(0, -1); return qemuDomainBlockJobImpl(dom, path, NULL, bandwidth, NULL, - BLOCK_JOB_SPEED); + BLOCK_JOB_SPEED, flags); } static int @@ -11689,10 +11730,10 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, virCheckFlags(0, -1); ret = qemuDomainBlockJobImpl(dom, path, base, bandwidth, NULL, - BLOCK_JOB_PULL); + BLOCK_JOB_PULL, flags); if (ret == 0 && bandwidth != 0) ret = qemuDomainBlockJobImpl(dom, path, NULL, bandwidth, NULL, - BLOCK_JOB_SPEED); + BLOCK_JOB_SPEED, flags); return ret; } -- 1.7.7.6

This patch introduces a new block job, useful for live storage migration using pre-copy streaming. If a live VM is using the following backing chain: base <- snap1 <- snap2 then virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY) will create /path/to/copy with no backing file, and virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW) will create /path/to/copy with snap1 as a backing file. An event will be issued when the copy is finally in sync with ths source, at which point the job remains alive until the user is ready to break the mirroring and either abort to the source or pivot to the destination with the new: virDomainBlockJobAbort(dom, disk, VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT) Normally, a shallow copy is created with a backing file being the absolute path matching the backing file of the source, but a management application can pre-create the copy with a relative backing file name, and use the VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT flag to have qemu reuse the metadata; if the management application also copies the backing files to a new location, this can be used to perform live storage migration of an entire backing chain. * include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_TYPE_COPY): New block job type. (VIR_DOMAIN_BLOCK_JOB_MIRRORING): New block job event type. (virDomainBlockJobAbortFlags, virDomainBlockRebaseFlags): New enums. * src/libvirt.c (virDomainBlockRebase): Document the new flags, and implement general restrictions on flag combinations. (virDomainBlockJobAbort): Document the new flag. (virDomainSaveFlags, virDomainSnapshotCreateXML) (virDomainRevertToSnapshot, virDomainDetachDeviceFlags): Document restrictions. * include/libvirt/virterror.h (VIR_ERR_BLOCK_COPY_ACTIVE): New error. * src/util/virterror.c (virErrorMsg): Define it. --- include/libvirt/libvirt.h.in | 24 ++++++++++- include/libvirt/virterror.h | 1 + src/libvirt.c | 90 ++++++++++++++++++++++++++++++++++++++---- src/util/virterror.c | 6 +++ 4 files changed, 111 insertions(+), 10 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 97ad99d..9901a82 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1934,12 +1934,15 @@ int virDomainUpdateDeviceFlags(virDomainPtr domain, /** * virDomainBlockJobType: * - * VIR_DOMAIN_BLOCK_JOB_TYPE_PULL: Block Pull (virDomainBlockPull or - * virDomainBlockRebase) + * VIR_DOMAIN_BLOCK_JOB_TYPE_PULL: Block Pull (virDomainBlockPull, or + * virDomainBlockRebase without flags), job ends on completion + * VIR_DOMAIN_BLOCK_JOB_TYPE_COPY: Block Copy (virDomainBlockRebase with + * flags), job exists as long as mirroring is active */ typedef enum { VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN = 0, VIR_DOMAIN_BLOCK_JOB_TYPE_PULL = 1, + VIR_DOMAIN_BLOCK_JOB_TYPE_COPY = 2, #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_BLOCK_JOB_TYPE_LAST @@ -1950,9 +1953,11 @@ typedef enum { * virDomainBlockJobAbortFlags: * * VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC: Request only, do not wait for completion + * VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT: Pivot to mirror when ending a copy job */ typedef enum { VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC = 1 << 0, + VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT = 1 << 1, } virDomainBlockJobAbortFlags; /* An iterator for monitoring block job operations */ @@ -1983,6 +1988,20 @@ int virDomainBlockJobSetSpeed(virDomainPtr dom, const char *disk, int virDomainBlockPull(virDomainPtr dom, const char *disk, unsigned long bandwidth, unsigned int flags); + +/** + * virDomainBlockRebaseFlags: + * + * Flags available for virDomainBlockRebase(). + */ +typedef enum { + VIR_DOMAIN_BLOCK_REBASE_SHALLOW = 1 << 0, /* Limit copy to top of source + backing chain */ + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT = 1 << 1, /* Reuse existing external + file for a copy */ + VIR_DOMAIN_BLOCK_REBASE_COPY = 1 << 2, /* Start a copy job */ +} virDomainBlockRebaseFlags; + int virDomainBlockRebase(virDomainPtr dom, const char *disk, const char *base, unsigned long bandwidth, unsigned int flags); @@ -3627,6 +3646,7 @@ typedef enum { VIR_DOMAIN_BLOCK_JOB_COMPLETED = 0, VIR_DOMAIN_BLOCK_JOB_FAILED = 1, VIR_DOMAIN_BLOCK_JOB_CANCELED = 2, + VIR_DOMAIN_BLOCK_JOB_MIRRORING = 3, #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_BLOCK_JOB_LAST diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index e04d29e..070fdb5 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -249,6 +249,7 @@ typedef enum { VIR_ERR_NO_DOMAIN_METADATA = 80, /* The metadata is not present */ VIR_ERR_MIGRATE_UNSAFE = 81, /* Migration is not safe */ VIR_ERR_OVERFLOW = 82, /* integer overflow */ + VIR_ERR_BLOCK_COPY_ACTIVE = 83, /* action prevented by block copy job */ } virErrorNumber; /** diff --git a/src/libvirt.c b/src/libvirt.c index af22232..9212c08 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -2696,6 +2696,10 @@ error: * A save file can be inspected or modified slightly with * virDomainSaveImageGetXMLDesc() and virDomainSaveImageDefineXML(). * + * Some hypervisors may prevent this operation if there is a current + * block copy operation; in that case, use virDomainBlockJobAbort() + * to stop the block copy first. + * * Returns 0 in case of success and -1 in case of failure. */ int @@ -9424,6 +9428,10 @@ error: * return failure if LIVE is specified but it only supports removing the * persisted device allocation. * + * Some hypervisors may prevent this operation if there is a current + * block copy operation on the device being detached; in that case, + * use virDomainBlockJobAbort() to stop the block copy first. + * * Returns 0 in case of success, -1 in case of failure. */ int @@ -17124,6 +17132,10 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * that it is still possible to fail after disks have changed, but only * in the much rarer cases of running out of memory or disk space). * + * Some hypervisors may prevent this operation if there is a current + * block copy operation; in that case, use virDomainBlockJobAbort() + * to stop the block copy first. + * * Returns an (opaque) virDomainSnapshotPtr on success, NULL on failure. */ virDomainSnapshotPtr @@ -17913,13 +17925,22 @@ error: * can be found by calling virDomainGetXMLDesc() and inspecting * elements within //domain/devices/disk. * - * By default, this function performs a synchronous operation and the caller + * If the current block job for @disk is VIR_DOMAIN_BLOCK_JOB_TYPE_PULL, then + * by default, this function performs a synchronous operation and the caller * may assume that the operation has completed when 0 is returned. However, * BlockJob operations may take a long time to complete, and during this time * further domain interactions may be unresponsive. To avoid this problem, * pass VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC in the @flags argument to enable * asynchronous behavior. Either way, when the job has been cancelled, a * BlockJob event will be emitted, with status VIR_DOMAIN_BLOCK_JOB_CANCELLED. + * In this usage, @flags must not contain VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT. + * + * If the current block job for @disk is VIR_DOMAIN_BLOCK_JOB_TYPE_COPY, then + * the default is to abort the mirroring and revert to the source disk; + * adding @flags of VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT causes this call to + * fail with VIR_ERR_BLOCK_COPY_ACTIVE if the copy is not fully populated, + * otherwise it will swap the disk over to the copy to end the mirroring. In + * this usage, @flags must not contain VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC. * * Returns -1 in case of failure, 0 when successful. */ @@ -17949,6 +17970,12 @@ int virDomainBlockJobAbort(virDomainPtr dom, const char *disk, _("disk is NULL")); goto error; } + if ((flags & VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC) && + (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("async and pivot flags are mutually exclusive")); + goto error; + } if (conn->driver->domainBlockJobAbort) { int ret; @@ -18169,19 +18196,49 @@ error: * @disk: path to the block device, or device shorthand * @base: path to backing file to keep, or NULL for no backing file * @bandwidth: (optional) specify copy bandwidth limit in Mbps - * @flags: extra flags; not used yet, so callers should always pass 0 + * @flags: bitwise-OR of virDomainBlockRebaseFlags * * Populate a disk image with data from its backing image chain, and - * setting the backing image to @base. @base must be the absolute + * setting the backing image to @base, or alternatively copy an entire + * backing chain to a new file @base. + * + * When @flags is 0, this starts a pull, where @base must be the absolute * path of one of the backing images further up the chain, or NULL to * convert the disk image so that it has no backing image. Once all * data from its backing image chain has been pulled, the disk no * longer depends on those intermediate backing images. This function * pulls data for the entire device in the background. Progress of - * the operation can be checked with virDomainGetBlockJobInfo() and - * the operation can be aborted with virDomainBlockJobAbort(). When - * finished, an asynchronous event is raised to indicate the final - * status. + * the operation can be checked with virDomainGetBlockJobInfo() with a + * job type of VIR_DOMAIN_BLOCK_JOB_TYPE_PULL, and the operation can be + * aborted with virDomainBlockJobAbort(). When finished, an asynchronous + * event is raised to indicate the final status, and the job no longer + * exists. + * + * When @flags includes VIR_DOMAIN_BLOCK_REBASE_COPY, this starts a copy, + * where @base must be the name of a new file to copy the chain to. The + * destination file will have the same file format as the top of the source + * chain. By default, if @base exists as a non-empty regular file, the + * copy is rejected to avoid losing content of that file. However, if + * @flags additionally includes VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT, then + * the destination file must already exist and contain content identical + * to the source file (this allows a management app to pre-create files + * with relative backing file names, rather than the default of creating + * with absolute backing file names). By default, the copy will pull + * the entire source chain into the destination file, but if @flags also + * contains VIR_DOMAIN_BLOCK_REBASE_SHALLOW, then only the top of the + * source chain will be copied. A copy job has two parts; in the first + * phase, the @bandwidth parameter affects how fast the source is pulled + * into the destination, and the job can only be canceled by reverting + * to the source file; progress in this phase can be tracked via the + * virDomainBlockJobInfo() command, with a job type of + * VIR_DOMAIN_BLOCK_JOB_TYPE_COPY. An asynchronous event is sent when + * this phase ends, at which point the job remains alive to indicate + * that both source and destination files contain mirrored contents, and + * the user must call virDomainBlockJobAbort() to end the mirroring while + * choosing whether to revert to source or pivot to the destination. + * Some hypervisors will restrict certain actions, such as virDomainSave() + * or virDomainDetachDevice(), while a copy job is active; they may + * also restrict a copy job to transient domains. * * The @disk parameter is either an unambiguous source name of the * block device (the <source file='...'/> sub-element, such as @@ -18195,7 +18252,8 @@ error: * suitable default. Some hypervisors do not support this feature and will * return an error if bandwidth is not 0. * - * When @base is NULL, this is identical to virDomainBlockPull(). + * When @base is NULL and @flags is 0, this is identical to + * virDomainBlockPull(). * * Returns 0 if the operation has started, -1 on failure. */ @@ -18228,6 +18286,22 @@ int virDomainBlockRebase(virDomainPtr dom, const char *disk, goto error; } + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) { + if (!base) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("base is required when starting a copy")); + goto error; + } + } else if (flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("shallow only permitted when requesting a copy")); + goto error; + } else if (flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("reuse_ext only permitted when requesting a copy")); + goto error; + } + if (conn->driver->domainBlockRebase) { int ret; ret = conn->driver->domainBlockRebase(dom, disk, base, bandwidth, diff --git a/src/util/virterror.c b/src/util/virterror.c index ff9a36f..845081e 100644 --- a/src/util/virterror.c +++ b/src/util/virterror.c @@ -1250,6 +1250,12 @@ virErrorMsg(virErrorNumber error, const char *info) else errmsg = _("numerical overflow: %s"); break; + case VIR_ERR_BLOCK_COPY_ACTIVE: + if (!info) + errmsg = _("block copy still active"); + else + errmsg = _("block copy still active: %s"); + break; } return errmsg; } -- 1.7.7.6

Rather than further overloading 'blockpull', I decided to create a new virsh command to expose the new flags of virDomainBlockRebase. Someday, I'd also like to make blockpull and blockcopy have a synchronous mode, which blocks until the event happens or Ctrl-C is pressed, as well as a --verbose flag to print status updates before the event. * tools/virsh.c (VSH_CMD_BLOCK_JOB_COPY): New mode. (blockJobImpl): Support new flags. (cmdBlockCopy): New command. (cmdBlockJob): Support new job info, new abort flag. * tools/virsh.pod (blockcopy, blockjob): Document the new command and flags. --- tools/virsh.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++------- tools/virsh.pod | 26 ++++++++++++++++++- 2 files changed, 87 insertions(+), 12 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index 7135b15..d01d31d 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -7515,16 +7515,18 @@ typedef enum { VSH_CMD_BLOCK_JOB_INFO = 1, VSH_CMD_BLOCK_JOB_SPEED = 2, VSH_CMD_BLOCK_JOB_PULL = 3, -} VSH_CMD_BLOCK_JOB_MODE; + VSH_CMD_BLOCK_JOB_COPY = 4, +} vshCmdBlockJobMode; static int blockJobImpl(vshControl *ctl, const vshCmd *cmd, - virDomainBlockJobInfoPtr info, int mode) + virDomainBlockJobInfoPtr info, int mode) { virDomainPtr dom = NULL; const char *name, *path; unsigned long bandwidth = 0; int ret = -1; + const char *base = NULL; unsigned int flags = 0; if (!vshConnectionUsability(ctl, ctl->conn)) @@ -7541,22 +7543,37 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, goto cleanup; } - if (mode == VSH_CMD_BLOCK_JOB_ABORT) { + switch ((vshCmdBlockJobMode) mode) { + case VSH_CMD_BLOCK_JOB_ABORT: if (vshCommandOptBool(cmd, "async")) flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC; + if (vshCommandOptBool(cmd, "pivot")) + flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT; ret = virDomainBlockJobAbort(dom, path, flags); - } else if (mode == VSH_CMD_BLOCK_JOB_INFO) { + break; + case VSH_CMD_BLOCK_JOB_INFO: ret = virDomainGetBlockJobInfo(dom, path, info, 0); - } else if (mode == VSH_CMD_BLOCK_JOB_SPEED) { + break; + case VSH_CMD_BLOCK_JOB_SPEED: ret = virDomainBlockJobSetSpeed(dom, path, bandwidth, 0); - } else if (mode == VSH_CMD_BLOCK_JOB_PULL) { - const char *base = NULL; + break; + case VSH_CMD_BLOCK_JOB_PULL: if (vshCommandOptString(cmd, "base", &base) < 0) goto cleanup; if (base) ret = virDomainBlockRebase(dom, path, base, bandwidth, 0); else ret = virDomainBlockPull(dom, path, bandwidth, 0); + break; + case VSH_CMD_BLOCK_JOB_COPY: + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; + if (vshCommandOptBool(cmd, "shallow")) + flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + if (vshCommandOptBool(cmd, "reuse-external")) + flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; + if (vshCommandOptString(cmd, "dest", &base) < 0) + goto cleanup; + ret = virDomainBlockRebase(dom, path, base, bandwidth, flags); } cleanup: @@ -7566,6 +7583,33 @@ cleanup: } /* + * "blockcopy" command + */ +static const vshCmdInfo info_block_copy[] = { + {"help", N_("Start a block copy operation.")}, + {"desc", N_("Populate a disk from its backing image.")}, + {NULL, NULL} +}; + +static const vshCmdOptDef opts_block_copy[] = { + {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")}, + {"path", VSH_OT_DATA, VSH_OFLAG_REQ, N_("Fully-qualified path of disk")}, + {"dest", VSH_OT_DATA, VSH_OFLAG_REQ, N_("path of the copy to create")}, + {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, N_("Bandwidth limit in MB/s")}, + {"shallow", VSH_OT_BOOL, 0, N_("make the copy share a backing chain")}, + {"reuse-external", VSH_OT_BOOL, 0, N_("reuse existing destination")}, + {NULL, 0, 0, NULL} +}; + +static bool +cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) +{ + if (blockJobImpl(ctl, cmd, NULL, VSH_CMD_BLOCK_JOB_COPY) != 0) + return false; + return true; +} + +/* * "blockpull" command */ static const vshCmdInfo info_block_pull[] = { @@ -7607,6 +7651,7 @@ static const vshCmdOptDef opts_block_job[] = { N_("Abort the active job on the specified disk")}, {"async", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("don't wait for --abort to complete")}, + {"pivot", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("with abort, pivot a copy job")}, {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Get active job information for the specified disk")}, {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, @@ -7646,10 +7691,17 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) if (ret == 0 || mode != VSH_CMD_BLOCK_JOB_INFO) return true; - if (info.type == VIR_DOMAIN_BLOCK_JOB_TYPE_PULL) + switch (info.type) { + case VIR_DOMAIN_BLOCK_JOB_TYPE_PULL: type = _("Block Pull"); - else + break; + case VIR_DOMAIN_BLOCK_JOB_TYPE_COPY: + type = _("Block Copy"); + break; + default: type = _("Unknown job"); + break; + } print_job_progress(type, info.end - info.cur, info.end); if (info.bandwidth != 0) @@ -17128,8 +17180,9 @@ static const vshCmdDef domManagementCmds[] = { {"autostart", cmdAutostart, opts_autostart, info_autostart, 0}, {"blkdeviotune", cmdBlkdeviotune, opts_blkdeviotune, info_blkdeviotune, 0}, {"blkiotune", cmdBlkiotune, opts_blkiotune, info_blkiotune, 0}, - {"blockpull", cmdBlockPull, opts_block_pull, info_block_pull, 0}, + {"blockcopy", cmdBlockCopy, opts_block_copy, info_block_copy, 0}, {"blockjob", cmdBlockJob, opts_block_job, info_block_job, 0}, + {"blockpull", cmdBlockPull, opts_block_pull, info_block_pull, 0}, {"blockresize", cmdBlockResize, opts_block_resize, info_block_resize, 0}, {"change-media", cmdChangeMedia, opts_change_media, info_change_media, 0}, #ifndef WIN32 diff --git a/tools/virsh.pod b/tools/virsh.pod index cc7de24..112be22 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -637,6 +637,26 @@ currently in use by a running domain. Other contexts that require a MAC address of virtual interface (such as I<detach-interface> or I<domif-setlink>) will accept the MAC address printed by this command. +=item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<--shallow>] +[I<--reuse-external>] + +Copy a disk backing image chain to I<dest>. By default, this command +flattens the entire chain; but if I<--shallow> is specified, the copy +shares the backing chain. + +If I<--reuse-external> is specified, then I<dest> must exist and have +contents identical to I<disk> (typically used to set up a relative +backing file name). + +The copy runs in the background; initially, the job must copy all data +from the source, and during this phase, the job can only be canceled to +revert back to the source disk. After this phase completes, both the +source and the destination remain mirrored until a call to B<blockjob> +with the I<--abort> and I<--pivot> flags pivots over to the copy. + +I<path> specifies fully-qualified path of the disk. +I<bandwidth> specifies copying bandwidth limit in Mbps. + =item B<blockpull> I<domain> I<path> [I<bandwidth>] [I<base>] Populate a disk from its backing image chain. By default, this command @@ -685,7 +705,7 @@ Both I<--live> and I<--current> flags may be given, but I<--current> is exclusive. If no flag is specified, behavior is different depending on hypervisor. -=item B<blockjob> I<domain> I<path> { I<--abort> [I<--async>] | +=item B<blockjob> I<domain> I<path> { I<--abort> [I<--async>] [I<--pivot] | [I<--info>] | I<bandwidth> } Manage active block operations. If no mode is chosen, I<--info> is assumed. @@ -694,7 +714,9 @@ I<path> specifies fully-qualified path of the disk. If I<--abort> is specified, the active job on the specified disk will be aborted. If I<--async> is also specified, this command will return -immediately, rather than waiting for the cancelation to complete. +immediately, rather than waiting for the cancelation to complete. If +I<--pivot> is specified, this requests that an active copy job +be pivoted over to the new copy. If I<--info> is specified, the active job information on the specified disk will be printed. I<bandwidth> can be used to set bandwidth limit for the active job. -- 1.7.7.6

This new API provides additional flexibility over what can be crammed on top of virDomainBlockRebase, at the expense that it cannot be backported without bumping the .so version. * include/libvirt/libvirt.h.in (virDomainBlockCopy): New API. * src/libvirt.c (virDomainBlockCopy): Implement it. * src/libvirt_public.syms (LIBVIRT_0.9.12): Export it. * src/driver.h (virDrvDomainBlockCopy): New driver callback. * docs/apibuild.py (CParser.parseSignature): Add exception. --- docs/apibuild.py | 1 + include/libvirt/libvirt.h.in | 18 ++++++- src/driver.h | 6 ++ src/libvirt.c | 120 +++++++++++++++++++++++++++++++++++++++++- src/libvirt_public.syms | 5 ++ 5 files changed, 147 insertions(+), 3 deletions(-) diff --git a/docs/apibuild.py b/docs/apibuild.py index 1ac0281..bf06f3b 100755 --- a/docs/apibuild.py +++ b/docs/apibuild.py @@ -1650,6 +1650,7 @@ class CParser: "virDomainBlockJobSetSpeed" : (False, ("bandwidth")), "virDomainBlockPull" : (False, ("bandwidth")), "virDomainBlockRebase" : (False, ("bandwidth")), + "virDomainBlockCopy" : (False, ("bandwidth")), "virDomainMigrateGetMaxSpeed" : (False, ("bandwidth")) } def checkLongLegacyFunction(self, name, return_type, signature): diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 9901a82..9e55307 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1937,7 +1937,7 @@ int virDomainUpdateDeviceFlags(virDomainPtr domain, * VIR_DOMAIN_BLOCK_JOB_TYPE_PULL: Block Pull (virDomainBlockPull, or * virDomainBlockRebase without flags), job ends on completion * VIR_DOMAIN_BLOCK_JOB_TYPE_COPY: Block Copy (virDomainBlockRebase with - * flags), job exists as long as mirroring is active + * flags, or virDomainBlockCopy), job exists as long as mirroring is active */ typedef enum { VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN = 0, @@ -2006,6 +2006,22 @@ int virDomainBlockRebase(virDomainPtr dom, const char *disk, const char *base, unsigned long bandwidth, unsigned int flags); +/** + * virDomainBlockCopyFlags: + * + * Flags available for virDomainBlockCopy(). + */ +typedef enum { + VIR_DOMAIN_BLOCK_COPY_SHALLOW = 1 << 0, /* Limit copy to top of source + backing chain */ + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT = 1 << 1, /* Reuse existing external + file for a copy */ +} virDomainBlockCopyFlags; + +int virDomainBlockCopy(virDomainPtr dom, const char *disk, const char *base, + const char *dest, const char *format, + unsigned long bandwidth, unsigned int flags); + /* Block I/O throttling support */ diff --git a/src/driver.h b/src/driver.h index 03d249b..37eecbc 100644 --- a/src/driver.h +++ b/src/driver.h @@ -788,6 +788,11 @@ typedef int (*virDrvDomainBlockRebase)(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags); +typedef int + (*virDrvDomainBlockCopy)(virDomainPtr dom, const char *path, + const char *base, const char *dest, + const char *format, unsigned long bandwidth, + unsigned int flags); typedef int (*virDrvSetKeepAlive)(virConnectPtr conn, @@ -1005,6 +1010,7 @@ struct _virDriver { virDrvDomainBlockJobSetSpeed domainBlockJobSetSpeed; virDrvDomainBlockPull domainBlockPull; virDrvDomainBlockRebase domainBlockRebase; + virDrvDomainBlockCopy domainBlockCopy; virDrvSetKeepAlive setKeepAlive; virDrvConnectIsAlive isAlive; virDrvNodeSuspendForDuration nodeSuspendForDuration; diff --git a/src/libvirt.c b/src/libvirt.c index 9212c08..a433c30 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -18253,7 +18253,10 @@ error: * return an error if bandwidth is not 0. * * When @base is NULL and @flags is 0, this is identical to - * virDomainBlockPull(). + * virDomainBlockPull(). Conversely, when @flags includes + * VIR_DOMAIN_BLOCK_REBASE_COPY, this is shorthand for + * virDomainBlockCopy(dom, disk, NULL, base, NULL, bandwidth, + * flags & ~VIR_DOMAIN_BLOCK_REBASE_COPY). * * Returns 0 if the operation has started, -1 on failure. */ @@ -18263,7 +18266,7 @@ int virDomainBlockRebase(virDomainPtr dom, const char *disk, { virConnectPtr conn; - VIR_DOMAIN_DEBUG(dom, "disk=%s, base=%s bandwidth=%lu, flags=%x", + VIR_DOMAIN_DEBUG(dom, "disk=%s, base=%s, bandwidth=%lu, flags=%x", disk, NULLSTR(base), bandwidth, flags); virResetLastError(); @@ -18320,6 +18323,119 @@ error: /** + * virDomainBlockCopy: + * @dom: pointer to domain object + * @disk: path to the block device, or device shorthand + * @base: path to backing file to keep, or NULL for no backing file + * @dest: path to the copy destination + * @format: format of the destination + * @bandwidth: (optional) specify copy bandwidth limit in Mbps + * @flags: bitwise-OR of virDomainBlockCopyFlags + * + * Copy a portion of a backing chain to a new file @dest, where the copy + * will have @base as its backing file; or, if @base is NULL, then the + * copy will default to being flat, but if @flags includes + * VIR_DOMAIN_BLOCK_COPY_SHALLOW, then the copy will have the same backing + * fils as the source. The destination file will have the format given by + * @format; if this is NULL, then the format will be the same as the + * top level of the source chain. By default, if @dest exists as a + * non-empty regular file, the copy is rejected to avoid losing content + * of that file. However, if @flags additionally includes + * VIR_DOMAIN_BLOCK_COPY_REUSE_EXT, then the destination file must + * already exist and contain content identical to the source file (this + * allows a management app to pre-create files with relative backing + * file names, rather than the default of creating with absolute backing + * file names). A copy job has two parts; in the first phase, the + * @bandwidth parameter affects how fast the source is pulled into the + * destination, and the job can only be canceled by reverting to the + * source file; progress in this phase can be tracked via the + * virDomainBlockJobInfo() command, with a job type of + * VIR_DOMAIN_BLOCK_JOB_TYPE_COPY. An asynchronous event is sent when + * this phase ends, at which point the job remains alive to indicate + * that both source and destination files contain mirrored contents, and + * the user must call virDomainBlockJobAbort() to end the mirroring while + * choosing whether to revert to source or pivot to the destination. + * Some hypervisors will restrict certain actions, such as virDomainSave() + * or virDomainDetachDevice(), while a copy job is active; they may + * also restrict a copy job to transient domains. + * + * The @disk parameter is either an unambiguous source name of the + * block device (the <source file='...'/> sub-element, such as + * "/path/to/image"), or the device target shorthand (the + * <target dev='...'/> sub-element, such as "xvda"). Valid names + * can be found by calling virDomainGetXMLDesc() and inspecting + * elements within //domain/devices/disk. + * + * The maximum bandwidth (in Mbps) that will be used to do the copy can be + * specified with the bandwidth parameter. If set to 0, libvirt will choose a + * suitable default. Some hypervisors do not support this feature and will + * return an error if bandwidth is not 0. + * + * When @base and @format are NULL, this is equivalent to calling + * virDomainBlockRebase() with the VIR_DOMAIN_BLOCK_REBASE_COPY flag. + * + * Returns 0 if the operation has started, -1 on failure. + */ +int virDomainBlockCopy(virDomainPtr dom, const char *disk, + const char *base, const char *dest, + const char *format, unsigned long bandwidth, + unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(dom, "disk=%s, base=%s, dest=%s, format=%s, " + "bandwidth=%lu, flags=%x", disk, NULLSTR(base), + dest, NULLSTR(format), bandwidth, flags); + + virResetLastError(); + + if (!VIR_IS_CONNECTED_DOMAIN (dom)) { + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + conn = dom->conn; + + if (dom->conn->flags & VIR_CONNECT_RO) { + virLibDomainError(VIR_ERR_OPERATION_DENIED, __FUNCTION__); + goto error; + } + + if (!disk) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("disk is NULL")); + goto error; + } + if (!dest) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("dest is NULL")); + goto error; + } + + if ((flags & VIR_DOMAIN_BLOCK_COPY_SHALLOW) && base) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("base not permitted when doing shallow copy")); + goto error; + } + + if (conn->driver->domainBlockCopy) { + int ret; + ret = conn->driver->domainBlockCopy(dom, disk, base, dest, format, + bandwidth, flags); + if (ret < 0) + goto error; + return ret; + } + + virLibDomainError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(dom->conn); + return -1; +} + + +/** * virDomainOpenGraphics: * @dom: pointer to domain object * @idx: index of graphics config to open diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 46c13fb..d152ab9 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -534,4 +534,9 @@ LIBVIRT_0.9.11 { virDomainPMWakeup; } LIBVIRT_0.9.10; +LIBVIRT_0.9.12 { + global: + virDomainBlockCopy; +} LIBVIRT_0.9.11; + # .... define new API here using predicted next version number .... -- 1.7.7.6

Expose the full abilities of virDomainBlockCopy. * tools/virsh.c (blockJobImpl): Support more options for block copy. * tools/virsh.pod (blockcopy): Document these. --- tools/virsh.c | 23 +++++++++++++++++++---- tools/virsh.pod | 10 ++++++---- 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index d01d31d..8f8b3b7 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -7527,6 +7527,8 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, unsigned long bandwidth = 0; int ret = -1; const char *base = NULL; + const char *dest = NULL; + const char *format = NULL; unsigned int flags = 0; if (!vshConnectionUsability(ctl, ctl->conn)) @@ -7568,12 +7570,22 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, case VSH_CMD_BLOCK_JOB_COPY: flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; if (vshCommandOptBool(cmd, "shallow")) - flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + flags |= VIR_DOMAIN_BLOCK_COPY_SHALLOW; if (vshCommandOptBool(cmd, "reuse-external")) - flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; - if (vshCommandOptString(cmd, "dest", &base) < 0) + flags |= VIR_DOMAIN_BLOCK_COPY_REUSE_EXT; + if (vshCommandOptString(cmd, "base", &base) < 0) + goto cleanup; + if (vshCommandOptString(cmd, "dest", &dest) < 0) + goto cleanup; + if (vshCommandOptString(cmd, "format", &format) < 0) goto cleanup; - ret = virDomainBlockRebase(dom, path, base, bandwidth, flags); + if (!base && !format) { + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; + ret = virDomainBlockRebase(dom, path, dest, bandwidth, flags); + } else { + ret = virDomainBlockCopy(dom, path, base, dest, format, + bandwidth, flags); + } } cleanup: @@ -7596,6 +7608,9 @@ static const vshCmdOptDef opts_block_copy[] = { {"path", VSH_OT_DATA, VSH_OFLAG_REQ, N_("Fully-qualified path of disk")}, {"dest", VSH_OT_DATA, VSH_OFLAG_REQ, N_("path of the copy to create")}, {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, N_("Bandwidth limit in MB/s")}, + {"format", VSH_OT_DATA, VSH_OFLAG_NONE, N_("file format of dest")}, + {"base", VSH_OT_DATA, VSH_OFLAG_NONE, + N_("path of backing file in chain for a partial pull")}, {"shallow", VSH_OT_BOOL, 0, N_("make the copy share a backing chain")}, {"reuse-external", VSH_OT_BOOL, 0, N_("reuse existing destination")}, {NULL, 0, 0, NULL} diff --git a/tools/virsh.pod b/tools/virsh.pod index 112be22..d3be071 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -637,16 +637,18 @@ currently in use by a running domain. Other contexts that require a MAC address of virtual interface (such as I<detach-interface> or I<domif-setlink>) will accept the MAC address printed by this command. -=item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<--shallow>] -[I<--reuse-external>] +=item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<format>] +{ [I<--shallow>] | [I<base>] } [I<--reuse-external>] Copy a disk backing image chain to I<dest>. By default, this command flattens the entire chain; but if I<--shallow> is specified, the copy -shares the backing chain. +shares the backing chain, or I<base> can be used to determine the new +base file of the copy for a partial pull. If I<--reuse-external> is specified, then I<dest> must exist and have contents identical to I<disk> (typically used to set up a relative -backing file name). +backing file name). I<format> can be used to specify a different +file format on the copy as compared to the source. The copy runs in the background; initially, the job must copy all data from the source, and during this phase, the job can only be canceled to -- 1.7.7.6

Almost trivial; the trick was dealing with the fact that we're stuck with 'unsigned long bandwidth' due to earlier design decisions. * src/remote/remote_protocol.x (remote_domain_block_copy_args): New struct. * src/remote/remote_driver.c (remote_driver): Use it. * src/rpc/gendispatch.pl (name_to_ProcName): Cater to legacy bandwidth type. * src/remote_protocol-structs: Regenerate. --- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 13 ++++++++++++- src/remote_protocol-structs | 10 ++++++++++ src/rpc/gendispatch.pl | 1 + 4 files changed, 24 insertions(+), 1 deletions(-) diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index af46384..b431779 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -5095,6 +5095,7 @@ static virDriver remote_driver = { .domainBlockJobSetSpeed = remoteDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = remoteDomainBlockPull, /* 0.9.4 */ .domainBlockRebase = remoteDomainBlockRebase, /* 0.9.10 */ + .domainBlockCopy = remoteDomainBlockCopy, /* 0.9.12 */ .setKeepAlive = remoteSetKeepAlive, /* 0.9.8 */ .isAlive = remoteIsAlive, /* 0.9.8 */ .nodeSuspendForDuration = remoteNodeSuspendForDuration, /* 0.9.8 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 2d57247..de73051 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -1188,6 +1188,15 @@ struct remote_domain_block_rebase_args { unsigned hyper bandwidth; unsigned int flags; }; +struct remote_domain_block_copy_args { + remote_nonnull_domain dom; + remote_nonnull_string path; + remote_string base; + remote_nonnull_string dest; + remote_string format; + unsigned hyper bandwidth; + unsigned int flags; +}; struct remote_domain_set_block_io_tune_args { remote_nonnull_domain dom; @@ -2782,7 +2791,9 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_PM_WAKEUP = 267, /* autogen autogen */ REMOTE_PROC_DOMAIN_EVENT_TRAY_CHANGE = 268, /* autogen autogen */ REMOTE_PROC_DOMAIN_EVENT_PMWAKEUP = 269, /* autogen autogen */ - REMOTE_PROC_DOMAIN_EVENT_PMSUSPEND = 270 /* autogen autogen */ + REMOTE_PROC_DOMAIN_EVENT_PMSUSPEND = 270, /* autogen autogen */ + + REMOTE_PROC_DOMAIN_BLOCK_COPY = 271 /* autogen autogen */ /* * Notice how the entries are grouped in sets of 10 ? diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 9b2414f..c26cf2e 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -845,6 +845,15 @@ struct remote_domain_block_rebase_args { uint64_t bandwidth; u_int flags; }; +struct remote_domain_block_copy_args { + remote_nonnull_domain dom; + remote_nonnull_string path; + remote_string base; + remote_nonnull_string dest; + remote_string format; + uint64_t bandwidth; + u_int flags; +}; struct remote_domain_set_block_io_tune_args { remote_nonnull_domain dom; remote_nonnull_string disk; @@ -2192,4 +2201,5 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_EVENT_TRAY_CHANGE = 268, REMOTE_PROC_DOMAIN_EVENT_PMWAKEUP = 269, REMOTE_PROC_DOMAIN_EVENT_PMSUSPEND = 270, + REMOTE_PROC_DOMAIN_BLOCK_COPY = 271, }; diff --git a/src/rpc/gendispatch.pl b/src/rpc/gendispatch.pl index f161ee0..e5e28e0 100755 --- a/src/rpc/gendispatch.pl +++ b/src/rpc/gendispatch.pl @@ -232,6 +232,7 @@ my $long_legacy = { NodeGetInfo => { ret => { memory => 1 } }, DomainBlockPull => { arg => { bandwidth => 1 } }, DomainBlockRebase => { arg => { bandwidth => 1 } }, + DomainBlockCopy => { arg => { bandwidth => 1 } }, DomainBlockJobSetSpeed => { arg => { bandwidth => 1 } }, DomainMigrateGetMaxSpeed => { ret => { bandwidth => 1 } }, }; -- 1.7.7.6

QUESTION: should we parse and ignore <mirror> on input, rather than rejecting it? By rejecting it, I can't add a unit test, since the unit test framework currently doesn't expose a way to trigger internal parsing. In order to track a block copy job across libvirtd restarts, we need to save internal XML that tracks the name of the file holding the mirror. Displaying this name in dumpxml might also be useful to the user, even if we don't yet have a way to (re-) start a domain with mirroring enabled up front. This is done with a new <mirror> sub-element to <disk>, as in: <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images/original.img'/> <mirror file='/var/lib/libvirt/images/copy.img' format='qcow2'/> ... </disk> Internally, an additional attribute is used to track the state of the job; this attribute does not need to be part of the RNG since it is not exposed to the user. * docs/schemas/domaincommon.rng (diskspec): Add diskMirror. * docs/formatdomain.html.in (elementsDisks): Document it. * src/conf/domain_conf.h (_virDomainDiskDef): New members. * src/conf/domain_conf.c (virDomainDiskDefFree): Clean them. (virDomainDiskDefParseXML): Parse them, but only internally. (virDomainDiskDefFormat): Output them, partially internally. --- docs/formatdomain.html.in | 11 +++++++ docs/schemas/domaincommon.rng | 19 ++++++++++-- src/conf/domain_conf.c | 65 +++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 13 ++++++++ 4 files changed, 105 insertions(+), 3 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index a382d30..534c44b 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -1296,6 +1296,17 @@ </table> <span class="since">Since 0.9.7</span> </dd> + <dt><code>mirror</code></dt> + <dd> + This element is present if the hypervisor has started a block + copy operation (via the <code>virDomainBlockCopy</code> API), + where the mirror location in attribute <code>file</code> will + eventually have the same contents as the source, and with the + file format in attribute <code>format</code> (which might + differ from the format of the source). For now, this element + only valid in output; it is rejected on + input. <span class="since">Since 0.9.12</span> + </dd> <dt><code>target</code></dt> <dd>The <code>target</code> element controls the bus / device under which the disk is exposed to the guest diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 0cc04af..66c91a2 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -772,6 +772,9 @@ <ref name="driver"/> </optional> <optional> + <ref name='diskMirror'/> + </optional> + <optional> <ref name="diskAuth"/> </optional> <ref name="target"/> @@ -1013,9 +1016,7 @@ </element> </define> <!-- - Disk may use a special driver for access. Currently this is - only defined for Xen for tap/aio and file, but will certainly be - extended in the future, and libvirt doesn't look for specific values. + Disk may use a special driver for access. --> <define name="driver"> <element name="driver"> @@ -3024,6 +3025,18 @@ <empty/> </element> </define> + <define name='diskMirror'> + <element name='mirror'> + <attribute name='file'> + <ref name='absFilePath'/> + </attribute> + <optional> + <attribute name='format'> + <ref name="genericName"/> + </attribute> + </optional> + </element> + </define> <define name="diskAuth"> <element name="auth"> <attribute name="username"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index cca757d..83b9655 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -640,6 +640,11 @@ VIR_ENUM_IMPL(virDomainDiskTray, VIR_DOMAIN_DISK_TRAY_LAST, "closed", "open"); +VIR_ENUM_IMPL(virDomainDiskMirrorStage, VIR_DOMAIN_DISK_MIRROR_STAGE_LAST, + "error", + "pulling", + "mirroring"); + #define virDomainReportError(code, ...) \ virReportErrorHelper(VIR_FROM_DOMAIN, code, __FILE__, \ __FUNCTION__, __LINE__, __VA_ARGS__) @@ -933,6 +938,8 @@ void virDomainDiskDefFree(virDomainDiskDefPtr def) VIR_FREE(def->dst); VIR_FREE(def->driverName); VIR_FREE(def->driverType); + VIR_FREE(def->mirror); + VIR_FREE(def->mirrorFormat); VIR_FREE(def->auth.username); if (def->auth.secretType == VIR_DOMAIN_DISK_SECRET_TYPE_USAGE) VIR_FREE(def->auth.secret.usage); @@ -3318,6 +3325,9 @@ virDomainDiskDefParseXML(virCapsPtr caps, char *ioeventfd = NULL; char *event_idx = NULL; char *copy_on_read = NULL; + char *mirror = NULL; + char *mirrorFormat = NULL; + char *mirrorStage = NULL; char *devaddr = NULL; virStorageEncryptionPtr encryption = NULL; char *serial = NULL; @@ -3453,6 +3463,23 @@ virDomainDiskDefParseXML(virCapsPtr caps, ioeventfd = virXMLPropString(cur, "ioeventfd"); event_idx = virXMLPropString(cur, "event_idx"); copy_on_read = virXMLPropString(cur, "copy_on_read"); + } else if ((mirror == NULL) && + (xmlStrEqual(cur->name, BAD_CAST "mirror"))) { + if (flags & VIR_DOMAIN_XML_INTERNAL_STATUS) { + mirror = virXMLPropString(cur, "file"); + if (!mirror) { + virDomainReportError(VIR_ERR_XML_ERROR, "%s", + _("mirror requires file name")); + goto error; + } + mirrorFormat = virXMLPropString(cur, "format"); + mirrorStage = virXMLPropString(cur, "stage"); + } else { + virDomainReportError(VIR_ERR_XML_ERROR, "%s", + _("Cannot handle disk mirror on " + "input yet")); + goto error; + } } else if (xmlStrEqual(cur->name, BAD_CAST "auth")) { authUsername = virXMLPropString(cur, "username"); if (authUsername == NULL) { @@ -3867,6 +3894,19 @@ virDomainDiskDefParseXML(virCapsPtr caps, driverName = NULL; def->driverType = driverType; driverType = NULL; + def->mirror = mirror; + mirror = NULL; + def->mirrorFormat = mirrorFormat; + mirrorFormat = NULL; + if (mirrorStage) { + int stage = virDomainDiskMirrorStageTypeFromString(mirrorStage); + if (stage < 0) { + virDomainReportError(VIR_ERR_INTERNAL_ERROR, + _("Unknown mirror stage '%s'"), mirrorStage); + goto cleanup; + } + def->mirrorStage = stage; + } def->encryption = encryption; encryption = NULL; def->serial = serial; @@ -3882,6 +3922,12 @@ virDomainDiskDefParseXML(virCapsPtr caps, !(def->driverName = strdup(caps->defaultDiskDriverName))) goto no_memory; + + if (def->mirror && !def->mirrorFormat && + caps->defaultDiskDriverType && + !(def->mirrorFormat = strdup(caps->defaultDiskDriverType))) + goto no_memory; + if (def->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && virDomainDiskDefAssignAddress(caps, def) < 0) goto error; @@ -3906,6 +3952,9 @@ cleanup: VIR_FREE(authUsage); VIR_FREE(driverType); VIR_FREE(driverName); + VIR_FREE(mirror); + VIR_FREE(mirrorFormat); + VIR_FREE(mirrorStage); VIR_FREE(cachetag); VIR_FREE(error_policy); VIR_FREE(rerror_policy); @@ -10828,6 +10877,22 @@ virDomainDiskDefFormat(virBufferPtr buf, } } + /* For now, mirroring is currently output-only: we always output + * it, but refuse to parse it on input except for internal parse + * on libvirtd restart. Mirror stage is internal use only. */ + if (def->mirror) { + virBufferEscapeString(buf, " <mirror file='%s'", def->mirror); + if (def->mirrorFormat) + virBufferAsprintf(buf, " format='%s'", def->mirrorFormat); + if (flags & VIR_DOMAIN_XML_INTERNAL_STATUS) { + const char *stage; + + stage = virDomainDiskMirrorStageTypeToString(def->mirrorStage); + virBufferEscapeString(buf, " stage='%s'", stage); + } + virBufferAddLit(buf, ">\n"); + } + virBufferAsprintf(buf, " <target dev='%s' bus='%s'", def->dst, bus); if ((def->device == VIR_DOMAIN_DISK_DEVICE_FLOPPY || diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0eed60e..d4b0338 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -540,6 +540,14 @@ struct _virDomainBlockIoTuneInfo { }; typedef virDomainBlockIoTuneInfo *virDomainBlockIoTuneInfoPtr; +typedef enum { + VIR_DOMAIN_DISK_MIRROR_STAGE_ERROR, + VIR_DOMAIN_DISK_MIRROR_STAGE_PULLING, + VIR_DOMAIN_DISK_MIRROR_STAGE_MIRRORING, + + VIR_DOMAIN_DISK_MIRROR_STAGE_LAST, +} virDomainDiskMirrorStage; + /* Stores the virtual disk configuration */ struct _virDomainDiskDef { int type; @@ -563,6 +571,10 @@ struct _virDomainDiskDef { char *driverName; char *driverType; + char *mirror; + char *mirrorFormat; + int mirrorStage; /* enum virDomainDiskMirrorStage */ + virDomainBlockIoTuneInfo blkdeviotune; char *serial; @@ -2125,6 +2137,7 @@ VIR_ENUM_DECL(virDomainDiskIo) VIR_ENUM_DECL(virDomainDiskSecretType) VIR_ENUM_DECL(virDomainDiskSnapshot) VIR_ENUM_DECL(virDomainDiskTray) +VIR_ENUM_DECL(virDomainDiskMirrorStage) VIR_ENUM_DECL(virDomainIoEventFd) VIR_ENUM_DECL(virDomainVirtioEventIdx) VIR_ENUM_DECL(virDomainDiskCopyOnRead) -- 1.7.7.6

For now, disk migration via block copy job is not implemented. But when we do implement it, we have to deal with the fact that qemu does not provide an easy way to re-start a qemu process with mirroring still intact (it _might_ be possible by using qemu -S then an initial 'drive-mirror' with disk reuse before starting the domain, but that gets hairy). Even something like 'virDomainSave' becomes hairy, if you realize the implications that 'virDomainRestore' would be stuck with recreating the same mirror layout. But if we step back and look at the bigger picture, we realize that the initial client of live storage migration via disk mirroring is oVirt, which always uses transient domains, and that if a transient domain is destroyed while a mirror exists, oVirt can easily restart the storage migration by creating a new domain that visits just the source storage, with no loss in data. We can make life a lot easier by being cowards, and forbidding certain operations on a domain. This patch guarantees that we never get in a state where we would have to restart a domain with a mirroring block copy, by preventing saves, snapshots, and hot unplug of a disk in use. * src/conf/domain_conf.h (virDomainHasDiskMirror): New prototype. * src/conf/domain_conf.c (virDomainHasDiskMirror): New function. * src/libvirt_private.syms (domain_conf.h): Export it. * src/qemu/qemu_driver.c (qemuDomainSaveInternal) (qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot) (qemuDomainBlockJobImpl): Prevent dangerous actions while block copy is already in action. * src/qemu/qemu_hotplug.c (qemuDomainDetachDiskDevice): Likewise. --- src/conf/domain_conf.c | 12 ++++++++++++ src/conf/domain_conf.h | 1 + src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 25 ++++++++++++++++++++++++- src/qemu/qemu_hotplug.c | 7 +++++++ 5 files changed, 45 insertions(+), 1 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 83b9655..df1c27a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -7199,6 +7199,18 @@ virDomainDiskRemoveByName(virDomainDefPtr def, const char *name) return virDomainDiskRemove(def, i); } +/* Return true if VM has at least one disk involved in a current block + * copy job (that is, with a <mirror> element in the disk xml). */ +bool +virDomainHasDiskMirror(virDomainObjPtr vm) +{ + int i; + for (i = 0; i < vm->def->ndisks; i++) + if (vm->def->disks[i]->mirror) + return true; + return false; +} + int virDomainNetInsert(virDomainDefPtr def, virDomainNetDefPtr net) { if (VIR_REALLOC_N(def->nets, def->nnets + 1) < 0) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index d4b0338..0c03e1c 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1962,6 +1962,7 @@ virDomainDiskDefPtr virDomainDiskRemove(virDomainDefPtr def, size_t i); virDomainDiskDefPtr virDomainDiskRemoveByName(virDomainDefPtr def, const char *name); +bool virDomainHasDiskMirror(virDomainObjPtr vm); int virDomainNetIndexByMac(virDomainDefPtr def, const unsigned char *mac); int virDomainNetInsert(virDomainDefPtr def, virDomainNetDefPtr net); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a90f8a0..570940d 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -354,6 +354,7 @@ virDomainGraphicsSpiceZlibCompressionTypeFromString; virDomainGraphicsSpiceZlibCompressionTypeToString; virDomainGraphicsTypeFromString; virDomainGraphicsTypeToString; +virDomainHasDiskMirror; virDomainHostdevDefAlloc; virDomainHostdevDefClear; virDomainHostdevDefFree; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f5b3406..53189b5 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2558,6 +2558,11 @@ qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, "%s", _("domain is marked for auto destroy")); goto cleanup; } + if (virDomainHasDiskMirror(vm)) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", + _("domain has active block copy job")); + goto cleanup; + } memset(&header, 0, sizeof(header)); memcpy(header.magic, QEMUD_SAVE_PARTIAL, sizeof(header.magic)); @@ -10264,6 +10269,12 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, "%s", _("domain is marked for auto destroy")); goto cleanup; } + if (virDomainHasDiskMirror(vm)) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", + _("domain has active block copy job")); + goto cleanup; + } + if (!vm->persistent && (flags & VIR_DOMAIN_SNAPSHOT_CREATE_HALT)) { qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cannot halt after transient domain snapshot")); @@ -10871,6 +10882,11 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, _("no domain with matching uuid '%s'"), uuidstr); goto cleanup; } + if (virDomainHasDiskMirror(vm)) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", + _("domain has active block copy job")); + goto cleanup; + } snap = virDomainSnapshotFindByName(&vm->snapshots, snapshot->name); if (!snap) { @@ -11609,6 +11625,7 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, char uuidstr[VIR_UUID_STRING_BUFLEN]; char *device = NULL; int ret = -1; + int idx; qemuDriverLock(driver); virUUIDFormat(dom->uuid, uuidstr); @@ -11619,10 +11636,16 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, goto cleanup; } - device = qemuDiskPathToAlias(vm, path, NULL); + device = qemuDiskPathToAlias(vm, path, &idx); if (!device) { goto cleanup; } + if (mode == BLOCK_JOB_PULL && vm->def->disks[idx]->mirror) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' already in active block copy job"), + vm->def->disks[idx]->dst); + goto cleanup; + } if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) goto cleanup; diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 857b980..98fa8f8 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1721,6 +1721,13 @@ int qemuDomainDetachDiskDevice(struct qemud_driver *driver, detach = vm->def->disks[i]; + if (detach->mirror) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' is in an active block copy job"), + detach->dst); + goto cleanup; + } + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { qemuReportError(VIR_ERR_INTERNAL_ERROR, -- 1.7.7.6

The new block copy storage migration sequence requires both the 'drive-mirror' action in 'transaction' (present if the 'drive-mirror' standalone monitor command also exists) and the 'drive-reopen' monitor command (it would be nice if that were also part of a 'transaction', but the initial qemu implementation has it standalone only). As of this[1] qemu email, both commands have been proposed but not yet incorporated into the tree, so there is a risk that qemu 1.1 will not have these commands, or will have something subtly different. [1]https://lists.gnu.org/archive/html/qemu-devel/2012-03/msg01524.html * src/qemu/qemu_capabilities.h (QEMU_CAPS_DRIVE_MIRROR) (QEMU_CAPS_DRIVE_REOPEN): New bits. * src/qemu/qemu_capabilities.c (qemuCaps): Name them. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set them. (qemuMonitorJSONDriveMirror, qemuMonitorDriveReopen): New functions. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror) (qemuMonitorDriveReopen): Declare them. * src/qemu/qemu_monitor.c (qemuMonitorDriveMirror) (qemuMonitorDriveReopen): New passthroughs. * src/qemu/qemu_monitor.h (qemuMonitorDriveMirror) (qemuMonitorDriveReopen): Declare them. --- src/qemu/qemu_capabilities.c | 3 ++ src/qemu/qemu_capabilities.h | 2 + src/qemu/qemu_monitor.c | 50 ++++++++++++++++++++++++++++ src/qemu/qemu_monitor.h | 23 +++++++++++++ src/qemu/qemu_monitor_json.c | 74 +++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_monitor_json.h | 21 +++++++++++- 6 files changed, 167 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 0e09d6d..1938ae4 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -156,6 +156,9 @@ VIR_ENUM_IMPL(qemuCaps, QEMU_CAPS_LAST, "scsi-disk.channel", "scsi-block", "transaction", + + "drive-mirror", /* 90 */ + "drive-reopen", ); struct qemu_feature_flags { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 78cdbe0..405bf2a 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -124,6 +124,8 @@ enum qemuCapsFlags { QEMU_CAPS_SCSI_DISK_CHANNEL = 87, /* Is scsi-disk.channel available? */ QEMU_CAPS_SCSI_BLOCK = 88, /* -device scsi-block */ QEMU_CAPS_TRANSACTION = 89, /* transaction monitor command */ + QEMU_CAPS_DRIVE_MIRROR = 90, /* drive-mirror monitor command */ + QEMU_CAPS_DRIVE_REOPEN = 91, /* drive-reopen monitor command */ QEMU_CAPS_LAST, /* this must always be the last item */ }; diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index e1a8d4c..f33bed8 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -2685,6 +2685,32 @@ qemuMonitorDiskSnapshot(qemuMonitorPtr mon, virJSONValuePtr actions, return ret; } +/* Add the drive-mirror action to a transaction. */ +int +qemuMonitorDriveMirror(qemuMonitorPtr mon, virJSONValuePtr actions, + const char *device, const char *file, + const char *format, int mode) +{ + int ret; + + VIR_DEBUG("mon=%p, actions=%p, device=%s, file=%s, format=%s, mode=%o", + mon, actions, device, file, format, mode); + + if (!mon) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (mon->json) + ret = qemuMonitorJSONDriveMirror(mon, actions, device, file, format, + mode); + else + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("drive-mirror requires JSON monitor")); + return ret; +} + /* Use the transaction QMP command to run atomic snapshot commands. */ int qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) @@ -2701,6 +2727,30 @@ qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) return ret; } +/* Use the drive-reopen monitor command. */ +int +qemuMonitorDriveReopen(qemuMonitorPtr mon, const char *device, + const char *file, const char *format) +{ + int ret; + + VIR_DEBUG("mon=%p, device=%s, file=%s, format=%s", + mon, device, file, format); + + if (!mon) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (mon->json) + ret = qemuMonitorJSONDriveReopen(mon, device, file, format); + else + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("drive-reopen requires JSON monitor")); + return ret; +} + int qemuMonitorArbitraryCommand(qemuMonitorPtr mon, const char *cmd, char **reply, diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index b480966..9fb2b49 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -508,8 +508,31 @@ int qemuMonitorDiskSnapshot(qemuMonitorPtr mon, const char *file, const char *format, bool reuse); + +typedef enum { + QEMU_MONITOR_DRIVE_MIRROR_ABSOLUTE, + QEMU_MONITOR_DRIVE_MIRROR_EXISTING, + QEMU_MONITOR_DRIVE_MIRROR_NO_BACKING, + + QEMU_MONITOR_DRIVE_MIRROR_LAST +} qemuMonitorDriveMirrorMode; + +int qemuMonitorDriveMirror(qemuMonitorPtr mon, + virJSONValuePtr actions, + const char *device, + const char *file, + const char *format, + int mode) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); int qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); +int qemuMonitorDriveReopen(qemuMonitorPtr mon, + const char *device, + const char *file, + const char *format) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4); int qemuMonitorArbitraryCommand(qemuMonitorPtr mon, const char *cmd, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 4ec7832..3ddeaa3 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -967,12 +967,14 @@ qemuMonitorJSONCheckCommands(qemuMonitorPtr mon, if (STREQ(name, "human-monitor-command")) *json_hmp = 1; - - if (STREQ(name, "system_wakeup")) + else if (STREQ(name, "system_wakeup")) qemuCapsSet(qemuCaps, QEMU_CAPS_WAKEUP); - - if (STREQ(name, "transaction")) + else if (STREQ(name, "transaction")) qemuCapsSet(qemuCaps, QEMU_CAPS_TRANSACTION); + else if (STREQ(name, "drive-mirror")) + qemuCapsSet(qemuCaps, QEMU_CAPS_DRIVE_MIRROR); + else if (STREQ(name, "drive-reopen")) + qemuCapsSet(qemuCaps, QEMU_CAPS_DRIVE_REOPEN); } ret = 0; @@ -3185,6 +3187,43 @@ cleanup: return ret; } +VIR_ENUM_DECL(qemuMonitorDriveMirror) +VIR_ENUM_IMPL(qemuMonitorDriveMirror, QEMU_MONITOR_DRIVE_MIRROR_LAST, + "absolute-paths", "exsisting", "no-backing-file"); + +int +qemuMonitorJSONDriveMirror(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virJSONValuePtr actions, + const char *device, const char *file, + const char *format, int mode) +{ + int ret = -1; + virJSONValuePtr cmd; + + cmd = qemuMonitorJSONMakeCommandRaw(true, + "drive-mirror", + "s:device", device, + "s:target", file, + "s:format", format, + "s:mode", + qemuMonitorDriveMirrorTypeToString(mode), + NULL); + if (!cmd) + return -1; + + if (virJSONValueArrayAppend(actions, cmd) < 0) { + virReportOOMError(); + goto cleanup; + } + + cmd = NULL; + ret = 0; + +cleanup: + virJSONValueFree(cmd); + return ret; +} + /* Note that this call frees actions regardless of whether the call * succeeds. */ int @@ -3213,6 +3252,33 @@ cleanup: return ret; } +int +qemuMonitorJSONDriveReopen(qemuMonitorPtr mon, const char *device, + const char *file, const char *format) +{ + int ret; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + + cmd = qemuMonitorJSONMakeCommand("drive-reopen", + "s:device", device, + "s:new-image-file", file, + "s:format", format, + NULL); + if (!cmd) + return -1; + + if ((ret = qemuMonitorJSONCommand(mon, cmd, &reply)) < 0) + goto cleanup; + + ret = qemuMonitorJSONCheckError(cmd, reply); + +cleanup: + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} + int qemuMonitorJSONArbitraryCommand(qemuMonitorPtr mon, const char *cmd_str, char **reply_str, diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index a0f67aa..ce23bc1 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -230,8 +230,25 @@ int qemuMonitorJSONDiskSnapshot(qemuMonitorPtr mon, const char *device, const char *file, const char *format, - bool reuse); -int qemuMonitorJSONTransaction(qemuMonitorPtr mon, virJSONValuePtr actions); + bool reuse) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); +int qemuMonitorJSONDriveMirror(qemuMonitorPtr mon, + virJSONValuePtr actions, + const char *device, + const char *file, + const char *format, + int mode) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); +int qemuMonitorJSONTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); +int qemuMonitorJSONDriveReopen(qemuMonitorPtr mon, + const char *device, + const char *file, + const char *format) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4); int qemuMonitorJSONArbitraryCommand(qemuMonitorPtr mon, const char *cmd_str, -- 1.7.7.6

Implementing the VIR_DOMAIN_BLOCK_REBASE_SHALLOW flag will require knowing what the immediate backing file is. As long as we don't track the entire backing chain in <domain> XML, the next best that we can do is to ask qemu. * src/qemu/qemu_conf.h (qemuDomainDiskInfo): Add member. * src/qemu/qemu_monitor.c (qemuMonitorGetBlockInfo): Free it, via... (qemuMonitorBlockInfoFree): ...new helper. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBlockInfo): Populate it. --- src/qemu/qemu_conf.h | 1 + src/qemu/qemu_monitor.c | 10 +++++++++- src/qemu/qemu_monitor_json.c | 12 ++++++++++++ 3 files changed, 22 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 482e6d3..a8aefaf 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -182,6 +182,7 @@ struct qemuDomainDiskInfo { bool locked; bool tray_open; int io_status; + char *backing; }; typedef virDomainObjPtr (*qemuDriverCloseCallback)(struct qemud_driver *driver, diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index f33bed8..2f5be2e 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -1358,6 +1358,14 @@ qemuMonitorBlockIOStatusToError(const char *status) return -1; } +static void +qemuMonitorBlockInfoFree(void *payload, const void *name ATTRIBUTE_UNUSED) +{ + struct qemuDomainDiskInfo *info = payload; + VIR_FREE(info->backing); + VIR_FREE(info); +} + virHashTablePtr qemuMonitorGetBlockInfo(qemuMonitorPtr mon) { @@ -1372,7 +1380,7 @@ qemuMonitorGetBlockInfo(qemuMonitorPtr mon) return NULL; } - if (!(table = virHashCreate(32, (virHashDataFree) free))) + if (!(table = virHashCreate(32, qemuMonitorBlockInfoFree))) return NULL; if (mon->json) diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 3ddeaa3..d24d7e7 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -1532,6 +1532,8 @@ int qemuMonitorJSONGetBlockInfo(qemuMonitorPtr mon, struct qemuDomainDiskInfo *info; const char *thisdev; const char *status; + virJSONValuePtr inserted; + const char *backing = NULL; if (!dev || dev->type != VIR_JSON_TYPE_OBJECT) { qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -1584,6 +1586,16 @@ int qemuMonitorJSONGetBlockInfo(qemuMonitorPtr mon, if (info->io_status < 0) goto cleanup; } + + /* Missing inserted, or inserted with missing backing_file, + * indicates no backing file. */ + inserted = virJSONValueObjectGet(dev, "inserted"); + if (inserted) + backing = virJSONValueObjectGetString(inserted, "backing_file"); + if (backing && !(info->backing = strdup(backing))) { + virReportOOMError(); + goto cleanup; + } } ret = 0; -- 1.7.7.6

During a block copy, we want to tweak the events that get output: - a successful block pull means we have transitioned to mirroring - a failed block pull affects what the next abort will do - a canceled block pull must not generate an event yet (that event gets delayed until after the drive-reopen) We also want to output a job info that indicates whether the job has transitioned to mirroring, even though qemu's 'query-block-jobs' quits giving information at that point. Of course, this patch does nothing until a later patch actually allows the creation of a block copy job. * src/qemu/qemu_process.c (qemuProcessHandleBlockJob): tweak event to reflect transition to mirroring * src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): provide info after event --- src/qemu/qemu_driver.c | 25 ++++++++++++++++++++++++- src/qemu/qemu_process.c | 19 +++++++++++++++++++ 2 files changed, 43 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 53189b5..1664e14 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11666,12 +11666,35 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, ret = qemuMonitorBlockJob(priv->mon, device, base, bandwidth, info, mode); qemuDomainObjExitMonitorWithDriver(driver, vm); + if (ret < 0) + goto cleanup; + + /* A block copy operation must provide info back to the user, even + * when it has transitioned to the mirroring stage. */ + if (mode == BLOCK_JOB_INFO && vm->def->disks[idx]->mirror) { + if (!vm->def->disks[idx]->mirrorStage) { + qemuReportError(VIR_ERR_OPERATION_FAILED, _("copy to '%s' failed"), + vm->def->disks[idx]->mirror); + ret = -1; + goto cleanup; + } + if (ret == 0) { + vm->def->disks[idx]->mirrorStage = + VIR_DOMAIN_DISK_MIRROR_STAGE_MIRRORING; + info->bandwidth = 0; + info->cur = 1; + info->end = 1; + } + info->type = VIR_DOMAIN_BLOCK_JOB_TYPE_COPY; + ret = 1; + } + /* Qemu provides asynchronous block job cancellation, but without * the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag libvirt guarantees a * synchronous operation. Provide this behavior by waiting here, * so we don't get confused by newly scheduled block jobs. */ - if (ret == 0 && mode == BLOCK_JOB_ABORT && + if (mode == BLOCK_JOB_ABORT && !(flags & VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC)) { ret = 1; while (1) { diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 9eed160..bd020b6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -903,9 +903,28 @@ qemuProcessHandleBlockJob(qemuMonitorPtr mon ATTRIBUTE_UNUSED, if (disk) { path = disk->src; + if (disk->mirror && type == VIR_DOMAIN_BLOCK_JOB_TYPE_PULL) { + type = VIR_DOMAIN_BLOCK_JOB_TYPE_COPY; + switch ((virConnectDomainEventBlockJobStatus) status) { + case VIR_DOMAIN_BLOCK_JOB_COMPLETED: + status = VIR_DOMAIN_BLOCK_JOB_MIRRORING; + disk->mirrorStage = VIR_DOMAIN_DISK_MIRROR_STAGE_MIRRORING; + break; + case VIR_DOMAIN_BLOCK_JOB_FAILED: + disk->mirrorStage = VIR_DOMAIN_DISK_MIRROR_STAGE_ERROR; + break; + case VIR_DOMAIN_BLOCK_JOB_CANCELED: + goto cleanup; + case VIR_DOMAIN_BLOCK_JOB_MIRRORING: + case VIR_DOMAIN_BLOCK_JOB_LAST: + VIR_DEBUG("should not reach here"); + goto cleanup; + } + } event = virDomainEventBlockJobNewFromObj(vm, path, type, status); } +cleanup: virDomainObjUnlock(vm); if (event) { -- 1.7.7.6

Wire up all the pieces in the previous patches to actually enable a block copy job. This compiles, but isn't correct, and there are still more patches to go: prereq patch: fix qemuDomainBlockJobImpl to do BLOCK_JOB_PULL and BLOCK_JOB_SPEED in one monitor job, rather than two this patch: fix things to reuse the same block job for all three monitor calls (drive-mirror, block_stream, block_job_set_speed); handle errors better, set SELinux labels and lock manager usage properly, add a disk audit message future patch 1: add block job abort processing, which calls 'drive-reopen' at the proper times, and also includes SELinux and lock manager activity future patch 2: wire up the _SHALLOW flag to use 'query-block' to determine the appropriate 'base' argument for shallow copy future patch 3: wire up the virDomainBlockCopy command to this function future patch 4: RHEL-only: cater to the __com.redhat_* naming prefix of an early backport what else? probably several cleanup patches But the series is progressing nicely, and even though this particular patch isn't ready, the prereqs can be reviewed. * src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function. (qemuDomainBlockRebase): Call it when appropriate. --- src/qemu/qemu_driver.c | 108 +++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 107 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1664e14..42af95e 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11769,12 +11769,117 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, } static int +qemuDomainBlockCopy(virDomainPtr dom, const char *path, const char *base, + const char *dest, const char *format, + unsigned long bandwidth, unsigned int flags) +{ + struct qemud_driver *driver = dom->conn->privateData; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + char uuidstr[VIR_UUID_STRING_BUFLEN]; + char *device = NULL; + virDomainDiskDefPtr disk; + int ret = -1; + int idx; + virJSONValuePtr actions = NULL; + + /* Step 0: get the disk, check for caps */ + virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT, -1); + + qemuDriverLock(driver); + virUUIDFormat(dom->uuid, uuidstr); + vm = virDomainFindByUUID(&driver->domains, dom->uuid); + if (!vm) { + qemuReportError(VIR_ERR_NO_DOMAIN, + _("no domain with matching uuid '%s'"), uuidstr); + goto cleanup; + } + + device = qemuDiskPathToAlias(vm, path, &idx); + if (!device) { + goto cleanup; + } + disk = vm->def->disks[idx]; + if (disk->mirror) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' already in active block copy job"), + disk->dst); + goto cleanup; + } + + priv = vm->privateData; + if (!(qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_MIRROR) && + qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_REOPEN))) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("block copy is not supported with this QEMU binary")); + goto cleanup; + } + if (vm->persistent) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not transient")); + goto cleanup; + } + + if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + goto endjob; + } + + /* Step 1: call 'drive-mirror' to start the mirroring */ + actions = virJSONValueNewArray(); + if (!actions) { + virReportOOMError(); + goto endjob; + } + qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorDriveMirror(priv->mon, actions, device, dest, + format ? format : disk->driverType, + (flags & VIR_DOMAIN_BLOCK_COPY_REUSE_EXT ? + QEMU_MONITOR_DRIVE_MIRROR_EXISTING : + QEMU_MONITOR_DRIVE_MIRROR_ABSOLUTE)); + + if (ret == 0) + ret = qemuMonitorTransaction(priv->mon, actions); + qemuDomainObjExitMonitorWithDriver(driver, vm); + + /* Step 2: call 'block_stream' to pull into the mirror */ + ret = qemuDomainBlockJobImpl(dom, path, base, bandwidth, NULL, + BLOCK_JOB_PULL, flags); + if (ret == 0 && bandwidth != 0) + ret = qemuDomainBlockJobImpl(dom, path, NULL, bandwidth, NULL, + BLOCK_JOB_SPEED, flags); + +endjob: + if (qemuDomainObjEndJob(driver, vm) == 0) { + vm = NULL; + goto cleanup; + } + +cleanup: + VIR_FREE(device); + if (vm) + virDomainObjUnlock(vm); + qemuDriverUnlock(driver); + return ret; +} + +static int qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { int ret; - virCheckFlags(0, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_COPY | + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT, -1); + + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) + return qemuDomainBlockCopy(dom, path, NULL, base, NULL, bandwidth, + flags & ~VIR_DOMAIN_BLOCK_REBASE_COPY); + ret = qemuDomainBlockJobImpl(dom, path, base, bandwidth, NULL, BLOCK_JOB_PULL, flags); if (ret == 0 && bandwidth != 0) @@ -11787,6 +11892,7 @@ static int qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth, unsigned int flags) { + virCheckFlags(0, -1); return qemuDomainBlockRebase(dom, path, NULL, bandwidth, flags); } -- 1.7.7.6
participants (1)
-
Eric Blake