[libvirt] [PATCHv2 00/15] live block migration via virDomainBlockCopy

v1 was here: https://www.redhat.com/archives/libvir-list/2012-April/msg00068.html changes from v1: Paolo has updated the qemu side of things, and built a scratch image for RHEL that I was able to test with for the new semantics. I was actually able to successfully run two different block copy jobs, one canceled and one pivoted, with SELinux disabled and this patch series. Patch 12/15 is for reference only when working with Paolo's build; it will not go upstream. Patches 13-15 are optional; the extra flexibility might be nice, but I haven't yet played with those three enough to know if Paolo's build behaves like I was expecting. I obviously have more patches to write, such as supporting the REUSE_EXT flag to reuse files instead of creating a new one, and fixing several XXX comments related to SELinux, auditing, and disk locking. But they can be extra patches on top of this starting series. Adam Litke (2): blockjob: add API for async virDomainBlockJobAbort blockjob: wire up qemu async virDomainBlockJobAbort Eric Blake (13): blockjob: allow for fast-finishing job blockjob: add new API flags blockjob: add 'blockcopy' to virsh blockjob: enhance xml to track mirrors across libvirtd restart blockjob: react to active block copy blockjob: expose qemu commands for mirrored storage migration blockjob: return appropriate event and info blockjob: support pivot operation on cancel blockjob: implement block copy for qemu blockjob: accommodate RHEL backport names blockjob: add virDomainBlockCopy blockjob: enhance virsh 'blockcopy' blockjob: wire up qemu and RPC for block copy docs/apibuild.py | 1 + docs/formatdomain.html.in | 11 ++ docs/schemas/domaincommon.rng | 19 ++- include/libvirt/libvirt.h.in | 50 ++++++- include/libvirt/virterror.h | 1 + src/conf/domain_conf.c | 54 +++++++ src/conf/domain_conf.h | 6 + src/driver.h | 5 + src/libvirt.c | 220 +++++++++++++++++++++++++- src/libvirt_private.syms | 1 + src/libvirt_public.syms | 5 + src/qemu/qemu_capabilities.c | 3 + src/qemu/qemu_capabilities.h | 2 + src/qemu/qemu_driver.c | 340 +++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_hotplug.c | 7 + src/qemu/qemu_monitor.c | 50 ++++++ src/qemu/qemu_monitor.h | 28 +++- src/qemu/qemu_monitor_json.c | 154 ++++++++++++++++--- src/qemu/qemu_monitor_json.h | 21 +++- src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 12 ++- src/remote_protocol-structs | 9 + src/rpc/gendispatch.pl | 1 + src/util/virterror.c | 6 + tools/virsh.c | 140 ++++++++++++++---- tools/virsh.pod | 40 +++++- 26 files changed, 1099 insertions(+), 88 deletions(-) -- 1.7.7.6

From: Adam Litke <agl@us.ibm.com> Qemu has changed the semantics of the "block_job_cancel" API. The original qed implementation (pretty much only backported to RHEL 6.2 qemu) was synchronous (ie. upon command completion, the operation was guaranteed to be completely stopped). With the new semantics going into qemu 1.1 for qcow2, a "block_job_cancel" merely requests that the operation be cancelled and an event is triggered once the cancellation request has been honored. To adopt the new semantics while preserving compatibility the following updates are made to the virDomainBlockJob API: A new block job event type VIR_DOMAIN_BLOCK_JOB_CANCELLED is recognized by libvirt. Regardless of the flags used with virDomainBlockJobAbort, this event will be raised whenever it is received from qemu. This event indicates that a block job has been successfully cancelled. For now, libvirt does not try to synthesize this event if using an older qemu that did not generate it. A new extension flag VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC is added to the virDomainBlockJobAbort API. When enabled, this function will operate asynchronously (ie, it can return before the job has actually been cancelled). When the API is used in this mode, it is the responsibility of the caller to wait for a VIR_DOMAIN_BLOCK_JOB_CANCELLED event or poll via the virDomainGetBlockJobInfo API to check the cancellation status; this flag is an error if it is not known if the hypervisor supports asynchronous cancel. This patch also exposes the new flag through virsh. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt.h.in | 10 +++++++++ src/libvirt.c | 10 ++++++++- src/qemu/qemu_monitor_json.c | 42 +++++++++++++++++++++++++++++++------ tools/virsh.c | 46 ++++++++++++++++++++++++++--------------- tools/virsh.pod | 9 +++++-- 5 files changed, 89 insertions(+), 28 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 499dcd4..97ad99d 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1946,6 +1946,15 @@ typedef enum { #endif } virDomainBlockJobType; +/** + * virDomainBlockJobAbortFlags: + * + * VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC: Request only, do not wait for completion + */ +typedef enum { + VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC = 1 << 0, +} virDomainBlockJobAbortFlags; + /* An iterator for monitoring block job operations */ typedef unsigned long long virDomainBlockJobCursor; @@ -3617,6 +3626,7 @@ typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn, typedef enum { VIR_DOMAIN_BLOCK_JOB_COMPLETED = 0, VIR_DOMAIN_BLOCK_JOB_FAILED = 1, + VIR_DOMAIN_BLOCK_JOB_CANCELED = 2, #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_BLOCK_JOB_LAST diff --git a/src/libvirt.c b/src/libvirt.c index 16d1fd5..af22232 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -17902,7 +17902,7 @@ error: * virDomainBlockJobAbort: * @dom: pointer to domain object * @disk: path to the block device, or device shorthand - * @flags: extra flags; not used yet, so callers should always pass 0 + * @flags: bitwise-OR of virDomainBlockJobAbortFlags * * Cancel the active block job on the given disk. * @@ -17913,6 +17913,14 @@ error: * can be found by calling virDomainGetXMLDesc() and inspecting * elements within //domain/devices/disk. * + * By default, this function performs a synchronous operation and the caller + * may assume that the operation has completed when 0 is returned. However, + * BlockJob operations may take a long time to complete, and during this time + * further domain interactions may be unresponsive. To avoid this problem, + * pass VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC in the @flags argument to enable + * asynchronous behavior. Either way, when the job has been cancelled, a + * BlockJob event will be emitted, with status VIR_DOMAIN_BLOCK_JOB_CANCELLED. + * * Returns -1 in case of failure, 0 when successful. */ int virDomainBlockJobAbort(virDomainPtr dom, const char *disk, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 8eeb307..9a8da3a 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -58,13 +58,14 @@ static void qemuMonitorJSONHandleIOError(qemuMonitorPtr mon, virJSONValuePtr dat static void qemuMonitorJSONHandleVNCConnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleVNCInitialize(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleVNCDisconnect(qemuMonitorPtr mon, virJSONValuePtr data); -static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEConnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEInitialize(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEDisconnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleTrayChange(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandlePMWakeup(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandlePMSuspend(qemuMonitorPtr mon, virJSONValuePtr data); +static void qemuMonitorJSONHandleBlockJobCompleted(qemuMonitorPtr mon, virJSONValuePtr data); +static void qemuMonitorJSONHandleBlockJobCanceled(qemuMonitorPtr mon, virJSONValuePtr data); static struct { const char *type; @@ -80,13 +81,14 @@ static struct { { "VNC_CONNECTED", qemuMonitorJSONHandleVNCConnect, }, { "VNC_INITIALIZED", qemuMonitorJSONHandleVNCInitialize, }, { "VNC_DISCONNECTED", qemuMonitorJSONHandleVNCDisconnect, }, - { "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJob, }, { "SPICE_CONNECTED", qemuMonitorJSONHandleSPICEConnect, }, { "SPICE_INITIALIZED", qemuMonitorJSONHandleSPICEInitialize, }, { "SPICE_DISCONNECTED", qemuMonitorJSONHandleSPICEDisconnect, }, { "DEVICE_TRAY_MOVED", qemuMonitorJSONHandleTrayChange, }, { "WAKEUP", qemuMonitorJSONHandlePMWakeup, }, { "SUSPEND", qemuMonitorJSONHandlePMSuspend, }, + { "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJobCompleted, }, + { "BLOCK_JOB_CANCELLED", qemuMonitorJSONHandleBlockJobCanceled, }, }; @@ -754,13 +756,15 @@ static void qemuMonitorJSONHandleSPICEDisconnect(qemuMonitorPtr mon, virJSONValu qemuMonitorJSONHandleGraphics(mon, data, VIR_DOMAIN_EVENT_GRAPHICS_DISCONNECT); } -static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data) +static void +qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon, + virJSONValuePtr data, + int event) { const char *device; const char *type_str; int type = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; unsigned long long offset, len; - int status = VIR_DOMAIN_BLOCK_JOB_FAILED; if ((device = virJSONValueObjectGetString(data, "device")) == NULL) { VIR_WARN("missing device in block job event"); @@ -785,11 +789,19 @@ static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr da if (STREQ(type_str, "stream")) type = VIR_DOMAIN_BLOCK_JOB_TYPE_PULL; - if (offset != 0 && offset == len) - status = VIR_DOMAIN_BLOCK_JOB_COMPLETED; + switch (event) { + case VIR_DOMAIN_BLOCK_JOB_COMPLETED: + /* Make sure the whole device has been processed */ + if (offset != len) + event = VIR_DOMAIN_BLOCK_JOB_FAILED; + break; + case VIR_DOMAIN_BLOCK_JOB_FAILED: + case VIR_DOMAIN_BLOCK_JOB_CANCELED: + break; + } out: - qemuMonitorEmitBlockJob(mon, device, type, status); + qemuMonitorEmitBlockJob(mon, device, type, event); } static void @@ -832,6 +844,22 @@ qemuMonitorJSONHandlePMSuspend(qemuMonitorPtr mon, qemuMonitorEmitPMSuspend(mon); } +static void +qemuMonitorJSONHandleBlockJobCompleted(qemuMonitorPtr mon, + virJSONValuePtr data) +{ + qemuMonitorJSONHandleBlockJobImpl(mon, data, + VIR_DOMAIN_BLOCK_JOB_COMPLETED); +} + +static void +qemuMonitorJSONHandleBlockJobCanceled(qemuMonitorPtr mon, + virJSONValuePtr data) +{ + qemuMonitorJSONHandleBlockJobImpl(mon, data, + VIR_DOMAIN_BLOCK_JOB_CANCELED); +} + int qemuMonitorJSONHumanCommandWithFd(qemuMonitorPtr mon, const char *cmd_str, diff --git a/tools/virsh.c b/tools/virsh.c index cfdd040..7180b83 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -7525,6 +7525,7 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, const char *name, *path; unsigned long bandwidth = 0; int ret = -1; + unsigned int flags = 0; if (!vshConnectionUsability(ctl, ctl->conn)) goto cleanup; @@ -7541,7 +7542,9 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, } if (mode == VSH_CMD_BLOCK_JOB_ABORT) { - ret = virDomainBlockJobAbort(dom, path, 0); + if (vshCommandOptBool(cmd, "async")) + flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC; + ret = virDomainBlockJobAbort(dom, path, flags); } else if (mode == VSH_CMD_BLOCK_JOB_INFO) { ret = virDomainGetBlockJobInfo(dom, path, info, 0); } else if (mode == VSH_CMD_BLOCK_JOB_SPEED) { @@ -7589,20 +7592,25 @@ cmdBlockPull(vshControl *ctl, const vshCmd *cmd) } /* - * "blockjobinfo" command + * "blockjob" command */ static const vshCmdInfo info_block_job[] = { - {"help", N_("Manage active block operations.")}, - {"desc", N_("Manage active block operations.")}, + {"help", N_("Manage active block operations")}, + {"desc", N_("Query, adjust speed, or cancel active block operations.")}, {NULL, NULL} }; static const vshCmdOptDef opts_block_job[] = { {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")}, {"path", VSH_OT_DATA, VSH_OFLAG_REQ, N_("Fully-qualified path of disk")}, - {"abort", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Abort the active job on the specified disk")}, - {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Get active job information for the specified disk")}, - {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, N_("Set the Bandwidth limit in MB/s")}, + {"abort", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("Abort the active job on the specified disk")}, + {"async", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("don't wait for --abort to complete")}, + {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("Get active job information for the specified disk")}, + {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, + N_("Set the Bandwidth limit in MB/s")}, {NULL, 0, 0, NULL} }; @@ -7613,19 +7621,23 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) virDomainBlockJobInfo info; const char *type; int ret; + bool abortMode = vshCommandOptBool(cmd, "abort"); + bool infoMode = vshCommandOptBool(cmd, "info"); + bool bandwidth = vshCommandOptBool(cmd, "bandwidth"); - if (vshCommandOptBool (cmd, "abort")) { - mode = VSH_CMD_BLOCK_JOB_ABORT; - } else if (vshCommandOptBool (cmd, "info")) { - mode = VSH_CMD_BLOCK_JOB_INFO; - } else if (vshCommandOptBool (cmd, "bandwidth")) { - mode = VSH_CMD_BLOCK_JOB_SPEED; - } else { + if (abortMode + infoMode + bandwidth > 1) { vshError(ctl, "%s", _("One of --abort, --info, or --bandwidth is required")); return false; } + if (abortMode) + mode = VSH_CMD_BLOCK_JOB_ABORT; + else if (bandwidth) + mode = VSH_CMD_BLOCK_JOB_SPEED; + else + mode = VSH_CMD_BLOCK_JOB_INFO; + ret = blockJobImpl(ctl, cmd, &info, mode); if (ret < 0) return false; @@ -7634,13 +7646,13 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) return true; if (info.type == VIR_DOMAIN_BLOCK_JOB_TYPE_PULL) - type = "Block Pull"; + type = _("Block Pull"); else - type = "Unknown job"; + type = _("Unknown job"); print_job_progress(type, info.end - info.cur, info.end); if (info.bandwidth != 0) - vshPrint(ctl, " Bandwidth limit: %lu MB/s\n", info.bandwidth); + vshPrint(ctl, _(" Bandwidth limit: %lu MB/s\n"), info.bandwidth); return true; } diff --git a/tools/virsh.pod b/tools/virsh.pod index a60e667..aac72d1 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -686,13 +686,16 @@ Both I<--live> and I<--current> flags may be given, but I<--current> is exclusive. If no flag is specified, behavior is different depending on hypervisor. -=item B<blockjob> I<domain> I<path> [I<--abort>] [I<--info>] [I<bandwidth>] +=item B<blockjob> I<domain> I<path> { I<--abort> [I<--async>] | +[I<--info>] | I<bandwidth> } -Manage active block operations. +Manage active block operations. If no mode is chosen, I<--info> is assumed. I<path> specifies fully-qualified path of the disk. + If I<--abort> is specified, the active job on the specified disk will -be aborted. +be aborted. If I<--async> is also specified, this command will return +immediately, rather than waiting for the cancelation to complete. If I<--info> is specified, the active job information on the specified disk will be printed. I<bandwidth> can be used to set bandwidth limit for the active job. -- 1.7.7.6

On Thu, Apr 05, 2012 at 10:36:47PM -0600, Eric Blake wrote:
From: Adam Litke <agl@us.ibm.com>
Qemu has changed the semantics of the "block_job_cancel" API. The original qed implementation (pretty much only backported to RHEL 6.2 qemu) was synchronous (ie. upon command completion, the operation was guaranteed to be completely stopped). With the new semantics going into qemu 1.1 for qcow2, a "block_job_cancel" merely requests that the operation be cancelled and an event is triggered once the cancellation request has been honored.
To adopt the new semantics while preserving compatibility the following updates are made to the virDomainBlockJob API:
A new block job event type VIR_DOMAIN_BLOCK_JOB_CANCELLED is recognized by libvirt. Regardless of the flags used with virDomainBlockJobAbort, this event will be raised whenever it is received from qemu. This event indicates that a block job has been successfully cancelled. For now, libvirt does not try to synthesize this event if using an older qemu that did not generate it.
A new extension flag VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC is added to the virDomainBlockJobAbort API. When enabled, this function will operate asynchronously (ie, it can return before the job has actually been cancelled). When the API is used in this mode, it is the responsibility of the caller to wait for a VIR_DOMAIN_BLOCK_JOB_CANCELLED event or poll via the virDomainGetBlockJobInfo API to check the cancellation status; this flag is an error if it is not known if the hypervisor supports asynchronous cancel.
This patch also exposes the new flag through virsh.
Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Tested-by: Adam Litke <agl@us.ibm.com>
--- include/libvirt/libvirt.h.in | 10 +++++++++ src/libvirt.c | 10 ++++++++- src/qemu/qemu_monitor_json.c | 42 +++++++++++++++++++++++++++++++------ tools/virsh.c | 46 ++++++++++++++++++++++++++--------------- tools/virsh.pod | 9 +++++-- 5 files changed, 89 insertions(+), 28 deletions(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 499dcd4..97ad99d 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1946,6 +1946,15 @@ typedef enum { #endif } virDomainBlockJobType;
+/** + * virDomainBlockJobAbortFlags: + * + * VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC: Request only, do not wait for completion + */ +typedef enum { + VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC = 1 << 0, +} virDomainBlockJobAbortFlags; + /* An iterator for monitoring block job operations */ typedef unsigned long long virDomainBlockJobCursor;
@@ -3617,6 +3626,7 @@ typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn, typedef enum { VIR_DOMAIN_BLOCK_JOB_COMPLETED = 0, VIR_DOMAIN_BLOCK_JOB_FAILED = 1, + VIR_DOMAIN_BLOCK_JOB_CANCELED = 2,
#ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_BLOCK_JOB_LAST diff --git a/src/libvirt.c b/src/libvirt.c index 16d1fd5..af22232 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -17902,7 +17902,7 @@ error: * virDomainBlockJobAbort: * @dom: pointer to domain object * @disk: path to the block device, or device shorthand - * @flags: extra flags; not used yet, so callers should always pass 0 + * @flags: bitwise-OR of virDomainBlockJobAbortFlags * * Cancel the active block job on the given disk. * @@ -17913,6 +17913,14 @@ error: * can be found by calling virDomainGetXMLDesc() and inspecting * elements within //domain/devices/disk. * + * By default, this function performs a synchronous operation and the caller + * may assume that the operation has completed when 0 is returned. However, + * BlockJob operations may take a long time to complete, and during this time + * further domain interactions may be unresponsive. To avoid this problem, + * pass VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC in the @flags argument to enable + * asynchronous behavior. Either way, when the job has been cancelled, a + * BlockJob event will be emitted, with status VIR_DOMAIN_BLOCK_JOB_CANCELLED. + * * Returns -1 in case of failure, 0 when successful. */ int virDomainBlockJobAbort(virDomainPtr dom, const char *disk, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 8eeb307..9a8da3a 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -58,13 +58,14 @@ static void qemuMonitorJSONHandleIOError(qemuMonitorPtr mon, virJSONValuePtr dat static void qemuMonitorJSONHandleVNCConnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleVNCInitialize(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleVNCDisconnect(qemuMonitorPtr mon, virJSONValuePtr data); -static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEConnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEInitialize(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleSPICEDisconnect(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleTrayChange(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandlePMWakeup(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandlePMSuspend(qemuMonitorPtr mon, virJSONValuePtr data); +static void qemuMonitorJSONHandleBlockJobCompleted(qemuMonitorPtr mon, virJSONValuePtr data); +static void qemuMonitorJSONHandleBlockJobCanceled(qemuMonitorPtr mon, virJSONValuePtr data);
static struct { const char *type; @@ -80,13 +81,14 @@ static struct { { "VNC_CONNECTED", qemuMonitorJSONHandleVNCConnect, }, { "VNC_INITIALIZED", qemuMonitorJSONHandleVNCInitialize, }, { "VNC_DISCONNECTED", qemuMonitorJSONHandleVNCDisconnect, }, - { "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJob, }, { "SPICE_CONNECTED", qemuMonitorJSONHandleSPICEConnect, }, { "SPICE_INITIALIZED", qemuMonitorJSONHandleSPICEInitialize, }, { "SPICE_DISCONNECTED", qemuMonitorJSONHandleSPICEDisconnect, }, { "DEVICE_TRAY_MOVED", qemuMonitorJSONHandleTrayChange, }, { "WAKEUP", qemuMonitorJSONHandlePMWakeup, }, { "SUSPEND", qemuMonitorJSONHandlePMSuspend, }, + { "BLOCK_JOB_COMPLETED", qemuMonitorJSONHandleBlockJobCompleted, }, + { "BLOCK_JOB_CANCELLED", qemuMonitorJSONHandleBlockJobCanceled, }, };
@@ -754,13 +756,15 @@ static void qemuMonitorJSONHandleSPICEDisconnect(qemuMonitorPtr mon, virJSONValu qemuMonitorJSONHandleGraphics(mon, data, VIR_DOMAIN_EVENT_GRAPHICS_DISCONNECT); }
-static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr data) +static void +qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon, + virJSONValuePtr data, + int event) { const char *device; const char *type_str; int type = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; unsigned long long offset, len; - int status = VIR_DOMAIN_BLOCK_JOB_FAILED;
if ((device = virJSONValueObjectGetString(data, "device")) == NULL) { VIR_WARN("missing device in block job event"); @@ -785,11 +789,19 @@ static void qemuMonitorJSONHandleBlockJob(qemuMonitorPtr mon, virJSONValuePtr da if (STREQ(type_str, "stream")) type = VIR_DOMAIN_BLOCK_JOB_TYPE_PULL;
- if (offset != 0 && offset == len) - status = VIR_DOMAIN_BLOCK_JOB_COMPLETED; + switch (event) { + case VIR_DOMAIN_BLOCK_JOB_COMPLETED: + /* Make sure the whole device has been processed */ + if (offset != len) + event = VIR_DOMAIN_BLOCK_JOB_FAILED; + break; + case VIR_DOMAIN_BLOCK_JOB_FAILED: + case VIR_DOMAIN_BLOCK_JOB_CANCELED: + break; + }
out: - qemuMonitorEmitBlockJob(mon, device, type, status); + qemuMonitorEmitBlockJob(mon, device, type, event); }
static void @@ -832,6 +844,22 @@ qemuMonitorJSONHandlePMSuspend(qemuMonitorPtr mon, qemuMonitorEmitPMSuspend(mon); }
+static void +qemuMonitorJSONHandleBlockJobCompleted(qemuMonitorPtr mon, + virJSONValuePtr data) +{ + qemuMonitorJSONHandleBlockJobImpl(mon, data, + VIR_DOMAIN_BLOCK_JOB_COMPLETED); +} + +static void +qemuMonitorJSONHandleBlockJobCanceled(qemuMonitorPtr mon, + virJSONValuePtr data) +{ + qemuMonitorJSONHandleBlockJobImpl(mon, data, + VIR_DOMAIN_BLOCK_JOB_CANCELED); +} + int qemuMonitorJSONHumanCommandWithFd(qemuMonitorPtr mon, const char *cmd_str, diff --git a/tools/virsh.c b/tools/virsh.c index cfdd040..7180b83 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -7525,6 +7525,7 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, const char *name, *path; unsigned long bandwidth = 0; int ret = -1; + unsigned int flags = 0;
if (!vshConnectionUsability(ctl, ctl->conn)) goto cleanup; @@ -7541,7 +7542,9 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, }
if (mode == VSH_CMD_BLOCK_JOB_ABORT) { - ret = virDomainBlockJobAbort(dom, path, 0); + if (vshCommandOptBool(cmd, "async")) + flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC; + ret = virDomainBlockJobAbort(dom, path, flags); } else if (mode == VSH_CMD_BLOCK_JOB_INFO) { ret = virDomainGetBlockJobInfo(dom, path, info, 0); } else if (mode == VSH_CMD_BLOCK_JOB_SPEED) { @@ -7589,20 +7592,25 @@ cmdBlockPull(vshControl *ctl, const vshCmd *cmd) }
/* - * "blockjobinfo" command + * "blockjob" command */ static const vshCmdInfo info_block_job[] = { - {"help", N_("Manage active block operations.")}, - {"desc", N_("Manage active block operations.")}, + {"help", N_("Manage active block operations")}, + {"desc", N_("Query, adjust speed, or cancel active block operations.")}, {NULL, NULL} };
static const vshCmdOptDef opts_block_job[] = { {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")}, {"path", VSH_OT_DATA, VSH_OFLAG_REQ, N_("Fully-qualified path of disk")}, - {"abort", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Abort the active job on the specified disk")}, - {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Get active job information for the specified disk")}, - {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, N_("Set the Bandwidth limit in MB/s")}, + {"abort", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("Abort the active job on the specified disk")}, + {"async", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("don't wait for --abort to complete")}, + {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("Get active job information for the specified disk")}, + {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, + N_("Set the Bandwidth limit in MB/s")}, {NULL, 0, 0, NULL} };
@@ -7613,19 +7621,23 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) virDomainBlockJobInfo info; const char *type; int ret; + bool abortMode = vshCommandOptBool(cmd, "abort"); + bool infoMode = vshCommandOptBool(cmd, "info"); + bool bandwidth = vshCommandOptBool(cmd, "bandwidth");
- if (vshCommandOptBool (cmd, "abort")) { - mode = VSH_CMD_BLOCK_JOB_ABORT; - } else if (vshCommandOptBool (cmd, "info")) { - mode = VSH_CMD_BLOCK_JOB_INFO; - } else if (vshCommandOptBool (cmd, "bandwidth")) { - mode = VSH_CMD_BLOCK_JOB_SPEED; - } else { + if (abortMode + infoMode + bandwidth > 1) { vshError(ctl, "%s", _("One of --abort, --info, or --bandwidth is required")); return false; }
+ if (abortMode) + mode = VSH_CMD_BLOCK_JOB_ABORT; + else if (bandwidth) + mode = VSH_CMD_BLOCK_JOB_SPEED; + else + mode = VSH_CMD_BLOCK_JOB_INFO; + ret = blockJobImpl(ctl, cmd, &info, mode); if (ret < 0) return false; @@ -7634,13 +7646,13 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) return true;
if (info.type == VIR_DOMAIN_BLOCK_JOB_TYPE_PULL) - type = "Block Pull"; + type = _("Block Pull"); else - type = "Unknown job"; + type = _("Unknown job");
print_job_progress(type, info.end - info.cur, info.end); if (info.bandwidth != 0) - vshPrint(ctl, " Bandwidth limit: %lu MB/s\n", info.bandwidth); + vshPrint(ctl, _(" Bandwidth limit: %lu MB/s\n"), info.bandwidth); return true; }
diff --git a/tools/virsh.pod b/tools/virsh.pod index a60e667..aac72d1 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -686,13 +686,16 @@ Both I<--live> and I<--current> flags may be given, but I<--current> is exclusive. If no flag is specified, behavior is different depending on hypervisor.
-=item B<blockjob> I<domain> I<path> [I<--abort>] [I<--info>] [I<bandwidth>] +=item B<blockjob> I<domain> I<path> { I<--abort> [I<--async>] | +[I<--info>] | I<bandwidth> }
-Manage active block operations. +Manage active block operations. If no mode is chosen, I<--info> is assumed.
I<path> specifies fully-qualified path of the disk. + If I<--abort> is specified, the active job on the specified disk will -be aborted. +be aborted. If I<--async> is also specified, this command will return +immediately, rather than waiting for the cancelation to complete. If I<--info> is specified, the active job information on the specified disk will be printed. I<bandwidth> can be used to set bandwidth limit for the active job. -- 1.7.7.6
-- Adam Litke <agl@us.ibm.com> IBM Linux Technology Center

From: Adam Litke <agl@us.ibm.com> Without the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag, libvirt will internally poll using qemu's "query-block-jobs" API and will not return until the operation has been completed. API users are advised that this operation is unbounded and further interaction with the domain during this period may block. Future patches may refactor things to allow other queries in parallel with this polling. Unfortunately, there's no good way to tell if qemu will emit the new event, so this implementation always polls to deal with older qemu. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 55 +++++++++++++++++++++++++++++++++++++++++------ 1 files changed, 48 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0456b34..455fa30 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11602,7 +11602,7 @@ cleanup: static int qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, virDomainBlockJobInfoPtr info, - int mode) + int mode, unsigned int flags) { struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm = NULL; @@ -11644,6 +11644,45 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, ret = qemuMonitorBlockJob(priv->mon, device, base, bandwidth, info, mode); qemuDomainObjExitMonitorWithDriver(driver, vm); + /* Qemu provides asynchronous block job cancellation, but without + * the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag libvirt guarantees a + * synchronous operation. Provide this behavior by waiting here, + * so we don't get confused by newly scheduled block jobs. + */ + if (ret == 0 && mode == BLOCK_JOB_ABORT && + !(flags & VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC)) { + ret = 1; + while (1) { + /* Poll every 50ms */ + struct timespec ts = { .tv_sec = 0, + .tv_nsec = 50 * 1000 * 1000ull }; + virDomainBlockJobInfo dummy; + + qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorBlockJob(priv->mon, device, NULL, 0, &dummy, + BLOCK_JOB_INFO); + qemuDomainObjExitMonitorWithDriver(driver, vm); + + if (ret <= 0) + break; + + virDomainObjUnlock(vm); + qemuDriverUnlock(driver); + + nanosleep(&ts, NULL); + + qemuDriverLock(driver); + virDomainObjLock(vm); + + if (!virDomainObjIsActive(vm)) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + ret = -1; + break; + } + } + } + endjob: if (qemuDomainObjEndJob(driver, vm) == 0) { vm = NULL; @@ -11661,8 +11700,9 @@ cleanup: static int qemuDomainBlockJobAbort(virDomainPtr dom, const char *path, unsigned int flags) { - virCheckFlags(0, -1); - return qemuDomainBlockJobImpl(dom, path, NULL, 0, NULL, BLOCK_JOB_ABORT); + virCheckFlags(VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC, -1); + return qemuDomainBlockJobImpl(dom, path, NULL, 0, NULL, BLOCK_JOB_ABORT, + flags); } static int @@ -11670,7 +11710,8 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, virDomainBlockJobInfoPtr info, unsigned int flags) { virCheckFlags(0, -1); - return qemuDomainBlockJobImpl(dom, path, NULL, 0, info, BLOCK_JOB_INFO); + return qemuDomainBlockJobImpl(dom, path, NULL, 0, info, BLOCK_JOB_INFO, + flags); } static int @@ -11679,7 +11720,7 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, { virCheckFlags(0, -1); return qemuDomainBlockJobImpl(dom, path, NULL, bandwidth, NULL, - BLOCK_JOB_SPEED); + BLOCK_JOB_SPEED, flags); } static int @@ -11690,10 +11731,10 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, virCheckFlags(0, -1); ret = qemuDomainBlockJobImpl(dom, path, base, bandwidth, NULL, - BLOCK_JOB_PULL); + BLOCK_JOB_PULL, flags); if (ret == 0 && bandwidth != 0) ret = qemuDomainBlockJobImpl(dom, path, NULL, bandwidth, NULL, - BLOCK_JOB_SPEED); + BLOCK_JOB_SPEED, flags); return ret; } -- 1.7.7.6

On Thu, Apr 05, 2012 at 10:36:48PM -0600, Eric Blake wrote:
From: Adam Litke <agl@us.ibm.com>
Without the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag, libvirt will internally poll using qemu's "query-block-jobs" API and will not return until the operation has been completed. API users are advised that this operation is unbounded and further interaction with the domain during this period may block. Future patches may refactor things to allow other queries in parallel with this polling. Unfortunately, there's no good way to tell if qemu will emit the new event, so this implementation always polls to deal with older qemu.
Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Tested-by: Adam Litke <agl@us.ibm.com>
--- src/qemu/qemu_driver.c | 55 +++++++++++++++++++++++++++++++++++++++++------ 1 files changed, 48 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0456b34..455fa30 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11602,7 +11602,7 @@ cleanup: static int qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, virDomainBlockJobInfoPtr info, - int mode) + int mode, unsigned int flags) { struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm = NULL; @@ -11644,6 +11644,45 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, ret = qemuMonitorBlockJob(priv->mon, device, base, bandwidth, info, mode); qemuDomainObjExitMonitorWithDriver(driver, vm);
+ /* Qemu provides asynchronous block job cancellation, but without + * the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag libvirt guarantees a + * synchronous operation. Provide this behavior by waiting here, + * so we don't get confused by newly scheduled block jobs. + */ + if (ret == 0 && mode == BLOCK_JOB_ABORT && + !(flags & VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC)) { + ret = 1; + while (1) { + /* Poll every 50ms */ + struct timespec ts = { .tv_sec = 0, + .tv_nsec = 50 * 1000 * 1000ull }; + virDomainBlockJobInfo dummy; + + qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorBlockJob(priv->mon, device, NULL, 0, &dummy, + BLOCK_JOB_INFO); + qemuDomainObjExitMonitorWithDriver(driver, vm); + + if (ret <= 0) + break; + + virDomainObjUnlock(vm); + qemuDriverUnlock(driver); + + nanosleep(&ts, NULL); + + qemuDriverLock(driver); + virDomainObjLock(vm); + + if (!virDomainObjIsActive(vm)) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + ret = -1; + break; + } + } + } + endjob: if (qemuDomainObjEndJob(driver, vm) == 0) { vm = NULL; @@ -11661,8 +11700,9 @@ cleanup: static int qemuDomainBlockJobAbort(virDomainPtr dom, const char *path, unsigned int flags) { - virCheckFlags(0, -1); - return qemuDomainBlockJobImpl(dom, path, NULL, 0, NULL, BLOCK_JOB_ABORT); + virCheckFlags(VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC, -1); + return qemuDomainBlockJobImpl(dom, path, NULL, 0, NULL, BLOCK_JOB_ABORT, + flags); }
static int @@ -11670,7 +11710,8 @@ qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path, virDomainBlockJobInfoPtr info, unsigned int flags) { virCheckFlags(0, -1); - return qemuDomainBlockJobImpl(dom, path, NULL, 0, info, BLOCK_JOB_INFO); + return qemuDomainBlockJobImpl(dom, path, NULL, 0, info, BLOCK_JOB_INFO, + flags); }
static int @@ -11679,7 +11720,7 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, { virCheckFlags(0, -1); return qemuDomainBlockJobImpl(dom, path, NULL, bandwidth, NULL, - BLOCK_JOB_SPEED); + BLOCK_JOB_SPEED, flags); }
static int @@ -11690,10 +11731,10 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base,
virCheckFlags(0, -1); ret = qemuDomainBlockJobImpl(dom, path, base, bandwidth, NULL, - BLOCK_JOB_PULL); + BLOCK_JOB_PULL, flags); if (ret == 0 && bandwidth != 0) ret = qemuDomainBlockJobImpl(dom, path, NULL, bandwidth, NULL, - BLOCK_JOB_SPEED); + BLOCK_JOB_SPEED, flags); return ret; }
-- 1.7.7.6
-- Adam Litke <agl@us.ibm.com> IBM Linux Technology Center

In my testing, I was able to provoke an odd block pull failure: $ virsh blockpull dom vda --bandwidth 10000 error: Requested operation is not valid: No active operation on device: drive-virtio-disk0 merely by using gdb to artifically wait to do the block job set speed until after the pull had already finished. But in reality, that should be a success, since the pull finished before we had a chance to set speed. Furthermore, using a double job lock is annoying; we can alter the speed as part of the same job that started the block pull for less likelihood of hitting the speed failure in the first place. In fixing this, I discovered commit 10ec36e2 was wrong for marking the info parameter of qemuMonitorBlockJob as non-null. * src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Fix prototype. (BLOCK_JOB_CMD): Add new mode. * src/qemu/qemu_driver.c (qemuDomainBlockRebase): Move secondary job call... (qemuDomainBlockJobImpl): ...here, for fewer locks. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Change return value on new internal mode. --- src/qemu/qemu_driver.c | 18 +++++++++--------- src/qemu/qemu_monitor.h | 5 +++-- src/qemu/qemu_monitor_json.c | 32 ++++++++++++++++++++------------ 3 files changed, 32 insertions(+), 23 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 455fa30..b219cb6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11642,14 +11642,20 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, * itself, and validating that base is on the chain, rather than * relying on qemu to do this. */ ret = qemuMonitorBlockJob(priv->mon, device, base, bandwidth, info, mode); + if (ret == 0 && mode == BLOCK_JOB_PULL && bandwidth) + ret = qemuMonitorBlockJob(priv->mon, device, NULL, bandwidth, NULL, + BLOCK_JOB_SPEED_INTERNAL); qemuDomainObjExitMonitorWithDriver(driver, vm); + if (ret < 0) + goto endjob; + /* Qemu provides asynchronous block job cancellation, but without * the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag libvirt guarantees a * synchronous operation. Provide this behavior by waiting here, * so we don't get confused by newly scheduled block jobs. */ - if (ret == 0 && mode == BLOCK_JOB_ABORT && + if (mode == BLOCK_JOB_ABORT && !(flags & VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC)) { ret = 1; while (1) { @@ -11727,15 +11733,9 @@ static int qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { - int ret; - virCheckFlags(0, -1); - ret = qemuDomainBlockJobImpl(dom, path, base, bandwidth, NULL, - BLOCK_JOB_PULL, flags); - if (ret == 0 && bandwidth != 0) - ret = qemuDomainBlockJobImpl(dom, path, NULL, bandwidth, NULL, - BLOCK_JOB_SPEED, flags); - return ret; + return qemuDomainBlockJobImpl(dom, path, base, bandwidth, NULL, + BLOCK_JOB_PULL, flags); } static int diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index b480966..2e6ac79 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -530,7 +530,8 @@ typedef enum { BLOCK_JOB_ABORT = 0, BLOCK_JOB_INFO = 1, BLOCK_JOB_SPEED = 2, - BLOCK_JOB_PULL = 3, + BLOCK_JOB_SPEED_INTERNAL = 3, + BLOCK_JOB_PULL = 4, } BLOCK_JOB_CMD; int qemuMonitorBlockJob(qemuMonitorPtr mon, @@ -539,7 +540,7 @@ int qemuMonitorBlockJob(qemuMonitorPtr mon, unsigned long bandwidth, virDomainBlockJobInfoPtr info, int mode) - ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(5); + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); int qemuMonitorOpenGraphics(qemuMonitorPtr mon, const char *protocol, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 9a8da3a..d4aca5c 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -3436,7 +3436,7 @@ qemuMonitorJSONBlockJob(qemuMonitorPtr mon, } else if (mode == BLOCK_JOB_INFO) { cmd_name = "query-block-jobs"; cmd = qemuMonitorJSONMakeCommand(cmd_name, NULL); - } else if (mode == BLOCK_JOB_SPEED) { + } else if (mode == BLOCK_JOB_SPEED || mode == BLOCK_JOB_SPEED_INTERNAL) { cmd_name = "block_job_set_speed"; cmd = qemuMonitorJSONMakeCommand(cmd_name, "s:device", device, "U:value", @@ -3458,22 +3458,30 @@ qemuMonitorJSONBlockJob(qemuMonitorPtr mon, ret = qemuMonitorJSONCommand(mon, cmd, &reply); if (ret == 0 && virJSONValueObjectHasKey(reply, "error")) { - if (qemuMonitorJSONHasError(reply, "DeviceNotActive")) - qemuReportError(VIR_ERR_OPERATION_INVALID, - _("No active operation on device: %s"), device); - else if (qemuMonitorJSONHasError(reply, "DeviceInUse")) + ret = -1; + if (qemuMonitorJSONHasError(reply, "DeviceNotActive")) { + /* If a job completes before we get a chance to set the + * speed, we don't want to fail the original command. */ + if (mode == BLOCK_JOB_SPEED_INTERNAL) + ret = 0; + else + qemuReportError(VIR_ERR_OPERATION_INVALID, + _("No active operation on device: %s"), + device); + } else if (qemuMonitorJSONHasError(reply, "DeviceInUse")){ qemuReportError(VIR_ERR_OPERATION_FAILED, - _("Device %s in use"), device); - else if (qemuMonitorJSONHasError(reply, "NotSupported")) + _("Device %s in use"), device); + } else if (qemuMonitorJSONHasError(reply, "NotSupported")) { qemuReportError(VIR_ERR_OPERATION_INVALID, - _("Operation is not supported for device: %s"), device); - else if (qemuMonitorJSONHasError(reply, "CommandNotFound")) + _("Operation is not supported for device: %s"), + device); + } else if (qemuMonitorJSONHasError(reply, "CommandNotFound")) { qemuReportError(VIR_ERR_OPERATION_INVALID, - _("Command '%s' is not found"), cmd_name); - else + _("Command '%s' is not found"), cmd_name); + } else { qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Unexpected error")); - ret = -1; + } } if (ret == 0 && mode == BLOCK_JOB_INFO) -- 1.7.7.6

This patch introduces a new block job, useful for live storage migration using pre-copy streaming. Using a live VM with the backing chain: base <- snap1 <- snap2 as the starting point, we have: - virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY) creates /path/to/copy with the same format as snap2, with no backing file, so entire chain is copied and flattened - virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) creates /path/to/copy as a raw file, so entire chain is copied and flattened - virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW) creates /path/to/copy with the same format as snap2, but with snap1 as a backing file, so only snap2 is copied. - virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT) reuse existing /path/to/copy (must have empty contents, and format is probed from the metadata), and copy the full chain - virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT| VIR_DOMAIN_BLOCK_REBASE_SHALLOW) reuse existing /path/to/copy (contents must be identical to snap1, and format is probed from the metadata), and copy only the contents of snap2 - virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT| VIR_DOMAIN_BLOCK_REBASE_SHALLOW|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) reuse existing /path/to/copy (must be raw volume with contents identical to snap1), and copy only the contents of snap2 Less useful combinations: - virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW| VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) fail if source is not raw, otherwise create /path/to/copy as raw and the single file is copied (no chain involved) - virDomainBlockRebase(dom, disk, "/path/to/copy", 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT| VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) makes little sense: the destination must be raw but have no contents, meaning that it is an empty file, so there is nothing to reuse The other three flags are rejected without VIR_DOMAIN_BLOCK_COPY. It would be nice if we could issue an event when pivoting from phase 1 to phase 2, but qemu hasn't implemented that, so we would have to poll in order to synthesize it ourselves. Meanwhile, qemu will give us a distinct job info and completion event when we either cancel or pivot to end the job. Pivoting is accomplished via the new: virDomainBlockJobAbort(dom, disk, VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT) Management applications can pre-create the copy with a relative backing file name, and use the VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT flag to have qemu reuse the metadata; if the management application also copies the backing files to a new location, this can be used to perform live storage migration of an entire backing chain. * include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_TYPE_COPY): New block job type. (virDomainBlockJobAbortFlags, virDomainBlockRebaseFlags): New enums. * src/libvirt.c (virDomainBlockRebase): Document the new flags, and implement general restrictions on flag combinations. (virDomainBlockJobAbort): Document the new flag. (virDomainSaveFlags, virDomainSnapshotCreateXML) (virDomainRevertToSnapshot, virDomainDetachDeviceFlags): Document restrictions. * include/libvirt/virterror.h (VIR_ERR_BLOCK_COPY_ACTIVE): New error. * src/util/virterror.c (virErrorMsg): Define it. --- include/libvirt/libvirt.h.in | 24 ++++++++++- include/libvirt/virterror.h | 1 + src/libvirt.c | 95 ++++++++++++++++++++++++++++++++++++++---- src/util/virterror.c | 6 +++ 4 files changed, 116 insertions(+), 10 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 97ad99d..ac5df95 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1934,12 +1934,15 @@ int virDomainUpdateDeviceFlags(virDomainPtr domain, /** * virDomainBlockJobType: * - * VIR_DOMAIN_BLOCK_JOB_TYPE_PULL: Block Pull (virDomainBlockPull or - * virDomainBlockRebase) + * VIR_DOMAIN_BLOCK_JOB_TYPE_PULL: Block Pull (virDomainBlockPull, or + * virDomainBlockRebase without flags), job ends on completion + * VIR_DOMAIN_BLOCK_JOB_TYPE_COPY: Block Copy (virDomainBlockRebase with + * flags), job exists as long as mirroring is active */ typedef enum { VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN = 0, VIR_DOMAIN_BLOCK_JOB_TYPE_PULL = 1, + VIR_DOMAIN_BLOCK_JOB_TYPE_COPY = 2, #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_BLOCK_JOB_TYPE_LAST @@ -1950,9 +1953,11 @@ typedef enum { * virDomainBlockJobAbortFlags: * * VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC: Request only, do not wait for completion + * VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT: Pivot to mirror when ending a copy job */ typedef enum { VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC = 1 << 0, + VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT = 1 << 1, } virDomainBlockJobAbortFlags; /* An iterator for monitoring block job operations */ @@ -1983,6 +1988,21 @@ int virDomainBlockJobSetSpeed(virDomainPtr dom, const char *disk, int virDomainBlockPull(virDomainPtr dom, const char *disk, unsigned long bandwidth, unsigned int flags); + +/** + * virDomainBlockRebaseFlags: + * + * Flags available for virDomainBlockRebase(). + */ +typedef enum { + VIR_DOMAIN_BLOCK_REBASE_SHALLOW = 1 << 0, /* Limit copy to top of source + backing chain */ + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT = 1 << 1, /* Reuse existing external + file for a copy */ + VIR_DOMAIN_BLOCK_REBASE_COPY_RAW = 1 << 2, /* Make destination file raw */ + VIR_DOMAIN_BLOCK_REBASE_COPY = 1 << 3, /* Start a copy job */ +} virDomainBlockRebaseFlags; + int virDomainBlockRebase(virDomainPtr dom, const char *disk, const char *base, unsigned long bandwidth, unsigned int flags); diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index e04d29e..070fdb5 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -249,6 +249,7 @@ typedef enum { VIR_ERR_NO_DOMAIN_METADATA = 80, /* The metadata is not present */ VIR_ERR_MIGRATE_UNSAFE = 81, /* Migration is not safe */ VIR_ERR_OVERFLOW = 82, /* integer overflow */ + VIR_ERR_BLOCK_COPY_ACTIVE = 83, /* action prevented by block copy job */ } virErrorNumber; /** diff --git a/src/libvirt.c b/src/libvirt.c index af22232..d4b648c 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -2696,6 +2696,10 @@ error: * A save file can be inspected or modified slightly with * virDomainSaveImageGetXMLDesc() and virDomainSaveImageDefineXML(). * + * Some hypervisors may prevent this operation if there is a current + * block copy operation; in that case, use virDomainBlockJobAbort() + * to stop the block copy first. + * * Returns 0 in case of success and -1 in case of failure. */ int @@ -7891,6 +7895,11 @@ error: * virDomainUndefine(). A previous definition for this domain would be * overriden if it already exists. * + * Some hypervisors may prevent this operation if there is a current + * block copy operation on a transient domain with the same id as the + * domain being defined; in that case, use virDomainBlockJobAbort() to + * stop the block copy first. + * * Returns NULL in case of error, a pointer to the domain otherwise */ virDomainPtr @@ -9424,6 +9433,10 @@ error: * return failure if LIVE is specified but it only supports removing the * persisted device allocation. * + * Some hypervisors may prevent this operation if there is a current + * block copy operation on the device being detached; in that case, + * use virDomainBlockJobAbort() to stop the block copy first. + * * Returns 0 in case of success, -1 in case of failure. */ int @@ -17124,6 +17137,10 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * that it is still possible to fail after disks have changed, but only * in the much rarer cases of running out of memory or disk space). * + * Some hypervisors may prevent this operation if there is a current + * block copy operation; in that case, use virDomainBlockJobAbort() + * to stop the block copy first. + * * Returns an (opaque) virDomainSnapshotPtr on success, NULL on failure. */ virDomainSnapshotPtr @@ -17913,13 +17930,24 @@ error: * can be found by calling virDomainGetXMLDesc() and inspecting * elements within //domain/devices/disk. * - * By default, this function performs a synchronous operation and the caller + * If the current block job for @disk is VIR_DOMAIN_BLOCK_JOB_TYPE_PULL, then + * by default, this function performs a synchronous operation and the caller * may assume that the operation has completed when 0 is returned. However, * BlockJob operations may take a long time to complete, and during this time * further domain interactions may be unresponsive. To avoid this problem, * pass VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC in the @flags argument to enable * asynchronous behavior. Either way, when the job has been cancelled, a * BlockJob event will be emitted, with status VIR_DOMAIN_BLOCK_JOB_CANCELLED. + * In this usage, @flags must not contain VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT. + * + * If the current block job for @disk is VIR_DOMAIN_BLOCK_JOB_TYPE_COPY, then + * the default is to abort the mirroring and revert to the source disk; + * adding @flags of VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT causes this call to + * fail with VIR_ERR_BLOCK_COPY_ACTIVE if the copy is not fully populated, + * otherwise it will swap the disk over to the copy to end the mirroring. An + * event will be issued when the job is ended, and it is possible to use + * VIR_DOMAIN_BLOCK_JOB_ABORT_SYNC to control whether this command waits + * for the completion of the job. * * Returns -1 in case of failure, 0 when successful. */ @@ -18169,19 +18197,55 @@ error: * @disk: path to the block device, or device shorthand * @base: path to backing file to keep, or NULL for no backing file * @bandwidth: (optional) specify copy bandwidth limit in Mbps - * @flags: extra flags; not used yet, so callers should always pass 0 + * @flags: bitwise-OR of virDomainBlockRebaseFlags * * Populate a disk image with data from its backing image chain, and - * setting the backing image to @base. @base must be the absolute + * setting the backing image to @base, or alternatively copy an entire + * backing chain to a new file @base. + * + * When @flags is 0, this starts a pull, where @base must be the absolute * path of one of the backing images further up the chain, or NULL to * convert the disk image so that it has no backing image. Once all * data from its backing image chain has been pulled, the disk no * longer depends on those intermediate backing images. This function * pulls data for the entire device in the background. Progress of - * the operation can be checked with virDomainGetBlockJobInfo() and - * the operation can be aborted with virDomainBlockJobAbort(). When - * finished, an asynchronous event is raised to indicate the final - * status. + * the operation can be checked with virDomainGetBlockJobInfo() with a + * job type of VIR_DOMAIN_BLOCK_JOB_TYPE_PULL, and the operation can be + * aborted with virDomainBlockJobAbort(). When finished, an asynchronous + * event is raised to indicate the final status, and the job no longer + * exists. + * + * When @flags includes VIR_DOMAIN_BLOCK_REBASE_COPY, this starts a copy, + * where @base must be the name of a new file to copy the chain to. By + * default, the copy will pull the entire source chain into the destination + * file, but if @flags also contains VIR_DOMAIN_BLOCK_REBASE_SHALLOW, then + * only the top of the source chain will be copied (the source and + * destination have a common backing file). By default, @base will be + * created with the same file format as the source, but this can be altered + * by adding VIR_DOMAIN_BLOCK_REBASE_COPY_RAW to force the copy to be raw + * (does not make sense with the shallow flag unless the source is also raw), + * or by using VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT to reuse an existing file + * with initial contents identical to the backing file of the source (this + * allows a management app to pre-create files with relative backing file + * names, rather than the default of absolute backing file names; it is + * generally used with the shallow flag, since otherwise the destination + * file must start with empty contents). + * + * A copy job has two parts; in the first phase, the @bandwidth parameter + * affects how fast the source is pulled into the destination, and the job + * can only be canceled by reverting to the source file; progress in this + * phase can be tracked via the virDomainBlockJobInfo() command, with a + * job type of VIR_DOMAIN_BLOCK_JOB_TYPE_COPY. The job transitions to the + * second phase when the job info states cur == end, and remains alive to + * mirror all further changes to both source and destination. The user + * must call virDomainBlockJobAbort() to end the mirroring while choosing + * whether to revert to source or pivot to the destination. An event is + * issued when the job ends, and in the future, an event may be added when + * the job transitions from pulling to mirroring. + * + * Some hypervisors will restrict certain actions, such as virDomainSave() + * or virDomainDetachDevice(), while a copy job is active; they may + * also restrict a copy job to transient domains. * * The @disk parameter is either an unambiguous source name of the * block device (the <source file='...'/> sub-element, such as @@ -18195,7 +18259,8 @@ error: * suitable default. Some hypervisors do not support this feature and will * return an error if bandwidth is not 0. * - * When @base is NULL, this is identical to virDomainBlockPull(). + * When @base is NULL and @flags is 0, this is identical to + * virDomainBlockPull(). * * Returns 0 if the operation has started, -1 on failure. */ @@ -18228,6 +18293,20 @@ int virDomainBlockRebase(virDomainPtr dom, const char *disk, goto error; } + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) { + if (!base) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("base is required when starting a copy")); + goto error; + } + } else if (flags & (VIR_DOMAIN_BLOCK_REBASE_SHALLOW | + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | + VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("use of flags requires a copy job")); + goto error; + } + if (conn->driver->domainBlockRebase) { int ret; ret = conn->driver->domainBlockRebase(dom, disk, base, bandwidth, diff --git a/src/util/virterror.c b/src/util/virterror.c index ff9a36f..845081e 100644 --- a/src/util/virterror.c +++ b/src/util/virterror.c @@ -1250,6 +1250,12 @@ virErrorMsg(virErrorNumber error, const char *info) else errmsg = _("numerical overflow: %s"); break; + case VIR_ERR_BLOCK_COPY_ACTIVE: + if (!info) + errmsg = _("block copy still active"); + else + errmsg = _("block copy still active: %s"); + break; } return errmsg; } -- 1.7.7.6

Rather than further overloading 'blockpull', I decided to create a new virsh command to expose the new flags of virDomainBlockRebase. Someday, I'd also like to make blockpull and blockcopy have a synchronous mode, which blocks until the event happens or Ctrl-C is pressed, as well as a --verbose flag to print status updates before the job finishes - but not today. * tools/virsh.c (VSH_CMD_BLOCK_JOB_COPY): New mode. (blockJobImpl): Support new flags. (cmdBlockCopy): New command. (cmdBlockJob): Support new job info, new abort flag. * tools/virsh.pod (blockcopy, blockjob): Document the new command and flags. --- tools/virsh.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++------- tools/virsh.pod | 37 +++++++++++++++++++++++-- 2 files changed, 104 insertions(+), 14 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index 7180b83..99f3d22 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -7515,16 +7515,18 @@ typedef enum { VSH_CMD_BLOCK_JOB_INFO = 1, VSH_CMD_BLOCK_JOB_SPEED = 2, VSH_CMD_BLOCK_JOB_PULL = 3, -} VSH_CMD_BLOCK_JOB_MODE; + VSH_CMD_BLOCK_JOB_COPY = 4, +} vshCmdBlockJobMode; static int blockJobImpl(vshControl *ctl, const vshCmd *cmd, - virDomainBlockJobInfoPtr info, int mode) + virDomainBlockJobInfoPtr info, int mode) { virDomainPtr dom = NULL; const char *name, *path; unsigned long bandwidth = 0; int ret = -1; + const char *base = NULL; unsigned int flags = 0; if (!vshConnectionUsability(ctl, ctl->conn)) @@ -7541,22 +7543,39 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, goto cleanup; } - if (mode == VSH_CMD_BLOCK_JOB_ABORT) { + switch ((vshCmdBlockJobMode) mode) { + case VSH_CMD_BLOCK_JOB_ABORT: if (vshCommandOptBool(cmd, "async")) flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC; + if (vshCommandOptBool(cmd, "pivot")) + flags |= VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT; ret = virDomainBlockJobAbort(dom, path, flags); - } else if (mode == VSH_CMD_BLOCK_JOB_INFO) { + break; + case VSH_CMD_BLOCK_JOB_INFO: ret = virDomainGetBlockJobInfo(dom, path, info, 0); - } else if (mode == VSH_CMD_BLOCK_JOB_SPEED) { + break; + case VSH_CMD_BLOCK_JOB_SPEED: ret = virDomainBlockJobSetSpeed(dom, path, bandwidth, 0); - } else if (mode == VSH_CMD_BLOCK_JOB_PULL) { - const char *base = NULL; + break; + case VSH_CMD_BLOCK_JOB_PULL: if (vshCommandOptString(cmd, "base", &base) < 0) goto cleanup; if (base) ret = virDomainBlockRebase(dom, path, base, bandwidth, 0); else ret = virDomainBlockPull(dom, path, bandwidth, 0); + break; + case VSH_CMD_BLOCK_JOB_COPY: + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; + if (vshCommandOptBool(cmd, "shallow")) + flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + if (vshCommandOptBool(cmd, "reuse-external")) + flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; + if (vshCommandOptBool(cmd, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + if (vshCommandOptString(cmd, "dest", &base) < 0) + goto cleanup; + ret = virDomainBlockRebase(dom, path, base, bandwidth, flags); } cleanup: @@ -7566,6 +7585,34 @@ cleanup: } /* + * "blockcopy" command + */ +static const vshCmdInfo info_block_copy[] = { + {"help", N_("Start a block copy operation.")}, + {"desc", N_("Populate a disk from its backing image.")}, + {NULL, NULL} +}; + +static const vshCmdOptDef opts_block_copy[] = { + {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")}, + {"path", VSH_OT_DATA, VSH_OFLAG_REQ, N_("Fully-qualified path of disk")}, + {"dest", VSH_OT_DATA, VSH_OFLAG_REQ, N_("path of the copy to create")}, + {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, N_("Bandwidth limit in MB/s")}, + {"shallow", VSH_OT_BOOL, 0, N_("make the copy share a backing chain")}, + {"reuse-external", VSH_OT_BOOL, 0, N_("reuse existing destination")}, + {"raw", VSH_OT_BOOL, 0, N_("use raw destination file")}, + {NULL, 0, 0, NULL} +}; + +static bool +cmdBlockCopy(vshControl *ctl, const vshCmd *cmd) +{ + if (blockJobImpl(ctl, cmd, NULL, VSH_CMD_BLOCK_JOB_COPY) != 0) + return false; + return true; +} + +/* * "blockpull" command */ static const vshCmdInfo info_block_pull[] = { @@ -7607,6 +7654,8 @@ static const vshCmdOptDef opts_block_job[] = { N_("Abort the active job on the specified disk")}, {"async", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("don't wait for --abort to complete")}, + {"pivot", VSH_OT_BOOL, VSH_OFLAG_NONE, + N_("conclude and pivot a copy job")}, {"info", VSH_OT_BOOL, VSH_OFLAG_NONE, N_("Get active job information for the specified disk")}, {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, @@ -7621,7 +7670,9 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) virDomainBlockJobInfo info; const char *type; int ret; - bool abortMode = vshCommandOptBool(cmd, "abort"); + bool abortMode = (vshCommandOptBool(cmd, "abort") || + vshCommandOptBool(cmd, "async") || + vshCommandOptBool(cmd, "pivot")); bool infoMode = vshCommandOptBool(cmd, "info"); bool bandwidth = vshCommandOptBool(cmd, "bandwidth"); @@ -7645,10 +7696,17 @@ cmdBlockJob(vshControl *ctl, const vshCmd *cmd) if (ret == 0 || mode != VSH_CMD_BLOCK_JOB_INFO) return true; - if (info.type == VIR_DOMAIN_BLOCK_JOB_TYPE_PULL) + switch (info.type) { + case VIR_DOMAIN_BLOCK_JOB_TYPE_PULL: type = _("Block Pull"); - else + break; + case VIR_DOMAIN_BLOCK_JOB_TYPE_COPY: + type = _("Block Copy"); + break; + default: type = _("Unknown job"); + break; + } print_job_progress(type, info.end - info.cur, info.end); if (info.bandwidth != 0) @@ -17127,8 +17185,9 @@ static const vshCmdDef domManagementCmds[] = { {"autostart", cmdAutostart, opts_autostart, info_autostart, 0}, {"blkdeviotune", cmdBlkdeviotune, opts_blkdeviotune, info_blkdeviotune, 0}, {"blkiotune", cmdBlkiotune, opts_blkiotune, info_blkiotune, 0}, - {"blockpull", cmdBlockPull, opts_block_pull, info_block_pull, 0}, + {"blockcopy", cmdBlockCopy, opts_block_copy, info_block_copy, 0}, {"blockjob", cmdBlockJob, opts_block_job, info_block_job, 0}, + {"blockpull", cmdBlockPull, opts_block_pull, info_block_pull, 0}, {"blockresize", cmdBlockResize, opts_block_resize, info_block_resize, 0}, {"change-media", cmdChangeMedia, opts_change_media, info_change_media, 0}, #ifndef WIN32 diff --git a/tools/virsh.pod b/tools/virsh.pod index aac72d1..50a495d 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -638,6 +638,34 @@ currently in use by a running domain. Other contexts that require a MAC address of virtual interface (such as I<detach-interface> or I<domif-setlink>) will accept the MAC address printed by this command. +=item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<--shallow>] +[I<--reuse-external>] [I<--raw>] + +Copy a disk backing image chain to I<dest>. By default, this command +flattens the entire chain; but if I<--shallow> is specified, the copy +shares the backing chain. + +If I<--reuse-external> is specified, then I<dest> must exist and have +contents identical to the resulting backing file (that is, it must +start with contents matching the backing file I<disk> if I<--shallow> +is used, otherwise it must start empty); this option is typically used +to set up a relative backing file name in the destination. + +The format of the destination is determined by the first match in the +following list: if I<--raw> is specified, it will be raw; if +I<--reuse-external> is specified, the existing destination is probed +for a format; and in all other cases, the destination format will +match the source format. + +The copy runs in the background; initially, the job must copy all data +from the source, and during this phase, the job can only be canceled to +revert back to the source disk. After this phase completes, both the +source and the destination remain mirrored until a call to B<blockjob> +with the I<--abort> and I<--pivot> flags pivots over to the copy. + +I<path> specifies fully-qualified path of the disk. +I<bandwidth> specifies copying bandwidth limit in Mbps. + =item B<blockpull> I<domain> I<path> [I<bandwidth>] [I<base>] Populate a disk from its backing image chain. By default, this command @@ -686,16 +714,19 @@ Both I<--live> and I<--current> flags may be given, but I<--current> is exclusive. If no flag is specified, behavior is different depending on hypervisor. -=item B<blockjob> I<domain> I<path> { I<--abort> [I<--async>] | +=item B<blockjob> I<domain> I<path> { [I<--abort>] [I<--async>] [I<--pivot] | [I<--info>] | I<bandwidth> } -Manage active block operations. If no mode is chosen, I<--info> is assumed. +Manage active block operations. If no mode is chosen, I<--info> is assumed, +except that I<--async> and I<--pivot> imply I<--abort>. I<path> specifies fully-qualified path of the disk. If I<--abort> is specified, the active job on the specified disk will be aborted. If I<--async> is also specified, this command will return -immediately, rather than waiting for the cancelation to complete. +immediately, rather than waiting for the cancelation to complete. If +I<--pivot> is specified, this requests that an active copy job +be pivoted over to the new copy. If I<--info> is specified, the active job information on the specified disk will be printed. I<bandwidth> can be used to set bandwidth limit for the active job. -- 1.7.7.6

QUESTION: should we parse and ignore <mirror> on input, rather than rejecting it? By rejecting it, I can't add a unit test, since the unit test framework currently doesn't expose a way to trigger internal parsing. In order to track a block copy job across libvirtd restarts, we need to save internal XML that tracks the name of the file holding the mirror. Displaying this name in dumpxml might also be useful to the user, even if we don't yet have a way to (re-) start a domain with mirroring enabled up front. This is done with a new <mirror> sub-element to <disk>, as in: <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images/original.img'/> <mirror file='/var/lib/libvirt/images/copy.img' format='qcow2'/> ... </disk> * docs/schemas/domaincommon.rng (diskspec): Add diskMirror. * docs/formatdomain.html.in (elementsDisks): Document it. * src/conf/domain_conf.h (_virDomainDiskDef): New members. * src/conf/domain_conf.c (virDomainDiskDefFree): Clean them. (virDomainDiskDefParseXML): Parse them, but only internally. (virDomainDiskDefFormat): Output them. --- docs/formatdomain.html.in | 11 ++++++++++ docs/schemas/domaincommon.rng | 19 +++++++++++++++-- src/conf/domain_conf.c | 42 +++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 5 ++++ 4 files changed, 74 insertions(+), 3 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index a382d30..534c44b 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -1296,6 +1296,17 @@ </table> <span class="since">Since 0.9.7</span> </dd> + <dt><code>mirror</code></dt> + <dd> + This element is present if the hypervisor has started a block + copy operation (via the <code>virDomainBlockCopy</code> API), + where the mirror location in attribute <code>file</code> will + eventually have the same contents as the source, and with the + file format in attribute <code>format</code> (which might + differ from the format of the source). For now, this element + only valid in output; it is rejected on + input. <span class="since">Since 0.9.12</span> + </dd> <dt><code>target</code></dt> <dd>The <code>target</code> element controls the bus / device under which the disk is exposed to the guest diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 0cc04af..66c91a2 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -772,6 +772,9 @@ <ref name="driver"/> </optional> <optional> + <ref name='diskMirror'/> + </optional> + <optional> <ref name="diskAuth"/> </optional> <ref name="target"/> @@ -1013,9 +1016,7 @@ </element> </define> <!-- - Disk may use a special driver for access. Currently this is - only defined for Xen for tap/aio and file, but will certainly be - extended in the future, and libvirt doesn't look for specific values. + Disk may use a special driver for access. --> <define name="driver"> <element name="driver"> @@ -3024,6 +3025,18 @@ <empty/> </element> </define> + <define name='diskMirror'> + <element name='mirror'> + <attribute name='file'> + <ref name='absFilePath'/> + </attribute> + <optional> + <attribute name='format'> + <ref name="genericName"/> + </attribute> + </optional> + </element> + </define> <define name="diskAuth"> <element name="auth"> <attribute name="username"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c6b97e1..38c461a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -933,6 +933,8 @@ void virDomainDiskDefFree(virDomainDiskDefPtr def) VIR_FREE(def->dst); VIR_FREE(def->driverName); VIR_FREE(def->driverType); + VIR_FREE(def->mirror); + VIR_FREE(def->mirrorFormat); VIR_FREE(def->auth.username); if (def->auth.secretType == VIR_DOMAIN_DISK_SECRET_TYPE_USAGE) VIR_FREE(def->auth.secret.usage); @@ -3318,6 +3320,8 @@ virDomainDiskDefParseXML(virCapsPtr caps, char *ioeventfd = NULL; char *event_idx = NULL; char *copy_on_read = NULL; + char *mirror = NULL; + char *mirrorFormat = NULL; char *devaddr = NULL; virStorageEncryptionPtr encryption = NULL; char *serial = NULL; @@ -3453,6 +3457,22 @@ virDomainDiskDefParseXML(virCapsPtr caps, ioeventfd = virXMLPropString(cur, "ioeventfd"); event_idx = virXMLPropString(cur, "event_idx"); copy_on_read = virXMLPropString(cur, "copy_on_read"); + } else if ((mirror == NULL) && + (xmlStrEqual(cur->name, BAD_CAST "mirror"))) { + if (flags & VIR_DOMAIN_XML_INTERNAL_STATUS) { + mirror = virXMLPropString(cur, "file"); + if (!mirror) { + virDomainReportError(VIR_ERR_XML_ERROR, "%s", + _("mirror requires file name")); + goto error; + } + mirrorFormat = virXMLPropString(cur, "format"); + } else { + virDomainReportError(VIR_ERR_XML_ERROR, "%s", + _("Cannot handle disk mirror on " + "input yet")); + goto error; + } } else if (xmlStrEqual(cur->name, BAD_CAST "auth")) { authUsername = virXMLPropString(cur, "username"); if (authUsername == NULL) { @@ -3867,6 +3887,10 @@ virDomainDiskDefParseXML(virCapsPtr caps, driverName = NULL; def->driverType = driverType; driverType = NULL; + def->mirror = mirror; + mirror = NULL; + def->mirrorFormat = mirrorFormat; + mirrorFormat = NULL; def->encryption = encryption; encryption = NULL; def->serial = serial; @@ -3882,6 +3906,12 @@ virDomainDiskDefParseXML(virCapsPtr caps, !(def->driverName = strdup(caps->defaultDiskDriverName))) goto no_memory; + + if (def->mirror && !def->mirrorFormat && + caps->defaultDiskDriverType && + !(def->mirrorFormat = strdup(caps->defaultDiskDriverType))) + goto no_memory; + if (def->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && virDomainDiskDefAssignAddress(caps, def) < 0) goto error; @@ -3907,6 +3937,8 @@ cleanup: VIR_FREE(authUsage); VIR_FREE(driverType); VIR_FREE(driverName); + VIR_FREE(mirror); + VIR_FREE(mirrorFormat); VIR_FREE(cachetag); VIR_FREE(error_policy); VIR_FREE(rerror_policy); @@ -10829,6 +10861,16 @@ virDomainDiskDefFormat(virBufferPtr buf, } } + /* For now, mirroring is currently output-only: we always output + * it, but refuse to parse it on input except for internal parse + * on libvirtd restart. */ + if (def->mirror) { + virBufferEscapeString(buf, " <mirror file='%s'", def->mirror); + if (def->mirrorFormat) + virBufferAsprintf(buf, " format='%s'", def->mirrorFormat); + virBufferAddLit(buf, ">\n"); + } + virBufferAsprintf(buf, " <target dev='%s' bus='%s'", def->dst, bus); if ((def->device == VIR_DOMAIN_DISK_DEVICE_FLOPPY || diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0eed60e..abc953d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -563,6 +563,10 @@ struct _virDomainDiskDef { char *driverName; char *driverType; + char *mirror; + char *mirrorFormat; + bool mirroring; + virDomainBlockIoTuneInfo blkdeviotune; char *serial; @@ -2125,6 +2129,7 @@ VIR_ENUM_DECL(virDomainDiskIo) VIR_ENUM_DECL(virDomainDiskSecretType) VIR_ENUM_DECL(virDomainDiskSnapshot) VIR_ENUM_DECL(virDomainDiskTray) +VIR_ENUM_DECL(virDomainDiskMirrorStage) VIR_ENUM_DECL(virDomainIoEventFd) VIR_ENUM_DECL(virDomainVirtioEventIdx) VIR_ENUM_DECL(virDomainDiskCopyOnRead) -- 1.7.7.6

For now, disk migration via block copy job is not implemented. But when we do implement it, we have to deal with the fact that qemu does not provide an easy way to re-start a qemu process with mirroring still intact (it _might_ be possible by using qemu -S then an initial 'drive-mirror' with disk reuse before starting the domain, but that gets hairy). Even something like 'virDomainSave' becomes hairy, if you realize the implications that 'virDomainRestore' would be stuck with recreating the same mirror layout. But if we step back and look at the bigger picture, we realize that the initial client of live storage migration via disk mirroring is oVirt, which always uses transient domains, and that if a transient domain is destroyed while a mirror exists, oVirt can easily restart the storage migration by creating a new domain that visits just the source storage, with no loss in data. We can make life a lot easier by being cowards, and forbidding certain operations on a domain. This patch guarantees that we never get in a state where we would have to restart a domain with a mirroring block copy, by preventing saves, snapshots, hot unplug of a disk in use, and conversion to a persistent domain. * src/conf/domain_conf.h (virDomainHasDiskMirror): New prototype. * src/conf/domain_conf.c (virDomainHasDiskMirror): New function. * src/libvirt_private.syms (domain_conf.h): Export it. * src/qemu/qemu_driver.c (qemuDomainSaveInternal) (qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot) (qemuDomainBlockJobImpl, qemudDomainDefine): Prevent dangerous actions while block copy is already in action. * src/qemu/qemu_hotplug.c (qemuDomainDetachDiskDevice): Likewise. --- src/conf/domain_conf.c | 12 ++++++++++++ src/conf/domain_conf.h | 1 + src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 31 ++++++++++++++++++++++++++++++- src/qemu/qemu_hotplug.c | 7 +++++++ 5 files changed, 51 insertions(+), 1 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 38c461a..e67be42 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -7183,6 +7183,18 @@ virDomainDiskRemoveByName(virDomainDefPtr def, const char *name) return virDomainDiskRemove(def, i); } +/* Return true if VM has at least one disk involved in a current block + * copy job (that is, with a <mirror> element in the disk xml). */ +bool +virDomainHasDiskMirror(virDomainObjPtr vm) +{ + int i; + for (i = 0; i < vm->def->ndisks; i++) + if (vm->def->disks[i]->mirror) + return true; + return false; +} + int virDomainNetInsert(virDomainDefPtr def, virDomainNetDefPtr net) { if (VIR_REALLOC_N(def->nets, def->nnets + 1) < 0) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index abc953d..77c501c 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1954,6 +1954,7 @@ virDomainDiskDefPtr virDomainDiskRemove(virDomainDefPtr def, size_t i); virDomainDiskDefPtr virDomainDiskRemoveByName(virDomainDefPtr def, const char *name); +bool virDomainHasDiskMirror(virDomainObjPtr vm); int virDomainNetIndexByMac(virDomainDefPtr def, const unsigned char *mac); int virDomainNetInsert(virDomainDefPtr def, virDomainNetDefPtr net); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a90f8a0..570940d 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -354,6 +354,7 @@ virDomainGraphicsSpiceZlibCompressionTypeFromString; virDomainGraphicsSpiceZlibCompressionTypeToString; virDomainGraphicsTypeFromString; virDomainGraphicsTypeToString; +virDomainHasDiskMirror; virDomainHostdevDefAlloc; virDomainHostdevDefClear; virDomainHostdevDefFree; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b219cb6..88dadca 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2558,6 +2558,11 @@ qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, "%s", _("domain is marked for auto destroy")); goto cleanup; } + if (virDomainHasDiskMirror(vm)) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", + _("domain has active block copy job")); + goto cleanup; + } memset(&header, 0, sizeof(header)); memcpy(header.magic, QEMUD_SAVE_PARTIAL, sizeof(header.magic)); @@ -4947,6 +4952,12 @@ static virDomainPtr qemudDomainDefine(virConnectPtr conn, const char *xml) { goto cleanup; } def = NULL; + if (virDomainHasDiskMirror(vm)) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", + _("domain has active block copy job")); + virDomainObjAssignDef(vm, NULL, false); + goto cleanup; + } vm->persistent = 1; if (virDomainSaveConfig(driver->configDir, @@ -10265,6 +10276,12 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, "%s", _("domain is marked for auto destroy")); goto cleanup; } + if (virDomainHasDiskMirror(vm)) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", + _("domain has active block copy job")); + goto cleanup; + } + if (!vm->persistent && (flags & VIR_DOMAIN_SNAPSHOT_CREATE_HALT)) { qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cannot halt after transient domain snapshot")); @@ -10872,6 +10889,11 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, _("no domain with matching uuid '%s'"), uuidstr); goto cleanup; } + if (virDomainHasDiskMirror(vm)) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", + _("domain has active block copy job")); + goto cleanup; + } snap = virDomainSnapshotFindByName(&vm->snapshots, snapshot->name); if (!snap) { @@ -11610,6 +11632,7 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, char uuidstr[VIR_UUID_STRING_BUFLEN]; char *device = NULL; int ret = -1; + int idx; qemuDriverLock(driver); virUUIDFormat(dom->uuid, uuidstr); @@ -11620,10 +11643,16 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, goto cleanup; } - device = qemuDiskPathToAlias(vm, path, NULL); + device = qemuDiskPathToAlias(vm, path, &idx); if (!device) { goto cleanup; } + if (mode == BLOCK_JOB_PULL && vm->def->disks[idx]->mirror) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' already in active block copy job"), + vm->def->disks[idx]->dst); + goto cleanup; + } if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) goto cleanup; diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 857b980..98fa8f8 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1721,6 +1721,13 @@ int qemuDomainDetachDiskDevice(struct qemud_driver *driver, detach = vm->def->disks[i]; + if (detach->mirror) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' is in an active block copy job"), + detach->dst); + goto cleanup; + } + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) { if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0) != 0) { qemuReportError(VIR_ERR_INTERNAL_ERROR, -- 1.7.7.6

The new block copy storage migration sequence requires both the 'drive-mirror' action in 'transaction' (present if the 'drive-mirror' standalone monitor command also exists) and the 'drive-reopen' monitor command (it would be nice if that were also part of a 'transaction', but the initial qemu implementation has it standalone only). As of this[1] qemu email, both commands have been proposed but not yet incorporated into the tree, so there is a risk that qemu 1.1 will not have these commands, or will have something subtly different. [1]https://lists.gnu.org/archive/html/qemu-devel/2012-03/msg01524.html * src/qemu/qemu_capabilities.h (QEMU_CAPS_DRIVE_MIRROR) (QEMU_CAPS_DRIVE_REOPEN): New bits. * src/qemu/qemu_capabilities.c (qemuCaps): Name them. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set them. (qemuMonitorJSONDriveMirror, qemuMonitorDriveReopen): New functions. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror) (qemuMonitorDriveReopen): Declare them. * src/qemu/qemu_monitor.c (qemuMonitorDriveMirror) (qemuMonitorDriveReopen): New passthroughs. * src/qemu/qemu_monitor.h (qemuMonitorDriveMirror) (qemuMonitorDriveReopen): Declare them. --- src/qemu/qemu_capabilities.c | 3 ++ src/qemu/qemu_capabilities.h | 2 + src/qemu/qemu_monitor.c | 50 ++++++++++++++++++++++++++++ src/qemu/qemu_monitor.h | 23 +++++++++++++ src/qemu/qemu_monitor_json.c | 74 +++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_monitor_json.h | 21 +++++++++++- 6 files changed, 167 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 0e09d6d..1938ae4 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -156,6 +156,9 @@ VIR_ENUM_IMPL(qemuCaps, QEMU_CAPS_LAST, "scsi-disk.channel", "scsi-block", "transaction", + + "drive-mirror", /* 90 */ + "drive-reopen", ); struct qemu_feature_flags { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 78cdbe0..405bf2a 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -124,6 +124,8 @@ enum qemuCapsFlags { QEMU_CAPS_SCSI_DISK_CHANNEL = 87, /* Is scsi-disk.channel available? */ QEMU_CAPS_SCSI_BLOCK = 88, /* -device scsi-block */ QEMU_CAPS_TRANSACTION = 89, /* transaction monitor command */ + QEMU_CAPS_DRIVE_MIRROR = 90, /* drive-mirror monitor command */ + QEMU_CAPS_DRIVE_REOPEN = 91, /* drive-reopen monitor command */ QEMU_CAPS_LAST, /* this must always be the last item */ }; diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index e1a8d4c..f33bed8 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -2685,6 +2685,32 @@ qemuMonitorDiskSnapshot(qemuMonitorPtr mon, virJSONValuePtr actions, return ret; } +/* Add the drive-mirror action to a transaction. */ +int +qemuMonitorDriveMirror(qemuMonitorPtr mon, virJSONValuePtr actions, + const char *device, const char *file, + const char *format, int mode) +{ + int ret; + + VIR_DEBUG("mon=%p, actions=%p, device=%s, file=%s, format=%s, mode=%o", + mon, actions, device, file, format, mode); + + if (!mon) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (mon->json) + ret = qemuMonitorJSONDriveMirror(mon, actions, device, file, format, + mode); + else + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("drive-mirror requires JSON monitor")); + return ret; +} + /* Use the transaction QMP command to run atomic snapshot commands. */ int qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) @@ -2701,6 +2727,30 @@ qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) return ret; } +/* Use the drive-reopen monitor command. */ +int +qemuMonitorDriveReopen(qemuMonitorPtr mon, const char *device, + const char *file, const char *format) +{ + int ret; + + VIR_DEBUG("mon=%p, device=%s, file=%s, format=%s", + mon, device, file, format); + + if (!mon) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (mon->json) + ret = qemuMonitorJSONDriveReopen(mon, device, file, format); + else + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("drive-reopen requires JSON monitor")); + return ret; +} + int qemuMonitorArbitraryCommand(qemuMonitorPtr mon, const char *cmd, char **reply, diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 2e6ac79..bbeb330 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -508,8 +508,31 @@ int qemuMonitorDiskSnapshot(qemuMonitorPtr mon, const char *file, const char *format, bool reuse); + +typedef enum { + QEMU_MONITOR_DRIVE_MIRROR_ABSOLUTE, + QEMU_MONITOR_DRIVE_MIRROR_EXISTING, + QEMU_MONITOR_DRIVE_MIRROR_NO_BACKING, + + QEMU_MONITOR_DRIVE_MIRROR_LAST +} qemuMonitorDriveMirrorMode; + +int qemuMonitorDriveMirror(qemuMonitorPtr mon, + virJSONValuePtr actions, + const char *device, + const char *file, + const char *format, + int mode) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); int qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); +int qemuMonitorDriveReopen(qemuMonitorPtr mon, + const char *device, + const char *file, + const char *format) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4); int qemuMonitorArbitraryCommand(qemuMonitorPtr mon, const char *cmd, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index d4aca5c..93ebfbf 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -967,12 +967,14 @@ qemuMonitorJSONCheckCommands(qemuMonitorPtr mon, if (STREQ(name, "human-monitor-command")) *json_hmp = 1; - - if (STREQ(name, "system_wakeup")) + else if (STREQ(name, "system_wakeup")) qemuCapsSet(qemuCaps, QEMU_CAPS_WAKEUP); - - if (STREQ(name, "transaction")) + else if (STREQ(name, "transaction")) qemuCapsSet(qemuCaps, QEMU_CAPS_TRANSACTION); + else if (STREQ(name, "drive-mirror")) + qemuCapsSet(qemuCaps, QEMU_CAPS_DRIVE_MIRROR); + else if (STREQ(name, "drive-reopen")) + qemuCapsSet(qemuCaps, QEMU_CAPS_DRIVE_REOPEN); } ret = 0; @@ -3185,6 +3187,43 @@ cleanup: return ret; } +VIR_ENUM_DECL(qemuMonitorDriveMirror) +VIR_ENUM_IMPL(qemuMonitorDriveMirror, QEMU_MONITOR_DRIVE_MIRROR_LAST, + "absolute-paths", "exsisting", "no-backing-file"); + +int +qemuMonitorJSONDriveMirror(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virJSONValuePtr actions, + const char *device, const char *file, + const char *format, int mode) +{ + int ret = -1; + virJSONValuePtr cmd; + + cmd = qemuMonitorJSONMakeCommandRaw(true, + "drive-mirror", + "s:device", device, + "s:target", file, + "s:format", format, + "s:mode", + qemuMonitorDriveMirrorTypeToString(mode), + NULL); + if (!cmd) + return -1; + + if (virJSONValueArrayAppend(actions, cmd) < 0) { + virReportOOMError(); + goto cleanup; + } + + cmd = NULL; + ret = 0; + +cleanup: + virJSONValueFree(cmd); + return ret; +} + int qemuMonitorJSONTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) { @@ -3214,6 +3253,33 @@ cleanup: return ret; } +int +qemuMonitorJSONDriveReopen(qemuMonitorPtr mon, const char *device, + const char *file, const char *format) +{ + int ret; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + + cmd = qemuMonitorJSONMakeCommand("drive-reopen", + "s:device", device, + "s:new-image-file", file, + "s:format", format, + NULL); + if (!cmd) + return -1; + + if ((ret = qemuMonitorJSONCommand(mon, cmd, &reply)) < 0) + goto cleanup; + + ret = qemuMonitorJSONCheckError(cmd, reply); + +cleanup: + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} + int qemuMonitorJSONArbitraryCommand(qemuMonitorPtr mon, const char *cmd_str, char **reply_str, diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index a0f67aa..ce23bc1 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -230,8 +230,25 @@ int qemuMonitorJSONDiskSnapshot(qemuMonitorPtr mon, const char *device, const char *file, const char *format, - bool reuse); -int qemuMonitorJSONTransaction(qemuMonitorPtr mon, virJSONValuePtr actions); + bool reuse) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); +int qemuMonitorJSONDriveMirror(qemuMonitorPtr mon, + virJSONValuePtr actions, + const char *device, + const char *file, + const char *format, + int mode) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); +int qemuMonitorJSONTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); +int qemuMonitorJSONDriveReopen(qemuMonitorPtr mon, + const char *device, + const char *file, + const char *format) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_NONNULL(4); int qemuMonitorJSONArbitraryCommand(qemuMonitorPtr mon, const char *cmd_str, -- 1.7.7.6

Il 06/04/2012 06:36, Eric Blake ha scritto:
+int +qemuMonitorJSONDriveMirror(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virJSONValuePtr actions, + const char *device, const char *file, + const char *format, int mode) +{ + int ret = -1; + virJSONValuePtr cmd; + + cmd = qemuMonitorJSONMakeCommandRaw(true, + "drive-mirror", + "s:device", device, + "s:target", file, + "s:format", format, + "s:mode", + qemuMonitorDriveMirrorTypeToString(mode), + NULL); + if (!cmd) + return -1; + + if (virJSONValueArrayAppend(actions, cmd) < 0) { + virReportOOMError(); + goto cleanup; + }
Here it would be nice to invoke the command directly if actions is NULL. Right now there is no certainty that drive-mirror will be transactionable in upstream QEMU, so it is safer to invoke it outside a transaction in patch 11. Paolo

On 04/06/2012 01:13 AM, Paolo Bonzini wrote:
Il 06/04/2012 06:36, Eric Blake ha scritto:
+int +qemuMonitorJSONDriveMirror(qemuMonitorPtr mon ATTRIBUTE_UNUSED, + virJSONValuePtr actions, + const char *device, const char *file, + const char *format, int mode) +{ + int ret = -1; + virJSONValuePtr cmd; + + cmd = qemuMonitorJSONMakeCommandRaw(true, + "drive-mirror", + "s:device", device, + "s:target", file, + "s:format", format, + "s:mode", + qemuMonitorDriveMirrorTypeToString(mode), + NULL); + if (!cmd) + return -1; + + if (virJSONValueArrayAppend(actions, cmd) < 0) { + virReportOOMError(); + goto cleanup; + }
Here it would be nice to invoke the command directly if actions is NULL. Right now there is no certainty that drive-mirror will be transactionable in upstream QEMU, so it is safer to invoke it outside a transaction in patch 11.
Good idea; borrowing from qemuMonitorJSONDiskSnapshot, I'm going to squash this in: diff --git i/src/qemu/qemu_monitor_json.c w/src/qemu/qemu_monitor_json.c index 3bbc664..ad9b8b8 100644 --- i/src/qemu/qemu_monitor_json.c +++ w/src/qemu/qemu_monitor_json.c @@ -3144,7 +3144,7 @@ qemuMonitorJSONDiskSnapshot(qemuMonitorPtr mon, virJSONValuePtr actions, const char *device, const char *file, const char *format, bool reuse) { - int ret; + int ret = -1; virJSONValuePtr cmd; virJSONValuePtr reply = NULL; @@ -3162,14 +3162,13 @@ qemuMonitorJSONDiskSnapshot(qemuMonitorPtr mon, virJSONValuePtr actions, if (actions) { if (virJSONValueArrayAppend(actions, cmd) < 0) { virReportOOMError(); - ret = -1; } else { ret = 0; cmd = NULL; } } else { if ((ret = qemuMonitorJSONCommand(mon, cmd, &reply)) < 0) - goto cleanup; + goto cleanup; if (qemuMonitorJSONHasError(reply, "CommandNotFound") && qemuMonitorCheckHMP(mon, "snapshot_blkdev")) { @@ -3199,8 +3198,9 @@ qemuMonitorJSONDriveMirror(qemuMonitorPtr mon ATTRIBUTE_UNUSED, { int ret = -1; virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; - cmd = qemuMonitorJSONMakeCommandRaw(true, + cmd = qemuMonitorJSONMakeCommandRaw(actions != NULL, "drive-mirror", "s:device", device, "s:target", file, @@ -3211,16 +3211,22 @@ qemuMonitorJSONDriveMirror(qemuMonitorPtr mon ATTRIBUTE_UNUSED, if (!cmd) return -1; - if (virJSONValueArrayAppend(actions, cmd) < 0) { - virReportOOMError(); - goto cleanup; + if (actions) { + if (virJSONValueArrayAppend(actions, cmd) < 0) { + virReportOOMError(); + } else { + cmd = NULL; + ret = 0; + } + } else { + if ((ret = qemuMonitorJSONCommand(mon, cmd, &reply)) < 0) + goto cleanup; + ret = qemuMonitorJSONCheckError(cmd, reply); } - cmd = NULL; - ret = 0; - cleanup: virJSONValueFree(cmd); + virJSONValueFree(reply); return ret; } -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 04/05/2012 10:36 PM, Eric Blake wrote:
The new block copy storage migration sequence requires both the 'drive-mirror' action in 'transaction' (present if the 'drive-mirror' standalone monitor command also exists) and the 'drive-reopen' monitor command (it would be nice if that were also part of a 'transaction', but the initial qemu implementation has it standalone only).
As of this[1] qemu email, both commands have been proposed but not yet incorporated into the tree, so there is a risk that qemu 1.1 will not have these commands, or will have something subtly different. [1]https://lists.gnu.org/archive/html/qemu-devel/2012-03/msg01524.html
Another tweak that I will need for implementing REUSE_EXT - for an existing file, it should be okay to let qemu probe the existing file for its type, rather than requiring libvirt to probe the type and always pass a string to qemu. I specifically added the COPY_RAW flag as a way to force libvirt to pass "raw" and prevent a format probe. Also, when there is no existing destination, the assumption is that the destination should be the same format as the source, but we may have let qemu probe the source file, in which case we have no format string to pass along. Normally, probing is suspicious - any raw file can pretend to be any other file type, but since there is a mechanism to specifically recognize raw files, I think probing is okay in other situations. So I'm squashing this in: diff --git i/src/qemu/qemu_monitor.h w/src/qemu/qemu_monitor.h index 0aa74be..a432b6f 100644 --- i/src/qemu/qemu_monitor.h +++ w/src/qemu/qemu_monitor.h @@ -523,16 +523,14 @@ int qemuMonitorDriveMirror(qemuMonitorPtr mon, const char *file, const char *format, int mode) - ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4) - ATTRIBUTE_NONNULL(5); + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4); int qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); int qemuMonitorDriveReopen(qemuMonitorPtr mon, const char *device, const char *file, const char *format) - ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) - ATTRIBUTE_NONNULL(4); + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int qemuMonitorArbitraryCommand(qemuMonitorPtr mon, const char *cmd, diff --git i/src/qemu/qemu_monitor_json.c w/src/qemu/qemu_monitor_json.c index d61c91b..bb080e5 100644 --- i/src/qemu/qemu_monitor_json.c +++ w/src/qemu/qemu_monitor_json.c @@ -3206,9 +3206,9 @@ qemuMonitorJSONDriveMirror(qemuMonitorPtr mon ATTRIBUTE_UNUSED, "drive-mirror", "s:device", device, "s:target", file, - "s:format", format, "s:mode", qemuMonitorDriveMirrorTypeToString(mode), + format ? "s:format" : NULL, format, NULL); if (!cmd) return -1; @@ -3271,7 +3271,7 @@ qemuMonitorJSONDriveReopen(qemuMonitorPtr mon, const char *device, cmd = qemuMonitorJSONMakeCommand("drive-reopen", "s:device", device, "s:new-image-file", file, - "s:format", format, + format ? "s:format" : NULL, format, NULL); if (!cmd) return -1; diff --git i/src/qemu/qemu_monitor_json.h w/src/qemu/qemu_monitor_json.h index ffd8e64..136c0f3 100644 --- i/src/qemu/qemu_monitor_json.h +++ w/src/qemu/qemu_monitor_json.h @@ -239,16 +239,14 @@ int qemuMonitorJSONDriveMirror(qemuMonitorPtr mon, const char *file, const char *format, int mode) - ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) - ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5); + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4); int qemuMonitorJSONTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); int qemuMonitorJSONDriveReopen(qemuMonitorPtr mon, const char *device, const char *file, const char *format) - ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) - ATTRIBUTE_NONNULL(4); + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3); int qemuMonitorJSONArbitraryCommand(qemuMonitorPtr mon, const char *cmd_str, -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

----- Original Message -----
From: "Eric Blake" <eblake@redhat.com> To: libvir-list@redhat.com Cc: pbonzini@redhat.com, fsimonce@redhat.com Sent: Friday, April 6, 2012 6:36:54 AM Subject: [PATCHv2 08/15] blockjob: expose qemu commands for mirrored storage migration
The new block copy storage migration sequence requires both the 'drive-mirror' action in 'transaction' (present if the 'drive-mirror' standalone monitor command also exists) and the 'drive-reopen' monitor command (it would be nice if that were also part of a 'transaction', but the initial qemu implementation has it standalone only).
As of this[1] qemu email, both commands have been proposed but not yet incorporated into the tree, so there is a risk that qemu 1.1 will not have these commands, or will have something subtly different. [1]https://lists.gnu.org/archive/html/qemu-devel/2012-03/msg01524.html
* src/qemu/qemu_capabilities.h (QEMU_CAPS_DRIVE_MIRROR) (QEMU_CAPS_DRIVE_REOPEN): New bits. * src/qemu/qemu_capabilities.c (qemuCaps): Name them. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set them. (qemuMonitorJSONDriveMirror, qemuMonitorDriveReopen): New functions. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror) (qemuMonitorDriveReopen): Declare them. * src/qemu/qemu_monitor.c (qemuMonitorDriveMirror) (qemuMonitorDriveReopen): New passthroughs. * src/qemu/qemu_monitor.h (qemuMonitorDriveMirror) (qemuMonitorDriveReopen): Declare them. --- src/qemu/qemu_capabilities.c | 3 ++ src/qemu/qemu_capabilities.h | 2 + src/qemu/qemu_monitor.c | 50 ++++++++++++++++++++++++++++ src/qemu/qemu_monitor.h | 23 +++++++++++++ src/qemu/qemu_monitor_json.c | 74 +++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_monitor_json.h | 21 +++++++++++- 6 files changed, 167 insertions(+), 6 deletions(-)
[...]
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index e1a8d4c..f33bed8 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -2685,6 +2685,32 @@ qemuMonitorDiskSnapshot(qemuMonitorPtr mon, virJSONValuePtr actions, return ret; }
+/* Add the drive-mirror action to a transaction. */ +int +qemuMonitorDriveMirror(qemuMonitorPtr mon, virJSONValuePtr actions, + const char *device, const char *file, + const char *format, int mode) +{ + int ret; + + VIR_DEBUG("mon=%p, actions=%p, device=%s, file=%s, format=%s, mode=%o", + mon, actions, device, file, format, mode); + + if (!mon) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (mon->json) + ret = qemuMonitorJSONDriveMirror(mon, actions, device, file, format, + mode); + else + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("drive-mirror requires JSON monitor"));
You should set ret to -1 here (or return -1).
+ return ret; +} + /* Use the transaction QMP command to run atomic snapshot commands. */ int qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) @@ -2701,6 +2727,30 @@ qemuMonitorTransaction(qemuMonitorPtr mon, virJSONValuePtr actions) return ret; }
+/* Use the drive-reopen monitor command. */ +int +qemuMonitorDriveReopen(qemuMonitorPtr mon, const char *device, + const char *file, const char *format) +{ + int ret; + + VIR_DEBUG("mon=%p, device=%s, file=%s, format=%s", + mon, device, file, format); + + if (!mon) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (mon->json) + ret = qemuMonitorJSONDriveReopen(mon, device, file, format); + else + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("drive-reopen requires JSON monitor"));
You should set ret to -1 here (or return -1).
+ return ret; +} + int qemuMonitorArbitraryCommand(qemuMonitorPtr mon, const char *cmd, char **reply,
-- Federico

On 04/06/2012 10:04 AM, Federico Simoncelli wrote:
----- Original Message -----
From: "Eric Blake" <eblake@redhat.com> To: libvir-list@redhat.com Cc: pbonzini@redhat.com, fsimonce@redhat.com Sent: Friday, April 6, 2012 6:36:54 AM Subject: [PATCHv2 08/15] blockjob: expose qemu commands for mirrored storage migration
The new block copy storage migration sequence requires both the 'drive-mirror' action in 'transaction' (present if the 'drive-mirror' standalone monitor command also exists) and the 'drive-reopen' monitor command (it would be nice if that were also part of a 'transaction', but the initial qemu implementation has it standalone only).
+int +qemuMonitorDriveMirror(qemuMonitorPtr mon, virJSONValuePtr actions, + const char *device, const char *file, + const char *format, int mode) +{ + int ret; + + VIR_DEBUG("mon=%p, actions=%p, device=%s, file=%s, format=%s, mode=%o", + mon, actions, device, file, format, mode); + + if (!mon) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (mon->json) + ret = qemuMonitorJSONDriveMirror(mon, actions, device, file, format, + mode); + else + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("drive-mirror requires JSON monitor"));
You should set ret to -1 here (or return -1).
Yep. Fixed in my tree for v3. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Handle the new type of block copy event and info. Of course, this patch does nothing until a later patch actually allows the creation/abort of a block copy job. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONHandleBlockJobImpl) (qemuMonitorJSONGetBlockJobInfoOne): Translate new job type. * src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Snoop a successful info query to save effort on a pivot request. --- src/qemu/qemu_driver.c | 6 ++++++ src/qemu/qemu_monitor_json.c | 4 ++++ 2 files changed, 10 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 88dadca..c6033f8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11679,6 +11679,12 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, if (ret < 0) goto endjob; + /* Snoop block copy operations, so future cancel operations can + * avoid checking if pivot is safe. */ + if (mode == BLOCK_JOB_INFO && ret == 1 && vm->def->disks[idx]->mirror && + info->cur == info->end && info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COPY) + vm->def->disks[idx]->mirroring = true; + /* Qemu provides asynchronous block job cancellation, but without * the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag libvirt guarantees a * synchronous operation. Provide this behavior by waiting here, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 93ebfbf..6cf0779 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -788,6 +788,8 @@ qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon, if (STREQ(type_str, "stream")) type = VIR_DOMAIN_BLOCK_JOB_TYPE_PULL; + else if (STREQ(type_str, "mirror")) + type = VIR_DOMAIN_BLOCK_JOB_TYPE_COPY; switch (event) { case VIR_DOMAIN_BLOCK_JOB_COMPLETED: @@ -3406,6 +3408,8 @@ static int qemuMonitorJSONGetBlockJobInfoOne(virJSONValuePtr entry, } if (STREQ(type, "stream")) info->type = VIR_DOMAIN_BLOCK_JOB_TYPE_PULL; + else if (STREQ(type, "mirror")) + info->type = VIR_DOMAIN_BLOCK_JOB_TYPE_COPY; else info->type = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN; -- 1.7.7.6

This is the bare minimum to end a copy job (of course, until a later patch adds the ability to start a copy job, this patch doesn't do much in isolation; I've just split the patches to ease the review). This patch intentionally avoids SELinux, lock manager, and audit actions, saving that for a later patch that affects the overall lifecycle of a disk copy. In particular, I'm still fuzzy on the exact qemu error semantics, and whether I need to make more of an effort after a 'drive-reopen' fails. When a mirror job is started, cancelling the job safely reverts back to the source disk, regardless of whether the destination is in phase 1 (streaming, in which case the destination is worthless) or phase 2 (mirroring, in which case the destination is sync'd up to the source at the time of the cancel). Our existing code does just fine in either phase, other than some bookkeeping cleanup. Pivoting the job requires the use of the new 'drive-reopen' command. Here, failure of the command is potentially catastrophic to the domain, since it rips out the old disk before attempting to open the new one; if our recovery path of retrying the reopen on the original source disk also fails, the domain is hosed. If only qemu could get 'drive-reopen' inside 'transaction'... Interesting side note: while snapshot-create --disk-only creates a copy of the disk at a point in time by moving the domain on to a new file (the copy is the file now in the just-extended backing chain), blockjob --abort of a copy job creates a copy of the disk while keeping the domain on the original file. There may be potential improvements to the snapshot code to exploit block copy over multiple disks all at one point in time. And, if 'block_job_cancel' were made part of 'transaction', you could copy multiple disks at the same point in time without pausing the domain. This also implies we may want to add a --quiesce flag to the pivot operation, so that when breaking a mirror, the side of the mirror that we are abandoning is at least in a stable state with regards to guest I/O. * src/qemu/qemu_driver.c (qemuDomainBlockJobAbort): Accept new flag. (qemuDomainBlockPivot): New helper function. (qemuDomainBlockJobImpl): Implement it. --- src/qemu/qemu_driver.c | 118 +++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 112 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index c6033f8..22829ba 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11621,6 +11621,86 @@ cleanup: return ret; } +/* Called while holding the VM job lock, to implement a block job + * abort with pivot; this updates the VM definition as appropriate, on + * either success or failure (although there are some forms of + * catastrophic failure that will leave the VM unusable). */ +static int +qemuDomainBlockPivot(struct qemud_driver *driver, virDomainObjPtr vm, + const char *device, virDomainDiskDefPtr disk) +{ + int ret = -1; + qemuDomainObjPrivatePtr priv = vm->privateData; + virDomainBlockJobInfo info; + + /* Probe the status, if needed. */ + if (!disk->mirroring) { + qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorBlockJob(priv->mon, device, NULL, 0, &info, + BLOCK_JOB_INFO); + qemuDomainObjExitMonitorWithDriver(driver, vm); + if (ret < 0) + goto cleanup; + if (!virDomainObjIsActive(vm)) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + goto cleanup; + } + if (ret == 1 && info.cur == info.end && + info.type == VIR_DOMAIN_BLOCK_JOB_TYPE_COPY) + disk->mirroring = true; + } + + if (!disk->mirroring) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' not ready for pivot yet"), + disk->dst); + goto cleanup; + } + + /* Attempt the pivot. */ + qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorDriveReopen(priv->mon, device, disk->mirror, + disk->mirrorFormat); + qemuDomainObjExitMonitorWithDriver(driver, vm); + + /* XXX Until qemu adds support for 'drive-reopen' inside + * 'transaction', we have the remote risk of a catastrophic + * failure, where the drive-reopen fails but can't recover by + * reopening the source. Not much we can do about it. */ + + if (ret == 0) { + /* XXX We want to revoke security labels and disk lease, as + * well as audit that revocation, before dropping the original + * source. But it gets tricky if both source and mirror share + * common backing files (we want to only revoke the non-shared + * portion of the chain, and is made more difficult by the + * fact that we aren't tracking the full chain ourselves; so + * for now, we leak the access to the original. */ + VIR_FREE(disk->src); + VIR_FREE(disk->driverType); + disk->src = disk->mirror; + disk->driverType = disk->mirrorFormat; + disk->mirror = NULL; + disk->mirrorFormat = NULL; + disk->mirroring = false; + } else { + /* On failure, qemu abandons the mirror, and attempts to + * revert back to the source disk. Hopefully it was able to + * reopen things. */ + /* XXX should we be parsing the exact qemu error, or calling + * 'query-block', to see what state we really got left in + * before killing the mirroring job? And just as on the + * success case, there's security labeling to worry about. */ + VIR_FREE(disk->mirror); + VIR_FREE(disk->mirrorFormat); + disk->mirroring = false; + } + +cleanup: + return ret; +} + static int qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, virDomainBlockJobInfoPtr info, @@ -11633,6 +11713,7 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, char *device = NULL; int ret = -1; int idx; + virDomainDiskDefPtr disk = NULL; qemuDriverLock(driver); virUUIDFormat(dom->uuid, uuidstr); @@ -11647,10 +11728,20 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, if (!device) { goto cleanup; } - if (mode == BLOCK_JOB_PULL && vm->def->disks[idx]->mirror) { + disk = vm->def->disks[idx]; + priv = vm->privateData; + + if (disk->mirror && mode == BLOCK_JOB_PULL) { qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, _("disk '%s' already in active block copy job"), - vm->def->disks[idx]->dst); + disk->dst); + goto cleanup; + } + if (!disk->mirror && mode == BLOCK_JOB_ABORT && + (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) { + qemuReportError(VIR_ERR_OPERATION_INVALID, + _("pivot of disk '%s' requires an active copy job"), + disk->dst); goto cleanup; } @@ -11663,8 +11754,13 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, goto endjob; } + if (disk->mirror && mode == BLOCK_JOB_ABORT && + (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) { + ret = qemuDomainBlockPivot(driver, vm, device, disk); + goto endjob; + } + qemuDomainObjEnterMonitorWithDriver(driver, vm); - priv = vm->privateData; /* XXX - add a qemu capability check, since only qemu 1.1 or newer * supports the base argument. * XXX - libvirt should really be tracking the backing file chain @@ -11681,9 +11777,18 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, /* Snoop block copy operations, so future cancel operations can * avoid checking if pivot is safe. */ - if (mode == BLOCK_JOB_INFO && ret == 1 && vm->def->disks[idx]->mirror && + if (mode == BLOCK_JOB_INFO && ret == 1 && disk->mirror && info->cur == info->end && info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COPY) - vm->def->disks[idx]->mirroring = true; + disk->mirroring = true; + + /* A successful block job cancelation stops any mirroring. */ + if (mode == BLOCK_JOB_ABORT && disk->mirror) { + /* XXX We should also revoke security labels and disk lease on + * the mirror, and audit that fact, before dropping things. */ + VIR_FREE(disk->mirror); + VIR_FREE(disk->mirrorFormat); + disk->mirroring = false; + } /* Qemu provides asynchronous block job cancellation, but without * the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag libvirt guarantees a @@ -11741,7 +11846,8 @@ cleanup: static int qemuDomainBlockJobAbort(virDomainPtr dom, const char *path, unsigned int flags) { - virCheckFlags(VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC | + VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT, -1); return qemuDomainBlockJobImpl(dom, path, NULL, 0, NULL, BLOCK_JOB_ABORT, flags); } -- 1.7.7.6

Il 06/04/2012 06:36, Eric Blake ha scritto:
If only qemu could get 'drive-reopen' inside 'transaction'...
Just a quick answer to this: if qemu could get 'drive-reopen' inside 'transaction', the standalone command could be made safe just as easily. In fact, in QEMU 1.1 the blockdev-snapshot-sync command got a similar treatment: it became transactionable, _and_ the standalone command became a simple wrapper around a 1-command transaction. Paolo

On 04/06/2012 01:08 AM, Paolo Bonzini wrote:
Il 06/04/2012 06:36, Eric Blake ha scritto:
If only qemu could get 'drive-reopen' inside 'transaction'...
Just a quick answer to this: if qemu could get 'drive-reopen' inside 'transaction', the standalone command could be made safe just as easily. In fact, in QEMU 1.1 the blockdev-snapshot-sync command got a similar treatment: it became transactionable, _and_ the standalone command became a simple wrapper around a 1-command transaction.
Good point. So unless we have a situation where we want to do 'drive-reopen' on multiple disks at once, we don't actually have to use a transaction command to reopen a single disk. Meanwhile, for similarity with blockdev-snapshot-sync, it looks like I will someday be adding a VIR_DOMAIN_BLOCK_JOB_ABORT_ATOMIC flag, which fails if qemu lacks support for reopen within a transaction, and assuming that qemu exposes a way to see which commands are transactionable. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Il 06/04/2012 06:36, Eric Blake ha scritto:
if 'block_job_cancel' were made part of 'transaction', you could copy multiple disks at the same point in time without pausing the domain.
This is doable. The transactioned command would do a qemu_aio_flush() in the prepare phase, and a normal block_job_cancel in the commit phase.
This also implies we may want to add a --quiesce flag to the pivot operation, so that when breaking a mirror, the side of the mirror that we are abandoning is at least in a stable state with regards to guest I/O.
I don't think this is needed. Either it makes a difference so it should be the default, or it doesn't and then it is not a useful knob. Paolo

On 04/06/2012 09:19 AM, Paolo Bonzini wrote:
Il 06/04/2012 06:36, Eric Blake ha scritto:
if 'block_job_cancel' were made part of 'transaction', you could copy multiple disks at the same point in time without pausing the domain.
This is doable.
The transactioned command would do a qemu_aio_flush() in the prepare phase, and a normal block_job_cancel in the commit phase.
This also implies we may want to add a --quiesce flag to the pivot operation, so that when breaking a mirror, the side of the mirror that we are abandoning is at least in a stable state with regards to guest I/O.
I don't think this is needed. Either it makes a difference so it should be the default, or it doesn't and then it is not a useful knob.
I'm talking about the guest agent. It may make a difference, but cannot be the default, because you cannot trust the guest agent to be present. I'm thinking this will be like the --quiesce operation of snapshot-create, as another situation where the guest agent quiesce can usefully surround a 'transaction' command to improve the results, but must not be default. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Il 06/04/2012 17:28, Eric Blake ha scritto:
I'm talking about the guest agent. It may make a difference, but cannot be the default, because you cannot trust the guest agent to be present. I'm thinking this will be like the --quiesce operation of snapshot-create, as another situation where the guest agent quiesce can usefully surround a 'transaction' command to improve the results, but must not be default.
Ah, got it now. Paolo

Minimal patch to wire up all the pieces in the previous patches to actually enable a block copy job. By minimal, I mean that qemu creates the file, SELinux must be disabled, a lock manager is not informed, and the audit logs aren't updated. But those will be added as improvements in future patches. * src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function. (qemuDomainBlockRebase): Call it when appropriate. --- src/qemu/qemu_driver.c | 129 +++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 128 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 22829ba..41b45ec 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11871,10 +11871,136 @@ qemuDomainBlockJobSetSpeed(virDomainPtr dom, const char *path, } static int +qemuDomainBlockCopy(virDomainPtr dom, const char *path, + const char *dest, const char *format, + unsigned long bandwidth, unsigned int flags) +{ + struct qemud_driver *driver = dom->conn->privateData; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + char uuidstr[VIR_UUID_STRING_BUFLEN]; + char *device = NULL; + virDomainDiskDefPtr disk; + int ret = -1; + int idx; + virJSONValuePtr actions = NULL; + int mode; + + /* Preliminaries: find the disk we are editing, sanity checks */ + virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW, -1); + + qemuDriverLock(driver); + virUUIDFormat(dom->uuid, uuidstr); + vm = virDomainFindByUUID(&driver->domains, dom->uuid); + if (!vm) { + qemuReportError(VIR_ERR_NO_DOMAIN, + _("no domain with matching uuid '%s'"), uuidstr); + goto cleanup; + } + + device = qemuDiskPathToAlias(vm, path, &idx); + if (!device) { + goto cleanup; + } + disk = vm->def->disks[idx]; + if (disk->mirror) { + qemuReportError(VIR_ERR_BLOCK_COPY_ACTIVE, + _("disk '%s' already in active block copy job"), + disk->dst); + goto cleanup; + } + + priv = vm->privateData; + if (!(qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_MIRROR) && + qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_REOPEN))) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("block copy is not supported with this QEMU binary")); + goto cleanup; + } + if (vm->persistent) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not transient")); + goto cleanup; + } + + if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + goto endjob; + } + + /* Prepare the destination file. */ + /* XXX We also need to add security labeling, lock manager lease, + * and auditing of those events, as well as to support reuse of + * existing images, including probing the existing format of an + * existing image. */ + if (!format) + format = disk->driverType; + if ((format && !(disk->mirrorFormat = strdup(format))) || + !(disk->mirror = strdup(dest))) { + virReportOOMError(); + goto endjob; + } + if (flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) + mode = QEMU_MONITOR_DRIVE_MIRROR_ABSOLUTE; + else + mode = QEMU_MONITOR_DRIVE_MIRROR_NO_BACKING; + + /* Actually start the mirroring */ + actions = virJSONValueNewArray(); + if (!actions) { + virReportOOMError(); + goto endjob; + } + qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorDriveMirror(priv->mon, actions, device, dest, + format, mode); + if (ret == 0) + ret = qemuMonitorTransaction(priv->mon, actions); + if (ret == 0 && bandwidth != 0) + ret = qemuMonitorBlockJob(priv->mon, device, NULL, bandwidth, NULL, + BLOCK_JOB_SPEED_INTERNAL); + qemuDomainObjExitMonitorWithDriver(driver, vm); + virJSONValueFree(actions); + +endjob: + if (ret < 0) { + VIR_FREE(disk->mirror); + VIR_FREE(disk->mirrorFormat); + } + if (qemuDomainObjEndJob(driver, vm) == 0) { + vm = NULL; + goto cleanup; + } + +cleanup: + VIR_FREE(device); + if (vm) + virDomainObjUnlock(vm); + qemuDriverUnlock(driver); + return ret; +} + +static int qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { - virCheckFlags(0, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | + VIR_DOMAIN_BLOCK_REBASE_COPY | + VIR_DOMAIN_BLOCK_REBASE_COPY_RAW, -1); + + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY) { + const char *format = NULL; + if (flags & VIR_DOMAIN_BLOCK_REBASE_COPY_RAW) + format = "raw"; + flags &= ~(VIR_DOMAIN_BLOCK_REBASE_COPY | + VIR_DOMAIN_BLOCK_REBASE_COPY_RAW); + return qemuDomainBlockCopy(dom, path, base, format, bandwidth, flags); + } + return qemuDomainBlockJobImpl(dom, path, base, bandwidth, NULL, BLOCK_JOB_PULL, flags); } @@ -11883,6 +12009,7 @@ static int qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth, unsigned int flags) { + virCheckFlags(0, -1); return qemuDomainBlockRebase(dom, path, NULL, bandwidth, flags); } -- 1.7.7.6

On 04/05/2012 10:36 PM, Eric Blake wrote:
Minimal patch to wire up all the pieces in the previous patches to actually enable a block copy job. By minimal, I mean that qemu creates the file, SELinux must be disabled, a lock manager is not informed, and the audit logs aren't updated. But those will be added as improvements in future patches.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function. (qemuDomainBlockRebase): Call it when appropriate. --- src/qemu/qemu_driver.c | 129 +++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 128 insertions(+), 1 deletions(-)
Squash this in to simplify things based on Paolo's suggestion for 8/15. diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index f8f2210..1b1c921 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -11880,7 +11880,6 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, virDomainDiskDefPtr disk; int ret = -1; int idx; - virJSONValuePtr actions = NULL; int mode; /* Preliminaries: find the disk we are editing, sanity checks */ @@ -11947,21 +11946,12 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, mode = QEMU_MONITOR_DRIVE_MIRROR_NO_BACKING; /* Actually start the mirroring */ - actions = virJSONValueNewArray(); - if (!actions) { - virReportOOMError(); - goto endjob; - } qemuDomainObjEnterMonitorWithDriver(driver, vm); - ret = qemuMonitorDriveMirror(priv->mon, actions, device, dest, - format, mode); - if (ret == 0) - ret = qemuMonitorTransaction(priv->mon, actions); + ret = qemuMonitorDriveMirror(priv->mon, NULL, device, dest, format, mode); if (ret == 0 && bandwidth != 0) ret = qemuMonitorBlockJob(priv->mon, device, NULL, bandwidth, NULL, BLOCK_JOB_SPEED_INTERNAL); qemuDomainObjExitMonitorWithDriver(driver, vm); - virJSONValueFree(actions); endjob: if (ret < 0) { -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 04/06/2012 09:36 AM, Eric Blake wrote:
On 04/05/2012 10:36 PM, Eric Blake wrote:
Minimal patch to wire up all the pieces in the previous patches to actually enable a block copy job. By minimal, I mean that qemu creates the file, SELinux must be disabled, a lock manager is not informed, and the audit logs aren't updated. But those will be added as improvements in future patches.
I'm still testing this, but this is probably what I will add as 11.5/15 in my v3 series for allowing the REUSE_EXT flag. From 804c1b654af7f4dddbd3dd768cce7c041436b761 Mon Sep 17 00:00:00 2001 From: Eric Blake <eblake@redhat.com> Date: Fri, 6 Apr 2012 10:32:07 -0600 Subject: [PATCH] blockjob: allow for existing files This copies some of the checks from snapshots regarding testing when a file already exists. * src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow REUSE_EXT flag. (qemuDomainBlockCopy): Wire up flag, and add some sanity checks. --- src/qemu/qemu_driver.c | 51 ++++++++++++++++++++++++++++++++++++++++++----- 1 files changed, 45 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1b1c921..f7b2a01 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11881,9 +11881,11 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, int ret = -1; int idx; int mode; + struct stat st; /* Preliminaries: find the disk we are editing, sanity checks */ - virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW, -1); + virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT, -1); qemuDriverLock(driver); virUUIDFormat(dom->uuid, uuidstr); @@ -11928,19 +11930,55 @@ qemuDomainBlockCopy(virDomainPtr dom, const char *path, goto endjob; } + if ((flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) && + STREQ_NULLABLE(format, "raw") && + STRNEQ_NULLABLE(disk->driverType, "raw")) { + qemuReportError(VIR_ERR_INVALID_ARG, + _("raw shallow copy of non-raw disk '%s' not possible"), + disk->dst); + goto endjob; + } + /* XXX this is pessimistic; we could use 'query-block' or even + * keep track of the backing chain ourselves, rather than assuming + * that all non-raw source files have a current backing image */ + if ((flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT) && + !(flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) && + STRNEQ_NULLABLE(disk->driverType, "raw")) { + qemuReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, + _("non-shallow copy of non-raw disk '%s' to existing " + "destination not supported"), + disk->dst); + goto endjob; + } + /* Prepare the destination file. */ + if (stat(dest, &st) < 0) { + if (errno != ENOENT) { + virReportSystemError(errno, _("unable to stat for disk %s: %s"), + disk->dst, dest); + goto endjob; + } + } else if (!(S_ISBLK(st.st_mode) || !st.st_size || + (flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT))) { + qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("external destination file for disk %s already " + "exists and is not a block device: %s"), + disk->dst, dest); + goto endjob; + } + /* XXX We also need to add security labeling, lock manager lease, - * and auditing of those events, as well as to support reuse of - * existing images, including probing the existing format of an - * existing image. */ - if (!format) + * and auditing of those events. */ + if (!format && !(flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)) format = disk->driverType; if ((format && !(disk->mirrorFormat = strdup(format))) || !(disk->mirror = strdup(dest))) { virReportOOMError(); goto endjob; } - if (flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) + if (flags & VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT) + mode = QEMU_MONITOR_DRIVE_MIRROR_EXISTING; + else if (flags & VIR_DOMAIN_BLOCK_REBASE_SHALLOW) mode = QEMU_MONITOR_DRIVE_MIRROR_ABSOLUTE; else mode = QEMU_MONITOR_DRIVE_MIRROR_NO_BACKING; @@ -11976,6 +12014,7 @@ qemuDomainBlockRebase(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags) { virCheckFlags(VIR_DOMAIN_BLOCK_REBASE_SHALLOW | + VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT | VIR_DOMAIN_BLOCK_REBASE_COPY | VIR_DOMAIN_BLOCK_REBASE_COPY_RAW, -1); -- 1.7.7.6 -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

RHEL-only drive-mirror and drive-reopen are still under upstream qemu discussion; as a result, RHEL decided to backport things under a downstream name. Accommodate this alternate spelling. I don't think it's worth trying to support both spellings at once. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Check for alternate spelling. (int qemuMonitorJSONDriveMirror, qemuMonitorJSONDriveReopen): Use that spelling. --- src/qemu/qemu_monitor_json.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 6cf0779..bb819b0 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -973,9 +973,9 @@ qemuMonitorJSONCheckCommands(qemuMonitorPtr mon, qemuCapsSet(qemuCaps, QEMU_CAPS_WAKEUP); else if (STREQ(name, "transaction")) qemuCapsSet(qemuCaps, QEMU_CAPS_TRANSACTION); - else if (STREQ(name, "drive-mirror")) + else if (STREQ(name, "__com.redhat_drive-mirror")) qemuCapsSet(qemuCaps, QEMU_CAPS_DRIVE_MIRROR); - else if (STREQ(name, "drive-reopen")) + else if (STREQ(name, "__com.redhat_drive-reopen")) qemuCapsSet(qemuCaps, QEMU_CAPS_DRIVE_REOPEN); } @@ -3203,7 +3203,7 @@ qemuMonitorJSONDriveMirror(qemuMonitorPtr mon ATTRIBUTE_UNUSED, virJSONValuePtr cmd; cmd = qemuMonitorJSONMakeCommandRaw(true, - "drive-mirror", + "__com.redhat_drive-mirror", "s:device", device, "s:target", file, "s:format", format, @@ -3263,7 +3263,7 @@ qemuMonitorJSONDriveReopen(qemuMonitorPtr mon, const char *device, virJSONValuePtr cmd; virJSONValuePtr reply = NULL; - cmd = qemuMonitorJSONMakeCommand("drive-reopen", + cmd = qemuMonitorJSONMakeCommand("__com.redhat_drive-reopen", "s:device", device, "s:new-image-file", file, "s:format", format, -- 1.7.7.6

This new API provides additional flexibility over what can be crammed on top of virDomainBlockRebase (namely, the ability to specify an arbitrary destination format, for things like copying qcow2 into qed without having to pre-create the destination), at the expense that it cannot be backported without bumping the .so version. * include/libvirt/libvirt.h.in (virDomainBlockCopy): New API. * src/libvirt.c (virDomainBlockCopy): Implement it. * src/libvirt_public.syms (LIBVIRT_0.9.12): Export it. * src/driver.h (virDrvDomainBlockCopy): New driver callback. * docs/apibuild.py (CParser.parseSignature): Add exception. --- docs/apibuild.py | 1 + include/libvirt/libvirt.h.in | 18 ++++++- src/driver.h | 5 ++ src/libvirt.c | 119 +++++++++++++++++++++++++++++++++++++++++- src/libvirt_public.syms | 5 ++ 5 files changed, 145 insertions(+), 3 deletions(-) diff --git a/docs/apibuild.py b/docs/apibuild.py index 1ac0281..bf06f3b 100755 --- a/docs/apibuild.py +++ b/docs/apibuild.py @@ -1650,6 +1650,7 @@ class CParser: "virDomainBlockJobSetSpeed" : (False, ("bandwidth")), "virDomainBlockPull" : (False, ("bandwidth")), "virDomainBlockRebase" : (False, ("bandwidth")), + "virDomainBlockCopy" : (False, ("bandwidth")), "virDomainMigrateGetMaxSpeed" : (False, ("bandwidth")) } def checkLongLegacyFunction(self, name, return_type, signature): diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index ac5df95..7cd5bbe 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1937,7 +1937,7 @@ int virDomainUpdateDeviceFlags(virDomainPtr domain, * VIR_DOMAIN_BLOCK_JOB_TYPE_PULL: Block Pull (virDomainBlockPull, or * virDomainBlockRebase without flags), job ends on completion * VIR_DOMAIN_BLOCK_JOB_TYPE_COPY: Block Copy (virDomainBlockRebase with - * flags), job exists as long as mirroring is active + * flags, or virDomainBlockCopy), job exists as long as mirroring is active */ typedef enum { VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN = 0, @@ -2007,6 +2007,22 @@ int virDomainBlockRebase(virDomainPtr dom, const char *disk, const char *base, unsigned long bandwidth, unsigned int flags); +/** + * virDomainBlockCopyFlags: + * + * Flags available for virDomainBlockCopy(). + */ +typedef enum { + VIR_DOMAIN_BLOCK_COPY_SHALLOW = 1 << 0, /* Limit copy to top of source + backing chain */ + VIR_DOMAIN_BLOCK_COPY_REUSE_EXT = 1 << 1, /* Reuse existing external + file for a copy */ +} virDomainBlockCopyFlags; + +int virDomainBlockCopy(virDomainPtr dom, const char *disk, const char *dest, + const char *format, unsigned long bandwidth, + unsigned int flags); + /* Block I/O throttling support */ diff --git a/src/driver.h b/src/driver.h index 03d249b..e10ba71 100644 --- a/src/driver.h +++ b/src/driver.h @@ -788,6 +788,10 @@ typedef int (*virDrvDomainBlockRebase)(virDomainPtr dom, const char *path, const char *base, unsigned long bandwidth, unsigned int flags); +typedef int + (*virDrvDomainBlockCopy)(virDomainPtr dom, const char *path, + const char *dest, const char *format, + unsigned long bandwidth, unsigned int flags); typedef int (*virDrvSetKeepAlive)(virConnectPtr conn, @@ -1005,6 +1009,7 @@ struct _virDriver { virDrvDomainBlockJobSetSpeed domainBlockJobSetSpeed; virDrvDomainBlockPull domainBlockPull; virDrvDomainBlockRebase domainBlockRebase; + virDrvDomainBlockCopy domainBlockCopy; virDrvSetKeepAlive setKeepAlive; virDrvConnectIsAlive isAlive; virDrvNodeSuspendForDuration nodeSuspendForDuration; diff --git a/src/libvirt.c b/src/libvirt.c index d4b648c..7a8f7c0 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -18260,7 +18260,11 @@ error: * return an error if bandwidth is not 0. * * When @base is NULL and @flags is 0, this is identical to - * virDomainBlockPull(). + * virDomainBlockPull(). Conversely, when @flags includes + * VIR_DOMAIN_BLOCK_REBASE_COPY, this is shorthand for + * virDomainBlockCopy(dom, disk, base, + * flags & VIR_DOMAIN_BLOCK_COPY_RAW ? "raw" : NULL, bandwidth, + * flags & (VIR_DOMAIN_BLOCK_REBASE_SHALLOW|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)). * * Returns 0 if the operation has started, -1 on failure. */ @@ -18270,7 +18274,7 @@ int virDomainBlockRebase(virDomainPtr dom, const char *disk, { virConnectPtr conn; - VIR_DOMAIN_DEBUG(dom, "disk=%s, base=%s bandwidth=%lu, flags=%x", + VIR_DOMAIN_DEBUG(dom, "disk=%s, base=%s, bandwidth=%lu, flags=%x", disk, NULLSTR(base), bandwidth, flags); virResetLastError(); @@ -18325,6 +18329,117 @@ error: /** + * virDomainBlockCopy: + * @dom: pointer to domain object + * @disk: path to the block device, or device shorthand + * @dest: path to the copy destination + * @format: format of the destination + * @bandwidth: (optional) specify copy bandwidth limit in Mbps + * @flags: bitwise-OR of virDomainBlockCopyFlags + * + * Start a block copy job, where @dest is the name of a new file to copy + * the chain to. By default, the copy will pull the entire source chain + * into the destination file, but if @flags also contains + * VIR_DOMAIN_BLOCK_COPY_SHALLOW, then only the top of the source chain + * will be copied (the source and destination have a common backing file). + * By default, @dest will be created with the same file format as the + * source, but this can be altered by passing a non-NULL @format argument, + * or by using VIR_DOMAIN_BLOCK_COPY_REUSE_EXT to reuse an existing file + * with initial contents identical to the backing file of the source (this + * allows a management app to pre-create files with relative backing file + * names, rather than the default of absolute backing file names; it is + * generally used with the shallow flag, since otherwise the destination + * file must start with empty contents). + * + * A copy job has two parts; in the first phase, the @bandwidth parameter + * affects how fast the source is pulled into the destination, and the job + * can only be canceled by reverting to the source file; progress in this + * phase can be tracked via the virDomainBlockJobInfo() command, with a + * job type of VIR_DOMAIN_BLOCK_JOB_TYPE_COPY. The job transitions to the + * second phase when the job info states cur == end, and remains alive to + * mirror all further changes to both source and destination. The user + * must call virDomainBlockJobAbort() to end the mirroring while choosing + * whether to revert to source or pivot to the destination. An event is + * issued when the job ends, and in the future, an event may be added when + * the job transitions from pulling to mirroring. + * + * Some hypervisors will restrict certain actions, such as virDomainSave() + * or virDomainDetachDevice(), while a copy job is active; they may + * also restrict a copy job to transient domains. + * + * The @disk parameter is either an unambiguous source name of the + * block device (the <source file='...'/> sub-element, such as + * "/path/to/image"), or the device target shorthand (the + * <target dev='...'/> sub-element, such as "xvda"). Valid names + * can be found by calling virDomainGetXMLDesc() and inspecting + * elements within //domain/devices/disk. + * + * The maximum bandwidth (in Mbps) that will be used to do the copy can be + * specified with the bandwidth parameter. If set to 0, libvirt will choose a + * suitable default. Some hypervisors do not support this feature and will + * return an error if bandwidth is not 0. + * + * When @format is NULL, this is equivalent to calling + * virDomainBlockRebase() with the VIR_DOMAIN_BLOCK_REBASE_COPY flag added + * to @flags. Additionally, if @format is "raw", this is equivalent to + * calling virDomainBlockRebase() with the VIR_DOMAIN_BLOCK_REBASE_COPY + * and VIR_DOMAIN_BLOCK_REBASE_COPY_RAW flags added to @flags. + * + * Returns 0 if the operation has started, -1 on failure. + */ +int virDomainBlockCopy(virDomainPtr dom, const char *disk, + const char *dest, const char *format, + unsigned long bandwidth, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(dom, + "disk=%s, dest=%s, format=%s, bandwidth=%lu, flags=%x", + disk, dest, NULLSTR(format), bandwidth, flags); + + virResetLastError(); + + if (!VIR_IS_CONNECTED_DOMAIN (dom)) { + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + conn = dom->conn; + + if (dom->conn->flags & VIR_CONNECT_RO) { + virLibDomainError(VIR_ERR_OPERATION_DENIED, __FUNCTION__); + goto error; + } + + if (!disk) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("disk is NULL")); + goto error; + } + if (!dest) { + virLibDomainError(VIR_ERR_INVALID_ARG, + _("dest is NULL")); + goto error; + } + + if (conn->driver->domainBlockCopy) { + int ret; + ret = conn->driver->domainBlockCopy(dom, disk, dest, format, bandwidth, + flags); + if (ret < 0) + goto error; + return ret; + } + + virLibDomainError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(dom->conn); + return -1; +} + + +/** * virDomainOpenGraphics: * @dom: pointer to domain object * @idx: index of graphics config to open diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 46c13fb..d152ab9 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -534,4 +534,9 @@ LIBVIRT_0.9.11 { virDomainPMWakeup; } LIBVIRT_0.9.10; +LIBVIRT_0.9.12 { + global: + virDomainBlockCopy; +} LIBVIRT_0.9.11; + # .... define new API here using predicted next version number .... -- 1.7.7.6

Expose the full abilities of virDomainBlockCopy. * tools/virsh.c (blockJobImpl): Add --format option for block copy. * tools/virsh.pod (blockcopy): Document this. --- tools/virsh.c | 25 ++++++++++++++++++++----- tools/virsh.pod | 10 +++++----- 2 files changed, 25 insertions(+), 10 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index 99f3d22..e855951 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -7527,6 +7527,7 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, unsigned long bandwidth = 0; int ret = -1; const char *base = NULL; + const char *format = NULL; unsigned int flags = 0; if (!vshConnectionUsability(ctl, ctl->conn)) @@ -7568,14 +7569,27 @@ blockJobImpl(vshControl *ctl, const vshCmd *cmd, case VSH_CMD_BLOCK_JOB_COPY: flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; if (vshCommandOptBool(cmd, "shallow")) - flags |= VIR_DOMAIN_BLOCK_REBASE_SHALLOW; + flags |= VIR_DOMAIN_BLOCK_COPY_SHALLOW; if (vshCommandOptBool(cmd, "reuse-external")) - flags |= VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT; - if (vshCommandOptBool(cmd, "raw")) - flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + flags |= VIR_DOMAIN_BLOCK_COPY_REUSE_EXT; if (vshCommandOptString(cmd, "dest", &base) < 0) goto cleanup; - ret = virDomainBlockRebase(dom, path, base, bandwidth, flags); + if (vshCommandOptString(cmd, "format", &format) < 0) + goto cleanup; + if (!format) { + if (vshCommandOptBool(cmd, "raw")) + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY_RAW; + flags |= VIR_DOMAIN_BLOCK_REBASE_COPY; + ret = virDomainBlockRebase(dom, path, base, bandwidth, flags); + } else { + if (vshCommandOptBool(cmd, "raw")) { + vshError(ctl, "%s", + _("--raw and --format are mutually exclusive")); + goto cleanup; + } + ret = virDomainBlockCopy(dom, path, base, format, + bandwidth, flags); + } } cleanup: @@ -7598,6 +7612,7 @@ static const vshCmdOptDef opts_block_copy[] = { {"path", VSH_OT_DATA, VSH_OFLAG_REQ, N_("Fully-qualified path of disk")}, {"dest", VSH_OT_DATA, VSH_OFLAG_REQ, N_("path of the copy to create")}, {"bandwidth", VSH_OT_DATA, VSH_OFLAG_NONE, N_("Bandwidth limit in MB/s")}, + {"format", VSH_OT_DATA, VSH_OFLAG_NONE, N_("file format of dest")}, {"shallow", VSH_OT_BOOL, 0, N_("make the copy share a backing chain")}, {"reuse-external", VSH_OT_BOOL, 0, N_("reuse existing destination")}, {"raw", VSH_OT_BOOL, 0, N_("use raw destination file")}, diff --git a/tools/virsh.pod b/tools/virsh.pod index 50a495d..1e5ce12 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -639,7 +639,7 @@ address of virtual interface (such as I<detach-interface> or I<domif-setlink>) will accept the MAC address printed by this command. =item B<blockcopy> I<domain> I<path> I<dest> [I<bandwidth>] [I<--shallow>] -[I<--reuse-external>] [I<--raw>] +[I<--reuse-external>] { [I<--raw>] | I<format> } Copy a disk backing image chain to I<dest>. By default, this command flattens the entire chain; but if I<--shallow> is specified, the copy @@ -652,10 +652,10 @@ is used, otherwise it must start empty); this option is typically used to set up a relative backing file name in the destination. The format of the destination is determined by the first match in the -following list: if I<--raw> is specified, it will be raw; if -I<--reuse-external> is specified, the existing destination is probed -for a format; and in all other cases, the destination format will -match the source format. +following list: if I<--raw> is specified, it will be raw; if I<format> +is specified, it will have that format; if I<--reuse-external> is +specified, the existing destination is probed for a format; and in +all other cases, the destination format will match the source format. The copy runs in the background; initially, the job must copy all data from the source, and during this phase, the job can only be canceled to -- 1.7.7.6

Almost trivial; the trick was dealing with the fact that we're stuck with 'unsigned long bandwidth' due to earlier design decisions. * src/remote/remote_protocol.x (remote_domain_block_copy_args): New struct. * src/remote/remote_driver.c (remote_driver): Use it. * src/rpc/gendispatch.pl (name_to_ProcName): Cater to legacy bandwidth type. * src/remote_protocol-structs: Regenerate. * src/qemu/qemu_driver.c (qemuDriver): Wire up driver callback. --- src/qemu/qemu_driver.c | 1 + src/remote/remote_driver.c | 1 + src/remote/remote_protocol.x | 12 +++++++++++- src/remote_protocol-structs | 9 +++++++++ src/rpc/gendispatch.pl | 1 + 5 files changed, 23 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 41b45ec..cc43631 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13107,6 +13107,7 @@ static virDriver qemuDriver = { .domainBlockJobSetSpeed = qemuDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = qemuDomainBlockPull, /* 0.9.4 */ .domainBlockRebase = qemuDomainBlockRebase, /* 0.9.10 */ + .domainBlockCopy = qemuDomainBlockCopy, /* 0.9.12 */ .isAlive = qemuIsAlive, /* 0.9.8 */ .nodeSuspendForDuration = nodeSuspendForDuration, /* 0.9.8 */ .domainSetBlockIoTune = qemuDomainSetBlockIoTune, /* 0.9.8 */ diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index af46384..b431779 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -5095,6 +5095,7 @@ static virDriver remote_driver = { .domainBlockJobSetSpeed = remoteDomainBlockJobSetSpeed, /* 0.9.4 */ .domainBlockPull = remoteDomainBlockPull, /* 0.9.4 */ .domainBlockRebase = remoteDomainBlockRebase, /* 0.9.10 */ + .domainBlockCopy = remoteDomainBlockCopy, /* 0.9.12 */ .setKeepAlive = remoteSetKeepAlive, /* 0.9.8 */ .isAlive = remoteIsAlive, /* 0.9.8 */ .nodeSuspendForDuration = remoteNodeSuspendForDuration, /* 0.9.8 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 2d57247..65b4a1d 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -1188,6 +1188,14 @@ struct remote_domain_block_rebase_args { unsigned hyper bandwidth; unsigned int flags; }; +struct remote_domain_block_copy_args { + remote_nonnull_domain dom; + remote_nonnull_string path; + remote_nonnull_string dest; + remote_string format; + unsigned hyper bandwidth; + unsigned int flags; +}; struct remote_domain_set_block_io_tune_args { remote_nonnull_domain dom; @@ -2782,7 +2790,9 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_PM_WAKEUP = 267, /* autogen autogen */ REMOTE_PROC_DOMAIN_EVENT_TRAY_CHANGE = 268, /* autogen autogen */ REMOTE_PROC_DOMAIN_EVENT_PMWAKEUP = 269, /* autogen autogen */ - REMOTE_PROC_DOMAIN_EVENT_PMSUSPEND = 270 /* autogen autogen */ + REMOTE_PROC_DOMAIN_EVENT_PMSUSPEND = 270, /* autogen autogen */ + + REMOTE_PROC_DOMAIN_BLOCK_COPY = 271 /* autogen autogen */ /* * Notice how the entries are grouped in sets of 10 ? diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 9b2414f..dd2aa72 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -845,6 +845,14 @@ struct remote_domain_block_rebase_args { uint64_t bandwidth; u_int flags; }; +struct remote_domain_block_copy_args { + remote_nonnull_domain dom; + remote_nonnull_string path; + remote_nonnull_string dest; + remote_string format; + uint64_t bandwidth; + u_int flags; +}; struct remote_domain_set_block_io_tune_args { remote_nonnull_domain dom; remote_nonnull_string disk; @@ -2192,4 +2200,5 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_EVENT_TRAY_CHANGE = 268, REMOTE_PROC_DOMAIN_EVENT_PMWAKEUP = 269, REMOTE_PROC_DOMAIN_EVENT_PMSUSPEND = 270, + REMOTE_PROC_DOMAIN_BLOCK_COPY = 271, }; diff --git a/src/rpc/gendispatch.pl b/src/rpc/gendispatch.pl index f161ee0..e5e28e0 100755 --- a/src/rpc/gendispatch.pl +++ b/src/rpc/gendispatch.pl @@ -232,6 +232,7 @@ my $long_legacy = { NodeGetInfo => { ret => { memory => 1 } }, DomainBlockPull => { arg => { bandwidth => 1 } }, DomainBlockRebase => { arg => { bandwidth => 1 } }, + DomainBlockCopy => { arg => { bandwidth => 1 } }, DomainBlockJobSetSpeed => { arg => { bandwidth => 1 } }, DomainMigrateGetMaxSpeed => { ret => { bandwidth => 1 } }, }; -- 1.7.7.6

On 04/05/2012 10:36 PM, Eric Blake wrote:
v1 was here: https://www.redhat.com/archives/libvir-list/2012-April/msg00068.html
changes from v1: Paolo has updated the qemu side of things, and built a scratch image for RHEL that I was able to test with for the new semantics. I was actually able to successfully run two different block copy jobs, one canceled and one pivoted, with SELinux disabled and this patch series.
Patch 12/15 is for reference only when working with Paolo's build; it will not go upstream.
Patches 13-15 are optional; the extra flexibility might be nice, but I haven't yet played with those three enough to know if Paolo's build behaves like I was expecting.
I obviously have more patches to write, such as supporting the REUSE_EXT flag to reuse files instead of creating a new one, and fixing several XXX comments related to SELinux, auditing, and disk locking. But they can be extra patches on top of this starting series.
For reference, I have now pushed this v2 series to my staging git repository (I may rewind the branch as I update things to rebase my snapshot+mirror proposal on top of this): http://repo.or.cz/w/libvirt/ericb.git/shortlog/refs/heads/snapshot git fetch git://repo.or.cz/libvirt/ericb.git snapshot -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 04/06/2012 08:24 AM, Eric Blake wrote:
On 04/05/2012 10:36 PM, Eric Blake wrote:
v1 was here: https://www.redhat.com/archives/libvir-list/2012-April/msg00068.html
changes from v1: Paolo has updated the qemu side of things, and built a scratch image for RHEL that I was able to test with for the new semantics. I was actually able to successfully run two different block copy jobs, one canceled and one pivoted, with SELinux disabled and this patch series.
Patch 12/15 is for reference only when working with Paolo's build; it will not go upstream.
Patches 13-15 are optional; the extra flexibility might be nice, but I haven't yet played with those three enough to know if Paolo's build behaves like I was expecting.
I obviously have more patches to write, such as supporting the REUSE_EXT flag to reuse files instead of creating a new one, and fixing several XXX comments related to SELinux, auditing, and disk locking. But they can be extra patches on top of this starting series.
For reference, I have now pushed this v2 series to my staging git repository (I may rewind the branch as I update things to rebase my snapshot+mirror proposal on top of this):
http://repo.or.cz/w/libvirt/ericb.git/shortlog/refs/heads/snapshot
git fetch git://repo.or.cz/libvirt/ericb.git snapshot
I rebased this to point to my v3 proposal. https://www.redhat.com/archives/libvir-list/2012-April/msg00288.html -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (4)
-
Adam Litke
-
Eric Blake
-
Federico Simoncelli
-
Paolo Bonzini