
On 04/13/2012 08:46 AM, Jiri Denemark wrote:
On Mon, Apr 09, 2012 at 21:52:21 -0600, Eric Blake wrote:
Minimal patch to wire up all the pieces in the previous patches to actually enable a block copy job. By minimal, I mean that qemu creates the file (that is, no REUSE_EXT flag support yet), SELinux must be disabled, a lock manager is not informed, and the audit logs aren't updated. But those will be added as improvements in future patches.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function. (qemuDomainBlockRebase): Call it when appropriate. --- src/qemu/qemu_driver.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 113 insertions(+), 1 deletions(-)
+ + priv = vm->privateData; + if (!(qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_MIRROR) && + qemuCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_REOPEN))) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s",
We usually use VIR_ERR_CONFIG_UNSUPPORTED in such cases.
Sure, easy to fix. Actually, looking at 'git grep -B3 -i "qemu binary" src/qemu', I counted an INTERNAL_ERROR, an OPERATION_FAILED, and a couple of OPERATION_INVALID that should all be cleaned up, to match the fact that the majority of uses did indeed favor CONFIG_UNSUPPORTED. I'll split the cleanup into a separate patch.
+ _("block copy is not supported with this QEMU binary")); + goto cleanup; + } + if (vm->persistent) { + qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not transient")); + goto cleanup; + }
I guess I wasn't paying enough attention somewhere but why do we forbid block copy for persistent domains? I understand why we want to forbid certain operations when block copy is active but I don't see a reason for penalizing persistent domains.
It was in patch 8/18 where I added the restrictions, and hopefully documented in that commit message why limiting things to transient is a good first step: 1. the first client of live migration is oVirt, which uses transient domains 2. qemu does not (yet) provide a way to resume a mirror when restarting a domain, so anything that would require restoring a domain from saved state is broken: incoming migration, virsh restore, and virsh start after a managed save. But users would be upset if they saved a domain, only to find out that they cannot then restore it, so I squelched things one step earlier in the process, by preventing any save of a domain so that we never have a broken save image in the first place. My worry now comes from the fact that managedsave is on the list of forbidden operations. If a domain is transient, managedsave is already forbidden (it is assumed that you either don't care about the domain if the host dies, or that you are running a higher-level app like oVirt that knows how to rebuild the guest on a different host). But if a guest is persistent, and you use the libvirt-guests init script, then you have a right to assume that rebooting your host will resume your guests in the same state that they were prior to the host going down - because libvirt-guests uses managedsave. If we allow a mirror job on a persistent domain, we violate this assumption (libvirt-guests will fail to save the guest). Therefore, I forbid to start a mirror job on a persistent domain, just as I forbid to 'virsh define' a transient domain into a persistent domain if a mirror job is active. If, at a later date, qemu comes up with a way to resume mirroring when restarting a domain, we can relax these restrictions.
+ + /* Actually start the mirroring */ + qemuDomainObjEnterMonitorWithDriver(driver, vm); + ret = qemuMonitorDriveMirror(priv->mon, NULL, device, dest, format, flags); + if (ret == 0 && bandwidth != 0) + ret = qemuMonitorBlockJob(priv->mon, device, NULL, bandwidth, NULL, + BLOCK_JOB_SPEED_INTERNAL); + qemuDomainObjExitMonitorWithDriver(driver, vm);
Can BLOCK_JOB_SPEED_INTERNAL fail while block copy remains running? If so, we should try to abort the job in case we fail to set the speed, otherwise we will report an error and forget about the job while qemu will keep copying data.
Good catch, and virDomainBlockPull() suffers from the same ramifications. I think both code paths need the same fix.
@@ -11889,6 +12000,7 @@ static int qemuDomainBlockPull(virDomainPtr dom, const char *path, unsigned long bandwidth, unsigned int flags) { + virCheckFlags(0, -1); return qemuDomainBlockRebase(dom, path, NULL, bandwidth, flags); }
Hmm, personally, I would make qemuDomainBlockPull call qemuDomainBlockJobImpl directly instead of going through qemuDomainBlockRebase while adding the flags check...
Easy enough to change. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org