On 05/07/2012 07:00 PM, Eric Blake wrote:
It turns out that when cgroups are enabled, the use of a block
device
for a snapshot target was failing with EPERM due to libvirt failing
to add the block device to the cgroup whitelist. See also
https://bugzilla.redhat.com/show_bug.cgi?id=810200
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Account for cgroup.
(qemuDomainSnapshotCreateDiskActive): Update caller.
---
I still need to properly test this, but based on how disk hotplug
works, I think this should solve the problem at hand.
My recent patch series for virDomainBlockRebase needs the same
treatment; I'll do that as a separate patch.
This still does not address the fact that block pull should be
revoking rights to a backing file that is no longer in use; that
will require more infrastructure for libvirt properly tracking an
entire backing chain.
src/qemu/qemu_driver.c | 26 ++++++++++++++++++++++++--
1 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 2bec617..92535b9 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -9895,6 +9895,7 @@ cleanup:
static int
qemuDomainSnapshotCreateSingleDiskActive(struct qemud_driver *driver,
virDomainObjPtr vm,
+ virCgroupPtr cgroup,
virDomainSnapshotDiskDefPtr snap,
virDomainDiskDefPtr disk,
virDomainDiskDefPtr persistDisk,
@@ -9948,8 +9949,15 @@ qemuDomainSnapshotCreateSingleDiskActive(struct qemud_driver
*driver,
if (virDomainLockDiskAttach(driver->lockManager, vm, disk) < 0)
goto cleanup;
+ if (cgroup && qemuSetupDiskCgroup(driver, vm, cgroup, disk) < 0) {
As far as I can see, you can't get to here if cgroup is NULL.
Technically there's nothing wrong with checking it, but isn't it
unnecessary?
+ if (virDomainLockDiskDetach(driver->lockManager, vm,
disk) < 0)
+ VIR_WARN("Unable to release lock on %s", source);
+ goto cleanup;
+ }
if (virSecurityManagerSetImageLabel(driver->securityManager, vm->def,
disk) < 0) {
+ if (cgroup && qemuTeardownDiskCgroup(driver, vm, cgroup, disk) < 0)
Same here.
+ VIR_WARN("Failed to teardown cgroup for disk path
%s", source);
if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0)
VIR_WARN("Unable to release lock on %s", source);
goto cleanup;
@@ -10009,6 +10017,7 @@ cleanup:
static void
qemuDomainSnapshotUndoSingleDiskActive(struct qemud_driver *driver,
virDomainObjPtr vm,
+ virCgroupPtr cgroup,
virDomainDiskDefPtr origdisk,
virDomainDiskDefPtr disk,
virDomainDiskDefPtr persistDisk,
@@ -10033,6 +10042,8 @@ qemuDomainSnapshotUndoSingleDiskActive(struct qemud_driver
*driver,
if (virSecurityManagerRestoreImageLabel(driver->securityManager,
vm->def, disk) < 0)
VIR_WARN("Unable to restore security label on %s", disk->src);
+ if (cgroup && qemuTeardownDiskCgroup(driver, vm, cgroup, disk) < 0)
And also here.
+ VIR_WARN("Failed to teardown cgroup for disk path
%s", disk->src);
if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0)
VIR_WARN("Unable to release lock on %s", disk->src);
if (need_unlink && stat(disk->src, &st) == 0 &&
@@ -10084,6 +10095,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn,
int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */
bool atomic = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC) != 0;
bool reuse = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0;
+ virCgroupPtr cgroup = NULL;
if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0)
return -1;
@@ -10094,6 +10106,14 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn,
goto endjob;
}
+ if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) &&
+ virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) {
+ qemuReportError(VIR_ERR_INTERNAL_ERROR,
+ _("Unable to find cgroup for %s"),
+ vm->def->name);
+ goto endjob;
+ }
+
By the time you get here, you're guaranteed the cgroup is non-NULL, right?
Also, looking beyond the context provided by git - virCGroupFree() is
called in cleanup, which is higher up in the function than end_job, but
there are 2 occurrences of goto end_job past where you get the cgroup -
if you encountered an error and took either of those goto's, you would
end up leaking cgroup.
/* If quiesce was requested, then issue a freeze command, and a
* counterpart thaw command, no matter what. The command will
* fail if the guest is paused or the guest agent is not
@@ -10156,7 +10176,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn,
}
}
- ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm,
+ ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, cgroup,
So back to the other topic - this is the only place this function is
called, and you're guaranteed that cgroup will be non-NULL, so I don't
think qemuDomainSnapshotCreateSingleDiskActive() actually needs to check
for NULL cgroup.
&snap->def->disks[i],
vm->def->disks[i],
persistDisk, actions,
@@ -10184,7 +10204,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn,
persistDisk = vm->newDef->disks[indx];
}
- qemuDomainSnapshotUndoSingleDiskActive(driver, vm,
+ qemuDomainSnapshotUndoSingleDiskActive(driver, vm, cgroup,
snap->def->dom->disks[i],
vm->def->disks[i],
persistDisk,
Likewise here.
@@ -10214,6 +10234,8 @@
qemuDomainSnapshotCreateDiskActive(virConnectPtr conn,
}
cleanup:
+ if (cgroup)
+ virCgroupFree(&cgroup);
I think this either needs to be moved down to end_job, or the to goto
end_job's I mentioned above need to goto cleanup (I think I prefer the
former).
if (resume && virDomainObjIsActive(vm)) {
if (qemuProcessStartCPUs(driver, vm, conn,
VIR_DOMAIN_RUNNING_UNPAUSED,