[libvirt] [PATCH] snapshot: allow block devices past cgroup

It turns out that when cgroups are enabled, the use of a block device for a snapshot target was failing with EPERM due to libvirt failing to add the block device to the cgroup whitelist. See also https://bugzilla.redhat.com/show_bug.cgi?id=810200 * src/qemu/qemu_driver.c (qemuDomainSnapshotCreateSingleDiskActive) (qemuDomainSnapshotUndoSingleDiskActive): Account for cgroup. (qemuDomainSnapshotCreateDiskActive): Update caller. --- I still need to properly test this, but based on how disk hotplug works, I think this should solve the problem at hand. My recent patch series for virDomainBlockRebase needs the same treatment; I'll do that as a separate patch. This still does not address the fact that block pull should be revoking rights to a backing file that is no longer in use; that will require more infrastructure for libvirt properly tracking an entire backing chain. src/qemu/qemu_driver.c | 26 ++++++++++++++++++++++++-- 1 files changed, 24 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2bec617..92535b9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9895,6 +9895,7 @@ cleanup: static int qemuDomainSnapshotCreateSingleDiskActive(struct qemud_driver *driver, virDomainObjPtr vm, + virCgroupPtr cgroup, virDomainSnapshotDiskDefPtr snap, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, @@ -9948,8 +9949,15 @@ qemuDomainSnapshotCreateSingleDiskActive(struct qemud_driver *driver, if (virDomainLockDiskAttach(driver->lockManager, vm, disk) < 0) goto cleanup; + if (cgroup && qemuSetupDiskCgroup(driver, vm, cgroup, disk) < 0) { + if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) + VIR_WARN("Unable to release lock on %s", source); + goto cleanup; + } if (virSecurityManagerSetImageLabel(driver->securityManager, vm->def, disk) < 0) { + if (cgroup && qemuTeardownDiskCgroup(driver, vm, cgroup, disk) < 0) + VIR_WARN("Failed to teardown cgroup for disk path %s", source); if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) VIR_WARN("Unable to release lock on %s", source); goto cleanup; @@ -10009,6 +10017,7 @@ cleanup: static void qemuDomainSnapshotUndoSingleDiskActive(struct qemud_driver *driver, virDomainObjPtr vm, + virCgroupPtr cgroup, virDomainDiskDefPtr origdisk, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, @@ -10033,6 +10042,8 @@ qemuDomainSnapshotUndoSingleDiskActive(struct qemud_driver *driver, if (virSecurityManagerRestoreImageLabel(driver->securityManager, vm->def, disk) < 0) VIR_WARN("Unable to restore security label on %s", disk->src); + if (cgroup && qemuTeardownDiskCgroup(driver, vm, cgroup, disk) < 0) + VIR_WARN("Failed to teardown cgroup for disk path %s", disk->src); if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) VIR_WARN("Unable to release lock on %s", disk->src); if (need_unlink && stat(disk->src, &st) == 0 && @@ -10084,6 +10095,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */ bool atomic = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC) != 0; bool reuse = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0; + virCgroupPtr cgroup = NULL; if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) return -1; @@ -10094,6 +10106,14 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, goto endjob; } + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && + virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to find cgroup for %s"), + vm->def->name); + goto endjob; + } + /* If quiesce was requested, then issue a freeze command, and a * counterpart thaw command, no matter what. The command will * fail if the guest is paused or the guest agent is not @@ -10156,7 +10176,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, } } - ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, + ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, cgroup, &snap->def->disks[i], vm->def->disks[i], persistDisk, actions, @@ -10184,7 +10204,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, persistDisk = vm->newDef->disks[indx]; } - qemuDomainSnapshotUndoSingleDiskActive(driver, vm, + qemuDomainSnapshotUndoSingleDiskActive(driver, vm, cgroup, snap->def->dom->disks[i], vm->def->disks[i], persistDisk, @@ -10214,6 +10234,8 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, } cleanup: + if (cgroup) + virCgroupFree(&cgroup); if (resume && virDomainObjIsActive(vm)) { if (qemuProcessStartCPUs(driver, vm, conn, VIR_DOMAIN_RUNNING_UNPAUSED, -- 1.7.7.6

On 05/07/2012 07:00 PM, Eric Blake wrote:
It turns out that when cgroups are enabled, the use of a block device for a snapshot target was failing with EPERM due to libvirt failing to add the block device to the cgroup whitelist. See also https://bugzilla.redhat.com/show_bug.cgi?id=810200
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateSingleDiskActive) (qemuDomainSnapshotUndoSingleDiskActive): Account for cgroup. (qemuDomainSnapshotCreateDiskActive): Update caller. ---
I still need to properly test this, but based on how disk hotplug works, I think this should solve the problem at hand.
My recent patch series for virDomainBlockRebase needs the same treatment; I'll do that as a separate patch.
This still does not address the fact that block pull should be revoking rights to a backing file that is no longer in use; that will require more infrastructure for libvirt properly tracking an entire backing chain.
src/qemu/qemu_driver.c | 26 ++++++++++++++++++++++++-- 1 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2bec617..92535b9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9895,6 +9895,7 @@ cleanup: static int qemuDomainSnapshotCreateSingleDiskActive(struct qemud_driver *driver, virDomainObjPtr vm, + virCgroupPtr cgroup, virDomainSnapshotDiskDefPtr snap, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, @@ -9948,8 +9949,15 @@ qemuDomainSnapshotCreateSingleDiskActive(struct qemud_driver *driver,
if (virDomainLockDiskAttach(driver->lockManager, vm, disk) < 0) goto cleanup; + if (cgroup && qemuSetupDiskCgroup(driver, vm, cgroup, disk) < 0) {
As far as I can see, you can't get to here if cgroup is NULL. Technically there's nothing wrong with checking it, but isn't it unnecessary?
+ if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) + VIR_WARN("Unable to release lock on %s", source); + goto cleanup; + } if (virSecurityManagerSetImageLabel(driver->securityManager, vm->def, disk) < 0) { + if (cgroup && qemuTeardownDiskCgroup(driver, vm, cgroup, disk) < 0)
Same here.
+ VIR_WARN("Failed to teardown cgroup for disk path %s", source); if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) VIR_WARN("Unable to release lock on %s", source); goto cleanup; @@ -10009,6 +10017,7 @@ cleanup: static void qemuDomainSnapshotUndoSingleDiskActive(struct qemud_driver *driver, virDomainObjPtr vm, + virCgroupPtr cgroup, virDomainDiskDefPtr origdisk, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, @@ -10033,6 +10042,8 @@ qemuDomainSnapshotUndoSingleDiskActive(struct qemud_driver *driver, if (virSecurityManagerRestoreImageLabel(driver->securityManager, vm->def, disk) < 0) VIR_WARN("Unable to restore security label on %s", disk->src); + if (cgroup && qemuTeardownDiskCgroup(driver, vm, cgroup, disk) < 0)
And also here.
+ VIR_WARN("Failed to teardown cgroup for disk path %s", disk->src); if (virDomainLockDiskDetach(driver->lockManager, vm, disk) < 0) VIR_WARN("Unable to release lock on %s", disk->src); if (need_unlink && stat(disk->src, &st) == 0 && @@ -10084,6 +10095,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */ bool atomic = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC) != 0; bool reuse = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0; + virCgroupPtr cgroup = NULL;
if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) return -1; @@ -10094,6 +10106,14 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, goto endjob; }
+ if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && + virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to find cgroup for %s"), + vm->def->name); + goto endjob; + } +
By the time you get here, you're guaranteed the cgroup is non-NULL, right? Also, looking beyond the context provided by git - virCGroupFree() is called in cleanup, which is higher up in the function than end_job, but there are 2 occurrences of goto end_job past where you get the cgroup - if you encountered an error and took either of those goto's, you would end up leaking cgroup.
/* If quiesce was requested, then issue a freeze command, and a * counterpart thaw command, no matter what. The command will * fail if the guest is paused or the guest agent is not @@ -10156,7 +10176,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, } }
- ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, + ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, cgroup,
So back to the other topic - this is the only place this function is called, and you're guaranteed that cgroup will be non-NULL, so I don't think qemuDomainSnapshotCreateSingleDiskActive() actually needs to check for NULL cgroup.
&snap->def->disks[i], vm->def->disks[i], persistDisk, actions, @@ -10184,7 +10204,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, persistDisk = vm->newDef->disks[indx]; }
- qemuDomainSnapshotUndoSingleDiskActive(driver, vm, + qemuDomainSnapshotUndoSingleDiskActive(driver, vm, cgroup, snap->def->dom->disks[i], vm->def->disks[i], persistDisk,
Likewise here.
@@ -10214,6 +10234,8 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, }
cleanup: + if (cgroup) + virCgroupFree(&cgroup);
I think this either needs to be moved down to end_job, or the to goto end_job's I mentioned above need to goto cleanup (I think I prefer the former).
if (resume && virDomainObjIsActive(vm)) { if (qemuProcessStartCPUs(driver, vm, conn, VIR_DOMAIN_RUNNING_UNPAUSED,

On 05/08/2012 12:42 PM, Laine Stump wrote:
On 05/07/2012 07:00 PM, Eric Blake wrote:
It turns out that when cgroups are enabled, the use of a block device for a snapshot target was failing with EPERM due to libvirt failing to add the block device to the cgroup whitelist. See also https://bugzilla.redhat.com/show_bug.cgi?id=810200
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateSingleDiskActive) (qemuDomainSnapshotUndoSingleDiskActive): Account for cgroup. (qemuDomainSnapshotCreateDiskActive): Update caller. --- @@ -9948,8 +9949,15 @@ qemuDomainSnapshotCreateSingleDiskActive(struct qemud_driver *driver,
if (virDomainLockDiskAttach(driver->lockManager, vm, disk) < 0) goto cleanup; + if (cgroup && qemuSetupDiskCgroup(driver, vm, cgroup, disk) < 0) {
As far as I can see, you can't get to here if cgroup is NULL.
cgroup can be NULL; see below...
Technically there's nothing wrong with checking it, but isn't it unnecessary?
No, it's necessary (unless we fix qemuSetupDiskCgroup to be a no-op when cgroup is NULL).
@@ -10084,6 +10095,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */ bool atomic = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC) != 0; bool reuse = (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT) != 0; + virCgroupPtr cgroup = NULL;
if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_MODIFY) < 0) return -1; @@ -10094,6 +10106,14 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, goto endjob; }
+ if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES) && + virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0)) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to find cgroup for %s"), + vm->def->name); + goto endjob; + } +
By the time you get here, you're guaranteed the cgroup is non-NULL, right?
No. If qemuCgroupControllerActive() returns false (possible if your qemu.conf left "devices" out of cgroup_controllers, or if you don't have cgroups mounted in the kernel at all), then we never call virCgroupForDomain to populate cgroup with a non-NULL value.
Also, looking beyond the context provided by git - virCGroupFree() is called in cleanup, which is higher up in the function than end_job, but there are 2 occurrences of goto end_job past where you get the cgroup - if you encountered an error and took either of those goto's, you would end up leaking cgroup.
Eek; I'll fix that and provide a v2.
@@ -10156,7 +10176,7 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, } }
- ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, + ret = qemuDomainSnapshotCreateSingleDiskActive(driver, vm, cgroup,
So back to the other topic - this is the only place this function is called, and you're guaranteed that cgroup will be non-NULL, so I don't think qemuDomainSnapshotCreateSingleDiskActive() actually needs to check for NULL cgroup.
Calling with NULL cgroup is permissible.
@@ -10214,6 +10234,8 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, }
cleanup: + if (cgroup) + virCgroupFree(&cgroup);
I think this either needs to be moved down to end_job, or the to goto end_job's I mentioned above need to goto cleanup (I think I prefer the former).
cleanup falls through to endjob - we have to use cleanup if we pause the cpus or if we modify state that needs saving. But since we do indeed have a 'goto endjob' that occurs after cgroup might be set, we do indeed need to move the virCgroupFree() into the endjob label. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Eric Blake
-
Laine Stump