On 06/19/2014 07:46 AM, Peter Krempa wrote:
Add functions that will allow to set all the required cgroup stuff on
s/to set/setting/
individual images taking a virStorageSourcePtr. Also convert
functions
s/taking/by taking/
designed to setup whole backing chain to take advantage of the
chagne.
s/chagne/change/
---
src/qemu/qemu_cgroup.c | 92 +++++++++++++++++++++++++++++++-------------------
src/qemu/qemu_cgroup.h | 5 +++
2 files changed, 62 insertions(+), 35 deletions(-)
+int
+qemuSetupImageCgroup(virDomainObjPtr vm,
+ virStorageSourcePtr src,
+ bool readonly)
{
+ VIR_DEBUG("Process path %s for disk", src->path);
Might be worth mentioning the value of readonly in this debug statement
(imagine reading a log to see which images got 'rw' vs. 'ro'
permissions).
@@ -81,38 +91,51 @@ int
qemuSetupDiskCgroup(virDomainObjPtr vm,
virDomainDiskDefPtr disk)
{
- qemuDomainObjPrivatePtr priv = vm->privateData;
+ virStorageSourcePtr next;
- if (!virCgroupHasController(priv->cgroup,
- VIR_CGROUP_CONTROLLER_DEVICES))
- return 0;
+ for (next = disk->src; next; next = next->backingStore) {
+ if (qemuSetupImageCgroup(vm, next, disk->readonly) < 0)
+ return -1;
- return virDomainDiskDefForeachPath(disk, true, qemuSetupDiskPathAllow, vm);
Not quite right (although not necessarily your fault - the existing code
has the same bug, because it ignores the depth argument of
qemuSetupDiskPathAllow). We want the active image to be 'rw' or 'ro'
based on disk->readonly; but all of its backing files should be 'ro'
without any decision needed. That is, permissions on the active layer
should not determine what we do with lower layers, because lower layers
should never change (well, there is a temporary exception during
block-commit operations, but that's why we are writing this series - to
make it easier to describe those temporary changes in the block-commit
code). So the loop should be more like:
for (next = disk->src; next; next = next->backingStore) {
bool ro = next == disk->src ? disk->readonly : true;
if (qemuSetupImageCgroup(vm, next, ro) < 0)
return -1;
}
or:
bool readonly = disk->readonly;
for (next = disk->src; next; next = next->backingStore) {
if (qemuSetupImageCgroup(vm, next, readonly) < 0)
return -1;
readonly = true;
}
+int
+qemuTeardownImageCgroup(virDomainObjPtr vm,
+ virStorageSourcePtr src)
{
Not your fault that the existing code has two very similar call paths
(one for grant 'rw'/'r', one for deny 'rwm'; and we never grant
'm') -
but I think it might be a little cleaner if we have just ONE call path
with a tri-state setting (change to one of read/write (grant 'rw' deny
'm'), readonly (grant 'r' deny 'wm'), or unaccessible (deny
'rwm'). That
way, the intro code for returning early when cgroups is not mounted or
when dealing with a network disk is not copied between two paths, and
the code is a bit closer to our intended usage (particularly during
block-commit - we are temporarily changing a backing file from readonly
to read/write for the duration of the commit, then back to readonly at
the end, which is different than making it unaccessible).
I'd have to experiment with cgroup ACLs to see if granting a disk 'r'
permission wipes out an earlier grant of 'rw', or if going from 'rw' to
'r' instead requires a deny of 'w'. I don't know if that makes any
difference to the code (it may mean that we have to track the current
mode, in order to make an efficient decision of how to move to readonly
mode, and especially without going through a temporarily inaccessible
state if the guest is live).
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org