On Wed, Jan 16, 2013 at 10:10:50AM +0000, Daniel P. Berrange wrote:
On Wed, Jan 16, 2013 at 10:53:08AM +0800, Hu Tao wrote:
> diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c
> index 7cb99b1..92e3292 100644
> --- a/daemon/libvirtd.c
> +++ b/daemon/libvirtd.c
> @@ -1442,6 +1442,9 @@ int main(int argc, char **argv) {
> VIR_FORCE_CLOSE(statuswrite);
> }
>
> + if (virCgroupInit() < 0)
> + goto cleanup;
> +
I don't like this addition. Our aim has been to *remove* the need
to global initializers like this, not add new ones. AFAICT the
reason you needed to add this is because you removed code from
the individual drivers which would initialize the cgroups they
required.
I think I can eliminate this init/uninit thing.
> diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
> index 2ac338c..23ff2c9 100644
> --- a/src/conf/domain_conf.h
> +++ b/src/conf/domain_conf.h
> @@ -48,6 +48,7 @@
> # include "device_conf.h"
> # include "virbitmap.h"
> # include "virstoragefile.h"
> +# include "vircgroup.h"
>
> /* forward declarations of all device types, required by
> * virDomainDeviceDef
> @@ -1843,6 +1844,10 @@ struct _virDomainDef {
>
> /* Application-specific custom metadata */
> xmlNodePtr metadata;
> +
> + virCgroupItemPtr cgroup[VIR_CGROUP_CONTROLLER_LAST];
> + virCgroupItemPtr (*vcpuCgroups)[VIR_CGROUP_CONTROLLER_LAST];
> + virCgroupItemPtr emulatorCgroup[VIR_CGROUP_CONTROLLER_LAST];
> };
Two things here. First, this is driver state and so should *not* be
in the virDomainDef struct - it should be in the driver specific
private data structs.
Agreed.
Second, the new cgroups API you've got here is really bad. It was
an explicit design decision in the original API to *not* expose
the concept of individual cgroup controllers to the driver APIs.
The only time the drivers should can about individual controllers
is when they first create the cgroup and decide which controllers
they want to use. From then onwards the virCgroupPtr APIs should
just 'do the right thing'.
The explanation is helpful. Fortunately I think the new virCgroup
can leave the original API unchanged(let me try in v2).
What are important in the new virCgroup are:
- the lazy creation of cgroup directories, despite of what level
they are.
- cgroup directories are removed if no one is using the corresponding
virCgroup.
Do you think it's worth doing it? If yes, can you review patch 5 about
the new implementation? (forget about the API change)
> }
>
> + cgroup = vm->def->cgroup[VIR_CGROUP_CONTROLLER_DEVICES];
> +
> if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_DEVICES)) {
> qemuCgroupData data = { vm, cgroup };
> rc = virCgroupDenyAllDevices(cgroup);
> @@ -300,6 +301,8 @@ int qemuSetupCgroup(virQEMUDriverPtr driver,
> }
> }
>
> + cgroup = vm->def->cgroup[VIR_CGROUP_CONTROLLER_BLKIO];
> +
> if (vm->def->blkio.weight != 0) {
> if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) {
> rc = virCgroupSetBlkioWeight(cgroup, vm->def->blkio.weight);
> @@ -339,6 +342,8 @@ int qemuSetupCgroup(virQEMUDriverPtr driver,
> }
> }
>
> + cgroup = vm->def->cgroup[VIR_CGROUP_CONTROLLER_MEMORY];
> +
> if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_MEMORY)) {
> unsigned long long hard_limit = vm->def->mem.hard_limit;
>
> @@ -392,6 +397,8 @@ int qemuSetupCgroup(virQEMUDriverPtr driver,
> VIR_WARN("Could not autoset a RSS limit for domain %s",
vm->def->name);
> }
>
> + cgroup = vm->def->cgroup[VIR_CGROUP_CONTROLLER_CPU];
> +
> if (vm->def->cputune.shares != 0) {
> if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) {
> rc = virCgroupSetCpuShares(cgroup, vm->def->cputune.shares);
> @@ -407,6 +414,8 @@ int qemuSetupCgroup(virQEMUDriverPtr driver,
> }
> }
>
> + cgroup = vm->def->cgroup[VIR_CGROUP_CONTROLLER_CPUSET];
> +
> if ((vm->def->numatune.memory.nodemask ||
> (vm->def->numatune.memory.placement_mode ==
> VIR_DOMAIN_NUMATUNE_MEM_PLACEMENT_MODE_AUTO)) &&
These 4 additions are a perfect example of what this new API design is
awful. The drivers now have to remember which controller they need to
use for which tunable.
If the original API is kept, these will be the form of:
cgroup = vm->cgroup;
...
doSomethingWithCgroup(cgroup)
...
Is this acceptable?