Am 11.02.2014 13:05, schrieb Daniel P. Berrange:
On Sat, Feb 08, 2014 at 06:37:43PM +0100, Richard Weinberger wrote:
> Add a new helper function to change the permissions
> of a control group.
>
> Signed-off-by: Richard Weinberger <richard(a)nod.at>
> ---
> src/lxc/lxc_controller.c | 7 +++++++
> src/util/vircgroup.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> src/util/vircgroup.h | 2 ++
> 3 files changed, 52 insertions(+)
>
> diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c
> index f7b614b..6e348b3 100644
> --- a/src/lxc/lxc_controller.c
> +++ b/src/lxc/lxc_controller.c
> @@ -2223,6 +2223,13 @@ virLXCControllerRun(virLXCControllerPtr ctrl)
> goto cleanup;
> }
>
> + /* setup control group permissions for user namespace */
> + if (ctrl->def->idmap.uidmap) {
> + if (virCgroupSetOwner(ctrl->cgroup,
ctrl->def->idmap.uidmap[0].target,
> + ctrl->def->idmap.gidmap[0].target))
> + goto cleanup;
> + }
> +
I wonder is it possible to just do the ownership change in the
virLXCCgroupCreate method ? When the container then does the
bind mount, it should preserve the ownership correctly. This
would avoid any need for the extra handshake synchronization.
Bah, I've overlooked that. Of course it works.
Much simplified patch is on its way.
Thanks for pointing this out!
> diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c
> index a6d60c5..b66ffed 100644
> --- a/src/util/vircgroup.c
> +++ b/src/util/vircgroup.c
> @@ -3252,6 +3252,49 @@ cleanup:
> return ret;
> }
>
> +int virCgroupSetOwner(virCgroupPtr cgroup, uid_t uid, gid_t gid) {
> + size_t i;
> +
> + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) {
I think we only want to set ownership for VIR_CGROUP_CONTROLLER_SYSTEMD.
We don't want to give the container permission to change resource
controllers, this that would allow it to potentially escape its resource
limits. You might think the hierarchy would lock things down, but I
believe the kernel developers still say it is unsafe to delegate cgroup
resource controllers to unprivileged users.
That's right. But then we should generally only delegate the systemd cgroup
to a container.
...patch sent.
Thanks,
//richard