Am 06.06.2013 09:56, schrieb Daniel P. Berrange:
On Thu, Jun 06, 2013 at 08:57:21AM +0200, Richard Weinberger wrote:
> Hi!
>
> I'm facing the issue that "virsh lxc-enter-namespace ..." does not work
for me.
> setns() always fails with EINVAL.
>
> Reading the code confused me a bit, maybe you can help me. :D
>
> virsh itself calls:
> cmdLxcEnterNamespace()
> virDomainLxcOpenNamespace()
> conn->driver->domainLxcOpenNamespace()
>
> Here comes the first thing that is not clear to me.
> conn->driver seems to be the remote driver and therefore
> ->domainLxcOpenNamespace is remoteDomainLxcOpenNamespace()
> Why is lxc:/// a remote connection?
>
> remoteDomainLxcOpenNamespace() does a rpc call to libvirtd.
>
> On the remote side libvirtd does:
>
> lxcDispatchDomainOpenNamespace(), which opens the namespace fds,
> and sends them back as result.
> How can this work? Does it somewhere magic file descriptor passing
> on AF_UNIX?
Yes, we use SCM_RIGHTS to pass FDs.
> virsh then receives the fd's (pure numbers) and setns() failed badly.
>
> Wouldn't it make much more sense to do the open(/proc/XXX/ns/{mnt, user, ...})
and setns()
> calls directly on the local side? IOW directly in virsh?
> driver->domainLxcOpenNamespace() should only report the process id of the
container's
> init process.
The reason for doing it server side is to get privilege separation.
eg libvirtd runs privileged to open the fds, and virsh can run
unprivileged with setns(). Unfortunately it seems the kernel
doesn't allow for the thing calling setns() to be unprivileged
at this time, but the design allows for this enhancement in the
future.
setns() needs CAP_SYS_ADMIN() and the manpage also says:
ERRORS:
...
EINVAL fd refers to a namespace whose type does not match that specified in nstype, or
there is problem with reassociating the the thread with the specified namespace.
I'm sure in my case setns() fails because the calling thread did not open() the ns
files itself.
What is the plan to make lxc-enter-namespace work?
Privilege separation is nice but as of now the kernel interface (setns()) seems not to
allow this.
Are you forcing the kernel guys to change the interface?
In the meanwhile I'll use util-linux's nsenter which works fine.
Thanks,
//richard