On Thu, Sep 03, 2015 at 11:51:16AM +0200, Cédric Bosdonnat wrote:
We already have a fuse mount to reflect the cgroup memory
restrictions
in the container. This commit adds the same for the number of available
CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the
container's cpuinfo.
So this (re-)raises some interesting / difficult questions that I'm
not sure we have a good answer to.
The main concern is that actually this is not really a problem specific
to containers, rather it is related to cgroup resource confinement.
ie the cgroup has confined a process(es) to a set of CPUs are the process
is using /proc/cpuinfo to count CPUs and so is wrong. Cgroups are being
increasingly widely used in Linux, particularly since systemd, so pretty
much any process has to expect that it can be confined to a subset of
CPUs.
IOW, any application using /proc/cpuinfo to determine "available"
resource is already broken, even when run on bare metal. The same also
applies to the use of /proc/meminfo, which we previously faked via
fuse.
So the question is whether we should invest time trying to fake the
/proc/cpuinfo in containers, when any apps we'd be fixing are already
broken in bare metal. Apps might have avoided /proc/cpuinfo and instead
be trying /sys/devices/system/cpu/ which your patch isn't trying to
fake. This is just as broken, because sysfs doesn't reflect cgroup
confinement either.
I think what is ultimately needed for applications is some kind of
libresource.so library that they can use to query what resources
are available in their compute environment, which can intelligently
query cgroups directly, and ignore the legacy /proc & /sys interfaces
for counting memory / cpu availability. I don't think that's something
that libvirt should solve - if anything it could be systemd, or a
standalone project.
So I'm increasingly convinced that LXC should not try to fake out
any /proc & /sys file content, and instead document the limitations.
I'm also thinking that we should kill off our existing meminfo fake
fuse at some point.
The more minor concern I have is around the implementation. AFAIR, the
/proc/cpuinfo file contents is not standardized across architectures,
so I'm concerned whether your parsing code is robust on non-x86 arches.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|