
This patch adds the start container support. A couple new source files are added - lxc_container.h and lxc_container.c These contain the setup code that runs within the container namespace prior to exec'ing the user specified init. This is a rough outline of the functions involved in starting a container and the namespace and process under which they run: lxcVmStart() - runs under callers process lxcSetupTtyTunnel() - opens a tty and socket pair, tty stored in vmDef double fork to separate from parent process grandchild calls lxcStartContainer() see below parent continues wait for child process(es) if child process was successful, change vm state to running return lxcStartContainer() - runs in parent namespace, child process from lxcVmStart Allocate stack for container clone() - child process will start in lxcChild() see below exit() - once lxcTtyForward returns, the container has exited lxcChild() - runs within container, child process from clone() mount user filesystems mount container /proc lxcExecWithTty() - see below, will not return lxcExecWithTty() - runs within container lxcSetupContainerTty() - opens tty for container Set up SIGCHLD handler fork() Child calls lxcExecContainerInit() see below Parent continues lxcTtyForward - shuttles data between file descriptors until flag is set in this case between the master end of the container tty and the master end of the parent tty exit() - when lxcTtyForward returns, container init has exited lxcExecContainerInit() - runs within contianer, child process from lxcExecWithTty exec containers init if exec fails, exit() There's (at least) a couple issues I don't have good solutions for - 1) In this setup with a tty console, we end up with at least 2 processes per container. One process is running the user init. The CMD listed under ps will be the init as specified in the XML (unless it changes it to something else). The other process is forwarding console traffic between the parent and container pts. The CMD listed in ps depends will depend on the mgmt app used to start the container. Using virsh, it's something like this outside the container: root 10141 1 93 22:05 pts/6 00:27:50 /home/dlesko/src/dev/libvirt-ss/libvirt/src/.libs/lt-virsh -c lxc:/// and this inside the container: root 1 0 93 22:05 pts/6 00:29:19 /home/dlesko/src/dev/libvirt-ss/libvirt/src/.libs/lt-virsh -c lxc:/// This can be a bit confusing. I'm not sure how important it is but it would be nice to change this to something a little more meaningful as is done by ssh. 2) The container can stall when nothing is connected to the parent side pty and console output fills up the buffer. To avoid this, we set the parent side pty to be non-blocking. The result of this is that we will discard any console output once the buffer has filled. When a user does connect to the console, they may get a flood of (potentially very) old data. It would be nice to be able to provide some more recent output once someone connects to the console. -- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization