Re: [libvirt] PATCH: 3/5: Introduce a container controller process

Wednesday, 16 July 2008

On Wed, Jul 16, 2008 at 11:30:53AM -0700, Dan Smith wrote:
...
 DB> +static int lxcControllerMoveInterfaces(int nveths,
 DB> +                                       char **veths,
 DB> +                                       pid_t container)
 DB> +{
 DB> +    int i;
 DB> +    for (i = 0 ; i < nveths ; i++)
 DB> +        if (moveInterfaceToNetNs(veths[i], container) < 0) {
 DB> +            lxcError(NULL, NULL, VIR_ERR_INTERNAL_ERROR,
 DB> +                     _("failed to move interface %s to ns %d"),
 DB> +                     veths[i], container);
 DB> +            return -1;
 DB> +        }
 DB> +
 DB> +    return 0;
 DB> +}

 I'm not sure why, but the call to this function causes a failure on my
 system.  I think that it's related to luck somehow, so I don't think
 it's something that was really introduced by this set, but it prevents
 me from starting a container with a network interface.

 I've tracked it down to the actual virRun() of the 'ip' command.  If I
 comment out the line that virRun()'s ip, everything works fine.
 However, if it actually gets run, the container never receives the
 continue message. 
Yep, as I mentioned on IRC I've not been able to test the network
stuff myself yet, since 2.6.26 hangs on my laptop. I'll hopefully
be able to get it running on one of my development servers when
I'm back in the office next.

...
 I added a loop in lxcWaitForContinue() that reads the socket
 character-by-character instead of bailing out if the first character
 isn't 'c'.  The result was a bunch of control characters followed by
 'c'.

 Based on these two facts, I would tend to guess that somehow the exec
 of ip is causing some terminal stuff (yes, that's my technical term
 for it) to be written to the socket ahead of the continue message.  I
 changed 'ip' to 'true' in moveInterfaceToNetNs() and it behaves the
 same (which absolves ip itself).  I removed the virRun() and replaced
 with a fork()..exec() and it behaves the same.  I set FD_CLOEXEC on
 the socket pair, and it behaves the same. 
Ok. that's useful info - I'll do some poking around this.

...
 DB> -    if (0 != (rc = lxcSetupInterfaces(conn, vm))) {

 You changed the condition from "nonzero" here... 
Opps.

...

 <snip>

 DB> +    if (lxcSetupInterfaces(conn, vm->def, &nveths, &veths) < 0)

 ...to "negative" here, which makes the start process not notice some
 of the failure paths in that function, and thus will erroneously
 return success.

 In general, I think this is an excellent approach, but I don't really
 like how many (dozens of) failure points there are in the controller
 and container setup procedure that result in a silent failure.  Almost
 all of them leave libvirtd thinking that the container/controller is
 running, but in fact, it _exit(1)'d long ago. 
Yes, those are bugs. THe series of patches I sent missed out my change
to have the libvirtd daemon register the socket connection to the
controller with the event loop. Basically libvirtd needs to watch for
SIGHUP on its socket to the conttroller so it detects death. The
controller also doesn't seem to always exit when the container dies
so its probaly another condition i broke in the I/o forwarding loop.

The one difference i gues is that the failures are not directly fed back
as errors from the virDomainCreate() call. Ultimately I hope this isn't
too much of a problem - the really interesting errors are going to be in
the container process, and they were never fed back - the errors in the
controller process are mostly my coding bugs which should be fixed :-)
much

Btw, anything fprintf'd in the controller (eg, via lxcError() )
will end up in /var/log/libvirt/lxc/NAME.log

Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] PATCH: 3/5: Introduce a container controller process