Daniel P. Berrange wrote:
On Fri, May 30, 2008 at 11:40:00AM -0700, Dave Leskovec wrote:
> Daniel P. Berrange wrote:
>> On Fri, May 30, 2008 at 02:28:46AM -0400, Daniel Veillard wrote:
>>> On Thu, May 29, 2008 at 03:20:15PM -0700, Dave Leskovec wrote:
>>>> + if (ESRCH == errno) {
>>>> + rc = 0;
>>>> + DEBUG("pid %d no longer exists", def->id);
>>>> + goto done;
>>>> + }
>>>> +
>>>> + lxcError(NULL, NULL, VIR_ERR_INTERNAL_ERROR,
>>>> + _("error checking container process: %d
%s"),
>>>> + def->id, strerror(errno));
>>>> + goto done;
>>>> + }
>>> The problem though is that by doing just a passive test for the PID
>>> it feels like there is a possible race if the process counter rolled over
and
>>> another process with the same PID got create in the meantime.
>>> i have the feeling that a test based on the state of the file descriptors
>>> used to communicate with the container would be more reliable. Basically if
the
>>> container disapear, then the pipe should get in a half-closed state,
>>> detecting the change at that level sounds like it would be more reliable,
>>> don't you think so ?
>> Yes, after checking the PID still exists, it needs to validate /proc/$PID/exe
>> to verify it points to the binary we expect it to.
> Hmmm.... Worked with this a bit and I don't think we can reliably know what to
> expect /proc/$PID/exe to point to. For scripts, /proc/$PID/exe seems to point
> the shell. Also, if the container does an exec, /proc/$PID/exe points to
> whatever it exec'd rather than the init program.
Well the path won't change once launched, so why not store the original
value of /proc/$PID/exe in the /var/lib/libivrt/lxc/NAME.pid file too,
so you can read it back out later & validate.
AFAIK there's nothing restricting /proc/$PID/exe from changing for valid reasons
between the time it was stored and the time it's checked. Say we store it right
after launching the container. How do we know that we didn't happen to store it
right before an exec? Take a simple container with an init that performs some
setup maybe of the network that takes an indeterminate amount of time. It then
exec's sshd or apache. If the value was stored right after the container was
started, it would no longer be valid after the exec.
--
Best Regards,
Dave Leskovec
IBM Linux Technology Center
Open Virtualization