[libvirt] [PATCH 3/4] lxc: validate container process during load config

This patch adds a check that validates that the container process pid still exists. This should catch cases where the container exits while libvirtd is down. -- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization

On Thu, May 29, 2008 at 03:20:15PM -0700, Dave Leskovec wrote:
This patch adds a check that validates that the container process pid still exists. This should catch cases where the container exits while libvirtd is down.
sounds fine,
+/** + * lxcCheckContainerProcess: + * @def: Ptr to VM definition + * + * Checks if the container process (stored at def->id is running + * + * Returns on success or -1 in case of error + * 0 - no process with id vm->def->id + * 1 - container process exists + * -1 - error + */ +int lxcCheckContainerProcess(lxc_vm_def_t *def) +{ + int rc = -1; + + if (1 < def->id) { + if (-1 == kill(def->id, 0)) {
hum i didn't know of that way to check for a process, cool
+ if (ESRCH == errno) { + rc = 0; + DEBUG("pid %d no longer exists", def->id); + goto done; + } + + lxcError(NULL, NULL, VIR_ERR_INTERNAL_ERROR, + _("error checking container process: %d %s"), + def->id, strerror(errno)); + goto done; + }
The problem though is that by doing just a passive test for the PID it feels like there is a possible race if the process counter rolled over and another process with the same PID got create in the meantime. i have the feeling that a test based on the state of the file descriptors used to communicate with the container would be more reliable. Basically if the container disapear, then the pipe should get in a half-closed state, detecting the change at that level sounds like it would be more reliable, don't you think so ? But as is the patch is still a good improvement, +1 Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Fri, May 30, 2008 at 02:28:46AM -0400, Daniel Veillard wrote:
On Thu, May 29, 2008 at 03:20:15PM -0700, Dave Leskovec wrote:
+ if (ESRCH == errno) { + rc = 0; + DEBUG("pid %d no longer exists", def->id); + goto done; + } + + lxcError(NULL, NULL, VIR_ERR_INTERNAL_ERROR, + _("error checking container process: %d %s"), + def->id, strerror(errno)); + goto done; + }
The problem though is that by doing just a passive test for the PID it feels like there is a possible race if the process counter rolled over and another process with the same PID got create in the meantime. i have the feeling that a test based on the state of the file descriptors used to communicate with the container would be more reliable. Basically if the container disapear, then the pipe should get in a half-closed state, detecting the change at that level sounds like it would be more reliable, don't you think so ?
Yes, after checking the PID still exists, it needs to validate /proc/$PID/exe to verify it points to the binary we expect it to. Regards, Dan. -- |: Red Hat, Engineering, Boston -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel P. Berrange wrote:
On Fri, May 30, 2008 at 02:28:46AM -0400, Daniel Veillard wrote:
On Thu, May 29, 2008 at 03:20:15PM -0700, Dave Leskovec wrote:
+ if (ESRCH == errno) { + rc = 0; + DEBUG("pid %d no longer exists", def->id); + goto done; + } + + lxcError(NULL, NULL, VIR_ERR_INTERNAL_ERROR, + _("error checking container process: %d %s"), + def->id, strerror(errno)); + goto done; + } The problem though is that by doing just a passive test for the PID it feels like there is a possible race if the process counter rolled over and another process with the same PID got create in the meantime. i have the feeling that a test based on the state of the file descriptors used to communicate with the container would be more reliable. Basically if the container disapear, then the pipe should get in a half-closed state, detecting the change at that level sounds like it would be more reliable, don't you think so ?
Yes, after checking the PID still exists, it needs to validate /proc/$PID/exe to verify it points to the binary we expect it to.
Hmmm.... Worked with this a bit and I don't think we can reliably know what to expect /proc/$PID/exe to point to. For scripts, /proc/$PID/exe seems to point the shell. Also, if the container does an exec, /proc/$PID/exe points to whatever it exec'd rather than the init program. There currently isn't pipe into the container although one may be added as part of the network support for other reasons. My concern there is that if something has happened that we're not getting SIGCHLD from exiting containers, wouldn't we have lost the pipe as well? -- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization

On Fri, May 30, 2008 at 11:40:00AM -0700, Dave Leskovec wrote:
Daniel P. Berrange wrote:
On Fri, May 30, 2008 at 02:28:46AM -0400, Daniel Veillard wrote:
On Thu, May 29, 2008 at 03:20:15PM -0700, Dave Leskovec wrote:
+ if (ESRCH == errno) { + rc = 0; + DEBUG("pid %d no longer exists", def->id); + goto done; + } + + lxcError(NULL, NULL, VIR_ERR_INTERNAL_ERROR, + _("error checking container process: %d %s"), + def->id, strerror(errno)); + goto done; + } The problem though is that by doing just a passive test for the PID it feels like there is a possible race if the process counter rolled over and another process with the same PID got create in the meantime. i have the feeling that a test based on the state of the file descriptors used to communicate with the container would be more reliable. Basically if the container disapear, then the pipe should get in a half-closed state, detecting the change at that level sounds like it would be more reliable, don't you think so ?
Yes, after checking the PID still exists, it needs to validate /proc/$PID/exe to verify it points to the binary we expect it to.
Hmmm.... Worked with this a bit and I don't think we can reliably know what to expect /proc/$PID/exe to point to. For scripts, /proc/$PID/exe seems to point the shell. Also, if the container does an exec, /proc/$PID/exe points to whatever it exec'd rather than the init program.
Well the path won't change once launched, so why not store the original value of /proc/$PID/exe in the /var/lib/libivrt/lxc/NAME.pid file too, so you can read it back out later & validate. Dan. -- |: Red Hat, Engineering, Boston -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel P. Berrange wrote:
On Fri, May 30, 2008 at 11:40:00AM -0700, Dave Leskovec wrote:
On Fri, May 30, 2008 at 02:28:46AM -0400, Daniel Veillard wrote:
On Thu, May 29, 2008 at 03:20:15PM -0700, Dave Leskovec wrote:
+ if (ESRCH == errno) { + rc = 0; + DEBUG("pid %d no longer exists", def->id); + goto done; + } + + lxcError(NULL, NULL, VIR_ERR_INTERNAL_ERROR, + _("error checking container process: %d %s"), + def->id, strerror(errno)); + goto done; + } The problem though is that by doing just a passive test for the PID it feels like there is a possible race if the process counter rolled over and another process with the same PID got create in the meantime. i have the feeling that a test based on the state of the file descriptors used to communicate with the container would be more reliable. Basically if the container disapear, then the pipe should get in a half-closed state, detecting the change at that level sounds like it would be more reliable, don't you think so ? Yes, after checking the PID still exists, it needs to validate /proc/$PID/exe to verify it points to the binary we expect it to. Hmmm.... Worked with this a bit and I don't think we can reliably know what to expect /proc/$PID/exe to point to. For scripts, /proc/$PID/exe seems to point
Daniel P. Berrange wrote: the shell. Also, if the container does an exec, /proc/$PID/exe points to whatever it exec'd rather than the init program.
Well the path won't change once launched, so why not store the original value of /proc/$PID/exe in the /var/lib/libivrt/lxc/NAME.pid file too, so you can read it back out later & validate.
AFAIK there's nothing restricting /proc/$PID/exe from changing for valid reasons between the time it was stored and the time it's checked. Say we store it right after launching the container. How do we know that we didn't happen to store it right before an exec? Take a simple container with an init that performs some setup maybe of the network that takes an indeterminate amount of time. It then exec's sshd or apache. If the value was stored right after the container was started, it would no longer be valid after the exec. -- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization
participants (3)
-
Daniel P. Berrange
-
Daniel Veillard
-
Dave Leskovec