[libvirt] [PATCH 0/2] qemu: kill junk process

First patch tries to kill qemu possibly left running by previous daemon and second patch is just a reproducer for the issue; if you start daemon with that patch applied, it will kill itself in the appropriate time so you don't have to quickly start/kill the daemon over and over again. Martin Kletzander (2): qemu: make sure capability probing process can start DO NOT APPLY: Reproducer for patch 1/2 src/qemu/qemu_capabilities.c | 45 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) -- 2.1.2

When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d. Reported-by: Wang Yufei <james.wangyufei@huawei.com> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false; + /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) { + virProcessKillPainfully(pid, true); + break; + } + + pos += strlen(cmdline + pos) + 1; + } + + VIR_FREE(cmdline); + } + VIR_DEBUG("Try to get caps via QMP qemuCaps=%p", qemuCaps); /* -- 2.1.2

On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it. The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed. Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out. So I guss we don't have much choice but to manually kill.
Reported-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false;
+ /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) {
Heh fun hack. Did you consider simply trying to connect to the monitor socket to see if the QEMU was still there ? That would be slightly more portable as it wouldn't rely on Linux /proc
+ virProcessKillPainfully(pid, true); + break; + } + + pos += strlen(cmdline + pos) + 1; + } + + VIR_FREE(cmdline); + } + VIR_DEBUG("Try to get caps via QMP qemuCaps=%p", qemuCaps);
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Oct 09, 2014 at 10:14:48 +0100, Daniel Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
It would be cool if we could tell QEMU to die when the monitor connection gets closed. Configuring some predefined actions to be taken when monitor is closed would be useful in general... we could use that to automatically cancel migration if QEMU loses connection with libvirt, for example. I'm not sure how this idea would be taken by QEMU community, though. I'll try to get opinions on it during KVM Forum. But even if this is something that could be done, we'd still need Martin's solution. Jirka

On Thu, Oct 09, 2014 at 11:38:46AM +0200, Jiri Denemark wrote:
On Thu, Oct 09, 2014 at 10:14:48 +0100, Daniel Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
It would be cool if we could tell QEMU to die when the monitor connection gets closed. Configuring some predefined actions to be taken when monitor is closed would be useful in general... we could use that to automatically cancel migration if QEMU loses connection with libvirt, for example. I'm not sure how this idea would be taken by QEMU community, though. I'll try to get opinions on it during KVM Forum. But even if this is something that could be done, we'd still need Martin's solution.
The real problem with this capabilities probing is that we really need to deal with whatever functionality was availble when '-M none' was first introduced. Otherwise we'll have to have 3 different ways of launching QEMU to probe for capabilities, which sucks even more than having 2 different ways Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Oct 09, 2014 at 10:45:50 +0100, Daniel Berrange wrote:
On Thu, Oct 09, 2014 at 11:38:46AM +0200, Jiri Denemark wrote:
On Thu, Oct 09, 2014 at 10:14:48 +0100, Daniel Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
It would be cool if we could tell QEMU to die when the monitor connection gets closed. Configuring some predefined actions to be taken when monitor is closed would be useful in general... we could use that to automatically cancel migration if QEMU loses connection with libvirt, for example. I'm not sure how this idea would be taken by QEMU community, though. I'll try to get opinions on it during KVM Forum. But even if this is something that could be done, we'd still need Martin's solution.
The real problem with this capabilities probing is that we really need to deal with whatever functionality was availble when '-M none' was first introduced. Otherwise we'll have to have 3 different ways of launching QEMU to probe for capabilities, which sucks even more than having 2 different ways
Hmm, that makes sense. However, the feature may still be useful :-) Jirka

On Thu, Oct 09, 2014 at 10:14:48AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
Reported-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false;
+ /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) {
Heh fun hack. Did you consider simply trying to connect to the monitor socket to see if the QEMU was still there ? That would be slightly more portable as it wouldn't rely on Linux /proc
I wanted to cooperate with qemu somehow, for example killing it if we are handling a signal (that wouldn't help if the daemon is SIGKILL'd), I thought we can add some timeout option to qemu (or it's parent process that would kill it if there was no communication for a while), but that doesn't make much sense and so on. Michal then told me that we can just kill it if it still exists. And that's something that would still need to be there if the monitor is not responding, for example. Since all this machinery happens only after libvirtd has its pidfile flock()'d, we can be sure we're not killing anyone else's qemu. What would be the platforms this wouldn't run on? I can only think of BSD where /proc doesn't have to be mounted. Can we somehow require that? Martin
+ virProcessKillPainfully(pid, true); + break; + } + + pos += strlen(cmdline + pos) + 1; + } + + VIR_FREE(cmdline); + } + VIR_DEBUG("Try to get caps via QMP qemuCaps=%p", qemuCaps);
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Oct 09, 2014 at 11:49:34 +0200, Martin Kletzander wrote:
On Thu, Oct 09, 2014 at 10:14:48AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
Reported-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false;
+ /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) {
Heh fun hack. Did you consider simply trying to connect to the monitor socket to see if the QEMU was still there ? That would be slightly more portable as it wouldn't rely on Linux /proc
I wanted to cooperate with qemu somehow, for example killing it if we are handling a signal (that wouldn't help if the daemon is SIGKILL'd), I thought we can add some timeout option to qemu (or it's parent process that would kill it if there was no communication for a while), but that doesn't make much sense and so on. Michal then told me that we can just kill it if it still exists. And that's something that would still need to be there if the monitor is not responding, for example.
Since all this machinery happens only after libvirtd has its pidfile flock()'d, we can be sure we're not killing anyone else's qemu.
And do we actually have to check anything. We already know there is a process running with its PID stored in /path/to/capabilities.pidfile and the pidfile is flocked. I think it should be good enough to assume the stored PID belongs to the process that flocked the pidfile. Why should we care about other processes flocking the file? We would be killing a random process rather than the one which locked the pidfile but that's life :-) Jirka

On Thu, Oct 09, 2014 at 12:37:42PM +0200, Jiri Denemark wrote:
On Thu, Oct 09, 2014 at 11:49:34 +0200, Martin Kletzander wrote:
On Thu, Oct 09, 2014 at 10:14:48AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
Reported-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false;
+ /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) {
Heh fun hack. Did you consider simply trying to connect to the monitor socket to see if the QEMU was still there ? That would be slightly more portable as it wouldn't rely on Linux /proc
I wanted to cooperate with qemu somehow, for example killing it if we are handling a signal (that wouldn't help if the daemon is SIGKILL'd), I thought we can add some timeout option to qemu (or it's parent process that would kill it if there was no communication for a while), but that doesn't make much sense and so on. Michal then told me that we can just kill it if it still exists. And that's something that would still need to be there if the monitor is not responding, for example.
Since all this machinery happens only after libvirtd has its pidfile flock()'d, we can be sure we're not killing anyone else's qemu.
And do we actually have to check anything. We already know there is a process running with its PID stored in /path/to/capabilities.pidfile and the pidfile is flocked. I think it should be good enough to assume the stored PID belongs to the process that flocked the pidfile. Why should we care about other processes flocking the file? We would be killing a random process rather than the one which locked the pidfile but that's life :-)
Sure we could try to acquire the flock on the pidfile and if that fails we should be ok to assume the PID stored in the pidfile is still valid. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Oct 09, 2014 at 11:52:49AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 12:37:42PM +0200, Jiri Denemark wrote:
On Thu, Oct 09, 2014 at 11:49:34 +0200, Martin Kletzander wrote:
On Thu, Oct 09, 2014 at 10:14:48AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
Reported-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false;
+ /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) {
Heh fun hack. Did you consider simply trying to connect to the monitor socket to see if the QEMU was still there ? That would be slightly more portable as it wouldn't rely on Linux /proc
I wanted to cooperate with qemu somehow, for example killing it if we are handling a signal (that wouldn't help if the daemon is SIGKILL'd), I thought we can add some timeout option to qemu (or it's parent process that would kill it if there was no communication for a while), but that doesn't make much sense and so on. Michal then told me that we can just kill it if it still exists. And that's something that would still need to be there if the monitor is not responding, for example.
Since all this machinery happens only after libvirtd has its pidfile flock()'d, we can be sure we're not killing anyone else's qemu.
And do we actually have to check anything. We already know there is a process running with its PID stored in /path/to/capabilities.pidfile and the pidfile is flocked. I think it should be good enough to assume the stored PID belongs to the process that flocked the pidfile. Why should we care about other processes flocking the file? We would be killing a random process rather than the one which locked the pidfile but that's life :-)
Sure we could try to acquire the flock on the pidfile and if that fails we should be ok to assume the PID stored in the pidfile is still valid.
Either we have to check the cmdline (which is in *BSD's procfs implementation, but we can't rely on that) *or* we can try flock()'ing the file and killing the pid that's written inside (which is probably better). But we can't just kill the pid without checking anything, we could kill something else and that's certainly not what we want :) I'll send a v2, Martin

On 2014/10/9 19:08, Martin Kletzander wrote:
On Thu, Oct 09, 2014 at 11:52:49AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 12:37:42PM +0200, Jiri Denemark wrote:
On Thu, Oct 09, 2014 at 11:49:34 +0200, Martin Kletzander wrote:
On Thu, Oct 09, 2014 at 10:14:48AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
Reported-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false;
+ /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) {
Heh fun hack. Did you consider simply trying to connect to the monitor socket to see if the QEMU was still there ? That would be slightly more portable as it wouldn't rely on Linux /proc
I wanted to cooperate with qemu somehow, for example killing it if we are handling a signal (that wouldn't help if the daemon is SIGKILL'd), I thought we can add some timeout option to qemu (or it's parent process that would kill it if there was no communication for a while), but that doesn't make much sense and so on. Michal then told me that we can just kill it if it still exists. And that's something that would still need to be there if the monitor is not responding, for example.
Since all this machinery happens only after libvirtd has its pidfile flock()'d, we can be sure we're not killing anyone else's qemu.
And do we actually have to check anything. We already know there is a process running with its PID stored in /path/to/capabilities.pidfile and the pidfile is flocked. I think it should be good enough to assume the stored PID belongs to the process that flocked the pidfile. Why should we care about other processes flocking the file? We would be killing a random process rather than the one which locked the pidfile but that's life :-)
Sure we could try to acquire the flock on the pidfile and if that fails we should be ok to assume the PID stored in the pidfile is still valid.
Either we have to check the cmdline (which is in *BSD's procfs implementation, but we can't rely on that) *or* we can try flock()'ing the file and killing the pid that's written inside (which is probably better). But we can't just kill the pid without checking anything, we could kill something else and that's certainly not what we want :)
I'll send a v2, Martin
Thanks. I just wanted to reply your former email and then saw your patch. I think you have understood the problem reported. My idea is the same as yours. Check that the pid is qemu and kill it.

On Thu, Oct 09, 2014 at 11:49:34AM +0200, Martin Kletzander wrote:
On Thu, Oct 09, 2014 at 10:14:48AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
The comments say we have to use daemonize to synchronize with the monitor socket creation and I recall we've tried other approaches to that before which failed.
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
So I guss we don't have much choice but to manually kill.
Reported-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..919780e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3243,6 +3243,47 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, config.data.nix.path = monpath; config.data.nix.listen = false;
+ /* + * Check if there wasn't some older qemu left running, e.g. if + * someone killed libvirtd during the probing phase. + */ + if (virFileExists(pidfile) && + virPidFileReadPath(pidfile, &pid) == 0) { + char *cmdpath = NULL; + char *cmdline = NULL; + int len, pos = 0; + + VIR_DEBUG("Checking if there's an older qemu process left running that " + "was used for capability probing"); + + if (virAsprintf(&cmdpath, "/proc/%u/cmdline", pid) < 0) + goto cleanup; + + len = virFileReadAll(cmdpath, 1024, &cmdline); + VIR_FREE(cmdpath); + + /* + * This cycle gets trivially skipped if there was an error in + * virFileReadAll(). + */ + while (len > pos) { + /* + * Find the '-pidfile /path/to/capabilities.pidfile' to be + * sure we won't kill anyone else. + */ + if (STREQ_NULLABLE(cmdline + pos, "-pidfile") && + (pos += strlen(cmdline + pos) + 1) < len && + STREQ_NULLABLE(cmdline + pos, pidfile)) {
Heh fun hack. Did you consider simply trying to connect to the monitor socket to see if the QEMU was still there ? That would be slightly more portable as it wouldn't rely on Linux /proc
I wanted to cooperate with qemu somehow, for example killing it if we are handling a signal (that wouldn't help if the daemon is SIGKILL'd), I thought we can add some timeout option to qemu (or it's parent process that would kill it if there was no communication for a while), but that doesn't make much sense and so on. Michal then told me that we can just kill it if it still exists. And that's something that would still need to be there if the monitor is not responding, for example.
Since all this machinery happens only after libvirtd has its pidfile flock()'d, we can be sure we're not killing anyone else's qemu.
What would be the platforms this wouldn't run on? I can only think of BSD where /proc doesn't have to be mounted. Can we somehow require that?
Does BSD's /proc actally include the 'cmdline' file ? I know a number of OS include /proc but I always thought they have completely different contents from each other, which is why i was concerned in this case. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Oct 09, 2014 at 10:14:48AM +0100, Daniel P. Berrange wrote:
On Thu, Oct 09, 2014 at 09:58:30AM +0200, Martin Kletzander wrote:
When daemon is killed right in the middle of probing a qemu binary for its capabilities, the VM is left running. Next time the daemon is starting, it cannot start qemu process because the one that's already running does have the pidfile flock()'d.
I was wondering if there's anything we can easily change in the way we launch the QEMU binary so that it automatically dies when libvirtd exits, rather than us needing to manually kill it.
FWIW libguestfs (in non-libvirt mode) spawns off a recovery process that is just there to kill qemu if the parent process goes away without closing qemu properly. This has Just Worked all the time we have been using libguestfs. Code: https://github.com/libguestfs/libguestfs/blob/master/src/launch-direct.c#L74...
Another idea would be to play with adding '-serial stdio' and then when libvirt died stdio would get a broken pipe but I don't think it is safe to use -serial when we have -M none so that's out.
libguestfs uses -serial stdio but qemu doesn't die if the pipe is broken. Hence need for above. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top

Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_capabilities.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 919780e..9d553b4 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3324,6 +3324,10 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, goto cleanup; } + VIR_ERROR("Kamikazeee"); + while (virProcessKillPainfully(getpid(), true) < 0) + sleep(1); + if (!(xmlopt = virDomainXMLOptionNew(NULL, NULL, NULL)) || !(vm = virDomainObjNew(xmlopt))) goto cleanup; -- 2.1.2
participants (5)
-
Daniel P. Berrange
-
Jiri Denemark
-
Martin Kletzander
-
Richard W.M. Jones
-
Wang Rui