[Libvir] Restarting of libvirt_qemud daemon

Thinking about later RPM upgrades I think we need to think about whether it will be possible to restart the libvirt_qemud while guests & networks are running. There's a couple of issues: - We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains & networks, or to detect when they crash. For the former, we could may they daemons to avoid waitpid() cleanup, but we'd still need waitpid to be able to detect shutdowns. There is also the issue of enumerating running instances. - We always try to re-create a bridge device at startup, even if it already exists. Likewise we always try to add the IPtables rules & start dnsmasq. We can easily detect if the bridge already exists. I think we can probably double check iptables rulles too., The tricky one is figuring out whether a dnsmasq instance is still running. Dealing with theses not only helps planned restarts, but will also make it possible start up the daemon again after a crash without having to kill off all guests & networks manually. So I think it is worth investigating what we can do to enable restarts. It might be worth waiting until we sort out whether we'll merge libvirt_qemud with the generic libvirtd remote daemon though so we don't have to do the work twice over. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Fri, Mar 09, 2007 at 03:25:44AM +0000, Daniel P. Berrange wrote:
Thinking about later RPM upgrades I think we need to think about whether it will be possible to restart the libvirt_qemud while guests & networks are running. There's a couple of issues:
- We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains & networks, or to detect when they crash. For the former, we could may they daemons to avoid waitpid() cleanup, but we'd still need waitpid to be able to detect shutdowns. There is also the issue of enumerating running instances.
- We always try to re-create a bridge device at startup, even if it already exists. Likewise we always try to add the IPtables rules & start dnsmasq. We can easily detect if the bridge already exists. I think we can probably double check iptables rulles too., The tricky one is figuring out whether a dnsmasq instance is still running.
Dealing with theses not only helps planned restarts, but will also make it possible start up the daemon again after a crash without having to kill off all guests & networks manually. So I think it is worth investigating what we can do to enable restarts. It might be worth waiting until we sort out whether we'll merge libvirt_qemud with the generic libvirtd remote daemon though so we don't have to do the work twice over.
In general I really prefer restartable daemons especially if the client can auto restart them if they are gone missing, it makes users and sysadmins life so much easier (and avoid the need to start the daemon at bootup, which is yet another pain), though I understand this may be hard to achieve because we have too much state. With respect to the unification of the various daemons, this also sounds like a really nice thing to have, but I must admit I'm a bit lost, I don't have really a clear picture of all the requirements (and probably won't until we finalize at least a first version of the networking support). Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Fri, 2007-03-09 at 04:44 -0500, Daniel Veillard wrote:
In general I really prefer restartable daemons especially if the client can auto restart them if they are gone missing, it makes users and sysadmins life so much easier (and avoid the need to start the daemon at bootup, which is yet another pain), though I understand this may be hard to achieve because we have too much state.
Well, the problem here is that you want the daemon to start guests and domains at boot time. That's the main reason for the initscript.
With respect to the unification of the various daemons, this also sounds like a really nice thing to have, but I must admit I'm a bit lost, I don't have really a clear picture of all the requirements (and probably won't until we finalize at least a first version of the networking support).
Stop confusing me! :-) s/networking/remote/ Cheers, Mark.

Daniel P. Berrange wrote:
Thinking about later RPM upgrades I think we need to think about whether it will be possible to restart the libvirt_qemud while guests & networks are running. There's a couple of issues:
- We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains & networks, or to detect when they crash. For the former, we could may they daemons to avoid waitpid() cleanup, but we'd still need waitpid to be able to detect shutdowns. There is also the issue of enumerating running instances.
Maybe I'm missing something big here, but how would libvirt_qemud regain connections to the running qemu monitor ptys? Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

On Fri, Mar 09, 2007 at 10:47:18AM +0000, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
Thinking about later RPM upgrades I think we need to think about whether it will be possible to restart the libvirt_qemud while guests & networks are running. There's a couple of issues:
- We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains & networks, or to detect when they crash. For the former, we could may they daemons to avoid waitpid() cleanup, but we'd still need waitpid to be able to detect shutdowns. There is also the issue of enumerating running instances.
Maybe I'm missing something big here, but how would libvirt_qemud regain connections to the running qemu monitor ptys?
That's one of the challenges to be addressed :-) Fortunately the monitor is set to be exposed via /dev/pty/XXX, so if the restarted client can find out the path to the PTY, then it can re-open it. Maybe we just need to record a state file somewhere containing a PID & PTY path. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
On Fri, Mar 09, 2007 at 10:47:18AM +0000, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
Thinking about later RPM upgrades I think we need to think about whether it will be possible to restart the libvirt_qemud while guests & networks are running. There's a couple of issues:
- We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains & networks, or to detect when they crash. For the former, we could may they daemons to avoid waitpid() cleanup, but we'd still need waitpid to be able to detect shutdowns. There is also the issue of enumerating running instances. Maybe I'm missing something big here, but how would libvirt_qemud regain connections to the running qemu monitor ptys?
That's one of the challenges to be addressed :-) Fortunately the monitor is set to be exposed via /dev/pty/XXX, so if the restarted client can find out the path to the PTY, then it can re-open it. Maybe we just need to record a state file somewhere containing a PID & PTY path.
Is it possible to start qemu with something like: qemu -monitor pipe:/var/some/known/place/pipe.UUID and then just look in /var/some/known/place/ in order to find the running instances? Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
On Fri, Mar 09, 2007 at 10:47:18AM +0000, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
Thinking about later RPM upgrades I think we need to think about whether it will be possible to restart the libvirt_qemud while guests & networks are running. There's a couple of issues:
- We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains & networks, or to detect when they crash. For the former, we could may they daemons to avoid waitpid() cleanup, but we'd still need waitpid to be able to detect shutdowns. There is also the issue of enumerating running instances. Maybe I'm missing something big here, but how would libvirt_qemud regain connections to the running qemu monitor ptys?
That's one of the challenges to be addressed :-) Fortunately the monitor is set to be exposed via /dev/pty/XXX, so if the restarted client can find out the path to the PTY, then it can re-open it. Maybe we just need to record a state file somewhere containing a PID & PTY path.
Is it possible to start qemu with something like:
qemu -monitor pipe:/var/some/known/place/pipe.UUID
and then just look in /var/some/known/place/ in order to find the running instances?
I should add a note that you can tell if the qemu at the other end of the pipe has died by opening the pipe and writing something, for example a NO-OP command. If you get EPIPE (or SIGPIPE if you weren't careful to disable the signal) you can delete the pipe device. Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 "[Negative numbers] darken the very whole doctrines of the equations and make dark of the things which are in their nature excessively obvious and simple" (Francis Maseres FRS, mathematician, 1759)

On Fri, Mar 09, 2007 at 02:02:55PM +0000, Richard W.M. Jones wrote:
Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
On Fri, Mar 09, 2007 at 10:47:18AM +0000, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
Thinking about later RPM upgrades I think we need to think about whether it will be possible to restart the libvirt_qemud while guests & networks are running. There's a couple of issues:
- We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains & networks, or to detect when they crash. For the former, we could may they daemons to avoid waitpid() cleanup, but we'd still need waitpid to be able to detect shutdowns. There is also the issue of enumerating running instances. Maybe I'm missing something big here, but how would libvirt_qemud regain connections to the running qemu monitor ptys?
That's one of the challenges to be addressed :-) Fortunately the monitor is set to be exposed via /dev/pty/XXX, so if the restarted client can find out the path to the PTY, then it can re-open it. Maybe we just need to record a state file somewhere containing a PID & PTY path.
Is it possible to start qemu with something like:
qemu -monitor pipe:/var/some/known/place/pipe.UUID
and then just look in /var/some/known/place/ in order to find the running instances?
I should add a note that you can tell if the qemu at the other end of the pipe has died by opening the pipe and writing something, for example a NO-OP command. If you get EPIPE (or SIGPIPE if you weren't careful to disable the signal) you can delete the pipe device.
Yes, that could be a very useful way to detect guest shutdown without needing to maintain the PPID<->PID relationship & waitpid(). Definitely something to experiment with in the near future. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Fri, 2007-03-09 at 03:25 +0000, Daniel P. Berrange wrote:
Thinking about later RPM upgrades I think we need to think about whether it will be possible to restart the libvirt_qemud while guests & networks are running.
If I had time, I'd give some serious thought as to whether we need to allow this. Are there any other examples of a daemon that manages something long-lived that can't be restarted without shutting down what it's managing?
There's a couple of issues:
- We do waitpid() to cleanup qemu & dnsmasq processes when we stop domains & networks, or to detect when they crash. For the former, we could may they daemons to avoid waitpid() cleanup, but we'd still need waitpid to be able to detect shutdowns. There is also the issue of enumerating running instances.
- We always try to re-create a bridge device at startup, even if it already exists. Likewise we always try to add the IPtables rules & start dnsmasq. We can easily detect if the bridge already exists. I think we can probably double check iptables rulles too., The tricky one is figuring out whether a dnsmasq instance is still running.
Dealing with theses not only helps planned restarts, but will also make it possible start up the daemon again after a crash without having to kill off all guests & networks manually. So I think it is worth investigating what we can do to enable restarts. It might be worth waiting until we sort out whether we'll merge libvirt_qemud with the generic libvirtd remote daemon though so we don't have to do the work twice over.
I guess the way I'd look at it is, a running qemud contains various state - how do you recover that state on restart? e.g. - the list of running VMs, the PID of the qemu processes, the stdout/stderr/monitor pipes, the domain ID, and the domain UUID if we generated it - the list of running networks, the bridge associated with each network and the PID of the dnsmasq processes. I could perhaps imagine using named pipes, caching this state in /var and re-loading it on startup but ... non-trivial to say the least. Cheers, Mark.
participants (4)
-
Daniel P. Berrange
-
Daniel Veillard
-
Mark McLoughlin
-
Richard W.M. Jones