[Libvir] Adding suppport for daemon restarts with stateful drivers

The libvirt daemon has the ability to reload itself by sending it SIGHUP. For the QEMU & network drivers this makes it reload the config files for VMs and re-init the iptables rules. It would be desirable though to allow the daemon to perform a full restart. Principally this is for RPM upgrades where you want toensure the daemon is running the new code. The tricky thing is figuring out how to handle driver state. Looking at the QEMU, network, storage and LXC drivers, there is not actually all that much state to deal with. It basically comes down to: - PID of child processes (eg QEMU, dnsmasq, container) - FDs for STDIN/OUT/ERR of the child processes - A possible logfile FD - Flag to indicate whether some objects are active or not That is more or less it. Anything else is kept in the config files and can be reloaded at will. So I was thinking about whether we could provide a simple protocol to allow each stateful driver to save its state into some location, the daemon could just 'exec()' itself again, and upon startup the drivers reload their active state. Since the daemon just exec()'s itself it would still own the child processes & still have all the neccessary FD's open. Dan. -- |: Red Hat, Engineering, Boston -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Fri, Mar 21, 2008 at 05:35:12PM +0000, Daniel P. Berrange wrote:
The libvirt daemon has the ability to reload itself by sending it SIGHUP. For the QEMU & network drivers this makes it reload the config files for VMs and re-init the iptables rules. It would be desirable though to allow the daemon to perform a full restart. Principally this is for RPM upgrades where you want toensure the daemon is running the new code.
The tricky thing is figuring out how to handle driver state. Looking at the QEMU, network, storage and LXC drivers, there is not actually all that much state to deal with. It basically comes down to:
- PID of child processes (eg QEMU, dnsmasq, container) - FDs for STDIN/OUT/ERR of the child processes - A possible logfile FD - Flag to indicate whether some objects are active or not
That is more or less it. Anything else is kept in the config files and can be reloaded at will.
From a libvirt client connected to the driver POV we would still either see a disconnection or a potential loss of state depending how they are connected, right ? if we are sure we can transparently restart fine, but I'm not sure it's always the case for say an ssh connection without an agent, still being able to re-exec on the new code is important, I would still try to avoid it if we can detect the code itself didn't change (for example if the timestamp on the /usr/sbin/libvirtd didn't change it's likely to be a simple reload -HUP command)
So I was thinking about whether we could provide a simple protocol to allow each stateful driver to save its state into some location, the daemon could just 'exec()' itself again, and upon startup the drivers reload their active state. Since the daemon just exec()'s itself it would still own the child processes & still have all the neccessary FD's open.
yes but how much state is kept in buffers and code of the protocols ? Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Fri, Mar 21, 2008 at 04:50:45PM -0400, Daniel Veillard wrote:
On Fri, Mar 21, 2008 at 05:35:12PM +0000, Daniel P. Berrange wrote:
The libvirt daemon has the ability to reload itself by sending it SIGHUP. For the QEMU & network drivers this makes it reload the config files for VMs and re-init the iptables rules. It would be desirable though to allow the daemon to perform a full restart. Principally this is for RPM upgrades where you want toensure the daemon is running the new code.
The tricky thing is figuring out how to handle driver state. Looking at the QEMU, network, storage and LXC drivers, there is not actually all that much state to deal with. It basically comes down to:
- PID of child processes (eg QEMU, dnsmasq, container) - FDs for STDIN/OUT/ERR of the child processes - A possible logfile FD - Flag to indicate whether some objects are active or not
That is more or less it. Anything else is kept in the config files and can be reloaded at will.
From a libvirt client connected to the driver POV we would still either see a disconnection or a potential loss of state depending how they are connected, right ? if we are sure we can transparently restart fine, but I'm not sure it's always the case for say an ssh connection without an agent, still being able to re-exec on the new code is important, I would still try to avoid it if we can detect the code itself didn't change (for example if the timestamp on the /usr/sbin/libvirtd didn't change it's likely to be a simple reload -HUP command)
Yes, it is an open question whether it would be neccessary to keep clients open / functional. I'd probably argue that it should just kick off all clients when re-exec()ing. Clients can trivially re-connect & the libvirt API itself is stateless, so dropping & reconnecting is not a particularly hard thing to deal with from that POV.
So I was thinking about whether we could provide a simple protocol to allow each stateful driver to save its state into some location, the daemon could just 'exec()' itself again, and upon startup the drivers reload their active state. Since the daemon just exec()'s itself it would still own the child processes & still have all the neccessary FD's open.
yes but how much state is kept in buffers and code of the protocols ?
The SSL / Kerberos protocols definitely have arbitrary internal state that would be impossible to preserve. So if we tried this approach we'd have to kill off active clients & let them reconnect. Dan. -- |: Red Hat, Engineering, Boston -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (2)
-
Daniel P. Berrange
-
Daniel Veillard