[libvirt] KVM processes -- should we be able to attach them to the libvirtd process?

Not too long ago we took a patch that allowed QEMU VMs to keep running even if libvirtd died or was restarted. I was talking to Matt Farellee (cc'd) this afternoon about manageability, and he feels fairly strongly that this behavior should be optional -- in other words, it should be possible to guarantee that if libvirtd dies, it will take all the VMs with the "die-with-libvirtd" flag set down with it. I'm not sure this API is portable to Xen, but it would work on any hypervisor that represents the VM as a normal process. Does this strike anyone else as useful behavior? Thanks, --Hugh

Dnia wtorek 05 maj 2009 o 22:13:38 Hugh O. Brock napisał(a):
Not too long ago we took a patch that allowed QEMU VMs to keep running even if libvirtd died or was restarted.
I was talking to Matt Farellee (cc'd) this afternoon about manageability, and he feels fairly strongly that this behavior should be optional -- in other words, it should be possible to guarantee that if libvirtd dies, it will take all the VMs with the "die-with-libvirtd" flag set down with it.
I'm not sure this API is portable to Xen, but it would work on any hypervisor that represents the VM as a normal process.
Does this strike anyone else as useful behavior?
Not only useful but a must have for me. Do You really want all Your virtual machines to die when libvirtd dies? Not if You are doing anything serious. Also this means that You can upgrade libvirt without stopping everything. Łukasz Mierzwa

On Tue, May 05, 2009 at 04:13:38PM -0400, Hugh O. Brock wrote:
Not too long ago we took a patch that allowed QEMU VMs to keep running even if libvirtd died or was restarted.
I was talking to Matt Farellee (cc'd) this afternoon about manageability, and he feels fairly strongly that this behavior should be optional -- in other words, it should be possible to guarantee that if libvirtd dies, it will take all the VMs with the "die-with-libvirtd" flag set down with it.
I'm not sure this API is portable to Xen, but it would work on any hypervisor that represents the VM as a normal process.
Does this strike anyone else as useful behavior?
This isn't really a model we want in the architecture. That the QEMU instances used to die when libvirtd died was an unfortunate artifact of the fact that QEMU was the parent process leader. These days all VMs are fully daemonized, so there is no parent/child relationship. In fact QEMU was really the odd-ball in this respect, because with Xen/OpenVZ/LXC and VirtualBox, VMs have always happily continued when libvirtd stopped or died, as do storage pools and virtual networks. This is important because it ensures we can automatically restart the libvirtd daemon during RPM upgrades, and provides robustness should a bug cause the daemon to crash - the daemon can be trivially restarted and continue with no interruption to services being managed. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel P. Berrange wrote:
On Tue, May 05, 2009 at 04:13:38PM -0400, Hugh O. Brock wrote:
Not too long ago we took a patch that allowed QEMU VMs to keep running even if libvirtd died or was restarted.
I was talking to Matt Farrellee (cc'd) this afternoon about manageability, and he feels fairly strongly that this behavior should be optional -- in other words, it should be possible to guarantee that if libvirtd dies, it will take all the VMs with the "die-with-libvirtd" flag set down with it.
I'm not sure this API is portable to Xen, but it would work on any hypervisor that represents the VM as a normal process.
Does this strike anyone else as useful behavior?
This isn't really a model we want in the architecture. That the QEMU instances used to die when libvirtd died was an unfortunate artifact of the fact that QEMU was the parent process leader. These days all VMs are fully daemonized, so there is no parent/child relationship. In fact QEMU was really the odd-ball in this respect, because with Xen/OpenVZ/LXC and VirtualBox, VMs have always happily continued when libvirtd stopped or died, as do storage pools and virtual networks.
This is important because it ensures we can automatically restart the libvirtd daemon during RPM upgrades, and provides robustness should a bug cause the daemon to crash - the daemon can be trivially restarted and continue with no interruption to services being managed.
Daniel
It doesn't appear to be the case that the libvirtd daemon can trivially restart and continue with no interruptions. Right now it loses track of VMs. In a scenario where VMs are not deployed and locked to specific physical nodes, it can be highly valuable to have ways to ensure a VM is no longer running when a layer of its management stops functioning. Best, matt

On Tue, May 05, 2009 at 11:38:13PM -0500, Matthew Farrellee wrote:
Daniel P. Berrange wrote:
On Tue, May 05, 2009 at 04:13:38PM -0400, Hugh O. Brock wrote:
Not too long ago we took a patch that allowed QEMU VMs to keep running even if libvirtd died or was restarted.
I was talking to Matt Farrellee (cc'd) this afternoon about manageability, and he feels fairly strongly that this behavior should be optional -- in other words, it should be possible to guarantee that if libvirtd dies, it will take all the VMs with the "die-with-libvirtd" flag set down with it.
I'm not sure this API is portable to Xen, but it would work on any hypervisor that represents the VM as a normal process.
Does this strike anyone else as useful behavior?
This isn't really a model we want in the architecture. That the QEMU instances used to die when libvirtd died was an unfortunate artifact of the fact that QEMU was the parent process leader. These days all VMs are fully daemonized, so there is no parent/child relationship. In fact QEMU was really the odd-ball in this respect, because with Xen/OpenVZ/LXC and VirtualBox, VMs have always happily continued when libvirtd stopped or died, as do storage pools and virtual networks.
This is important because it ensures we can automatically restart the libvirtd daemon during RPM upgrades, and provides robustness should a bug cause the daemon to crash - the daemon can be trivially restarted and continue with no interruption to services being managed.
It doesn't appear to be the case that the libvirtd daemon can trivially restart and continue with no interruptions. Right now it loses track of VMs.
That a is a bug then, if you can reproduce it, please file a BZ ticket so we can track it down & fix it.
In a scenario where VMs are not deployed and locked to specific physical nodes, it can be highly valuable to have ways to ensure a VM is no longer running when a layer of its management stops functioning.
IMHO this is a problem to be solved by clustering software. If the clustering software detects a failure with the management service, then it should power fence the entire node. Relying on management service failure to kill the VMs will never be reliable enough. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel P. Berrange wrote:
On Tue, May 05, 2009 at 11:38:13PM -0500, Matthew Farrellee wrote:
Daniel P. Berrange wrote:
On Tue, May 05, 2009 at 04:13:38PM -0400, Hugh O. Brock wrote:
Not too long ago we took a patch that allowed QEMU VMs to keep running even if libvirtd died or was restarted.
I was talking to Matt Farrellee (cc'd) this afternoon about manageability, and he feels fairly strongly that this behavior should be optional -- in other words, it should be possible to guarantee that if libvirtd dies, it will take all the VMs with the "die-with-libvirtd" flag set down with it.
I'm not sure this API is portable to Xen, but it would work on any hypervisor that represents the VM as a normal process.
Does this strike anyone else as useful behavior? This isn't really a model we want in the architecture. That the QEMU instances used to die when libvirtd died was an unfortunate artifact of the fact that QEMU was the parent process leader. These days all VMs are fully daemonized, so there is no parent/child relationship. In fact QEMU was really the odd-ball in this respect, because with Xen/OpenVZ/LXC and VirtualBox, VMs have always happily continued when libvirtd stopped or died, as do storage pools and virtual networks.
This is important because it ensures we can automatically restart the libvirtd daemon during RPM upgrades, and provides robustness should a bug cause the daemon to crash - the daemon can be trivially restarted and continue with no interruption to services being managed.
It doesn't appear to be the case that the libvirtd daemon can trivially restart and continue with no interruptions. Right now it loses track of VMs.
That a is a bug then, if you can reproduce it, please file a BZ ticket so we can track it down & fix it.
In a scenario where VMs are not deployed and locked to specific physical nodes, it can be highly valuable to have ways to ensure a VM is no longer running when a layer of its management stops functioning.
IMHO this is a problem to be solved by clustering software. If the clustering software detects a failure with the management service, then it should power fence the entire node. Relying on management service failure to kill the VMs will never be reliable enough.
Daniel
Assuming clustering software were the answer, it is often too specialized and does not scale nearly well enough. There are other alternative to layers and layers of management software, but for many years layers have been what we get to work with. Best, matt

Hugh O. Brock schrieb:
Not too long ago we took a patch that allowed QEMU VMs to keep running even if libvirtd died or was restarted.
I was talking to Matt Farellee (cc'd) this afternoon about manageability, and he feels fairly strongly that this behavior should be optional -- in other words, it should be possible to guarantee that if libvirtd dies, it will take all the VMs with the "die-with-libvirtd" flag set down with it.
I'm not sure this API is portable to Xen, but it would work on any hypervisor that represents the VM as a normal process.
Does this strike anyone else as useful behavior?
Thanks, --Hugh Hello
From my point of view the kvm-processes should under no circumstance die if it is not intended that they behave so. i.e. one most shut down a VM or destroy it on purpose. A normale restart of libvirt should do nothing but restart the libvirt. Your mentiond behaviour would also make it impossible to update/upgrade libvirt without restarting all VMs. Kind regards, Gerrit Slomma
participants (5)
-
Daniel P. Berrange
-
Gerrit Slomma
-
Hugh O. Brock
-
Matthew Farrellee
-
Łukasz Mierzwa