
On Thu, Dec 21, 2017 at 3:51 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
On Thu, Dec 21, 2017 at 11:22:48AM +0100, Sergio Lopez wrote:
I'm willing to work on a patch implementing this feature myself, but first I'd like to know if this sounds good to you. Also, if it does, do you think the behavior of coredump-[destroy|restart] should be changed to use gdb if available, falling back to qemuDomainCoreDump if it isn't, or just implement another action for on_crash?
I really don't like the idea of running GDB automatically in response to anything that can be triggered by the guest. GDB is not robust wrt parsing cores, so a suitably corrupted memory image of QEMU could be used to abuse GDB. Further, few production deployments of libvirt & QEMU will ever have GDB installed. Many organizations outright forbid install of such tools on production machines.
I'm thinking of this not as something to be enabled by default for the whole environment, but as a feature that would be temporarily enabled for debugging a certain misbehaving VM. Would be a great companion for softlockup_panic.
A much simpler way is to just have an "abort" action, which would cause libvirt to send SIG_ABRT to QEMU. This should cause QEMU's default SIG_ABRT handler to kick in which will make the kernel trigger a coredump. If really wanted, you can then use abrt to catch this coredump.
GDB would allow us to dump all memory mappings even for VMs running with dump-guest-core=off, but as this is something that already requires modifying the domain's definition, adding dumpCore=on too shouldn't be a problem. I like the on_crash=abort approach. I'm going to work on it. Thanks Daniel. -- Sergio