On Thu, Dec 21, 2017 at 3:51 PM, Daniel P. Berrange <berrange(a)redhat.com> wrote:
On Thu, Dec 21, 2017 at 11:22:48AM +0100, Sergio Lopez wrote:
> I'm willing to work on a patch implementing this feature myself, but
> first I'd like to know if this sounds good to you. Also, if it does,
> do you think the behavior of coredump-[destroy|restart] should be
> changed to use gdb if available, falling back to qemuDomainCoreDump if
> it isn't, or just implement another action for on_crash?
I really don't like the idea of running GDB automatically in response to
anything that can be triggered by the guest. GDB is not robust wrt
parsing cores, so a suitably corrupted memory image of QEMU could be
used to abuse GDB. Further, few production deployments of libvirt & QEMU
will ever have GDB installed. Many organizations outright forbid install
of such tools on production machines.
I'm thinking of this not as something to be enabled by default for the
whole environment, but as a feature that would be temporarily enabled
for debugging a certain misbehaving VM. Would be a great companion for
softlockup_panic.
A much simpler way is to just have an "abort" action, which
would cause
libvirt to send SIG_ABRT to QEMU. This should cause QEMU's default
SIG_ABRT handler to kick in which will make the kernel trigger a coredump.
If really wanted, you can then use abrt to catch this coredump.
GDB would allow us to dump all memory mappings even for VMs running
with dump-guest-core=off, but as this is something that already
requires modifying the domain's definition, adding dumpCore=on too
shouldn't be a problem.
I like the on_crash=abort approach. I'm going to work on it. Thanks Daniel.
--
Sergio