Enabling core dumps is a reasonably straightforward task, but is not
documented clearly. This page provides as easy link to point users
to when they need to debug QEMU.
Signed-off-by: Daniel P. Berrangé <berrange(a)redhat.com>
---
docs/kbase/index.rst | 4 ++
docs/kbase/meson.build | 1 +
docs/kbase/qemu-core-dump.rst | 132 ++++++++++++++++++++++++++++++++++
3 files changed, 137 insertions(+)
create mode 100644 docs/kbase/qemu-core-dump.rst
diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst
index 91083ee49d..372042886d 100644
--- a/docs/kbase/index.rst
+++ b/docs/kbase/index.rst
@@ -67,3 +67,7 @@ Internals / Debugging
`VM migration internals <migrationinternals.html>`__
VM migration implementation details, complementing the info in
`migration <../migration.html>`__
+
+`Capturing core dumps for QEMU <qemu-core-dump.html>`__
+ How to configure libvirt to enable capture of core dumps from
+ QEMU virtual machines
diff --git a/docs/kbase/meson.build b/docs/kbase/meson.build
index 7631b47018..6d17a83d1d 100644
--- a/docs/kbase/meson.build
+++ b/docs/kbase/meson.build
@@ -12,6 +12,7 @@ docs_kbase_files = [
'locking-sanlock',
'merging_disk_image_chains',
'migrationinternals',
+ 'qemu-core-dump',
'qemu-passthrough-security',
'rpm-deployment',
's390_protected_virt',
diff --git a/docs/kbase/qemu-core-dump.rst b/docs/kbase/qemu-core-dump.rst
new file mode 100644
index 0000000000..d27f81c4d6
--- /dev/null
+++ b/docs/kbase/qemu-core-dump.rst
@@ -0,0 +1,132 @@
+=============================
+Capturing core dumps for QEMU
+=============================
+
+The default behaviour for a QEMU virtual machine launched by libvirt is to
+have core dumps disabled. There can be times, however, when it is beneficial
+to collect a core dump to enable debugging.
+
+QEMU driver configuration
+=========================
+
+There is a global setting in the QEMU driver configuration file that controls
+whether core dumps are permitted, and their maximum size. Enabling core dumps
+is simply a matter of setting the maximum size to a non-zero value by editting
+the ``/etc/libvirt/qemu.conf`` file:
+
+::
+
+ max_core = "unlimited"
+
+For an adhoc debugging session, setting the core dump size to "unlimited" is
+viable, on the assumption that the core dumps will be disabled again once the
+requisite information is collected. If the intention is to leave core dumps
+permanently enabled, more careful consideration of limits is required
+
+Note that by default, a core dump will **NOT** include the the guest RAM
+region, so will only include memory regions used by QEMU for emulation and
+backend purposes. This is expected to be sufficient for the vast majority
+of debugging needs.
+
+When there is a need to examine guest RAM though, a further setting is
+available
+
+::
+
+ dump_guest_core = 1
+
+This will of course result in core dumps that are as large as the biggest
+virtual machine on the host - potentially 10's or even 100's of GB in size.
+To allow more fine grained control it is possible to toggle this on a per
+VM basis in the XML configuration.
+
+After changing either of the settings in ``/etc/libvirt/qemu.conf`` the daemon
+hosting the QEMU driver must be restarted. For deployments using the monolithic
+daemons, this means ``libvirtd``, while for those using modular daemons this
+means ``virtqemud``
+
+::
+
+ systemctl restart libvirtd (for a monolithic deployment)
+ systemctl restart virtqemud (for a modular deployment)
+
+While libvirt attempts to make it possible to restart the daemons without
+negatively impacting running guests, there are some management operations
+that may get interrupted. In particular long running jobs like live
+migration or block device copy jobs may abort. It is thus wise to check
+that the host is mostly idle before restarting the daemons.
+
+Guest core dump configuration
+=============================
+
+The ``dump_guest_core`` setting mentioned above will allow guest RAM to be
+included in core dumps for all virtual machines on the host. This may not
+be desirable, so it is also possible to control this on a per-virtual
+machine basis in the XML configuration:
+
+::
+
+ <memory dumpCore="on">...</memory>
+
+Note, it is still neccessary to at least set ``max_core`` to a non-zero
+value in the global configuration file.
+
+Some management applications may not offer the ability to customimze the
+XML configuration for a guest. In such situations, using the global
+``dump_guest_core`` setting is the only option.
+
+Host OS core dump storage
+=========================
+
+The Linux kernel default behaviour is to write core dumps to a file in the
+current working directory of the process. This will not work with QEMU
+processes launched by libvirt, because their working directory is ``/``
+which will not be writable.
+
+Most modern OS distros, however, now include systemd which configures a
+custom core dump handler out of the box. When this is in effect, core dumps
+from QEMU can be seen using the ``coredumpctl`` commands
+
+::
+
+ $ coredumpctl list -r
+ TIME PID UID GID SIG COREFILE EXE
SIZE
+ Tue 2021-07-20 12:12:52 BST 2649303 107 107 SIGABRT present
/usr/bin/qemu-system-x86_64 1.8M
+ ...snip...
+
+ $ coredumpctl info 2649303
+ PID: 2649303 (qemu-system-x86)
+ UID: 107 (qemu)
+ GID: 107 (qemu)
+ Signal: 6 (ABRT)
+ Timestamp: Tue 2021-07-20 12:12:52 BST (48min ago)
+ Command Line: /usr/bin/qemu-system-x86_64 -name guest=f30,debug-threads=on ..snip...
-msg timestamp=on
+ Executable: /usr/bin/qemu-system-x86_64
+ Control Group: /machine.slice/machine-qemu\x2d1\x2df30.scope/libvirt/emulator
+ Unit: machine-qemu\x2d1\x2df30.scope
+ Slice: machine.slice
+ Boot ID: 6b9015d0c05f4e7fbfe4197a2c7824a2
+ Machine ID: c78c8286d6d74b22ac0dd275975f9ced
+ Hostname: localhost.localdomain
+ Storage:
/var/lib/systemd/coredump/core.qemu-system-x86.107.6b9015d0c05f4e7fbfe4197a2c7824a2.2649303.1626779572000000.zst
(present)
+ Disk Size: 1.8M
+ Message: Process 2649303 (qemu-system-x86) of user 107 dumped core.
+
+ Stack trace of thread 2649303:
+ #0 0x00007ff3c32436be n/a (libc.so.6 + 0xf56be)
+ #1 0x000055a949c0ed05 qemu_poll_ns (qemu-system-x86_64 + 0x7b0d05)
+ #2 0x000055a949c0e476 main_loop_wait (qemu-system-x86_64 + 0x7b0476)
+ #3 0x000055a949a36d27 qemu_main_loop (qemu-system-x86_64 + 0x5d8d27)
+ #4 0x000055a94979e4d2 main (qemu-system-x86_64 + 0x3404d2)
+ #5 0x00007ff3c3175b75 n/a (libc.so.6 + 0x27b75)
+ #6 0x000055a9497a1f5e _start (qemu-system-x86_64 + 0x343f5e)
+
+ Stack trace of thread 2649368:
+ #0 0x00007ff3c32435bf n/a (libc.so.6 + 0xf55bf)
+ #1 0x00007ff3c3af547c g_main_context_iterate.constprop.0
(libglib-2.0.so.0 + 0xa947c)
+ #2 0x00007ff3c3aa0a93 g_main_loop_run (libglib-2.0.so.0 + 0x54a93)
+ #3 0x00007ff3c17a727a red_worker_main.lto_priv.0 (libspice-server.so.1
+ 0x5227a)
+ #4 0x00007ff3c3326299 start_thread (libpthread.so.0 + 0x9299)
+ #5 0x00007ff3c324e353 n/a (libc.so.6 + 0x100353)
+
+ ...snip...
--
2.31.1