[libvirt] [PATCH 0/6] Allow debug dump in case of crashes

This patch set main goal is to allow dumping of full debug informations for operations occuring in libvirt daemon before a crash. The principle is to save all logs to a round-robbin memory buffer (which we already do, except we never use that buffer), save the general daemon logs to a libvirtd.log file for which we keep an open descriptor, and upon reception of a fatal signal, save the memory buffer directly to the file descriptor and fsync it. There is quite a few remarks about this, we already have that buffer but we don't log everything in, only what's output, that's what patch 1 changes, then we need to provide a function to dump this, so there is a new internal API for it. We also need to be cautious about what system call we operate from the signal handler, but basically we should limit ourselves here to write/fsync and sigaction which are safe there dixit POSIX. Something like killall -USR2 libvirtd allows to see the kind of output one get, an idle libvirtd is quiet, but handle/timer/fdpolls tend to be very verbose, maybe we need an intermediate debug level, but that would also impact the API. Maybe now that this part is well debugged some of those could be suppressed or commented off. Also the buffer is a statically allocated 64KB, maybe this could be made more flexible but IMHO that's not fundamental. Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On Thu, Mar 03, 2011 at 06:26:19PM +0800, Daniel Veillard wrote:
This patch set main goal is to allow dumping of full debug informations for operations occuring in libvirt daemon before a crash. The principle is to save all logs to a round-robbin memory buffer (which we already do, except we never use that buffer), save the general daemon logs to a libvirtd.log file for which we keep an open descriptor, and upon reception of a fatal signal, save the memory buffer directly to the file descriptor and fsync it.
There is quite a few remarks about this, we already have that buffer but we don't log everything in, only what's output, that's what patch 1 changes, then we need to provide a function to dump this, so there is a new internal API for it. We also need to be cautious about what system call we operate from the signal handler, but basically we should limit ourselves here to write/fsync and sigaction which are safe there dixit POSIX.
Something like killall -USR2 libvirtd allows to see the kind of output one get, an idle libvirtd is quiet, but handle/timer/fdpolls tend to be very verbose, maybe we need an intermediate debug level, but that would also impact the API. Maybe now that this part is well debugged some of those could be suppressed or commented off.
Also the buffer is a statically allocated 64KB, maybe this could be made more flexible but IMHO that's not fundamental.
64 KB is unlikely to be sufficiently large if libvirt is running under RHEV with moderate load. With a fairly strict log filter of '1:libvirt 1:util 1:qemu' RHEV generates > 300 MB of logs in ~10 minutes. This works out at approx 512 KB per second so a 64 KB buffer will only capture 125 milliseconds at that log filter level, and far far less if collecting full debug logs Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Mar 03, 2011 at 11:48:51AM +0000, Daniel P. Berrange wrote:
On Thu, Mar 03, 2011 at 06:26:19PM +0800, Daniel Veillard wrote:
Something like killall -USR2 libvirtd allows to see the kind of output one get, an idle libvirtd is quiet, but handle/timer/fdpolls tend to be very verbose, maybe we need an intermediate debug level, but that would also impact the API. Maybe now that this part is well debugged some of those could be suppressed or commented off.
Also the buffer is a statically allocated 64KB, maybe this could be made more flexible but IMHO that's not fundamental.
64 KB is unlikely to be sufficiently large if libvirt is running under RHEV with moderate load. With a fairly strict log filter of '1:libvirt 1:util 1:qemu' RHEV generates > 300 MB of logs in ~10 minutes. This works out at approx 512 KB per second so a 64 KB buffer will only capture 125 milliseconds at that log filter level, and far far less if collecting full debug logs
yeah, I was afraid of this kind of scenarios, and went for the minimal approach since the static buffer is already in the current code. Problem is that I don't think we should really rely on an user provided value, and getting a dynamic algorithm (keep say at least one second worth of log) sounds hard to implement right. I was somehow hoping that some of the background noises could be reduced by further patches. Another option would be to apply a specific filter (which could be empty by default) for the debug buffer, and that is rather trivial to implement. Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On Thu, Mar 03, 2011 at 10:00:14PM +0800, Daniel Veillard wrote:
On Thu, Mar 03, 2011 at 11:48:51AM +0000, Daniel P. Berrange wrote:
On Thu, Mar 03, 2011 at 06:26:19PM +0800, Daniel Veillard wrote:
Something like killall -USR2 libvirtd allows to see the kind of output one get, an idle libvirtd is quiet, but handle/timer/fdpolls tend to be very verbose, maybe we need an intermediate debug level, but that would also impact the API. Maybe now that this part is well debugged some of those could be suppressed or commented off.
Also the buffer is a statically allocated 64KB, maybe this could be made more flexible but IMHO that's not fundamental.
64 KB is unlikely to be sufficiently large if libvirt is running under RHEV with moderate load. With a fairly strict log filter of '1:libvirt 1:util 1:qemu' RHEV generates > 300 MB of logs in ~10 minutes. This works out at approx 512 KB per second so a 64 KB buffer will only capture 125 milliseconds at that log filter level, and far far less if collecting full debug logs
yeah, I was afraid of this kind of scenarios, and went for the minimal approach since the static buffer is already in the current code. Problem is that I don't think we should really rely on an user provided value, and getting a dynamic algorithm (keep say at least one second worth of log) sounds hard to implement right. I was somehow hoping that some of the background noises could be reduced by further patches. Another option would be to apply a specific filter (which could be empty by default) for the debug buffer, and that is rather trivial to implement.
We could make a configurable in libvirtd.conf for buffer size. eg to set a 1 MB ringbuffer, users could do: log_ringbuffer = 1048576 Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (2)
-
Daniel P. Berrange
-
Daniel Veillard