We have had sporadic reports of
# virsh capabilities
error: failed to get capabilities
error: server closed connection:
This normally means that libvirtd has crashed, closing the connection
but in this case libvirtd has always remained running. It turns out
that the capabilities XML was too large for the remote RPC message
size. This caused XDR serialization to fail. This caused libvirtd to
close the client connection immediately. The cause of the large XML
was node handling an edge case in libnuma where it returns a CPU mask
of all-1s to indicate a non-existant node.
This series does many things:
- Adds explicit warnings in places where XDR serialization fails,
so we see an indication of problem in /var/log/messages
- Try to send a real remote_error back to client, instead of
closing its connection
- Add logging of capabilities XML in libvirt.c so we can identify
the too large doc in libvirtd
- Add fix to cope with all-1s node mask