On 10/16/20 9:37 AM, Lentes, Bernd wrote:
Hi,
i have some questions concerning the destroying of domains, and i hope i'm right
here. If not sorry for the disturbance.
Laine already answered your questions about code navigation. I'll provide a few
random comments.
I'm running a two node HA cluster with pacemaker and KVM domains
as resources.
>From time to time when i try to stop a domain with the cluster manager that does not
work, so the domain is destroyed.
That's ok.
But seldom and irregular also destroy does not work, so the node this domain is running
on is fenced.
Have you investigated why the associated qemu process is not responding to
SIGKILL? Is it blocked on uninterruptible I/O? E.g. are there any hints in
/proc/<qemu-pid>/stack or /proc/<qemu-pid>/wchan?
That's ugly. Fencing is the worst which can happen to a cluster
and i try to avoid it.
Maybe destroy does not work because of heavy load, i'm currently examing that.
I installed the source package from libvirt-4.0.0, i have a SLES 12 SP4.
Laine mentioned trying a new libvirt. While that is possible, it can be
difficult in practice due to missing or insufficient build dependencies. E.g.
SLES12 SP4 does not have meson. Creating a SLE12 SP5 based build container with
all dependencies to build upstream libvirt is on my todo list. ATM I was only
considering doing this for the latest SLE12 service pack.
Regards,
Jim