Hi Peter,
thanks for your help.
----- On Jun 5, 2019, at 9:27 AM, Peter Krempa pkrempa(a)redhat.com wrote:
> =============================================================
> ...
> 2019-05-31 20:31:34.481+0000: 4170: error : qemuMonitorIO:719 : internal error:
> End of file from qemu monitor
> 2019-06-01 01:05:32.233+0000: 4170: error : qemuMonitorIO:719 : internal error:
> End of file from qemu monitor
This message is printed if qemu crashes for some reason and then closes
the monitor socket unexpectedly.
> 2019-06-01 01:05:43.804+0000: 22605: warning : qemuGetProcessInfo:1461 : cannot
> parse process status data
> 2019-06-01 01:05:43.848+0000: 22596: warning : qemuGetProcessInfo:1461 : cannot
> parse process status data
> 2019-06-01 01:06:11.438+0000: 26112: warning :
> qemuDomainObjBeginJobInternal:4865 : Cannot start job (destroy, none) for doma
> in severin; current job is (modify, none) owned by (5372
> remoteDispatchDomainBlockJobAbort, 0 <null>) for (39s, 0s)
> 2019-06-01 01:06:11.438+0000: 26112: error : qemuDomainObjBeginJobInternal:4877
> : Timed out during operation: cannot acquire
> state change lock (held by remoteDispatchDomainBlockJobAbort)
So this means that the virDomainBlockJobAbort API which is also used for
--pivot got stuck for some time.
This is kind of strange if the VM crashed, there might also be a bug in
the synchronous block job handling, but it's hard to tell from this log.
The VM didn't crash. It kept running.
See "last":
root pts/49 ha-idg-2.scidom. Tue Jun 4 14:02 - 13:18 (23:16)
root pts/47 pc60337.scidom.d Mon Jun 3 15:13 still logged in
reboot system boot 2.6.4-52-smp Wed May 15 20:19 (20+17:02)
reboot system boot 2.6.4-52-smp Fri Mar 15 17:38 (81+18:44)
reboot system boot 2.6.4-52-smp Wed Feb 27 20:29 (15+21:04)
> The syslog from the domain itself didn't reveal anything, it
just continues to
> run.
> The libvirt log from the domains just says:
> qemu-system-x86_64: block/mirror.c:864: mirror_run: Assertion
> `((&bs->tracked_requests)->lh_first == ((void *)0))' failed.
So that's interresting. Usually assertion failure in qemu leads to
calling abort() and thus the vm would have crashed. Didn't you HA
solution restart it?
No. As said the VM didn't crash. It kept running.
At any rate it would be really beneficial if you could collect debug
logs for libvirtd which also contain the monitor interactions with qemu:
https://wiki.libvirt.org/page/DebugLogs
The qemu assertion failure above should ideally be reported to qemu, but
if you are able to reproduce the problem with libvirtd debug logs
enabled I can extract more useful info from there which the qemu project
would ask you anyways.
I can't reproduce it. It seems to happen accidentally. But i can collect the logs. Do
they get very large ?
I can contact you the next time it happen. Is that ok for you ?
Bernd
Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Stellv. Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin
Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671