On Tue, Jun 04, 2019 at 14:44:29 +0200, Lentes, Bernd wrote:
Hi,
Hi,
i have several domains running on a 2-node HA-cluster.
Each night i create snapshots of the domains, after copying the consistent raw file to a
CIFS server i blockcommit the changes into the raw files.
That's running quite well.
But recent the blockcommit didn't work for one domain:
I create a logfile from the whole procedure:
===============================================================
...
Sat Jun 1 03:05:24 CEST 2019
Target Source
------------------------------------------------
vdb /mnt/snap/severin.sn
hdc -
/usr/bin/virsh blockcommit severin /mnt/snap/severin.sn --verbose --active --pivot
Block commit: [ 0 %]Block commit: [ 15 %]Block commit: [ 28 %]Block commit: [ 35 %]Block
commit: [ 43 %]Block commit: [ 53 %]Block commit: [ 63 %]Block commit: [ 73 %]Block
commit: [ 82 %]Block commit: [ 89 %]Block commit: [ 98 %]Block commit: [100 %]Target
Source
------------------------------------------------
vdb /mnt/snap/severin.sn
...
==============================================================
The libvirtd-log says (it's UTC IIRC):
=============================================================
...
2019-05-31 20:31:34.481+0000: 4170: error : qemuMonitorIO:719 : internal error: End of
file from qemu monitor
2019-06-01 01:05:32.233+0000: 4170: error : qemuMonitorIO:719 : internal error: End of
file from qemu monitor
This message is printed if qemu crashes for some reason and then closes
the monitor socket unexpectedly.
2019-06-01 01:05:43.804+0000: 22605: warning :
qemuGetProcessInfo:1461 : cannot parse process status data
2019-06-01 01:05:43.848+0000: 22596: warning : qemuGetProcessInfo:1461 : cannot parse
process status data
2019-06-01 01:06:11.438+0000: 26112: warning : qemuDomainObjBeginJobInternal:4865 :
Cannot start job (destroy, none) for doma
in severin; current job is (modify, none) owned by (5372
remoteDispatchDomainBlockJobAbort, 0 <null>) for (39s, 0s)
2019-06-01 01:06:11.438+0000: 26112: error : qemuDomainObjBeginJobInternal:4877 : Timed
out during operation: cannot acquire
state change lock (held by remoteDispatchDomainBlockJobAbort)
So this means that the virDomainBlockJobAbort API which is also used for
--pivot got stuck for some time.
This is kind of strange if the VM crashed, there might also be a bug in
the synchronous block job handling, but it's hard to tell from this log.
2019-06-01 01:06:13.976+0000: 5369: warning : qemuGetProcessInfo:1461
: cannot parse process status data
2019-06-01 01:06:14.028+0000: 22596: warning : qemuGetProcessInfo:1461 : cannot parse
process status data
2019-06-01 01:06:44.165+0000: 5371: warning : qemuGetProcessInfo:1461 : cannot parse
process status data
2019-06-01 01:06:44.218+0000: 22605: warning : qemuGetProcessInfo:1461 : cannot parse
process status data
2019-06-01 01:07:14.343+0000: 5369: warning : qemuGetProcessInfo:1461 : cannot parse
process status data
2019-06-01 01:07:14.387+0000: 22598: warning : qemuGetProcessInfo:1461 : cannot parse
process status data
2019-06-01 01:07:44.495+0000: 22605: warning : qemuGetProcessInfo:1461 : cannot parse
process status data
...
===========================================================
and "cannot parse process status data" continuously until the end of the
logfile.
The syslog from the domain itself didn't reveal anything, it just continues to run.
The libvirt log from the domains just says:
qemu-system-x86_64: block/mirror.c:864: mirror_run: Assertion
`((&bs->tracked_requests)->lh_first == ((void *)0))' failed.
So that's interresting. Usually assertion failure in qemu leads to
calling abort() and thus the vm would have crashed. Didn't you HA
solution restart it?
At any rate it would be really beneficial if you could collect debug
logs for libvirtd which also contain the monitor interactions with qemu:
https://wiki.libvirt.org/page/DebugLogs
The qemu assertion failure above should ideally be reported to qemu, but
if you are able to reproduce the problem with libvirtd debug logs
enabled I can extract more useful info from there which the qemu project
would ask you anyways.