When libvirt is gathering stats for block devices in the bulk stats API
it would use the fallback code that accesses the files directly in
libvirt both if the VM was offline and if qemu didn't return the stats
at all.
If qemu is not cooperating due to being stuck on an inaccessible NFS
share we would then attempt to read the files and get stuck too with
the VM object locked. All other APIs would get eventually get stuck
waiting on the VM lock.
Avoid this problem by skipping the block stats if the VM is online but
the monitor did not provide any stats.
Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1337073
---
src/qemu/qemu_driver.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index ee50577..d244fd3 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -19387,13 +19387,22 @@ qemuDomainGetStatsOneBlock(virQEMUDriverPtr driver,
QEMU_ADD_BLOCK_PARAM_UI(record, maxparams, block_idx, "backingIndex",
backing_idx);
- /* use fallback path if data is not available */
- if (!stats || !alias || !(entry = virHashLookup(stats, alias))) {
+ /* the VM is offline so we have to go and load the stast from the disk by
+ * ourselves */
+ if (!virDomainObjIsActive(dom)) {
ret = qemuDomainGetStatsOneBlockFallback(driver, cfg, dom, record,
maxparams, src, block_idx);
goto cleanup;
}
+ /* In case where qemu didn't provide the stats we stop here rather than
+ * trying to refresh the stats from the disk. Inability to provide stats is
+ * usually caused by blocked storage so this would make libvirtd hang */
+ if (!stats || !alias || !(entry = virHashLookup(stats, alias))) {
+ ret = 0;
+ goto cleanup;
+ }
+
QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx,
"rd.reqs", entry->rd_req);
QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx,
--
2.8.2