[libvirt] [PATCH 00/12] more progress towards backing chain allocation stats

This is in response to my earlier v2 patch attempt at exposing qcow2 allocation watermarks during a block commit (although that series attempted to expand virDomainGetXMLDesc, while this one expands virDomainListGetStats instead): https://www.redhat.com/archives/libvir-list/2014-November/msg00589.html It is also a followup to my preliminiaries (where I had review comments to split patch 2/2, now 3 separate patches in this series): https://www.redhat.com/archives/libvir-list/2014-November/msg00932.html There's enough new content here that I didn't know whether to call this a v3 of the original approach, but as the original approach was rejected on design without much review, it feels like this is a new series, even if a couple of patches from the v2 series ended up in this series in a slightly different form. See patch 12 for a sample of the planned output; I still have a few more patches to polish to properly get qemu JSON output wired into the command for correct values, but this should at least be sufficient to write up client code that interacts with the interface. The patches can also be grabbed from: git fetch git://repo.or.cz/libvirt/ericb.git getstats or browsed at: http://repo.or.cz/w/libvirt/ericb.git/tree/refs/heads/getstats Eric Blake (12): getstats: avoid memory leak on OOM getstats: improve documentation qemu: refactor blockinfo job handling qemu: let blockinfo reuse virStorageSource qemu: refactor blockinfo data gathering getstats: start giving offline block stats getstats: report block sizes for offline domains getstats: add block.n.path stat getstats: prepare for dynamic block.count stat getstats: add new flag for block backing chain getstats: split block stats reporting for easier recursion getstats: start crawling backing chain for qemu include/libvirt/libvirt-domain.h | 1 + src/libvirt-domain.c | 24 ++- src/qemu/qemu_driver.c | 415 ++++++++++++++++++++++++++------------- src/util/virstoragefile.h | 3 +- tools/virsh-domain-monitor.c | 7 + tools/virsh.pod | 10 +- 6 files changed, 315 insertions(+), 145 deletions(-) -- 1.9.3

qemuDomainGetStatsBlock() could leak a stats hash table if it encountered OOM while populating the virTypedParameters. Oddly, the fix doesn't even touch qemuDomainGetStatsBlock :) * src/qemu/qemu_driver.c (QEMU_ADD_COUNT_PARAM) (QEMU_ADD_NAME_PARAM): Don't return early. (qemuDomainGetStatsInterface): Adjust caller. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 9152cf5..a7b208f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -18371,7 +18371,7 @@ do { \ maxparams, \ param_name, \ count) < 0) \ - return -1; \ + goto cleanup; \ } while (0) #define QEMU_ADD_NAME_PARAM(record, maxparams, type, num, name) \ @@ -18384,7 +18384,7 @@ do { \ maxparams, \ param_name, \ name) < 0) \ - return -1; \ + goto cleanup; \ } while (0) #define QEMU_ADD_NET_PARAM(record, maxparams, num, name, value) \ @@ -18448,6 +18448,7 @@ qemuDomainGetStatsInterface(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, "tx.drop", tmp.tx_drop); } + cleanup: return 0; } -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
qemuDomainGetStatsBlock() could leak a stats hash table if it encountered OOM while populating the virTypedParameters. Oddly, the fix doesn't even touch qemuDomainGetStatsBlock :)
* src/qemu/qemu_driver.c (QEMU_ADD_COUNT_PARAM) (QEMU_ADD_NAME_PARAM): Don't return early. (qemuDomainGetStatsInterface): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 9152cf5..a7b208f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -18371,7 +18371,7 @@ do { \ maxparams, \ param_name, \ count) < 0) \ - return -1; \ + goto cleanup; \ } while (0)
#define QEMU_ADD_NAME_PARAM(record, maxparams, type, num, name) \ @@ -18384,7 +18384,7 @@ do { \ maxparams, \ param_name, \ name) < 0) \ - return -1; \ + goto cleanup; \ } while (0)
#define QEMU_ADD_NET_PARAM(record, maxparams, num, name, value) \ @@ -18448,6 +18448,7 @@ qemuDomainGetStatsInterface(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, "tx.drop", tmp.tx_drop); }
+ cleanup:
With this change qemuDomainGetStatsInterface will return success even in case when the OOM occurred. You need to add a 'ret' variable set to -1 and set it to 0 just before cleanup, so that the error gets propagated.
return 0; }
ACK if you keep reporting the error in OOM case in qemuDomainGetStatsInterface() Peter

On 12/08/2014 03:50 AM, Peter Krempa wrote:
On 12/06/14 09:14, Eric Blake wrote:
qemuDomainGetStatsBlock() could leak a stats hash table if it encountered OOM while populating the virTypedParameters. Oddly, the fix doesn't even touch qemuDomainGetStatsBlock :)
+ cleanup:
With this change qemuDomainGetStatsInterface will return success even in case when the OOM occurred. You need to add a 'ret' variable set to -1 and set it to 0 just before cleanup, so that the error gets propagated.
Oh right :)
ACK if you keep reporting the error in OOM case in qemuDomainGetStatsInterface()
Done and pushed: diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index a7b208f..ed8e140 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -18409,6 +18409,7 @@ qemuDomainGetStatsInterface(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, { size_t i; struct _virDomainInterfaceStats tmp; + int ret = -1; if (!virDomainObjIsActive(dom)) return 0; @@ -18448,8 +18449,9 @@ qemuDomainGetStatsInterface(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, "tx.drop", tmp.tx_drop); } + ret = 0; cleanup: - return 0; + return ret; } #undef QEMU_ADD_NET_PARAM -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

At least with 'virsh domstats --block' on an offline domain, we currently output no stats even though we recognize the stat category. Although a later patch will improve this situation, it is better to document that this is expected behavior. Also, while the current implementation rejects filtering flags for virDomainListGetStats, this limitation may be lifted in the future and we do not enforce it at the API level. * src/libvirt-domain.c (virConnectGetAllDomainStats): Document that recognized stats might not be reported. (virDomainListGetStats): Likewise, and tweak filtering documentation. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/libvirt-domain.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 8333183..87123d1 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -10940,7 +10940,11 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * * Specifying VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS as @flags makes * the function return error in case some of the stat types in @stats were - * not recognized by the daemon. + * not recognized by the daemon. However, even with this flag, a hypervisor + * may omit individual fields within a known group if the information is not + * available; as an extreme example, a supported group may produce zero + * fields for offline domains if the statistics are meaningful only for a + * running domain. * * Similarly to virConnectListAllDomains, @flags can contain various flags to * filter the list of domains to provide stats for. @@ -11020,9 +11024,13 @@ virConnectGetAllDomainStats(virConnectPtr conn, * * Specifying VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS as @flags makes * the function return error in case some of the stat types in @stats were - * not recognized by the daemon. + * not recognized by the daemon. However, even with this flag, a hypervisor + * may omit individual fields within a known group if the information is not + * available; as an extreme example, a supported group may produce zero + * fields for offline domains if the statistics are meaningful only for a + * running domain. * - * Note that any of the domain list filtering flags in @flags will be rejected + * Note that any of the domain list filtering flags in @flags may be rejected * by this function. * * Returns the count of returned statistics structures on success, -1 on error. -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
At least with 'virsh domstats --block' on an offline domain, we currently output no stats even though we recognize the stat category. Although a later patch will improve this situation, it is better to document that this is expected behavior.
Also, while the current implementation rejects filtering flags for virDomainListGetStats, this limitation may be lifted in the future and we do not enforce it at the API level.
* src/libvirt-domain.c (virConnectGetAllDomainStats): Document that recognized stats might not be reported. (virDomainListGetStats): Likewise, and tweak filtering documentation.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/libvirt-domain.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
ACK, Peter

In order for a future patch to virDomainListGetStats to reuse some code for determining disk usage of offline domains, we need to make it easier to pull out part of the guts of grabbing blockinfo. The current implementation grabs a job fairly late in the game, while getstats will already own a job; reordering things so that the job is always grabbed up front in both functions will make it easier to pull out the common code. This patch results in grabbing a job in cases where one was not previously needed, but as it is a query job, it should not be noticeably slower. This patch touches the same code as the fix for CVE-2014-6458 (commit b799259); in that patch, we avoided hotplug changing a disk reference during the time of obtaining a monitor lock by copying all data we needed and no longer referencing disk; this patch goes the other way and ensures that by holding the job, the disk cannot be changed so we no longer need to worry about the disk being invalidated across the monitor lock. * src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Rearrange job control to be outside of disk information. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 62 +++++++++++++++++++++++--------------------------- 1 file changed, 29 insertions(+), 33 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a7b208f..ae4485a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10999,7 +10999,6 @@ qemuDomainGetBlockInfo(virDomainPtr dom, int format; bool activeFail = false; virQEMUDriverConfigPtr cfg = NULL; - char *alias = NULL; char *buf = NULL; ssize_t len; @@ -11018,11 +11017,19 @@ qemuDomainGetBlockInfo(virDomainPtr dom, goto cleanup; } + /* Technically, we only need a job if we are going to query the + * monitor, which is only for active domains that are using + * non-raw block devices. But it is easier to share code if we + * always grab a job; furthermore, grabbing the job ensures that + * hot-plug won't change disk behind our backs. */ + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) + goto cleanup; + /* Check the path belongs to this domain. */ if ((idx = virDomainDiskIndexByName(vm->def, path, false)) < 0) { virReportError(VIR_ERR_INVALID_ARG, _("invalid path %s not assigned to domain"), path); - goto cleanup; + goto endjob; } disk = vm->def->disks[idx]; @@ -11032,36 +11039,36 @@ qemuDomainGetBlockInfo(virDomainPtr dom, virReportError(VIR_ERR_INVALID_ARG, _("disk '%s' does not currently have a source assigned"), path); - goto cleanup; + goto endjob; } if ((fd = qemuOpenFile(driver, vm, disk->src->path, O_RDONLY, NULL, NULL)) == -1) - goto cleanup; + goto endjob; if (fstat(fd, &sb) < 0) { virReportSystemError(errno, _("cannot stat file '%s'"), disk->src->path); - goto cleanup; + goto endjob; } if ((len = virFileReadHeaderFD(fd, VIR_STORAGE_MAX_HEADER, &buf)) < 0) { virReportSystemError(errno, _("cannot read header '%s'"), disk->src->path); - goto cleanup; + goto endjob; } } else { if (virStorageFileInitAs(disk->src, cfg->user, cfg->group) < 0) - goto cleanup; + goto endjob; if ((len = virStorageFileReadHeader(disk->src, VIR_STORAGE_MAX_HEADER, &buf)) < 0) - goto cleanup; + goto endjob; if (virStorageFileStat(disk->src, &sb) < 0) { virReportSystemError(errno, _("failed to stat remote file '%s'"), NULLSTR(disk->src->path)); - goto cleanup; + goto endjob; } } @@ -11073,17 +11080,17 @@ qemuDomainGetBlockInfo(virDomainPtr dom, virReportError(VIR_ERR_INTERNAL_ERROR, _("no disk format for %s and probing is disabled"), path); - goto cleanup; + goto endjob; } if ((format = virStorageFileProbeFormatFromBuf(disk->src->path, buf, len)) < 0) - goto cleanup; + goto endjob; } if (!(meta = virStorageFileGetMetadataFromBuf(disk->src->path, buf, len, format, NULL))) - goto cleanup; + goto endjob; /* Get info for normal formats */ if (S_ISREG(sb.st_mode) || fd == -1) { @@ -11105,7 +11112,7 @@ qemuDomainGetBlockInfo(virDomainPtr dom, if (end == (off_t)-1) { virReportSystemError(errno, _("failed to seek to end of %s"), path); - goto cleanup; + goto endjob; } info->physical = end; info->capacity = end; @@ -11133,35 +11140,24 @@ qemuDomainGetBlockInfo(virDomainPtr dom, if (!virDomainObjIsActive(vm)) { activeFail = true; ret = 0; - goto cleanup; + goto endjob; } - if (VIR_STRDUP(alias, disk->info.alias) < 0) - goto cleanup; + qemuDomainObjEnterMonitor(driver, vm); + ret = qemuMonitorGetBlockExtent(priv->mon, + disk->info.alias, + &info->allocation); + qemuDomainObjExitMonitor(driver, vm); - if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) - goto cleanup; - - if (virDomainObjIsActive(vm)) { - qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorGetBlockExtent(priv->mon, - alias, - &info->allocation); - qemuDomainObjExitMonitor(driver, vm); - } else { - activeFail = true; - ret = 0; - } - - if (!qemuDomainObjEndJob(driver, vm)) - vm = NULL; } else { ret = 0; } + endjob: + if (!qemuDomainObjEndJob(driver, vm)) + vm = NULL; cleanup: VIR_FREE(buf); - VIR_FREE(alias); virStorageSourceFree(meta); VIR_FORCE_CLOSE(fd); if (disk) -- 1.9.3

On 12/06/2014 03:14 AM, Eric Blake wrote:
In order for a future patch to virDomainListGetStats to reuse some code for determining disk usage of offline domains, we need to make it easier to pull out part of the guts of grabbing blockinfo. The current implementation grabs a job fairly late in the game, while getstats will already own a job; reordering things so that the job is always grabbed up front in both functions will make it easier to pull out the common code. This patch results in grabbing a job in cases where one was not previously needed, but as it is a query job, it should not be noticeably slower.
This patch touches the same code as the fix for CVE-2014-6458 (commit b799259); in that patch, we avoided hotplug changing a disk reference during the time of obtaining a monitor lock by copying all data we needed and no longer referencing disk; this patch goes the other way and ensures that by holding the job, the disk cannot be changed so we no longer need to worry about the disk being invalidated across the monitor lock.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Rearrange job control to be outside of disk information.
Ran thru my Coverity checker...
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 62 +++++++++++++++++++++++--------------------------- 1 file changed, 29 insertions(+), 33 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a7b208f..ae4485a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10999,7 +10999,6 @@ qemuDomainGetBlockInfo(virDomainPtr dom, int format; bool activeFail = false; virQEMUDriverConfigPtr cfg = NULL; - char *alias = NULL; char *buf = NULL; ssize_t len;
@@ -11018,11 +11017,19 @@ qemuDomainGetBlockInfo(virDomainPtr dom, goto cleanup; }
+ /* Technically, we only need a job if we are going to query the + * monitor, which is only for active domains that are using + * non-raw block devices. But it is easier to share code if we + * always grab a job; furthermore, grabbing the job ensures that + * hot-plug won't change disk behind our backs. */ + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) + goto cleanup; + /* Check the path belongs to this domain. */ if ((idx = virDomainDiskIndexByName(vm->def, path, false)) < 0) { virReportError(VIR_ERR_INVALID_ARG, _("invalid path %s not assigned to domain"), path); - goto cleanup; + goto endjob; }
disk = vm->def->disks[idx]; @@ -11032,36 +11039,36 @@ qemuDomainGetBlockInfo(virDomainPtr dom, virReportError(VIR_ERR_INVALID_ARG, _("disk '%s' does not currently have a source assigned"), path); - goto cleanup; + goto endjob; }
if ((fd = qemuOpenFile(driver, vm, disk->src->path, O_RDONLY, NULL, NULL)) == -1) - goto cleanup; + goto endjob;
if (fstat(fd, &sb) < 0) { virReportSystemError(errno, _("cannot stat file '%s'"), disk->src->path); - goto cleanup; + goto endjob; }
if ((len = virFileReadHeaderFD(fd, VIR_STORAGE_MAX_HEADER, &buf)) < 0) { virReportSystemError(errno, _("cannot read header '%s'"), disk->src->path); - goto cleanup; + goto endjob; } } else { if (virStorageFileInitAs(disk->src, cfg->user, cfg->group) < 0) - goto cleanup; + goto endjob;
if ((len = virStorageFileReadHeader(disk->src, VIR_STORAGE_MAX_HEADER, &buf)) < 0) - goto cleanup; + goto endjob;
if (virStorageFileStat(disk->src, &sb) < 0) { virReportSystemError(errno, _("failed to stat remote file '%s'"), NULLSTR(disk->src->path)); - goto cleanup; + goto endjob; } }
@@ -11073,17 +11080,17 @@ qemuDomainGetBlockInfo(virDomainPtr dom, virReportError(VIR_ERR_INTERNAL_ERROR, _("no disk format for %s and probing is disabled"), path); - goto cleanup; + goto endjob; }
if ((format = virStorageFileProbeFormatFromBuf(disk->src->path, buf, len)) < 0) - goto cleanup; + goto endjob; }
if (!(meta = virStorageFileGetMetadataFromBuf(disk->src->path, buf, len, format, NULL))) - goto cleanup; + goto endjob;
/* Get info for normal formats */ if (S_ISREG(sb.st_mode) || fd == -1) { @@ -11105,7 +11112,7 @@ qemuDomainGetBlockInfo(virDomainPtr dom, if (end == (off_t)-1) { virReportSystemError(errno, _("failed to seek to end of %s"), path); - goto cleanup; + goto endjob; } info->physical = end; info->capacity = end; @@ -11133,35 +11140,24 @@ qemuDomainGetBlockInfo(virDomainPtr dom, if (!virDomainObjIsActive(vm)) { activeFail = true; ret = 0; - goto cleanup; + goto endjob; }
- if (VIR_STRDUP(alias, disk->info.alias) < 0) - goto cleanup; + qemuDomainObjEnterMonitor(driver, vm); + ret = qemuMonitorGetBlockExtent(priv->mon, + disk->info.alias, + &info->allocation); + qemuDomainObjExitMonitor(driver, vm);
- if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) - goto cleanup; - - if (virDomainObjIsActive(vm)) { - qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorGetBlockExtent(priv->mon, - alias, - &info->allocation); - qemuDomainObjExitMonitor(driver, vm); - } else { - activeFail = true; - ret = 0; - } - - if (!qemuDomainObjEndJob(driver, vm)) - vm = NULL; } else { ret = 0; }
+ endjob: + if (!qemuDomainObjEndJob(driver, vm)) + vm = NULL;
This makes a virObjectUnlock(vm); in the following cleanup section unhappy... 11215 if (!qemuDomainObjEndJob(driver, vm)) (18) Event assign_zero: Assigning: "vm" = "NULL". Also see events: [var_deref_model] 11216 vm = NULL;
cleanup: VIR_FREE(buf); - VIR_FREE(alias); virStorageSourceFree(meta); VIR_FORCE_CLOSE(fd); if (disk)
11225 } (21) Event var_deref_model: Passing null pointer "vm" to "virObjectUnlock", which dereferences it. (The dereference is assumed on the basis of the 'nonnull' parameter attribute.) Also see events: [assign_zero] 11226 virObjectUnlock(vm);

Right now, grabbing blockinfo always calls stat on the disk, then opens the image to determine the capacity, using a throw-away virStorageSourcePtr. This has a couple of drawbacks: 1. We are calling stat and opening a file on every invocation of the API. However, there are cases where the stats should NOT be changing between successive calls (if a domain is running, no one should be changing the physical size of a block device or raw image behind our backs; capacity of read-only files should not be changing; and we are the gateway to the block-resize command to know when the capacity of read-write files should be changing). True, we still have to use stat in some cases (a sparse raw file changes allocation if it is read-write and the amount of holes is changing, and a read-write qcow2 image stored in a file changes physical size if it was not fully pre-allocated). But for read-only images, even this should be something we can remember from the previous time, rather than repeating every call. 2. We want to enhance the power of virDomainListGetStats, by sharing code. But we already have a virStorageSourcePtr for each disk, and it would be easier to reuse the common structure than to have to worry about the one-off virDomainBlockInfoPtr. While this patch does not optimize reuse of information in point 1, it does get us closer to being able to do so; by updating a structure that survives between consecutive calls. * src/util/virstoragefile.h (_virStorageSource): Add physical, to mirror virDomainBlockInfo. * src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Store into storage source, then copy to block info. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 42 ++++++++++++++++++++++++++++++++++-------- src/util/virstoragefile.h | 3 ++- 2 files changed, 36 insertions(+), 9 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ae4485a..e873362 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11034,6 +11034,26 @@ qemuDomainGetBlockInfo(virDomainPtr dom, disk = vm->def->disks[idx]; + /* FIXME: For an offline domain, we always want to check current + * on-disk statistics (as users have been known to change offline + * images behind our backs). For a running domain, however, it + * would be nice to avoid opening a file (particularly since + * reading a file while qemu is writing it risks the reader seeing + * bogus data), or even avoid a stat, if the information + * remembered from the prevoius run is still viable. + * + * For read-only disks, nothing should be changing unless the user + * has requested a block-commit action. For read-write disks, we + * know some special cases: capacity should not change without a + * block-resize (where capacity is the only stat that requires + * opening a file, and even then, only for non-raw files); and + * physical size of a raw image or of a block device should + * likewise not be changing without block-resize. On the other + * hand, allocation of a raw file can change (if the file is + * sparse, but the amount of sparseness changes due to writes or + * punching holes), and physical size of a non-raw file can + * change. + */ if (virStorageSourceIsLocalStorage(disk->src)) { if (!disk->src->path) { virReportError(VIR_ERR_INVALID_ARG, @@ -11095,15 +11115,15 @@ qemuDomainGetBlockInfo(virDomainPtr dom, /* Get info for normal formats */ if (S_ISREG(sb.st_mode) || fd == -1) { #ifndef WIN32 - info->physical = (unsigned long long)sb.st_blocks * + disk->src->physical = (unsigned long long)sb.st_blocks * (unsigned long long)DEV_BSIZE; #else - info->physical = sb.st_size; + disk->src->physical = sb.st_size; #endif /* Regular files may be sparse, so logical size (capacity) is not same * as actual physical above */ - info->capacity = sb.st_size; + disk->src->capacity = sb.st_size; } else { /* NB. Because we configure with AC_SYS_LARGEFILE, off_t should * be 64 bits on all platforms. @@ -11114,17 +11134,17 @@ qemuDomainGetBlockInfo(virDomainPtr dom, _("failed to seek to end of %s"), path); goto endjob; } - info->physical = end; - info->capacity = end; + disk->src->physical = end; + disk->src->capacity = end; } /* If the file we probed has a capacity set, then override * what we calculated from file/block extents */ if (meta->capacity) - info->capacity = meta->capacity; + disk->src->capacity = meta->capacity; /* Set default value .. */ - info->allocation = info->physical; + disk->src->allocation = disk->src->physical; /* ..but if guest is not using raw disk format and on a block device, * then query highest allocated extent from QEMU @@ -11146,13 +11166,19 @@ qemuDomainGetBlockInfo(virDomainPtr dom, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorGetBlockExtent(priv->mon, disk->info.alias, - &info->allocation); + &disk->src->allocation); qemuDomainObjExitMonitor(driver, vm); } else { ret = 0; } + if (ret == 0) { + info->capacity = disk->src->capacity; + info->allocation = disk->src->allocation; + info->physical = disk->src->physical; + } + endjob: if (!qemuDomainObjEndJob(driver, vm)) vm = NULL; diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index e05b843..b4c3808 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -257,8 +257,9 @@ struct _virStorageSource { virStoragePermsPtr perms; virStorageTimestampsPtr timestamps; - unsigned long long allocation; /* in bytes, 0 if unknown */ unsigned long long capacity; /* in bytes, 0 if unknown */ + unsigned long long allocation; /* in bytes, 0 if unknown */ + unsigned long long physical; /* in bytes, 0 if unknown */ size_t nseclabels; virSecurityDeviceLabelDefPtr *seclabels; -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
Right now, grabbing blockinfo always calls stat on the disk, then opens the image to determine the capacity, using a throw-away virStorageSourcePtr. This has a couple of drawbacks:
1. We are calling stat and opening a file on every invocation of the API. However, there are cases where the stats should NOT be changing between successive calls (if a domain is running, no one should be changing the physical size of a block device or raw image behind our backs; capacity of read-only files should not be changing; and we are the gateway to the block-resize command to know when the capacity of read-write files should be changing). True, we still have to use stat in some cases (a sparse raw file changes allocation if it is read-write and the amount of holes is changing, and a read-write qcow2 image stored in a file changes physical size if it was not fully pre-allocated). But for read-only images, even this should be something we can remember from the previous time, rather than repeating every call.
2. We want to enhance the power of virDomainListGetStats, by sharing code. But we already have a virStorageSourcePtr for each disk, and it would be easier to reuse the common structure than to have to worry about the one-off virDomainBlockInfoPtr.
While this patch does not optimize reuse of information in point 1, it does get us closer to being able to do so; by updating a structure that survives between consecutive calls.
* src/util/virstoragefile.h (_virStorageSource): Add physical, to mirror virDomainBlockInfo. * src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Store into storage source, then copy to block info.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 42 ++++++++++++++++++++++++++++++++++-------- src/util/virstoragefile.h | 3 ++- 2 files changed, 36 insertions(+), 9 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ae4485a..e873362 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11034,6 +11034,26 @@ qemuDomainGetBlockInfo(virDomainPtr dom,
disk = vm->def->disks[idx];
+ /* FIXME: For an offline domain, we always want to check current + * on-disk statistics (as users have been known to change offline + * images behind our backs). For a running domain, however, it + * would be nice to avoid opening a file (particularly since + * reading a file while qemu is writing it risks the reader seeing + * bogus data), or even avoid a stat, if the information + * remembered from the prevoius run is still viable. + * + * For read-only disks, nothing should be changing unless the user + * has requested a block-commit action. For read-write disks, we + * know some special cases: capacity should not change without a + * block-resize (where capacity is the only stat that requires + * opening a file, and even then, only for non-raw files); and + * physical size of a raw image or of a block device should + * likewise not be changing without block-resize. On the other + * hand, allocation of a raw file can change (if the file is + * sparse, but the amount of sparseness changes due to writes or + * punching holes), and physical size of a non-raw file can + * change.
For a live VM we should grab all of the above directly from the monitor and not ever touch the files on the disk. We do that already for the bulk stats and for getting the right size when doing storage migration. This function unfortunately is legacy code compared to the stuff I've pointed out
+ */ if (virStorageSourceIsLocalStorage(disk->src)) { if (!disk->src->path) { virReportError(VIR_ERR_INVALID_ARG, @@ -11095,15 +11115,15 @@ qemuDomainGetBlockInfo(virDomainPtr dom, /* Get info for normal formats */ if (S_ISREG(sb.st_mode) || fd == -1) { #ifndef WIN32 - info->physical = (unsigned long long)sb.st_blocks * + disk->src->physical = (unsigned long long)sb.st_blocks * (unsigned long long)DEV_BSIZE; #else - info->physical = sb.st_size; + disk->src->physical = sb.st_size; #endif /* Regular files may be sparse, so logical size (capacity) is not same * as actual physical above */ - info->capacity = sb.st_size; + disk->src->capacity = sb.st_size; } else { /* NB. Because we configure with AC_SYS_LARGEFILE, off_t should * be 64 bits on all platforms. @@ -11114,17 +11134,17 @@ qemuDomainGetBlockInfo(virDomainPtr dom, _("failed to seek to end of %s"), path); goto endjob; } - info->physical = end; - info->capacity = end; + disk->src->physical = end; + disk->src->capacity = end; }
/* If the file we probed has a capacity set, then override * what we calculated from file/block extents */ if (meta->capacity) - info->capacity = meta->capacity; + disk->src->capacity = meta->capacity;
/* Set default value .. */ - info->allocation = info->physical; + disk->src->allocation = disk->src->physical;
/* ..but if guest is not using raw disk format and on a block device, * then query highest allocated extent from QEMU @@ -11146,13 +11166,19 @@ qemuDomainGetBlockInfo(virDomainPtr dom, qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorGetBlockExtent(priv->mon, disk->info.alias, - &info->allocation); + &disk->src->allocation); qemuDomainObjExitMonitor(driver, vm);
} else { ret = 0; }
+ if (ret == 0) { + info->capacity = disk->src->capacity; + info->allocation = disk->src->allocation; + info->physical = disk->src->physical; + } + endjob: if (!qemuDomainObjEndJob(driver, vm)) vm = NULL;
diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index e05b843..b4c3808 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -257,8 +257,9 @@ struct _virStorageSource {
virStoragePermsPtr perms; virStorageTimestampsPtr timestamps; - unsigned long long allocation; /* in bytes, 0 if unknown */
Spurious move?
unsigned long long capacity; /* in bytes, 0 if unknown */ + unsigned long long allocation; /* in bytes, 0 if unknown */ + unsigned long long physical; /* in bytes, 0 if unknown */ size_t nseclabels; virSecurityDeviceLabelDefPtr *seclabels;
Also an addition to virStorageSourceCopy is missing. ACK with the tweak to virStorageSourceCopy Peter

On 12/08/2014 06:03 AM, Peter Krempa wrote:
On 12/06/14 09:14, Eric Blake wrote:
Right now, grabbing blockinfo always calls stat on the disk, then opens the image to determine the capacity, using a throw-away virStorageSourcePtr. This has a couple of drawbacks:
+ * For read-only disks, nothing should be changing unless the user + * has requested a block-commit action. For read-write disks, we + * know some special cases: capacity should not change without a + * block-resize (where capacity is the only stat that requires + * opening a file, and even then, only for non-raw files); and + * physical size of a raw image or of a block device should + * likewise not be changing without block-resize. On the other + * hand, allocation of a raw file can change (if the file is + * sparse, but the amount of sparseness changes due to writes or + * punching holes), and physical size of a non-raw file can + * change.
For a live VM we should grab all of the above directly from the monitor and not ever touch the files on the disk. We do that already for the bulk stats and for getting the right size when doing storage migration.
Agreed. On my next spin, I'll see about getting block info for live vms done right. (But first, I'm still in the middle of building a scratch image with this series, if only so that we have an easier binary to play with for integration and testing purposes).
This function unfortunately is legacy code compared to the stuff I've pointed out
Yes, blockinfo is quite old, so here's hoping I don't break any semantics in touching it.
+++ b/src/util/virstoragefile.h @@ -257,8 +257,9 @@ struct _virStorageSource {
virStoragePermsPtr perms; virStorageTimestampsPtr timestamps; - unsigned long long allocation; /* in bytes, 0 if unknown */
Spurious move?
unsigned long long capacity; /* in bytes, 0 if unknown */ + unsigned long long allocation; /* in bytes, 0 if unknown */ + unsigned long long physical; /* in bytes, 0 if unknown */
You know, Adam asked the same question last time :) https://www.redhat.com/archives/libvir-list/2014-November/msg00599.html It's intentional, to match the order in the public header for block info. But I'll update the commit message to make it obvious.
size_t nseclabels; virSecurityDeviceLabelDefPtr *seclabels;
Also an addition to virStorageSourceCopy is missing.
ACK with the tweak to virStorageSourceCopy
I'll probably wait on this patch until I see how the refactoring goes for splitting offline vs. online stat gathering. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Create a helper function that can be reused for gathering block info from virDomainListGetStats. * src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Split guts... (qemuStorageLimitsRefresh): ...into new helper function. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 196 ++++++++++++++++++++++++++----------------------- 1 file changed, 105 insertions(+), 91 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e873362..1e254bc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10981,59 +10981,25 @@ qemuDomainMemoryPeek(virDomainPtr dom, } +/* Refresh the capacity and allocation limits of a given storage + * source. Assumes that the caller has already obtained a domain job. + * Set *activeFail to true if data cannot be obtained because a + * transient guest is no longer active. */ static int -qemuDomainGetBlockInfo(virDomainPtr dom, - const char *path, - virDomainBlockInfoPtr info, - unsigned int flags) +qemuStorageLimitsRefresh(virQEMUDriverPtr driver, virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, virDomainDiskDefPtr disk, + virStorageSourcePtr src, const char *path, + bool *activeFail) { - virQEMUDriverPtr driver = dom->conn->privateData; - virDomainObjPtr vm; int ret = -1; int fd = -1; off_t end; virStorageSourcePtr meta = NULL; - virDomainDiskDefPtr disk = NULL; struct stat sb; - int idx; - int format; - bool activeFail = false; - virQEMUDriverConfigPtr cfg = NULL; + int format = src->format; char *buf = NULL; ssize_t len; - virCheckFlags(0, -1); - - if (!(vm = qemuDomObjFromDomain(dom))) - return -1; - - cfg = virQEMUDriverGetConfig(driver); - - if (virDomainGetBlockInfoEnsureACL(dom->conn, vm->def) < 0) - goto cleanup; - - if (!path || path[0] == '\0') { - virReportError(VIR_ERR_INVALID_ARG, "%s", _("NULL or empty path")); - goto cleanup; - } - - /* Technically, we only need a job if we are going to query the - * monitor, which is only for active domains that are using - * non-raw block devices. But it is easier to share code if we - * always grab a job; furthermore, grabbing the job ensures that - * hot-plug won't change disk behind our backs. */ - if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) - goto cleanup; - - /* Check the path belongs to this domain. */ - if ((idx = virDomainDiskIndexByName(vm->def, path, false)) < 0) { - virReportError(VIR_ERR_INVALID_ARG, - _("invalid path %s not assigned to domain"), path); - goto endjob; - } - - disk = vm->def->disks[idx]; - /* FIXME: For an offline domain, we always want to check current * on-disk statistics (as users have been known to change offline * images behind our backs). For a running domain, however, it @@ -11054,76 +11020,74 @@ qemuDomainGetBlockInfo(virDomainPtr dom, * punching holes), and physical size of a non-raw file can * change. */ - if (virStorageSourceIsLocalStorage(disk->src)) { - if (!disk->src->path) { + if (virStorageSourceIsLocalStorage(src)) { + if (!src->path) { virReportError(VIR_ERR_INVALID_ARG, _("disk '%s' does not currently have a source assigned"), path); - goto endjob; + goto cleanup; } - if ((fd = qemuOpenFile(driver, vm, disk->src->path, O_RDONLY, + if ((fd = qemuOpenFile(driver, vm, src->path, O_RDONLY, NULL, NULL)) == -1) - goto endjob; + goto cleanup; if (fstat(fd, &sb) < 0) { virReportSystemError(errno, - _("cannot stat file '%s'"), disk->src->path); - goto endjob; + _("cannot stat file '%s'"), src->path); + goto cleanup; } if ((len = virFileReadHeaderFD(fd, VIR_STORAGE_MAX_HEADER, &buf)) < 0) { virReportSystemError(errno, _("cannot read header '%s'"), - disk->src->path); - goto endjob; + src->path); + goto cleanup; } } else { - if (virStorageFileInitAs(disk->src, cfg->user, cfg->group) < 0) - goto endjob; + if (virStorageFileInitAs(src, cfg->user, cfg->group) < 0) + goto cleanup; - if ((len = virStorageFileReadHeader(disk->src, VIR_STORAGE_MAX_HEADER, + if ((len = virStorageFileReadHeader(src, VIR_STORAGE_MAX_HEADER, &buf)) < 0) - goto endjob; + goto cleanup; - if (virStorageFileStat(disk->src, &sb) < 0) { + if (virStorageFileStat(src, &sb) < 0) { virReportSystemError(errno, _("failed to stat remote file '%s'"), - NULLSTR(disk->src->path)); - goto endjob; + NULLSTR(src->path)); + goto cleanup; } } /* Probe for magic formats */ - if (virDomainDiskGetFormat(disk)) { - format = virDomainDiskGetFormat(disk); - } else { + if (!format) { if (!cfg->allowDiskFormatProbing) { virReportError(VIR_ERR_INTERNAL_ERROR, _("no disk format for %s and probing is disabled"), path); - goto endjob; + goto cleanup; } - if ((format = virStorageFileProbeFormatFromBuf(disk->src->path, + if ((format = virStorageFileProbeFormatFromBuf(src->path, buf, len)) < 0) - goto endjob; + goto cleanup; } - if (!(meta = virStorageFileGetMetadataFromBuf(disk->src->path, buf, len, + if (!(meta = virStorageFileGetMetadataFromBuf(src->path, buf, len, format, NULL))) - goto endjob; + goto cleanup; /* Get info for normal formats */ if (S_ISREG(sb.st_mode) || fd == -1) { #ifndef WIN32 - disk->src->physical = (unsigned long long)sb.st_blocks * + src->physical = (unsigned long long)sb.st_blocks * (unsigned long long)DEV_BSIZE; #else - disk->src->physical = sb.st_size; + src->physical = sb.st_size; #endif /* Regular files may be sparse, so logical size (capacity) is not same * as actual physical above */ - disk->src->capacity = sb.st_size; + src->capacity = sb.st_size; } else { /* NB. Because we configure with AC_SYS_LARGEFILE, off_t should * be 64 bits on all platforms. @@ -11132,24 +11096,24 @@ qemuDomainGetBlockInfo(virDomainPtr dom, if (end == (off_t)-1) { virReportSystemError(errno, _("failed to seek to end of %s"), path); - goto endjob; + goto cleanup; } - disk->src->physical = end; - disk->src->capacity = end; + src->physical = end; + src->capacity = end; } /* If the file we probed has a capacity set, then override * what we calculated from file/block extents */ if (meta->capacity) - disk->src->capacity = meta->capacity; + src->capacity = meta->capacity; /* Set default value .. */ - disk->src->allocation = disk->src->physical; + src->allocation = src->physical; /* ..but if guest is not using raw disk format and on a block device, * then query highest allocated extent from QEMU */ - if (virStorageSourceGetActualType(disk->src) == VIR_STORAGE_TYPE_BLOCK && + if (virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_BLOCK && format != VIR_STORAGE_FILE_RAW && S_ISBLK(sb.st_mode)) { qemuDomainObjPrivatePtr priv = vm->privateData; @@ -11158,37 +11122,87 @@ qemuDomainGetBlockInfo(virDomainPtr dom, * depends on whether domain is persistent */ if (!virDomainObjIsActive(vm)) { - activeFail = true; + *activeFail = true; ret = 0; - goto endjob; + goto cleanup; } qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorGetBlockExtent(priv->mon, disk->info.alias, - &disk->src->allocation); + &src->allocation); qemuDomainObjExitMonitor(driver, vm); } else { ret = 0; } - - if (ret == 0) { - info->capacity = disk->src->capacity; - info->allocation = disk->src->allocation; - info->physical = disk->src->physical; - } - - endjob: - if (!qemuDomainObjEndJob(driver, vm)) - vm = NULL; cleanup: VIR_FREE(buf); virStorageSourceFree(meta); VIR_FORCE_CLOSE(fd); - if (disk) - virStorageFileDeinit(disk->src); + virStorageFileDeinit(src); + return ret; +} + +static int +qemuDomainGetBlockInfo(virDomainPtr dom, + const char *path, + virDomainBlockInfoPtr info, + unsigned int flags) +{ + virQEMUDriverPtr driver = dom->conn->privateData; + virDomainObjPtr vm; + int ret = -1; + virDomainDiskDefPtr disk = NULL; + int idx; + bool activeFail = false; + virQEMUDriverConfigPtr cfg = NULL; + + virCheckFlags(0, -1); + + if (!(vm = qemuDomObjFromDomain(dom))) + return -1; + + cfg = virQEMUDriverGetConfig(driver); + + if (virDomainGetBlockInfoEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if (!path || path[0] == '\0') { + virReportError(VIR_ERR_INVALID_ARG, "%s", _("NULL or empty path")); + goto cleanup; + } + + /* Technically, we only need a job if we are going to query the + * monitor, which is only for active domains that are using + * non-raw block devices. But it is easier to share code if we + * always grab a job; furthermore, grabbing the job ensures that + * hot-plug won't change disk behind our backs. */ + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) + goto cleanup; + + /* Check the path belongs to this domain. */ + if ((idx = virDomainDiskIndexByName(vm->def, path, false)) < 0) { + virReportError(VIR_ERR_INVALID_ARG, + _("invalid path %s not assigned to domain"), path); + goto endjob; + } + + disk = vm->def->disks[idx]; + + if ((ret = qemuStorageLimitsRefresh(driver, cfg, vm, disk, disk->src, path, + &activeFail)) < 0) + goto endjob; + + info->capacity = disk->src->capacity; + info->allocation = disk->src->allocation; + info->physical = disk->src->physical; + + endjob: + if (!qemuDomainObjEndJob(driver, vm)) + vm = NULL; + cleanup: /* If we failed to get data from a domain because it's inactive and * it's not a persistent domain, then force failure. */ -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
Create a helper function that can be reused for gathering block info from virDomainListGetStats.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Split guts... (qemuStorageLimitsRefresh): ...into new helper function.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 196 ++++++++++++++++++++++++++----------------------- 1 file changed, 105 insertions(+), 91 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e873362..1e254bc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10981,59 +10981,25 @@ qemuDomainMemoryPeek(virDomainPtr dom, }
+/* Refresh the capacity and allocation limits of a given storage + * source. Assumes that the caller has already obtained a domain job. + * Set *activeFail to true if data cannot be obtained because a + * transient guest is no longer active. */ static int -qemuDomainGetBlockInfo(virDomainPtr dom, - const char *path, - virDomainBlockInfoPtr info, - unsigned int flags) +qemuStorageLimitsRefresh(virQEMUDriverPtr driver, virQEMUDriverConfigPtr cfg,
One argument per line looks better.
+ virDomainObjPtr vm, virDomainDiskDefPtr disk, + virStorageSourcePtr src, const char *path,
disk, src and path together? That doesn't seem to make much sense. While 'path' is used for error messages only, and I don't think it's entirely necessary to use the user-provided string in error messages we will be good of with just using src->path. The 'disk' argument is used to gather the allocation stat in case of block devices for an active domain. That is really strange. If you want to make this function usable for any disk source, the disk alias won't be the right way to get that. I think a better approach will be to split the code into two helpers, one being called on offline domains that gathers data from images on disk and doesn't ever touch the monitor (basically the existing code in qemuDomainGetBlockInfo minus the monitor call) and a second helper that will gather the data from monitor calls and never touch files used for online VMs. This is partially available in the existing helper to gather domain block stats. Peter

I noticed that for an offline domain, 'virsh domstats --block $dom' was producing just the domain name, with no stats. But the older 'virsh domblkinfo' works just fine on offline domains. This patch starts to get us closer, by at least reporting the disk names for an offline domain. With this patch, I now see the following for an offline domain with one qcow2 disk and an empty cdrom drive: $ virsh domstats --block foo Domain: 'foo' block.count=2 block.0.name=hda block.1.name=hdc * src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Don't short-circuit output of block name. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1e254bc..404decd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -18529,19 +18529,20 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, int rc; virHashTablePtr stats = NULL; qemuDomainObjPrivatePtr priv = dom->privateData; + bool abbreviated = false; - if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) - return 0; /* it's ok, just go ahead silently */ + if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) { + abbreviated = true; /* it's ok, just go ahead silently */ + } else { + qemuDomainObjEnterMonitor(driver, dom); + rc = qemuMonitorGetAllBlockStatsInfo(priv->mon, &stats); + ignore_value(qemuMonitorBlockStatsUpdateCapacity(priv->mon, stats)); + qemuDomainObjExitMonitor(driver, dom); - qemuDomainObjEnterMonitor(driver, dom); - rc = qemuMonitorGetAllBlockStatsInfo(priv->mon, &stats); - ignore_value(qemuMonitorBlockStatsUpdateCapacity(priv->mon, stats)); - qemuDomainObjExitMonitor(driver, dom); - - if (rc < 0) { - virResetLastError(); - ret = 0; /* still ok, again go ahead silently */ - goto cleanup; + if (rc < 0) { + virResetLastError(); + abbreviated = true; /* still ok, again go ahead silently */ + } } QEMU_ADD_COUNT_PARAM(record, maxparams, "block", dom->def->ndisks); @@ -18552,9 +18553,12 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, QEMU_ADD_NAME_PARAM(record, maxparams, "block", i, disk->dst); - if (!disk->info.alias || - !(entry = virHashLookup(stats, disk->info.alias))) + if (abbreviated || !disk->info.alias || + !(entry = virHashLookup(stats, disk->info.alias))) { + /* FIXME: we could still look up sizing by sharing code + * with qemuDomainGetBlockInfo */ continue; + } QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, "rd.reqs", entry->rd_req); -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
I noticed that for an offline domain, 'virsh domstats --block $dom' was producing just the domain name, with no stats. But the older 'virsh domblkinfo' works just fine on offline domains. This patch starts to get us closer, by at least reporting the disk names for an offline domain.
With this patch, I now see the following for an offline domain with one qcow2 disk and an empty cdrom drive: $ virsh domstats --block foo Domain: 'foo' block.count=2 block.0.name=hda block.1.name=hdc
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Don't short-circuit output of block name.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-)
ACK, Peter

The prior refactoring can now be put to use. With the same domain as the previous commit (one qcow2 disk and an empty cdrom drive): $ virsh domstats --block foo Domain: 'foo' block.count=2 block.0.name=hda block.0.allocation=200704 block.0.capacity=42949672960 block.0.physical=200704 block.1.name=hdc * src/qemu/qemu_driver.c (qemuStorageLimitsRefresh): Tweak semantics of helper function. (qemuDomainGetStatsBlock): Use it to report offline statistics. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 60 +++++++++++++++++++++++++++++++++++--------------- 1 file changed, 42 insertions(+), 18 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 404decd..723391b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10983,7 +10983,9 @@ qemuDomainMemoryPeek(virDomainPtr dom, /* Refresh the capacity and allocation limits of a given storage * source. Assumes that the caller has already obtained a domain job. - * Set *activeFail to true if data cannot be obtained because a + * Pass NULL for path to skip any errors about an empty drive. + * Pass NULL for activeFail to skip any monitor attempt, otherwise, + * set *activeFail to true if data cannot be obtained because a * transient guest is no longer active. */ static int qemuStorageLimitsRefresh(virQEMUDriverPtr driver, virQEMUDriverConfigPtr cfg, @@ -11000,6 +11002,12 @@ qemuStorageLimitsRefresh(virQEMUDriverPtr driver, virQEMUDriverConfigPtr cfg, char *buf = NULL; ssize_t len; + if (!path) { + if (virStorageSourceIsEmpty(src)) + return 0; + path = src->path; + } + /* FIXME: For an offline domain, we always want to check current * on-disk statistics (as users have been known to change offline * images behind our backs). For a running domain, however, it @@ -11116,23 +11124,27 @@ qemuStorageLimitsRefresh(virQEMUDriverPtr driver, virQEMUDriverConfigPtr cfg, if (virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_BLOCK && format != VIR_STORAGE_FILE_RAW && S_ISBLK(sb.st_mode)) { - qemuDomainObjPrivatePtr priv = vm->privateData; - - /* If the guest is not running, then success/failure return - * depends on whether domain is persistent - */ - if (!virDomainObjIsActive(vm)) { - *activeFail = true; + if (!activeFail) { + disk->src->allocation = 0; ret = 0; - goto cleanup; + } else { + qemuDomainObjPrivatePtr priv = vm->privateData; + + /* If the guest is not running, then success/failure return + * depends on whether domain is persistent + */ + if (!virDomainObjIsActive(vm)) { + *activeFail = true; + ret = 0; + goto cleanup; + } + + qemuDomainObjEnterMonitor(driver, vm); + ret = qemuMonitorGetBlockExtent(priv->mon, + disk->info.alias, + &src->allocation); + qemuDomainObjExitMonitor(driver, vm); } - - qemuDomainObjEnterMonitor(driver, vm); - ret = qemuMonitorGetBlockExtent(priv->mon, - disk->info.alias, - &src->allocation); - qemuDomainObjExitMonitor(driver, vm); - } else { ret = 0; } @@ -18530,6 +18542,7 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, virHashTablePtr stats = NULL; qemuDomainObjPrivatePtr priv = dom->privateData; bool abbreviated = false; + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) { abbreviated = true; /* it's ok, just go ahead silently */ @@ -18555,8 +18568,18 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, if (abbreviated || !disk->info.alias || !(entry = virHashLookup(stats, disk->info.alias))) { - /* FIXME: we could still look up sizing by sharing code - * with qemuDomainGetBlockInfo */ + if (qemuStorageLimitsRefresh(driver, cfg, dom, + disk, disk->src, NULL, NULL) < 0) + goto cleanup; + if (disk->src->allocation) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, + "allocation", disk->src->allocation); + if (disk->src->capacity) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, + "capacity", disk->src->capacity); + if (disk->src->physical) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, + "physical", disk->src->physical); continue; } @@ -18593,6 +18616,7 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, cleanup: virHashFree(stats); + virObjectUnref(cfg); return ret; } -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
The prior refactoring can now be put to use. With the same domain as the previous commit (one qcow2 disk and an empty cdrom drive): $ virsh domstats --block foo Domain: 'foo' block.count=2 block.0.name=hda block.0.allocation=200704 block.0.capacity=42949672960 block.0.physical=200704 block.1.name=hdc
* src/qemu/qemu_driver.c (qemuStorageLimitsRefresh): Tweak semantics of helper function. (qemuDomainGetStatsBlock): Use it to report offline statistics.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 60 +++++++++++++++++++++++++++++++++++--------------- 1 file changed, 42 insertions(+), 18 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 404decd..723391b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10983,7 +10983,9 @@ qemuDomainMemoryPeek(virDomainPtr dom,
/* Refresh the capacity and allocation limits of a given storage * source. Assumes that the caller has already obtained a domain job. - * Set *activeFail to true if data cannot be obtained because a + * Pass NULL for path to skip any errors about an empty drive. + * Pass NULL for activeFail to skip any monitor attempt, otherwise,
Having a separate offline-only helper in this case would avoid us having this argument as it would never touch the monitor.
+ * set *activeFail to true if data cannot be obtained because a * transient guest is no longer active. */ static int qemuStorageLimitsRefresh(virQEMUDriverPtr driver, virQEMUDriverConfigPtr cfg, @@ -11000,6 +11002,12 @@ qemuStorageLimitsRefresh(virQEMUDriverPtr driver, virQEMUDriverConfigPtr cfg, char *buf = NULL; ssize_t len;
+ if (!path) { + if (virStorageSourceIsEmpty(src)) + return 0; + path = src->path; + }
Again, no need to pass path explicitly.
+ /* FIXME: For an offline domain, we always want to check current * on-disk statistics (as users have been known to change offline * images behind our backs). For a running domain, however, it
...
@@ -18530,6 +18542,7 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, virHashTablePtr stats = NULL; qemuDomainObjPrivatePtr priv = dom->privateData; bool abbreviated = false; + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) { abbreviated = true; /* it's ok, just go ahead silently */ @@ -18555,8 +18568,18 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver,
if (abbreviated || !disk->info.alias || !(entry = virHashLookup(stats, disk->info.alias))) { - /* FIXME: we could still look up sizing by sharing code - * with qemuDomainGetBlockInfo */ + if (qemuStorageLimitsRefresh(driver, cfg, dom, + disk, disk->src, NULL, NULL) < 0) + goto cleanup; + if (disk->src->allocation) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, + "allocation", disk->src->allocation); + if (disk->src->capacity) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, + "capacity", disk->src->capacity); + if (disk->src->physical) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, + "physical", disk->src->physical); continue; }
This part looks good. Having stats for offline VMs might be useful.
@@ -18593,6 +18616,7 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver,
cleanup: virHashFree(stats); + virObjectUnref(cfg); return ret; }
Peter

I'm about to make block stats optionally more complex to cover backing chains, where block.count will no longer equal the number of <disks> for a domain. For these reasons, it is nicer if the statistics output includes the source path (for local files). This patch doesn't add anything for network disks, although we may decide to add that later. With this patch, I now see the following for the same domain as in earlier patches (one qcow2 file, and an empty cdrom drive): $ virsh domstats --block foo Domain: 'foo' block.count=2 block.0.name=hda block.0.path=/var/lib/libvirt/images/foo.qcow2 block.0.allocation=200704 block.0.capacity=42949672960 block.0.physical=200704 block.1.name=hdc * src/libvirt-domain.c (virConnectGetAllDomainStats): Document new field. * tools/virsh.pod (domstats): Document new field. * src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Return the new stat for local files/block devices. (QEMU_ADD_NAME_PARAM): Add parameter. (qemuDomainGetStatsInterface): Update caller. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/libvirt-domain.c | 3 +++ src/qemu/qemu_driver.c | 11 +++++++---- tools/virsh.pod | 2 ++ 3 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 87123d1..cb76d8c 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -10910,6 +10910,9 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * "block.<num>.name" - name of the block device <num> as string. * matches the target name (vda/sda/hda) of the * block device. + * "block.<num>.path" - string describing the source of block device <num>, + * if it is a file or block device (omitted for network + * sources and drives with no media inserted). * "block.<num>.rd.reqs" - number of read requests as unsigned long long. * "block.<num>.rd.bytes" - number of read bytes as unsigned long long. * "block.<num>.rd.times" - total time (ns) spent on reads as diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 723391b..5334f8d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -18422,11 +18422,11 @@ do { \ goto cleanup; \ } while (0) -#define QEMU_ADD_NAME_PARAM(record, maxparams, type, num, name) \ +#define QEMU_ADD_NAME_PARAM(record, maxparams, type, subtype, num, name) \ do { \ char param_name[VIR_TYPED_PARAM_FIELD_LENGTH]; \ snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, \ - "%s.%zu.name", type, num); \ + "%s.%zu.%s", type, num, subtype); \ if (virTypedParamsAddString(&(record)->params, \ &(record)->nparams, \ maxparams, \ @@ -18471,7 +18471,7 @@ qemuDomainGetStatsInterface(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, memset(&tmp, 0, sizeof(tmp)); QEMU_ADD_NAME_PARAM(record, maxparams, - "net", i, dom->def->nets[i]->ifname); + "net", "name", i, dom->def->nets[i]->ifname); if (virNetInterfaceStats(dom->def->nets[i]->ifname, &tmp) < 0) { virResetLastError(); @@ -18564,7 +18564,10 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, qemuBlockStats *entry; virDomainDiskDefPtr disk = dom->def->disks[i]; - QEMU_ADD_NAME_PARAM(record, maxparams, "block", i, disk->dst); + QEMU_ADD_NAME_PARAM(record, maxparams, "block", "name", i, disk->dst); + if (virStorageSourceIsLocalStorage(disk->src) && disk->src->path) + QEMU_ADD_NAME_PARAM(record, maxparams, "block", "path", + i, disk->src->path); if (abbreviated || !disk->info.alias || !(entry = virHashLookup(stats, disk->info.alias))) { diff --git a/tools/virsh.pod b/tools/virsh.pod index c070261..cbd82275 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -883,6 +883,8 @@ I<--interface> returns: I<--block> returns: "block.count" - number of block devices on this domain, "block.<num>.name" - name of the target of the block device <num>, +"block.<num>.path" - file source of block device <num>, if it is a +local file or block device, "block.<num>.rd.reqs" - number of read requests, "block.<num>.rd.bytes" - number of read bytes, "block.<num>.rd.times" - total time (ns) spent on reads, -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
I'm about to make block stats optionally more complex to cover backing chains, where block.count will no longer equal the number of <disks> for a domain. For these reasons, it is nicer if the statistics output includes the source path (for local files). This patch doesn't add anything for network disks, although we may decide to add that later.
With this patch, I now see the following for the same domain as in earlier patches (one qcow2 file, and an empty cdrom drive): $ virsh domstats --block foo Domain: 'foo' block.count=2 block.0.name=hda block.0.path=/var/lib/libvirt/images/foo.qcow2 block.0.allocation=200704 block.0.capacity=42949672960 block.0.physical=200704 block.1.name=hdc
* src/libvirt-domain.c (virConnectGetAllDomainStats): Document new field. * tools/virsh.pod (domstats): Document new field. * src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Return the new stat for local files/block devices. (QEMU_ADD_NAME_PARAM): Add parameter. (qemuDomainGetStatsInterface): Update caller.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/libvirt-domain.c | 3 +++ src/qemu/qemu_driver.c | 11 +++++++---- tools/virsh.pod | 2 ++ 3 files changed, 12 insertions(+), 4 deletions(-)
ACK, Peter

On 12/08/2014 07:17 AM, Peter Krempa wrote:
On 12/06/14 09:14, Eric Blake wrote:
I'm about to make block stats optionally more complex to cover backing chains, where block.count will no longer equal the number of <disks> for a domain. For these reasons, it is nicer if the statistics output includes the source path (for local files). This patch doesn't add anything for network disks, although we may decide to add that later.
With this patch, I now see the following for the same domain as in earlier patches (one qcow2 file, and an empty cdrom drive): $ virsh domstats --block foo Domain: 'foo' block.count=2 block.0.name=hda block.0.path=/var/lib/libvirt/images/foo.qcow2 block.0.allocation=200704 block.0.capacity=42949672960 block.0.physical=200704 block.1.name=hdc
ACK,
2, 6, and 8 now pushed. Of course, the commit message for 8 had to be tweaked a bit (since the output is waiting on the offline stats of patch 7, which in turn are waiting on the split of offline vs. online in the helper to blockinfo). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

A coming patch will make it optionally possible to list backing chain block stats; in this mode of operation, block.counts is no longer the number of <disks> in the domain, but the number of blocks in the array being reported. We still want block.count listed first, but rather than iterate the tree twice (once to count, and once to list stats), it's easier to just touch things up after the fact. * src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Compute count after the fact. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 5334f8d..cb80f49 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -18543,6 +18543,7 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, qemuDomainObjPrivatePtr priv = dom->privateData; bool abbreviated = false; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + int count_index = -1; if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) { abbreviated = true; /* it's ok, just go ahead silently */ @@ -18558,7 +18559,11 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, } } - QEMU_ADD_COUNT_PARAM(record, maxparams, "block", dom->def->ndisks); + /* When listing backing chains, it's easier to fix up the count + * after the iteration than it is to iterate twice; but we still + * want count listed first. */ + count_index = record->nparams; + QEMU_ADD_COUNT_PARAM(record, maxparams, "block", 0); for (i = 0; i < dom->def->ndisks; i++) { qemuBlockStats *entry; @@ -18618,6 +18623,8 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, ret = 0; cleanup: + if (count_index >= 0) + record->params[count_index].value.ui = i; virHashFree(stats); virObjectUnref(cfg); return ret; -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
A coming patch will make it optionally possible to list backing chain block stats; in this mode of operation, block.counts is no longer the number of <disks> in the domain, but the number of blocks in the array being reported. We still want block.count listed first, but rather than iterate the tree twice (once to count, and once to list stats), it's easier to just touch things up after the fact.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Compute count after the fact.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
ACK, Peter

On 12/08/14 15:19, Peter Krempa wrote:
On 12/06/14 09:14, Eric Blake wrote:
A coming patch will make it optionally possible to list backing chain block stats; in this mode of operation, block.counts is no longer the number of <disks> in the domain, but the number of blocks in the array being reported. We still want block.count listed first, but rather than iterate the tree twice (once to count, and once to list stats), it's easier to just touch things up after the fact.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Compute count after the fact.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
Compiler complains: qemu/qemu_driver.c: In function 'qemuDomainGetStatsBlock': qemu/qemu_driver.c:18630:46: error: 'i' may be used uninitialized in this function [-Werror=maybe-uninitialized] record->params[count_index].value.ui = i; ^ CC lxc/libvirt_driver_lxc_impl_la-lxc_native.lo CC lxc/libvirt_driver_lxc_impl_la-lxc_driver.lo CC uml/libvirt_driver_uml_impl_la-uml_conf.lo CC uml/libvirt_driver_uml_impl_la-uml_driver.lo CC network/libvirt_driver_network_impl_la-bridge_driver.lo CC network/libvirt_driver_network_impl_la-bridge_driver_platform.lo
ACK,
if you fix the issue above.
Peter

On 12/08/2014 07:26 AM, Peter Krempa wrote:
On 12/08/14 15:19, Peter Krempa wrote:
On 12/06/14 09:14, Eric Blake wrote:
A coming patch will make it optionally possible to list backing chain block stats; in this mode of operation, block.counts is no longer the number of <disks> in the domain, but the number of blocks in the array being reported. We still want block.count listed first, but rather than iterate the tree twice (once to count, and once to list stats), it's easier to just touch things up after the fact.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Compute count after the fact.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
Compiler complains:
qemu/qemu_driver.c: In function 'qemuDomainGetStatsBlock': qemu/qemu_driver.c:18630:46: error: 'i' may be used uninitialized in this function [-Werror=maybe-uninitialized] record->params[count_index].value.ui = i; ^
You must be compiling at -O2 while I was not :) Found the culprit: the QEMU_ADD_COUNT_PARAM macro has a hidden goto, that can bypass the initialization of i=0 in the for loop; the solution is trivial.
ACK,
if you fix the issue above.
I'll be pushing the acked patches that can be reshuffled without dependencies, and reposting the rest of the series with fixes incorporated... diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index bb516af..78d1bdd 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -18539,7 +18539,7 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, int *maxparams, unsigned int privflags) { - size_t i; + size_t i = 0; int ret = -1; int rc; virHashTablePtr stats = NULL; @@ -18568,7 +18568,7 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, count_index = record->nparams; QEMU_ADD_COUNT_PARAM(record, maxparams, "block", 0); - for (i = 0; i < dom->def->ndisks; i++) { + for (i; i < dom->def->ndisks; i++) { qemuBlockStats *entry; virDomainDiskDefPtr disk = dom->def->disks[i]; -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

This patch introduces access to allocation information about a backing chain of a live domain. While querying storage volumes for read-only disks could provide some of the details, there is one case where we have to rely on qemu: when doing a block commit into a backing file, where that file is stored in qcow2 format on a host block device, we want to know the current highest write offset into that image, in order to know if the disk must be resized larger. qemu-img does not (currently) show this information, and none of the earlier block APIs were extensible enough to expose it. But virDomainListGetStats is perfect for the job! We don't need a new group of statistics, as the existing block group is sufficient. On the other hand, as existing libvirt releases already report 1:1 mapping of block.count to <disk> devices, changing the array size could confuse older clients; and even with newer clients, the time and memory taken to report additional statistics is not always necessary (backing files are generally read-only except for block-commit, so while read statistics may change, sizing statistics will not). So the choice here is to add a new flag that only newer callers will pass, when they are prepared for the additional information. This patch introduces the new API, but it will take more patches to get it implemented for qemu. * include/libvirt/libvirt-domain.h (VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING): New flag. * src/libvirt-domain.c (virConnectGetAllDomainStats): Document it, and add a new field when it is in use. * tools/virsh-domain-monitor.c (cmdDomstats): Use new flag. * tools/virsh.pod (domstats): Document it. Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain.h | 1 + src/libvirt-domain.c | 7 ++++++- tools/virsh-domain-monitor.c | 7 +++++++ tools/virsh.pod | 8 ++++++-- 4 files changed, 20 insertions(+), 3 deletions(-) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index ae2c49c..e6185b9 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1698,6 +1698,7 @@ typedef enum { VIR_CONNECT_GET_ALL_DOMAINS_STATS_SHUTOFF = VIR_CONNECT_LIST_DOMAINS_SHUTOFF, VIR_CONNECT_GET_ALL_DOMAINS_STATS_OTHER = VIR_CONNECT_LIST_DOMAINS_OTHER, + VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING = 1 << 30, /* include backing chain for block stats */ VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS = 1 << 31, /* enforce requested stats */ } virConnectGetAllDomainStatsFlags; diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index cb76d8c..8c4ad7b 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -10903,13 +10903,18 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * "net.<num>.tx.errs" - transmission errors as unsigned long long. * "net.<num>.tx.drop" - transmit packets dropped as unsigned long long. * - * VIR_DOMAIN_STATS_BLOCK: Return block devices statistics. + * VIR_DOMAIN_STATS_BLOCK: Return block devices statistics. By default, + * this information is limited to the active layer of each <disk> of the + * domain, but adding VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING to @flags + * will expand the array to cover backing chains. * The typed parameter keys are in this format: * "block.count" - number of block devices on this domain * as unsigned int. * "block.<num>.name" - name of the block device <num> as string. * matches the target name (vda/sda/hda) of the * block device. + * "block.<num>.backingIndex" - unsigned int giving the <backingStore> index, + * when backing images are listed. * "block.<num>.path" - string describing the source of block device <num>, * if it is a file or block device (omitted for network * sources and drives with no media inserted). diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c index 259400f..214c0b2 100644 --- a/tools/virsh-domain-monitor.c +++ b/tools/virsh-domain-monitor.c @@ -2037,6 +2037,10 @@ static const vshCmdOptDef opts_domstats[] = { .type = VSH_OT_BOOL, .help = N_("enforce requested stats parameters"), }, + {.name = "backing", + .type = VSH_OT_BOOL, + .help = N_("add backing chain information to block stats"), + }, {.name = "domain", .type = VSH_OT_ARGV, .flags = VSH_OFLAG_NONE, @@ -2130,6 +2134,9 @@ cmdDomstats(vshControl *ctl, const vshCmd *cmd) if (vshCommandOptBool(cmd, "enforce")) flags |= VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS; + if (vshCommandOptBool(cmd, "backing")) + flags |= VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING; + if (vshCommandOptBool(cmd, "domain")) { if (VIR_ALLOC_N(domlist, 1) < 0) goto cleanup; diff --git a/tools/virsh.pod b/tools/virsh.pod index cbd82275..378f1c0 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -825,7 +825,7 @@ that require a block device name (such as I<domblkinfo> or I<snapshot-create> for disk snapshots) will accept either target or unique source names printed by this command. -=item B<domstats> [I<--raw>] [I<--enforce>] [I<--state>] +=item B<domstats> [I<--raw>] [I<--enforce>] [I<--backing>] [I<--state>] [I<--cpu-total>] [I<--balloon>] [I<--vcpu>] [I<--interface>] [I<--block>] [[I<--list-active>] [I<--list-inactive>] [I<--list-persistent>] [I<--list-transient>] [I<--list-running>] [I<--list-paused>] @@ -880,7 +880,11 @@ I<--interface> returns: "net.<num>.tx.errs" - number of transmission errors, "net.<num>.tx.drop" - number of transmit packets dropped -I<--block> returns: +I<--block> returns information about disks associated with each +domain. Using the I<--backing> flag extends this information to +cover all resources in the backing chain, rather than the default +of limiting information to the active layer for each guest disk. +Information listed includes: "block.count" - number of block devices on this domain, "block.<num>.name" - name of the target of the block device <num>, "block.<num>.path" - file source of block device <num>, if it is a -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
This patch introduces access to allocation information about a backing chain of a live domain. While querying storage volumes for read-only disks could provide some of the details, there is one case where we have to rely on qemu: when doing a block commit into a backing file, where that file is stored in qcow2 format on a host block device, we want to know the current highest write offset into that image, in order to know if the disk must be resized larger. qemu-img does not (currently) show this information, and none of the earlier block APIs were extensible enough to expose it. But virDomainListGetStats is perfect for the job!
We don't need a new group of statistics, as the existing block group is sufficient. On the other hand, as existing libvirt releases already report 1:1 mapping of block.count to <disk> devices, changing the array size could confuse older clients; and even with newer clients, the time and memory taken to report additional statistics is not always necessary (backing files are generally read-only except for block-commit, so while read statistics may change, sizing statistics will not). So the choice here is to add a new flag that only newer callers will pass, when they are prepared for the additional information.
This patch introduces the new API, but it will take more patches to get it implemented for qemu.
* include/libvirt/libvirt-domain.h (VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING): New flag. * src/libvirt-domain.c (virConnectGetAllDomainStats): Document it, and add a new field when it is in use. * tools/virsh-domain-monitor.c (cmdDomstats): Use new flag. * tools/virsh.pod (domstats): Document it.
Signed-off-by: Eric Blake <eblake@redhat.com> --- include/libvirt/libvirt-domain.h | 1 + src/libvirt-domain.c | 7 ++++++- tools/virsh-domain-monitor.c | 7 +++++++ tools/virsh.pod | 8 ++++++-- 4 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index cb76d8c..8c4ad7b 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -10903,13 +10903,18 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * "net.<num>.tx.errs" - transmission errors as unsigned long long. * "net.<num>.tx.drop" - transmit packets dropped as unsigned long long. * - * VIR_DOMAIN_STATS_BLOCK: Return block devices statistics. + * VIR_DOMAIN_STATS_BLOCK: Return block devices statistics. By default, + * this information is limited to the active layer of each <disk> of the + * domain, but adding VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING to @flags + * will expand the array to cover backing chains. * The typed parameter keys are in this format: * "block.count" - number of block devices on this domain * as unsigned int. * "block.<num>.name" - name of the block device <num> as string. * matches the target name (vda/sda/hda) of the * block device. + * "block.<num>.backingIndex" - unsigned int giving the <backingStore> index, + * when backing images are listed.
So name will always be "vda/sda/hda" and this will change according to the position in the backing chain? Okay. That makes sense although it's not entirely obvious from the description.
* "block.<num>.path" - string describing the source of block device <num>, * if it is a file or block device (omitted for network * sources and drives with no media inserted).
...
diff --git a/tools/virsh.pod b/tools/virsh.pod index cbd82275..378f1c0 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -825,7 +825,7 @@ that require a block device name (such as I<domblkinfo> or I<snapshot-create> for disk snapshots) will accept either target or unique source names printed by this command.
-=item B<domstats> [I<--raw>] [I<--enforce>] [I<--state>] +=item B<domstats> [I<--raw>] [I<--enforce>] [I<--backing>] [I<--state>] [I<--cpu-total>] [I<--balloon>] [I<--vcpu>] [I<--interface>] [I<--block>] [[I<--list-active>] [I<--list-inactive>] [I<--list-persistent>] [I<--list-transient>] [I<--list-running>] [I<--list-paused>] @@ -880,7 +880,11 @@ I<--interface> returns: "net.<num>.tx.errs" - number of transmission errors, "net.<num>.tx.drop" - number of transmit packets dropped
-I<--block> returns: +I<--block> returns information about disks associated with each +domain. Using the I<--backing> flag extends this information to +cover all resources in the backing chain, rather than the default +of limiting information to the active layer for each guest disk. +Information listed includes: "block.count" - number of block devices on this domain, "block.<num>.name" - name of the target of the block device <num>, "block.<num>.path" - file source of block device <num>, if it is a
The man page is missing entry for "block.<num>.backingIndex"
ACK if you add the man page entry. Peter

In order to report stats on backing chains, we need to separate the output of stats for one block from how we traverse blocks. * src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Split... (qemuDomainGetStatsOneBlock): ...into new helper. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 127 ++++++++++++++++++++++++++++++------------------- 1 file changed, 77 insertions(+), 50 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index cb80f49..feaa4a2 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -18529,6 +18529,79 @@ do { \ goto cleanup; \ } while (0) + +static int +qemuDomainGetStatsOneBlock(virQEMUDriverPtr driver, + virQEMUDriverConfigPtr cfg, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + virDomainDiskDefPtr disk, + virStorageSourcePtr src, + size_t block_idx, + bool abbreviated, + virHashTablePtr stats) +{ + qemuBlockStats *entry; + int ret = -1; + + QEMU_ADD_NAME_PARAM(record, maxparams, "block", "name", block_idx, + disk->dst); + if (virStorageSourceIsLocalStorage(src) && src->path) + QEMU_ADD_NAME_PARAM(record, maxparams, "block", "path", + block_idx, src->path); + + if (abbreviated || !disk->info.alias || + !(entry = virHashLookup(stats, disk->info.alias))) { + if (qemuStorageLimitsRefresh(driver, cfg, dom, + disk, src, NULL, NULL) < 0) + goto cleanup; + if (src->allocation) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "allocation", src->allocation); + if (src->capacity) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "capacity", src->capacity); + if (src->physical) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "physical", src->physical); + ret = 0; + goto cleanup; + } + + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "rd.reqs", entry->rd_req); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "rd.bytes", entry->rd_bytes); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "rd.times", entry->rd_total_times); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "wr.reqs", entry->wr_req); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "wr.bytes", entry->wr_bytes); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "wr.times", entry->wr_total_times); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "fl.reqs", entry->flush_req); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "fl.times", entry->flush_total_times); + + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "allocation", entry->wr_highest_offset); + + if (entry->capacity) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "capacity", entry->capacity); + if (entry->physical) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "physical", entry->physical); + + ret = 0; + cleanup: + return ret; +} + + static int qemuDomainGetStatsBlock(virQEMUDriverPtr driver, virDomainObjPtr dom, @@ -18566,58 +18639,12 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, QEMU_ADD_COUNT_PARAM(record, maxparams, "block", 0); for (i = 0; i < dom->def->ndisks; i++) { - qemuBlockStats *entry; virDomainDiskDefPtr disk = dom->def->disks[i]; - QEMU_ADD_NAME_PARAM(record, maxparams, "block", "name", i, disk->dst); - if (virStorageSourceIsLocalStorage(disk->src) && disk->src->path) - QEMU_ADD_NAME_PARAM(record, maxparams, "block", "path", - i, disk->src->path); - - if (abbreviated || !disk->info.alias || - !(entry = virHashLookup(stats, disk->info.alias))) { - if (qemuStorageLimitsRefresh(driver, cfg, dom, - disk, disk->src, NULL, NULL) < 0) - goto cleanup; - if (disk->src->allocation) - QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, - "allocation", disk->src->allocation); - if (disk->src->capacity) - QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, - "capacity", disk->src->capacity); - if (disk->src->physical) - QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, - "physical", disk->src->physical); - continue; - } - - QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, - "rd.reqs", entry->rd_req); - QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, - "rd.bytes", entry->rd_bytes); - QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, - "rd.times", entry->rd_total_times); - QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, - "wr.reqs", entry->wr_req); - QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, - "wr.bytes", entry->wr_bytes); - QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, - "wr.times", entry->wr_total_times); - QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, - "fl.reqs", entry->flush_req); - QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, i, - "fl.times", entry->flush_total_times); - - QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, - "allocation", entry->wr_highest_offset); - - if (entry->capacity) - QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, - "capacity", entry->capacity); - if (entry->physical) - QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, i, - "physical", entry->physical); - + if (qemuDomainGetStatsOneBlock(driver, cfg, dom, record, maxparams, + disk, disk->src, i, abbreviated, + stats) < 0) + goto cleanup; } ret = 0; -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
In order to report stats on backing chains, we need to separate the output of stats for one block from how we traverse blocks.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Split... (qemuDomainGetStatsOneBlock): ...into new helper.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 127 ++++++++++++++++++++++++++++++------------------- 1 file changed, 77 insertions(+), 50 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index cb80f49..feaa4a2 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -18529,6 +18529,79 @@ do { \ goto cleanup; \ } while (0)
+ +static int +qemuDomainGetStatsOneBlock(virQEMUDriverPtr driver, + virQEMUDriverConfigPtr cfg, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + virDomainDiskDefPtr disk, + virStorageSourcePtr src,
This is again a bit strange. I think that for other than top-level disks also should be gathered via monitor. In that case we'll need a way to 'alias' them from qemu's point of view in an uniform way - perhaps - node names?
+ size_t block_idx, + bool abbreviated, + virHashTablePtr stats) +{ + qemuBlockStats *entry; + int ret = -1; + + QEMU_ADD_NAME_PARAM(record, maxparams, "block", "name", block_idx, + disk->dst); + if (virStorageSourceIsLocalStorage(src) && src->path) + QEMU_ADD_NAME_PARAM(record, maxparams, "block", "path", + block_idx, src->path); + + if (abbreviated || !disk->info.alias || + !(entry = virHashLookup(stats, disk->info.alias))) { + if (qemuStorageLimitsRefresh(driver, cfg, dom, + disk, src, NULL, NULL) < 0)
...
+ goto cleanup; + if (src->allocation) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "allocation", src->allocation); + if (src->capacity) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "capacity", src->capacity); + if (src->physical) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "physical", src->physical); + ret = 0; + goto cleanup; + } + + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "rd.reqs", entry->rd_req); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "rd.bytes", entry->rd_bytes); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "rd.times", entry->rd_total_times); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "wr.reqs", entry->wr_req); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "wr.bytes", entry->wr_bytes); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "wr.times", entry->wr_total_times); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "fl.reqs", entry->flush_req); + QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, block_idx, + "fl.times", entry->flush_total_times); + + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "allocation", entry->wr_highest_offset); + + if (entry->capacity) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "capacity", entry->capacity); + if (entry->physical) + QEMU_ADD_BLOCK_PARAM_ULL(record, maxparams, block_idx, + "physical", entry->physical); + + ret = 0; + cleanup: + return ret; +} + + static int qemuDomainGetStatsBlock(virQEMUDriverPtr driver, virDomainObjPtr dom,
The recursive approach will be necessary, but I don't really like this step where we gather the data differently for non-toplevel images. Peter

Wire up backing chain recursion. Note that for now, we just use the same allocation numbers for read-only backing files as what offline domains would report. It is not the correct allocation number for qcow2 over block devices during block-commit, and it misses out on the fact that qemu also reports read statistics on backing files that are worth knowing about (seriously - for a thin-provisioned setup, it would be nice to easily get at a count of how many reads were serviced from the backing file in relation to reads serviced by the active layer). But it is at least sufficient to prove that the algorithm is working, and to let other people start coding to the interface while waiting for later patches that get the correct information. For a running domain, where one of the two images has a backing file, I see the traditional output: $ virsh domstats --block testvm2 Domain: 'testvm2' block.count=2 block.0.name=vda block.0.path=/tmp/wrapper.qcow2 block.0.rd.reqs=1 block.0.rd.bytes=512 block.0.rd.times=28858 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.capacity=1310720000 block.0.physical=200704 block.1.name=vdb block.1.path=/dev/sda7 block.1.rd.reqs=0 block.1.rd.bytes=0 block.1.rd.times=0 block.1.wr.reqs=0 block.1.wr.bytes=0 block.1.wr.times=0 block.1.fl.reqs=0 block.1.fl.times=0 block.1.allocation=0 block.1.capacity=1310720000 vs. the new output: $ virsh domstats --block --backing testvm2 Domain: 'testvm2' block.count=3 block.0.name=vda block.0.path=/tmp/wrapper.qcow2 block.0.rd.reqs=1 block.0.rd.bytes=512 block.0.rd.times=28858 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.capacity=1310720000 block.0.physical=200704 block.1.name=vda block.1.path=/dev/sda6 block.1.backingIndex=1 block.1.allocation=1073741824 block.1.capacity=1310720000 block.1.physical=1073741824 block.2.name=vdb block.2.path=/dev/sda7 block.2.rd.reqs=0 block.2.rd.bytes=0 block.2.rd.times=0 block.2.wr.reqs=0 block.2.wr.bytes=0 block.2.wr.times=0 block.2.fl.reqs=0 block.2.fl.times=0 block.2.allocation=0 block.2.capacity=1310720000 * src/qemu/qemu_driver.c (QEMU_DOMAIN_STATS_BACKING): New internal enum bit. (qemuConnectGetAllDomainStats): Recognize new user flag, and pass details to... (qemuDomainGetStatsBlock): ...here, where we can do longer recursion. (qemuDomainGetStatsOneBlock): Output new field. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 55 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 46 insertions(+), 9 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index feaa4a2..b57beeb 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -18256,8 +18256,10 @@ qemuDomainGetStatsState(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, typedef enum { - QEMU_DOMAIN_STATS_HAVE_JOB = (1 << 0), /* job is entered, monitor can be - accessed */ + QEMU_DOMAIN_STATS_HAVE_JOB = 1 << 0, /* job is entered, monitor can be + accessed */ + QEMU_DOMAIN_STATS_BACKING = 1 << 1, /* include backing chain in + block stats */ } qemuDomainStatsFlags; @@ -18502,6 +18504,19 @@ qemuDomainGetStatsInterface(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, #undef QEMU_ADD_NET_PARAM +#define QEMU_ADD_BLOCK_PARAM_UI(record, maxparams, num, name, value) \ + do { \ + char param_name[VIR_TYPED_PARAM_FIELD_LENGTH]; \ + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, \ + "block.%zu.%s", num, name); \ + if (virTypedParamsAddUInt(&(record)->params, \ + &(record)->nparams, \ + maxparams, \ + param_name, \ + value) < 0) \ + goto cleanup; \ + } while (0) + /* expects a LL, but typed parameter must be ULL */ #define QEMU_ADD_BLOCK_PARAM_LL(record, maxparams, num, name, value) \ do { \ @@ -18539,6 +18554,7 @@ qemuDomainGetStatsOneBlock(virQEMUDriverPtr driver, virDomainDiskDefPtr disk, virStorageSourcePtr src, size_t block_idx, + unsigned int backing_idx, bool abbreviated, virHashTablePtr stats) { @@ -18550,8 +18566,16 @@ qemuDomainGetStatsOneBlock(virQEMUDriverPtr driver, if (virStorageSourceIsLocalStorage(src) && src->path) QEMU_ADD_NAME_PARAM(record, maxparams, "block", "path", block_idx, src->path); + if (backing_idx) + QEMU_ADD_BLOCK_PARAM_UI(record, maxparams, block_idx, "backingIndex", + backing_idx); - if (abbreviated || !disk->info.alias || + /* FIXME: qemu gives information on backing files, but we aren't + * currently storing it into the stats table - we need a common + * key in qemu_monitor_json.c:qemuMonitorGetAllBlockStatsInfo and + * here for getting at that information, probably something like + * asprintf("%s.%d", alias, backing_idx). */ + if (abbreviated || backing_idx || !disk->info.alias || !(entry = virHashLookup(stats, disk->info.alias))) { if (qemuStorageLimitsRefresh(driver, cfg, dom, disk, src, NULL, NULL) < 0) @@ -18617,6 +18641,7 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, bool abbreviated = false; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); int count_index = -1; + size_t visited = 0; if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) { abbreviated = true; /* it's ok, just go ahead silently */ @@ -18640,18 +18665,26 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, for (i = 0; i < dom->def->ndisks; i++) { virDomainDiskDefPtr disk = dom->def->disks[i]; + virStorageSourcePtr src = disk->src; + unsigned int backing_idx = 0; - if (qemuDomainGetStatsOneBlock(driver, cfg, dom, record, maxparams, - disk, disk->src, i, abbreviated, - stats) < 0) - goto cleanup; + while (src && (!backing_idx || + (privflags & QEMU_DOMAIN_STATS_BACKING))) { + if (qemuDomainGetStatsOneBlock(driver, cfg, dom, record, maxparams, + disk, src, visited, backing_idx, + abbreviated, stats) < 0) + goto cleanup; + visited++; + backing_idx++; + src = src->backingStore; + } } ret = 0; cleanup: if (count_index >= 0) - record->params[count_index].value.ui = i; + record->params[count_index].value.ui = visited; virHashFree(stats); virObjectUnref(cfg); return ret; @@ -18792,11 +18825,13 @@ qemuConnectGetAllDomainStats(virConnectPtr conn, unsigned int domflags = 0; if (ndoms) - virCheckFlags(VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS, -1); + virCheckFlags(VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING | + VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS, -1); else virCheckFlags(VIR_CONNECT_LIST_DOMAINS_FILTERS_ACTIVE | VIR_CONNECT_LIST_DOMAINS_FILTERS_PERSISTENT | VIR_CONNECT_LIST_DOMAINS_FILTERS_STATE | + VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING | VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS, -1); if (virConnectGetAllDomainStatsEnsureACL(conn) < 0) @@ -18826,6 +18861,8 @@ qemuConnectGetAllDomainStats(virConnectPtr conn, if (qemuDomainGetStatsNeedMonitor(stats)) privflags |= QEMU_DOMAIN_STATS_HAVE_JOB; + if (flags & VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING) + privflags |= QEMU_DOMAIN_STATS_BACKING; for (i = 0; i < ndoms; i++) { domflags = privflags; -- 1.9.3

On 12/06/14 09:14, Eric Blake wrote:
Wire up backing chain recursion. Note that for now, we just use
...
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 55 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 46 insertions(+), 9 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index feaa4a2..b57beeb 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c
...
@@ -18550,8 +18566,16 @@ qemuDomainGetStatsOneBlock(virQEMUDriverPtr driver, if (virStorageSourceIsLocalStorage(src) && src->path) QEMU_ADD_NAME_PARAM(record, maxparams, "block", "path", block_idx, src->path); + if (backing_idx) + QEMU_ADD_BLOCK_PARAM_UI(record, maxparams, block_idx, "backingIndex", + backing_idx);
- if (abbreviated || !disk->info.alias || + /* FIXME: qemu gives information on backing files, but we aren't + * currently storing it into the stats table - we need a common + * key in qemu_monitor_json.c:qemuMonitorGetAllBlockStatsInfo and + * here for getting at that information, probably something like + * asprintf("%s.%d", alias, backing_idx). */
Breaks syntax-check: src/qemu/qemu_driver.c:18577: * asprintf("%s.%d", alias, backing_idx). */ maint.mk: use virAsprintf, not asprintf
+ if (abbreviated || backing_idx || !disk->info.alias || !(entry = virHashLookup(stats, disk->info.alias))) { if (qemuStorageLimitsRefresh(driver, cfg, dom, disk, src, NULL, NULL) < 0)
Rest of review will follow in-order. Peter

On 12/06/14 09:14, Eric Blake wrote:
Wire up backing chain recursion. Note that for now, we just use the same allocation numbers for read-only backing files as what offline domains would report. It is not the correct allocation number for qcow2 over block devices during block-commit, and it misses out on the fact that qemu also reports read statistics on backing files that are worth knowing about (seriously - for a thin-provisioned setup, it would be nice to easily get at a count of how many reads were serviced from the backing file in relation to reads serviced by the active layer). But it is at least sufficient to prove that the algorithm is working, and to let other people start coding to the interface while waiting for later patches that get the correct information.
Is that a proof-of-concept then? I'd like to avoid adding stats fields that will change, especially this close to the release as we're running into the risk of baking it in.
For a running domain, where one of the two images has a backing file, I see the traditional output:
$ virsh domstats --block testvm2 Domain: 'testvm2' block.count=2 block.0.name=vda block.0.path=/tmp/wrapper.qcow2 block.0.rd.reqs=1 block.0.rd.bytes=512 block.0.rd.times=28858 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.capacity=1310720000 block.0.physical=200704 block.1.name=vdb block.1.path=/dev/sda7 block.1.rd.reqs=0 block.1.rd.bytes=0 block.1.rd.times=0 block.1.wr.reqs=0 block.1.wr.bytes=0 block.1.wr.times=0 block.1.fl.reqs=0 block.1.fl.times=0 block.1.allocation=0 block.1.capacity=1310720000
vs. the new output:
$ virsh domstats --block --backing testvm2 Domain: 'testvm2' block.count=3 block.0.name=vda block.0.path=/tmp/wrapper.qcow2 block.0.rd.reqs=1 block.0.rd.bytes=512 block.0.rd.times=28858 block.0.wr.reqs=0 block.0.wr.bytes=0 block.0.wr.times=0 block.0.fl.reqs=0 block.0.fl.times=0 block.0.allocation=0 block.0.capacity=1310720000 block.0.physical=200704 block.1.name=vda block.1.path=/dev/sda6 block.1.backingIndex=1 block.1.allocation=1073741824 block.1.capacity=1310720000 block.1.physical=1073741824 block.2.name=vdb block.2.path=/dev/sda7 block.2.rd.reqs=0 block.2.rd.bytes=0 block.2.rd.times=0 block.2.wr.reqs=0 block.2.wr.bytes=0 block.2.wr.times=0 block.2.fl.reqs=0 block.2.fl.times=0 block.2.allocation=0 block.2.capacity=1310720000
ACK to the idea of the stat output idea ...
* src/qemu/qemu_driver.c (QEMU_DOMAIN_STATS_BACKING): New internal enum bit. (qemuConnectGetAllDomainStats): Recognize new user flag, and pass details to... (qemuDomainGetStatsBlock): ...here, where we can do longer recursion. (qemuDomainGetStatsOneBlock): Output new field.
Signed-off-by: Eric Blake <eblake@redhat.com> --- src/qemu/qemu_driver.c | 55 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 46 insertions(+), 9 deletions(-)
.. but we shouldn't add code that reports wrong data. Peter
participants (3)
-
Eric Blake
-
John Ferlan
-
Peter Krempa