[libvirt] Improvements to the RBD storage backend

Hi, After review by John Ferlan I'm sending a revised version of my earlier patch which adds the fast-diff feature for RBD volumes. Before the patch there is one small improvement which is later used by the fast-diff feature. The other is a bugfix where the #ifdef in the code wasn't requiring the right version of librbd. Wido

As more and more features are added to RBD volumes we will need to call this method more often. By moving it into a internal function we can re-use code inside the storage backend. Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index 5d73370..5f2469f 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -280,6 +280,24 @@ virStorageBackendRBDCloseRADOSConn(virStorageBackendRBDStatePtr ptr) } static int +volStorageBackendRBDGetFeatures(rbd_image_t image, + const char *volname, + uint64_t *features) +{ + int r, ret = -1; + + if ((r = rbd_get_features(image, features)) < 0) { + virReportSystemError(-r, _("failed to get the features of RBD image " + "%s"), volname); + goto cleanup; + } + ret = 0; + + cleanup: + return ret; +} + +static int volStorageBackendRBDRefreshVolInfo(virStorageVolDefPtr vol, virStoragePoolObjPtr pool, virStorageBackendRBDStatePtr ptr) @@ -685,11 +703,8 @@ virStorageBackendRBDImageInfo(rbd_image_t image, goto cleanup; } - if ((r = rbd_get_features(image, features)) < 0) { - virReportSystemError(-r, _("failed to get the features of RBD image %s"), - volname); + if (volStorageBackendRBDGetFeatures(image, volname, features) < 0) goto cleanup; - } if ((r = rbd_get_stripe_unit(image, stripe_unit)) < 0) { virReportSystemError(-r, _("failed to get the stripe unit of RBD image %s"), -- 1.9.1

In commit 0b15f920 there is a #ifdef which requires LIBRBD_VERSION_CODE 266 or newer for rbd_diff_iterate2() rbd_diff_iterate2() is available since 266, so this if-statement should require anything newer than 265. Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index 5f2469f..887db0b 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -792,8 +792,10 @@ virStorageBackendRBDSnapshotFindNoDiff(rbd_image_t image, * rbd_diff_iterate2() is available in versions above Ceph 0.94 (Hammer) * It uses a object map inside Ceph which is faster than rbd_diff_iterate() * which iterates all objects. + * LIBRBD_VERSION_CODE for Ceph 0.94 is 265. In 266 and upwards diff_iterate2 + * is available */ -#if LIBRBD_VERSION_CODE > 266 +#if LIBRBD_VERSION_CODE > 265 r = rbd_diff_iterate2(image, snaps[i].name, 0, info.size, 0, 1, virStorageBackendRBDIterateCb, (void *)&diff); #else -- 1.9.1

Since Ceph version Infernalis (9.2.0) the new fast-diff mechanism of RBD allows for querying actual volume usage. Prior to this version there was no easy and fast way to query how much allocation a RBD volume had inside a Ceph cluster. To use the fast-diff feature it needs to be enabled per RBD image and is only supported by Ceph cluster running version Infernalis (9.2.0) or newer. Without the fast-diff feature enabled libvirt will report an allocation identical to the image capacity. This is how libvirt behaves currently. 'virsh vol-info rbd/image2' might output for example: Name: image2 Type: network Capacity: 1,00 GiB Allocation: 124,00 MiB Newly created volumes will have the fast-diff feature enabled if the backing Ceph cluster supports it. Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 84 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index 887db0b..c00f6b2 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -297,6 +297,68 @@ volStorageBackendRBDGetFeatures(rbd_image_t image, return ret; } +#if LIBRBD_VERSION_CODE > 265 +static bool +volStorageBackendRBDUseFastDiff(uint64_t features) +{ + return features & RBD_FEATURE_FAST_DIFF; +} + +static int +virStorageBackendRBDRefreshVolInfoCb(uint64_t offset ATTRIBUTE_UNUSED, + size_t len, + int exists, + void *arg) +{ + uint64_t *used_size = (uint64_t *)(arg); + if (exists) + (*used_size) += len; + + return 0; +} + +static int +virStorageBackendRBDSetAllocation(virStorageVolDefPtr vol, + rbd_image_t *image, + rbd_image_info_t *info) +{ + int r, ret = -1; + uint64_t allocation = 0; + + if ((r = rbd_diff_iterate2(image, NULL, 0, info->size, 0, 1, + &virStorageBackendRBDRefreshVolInfoCb, + &allocation)) < 0) { + virReportSystemError(-r, _("failed to iterate RBD image '%s'"), + vol->name); + goto cleanup; + } + + VIR_DEBUG("Found %zu bytes allocated for RBD image %s", + allocation, vol->name); + + vol->target.allocation = allocation; + ret = 0; + + cleanup: + return ret; +} + +#else +static int +volStorageBackendRBDUseFastDiff(uint64_t features ATTRIBUTE_UNUSED) +{ + return false; +} + +static int +virStorageBackendRBDSetAllocation(virStorageVolDefPtr vol ATTRIBUTE_UNUSED, + rbd_image_t *image ATTRIBUTE_UNUSED, + rbd_image_info_t *info ATTRIBUTE_UNUSED) +{ + return false; +} +#endif + static int volStorageBackendRBDRefreshVolInfo(virStorageVolDefPtr vol, virStoragePoolObjPtr pool, @@ -306,6 +368,7 @@ volStorageBackendRBDRefreshVolInfo(virStorageVolDefPtr vol, int r = 0; rbd_image_t image = NULL; rbd_image_info_t info; + uint64_t features; if ((r = rbd_open_read_only(ptr->ioctx, vol->name, &image, NULL)) < 0) { ret = -r; @@ -321,15 +384,30 @@ volStorageBackendRBDRefreshVolInfo(virStorageVolDefPtr vol, goto cleanup; } - VIR_DEBUG("Refreshed RBD image %s/%s (size: %zu obj_size: %zu num_objs: %zu)", - pool->def->source.name, vol->name, info.size, info.obj_size, - info.num_objs); + if (volStorageBackendRBDGetFeatures(image, vol->name, &features) < 0) + goto cleanup; vol->target.capacity = info.size; vol->target.allocation = info.obj_size * info.num_objs; vol->type = VIR_STORAGE_VOL_NETWORK; vol->target.format = VIR_STORAGE_FILE_RAW; + if (volStorageBackendRBDUseFastDiff(features)) { + VIR_DEBUG("RBD image %s/%s has fast-diff feature enabled. " + "Querying for actual allocation", + pool->def->source.name, vol->name); + + if (virStorageBackendRBDSetAllocation(vol, image, &info) < 0) + goto cleanup; + } else { + vol->target.allocation = info.obj_size * info.num_objs; + } + + VIR_DEBUG("Refreshed RBD image %s/%s (capacity: %llu allocation: %llu " + "obj_size: %zu num_objs: %zu)", + pool->def->source.name, vol->name, vol->target.capacity, + vol->target.allocation, info.obj_size, info.num_objs); + VIR_FREE(vol->target.path); if (virAsprintf(&vol->target.path, "%s/%s", pool->def->source.name, -- 1.9.1

On 02/11/2016 11:04 AM, Wido den Hollander wrote:
Hi,
After review by John Ferlan I'm sending a revised version of my earlier patch which adds the fast-diff feature for RBD volumes.
Before the patch there is one small improvement which is later used by the fast-diff feature.
The other is a bugfix where the #ifdef in the code wasn't requiring the right version of librbd.
Wido
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Made one small adjustment to patch 3 to not set allocation twice (once before the if and once inside the else). It's only set in the else now. ACK series and pushed. John
participants (2)
-
John Ferlan
-
Wido den Hollander