[libvirt] [PATCH] rbd: Use rbd_create3 to create RBD format 2 images by default

This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit of the new RBD format. Older versions of libvirt can work with this new RBD format as long as librbd supports format 2, something that all recent versions of librbd do. Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index 4b6f18c..f3dd7a0 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -435,6 +435,26 @@ cleanup: return ret; } +static int virStorageBackendRBDCreateImage(rados_ioctx_t io, + char *name, long capacity) +{ + int order = 0; + #if LIBRBD_VERSION_CODE > 260 + uint64_t features = 3; + uint64_t stripe_count = 1; + uint64_t stripe_unit = 4194304; + + if (rbd_create3(io, name, capacity, features, &order, + stripe_count, stripe_unit) < 0) { + #else + if (rbd_create(io, name, capacity, &order) < 0) { + #endif + return -1; + } + + return 0; +} + static int virStorageBackendRBDCreateVol(virConnectPtr conn, virStoragePoolObjPtr pool, virStorageVolDefPtr vol) @@ -442,7 +462,6 @@ static int virStorageBackendRBDCreateVol(virConnectPtr conn, virStorageBackendRBDState ptr; ptr.cluster = NULL; ptr.ioctx = NULL; - int order = 0; int ret = -1; VIR_DEBUG("Creating RBD image %s/%s with size %llu", @@ -467,7 +486,7 @@ static int virStorageBackendRBDCreateVol(virConnectPtr conn, goto cleanup; } - if (rbd_create(ptr.ioctx, vol->name, vol->capacity, &order) < 0) { + if (virStorageBackendRBDCreateImage(ptr.ioctx, vol->name, vol->capacity) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("failed to create volume '%s/%s'"), pool->def->source.name, -- 1.7.9.5

On 12/11/2013 03:47 PM, Wido den Hollander wrote:
This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit of the new RBD format.
Older versions of libvirt can work with this new RBD format as long as librbd supports format 2, something that all recent versions of librbd do.
Did somebody get a look at this patch yet? It's a fairly simple one, but would be great for the Ceph project! Wido
Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index 4b6f18c..f3dd7a0 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -435,6 +435,26 @@ cleanup: return ret; }
+static int virStorageBackendRBDCreateImage(rados_ioctx_t io, + char *name, long capacity) +{ + int order = 0; + #if LIBRBD_VERSION_CODE > 260 + uint64_t features = 3; + uint64_t stripe_count = 1; + uint64_t stripe_unit = 4194304; + + if (rbd_create3(io, name, capacity, features, &order, + stripe_count, stripe_unit) < 0) { + #else + if (rbd_create(io, name, capacity, &order) < 0) { + #endif + return -1; + } + + return 0; +} + static int virStorageBackendRBDCreateVol(virConnectPtr conn, virStoragePoolObjPtr pool, virStorageVolDefPtr vol) @@ -442,7 +462,6 @@ static int virStorageBackendRBDCreateVol(virConnectPtr conn, virStorageBackendRBDState ptr; ptr.cluster = NULL; ptr.ioctx = NULL; - int order = 0; int ret = -1;
VIR_DEBUG("Creating RBD image %s/%s with size %llu", @@ -467,7 +486,7 @@ static int virStorageBackendRBDCreateVol(virConnectPtr conn, goto cleanup; }
- if (rbd_create(ptr.ioctx, vol->name, vol->capacity, &order) < 0) { + if (virStorageBackendRBDCreateImage(ptr.ioctx, vol->name, vol->capacity) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("failed to create volume '%s/%s'"), pool->def->source.name,

On Thu, Jan 2, 2014 at 10:01 AM, Wido den Hollander <wido@widodh.nl> wrote:
On 12/11/2013 03:47 PM, Wido den Hollander wrote:
This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit of the new RBD format.
Older versions of libvirt can work with this new RBD format as long as librbd supports format 2, something that all recent versions of librbd do.
Did somebody get a look at this patch yet? It's a fairly simple one, but would be great for the Ceph project!
So I'm thinking we might want to do something similar with the compat attribute added for qcow2 for the versioning that way things are explicitly stated rather than implicitly switching versions on us. -- Doug Goldstein

On 01/02/2014 08:42 PM, Doug Goldstein wrote:
On Thu, Jan 2, 2014 at 10:01 AM, Wido den Hollander <wido@widodh.nl> wrote:
On 12/11/2013 03:47 PM, Wido den Hollander wrote:
This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit of the new RBD format.
Older versions of libvirt can work with this new RBD format as long as librbd supports format 2, something that all recent versions of librbd do.
Did somebody get a look at this patch yet? It's a fairly simple one, but would be great for the Ceph project!
So I'm thinking we might want to do something similar with the compat attribute added for qcow2 for the versioning that way things are explicitly stated rather than implicitly switching versions on us.
Problem is that we are linking to a library here on compile time and not executing a binary to create the image. The Ceph project is still very new, almost nobody is actually using the format 1 images anymore. I wouldn't recommend anybody creating format 1 images now. They still work just fine, but we don't want to encourage people to keep using them. Wido

On 12/11/2013 03:47 PM, Wido den Hollander wrote:
This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit of the new RBD format.
Older versions of libvirt can work with this new RBD format as long as librbd supports format 2, something that all recent versions of librbd do.
Could somebody take a look at this patch? We would really like to see this in libvirt so it can make it's way into the Linux distributions. Wido
Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index 4b6f18c..f3dd7a0 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -435,6 +435,26 @@ cleanup: return ret; }
+static int virStorageBackendRBDCreateImage(rados_ioctx_t io, + char *name, long capacity) +{ + int order = 0; + #if LIBRBD_VERSION_CODE > 260 + uint64_t features = 3; + uint64_t stripe_count = 1; + uint64_t stripe_unit = 4194304; + + if (rbd_create3(io, name, capacity, features, &order, + stripe_count, stripe_unit) < 0) { + #else + if (rbd_create(io, name, capacity, &order) < 0) { + #endif + return -1; + } + + return 0; +} + static int virStorageBackendRBDCreateVol(virConnectPtr conn, virStoragePoolObjPtr pool, virStorageVolDefPtr vol) @@ -442,7 +462,6 @@ static int virStorageBackendRBDCreateVol(virConnectPtr conn, virStorageBackendRBDState ptr; ptr.cluster = NULL; ptr.ioctx = NULL; - int order = 0; int ret = -1;
VIR_DEBUG("Creating RBD image %s/%s with size %llu", @@ -467,7 +486,7 @@ static int virStorageBackendRBDCreateVol(virConnectPtr conn, goto cleanup; }
- if (rbd_create(ptr.ioctx, vol->name, vol->capacity, &order) < 0) { + if (virStorageBackendRBDCreateImage(ptr.ioctx, vol->name, vol->capacity) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("failed to create volume '%s/%s'"), pool->def->source.name,

On 12/11/2013 03:47 PM, Wido den Hollander wrote:
This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit of the new RBD format.
Older versions of libvirt can work with this new RBD format as long as librbd supports format 2, something that all recent versions of librbd do.
How recent? It might be nicer to mention the version number. Also, the patch no longer applies.
Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index 4b6f18c..f3dd7a0 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -435,6 +435,26 @@ cleanup: return ret; }
+static int virStorageBackendRBDCreateImage(rados_ioctx_t io, + char *name, long capacity) +{ + int order = 0; + #if LIBRBD_VERSION_CODE > 260
This will fail 'make syntax-check' as it's not indented properly, see: http://libvirt.org/hacking.html#preprocessor It would also be easier to read if compared against LIBRBD_VERSION(0, 1, x), instead of 260.
+ uint64_t features = 3; + uint64_t stripe_count = 1; + uint64_t stripe_unit = 4194304; +
Can these numbers be represented by more descriptive constants from librbd header files? Jan

This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit from the new RBD format. Older versions of libvirt can work with this new RBD format as long as librbd supports format 2. RBD format is supported by librbd since version 0.56 (Ceph Bobtail). Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index c5f0bc5..91c07ac 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -458,6 +458,25 @@ virStorageBackendRBDCreateVol(virConnectPtr conn ATTRIBUTE_UNUSED, return 0; } +static int virStorageBackendRBDCreateImage(rados_ioctx_t io, + char *name, long capacity) +{ + int order = 0; + #if LIBRBD_VERSION_CODE > 260 + uint64_t features = 3; + uint64_t stripe_count = 1; + uint64_t stripe_unit = 4194304; + + if (rbd_create3(io, name, capacity, features, &order, + stripe_count, stripe_unit) < 0) { + #else + if (rbd_create(io, name, capacity, &order) < 0) { + #endif + return -1; + } + + return 0; +} static int virStorageBackendRBDBuildVol(virConnectPtr conn, @@ -468,7 +487,6 @@ virStorageBackendRBDBuildVol(virConnectPtr conn, virStorageBackendRBDState ptr; ptr.cluster = NULL; ptr.ioctx = NULL; - int order = 0; int ret = -1; VIR_DEBUG("Creating RBD image %s/%s with size %llu", @@ -494,7 +512,7 @@ virStorageBackendRBDBuildVol(virConnectPtr conn, goto cleanup; } - if (rbd_create(ptr.ioctx, vol->name, vol->capacity, &order) < 0) { + if (virStorageBackendRBDCreateImage(ptr.ioctx, vol->name, vol->capacity) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("failed to create volume '%s/%s'"), pool->def->source.name, -- 1.7.9.5

On Thu, Jan 30, 2014 at 03:19:11PM +0100, Wido den Hollander wrote:
This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit from the new RBD format.
Older versions of libvirt can work with this new RBD format as long as librbd supports format 2. RBD format is supported by librbd since version 0.56 (Ceph Bobtail).
Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index c5f0bc5..91c07ac 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -458,6 +458,25 @@ virStorageBackendRBDCreateVol(virConnectPtr conn ATTRIBUTE_UNUSED, return 0; }
+static int virStorageBackendRBDCreateImage(rados_ioctx_t io, + char *name, long capacity) +{ + int order = 0; + #if LIBRBD_VERSION_CODE > 260 + uint64_t features = 3; + uint64_t stripe_count = 1; + uint64_t stripe_unit = 4194304; + + if (rbd_create3(io, name, capacity, features, &order, + stripe_count, stripe_unit) < 0) { + #else + if (rbd_create(io, name, capacity, &order) < 0) { + #endif
The '#if' indentation violates style rules - please remember to run 'make syntax-check' before submitting. ACK and I've pushed with the indentation fix Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 01/30/2014 02:02 PM, Ján Tomko wrote:
On 12/11/2013 03:47 PM, Wido den Hollander wrote:
This new RBD format supports snapshotting and cloning. By having libvirt create images in format 2 end-users of the created images can benefit of the new RBD format.
Older versions of libvirt can work with this new RBD format as long as librbd supports format 2, something that all recent versions of librbd do.
How recent? It might be nicer to mention the version number.
Also, the patch no longer applies.
I sent a revised version of the patch to the list last week. The commit message now shows the librbd versions required and it also applies to master again. Could you take a look at it again? It would really help the Ceph project. Thanks a lot! Wido
Signed-off-by: Wido den Hollander <wido@widodh.nl> --- src/storage/storage_backend_rbd.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c index 4b6f18c..f3dd7a0 100644 --- a/src/storage/storage_backend_rbd.c +++ b/src/storage/storage_backend_rbd.c @@ -435,6 +435,26 @@ cleanup: return ret; }
+static int virStorageBackendRBDCreateImage(rados_ioctx_t io, + char *name, long capacity) +{ + int order = 0; + #if LIBRBD_VERSION_CODE > 260
This will fail 'make syntax-check' as it's not indented properly, see: http://libvirt.org/hacking.html#preprocessor
It would also be easier to read if compared against LIBRBD_VERSION(0, 1, x), instead of 260.
+ uint64_t features = 3; + uint64_t stripe_count = 1; + uint64_t stripe_unit = 4194304; +
Can these numbers be represented by more descriptive constants from librbd header files?
Jan
participants (4)
-
Daniel P. Berrange
-
Doug Goldstein
-
Ján Tomko
-
Wido den Hollander