[libvirt] [PATCH 0/2] [V3] Libvirt : Storage fixes

Here is V3 of storage fixes, which addresses review comments of v2. Summary: Prerna Saxena (2): Storage: Introduce shadow vol for refresh while the main vol builds. Storage : Fix cloning of raw, sparse volumes src/storage/storage_backend.c | 23 ++++++++++------------- src/storage/storage_driver.c | 19 ++++++++++++------- 2 files changed, 22 insertions(+), 20 deletions(-) V2 : http://www.redhat.com/archives/libvir-list/2015-June/msg01052.html Changelog: Modified patches 1,2 per review comments. -- 1.8.3.1 -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India

Libvirt periodically refreshes all volumes in a storage pool, including the volumes being cloned. While cloning a storage volume from parent, we drop pool locks. Subsequent volume refresh sometimes changes allocation for an ongoing copy, and leads to corrupt images. Fix: Introduce a shadow volume that isolates the volume object under refresh from the base which has a copy ongoing. Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com> --- src/storage/storage_driver.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 57060ab..41de8df 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1898,7 +1898,7 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, { virStoragePoolObjPtr pool, origpool = NULL; virStorageBackendPtr backend; - virStorageVolDefPtr origvol = NULL, newvol = NULL; + virStorageVolDefPtr origvol = NULL, newvol = NULL, shadowvol = NULL; virStorageVolPtr ret = NULL, volobj = NULL; unsigned long long allocation; int buildret; @@ -2010,6 +2010,15 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, if (backend->createVol(obj->conn, pool, newvol) < 0) goto cleanup; + /* Make a shallow copy of the 'defined' volume definition, since the + * original allocation value will change as the user polls 'info', + * but we only need the initial requested values + */ + if (VIR_ALLOC(shadowvol) < 0) + goto cleanup; + + memcpy(shadowvol, newvol, sizeof(*newvol)); + pool->volumes.objs[pool->volumes.count++] = newvol; volobj = virGetStorageVol(obj->conn, pool->def->name, newvol->name, newvol->key, NULL, NULL); @@ -2029,7 +2038,7 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, virStoragePoolObjUnlock(origpool); } - buildret = backend->buildVolFrom(obj->conn, pool, newvol, origvol, flags); + buildret = backend->buildVolFrom(obj->conn, pool, shadowvol, origvol, flags); storageDriverLock(); virStoragePoolObjLock(pool); @@ -2071,6 +2080,7 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, cleanup: virObjectUnref(volobj); virStorageVolDefFree(newvol); + VIR_FREE(shadowvol); if (pool) virStoragePoolObjUnlock(pool); if (origpool) -- 1.8.3.1 -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India

On Fri, Jun 26, 2015 at 05:05:11PM +0530, Prerna Saxena wrote:
Libvirt periodically refreshes all volumes in a storage pool, including the volumes being cloned. While cloning a storage volume from parent, we drop pool locks. Subsequent volume refresh sometimes changes allocation for an ongoing copy, and leads to corrupt images. Fix: Introduce a shadow volume that isolates the volume object under refresh from the base which has a copy ongoing.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com> --- src/storage/storage_driver.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
ACK
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 57060ab..41de8df 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1898,7 +1898,7 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, { virStoragePoolObjPtr pool, origpool = NULL; virStorageBackendPtr backend; - virStorageVolDefPtr origvol = NULL, newvol = NULL; + virStorageVolDefPtr origvol = NULL, newvol = NULL, shadowvol = NULL; virStorageVolPtr ret = NULL, volobj = NULL; unsigned long long allocation; int buildret; @@ -2010,6 +2010,15 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, if (backend->createVol(obj->conn, pool, newvol) < 0) goto cleanup;
+ /* Make a shallow copy of the 'defined' volume definition, since the + * original allocation value will change as the user polls 'info', + * but we only need the initial requested values + */ + if (VIR_ALLOC(shadowvol) < 0) + goto cleanup; + + memcpy(shadowvol, newvol, sizeof(*newvol)); + pool->volumes.objs[pool->volumes.count++] = newvol; volobj = virGetStorageVol(obj->conn, pool->def->name, newvol->name, newvol->key, NULL, NULL); @@ -2029,7 +2038,7 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, virStoragePoolObjUnlock(origpool); }
- buildret = backend->buildVolFrom(obj->conn, pool, newvol, origvol, flags); + buildret = backend->buildVolFrom(obj->conn, pool, shadowvol, origvol, flags);
newvol->target.allocation should also use the value from shadowvol. I have made the adjustment and pushed the patch. Jan

On Tuesday 30 June 2015 06:20 PM, Ján Tomko wrote:
On Fri, Jun 26, 2015 at 05:05:11PM +0530, Prerna Saxena wrote:
Libvirt periodically refreshes all volumes in a storage pool, including the volumes being cloned. While cloning a storage volume from parent, we drop pool locks. Subsequent volume refresh sometimes changes allocation for an ongoing copy, and leads to corrupt images. Fix: Introduce a shadow volume that isolates the volume object under refresh from the base which has a copy ongoing.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com> --- src/storage/storage_driver.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
ACK
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 57060ab..41de8df 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1898,7 +1898,7 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, { virStoragePoolObjPtr pool, origpool = NULL; virStorageBackendPtr backend; - virStorageVolDefPtr origvol = NULL, newvol = NULL; + virStorageVolDefPtr origvol = NULL, newvol = NULL, shadowvol = NULL; virStorageVolPtr ret = NULL, volobj = NULL; unsigned long long allocation; int buildret; @@ -2010,6 +2010,15 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, if (backend->createVol(obj->conn, pool, newvol) < 0) goto cleanup;
+ /* Make a shallow copy of the 'defined' volume definition, since the + * original allocation value will change as the user polls 'info', + * but we only need the initial requested values + */ + if (VIR_ALLOC(shadowvol) < 0) + goto cleanup; + + memcpy(shadowvol, newvol, sizeof(*newvol)); + pool->volumes.objs[pool->volumes.count++] = newvol; volobj = virGetStorageVol(obj->conn, pool->def->name, newvol->name, newvol->key, NULL, NULL); @@ -2029,7 +2038,7 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, virStoragePoolObjUnlock(origpool); }
- buildret = backend->buildVolFrom(obj->conn, pool, newvol, origvol, flags); + buildret = backend->buildVolFrom(obj->conn, pool, shadowvol, origvol, flags);
newvol->target.allocation should also use the value from shadowvol. I have made the adjustment and pushed the patch.
Jan
Hi Jan, Thanks. Could you also take a look at the second patch in this series ? Regards, -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India

When virsh vol-clone is attempted on a raw file where capacity > allocation, the resulting cloned volume has a size that matches the virtual-size of the parent; in place of matching its actual, disk size. This patch fixes the cloned disk to have same _allocated_size_ as the parent file from which it was cloned. Ref: http://www.redhat.com/archives/libvir-list/2015-May/msg00050.html Also fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1130739 Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com> --- src/storage/storage_backend.c | 23 ++++++++++------------- src/storage/storage_driver.c | 5 ----- 2 files changed, 10 insertions(+), 18 deletions(-) diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index ce59f63..492b344 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -342,7 +342,7 @@ virStorageBackendCreateBlockFrom(virConnectPtr conn ATTRIBUTE_UNUSED, goto cleanup; } - remain = vol->target.allocation; + remain = vol->target.capacity; if (inputvol) { int res = virStorageBackendCopyToFD(vol, inputvol, @@ -397,7 +397,8 @@ createRawFile(int fd, virStorageVolDefPtr vol, virStorageVolDefPtr inputvol, bool reflink_copy) { - bool need_alloc = true; + bool want_sparse = (inputvol && + (inputvol->target.capacity > vol->target.allocation)); int ret = 0; unsigned long long remain; @@ -420,10 +421,9 @@ createRawFile(int fd, virStorageVolDefPtr vol, * to writing zeroes block by block in case fallocate isn't * available, and since we're going to copy data from another * file it doesn't make sense to write the file twice. */ - if (vol->target.allocation) { - if (fallocate(fd, 0, 0, vol->target.allocation) == 0) { - need_alloc = false; - } else if (errno != ENOSYS && errno != EOPNOTSUPP) { + if (vol->target.allocation && !want_sparse) { + if ((fallocate(fd, 0, 0, vol->target.allocation) != 0) && + (errno != ENOSYS && errno != EOPNOTSUPP)) { ret = -errno; virReportSystemError(errno, _("cannot allocate %llu bytes in file '%s'"), @@ -433,22 +433,19 @@ createRawFile(int fd, virStorageVolDefPtr vol, } #endif - remain = vol->target.allocation; + remain = vol->target.capacity; if (inputvol) { /* allow zero blocks to be skipped if we've requested sparse - * allocation (allocation < capacity) or we have already - * been able to allocate the required space. */ - bool want_sparse = !need_alloc || - (vol->target.allocation < inputvol->target.capacity); - + * allocation (allocation < capacity) + */ ret = virStorageBackendCopyToFD(vol, inputvol, fd, &remain, want_sparse, reflink_copy); if (ret < 0) goto cleanup; } - if (remain && need_alloc) { + if (remain && !want_sparse) { if (safezero(fd, vol->target.allocation - remain, remain) < 0) { ret = -errno; virReportSystemError(errno, _("cannot fill file '%s'"), diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 41de8df..b8ed006 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1976,11 +1976,6 @@ storageVolCreateXMLFrom(virStoragePoolPtr obj, if (newvol->target.capacity < origvol->target.capacity) newvol->target.capacity = origvol->target.capacity; - /* Make sure allocation is at least as large as the destination cap, - * to make absolutely sure we copy all possible contents */ - if (newvol->target.allocation < origvol->target.capacity) - newvol->target.allocation = origvol->target.capacity; - if (!backend->buildVolFrom) { virReportError(VIR_ERR_NO_SUPPORT, "%s", _("storage pool does not support" -- 1.8.3.1 -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India

On Fri, Jun 26, 2015 at 05:13:26PM +0530, Prerna Saxena wrote:
When virsh vol-clone is attempted on a raw file where capacity > allocation, the resulting cloned volume has a size that matches the virtual-size of the parent; in place of matching its actual, disk size. This patch fixes the cloned disk to have same _allocated_size_ as the parent file from which it was cloned.
Ref: http://www.redhat.com/archives/libvir-list/2015-May/msg00050.html
Also fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1130739
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com> --- src/storage/storage_backend.c | 23 ++++++++++------------- src/storage/storage_driver.c | 5 ----- 2 files changed, 10 insertions(+), 18 deletions(-)
diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index ce59f63..492b344 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -342,7 +342,7 @@ virStorageBackendCreateBlockFrom(virConnectPtr conn ATTRIBUTE_UNUSED, goto cleanup; }
- remain = vol->target.allocation; + remain = vol->target.capacity;
if (inputvol) { int res = virStorageBackendCopyToFD(vol, inputvol, @@ -397,7 +397,8 @@ createRawFile(int fd, virStorageVolDefPtr vol, virStorageVolDefPtr inputvol, bool reflink_copy) { - bool need_alloc = true; + bool want_sparse = (inputvol && + (inputvol->target.capacity > vol->target.allocation)); int ret = 0; unsigned long long remain;
@@ -420,10 +421,9 @@ createRawFile(int fd, virStorageVolDefPtr vol, * to writing zeroes block by block in case fallocate isn't * available, and since we're going to copy data from another * file it doesn't make sense to write the file twice. */ - if (vol->target.allocation) { - if (fallocate(fd, 0, 0, vol->target.allocation) == 0) { - need_alloc = false; - } else if (errno != ENOSYS && errno != EOPNOTSUPP) { + if (vol->target.allocation && !want_sparse) { + if ((fallocate(fd, 0, 0, vol->target.allocation) != 0) &&
If fallocate succeeds here, there's no need to copy zero bytes from the source.
+ (errno != ENOSYS && errno != EOPNOTSUPP)) { ret = -errno; virReportSystemError(errno, _("cannot allocate %llu bytes in file '%s'"), @@ -433,22 +433,19 @@ createRawFile(int fd, virStorageVolDefPtr vol, } #endif
- remain = vol->target.allocation; + remain = vol->target.capacity;
This breaks sparse file creation on systems without fallocate.
if (inputvol) { /* allow zero blocks to be skipped if we've requested sparse - * allocation (allocation < capacity) or we have already - * been able to allocate the required space. */ - bool want_sparse = !need_alloc || - (vol->target.allocation < inputvol->target.capacity); - + * allocation (allocation < capacity) + */ ret = virStorageBackendCopyToFD(vol, inputvol, fd, &remain, want_sparse, reflink_copy);
I'd rahter use !need_alloc instead of want_sparse here, because sparseness only makes sense for cloning volumes, but the bool affects creating new volumes as well.
if (ret < 0) goto cleanup; }
- if (remain && need_alloc) { + if (remain && !want_sparse) { if (safezero(fd, vol->target.allocation - remain, remain) < 0) {
This calculation assumes that 'remain' contains the bytes remaining to the target allocation, but after this patch it contains the bytes remaining to the target capacity.
ret = -errno; virReportSystemError(errno, _("cannot fill file '%s'"),
I'll send a v4 shortly. Jan
participants (2)
-
Ján Tomko
-
Prerna Saxena