[libvirt] [PATCH 0/8] Resolve some storage and disk backend issues

The following patches resolve a few issues either explicitly generated via a bz or seen while investigating/testing the results from bzs. Patches 1-2 were discovered while working through patch 3. I found that deleting volumes from a pool was always going through the fork processing. I neglected to recall that when the volume was placed into the pool the mode, uid, & gid values can be updated. Thus on some delete paths we may not be called with -1 values. Because we were going through the fork processing I found that the error message displayed when a volume target no longer exists was wrong - that is the caller was expecting a return of -1 and errno being set. In this case looking for ENOENT and continuing on with removal of the volume from the pool. Patch 3 stands by itself - it's certainly one of those that could be deemed don't do that, but since we do check whether the volume was removed without libvirt doing the removal, I guess we should check whether the volume exists before trying to create it and using the refresh pool is the simplest mechanism for all pools. Patches 4-7 are intertwined - the disk backend code made the assumption that the volume/device being used for the pool was "good". The normal mode of operation is to define, build, start, but there's never been a "guarantee" or check that what is "on" the volume is valid for the type of pool Patch 8 was disk backend related - so while I'm in the code, I may as well make the change as well. John Ferlan (8): virfile: Add extra check for direct delete in virFileRemove virfile: Fix error path for forked virFileRemove storage: Prior to creating a volume, refresh the pool storage: Refactor disk label checking storage: Add param to check whether we can write a disk label storage: Add additional errors/checks for disk label storage: Introduce virStorageBackendDiskStartPool storage: Adjust calculation of alloc/capacity for disk src/libvirt_private.syms | 1 + src/storage/storage_backend_disk.c | 162 ++++++++++++++++++++++++++++++------- src/storage/storage_driver.c | 9 +++ src/util/virfile.c | 11 ++- 4 files changed, 151 insertions(+), 32 deletions(-) -- 2.1.0

Unlike create options, if the file to be removed is already in the pool, then the uid/gid will come from the pool. If it's the same as the currently running process, then just do the unlink/rmdir directly rather than going through the fork processing unnecessarily Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/util/virfile.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/util/virfile.c b/src/util/virfile.c index 2332589..3d7efdc 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -2316,10 +2316,11 @@ virFileRemove(const char *path, int ngroups; /* If not running as root or if a non explicit uid/gid was being used for - * the file/volume, then use unlink directly + * the file/volume or the explicit uid/gid matches, then use unlink directly */ if ((geteuid() != 0) || - ((uid == (uid_t) -1) && (gid == (gid_t) -1))) { + ((uid == (uid_t) -1) && (gid == (gid_t) -1)) || + (uid == geteuid() && gid == getegid())) { if (virFileIsDir(path)) return rmdir(path); else -- 2.1.0

As it turns out the caller in this case expects a return < 0 for failure and to get/use "errno" rather than using the negative of returned status. Again different than the create path. If someone "deleted" a file from the pool without using virsh vol-delete, then the unlink/rmdir would return an error (-1) and set errno to ENOENT. The caller checks errno for ENOENT when determining whether to throw an error message indicating the failure. Without the change, the error message is: error: Failed to delete vol $vol error: cannot unlink file '/$pathto/$vol': Success This patch thus allows the fork path to follow the non-fork path where unlink/rmdir return -1 and errno. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/util/virfile.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/util/virfile.c b/src/util/virfile.c index 3d7efdc..a81f04c 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -2364,8 +2364,10 @@ virFileRemove(const char *path, status = EACCES; } - if (status) - ret = -status; + if (status) { + errno = status; + ret = -1; + } return ret; } -- 2.1.0

On 02.10.2015 15:41, John Ferlan wrote:
As it turns out the caller in this case expects a return < 0 for failure and to get/use "errno" rather than using the negative of returned status. Again different than the create path.
If someone "deleted" a file from the pool without using virsh vol-delete, then the unlink/rmdir would return an error (-1) and set errno to ENOENT. The caller checks errno for ENOENT when determining whether to throw an error message indicating the failure. Without the change, the error message is:
error: Failed to delete vol $vol error: cannot unlink file '/$pathto/$vol': Success
This patch thus allows the fork path to follow the non-fork path where unlink/rmdir return -1 and errno.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/util/virfile.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/util/virfile.c b/src/util/virfile.c index 3d7efdc..a81f04c 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -2364,8 +2364,10 @@ virFileRemove(const char *path, status = EACCES; }
- if (status) - ret = -status; + if (status) { + errno = status; + ret = -1; + }
return ret; }
Yep, this matches not only the rest of the function (e.g. when unlinking without forking) but the win32 impl too. Checked all the callers (ehm, so far the only one), and this behaviour is expected there too. ACK Michal

https://bugzilla.redhat.com/show_bug.cgi?id=1233003 Although perhaps bordering on a don't do that type scenario, if someone creates a volume in a pool outside of libvirt, then uses that same name to create a volume in the pool via libvirt, then the creation will fail and in some cases cause the same name volume to be deleted. This patch will refresh the pool just prior to checking whether the named volume exists prior to creating the volume in the pool. While it's still possible to have a timing window to create a file after the check - at least we tried. At that point, someone is being malicious. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_driver.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 7aaa060..ddf4405 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1796,6 +1796,15 @@ storageVolCreateXML(virStoragePoolPtr obj, if (virStorageVolCreateXMLEnsureACL(obj->conn, pool->def, voldef) < 0) goto cleanup; + /* While not perfect, refresh the list of volumes in the pool and + * then check that the incoming name isn't already in the pool. + */ + if (backend->refreshPool) { + virStoragePoolObjClearVols(pool); + if (backend->refreshPool(obj->conn, pool) < 0) + goto cleanup; + } + if (virStorageVolDefFindByName(pool, voldef->name)) { virReportError(VIR_ERR_STORAGE_VOL_EXIST, _("'%s'"), voldef->name); -- 2.1.0

On 02.10.2015 15:41, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1233003
Although perhaps bordering on a don't do that type scenario, if someone creates a volume in a pool outside of libvirt, then uses that same name to create a volume in the pool via libvirt, then the creation will fail and in some cases cause the same name volume to be deleted.
This patch will refresh the pool just prior to checking whether the named volume exists prior to creating the volume in the pool. While it's still possible to have a timing window to create a file after the check - at least we tried. At that point, someone is being malicious.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_driver.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 7aaa060..ddf4405 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1796,6 +1796,15 @@ storageVolCreateXML(virStoragePoolPtr obj, if (virStorageVolCreateXMLEnsureACL(obj->conn, pool->def, voldef) < 0) goto cleanup;
+ /* While not perfect, refresh the list of volumes in the pool and + * then check that the incoming name isn't already in the pool. + */ + if (backend->refreshPool) { + virStoragePoolObjClearVols(pool); + if (backend->refreshPool(obj->conn, pool) < 0) + goto cleanup; + } + if (virStorageVolDefFindByName(pool, voldef->name)) { virReportError(VIR_ERR_STORAGE_VOL_EXIST, _("'%s'"), voldef->name);
Okay, this makes sense. Not only from the POV you are presenting, but I'd expect my pool to be refreshed after I create new volume in it. And I think we have a way to eliminate even the little window - just track if we successfully built the volume or not. Something among these lines: diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index ddf4405..dd28f3f 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1857,7 +1857,7 @@ storageVolCreateXML(virStoragePoolPtr obj, voldef->building = true; virStoragePoolObjUnlock(pool); - buildret = backend->buildVol(obj->conn, pool, buildvoldef, flags); + buildret = backend->buildVol(obj->conn, pool, buildvoldef, &created, flags); storageDriverLock(); virStoragePoolObjLock(pool); @@ -1866,7 +1866,7 @@ storageVolCreateXML(virStoragePoolPtr obj, voldef->building = false; pool->asyncjobs--; - if (buildret < 0) { + if (buildret < 0 && created) { VIR_FREE(buildvoldef); storageVolDeleteInternal(volobj, backend, pool, voldef, 0, false); At any rate, ACK to this patch as is. Michal

On 10/05/2015 07:28 AM, Michal Privoznik wrote:
On 02.10.2015 15:41, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1233003
Although perhaps bordering on a don't do that type scenario, if someone creates a volume in a pool outside of libvirt, then uses that same name to create a volume in the pool via libvirt, then the creation will fail and in some cases cause the same name volume to be deleted.
This patch will refresh the pool just prior to checking whether the named volume exists prior to creating the volume in the pool. While it's still possible to have a timing window to create a file after the check - at least we tried. At that point, someone is being malicious.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_driver.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 7aaa060..ddf4405 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1796,6 +1796,15 @@ storageVolCreateXML(virStoragePoolPtr obj, if (virStorageVolCreateXMLEnsureACL(obj->conn, pool->def, voldef) < 0) goto cleanup;
+ /* While not perfect, refresh the list of volumes in the pool and + * then check that the incoming name isn't already in the pool. + */ + if (backend->refreshPool) { + virStoragePoolObjClearVols(pool); + if (backend->refreshPool(obj->conn, pool) < 0) + goto cleanup; + } + if (virStorageVolDefFindByName(pool, voldef->name)) { virReportError(VIR_ERR_STORAGE_VOL_EXIST, _("'%s'"), voldef->name);
Okay, this makes sense. Not only from the POV you are presenting, but I'd expect my pool to be refreshed after I create new volume in it.
And I think we have a way to eliminate even the little window - just track if we successfully built the volume or not. Something among these lines:
Did a bit of digging on this today - suffice to say it'll be tricky... * For some backends, the createVol actually does the creation (disk, logical, zfs) while for others the createVol is just checking existence and generating a key (fs, rbd, sheepdog) and buildVol does the magic creation deed. * For those that support buildVol, there's an "intertwined relationship" with buildVolFrom. There's even one (disk) that has the buildVolFrom, but no buildVol. One would 'assume' (hah!) that buildVolFrom would have an already created volume into which the from volume gets moved, but I know what happens when one assumes... * There's some fragility related to NFS root squash and the virFileOpenAs API (including the qemuOpenFileAs). In any case, I've created some patches, but need a bit more digging time to feel a bit more comfortable about them... John Initially in my mind the timing window was only "externally" based...
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index ddf4405..dd28f3f 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1857,7 +1857,7 @@ storageVolCreateXML(virStoragePoolPtr obj, voldef->building = true; virStoragePoolObjUnlock(pool);
- buildret = backend->buildVol(obj->conn, pool, buildvoldef, flags); + buildret = backend->buildVol(obj->conn, pool, buildvoldef, &created, flags);
storageDriverLock(); virStoragePoolObjLock(pool); @@ -1866,7 +1866,7 @@ storageVolCreateXML(virStoragePoolPtr obj, voldef->building = false; pool->asyncjobs--;
- if (buildret < 0) { + if (buildret < 0 && created) { VIR_FREE(buildvoldef); storageVolDeleteInternal(volobj, backend, pool, voldef, 0, false);
At any rate, ACK to this patch as is.
Michal

On Mon, Oct 05, 2015 at 07:27:34PM -0400, John Ferlan wrote:
On 10/05/2015 07:28 AM, Michal Privoznik wrote:
On 02.10.2015 15:41, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1233003
Although perhaps bordering on a don't do that type scenario, if someone creates a volume in a pool outside of libvirt, then uses that same name to create a volume in the pool via libvirt, then the creation will fail and in some cases cause the same name volume to be deleted.
This patch will refresh the pool just prior to checking whether the named volume exists prior to creating the volume in the pool. While it's still possible to have a timing window to create a file after the check - at least we tried. At that point, someone is being malicious.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_driver.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 7aaa060..ddf4405 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1796,6 +1796,15 @@ storageVolCreateXML(virStoragePoolPtr obj, if (virStorageVolCreateXMLEnsureACL(obj->conn, pool->def, voldef) < 0) goto cleanup;
+ /* While not perfect, refresh the list of volumes in the pool and + * then check that the incoming name isn't already in the pool. + */ + if (backend->refreshPool) { + virStoragePoolObjClearVols(pool); + if (backend->refreshPool(obj->conn, pool) < 0) + goto cleanup; + } + if (virStorageVolDefFindByName(pool, voldef->name)) { virReportError(VIR_ERR_STORAGE_VOL_EXIST, _("'%s'"), voldef->name);
This is a workaround for the linked bug, not a fix. Refreshing the pool seems excessive to me.
Okay, this makes sense. Not only from the POV you are presenting, but I'd expect my pool to be refreshed after I create new volume in it.
Other volume APIs don't refresh the pool, I think we should be consistent.
And I think we have a way to eliminate even the little window - just track if we successfully built the volume or not. Something among these lines:
After properly propagating the error value if the volume exists, we don't need this patch refreshing the pool to work around the original bug. Jan

On 10/07/2015 10:27 AM, Ján Tomko wrote:
On Mon, Oct 05, 2015 at 07:27:34PM -0400, John Ferlan wrote:
On 10/05/2015 07:28 AM, Michal Privoznik wrote:
On 02.10.2015 15:41, John Ferlan wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1233003
Although perhaps bordering on a don't do that type scenario, if someone creates a volume in a pool outside of libvirt, then uses that same name to create a volume in the pool via libvirt, then the creation will fail and in some cases cause the same name volume to be deleted.
This patch will refresh the pool just prior to checking whether the named volume exists prior to creating the volume in the pool. While it's still possible to have a timing window to create a file after the check - at least we tried. At that point, someone is being malicious.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_driver.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 7aaa060..ddf4405 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -1796,6 +1796,15 @@ storageVolCreateXML(virStoragePoolPtr obj, if (virStorageVolCreateXMLEnsureACL(obj->conn, pool->def, voldef) < 0) goto cleanup;
+ /* While not perfect, refresh the list of volumes in the pool and + * then check that the incoming name isn't already in the pool. + */ + if (backend->refreshPool) { + virStoragePoolObjClearVols(pool); + if (backend->refreshPool(obj->conn, pool) < 0) + goto cleanup; + } + if (virStorageVolDefFindByName(pool, voldef->name)) { virReportError(VIR_ERR_STORAGE_VOL_EXIST, _("'%s'"), voldef->name);
This is a workaround for the linked bug, not a fix. Refreshing the pool seems excessive to me.
Initially I felt the same way, but it's not beyond the realm of possibility that someone created a volume/file of the same name outside of libvirt and the only way we'd know is when/if create fails later on. The other advantage is refreshing will get the "sizes" a well and now I'm wondering if this should be done perhaps earlier. What would happen if you had a 10G pool with 5G available. Someone outside of libvirt creates a 5G file (or other size)... Then we try to create another file in the pool of say 1G. After creation, wouldn't our pool technically have negative capacity since there's a refreshVol done which adjusts the pool alloc/capacity. Eventually a refreshPool may discover that - I haven't tested that theory yet though.
Okay, this makes sense. Not only from the POV you are presenting, but I'd expect my pool to be refreshed after I create new volume in it.
Other volume APIs don't refresh the pool, I think we should be consistent.
The API's are createVol, buildVol, buildVolFrom, deleteVol, resizeVol, uploadVol, downloadVol, and wipeVol. Other than perhaps resizeVol, it doesn't seem upload, download, or wipe would need to update the list of volumes in the pool. Although, sure one could make the argument that I could create volume "foo" outside libvirt and then use one of those API's and they'd fail until someone refreshed. I think create & build are a bit "special" - I'd rather see us check if they exist in the pool before trying to add them, but I also understand the logic that says - just try the create and handle the failure.
And I think we have a way to eliminate even the little window - just track if we successfully built the volume or not. Something among these lines:
After properly propagating the error value if the volume exists, we don't need this patch refreshing the pool to work around the original bug.
We may not, unless there's a need due to the pool capacity calculation. John

Create a new function virStorageBackendDiskValidLabel to handle checking whether there is a label on the device and whether it's valid or not. While initially for the purpose of determining whether the label can be overwritten during DiskBuild, a future use during DiskStart could determine whether the pool should be started using the label found. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_disk.c | 47 +++++++++++++++++++++++++------------- 1 file changed, 31 insertions(+), 16 deletions(-) diff --git a/src/storage/storage_backend_disk.c b/src/storage/storage_backend_disk.c index ba70a5a..da2a4d4 100644 --- a/src/storage/storage_backend_disk.c +++ b/src/storage/storage_backend_disk.c @@ -432,6 +432,33 @@ virStorageBackendDiskFindLabel(const char* device) /** + * Determine whether the label on the disk is valid or in a known format + * for the purpose of rewriting the label during build + * + * Return: True if it's OK + * False if something's wrong + */ +static bool +virStorageBackendDiskValidLabel(const char *device) +{ + bool valid = false; + int check; + + check = virStorageBackendDiskFindLabel(device); + if (check > 0) { + valid = true; + } else if (check < 0) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Error checking for disk label")); + } else { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("Disk label already present")); + } + return valid; +} + + +/** * Write a new partition table header */ static int @@ -450,23 +477,11 @@ virStorageBackendDiskBuildPool(virConnectPtr conn ATTRIBUTE_UNUSED, VIR_STORAGE_POOL_BUILD_NO_OVERWRITE, error); - if (flags & VIR_STORAGE_POOL_BUILD_OVERWRITE) { + if (flags & VIR_STORAGE_POOL_BUILD_OVERWRITE) ok_to_mklabel = true; - } else { - int check; - - check = virStorageBackendDiskFindLabel( - pool->def->source.devices[0].path); - if (check > 0) { - ok_to_mklabel = true; - } else if (check < 0) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("Error checking for disk label")); - } else { - virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("Disk label already present")); - } - } + else + ok_to_mklabel = virStorageBackendDiskValidLabel( + pool->def->source.devices[0].path); if (ok_to_mklabel) { /* eg parted /dev/sda mklabel --script msdos */ -- 2.1.0

Modify virStorageBackendDiskValidLabel to add a 'writelabel' parameter. While initially for the purpose of determining whether the label should be written during DiskBuild, a future use during DiskStart could determine whether the pool should be started using the label found. Augment the error messages also to give a hint as to what someone may need to do or why the command failed. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_disk.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/src/storage/storage_backend_disk.c b/src/storage/storage_backend_disk.c index da2a4d4..6f9fab1 100644 --- a/src/storage/storage_backend_disk.c +++ b/src/storage/storage_backend_disk.c @@ -435,24 +435,40 @@ virStorageBackendDiskFindLabel(const char* device) * Determine whether the label on the disk is valid or in a known format * for the purpose of rewriting the label during build * + * When 'writelabel' is true, if we find a valid disk label on the device, + * then we shouldn't be attempting to write as the volume may contain + * data. Force the usage of the overwrite flag to the build command in + * order to be certain. When the disk label is unrecognized, then it + * should be safe to write. + * * Return: True if it's OK * False if something's wrong */ static bool -virStorageBackendDiskValidLabel(const char *device) +virStorageBackendDiskValidLabel(const char *device, + bool writelabel) { bool valid = false; int check; check = virStorageBackendDiskFindLabel(device); if (check > 0) { - valid = true; + if (writelabel) + valid = true; + else + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Unrecognized disk label found, requires build")); } else if (check < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("Error checking for disk label")); + _("Error checking for disk label, failed to get " + "disk partition information")); } else { - virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("Disk label already present")); + if (writelabel) + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("Valid disk label already present, " + "requires --overwrite")); + else + valid = true; } return valid; } @@ -481,7 +497,8 @@ virStorageBackendDiskBuildPool(virConnectPtr conn ATTRIBUTE_UNUSED, ok_to_mklabel = true; else ok_to_mklabel = virStorageBackendDiskValidLabel( - pool->def->source.devices[0].path); + pool->def->source.devices[0].path, + true); if (ok_to_mklabel) { /* eg parted /dev/sda mklabel --script msdos */ -- 2.1.0

Let's check to ensure we can find the Partition Table in the label and that libvirt actually recognizes that type; otherwise, when we go to read the partitions during a refresh operation we may not be reading what we expect. This will expand upon the types of errors or reason that a build would fail, so we can create more direct error messages. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/libvirt_private.syms | 1 + src/storage/storage_backend_disk.c | 41 +++++++++++++++++++++++++++++++++++--- 2 files changed, 39 insertions(+), 3 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c87efa1..415ed19 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -834,6 +834,7 @@ virStoragePoolDefParseFile; virStoragePoolDefParseNode; virStoragePoolDefParseSourceString; virStoragePoolDefParseString; +virStoragePoolFormatDiskTypeFromString; virStoragePoolFormatDiskTypeToString; virStoragePoolFormatFileSystemNetTypeToString; virStoragePoolFormatFileSystemTypeToString; diff --git a/src/storage/storage_backend_disk.c b/src/storage/storage_backend_disk.c index 6f9fab1..b66a4a1 100644 --- a/src/storage/storage_backend_disk.c +++ b/src/storage/storage_backend_disk.c @@ -396,7 +396,9 @@ virStorageBackendDiskRefreshPool(virConnectPtr conn ATTRIBUTE_UNUSED, * Check for a valid disk label (partition table) on device * * return: 0 - valid disk label found - * >0 - no or unrecognized disk label + * 1 - no or unrecognized disk label + * 2 - did not find the Partition Table type + * 3 - Partition Table type unknown * <0 - error finding the disk label */ static int @@ -408,6 +410,7 @@ virStorageBackendDiskFindLabel(const char* device) virCommandPtr cmd = virCommandNew(PARTED); char *output = NULL; char *error = NULL; + char *start, *end; int ret = -1; virCommandAddArgSet(cmd, args); @@ -422,15 +425,40 @@ virStorageBackendDiskFindLabel(const char* device) (error && strstr(error, "unrecognised disk label"))) { ret = 1; } + goto cleanup; + } + + /* Search for "Partition Table:" in the output. If not present, + * then we cannot validate the partition table type. + */ + if (!(start = strstr(output, "Partition Table: ")) || + !(end = strstr(start, "\n"))) { + VIR_DEBUG("Unable to find tag in output: %s", output); + ret = 2; + goto cleanup; + } + start += strlen("Partition Table: "); + *end = '\0'; + + /* on disk it's "msdos", but we document/use "dos" so deal with it here */ + if (STREQ(start, "msdos")) + start += 2; + + /* Make sure we know about this type */ + if (virStoragePoolFormatDiskTypeFromString(start) < 0) { + ret = 3; + goto cleanup; } + ret = 0; + + cleanup: virCommandFree(cmd); VIR_FREE(output); VIR_FREE(error); return ret; } - /** * Determine whether the label on the disk is valid or in a known format * for the purpose of rewriting the label during build @@ -452,12 +480,19 @@ virStorageBackendDiskValidLabel(const char *device, int check; check = virStorageBackendDiskFindLabel(device); - if (check > 0) { + if (check == 1) { if (writelabel) valid = true; else virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Unrecognized disk label found, requires build")); + } else if (check == 2) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Unable to determine Partition Type, " + "requires build --overwrite")); + } else if (check == 3) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Unknown Partition Type, requires build --overwrite")); } else if (check < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("Error checking for disk label, failed to get " -- 2.1.0

https://bugzilla.redhat.com/show_bug.cgi?id=1251461 When 'starting' up a disk pool, we need to make sure the label on the device is valid; otherwise, the followup refreshPool will assume the disk has been properly formatted for use. If we don't find the valid label, then refuse the start and give a proper reason. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_disk.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/src/storage/storage_backend_disk.c b/src/storage/storage_backend_disk.c index b66a4a1..c6317a2 100644 --- a/src/storage/storage_backend_disk.c +++ b/src/storage/storage_backend_disk.c @@ -461,7 +461,8 @@ virStorageBackendDiskFindLabel(const char* device) /** * Determine whether the label on the disk is valid or in a known format - * for the purpose of rewriting the label during build + * for the purpose of rewriting the label during build or being able to + * start a pool on a device. * * When 'writelabel' is true, if we find a valid disk label on the device, * then we shouldn't be attempting to write as the volume may contain @@ -469,6 +470,10 @@ virStorageBackendDiskFindLabel(const char* device) * order to be certain. When the disk label is unrecognized, then it * should be safe to write. * + * When 'writelabel' is false, only if we find a valid disk label on the + * device should we allow the start since for this path we won't be + * rewriting the label. + * * Return: True if it's OK * False if something's wrong */ @@ -509,6 +514,27 @@ virStorageBackendDiskValidLabel(const char *device, } +static int +virStorageBackendDiskStartPool(virConnectPtr conn ATTRIBUTE_UNUSED, + virStoragePoolObjPtr pool) +{ + virFileWaitForDevices(); + + if (!virFileExists(pool->def->source.devices[0].path)) { + virReportError(VIR_ERR_INVALID_ARG, + _("device path '%s' doesn't exist"), + pool->def->source.devices[0].path); + return -1; + } + + if (!virStorageBackendDiskValidLabel(pool->def->source.devices[0].path, + false)) + return -1; + + return 0; +} + + /** * Write a new partition table header */ @@ -940,6 +966,7 @@ virStorageBackendDiskVolWipe(virConnectPtr conn, virStorageBackend virStorageBackendDisk = { .type = VIR_STORAGE_POOL_DISK, + .startPool = virStorageBackendDiskStartPool, .buildPool = virStorageBackendDiskBuildPool, .refreshPool = virStorageBackendDiskRefreshPool, -- 2.1.0

https://bugzilla.redhat.com/show_bug.cgi?id=1247987 Calculation of the extended and logical partition values for the disk pool is complex. As the bz points out an extended partition should have it's allocation initialized to 0 (zero) and keep the capacity as the size dictated by the extents read. Then for each logical partition found, adjust the allocation of the extended partition. Finally, previous logic tried to avoid recalculating things if a logical partition was deleted; however, since we now have special logic to handle the allocation of the extended partition, just make life easier by reading the partition table again - rather than doing the reverse adjustment. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/storage/storage_backend_disk.c | 34 +++++++++++++++++++++++----------- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/src/storage/storage_backend_disk.c b/src/storage/storage_backend_disk.c index c6317a2..7baecc1 100644 --- a/src/storage/storage_backend_disk.c +++ b/src/storage/storage_backend_disk.c @@ -156,7 +156,8 @@ virStorageBackendDiskMakeDataVol(virStoragePoolObjPtr pool, VIR_STORAGE_VOL_OPEN_DEFAULT | VIR_STORAGE_VOL_OPEN_NOERROR) == -1) return -1; - vol->target.allocation = vol->target.capacity = + vol->target.allocation = 0; + vol->target.capacity = (vol->source.extents[0].end - vol->source.extents[0].start); } else { if (virStorageBackendUpdateVolInfo(vol, false, @@ -164,6 +165,20 @@ virStorageBackendDiskMakeDataVol(virStoragePoolObjPtr pool, return -1; } + /* Find the extended partition and increase the allocation value */ + if (vol->source.partType == VIR_STORAGE_VOL_DISK_TYPE_LOGICAL) { + size_t i; + + for (i = 0; i < pool->volumes.count; i++) { + if (pool->volumes.objs[i]->source.partType == + VIR_STORAGE_VOL_DISK_TYPE_EXTENDED) { + pool->volumes.objs[i]->target.allocation += + vol->target.allocation; + break; + } + } + } + if (STRNEQ(groups[2], "metadata")) pool->def->allocation += vol->target.allocation; if (vol->source.extents[0].end > pool->def->capacity) @@ -841,17 +856,14 @@ virStorageBackendDiskDeleteVol(virConnectPtr conn, goto cleanup; } - /* If this is not a logical partition, then either we've removed an - * extended partition or a primary partion - refresh the pool which - * includes resetting the [n]freeExtents data so a subsequent allocation - * might be able to use what was deleted. A logical partition is part - * of an extended partition and handled differently + /* Refreshing the pool is the easiest option as LOGICAL and EXTENDED + * partition allocation/capacity management is handled within + * virStorageBackendDiskMakeDataVol and trying to redo that logic + * here is pointless */ - if (vol->source.partType != VIR_STORAGE_VOL_DISK_TYPE_LOGICAL) { - virStoragePoolObjClearVols(pool); - if (virStorageBackendDiskRefreshPool(conn, pool) < 0) - goto cleanup; - } + virStoragePoolObjClearVols(pool); + if (virStorageBackendDiskRefreshPool(conn, pool) < 0) + goto cleanup; rc = 0; cleanup: -- 2.1.0

On 02.10.2015 15:41, John Ferlan wrote:
The following patches resolve a few issues either explicitly generated via a bz or seen while investigating/testing the results from bzs.
Patches 1-2 were discovered while working through patch 3. I found that deleting volumes from a pool was always going through the fork processing. I neglected to recall that when the volume was placed into the pool the mode, uid, & gid values can be updated. Thus on some delete paths we may not be called with -1 values. Because we were going through the fork processing I found that the error message displayed when a volume target no longer exists was wrong - that is the caller was expecting a return of -1 and errno being set. In this case looking for ENOENT and continuing on with removal of the volume from the pool.
Patch 3 stands by itself - it's certainly one of those that could be deemed don't do that, but since we do check whether the volume was removed without libvirt doing the removal, I guess we should check whether the volume exists before trying to create it and using the refresh pool is the simplest mechanism for all pools.
Patches 4-7 are intertwined - the disk backend code made the assumption that the volume/device being used for the pool was "good". The normal mode of operation is to define, build, start, but there's never been a "guarantee" or check that what is "on" the volume is valid for the type of pool
Patch 8 was disk backend related - so while I'm in the code, I may as well make the change as well.
John Ferlan (8): virfile: Add extra check for direct delete in virFileRemove virfile: Fix error path for forked virFileRemove storage: Prior to creating a volume, refresh the pool storage: Refactor disk label checking storage: Add param to check whether we can write a disk label storage: Add additional errors/checks for disk label storage: Introduce virStorageBackendDiskStartPool storage: Adjust calculation of alloc/capacity for disk
src/libvirt_private.syms | 1 + src/storage/storage_backend_disk.c | 162 ++++++++++++++++++++++++++++++------- src/storage/storage_driver.c | 9 +++ src/util/virfile.c | 11 ++- 4 files changed, 151 insertions(+), 32 deletions(-)
ACK series Michal
participants (3)
-
John Ferlan
-
Ján Tomko
-
Michal Privoznik