On 11/18/2016 07:56 AM, Ján Tomko wrote:
On Tue, Nov 15, 2016 at 06:48:03PM -0500, John Ferlan wrote:
>
https://bugzilla.redhat.com/show_bug.cgi?id=1363586
>
> There's actually a couple of bugs here...
>
> only the "input" format type, so any other "types" of format
would be
I could a verb.
Hmm... strange... my fingers must've started drinking early while
merging changes... I inserted patch1 to make patch2 just a little bit
more readable. Looks like I got distracted whilst trying to create the
commit message here.
> filtered during the blkid_do_probe resulting in allowing a new
format
> type to overwrite a previous format type since when it's determined a
> format type cannot be determined that we'd return 0 (or prior to the
> previous patch a NOT_FOUND value). So instead of passing just one type
> to filter on, pass the entire list of virStoragePoolFormatFileSystem
> types. According to recent docs, this call may be unnecessary unless
> we care about superblock probing only. Ironically, if the on disk
> format type was the same as the requested format, the code would then
> fail on the else condition (fixed in #2).
>
> was returning -1 (or prior to previous patch FOUND) with an error
> which caused the caller to just fail.
Why is that a bug? IIUC we want to error out if a filesystem was found.
I guess my theory was if you have a xfs partition, writing an xfs
partition shouldn't necessarily fail. Sure it rewrites xfs on xfs. I
guess that's why there was a tristate return.
> So even though it was found
> it did nothing. Change that to compare the on disk type with the
> passed format type and return 0 or -1 as necessary.
>
> Signed-off-by: John Ferlan <jferlan(a)redhat.com>
> ---
> src/storage/storage_backend_fs.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/src/storage/storage_backend_fs.c
> b/src/storage/storage_backend_fs.c
> index 2413e82..74b278d 100644
> --- a/src/storage/storage_backend_fs.c
> +++ b/src/storage/storage_backend_fs.c
> @@ -626,7 +626,8 @@ virStorageBackendFileSystemProbe(const char *device,
> int ret = -1;
> blkid_probe probe = NULL;
> const char *fstype = NULL;
> - char *names[2], *libblkid_format = NULL;
> + size_t i;
> + const char *names[VIR_STORAGE_POOL_FS_LAST] = {0};
>
> VIR_DEBUG("Probing for existing filesystem of type %s on device %s",
> format, device);
> @@ -648,25 +649,26 @@ virStorageBackendFileSystemProbe(const char
> *device,
> goto error;
> }
>
> - if (VIR_STRDUP(libblkid_format, format) < 0)
> - goto error;
> -
> - names[0] = libblkid_format;
> - names[1] = NULL;
> + for (i = 1; i < VIR_STORAGE_POOL_FS_LAST; i++)
> + names[i - 1] = virStoragePoolFormatFileSystemTypeToString(i);
>
> blkid_probe_filter_superblocks_type(probe,
> BLKID_FLTR_ONLYIN,
> - names);
> + (char **)names);
>
If we want to extend the probing to other filesystem types, then
removing the filter completely would be a better option.
But,
it seems the code intended to only check for the same fs type, not
others. From virsh pool-build --help:
--no-overwrite do not overwrite an existing pool of this type
--overwrite overwrite any existing data
Commit id 'ddcd5674a' written with the logic in mind. Whether the logic
is valid or not is I suppose a matter of interpretation.
The API constants are described more ambiguously:
VIR_STORAGE_POOL_BUILD_NO_OVERWRITE = 4
Do not overwrite existing pool
VIR_STORAGE_POOL_BUILD_OVERWRITE = 8
Overwrite data
and the comments to virStorageBackendFileSystemBuild from commit id
'27758859c':
* If no flag is set, it only makes the directory; If
* VIR_STORAGE_POOL_BUILD_NO_OVERWRITE set, it probes to determine if
* filesystem already exists on the target device, renurning an error
* if exists, or using mkfs to format the target device if not; If
* VIR_STORAGE_POOL_BUILD_OVERWRITE is set, mkfs is always executed,
* any existed data on the target device is overwritten unconditionally.
Still the logic/reasoning doesn't seem to be, well logical... So off to
patch history...
v4 of a series...
http://www.redhat.com/archives/libvir-list/2011-August/msg01495.html
v3 of a series...
http://www.redhat.com/archives/libvir-list/2010-June/msg00040.html
http://www.redhat.com/archives/libvir-list/2010-June/msg00042.html
v1/v2 of the series (intermixed)
http://www.redhat.com/archives/libvir-list/2010-May/msg00403.html
All of which imply the logic for no-overwrite (or some sort of probe
failure) was meant to be as is, even though conceptually I'm not sure it
makes proper sense. It seems as though what happens is that
probe-failure turned into how no-overwrite was implemented.
> if (blkid_do_probe(probe) != 0) {
> VIR_INFO("No filesystem of type '%s' found on device
'%s'",
> format, device);
> ret = 0;
> } else if (blkid_probe_lookup_value(probe, "TYPE", &fstype, NULL)
> == 0) {
> - virReportError(VIR_ERR_STORAGE_POOL_BUILT,
> - _("Existing filesystem of type '%s' found on
"
> - "device '%s'"),
> - fstype, device);
> + if (STRNEQ(fstype, format)) {
> + virReportError(VIR_ERR_STORAGE_POOL_BUILT,
> + _("Existing filesystem of type '%s' found
> on "
> + "device '%s'"),
When I tried to create an ext4 pool over an xfs filesystem, the
resulting error message was confusing:
error: Storage pool already built: Existing filesystem of type 'xfs'
found on device '/dev/sda5'
Sure prepending a "cannot create '%s' filesystem," would conceptually
help, but I guess the error message was descriptive enough for me - I
tried to build a build on a device with a pool already on it.
> + fstype, device);
> + } else {
> + ret = 0;
This is definitely wrong. If the existing filesystem matched the one
that had been requested by the user,
we would call mkfs again despite the NO_OVERWRITE flag.
hmmm... I guess writing xfs to an xfs didn't seem to be a problem, but I
do understand your point... thus perhaps why the original logic had 3
return values.
So it all boils down to the interpretation of the NO_OVERWRITE flag:
[0] Do not overwrite the requested filesystem
[A] Do not overwrite a filesystem libvirt knows about
[B] Do not overwrite any filesystem
[C] Do not overwrite any useful-looking data.
This patch did [A] (with the exception of the on-disk filesystem
matching the requested filesystem).
To match the 'virsh help' interpretation 'do not overwrite an existing
pool of this type', I guess the current situation [0] already satisfies
that (but the man page does not agree - and for disk pools it mentions
that libvirt uses 'parted' to determine this, which is oddly specific).
If we treat 'of this type' literally, e.g. anything that might be a
type='fs' pool, then we should remove the filter and let blkid check
for all filesystems [B].
This I could go with... Means patch 1 is dropped and we only return
FILESYSTEM_PROBE_NOT_FOUND if blkid_do_probe fails to find some sort of
file system on the disk. Dropping the filter is something I thought
about while working through this.
Is this what we "should" do though? Maybe both patches are dropped and
we close the bug as working as expected. Although that doesn't seem like
the best choice either since we overwrote xfs with ext4 *and* we
supplied the --no-overwrite flag.
[C] might be more user-friendly, but I'm not sure if we can change the
meaning of the NO_OVERWRITE constant like that. On the bright side,
if blkid can also identify partition tables, we could unify the probing
code with the disk backend and stop parsing parted's output (also, catch
those cases when someone puts the filesystem on the block device instead
of its partition).
Extracting blkid out and using for disk instread of parsing parted only
works if the target platform supports blkid API's I think. Still
something to think about for sure.
(Also, calling pool-create with --build but neither --overwrite nor
--no-overwrite
the pool does not get built at all:
error: internal error: Child process (/bin/mount -t ext4 /dev/sda5
/tmp/fspool)
unexpected exit status 32: mount: wrong fs type, bad option, bad
superblock on /dev/sda5,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
That is also confusing to me but at least libvirtd did not destroy any
data.)
True - thanks for allowing me to now be confused too. And I thought I
had done what was "logical" or "expected" if I'm not supposed to
overwrite something (OK other than the rewrite of the same fs type).
If we do proceed, then altering the docs is something I'd have to add as
well.
Maybe someone with more history on this will pipe in (Dan, Eric... Dave
and Osier aren't active anymore).
John