[libvirt] [PATCH] Drop virStorageBackendLogicalMatchPoolSource

Regression introduced by commit 71b803a for [1] that prevents starting up a logical pool created with <source><device path=''></source> after it has been moved to a different physical volume. For logical pools <source><name> contains the name of the volume group and uniquely identifies the VG on the host. This also speeds up startup for pools that do not have any <device>s specified. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1025230 --- src/storage/storage_backend_logical.c | 104 +--------------------------------- 1 file changed, 2 insertions(+), 102 deletions(-) diff --git a/src/storage/storage_backend_logical.c b/src/storage/storage_backend_logical.c index ca05fe1..057e0b9 100644 --- a/src/storage/storage_backend_logical.c +++ b/src/storage/storage_backend_logical.c @@ -558,107 +558,11 @@ virStorageBackendLogicalFindPoolSources(virConnectPtr conn ATTRIBUTE_UNUSED, } -/* - * virStorageBackendLogicalMatchPoolSource - * @pool: Pointer to the source pool object - * - * Search the output generated by a 'pvs --noheadings -o pv_name,vg_name' - * to match the 'vg_name' with the pool def->source.name and for the list - * of pool def->source.devices[]. - * - * Returns true if the volume group name matches the pool's source.name - * and at least one of the pool's def->source.devices[] matches the - * list of physical device names listed for the pool. Return false if - * we cannot find a matching volume group name and if we cannot match - * the any device list members. - */ -static bool -virStorageBackendLogicalMatchPoolSource(virStoragePoolObjPtr pool) -{ - virStoragePoolSourceList sourceList; - virStoragePoolSource *thisSource = NULL; - size_t i, j; - int matchcount = 0; - bool ret = false; - - memset(&sourceList, 0, sizeof(sourceList)); - sourceList.type = VIR_STORAGE_POOL_LOGICAL; - - if (virStorageBackendLogicalGetPoolSources(&sourceList) < 0) - goto cleanup; - - /* Search the pvs output for this pool's source.name */ - for (i = 0; i < sourceList.nsources; i++) { - thisSource = &sourceList.sources[i]; - if (STREQ(thisSource->name, pool->def->source.name)) - break; - } - - if (i == sourceList.nsources) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("cannot find logical volume group name '%s'"), - pool->def->source.name); - goto cleanup; - } - - /* If the pool has defined source device(s), then let's make sure - * they match as well; otherwise, matching can only occur on the - * pool's name. - */ - if (!pool->def->source.ndevice) { - ret = true; - goto cleanup; - } - - /* Let's make sure the pool's device(s) match what the pvs output has - * for volume group devices. - */ - for (i = 0; i < pool->def->source.ndevice; i++) { - for (j = 0; j < thisSource->ndevice; j++) { - if (STREQ(pool->def->source.devices[i].path, - thisSource->devices[j].path)) - matchcount++; - } - } - - /* If we didn't find any matches, then this pool has listed (a) source - * device path(s) that don't/doesn't match what was created for the pool - */ - if (matchcount == 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("cannot find any matching source devices for logical " - "volume group '%s'"), pool->def->source.name); - goto cleanup; - } - - /* Either there's more devices in the pool source device list or there's - * more devices in the pvs output. Could easily happen if someone decides - * to 'add' to or 'remove' from the volume group outside of libvirt's - * knowledge. Rather than fail on that, provide a warning and move on. - */ - if (matchcount != pool->def->source.ndevice) - VIR_WARN("pool device list count doesn't match pvs device list count"); - - ret = true; - - cleanup: - for (i = 0; i < sourceList.nsources; i++) - virStoragePoolSourceClear(&sourceList.sources[i]); - VIR_FREE(sourceList.sources); - - return ret; -} - - static int virStorageBackendLogicalCheckPool(virStoragePoolObjPtr pool, bool *isActive) { - /* If we can find the target.path as well as ensure that the - * pool's def source - */ - *isActive = virFileExists(pool->def->target.path) && - virStorageBackendLogicalMatchPoolSource(pool); + *isActive = virFileExists(pool->def->target.path); return 0; } @@ -666,11 +570,7 @@ static int virStorageBackendLogicalStartPool(virConnectPtr conn ATTRIBUTE_UNUSED, virStoragePoolObjPtr pool) { - /* Let's make sure that the pool's name matches the pvs output and - * that the pool's source devices match the pvs output. - */ - if (!virStorageBackendLogicalMatchPoolSource(pool) || - virStorageBackendLogicalSetActive(pool, 1) < 0) + if (virStorageBackendLogicalSetActive(pool, 1) < 0) return -1; return 0; -- 2.7.3

On 06/15/2016 01:19 PM, Ján Tomko wrote:
Regression introduced by commit 71b803a for [1] that prevents starting up a logical pool created with <source><device path=''></source> after it has been moved to a different physical volume.
Is there a bug for this? XML examples? Is an empty source device path string a valid value? Reading http://libvirt.org/formatstorage.html doesn't give me that impression. "device Provides the source for pools backed by physical devices (pool types fs, logical, disk, iscsi, zfs). May be repeated multiple times depending on backend driver. Contains a required attribute path which is either the fully qualified path to the block device node or for iscsi the iSCSI Qualified Name (IQN). Since 0.4.1"
For logical pools <source><name> contains the name of the volume group and uniquely identifies the VG on the host.
This also speeds up startup for pools that do not have any <device>s specified.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1025230 --- src/storage/storage_backend_logical.c | 104 +--------------------------------- 1 file changed, 2 insertions(+), 102 deletions(-)
This essentially reverts a patch that was used to resolve a bz without a patch to resolve the issue in the bug. What's the proposal/patch to resolve the issue from the bug? It would seem to me that if the "decision" was to accept an source device path of '', then it's "simple enough" to add a check for that in virStorageBackendLogicalMatchPoolSource before the memset()... E.g. if (pool->def->source.name[0] == '\0') return true; I think it's an oddball case, but I suppose valid. John
diff --git a/src/storage/storage_backend_logical.c b/src/storage/storage_backend_logical.c index ca05fe1..057e0b9 100644 --- a/src/storage/storage_backend_logical.c +++ b/src/storage/storage_backend_logical.c @@ -558,107 +558,11 @@ virStorageBackendLogicalFindPoolSources(virConnectPtr conn ATTRIBUTE_UNUSED, }
-/* - * virStorageBackendLogicalMatchPoolSource - * @pool: Pointer to the source pool object - * - * Search the output generated by a 'pvs --noheadings -o pv_name,vg_name' - * to match the 'vg_name' with the pool def->source.name and for the list - * of pool def->source.devices[]. - * - * Returns true if the volume group name matches the pool's source.name - * and at least one of the pool's def->source.devices[] matches the - * list of physical device names listed for the pool. Return false if - * we cannot find a matching volume group name and if we cannot match - * the any device list members. - */ -static bool -virStorageBackendLogicalMatchPoolSource(virStoragePoolObjPtr pool) -{ - virStoragePoolSourceList sourceList; - virStoragePoolSource *thisSource = NULL; - size_t i, j; - int matchcount = 0; - bool ret = false; - - memset(&sourceList, 0, sizeof(sourceList)); - sourceList.type = VIR_STORAGE_POOL_LOGICAL; - - if (virStorageBackendLogicalGetPoolSources(&sourceList) < 0) - goto cleanup; - - /* Search the pvs output for this pool's source.name */ - for (i = 0; i < sourceList.nsources; i++) { - thisSource = &sourceList.sources[i]; - if (STREQ(thisSource->name, pool->def->source.name)) - break; - } - - if (i == sourceList.nsources) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("cannot find logical volume group name '%s'"), - pool->def->source.name); - goto cleanup; - } - - /* If the pool has defined source device(s), then let's make sure - * they match as well; otherwise, matching can only occur on the - * pool's name. - */ - if (!pool->def->source.ndevice) { - ret = true; - goto cleanup; - } - - /* Let's make sure the pool's device(s) match what the pvs output has - * for volume group devices. - */ - for (i = 0; i < pool->def->source.ndevice; i++) { - for (j = 0; j < thisSource->ndevice; j++) { - if (STREQ(pool->def->source.devices[i].path, - thisSource->devices[j].path)) - matchcount++; - } - } - - /* If we didn't find any matches, then this pool has listed (a) source - * device path(s) that don't/doesn't match what was created for the pool - */ - if (matchcount == 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("cannot find any matching source devices for logical " - "volume group '%s'"), pool->def->source.name); - goto cleanup; - } - - /* Either there's more devices in the pool source device list or there's - * more devices in the pvs output. Could easily happen if someone decides - * to 'add' to or 'remove' from the volume group outside of libvirt's - * knowledge. Rather than fail on that, provide a warning and move on. - */ - if (matchcount != pool->def->source.ndevice) - VIR_WARN("pool device list count doesn't match pvs device list count"); - - ret = true; - - cleanup: - for (i = 0; i < sourceList.nsources; i++) - virStoragePoolSourceClear(&sourceList.sources[i]); - VIR_FREE(sourceList.sources); - - return ret; -} - - static int virStorageBackendLogicalCheckPool(virStoragePoolObjPtr pool, bool *isActive) { - /* If we can find the target.path as well as ensure that the - * pool's def source - */ - *isActive = virFileExists(pool->def->target.path) && - virStorageBackendLogicalMatchPoolSource(pool); + *isActive = virFileExists(pool->def->target.path); return 0; }
@@ -666,11 +570,7 @@ static int virStorageBackendLogicalStartPool(virConnectPtr conn ATTRIBUTE_UNUSED, virStoragePoolObjPtr pool) { - /* Let's make sure that the pool's name matches the pvs output and - * that the pool's source devices match the pvs output. - */ - if (!virStorageBackendLogicalMatchPoolSource(pool) || - virStorageBackendLogicalSetActive(pool, 1) < 0) + if (virStorageBackendLogicalSetActive(pool, 1) < 0) return -1;
return 0;

On Wed, Jun 15, 2016 at 06:45:59PM -0400, John Ferlan wrote:
On 06/15/2016 01:19 PM, Ján Tomko wrote:
Regression introduced by commit 71b803a for [1] that prevents starting up a logical pool created with <source><device path=''></source> after it has been moved to a different physical volume.
Is there a bug for this? XML examples?
Is an empty source device path string a valid value? Reading http://libvirt.org/formatstorage.html doesn't give me that impression.
I meant any pool with a <device> specified, not an empty path. The whole point of LVM is abstraction from the lower layers so we shouldn't ever be checking this.
"device Provides the source for pools backed by physical devices (pool types fs, logical, disk, iscsi, zfs). May be repeated multiple times depending on backend driver. Contains a required attribute path which is either the fully qualified path to the block device node or for iscsi the iSCSI Qualified Name (IQN). Since 0.4.1"
For logical pools <source><name> contains the name of the volume group and uniquely identifies the VG on the host.
This also speeds up startup for pools that do not have any <device>s specified.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1025230 --- src/storage/storage_backend_logical.c | 104 +--------------------------------- 1 file changed, 2 insertions(+), 102 deletions(-)
This essentially reverts a patch that was used to resolve a bz without a patch to resolve the issue in the bug. What's the proposal/patch to resolve the issue from the bug?
The issue in the bug is just cosmetic and should not block fixing this regresion. Jan

On 06/16/2016 06:03 AM, Ján Tomko wrote:
On Wed, Jun 15, 2016 at 06:45:59PM -0400, John Ferlan wrote:
On 06/15/2016 01:19 PM, Ján Tomko wrote:
Regression introduced by commit 71b803a for [1] that prevents starting up a logical pool created with <source><device path=''></source> after it has been moved to a different physical volume.
Is there a bug for this? XML examples?
Is an empty source device path string a valid value? Reading http://libvirt.org/formatstorage.html doesn't give me that impression.
I meant any pool with a <device> specified, not an empty path.
But the patch is specifically targeted at removing the _logical pool check. So the question is, is it "legal" for a logical pool to "start" with an empty string as a source device path? Secondarily I'm curious what's the use case of that?
The whole point of LVM is abstraction from the lower layers so we shouldn't ever be checking this.
So it's OK to activate/start a pool where the source device path is incorrect?
"device Provides the source for pools backed by physical devices (pool types fs, logical, disk, iscsi, zfs). May be repeated multiple times depending on backend driver. Contains a required attribute path which is either the fully qualified path to the block device node or for iscsi the iSCSI Qualified Name (IQN). Since 0.4.1"
For logical pools <source><name> contains the name of the volume group and uniquely identifies the VG on the host.
This also speeds up startup for pools that do not have any <device>s specified.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1025230 --- src/storage/storage_backend_logical.c | 104 +--------------------------------- 1 file changed, 2 insertions(+), 102 deletions(-)
This essentially reverts a patch that was used to resolve a bz without a patch to resolve the issue in the bug. What's the proposal/patch to resolve the issue from the bug?
The issue in the bug is just cosmetic and should not block fixing this regresion.
How is it just cosmetic? Let's say we allow starting a pool with an incorrect or invalid source device path. Any attempt to "use" or "list" a volume in the pool would fail. I will agree that it would seem unlikely in a "real world" situation that an admin would create/configure a logical pool using a vgname that is not associated with the provided source device path; however, to me it seems just as unlikely that the source device path is either not provided or is set to ''. Curious what others think. If the patch gets reverted, then so be it. We should then at least document that libvirt does not check or care if the source device path for a logical pool is valid or not (and update the bug thusly). John John

On Thu, Jun 16, 2016 at 07:44:05AM -0400, John Ferlan wrote:
On 06/16/2016 06:03 AM, Ján Tomko wrote:
On Wed, Jun 15, 2016 at 06:45:59PM -0400, John Ferlan wrote:
On 06/15/2016 01:19 PM, Ján Tomko wrote:
Regression introduced by commit 71b803a for [1] that prevents starting up a logical pool created with <source><device path=''></source> after it has been moved to a different physical volume.
Is there a bug for this? XML examples?
Is an empty source device path string a valid value? Reading http://libvirt.org/formatstorage.html doesn't give me that impression.
I meant any pool with a <device> specified, not an empty path.
But the patch is specifically targeted at removing the _logical pool check. So the question is, is it "legal" for a logical pool to "start" with an empty string as a source device path? Secondarily I'm curious what's the use case of that?
I do not know what happens with an empty string. <source><device path=''></source> was not an exact snippet, just a hint that there is a device path specified. If you create a pool like this: <pool type='logical'> <name>vgname</name> <source> <device path='/dev/sda4'/> <name>vgname</name> </source> <target> <path>/dev/vgname</path> </target> </pool> Then pvmove /dev/sda4 to /dev/sda5, libvirt will refuse that pool even though <name>vgname</name> is enough to uniquely identify a storage pool.
The whole point of LVM is abstraction from the lower layers so we shouldn't ever be checking this.
So it's OK to activate/start a pool where the source device path is incorrect?
For LVM, the important path is in <source><name>.
"device Provides the source for pools backed by physical devices (pool types fs, logical, disk, iscsi, zfs). May be repeated multiple times depending on backend driver. Contains a required attribute path which is either the fully qualified path to the block device node or for iscsi the iSCSI Qualified Name (IQN). Since 0.4.1"
For logical pools <source><name> contains the name of the volume group and uniquely identifies the VG on the host.
This also speeds up startup for pools that do not have any <device>s specified.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1025230 --- src/storage/storage_backend_logical.c | 104 +--------------------------------- 1 file changed, 2 insertions(+), 102 deletions(-)
This essentially reverts a patch that was used to resolve a bz without a patch to resolve the issue in the bug. What's the proposal/patch to resolve the issue from the bug?
The issue in the bug is just cosmetic and should not block fixing this regresion.
How is it just cosmetic? Let's say we allow starting a pool with an incorrect or invalid source device path. Any attempt to "use" or "list" a volume in the pool would fail.
The pool works fine, the logical volumes are under the same path in /dev/, the only thing that has changed is the underlying PV device.
I will agree that it would seem unlikely in a "real world" situation that an admin would create/configure a logical pool using a vgname that is not associated with the provided source device path; however, to me it seems just as unlikely that the source device path is either not provided or is set to ''.
Not providing the source device path at all is perfectly fine for existing logical pools, they are only needed for building new pools. Jan

On 06/17/2016 11:16 AM, Ján Tomko wrote:
On Thu, Jun 16, 2016 at 07:44:05AM -0400, John Ferlan wrote:
On 06/16/2016 06:03 AM, Ján Tomko wrote:
On Wed, Jun 15, 2016 at 06:45:59PM -0400, John Ferlan wrote:
On 06/15/2016 01:19 PM, Ján Tomko wrote:
Regression introduced by commit 71b803a for [1] that prevents starting up a logical pool created with <source><device path=''></source> after it has been moved to a different physical volume.
Is there a bug for this? XML examples?
Is an empty source device path string a valid value? Reading http://libvirt.org/formatstorage.html doesn't give me that impression.
I meant any pool with a <device> specified, not an empty path.
But the patch is specifically targeted at removing the _logical pool check. So the question is, is it "legal" for a logical pool to "start" with an empty string as a source device path? Secondarily I'm curious what's the use case of that?
I do not know what happens with an empty string. <source><device path=''></source> was not an exact snippet, just a hint that there is a device path specified.
If you create a pool like this:
<pool type='logical'> <name>vgname</name> <source> <device path='/dev/sda4'/> <name>vgname</name>
As I see things, this more or less lets libvirt "know" that "/dev/sda4" and "vgname" are associated.
</source> <target> <path>/dev/vgname</path> </target> </pool>
Then pvmove /dev/sda4 to /dev/sda5, libvirt will refuse that pool even though <name>vgname</name> is enough to uniquely identify a storage pool.
As an admin you take this step to pvmove your data from /dev/sda4 to /dev/sda5. But then you expect libvirt to know that? Do you expect libvirt to automagically update the XML to /dev/sda5 during 'build' or 'start'?
The whole point of LVM is abstraction from the lower layers so we shouldn't ever be checking this.
So it's OK to activate/start a pool where the source device path is incorrect?
For LVM, the important path is in <source><name>.
But isn't there a 1-to-1 relationship between the VG and the PV? A PV cannot be in more than one VG, so if you move your PV, then you have to update your XML to let libvirt know. John
"device Provides the source for pools backed by physical devices (pool types fs, logical, disk, iscsi, zfs). May be repeated multiple times depending on backend driver. Contains a required attribute path which is either the fully qualified path to the block device node or for iscsi the iSCSI Qualified Name (IQN). Since 0.4.1"
For logical pools <source><name> contains the name of the volume group and uniquely identifies the VG on the host.
This also speeds up startup for pools that do not have any <device>s specified.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1025230 --- src/storage/storage_backend_logical.c | 104 +--------------------------------- 1 file changed, 2 insertions(+), 102 deletions(-)
This essentially reverts a patch that was used to resolve a bz without a patch to resolve the issue in the bug. What's the proposal/patch to resolve the issue from the bug?
The issue in the bug is just cosmetic and should not block fixing this regresion.
How is it just cosmetic? Let's say we allow starting a pool with an incorrect or invalid source device path. Any attempt to "use" or "list" a volume in the pool would fail.
The pool works fine, the logical volumes are under the same path in /dev/, the only thing that has changed is the underlying PV device.
I will agree that it would seem unlikely in a "real world" situation that an admin would create/configure a logical pool using a vgname that is not associated with the provided source device path; however, to me it seems just as unlikely that the source device path is either not provided or is set to ''.
Not providing the source device path at all is perfectly fine for existing logical pools, they are only needed for building new pools.
Jan

On Wed, Jun 22, 2016 at 07:03:46AM -0400, John Ferlan wrote:
On 06/17/2016 11:16 AM, Ján Tomko wrote:
On Thu, Jun 16, 2016 at 07:44:05AM -0400, John Ferlan wrote:
On 06/16/2016 06:03 AM, Ján Tomko wrote:
On Wed, Jun 15, 2016 at 06:45:59PM -0400, John Ferlan wrote:
On 06/15/2016 01:19 PM, Ján Tomko wrote:
Regression introduced by commit 71b803a for [1] that prevents starting up a logical pool created with <source><device path=''></source> after it has been moved to a different physical volume.
Is there a bug for this? XML examples?
Is an empty source device path string a valid value? Reading http://libvirt.org/formatstorage.html doesn't give me that impression.
I meant any pool with a <device> specified, not an empty path.
But the patch is specifically targeted at removing the _logical pool check. So the question is, is it "legal" for a logical pool to "start" with an empty string as a source device path? Secondarily I'm curious what's the use case of that?
I do not know what happens with an empty string. <source><device path=''></source> was not an exact snippet, just a hint that there is a device path specified.
If you create a pool like this:
<pool type='logical'> <name>vgname</name> <source> <device path='/dev/sda4'/> <name>vgname</name>
As I see things, this more or less lets libvirt "know" that "/dev/sda4" and "vgname" are associated.
</source> <target> <path>/dev/vgname</path> </target> </pool>
Then pvmove /dev/sda4 to /dev/sda5, libvirt will refuse that pool even though <name>vgname</name> is enough to uniquely identify a storage pool.
As an admin you take this step to pvmove your data from /dev/sda4 to /dev/sda5.
But then you expect libvirt to know that? Do you expect libvirt to automagically update the XML to /dev/sda5 during 'build' or 'start'?
The whole point of LVM is abstraction from the lower layers so we shouldn't ever be checking this.
So it's OK to activate/start a pool where the source device path is incorrect?
For LVM, the important path is in <source><name>.
But isn't there a 1-to-1 relationship between the VG and the PV? A PV cannot be in more than one VG, so if you move your PV, then you have to update your XML to let libvirt know.
Why would it need to know? Apart from pool building, libvirt does not need the value for anything. Jan
John

On Fri, Jun 24, 2016 at 18:09:01 +0200, Ján Tomko wrote:
On Wed, Jun 22, 2016 at 07:03:46AM -0400, John Ferlan wrote:
On 06/17/2016 11:16 AM, Ján Tomko wrote:
On Thu, Jun 16, 2016 at 07:44:05AM -0400, John Ferlan wrote:
On 06/16/2016 06:03 AM, Ján Tomko wrote:
On Wed, Jun 15, 2016 at 06:45:59PM -0400, John Ferlan wrote:
On 06/15/2016 01:19 PM, Ján Tomko wrote: > Regression introduced by commit 71b803a for [1] that prevents > starting up > a logical pool created with <source><device path=''></source> > after it has been moved to a different physical volume.
[...]
<pool type='logical'> <name>vgname</name> <source> <device path='/dev/sda4'/> <name>vgname</name>
As I see things, this more or less lets libvirt "know" that "/dev/sda4" and "vgname" are associated.
</source> <target> <path>/dev/vgname</path> </target> </pool>
Then pvmove /dev/sda4 to /dev/sda5, libvirt will refuse that pool even though <name>vgname</name> is enough to uniquely identify a storage pool.
As an admin you take this step to pvmove your data from /dev/sda4 to /dev/sda5.
Note that this messes with the configuration without notifying libvirt since we don't support such operation via the APIs.
But then you expect libvirt to know that? Do you expect libvirt to automagically update the XML to /dev/sda5 during 'build' or 'start'?
That is one possibility ...
The whole point of LVM is abstraction from the lower layers so we shouldn't ever be checking this.
So it's OK to activate/start a pool where the source device path is incorrect?
For LVM, the important path is in <source><name>.
But isn't there a 1-to-1 relationship between the VG and the PV? A PV cannot be in more than one VG, so if you move your PV, then you have to update your XML to let libvirt know.
Why would it need to know? Apart from pool building, libvirt does not need the value for anything.
It is needed when deleting the pool, where libvirt attempts to pvremove the devices. Fortunately even in the current broken state it's not possible to remove PVs that are member of a different VG which this would allow. The broken part of the check is that it doesn't enforce that ALL PVs as listed in the XML are member of the given VG and no other is. If just one PV matches then the code happily accepts it. In the current state the check is pointless since it doesn't properly enforce the configuration and still still allows other wrong configurations. So the options are: 1) Remove it: - If the admin messes with the state we will happily ignore the difference. Deletion will not work properly unless updated. Pros: the storage pool will work if the volume group is found Cons: deletion may break 2) Enforce it properly: - Fix the check so that the data provided in the XML must match the state in the system. Pros: the state will be always right Cons: existing setups may stop to work 3) Update the state in the XML - Re-detect the PV paths on pool startup. Pros: state will be in sync, existing configs will work Cons: less robust in some cases, weird semantics with persistent config 4) 2 + 3. Update it via a flag when starting. Peter

On Mon, Jun 27, 2016 at 01:56:58PM +0200, Peter Krempa wrote:
On Fri, Jun 24, 2016 at 18:09:01 +0200, Ján Tomko wrote:
On Wed, Jun 22, 2016 at 07:03:46AM -0400, John Ferlan wrote:
On 06/17/2016 11:16 AM, Ján Tomko wrote:
On Thu, Jun 16, 2016 at 07:44:05AM -0400, John Ferlan wrote:
On 06/16/2016 06:03 AM, Ján Tomko wrote:
The whole point of LVM is abstraction from the lower layers so we shouldn't ever be checking this.
So it's OK to activate/start a pool where the source device path is incorrect?
For LVM, the important path is in <source><name>.
But isn't there a 1-to-1 relationship between the VG and the PV? A PV cannot be in more than one VG, so if you move your PV, then you have to update your XML to let libvirt know.
Why would it need to know? Apart from pool building, libvirt does not need the value for anything.
It is needed when deleting the pool, where libvirt attempts to pvremove the devices. Fortunately even in the current broken state it's not possible to remove PVs that are member of a different VG which this would allow.
The problem here is: [A] on pool deletion, we call pvremove on all the devices listed in the persistent XML config (since deletion only works on stopped pools) With the persistent config out of sync, currently we either: * call pvremove on PVs that are no longer in the VG This errors out if there is something useful on the PV (e.g. a new filesystem or it's used by a VG) * we fail to call it on a PV that was added to the VG behind our back This leaves PV metadata on the device, so it shows up in pvs/pvdisplay output and some mkfs tools will not want to create a filesystem on it without --force. In our fs backend, we only run mkfs -t xfs with --force. On the other hand, pvcreate happily runs on such volume.
The broken part of the check is that it doesn't enforce that ALL PVs as listed in the XML are member of the given VG and no other is. If just one PV matches then the code happily accepts it.
[B] The PVs listed in the XML do not match the on-disk config. [C] The pool fails to start after vgextend/vgreduce.
In the current state the check is pointless since it doesn't properly enforce the configuration and still still allows other wrong configurations.
It also slows down startup for pools which do not have any PVs in the persistent config, but that can be easily solved by skipping the whole check. It also unnecessarily calls vgscan if we have an active VG.
So the options are:
The current state does nothing to address [A], fixes a subset of [B] while breaking [C].
1) Remove it: - If the admin messes with the state we will happily ignore the difference. Deletion will not work properly unless updated. Pros: the storage pool will work if the volume group is found Cons: deletion may break
Partially breaks [B], fixes [C].
2) Enforce it properly: - Fix the check so that the data provided in the XML must match the state in the system. Pros: the state will be always right Cons: existing setups may stop to work
Fixes [B], breaks [C]. Possibly fixes [A] if we enforce it on delete.
3) Update the state in the XML - Re-detect the PV paths on pool startup. Pros: state will be in sync, existing configs will work Cons: less robust in some cases, weird semantics with persistent config
3a) Do it only for the live XML: Fixes [B] and [C], does nothing for [A] 3b) Also update persistent config Fixes [A], [B], [C], but with weird semantics.
4) 2 + 3. Update it via a flag when starting.
This fixes all three, giving the user an option to update the persistent config. I think [C] is more important than [B], which is why I proposed reverting the check. It would be nice to get it into the release even without addressing [B] at all. This does not make [A] any worse. As for [B], it would be nice to update the live XML, but (as I stated in an earlier mail) should not block fixing [C]. Rejecting such pools is just wrong. For [A], we can either: 1) Refuse to delete the pool if the PVs do not match the config. Requiring manual intervention, or an option to re-detect them in libvirt. 2) Automatically update the persistent config. This feels strange. 3) Make the issue not matter by not doing pvremove. This will not make them reusable for some tools, e.g. mkfs -t ext4, but since we already use --force for mkfs -t xfs, maybe we should do it here too. 4) Do nothing. Rely on pvremove to do the right thing. Considering the user is capable of adjusting the VGs manually, I think running PoolDelete on such pool is even more unlikely than creating the pool via libvirt and adjusting it behind its back, so the chosen option won't matter much. My order of preference is [4] [1] (requiring manual intervention), then [4] [4] [4] [3] [2]. Jan

On Wed, Jun 15, 2016 at 07:19:54PM +0200, Ján Tomko wrote:
Regression introduced by commit 71b803a for [1] that prevents starting up a logical pool created with <source><device path=''></source> after it has been moved to a different physical volume.
For logical pools <source><name> contains the name of the volume group and uniquely identifies the VG on the host.
This also speeds up startup for pools that do not have any <device>s specified.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1025230 --- src/storage/storage_backend_logical.c | 104 +--------------------------------- 1 file changed, 2 insertions(+), 102 deletions(-)
ping Jan
participants (3)
-
John Ferlan
-
Ján Tomko
-
Peter Krempa