On Mon, Jun 27, 2016 at 01:56:58PM +0200, Peter Krempa wrote:
On Fri, Jun 24, 2016 at 18:09:01 +0200, Ján Tomko wrote:
> On Wed, Jun 22, 2016 at 07:03:46AM -0400, John Ferlan wrote:
> >On 06/17/2016 11:16 AM, Ján Tomko wrote:
> >> On Thu, Jun 16, 2016 at 07:44:05AM -0400, John Ferlan wrote:
> >>> On 06/16/2016 06:03 AM, Ján Tomko wrote:
> >>>> The whole point of LVM is abstraction from the lower layers so we
> >>>> shouldn't ever be checking this.
> >>>>
> >>>
> >>> So it's OK to activate/start a pool where the source device path is
> >>> incorrect?
> >>
> >> For LVM, the important path is in <source><name>.
> >>
> >
> >But isn't there a 1-to-1 relationship between the VG and the PV? A PV
> >cannot be in more than one VG, so if you move your PV, then you have to
> >update your XML to let libvirt know.
> >
>
> Why would it need to know? Apart from pool building, libvirt does not
> need the value for anything.
It is needed when deleting the pool, where libvirt attempts to pvremove
the devices. Fortunately even in the current broken state it's not
possible to remove PVs that are member of a different VG which this
would allow.
The problem here is:
[A] on pool deletion, we call pvremove on all the devices listed in the
persistent XML config (since deletion only works on stopped pools)
With the persistent config out of sync, currently we either:
* call pvremove on PVs that are no longer in the VG
This errors out if there is something useful on the PV (e.g. a new
filesystem or it's used by a VG)
* we fail to call it on a PV that was added to the VG behind our back
This leaves PV metadata on the device, so it shows up in pvs/pvdisplay
output and some mkfs tools will not want to create a filesystem on it
without --force. In our fs backend, we only run mkfs -t xfs with
--force. On the other hand, pvcreate happily runs on such volume.
The broken part of the check is that it doesn't enforce that ALL
PVs as
listed in the XML are member of the given VG and no other is. If just
one PV matches then the code happily accepts it.
[B] The PVs listed in the XML do not match the on-disk config.
[C] The pool fails to start after vgextend/vgreduce.
In the current state the check is pointless since it doesn't
properly
enforce the configuration and still still allows other wrong
configurations.
It also slows down startup for pools which do not have any PVs
in the persistent config, but that can be easily solved by skipping the
whole check.
It also unnecessarily calls vgscan if we have an active VG.
So the options are:
The current state does nothing to address [A], fixes a subset of [B]
while breaking [C].
1) Remove it:
- If the admin messes with the state we will happily ignore the
difference. Deletion will not work properly unless updated.
Pros: the storage pool will work if the volume group is found
Cons: deletion may break
Partially breaks [B], fixes [C].
2) Enforce it properly:
- Fix the check so that the data provided in the XML must match the
state in the system.
Pros: the state will be always right
Cons: existing setups may stop to work
Fixes [B], breaks [C]. Possibly fixes [A] if we enforce it on delete.
3) Update the state in the XML
- Re-detect the PV paths on pool startup.
Pros: state will be in sync, existing configs will work
Cons: less robust in some cases, weird semantics with persistent
config
3a) Do it only for the live XML:
Fixes [B] and [C], does nothing for [A]
3b) Also update persistent config
Fixes [A], [B], [C], but with weird semantics.
4) 2 + 3. Update it via a flag when starting.
This fixes all three, giving the user an option to update the persistent
config.
I think [C] is more important than [B], which is why I proposed reverting
the check. It would be nice to get it into the release even without
addressing [B] at all. This does not make [A] any worse.
As for [B], it would be nice to update the live XML, but (as I stated in
an earlier mail) should not block fixing [C]. Rejecting such pools is
just wrong.
For [A], we can either:
1) Refuse to delete the pool if the PVs do not match the config.
Requiring manual intervention, or an option to re-detect them in
libvirt.
2) Automatically update the persistent config.
This feels strange.
3) Make the issue not matter by not doing pvremove.
This will not make them reusable for some tools, e.g. mkfs -t ext4,
but since we already use --force for mkfs -t xfs, maybe we should do
it here too.
4) Do nothing.
Rely on pvremove to do the right thing.
Considering the user is capable of adjusting the VGs manually, I think
running PoolDelete on such pool is even more unlikely than creating the
pool via libvirt and adjusting it behind its back, so the chosen option
won't matter much.
My order of preference is [4] [1] (requiring manual intervention), then
[4] [4] [4] [3] [2].
Jan