On Thu, Feb 11, 2016 at 02:50:55PM +0100, Peter Krempa wrote:
> > Whoah. Data corruption accross network? I'm not quite
sure
> > whether I'd use this to cover up a problem with the storage
> > technology or network rather than just fix the root cause. If
> > you have 3 copies, and manage to have a sector where all 3
> > differ then the quorum driver won't help. And it will make it
> > even harder to find any possible problems.
>
> But in that case you detect that it went wrong and you get an I/O
> error. The problem with silent data corruption is that it can be
> hard to detect.
Yes, and that's why it should be fixed at the network storage
technology layer rather than anywhere else.
I've had the chance to discuss this a bit with a cloud provider that
is using Quorum.
In their experience they have had problems in their tests with Gluster
or Ceph, particularly when sharing the same images among several
clients. They have experienced major delays and crashes when one of
the nodes fail, and in general they don't consider them stable enough
for their needs. On the other hand NFS is easy to use and manipulate,
robust, and allows the use of hardware appliances.
> If there's a bit-flip across the network Quorum can detect
it,
> report it and correct the faulty version without needing to
> rebuild everything.
I still think that you do wan't to rebulild the whole volume in such
case if you care about your data in the slightest. Otherwise you
don't have to do stuff like this.
In general, yes. But that's right, I agree that having API to deal
with these scenarios is a good idea and I can work on it.
> > * since we don't use node-names yet, it's not
really
> > possible to do block jobs on quorum disks, thus they are
> > forbidden
>
> I'm not sure what's the status of node names in libvirt, I could
> also try to help to make it happen.
They are basically non-existent. To be honest I think that the node
name support stuff and better approach at constructing block devices
and their backing chains and better handling of block jobs should be
done prior to quorum.
This series tries to partially do the stuff that is a plan how to
approach some stuff regarding disks. One of them is that the backing
chain of a disk is persisted in the XML and then fully constructed.
By adding this code the refactor will be even more painful as it
will currently be.
I'm actually planing to do this in short term future, but
unfortunately this is not a weekend project.
>
> > * since block jobs are forbidden and rewrite-corrupted can't
> > * be enabled, no way to do the rebuild
>
> 'rewrite-corrupted' can be easily added to the series so I don't
> think that's a problem. The block jobs thing I would need to
> see first. Would you really need to have node names in order to
> rebuild a Quorum?
Most probably yes. Without them, it will be just an ugly hack.
For the common usage I think you can use the device name just fine,
but if you have a scenario where a Quorum is part of a backing chain
then if wouldn't work without node names.
Berto