On 12/2/14 22:29, Franky Van Liedekerke wrote:
On Wed, 12 Feb 2014 21:51:53 +0100
urgrue <urgrue(a)bulbous.org> wrote:
> I'm trying to set up SAN-based shared storage in KVM, key word being
> "shared" across multiple KVM servers for a) live migration and b)
> clustering purposes. But it's surprisingly sparsely documented. For
> starters, what type of pool should I be using?
It's indeed not documented at all.
After many trial and errors, this is the result of my experience:
- set up a basic cluster using cman and pacemaker (when using redhat or
centos). If unsure about the multicast performance of your switches,
use unicast (I needed this in some cases).
- don't use a shared FS for your virtual machines. GFS2 works ok, but
the IO performance of your virtual machines drops a lot.
- because of the cluster, you can use clvmd. Even if not using
clustered logical volumes, you can still decide to stop the volume on
one server and start it on another via the pacemaker/heartbeat agents.
- use pacemaker to manage virtual machines (and, if not using clvmd, to
stop/start your lvm's using tags). For the xml files describing your
vm's you'll unfortunately either need a small GFS2 partition or use
rsync between the 2 servers. But use the VirtualDomain resource agent
from git, it contains a lot of fixes (even some from me :-) ). Also
compile libvirtd from source (1.2.1 is very stable with a small
extra patch to talk to older qemu versions), reason for this is that
you can then have more than 20 (or is it 25) virtual machines running
on one kvm without issues (and also lots of memory leak fixes, and it
provides a addon: virtlockd). Also, since you don't touch qemu from
the release you're using, it's not that big a deal.
- as an extra layer of protection, you can use virtlockd (to be sure
your vm doesn't run on 2 nodes at the same time). The disadvantage of
this is you need a small gfs2 shared partition, but that's ok if you
don't want to use rsync for your xml files anyway.
I'm open for any questions and/or bashing :-)
Franky
Hi Franky,
Thanks for sharing your experience. I considered using clvm but I was
hoping for something simpler for the reasons that:
- clvm refuses to start (initial start) if all cluster nodes aren't up,
which in some scenarios is a little problematic.
- I wouldn't like to have fencing enabled (imagine only one VM is using
clvm and the rest are on local disk), but red hat support requires fencing.
- KISS...
- oVirt/RHEV uses plain old non-clustered LVM.
I like this idea because it's super simple. Indeed, there is no
protection against something using that disk simultaneously on the host,
but that's why I'd use clustering HA LVM (or clvm) in the VMs themselves.
My only real concern with non-clustered LVM is that the libvirt
"logical" pool type doesn't get it at all. I'm not sure what it does
differently compared to standard lvm commands but it will only start the
pool on the first KVM node, and errors out on the next.
I'm still investigating (also have a case open at red hat) so if I find
out anything interesting