On Thu, Nov 18, 2010 at 5:13 PM, Sage Weil <sage(a)newdream.net> wrote:
On Thu, 18 Nov 2010, Daniel P. Berrange wrote:
> On Wed, Nov 17, 2010 at 04:33:07PM -0800, Josh Durgin wrote:
> > Hi Daniel,
> >
> > On 11/08/2010 05:16 AM, Daniel P. Berrange wrote:
> > >>>>In any case, before someone goes off and implements something,
does this
> > >>>>look like the right general approach to adding rbd support to
libvirt?
> > >>>
> > >>>I think this looks reasonable. I'd be inclined to get the
storage pool
> > >>>stuff working with the kernel RBD driver& UDEV rules for stable
path
> > >>>names, since that avoids needing to make any changes to guest XML
> > >>>format. Support for QEMU with the native librados CEPH driver could
> > >>>be added as a second patch.
> > >>
> > >>Okay, that sounds reasonable. Supporting the QEMU librados driver is
> > >>definitely something we want to target, though, and seems to be route
that
> > >>more users are interested in. Is defining the XML syntax for a guest
VM
> > >>something we can discuss now as well?
> > >>
> > >>(BTW this is biting NBD users too. Presumably the guest VM XML should
> > >>look similar?
> > >
> > >And also Sheepdog storage volumes. To define a syntax for all these we need
> > >to determine what configuration metadata is required at a per-VM level for
> > >each of them. Then try and decide how to represent that in the guest XML.
> > >It looks like at a VM level we'd need a hostname, port number and a
volume
> > >name (or path).
> >
> > It looks like that's what Sheepdog needs from the patch that was
> > submitted earlier today. For RBD, we would want to allow multiple hosts,
> > and specify the pool and image name when the QEMU librados driver is
> > used, e.g.:
> >
> > <disk type="rbd" device="disk">
> > <driver name="qemu" type="raw" />
> > <source vdi="image_name" pool="pool_name">
> > <host name="mon1.example.org" port="6000">
> > <host name="mon2.example.org" port="6000">
> > <host name="mon3.example.org" port="6000">
> > </source>
> > <target dev="vda" bus="virtio" />
> > </disk>
> >
> > Does this seem like a reasonable format for the VM XML? Any suggestions?
>
> I'm basically wondering whether we should be going for separate types for
> each of NBD, RBD & Sheepdog, as per your proposal & the sheepdog one earlier
> today. Or type to merge them into one type 'nework' which covers any kind of
> network block device, and list a protocol on the source element, eg
>
> <disk type="network" device="disk">
> <driver name="qemu" type="raw" />
> <source protocol='rbd|sheepdog|nbd' name="...some image
identifier...">
> <host name="mon1.example.org" port="6000">
> <host name="mon2.example.org" port="6000">
> <host name="mon3.example.org" port="6000">
> </source>
> <target dev="vda" bus="virtio" />
> </disk>
That would work...
One thing that I think should be considered, though, is that both RBD and
NBD can be used for non-qemu instances by mapping a regular block device
via the host's kernel. And in that case, there's some sysfs-fu (at least
in the rbd case; I'm not familiar with how the nbd client works) required
to set up/tear down the block device.
An nbd block device is attached using the nbd-client(1) userspace tool:
$ nbd-client my-server 1234 /dev/nbd0 # <host> <port> <nbd-device>
That program will open the socket, grab /dev/nbd0, and poke it with a
few ioctls so the kernel has the socket and can take it from there.
Stefan