On Fri, Nov 19, 2010 at 9:50 AM, Daniel P. Berrange <berrange(a)redhat.com> wrote:
On Fri, Nov 19, 2010 at 09:27:40AM +0000, Stefan Hajnoczi wrote:
> On Thu, Nov 18, 2010 at 5:13 PM, Sage Weil <sage(a)newdream.net> wrote:
> > On Thu, 18 Nov 2010, Daniel P. Berrange wrote:
> >> On Wed, Nov 17, 2010 at 04:33:07PM -0800, Josh Durgin wrote:
> >> > Hi Daniel,
> >> >
> >> > On 11/08/2010 05:16 AM, Daniel P. Berrange wrote:
> >> > >>>>In any case, before someone goes off and implements
something, does this
> >> > >>>>look like the right general approach to adding rbd
support to libvirt?
> >> > >>>
> >> > >>>I think this looks reasonable. I'd be inclined to get
the storage pool
> >> > >>>stuff working with the kernel RBD driver& UDEV rules
for stable path
> >> > >>>names, since that avoids needing to make any changes to
guest XML
> >> > >>>format. Support for QEMU with the native librados CEPH
driver could
> >> > >>>be added as a second patch.
> >> > >>
> >> > >>Okay, that sounds reasonable. Supporting the QEMU librados
driver is
> >> > >>definitely something we want to target, though, and seems to be
route that
> >> > >>more users are interested in. Is defining the XML syntax for a
guest VM
> >> > >>something we can discuss now as well?
> >> > >>
> >> > >>(BTW this is biting NBD users too. Presumably the guest VM XML
should
> >> > >>look similar?
> >> > >
> >> > >And also Sheepdog storage volumes. To define a syntax for all these
we need
> >> > >to determine what configuration metadata is required at a per-VM
level for
> >> > >each of them. Then try and decide how to represent that in the
guest XML.
> >> > >It looks like at a VM level we'd need a hostname, port number
and a volume
> >> > >name (or path).
> >> >
> >> > It looks like that's what Sheepdog needs from the patch that was
> >> > submitted earlier today. For RBD, we would want to allow multiple
hosts,
> >> > and specify the pool and image name when the QEMU librados driver is
> >> > used, e.g.:
> >> >
> >> > <disk type="rbd" device="disk">
> >> > <driver name="qemu" type="raw" />
> >> > <source vdi="image_name"
pool="pool_name">
> >> > <host name="mon1.example.org"
port="6000">
> >> > <host name="mon2.example.org"
port="6000">
> >> > <host name="mon3.example.org"
port="6000">
> >> > </source>
> >> > <target dev="vda" bus="virtio" />
> >> > </disk>
> >> >
> >> > Does this seem like a reasonable format for the VM XML? Any
suggestions?
> >>
> >> I'm basically wondering whether we should be going for separate types
for
> >> each of NBD, RBD & Sheepdog, as per your proposal & the sheepdog one
earlier
> >> today. Or type to merge them into one type 'nework' which covers any
kind of
> >> network block device, and list a protocol on the source element, eg
> >>
> >> <disk type="network" device="disk">
> >> <driver name="qemu" type="raw" />
> >> <source protocol='rbd|sheepdog|nbd' name="...some
image identifier...">
> >> <host name="mon1.example.org"
port="6000">
> >> <host name="mon2.example.org"
port="6000">
> >> <host name="mon3.example.org"
port="6000">
> >> </source>
> >> <target dev="vda" bus="virtio" />
> >> </disk>
> >
> > That would work...
> >
> > One thing that I think should be considered, though, is that both RBD and
> > NBD can be used for non-qemu instances by mapping a regular block device
> > via the host's kernel. And in that case, there's some sysfs-fu (at
least
> > in the rbd case; I'm not familiar with how the nbd client works) required
> > to set up/tear down the block device.
>
> An nbd block device is attached using the nbd-client(1) userspace tool:
> $ nbd-client my-server 1234 /dev/nbd0 # <host> <port> <nbd-device>
>
> That program will open the socket, grab /dev/nbd0, and poke it with a
> few ioctls so the kernel has the socket and can take it from there.
We don't need to worry about this for libvirt/QEMU. Since QEMU has native
NBD client support there's no need to do anything with nbd client tools
to setup the device for use with a VM.
I agree it's easier to use the built-in NBD support. Just wanted to
provide the background on how NBD client works when using the kernel
implementation.
Stefan