Hi all,
At the moment, RBD storage pools in libvirt must be supplied with a list
of Ceph monitor addresses, using <host> elements in the pool's source
definition. Ceph itself has a configuration file, and this is used by
default by all Ceph command-line utilities. This file can contain the
monitor addresses for the cluster, as well as a bunch of other useful
options (e.g. for tuning and debugging).
I think it would be nice if libvirt were able to load in this file when
starting RBD storage pools. Before I send some patches through. however, I
thought I'd better check to see whether my approach is sound.
First, I am not keen on having libvirt get librados to load the
configuration file automatically. librados actually uses a search path to
find the configuration file, and that path includes silly things like the
current working directory. Since it can be told to load a single file, I
think it would be better if it were made explicit in the storage pool XML,
i.e.:
<pool type="rbd">
<name>rbd</name>
<source>
<name>rbd</name>
<config file="/etc/ceph/ceph.conf"/>
<auth username="user" type="ceph">
<secret usage="mycephcluster"/>
</auth>
</source>
</pool>
<config> would be able to be used in addition to, or as an alternative to,
a list of <host> elements. Would something along these lines this be
suitable? Would it be better to use the <config> element's text content as
the filename, rather than use an attribute? I'm not sure what style
guidelines there are for something like this.
The second part is of course to make a similar change to RBD-based domain
disk definitions, i.e.:
...
<disk type="network">
<driver name="qemu" type="raw"/>
<source protocol="rbd" name="pool/volume">
<config file="/etc/ceph/ceph.conf"/>
</source>
<target dev="vda" bus="virtio"/>
<auth username="user">
<secret type="ceph" usage="mycephcluster"/>
</auth>
</disk>
...
Again, <config> could be used instead of or alongside some <host>
elements.
This is where it gets a little tricky. At the moment, <host> in a disk's
source definition is entirely optional. Furthermore, QEMU _always_ loads a
Ceph configuration file -- either one supplied as a "conf" argument for
the block device, or one found through the search path mentioned earlier.
The only way to suppress this is to pass conf=/dev/null... but for
backwards-compatibility (users may be relying on QEMU's use of the search
path), I don't think we can do this now.
There's one final gotcha in all of this: if QEMU is given both a "conf"
argument and a "mon_addr" argument, only the latter will take effect. This
means if both <config> and <host> are supplied, then the <host> elements
will override any monitor addresses from the configuration file.
For consistency, I intend to make an RBD storage pool have the same
behaviour. However, would it perhaps be better if the user could only
choose _either_ <config> or a list of <host> elements? Personally, I don't
think it's a big deal if the behaviour is clearly documented -- being able
to load options from a config file while still defining hosts in the
libvirt XML could be useful.
Anyway, before I send my patches through I'm interested in hearing
people's thoughts on this. All sound sane? Too intrusive? A waste of time?
:-)
- Michael