[Libvir] PATCH: Avahi advertisement for libvirtd

18 Sep 2007

      This is a serious patch at supporting Avahi advertisement of the libvirtd
service.

 - configure by default will probe to see if avahi is available and if
   found will enable appropriate code.

       --with-avahi    will force error if not found
       --without-avahi   will disable it with no checking

 - HAVE_AVAHI is defined in config.h if avahi is found & used to conditionally
   enable some code in qemud/qemud.c

 - HAVE_AVAHI is also a Makefile conditional to enable compilation of the
   mdns.c and mdns.h files. A little makefile rearrangement was needed to
   make sure variables like EXTRA_DIST were defined before we appended to
   them with +=

 - The code in mdns.c contains all the support for dealing with the Avahi
   APIs. 

 - The primary Avahi API is horribily complicated for day-to-day
   use in application code, exposing far too much of the event loop and
   state machine. So we expose a simplified API to libvirt in mdns.h

 - We fully support the Avahi state machine, so you can start libvirt even
   if Avahi is not running on the machine. If you later start Avahi, then
   libvirt will automatically detect & register with it. Likewise if you
   stop Avahi we handle that gracefully by shutting down our internal mdns
   support & waiting for avahi to restart. 

   NB this does require that the DBus system bus daemon is running. This
   is a limitation of the current Avahi client library & not our use of it.
   It is expected this will be fixed in future avahi releases.

 - The Avahi client library is basically a shim which talks to the Avahi
   daemon using DBus system daemon. The DBus stuff doesn't leak out of 
   the Avahi APIs - it is loosely couple - all we need do is provide Avahi
   with an event loop implementation which was surprisingly easy. The
   libvirtd daemon does now indirectly link with DBus, but I don't see any
   problem with this. Don't want it - then use --without-avahi 

 - We advertise a service name of _libvirt._tcp  The IETF draft recommends
   use of the name from /etc/services associated with your app. There is a
   way to register official Avahi services names. We don't have an /etc/service
   name registered either though.

 - If TLS is *not* enabled, we advertise _libvirt._tcp with a port number
   of 0.  The use of a port number 0 is explicitly allowed by the IETF
   draft docs - see [1] "8. Flagship Naming".  If seeing the _libvirt._tcp
   service with a port of 0, then clients should figure out some way to
   to tunnel to the UNXI domain socket on the advertised machine. The
   tunnel mechanism is undefined, though SSH is the obvious choice.

   If TLS is enabled, then we use getsockname() to fetch the actual port 
   number we're listening on for the TLS enabled socket. This plays nice 
   with admins ability to override the port number. Clients  can still of
   course choose to tunnel instead of use TLS.

   NB. we explicitly refuse to advertise the non-TLS port.

 - The default service name we advertise is 'Virtualization Host %h' where
   '%h' is the short hostname (ie without domain name). This has to be less
   than 63 characters & stripping domain name is usual practice, since this
   is implicit from the mdns domain you are browsing.

   This can be overridden with mdns_name="Blah"  in the libvirtd.conf
   configuration file. Service names *must* be unique in the LAN. If a name
   clashes, then avahi appends junk like #1, #2, #3... until it is unique.

 - Even when compiled in, use of Avahi mdns can be disabled by setting the
   mdns_adv=0 config file parameter in /etc/libvirt/libvirtd.conf

 - This patch does not advertise any per-VM  VNC server instances, but I have
   prepared the APIs in mdns.h to be ready to support that with minimal effort.
   A pre-requisite for this is an extension to the driver API to get async
   signals when VMs start & stop, since making the daemon poll hypervisors 
   will suck big time.

   When implemented each VM will be its own mdns 'group' and the VNC server
   associated with it will be the 'service' advertised in that group.

Having applied this patch & started the daemon, if /etc/init.d/avahi-daemon
is running, you should see the service advertised on the LAN. As mentioned
earlier if you start Avahi daemon after libvirt it should detect this too.

You can verify by running the following on another box on the same local
LAN segment.

  $ avahi-browse --resolve _libvirt._tcp 

When you start libvirtd on the other box the avahi-browse should print out
a record. It should look like this:

+ eth0 IPv4 Virtualization Host avocado          _libvirt._tcp        local
= eth0 IPv4 Virtualization Host avocado          _libvirt._tcp        local
   hostname = [avocado.local]
   address = [10.13.7.48]
   port = [16514]
   txt = ["org.freedesktop.Avahi.cookie=3025088350"]

If you've not configured TLS, then port will be '0' instead of 16514.

So that brings me nicely onto the main outstanding issue.

Notice the hostname is 'avocado.local' - this is the standard mdns practice
for zero-conf DNS hostnames. If you're /etc/nsswitch.conf is setup correctly
the name avocado.local will actually resolve to the IP address you see there
of '10.13.7.48' (well s/avocado/your hostname/ of course).

Well, x509 certificates include a FQDN in them & the client is expected to
validate the certificate hostname against the hostname it connected to. Now
my FQDN is  avocado.virt.boston.redhat.com which is obviously going to not
validate against avocado.local.

There're a couple of ways around this issue I can come up with so far

  - Recommend that the client try to reverse-DNS the 'address' field from
    the mdns advertisement. If reverse-DNS is working, 10.13.7.48 will be
    transformed into 'avocado.virt.boston.redhat.com' which the client can
    then connect to. 

  - Add TXT record containing the hostname associated with the certificate.
    If the client were to use this record, then obviously any validation of
    it against the resulting certificate is sort-of pointless, since both
    came from the server. The first approach rather had this drawback too.

  - Advertise the records in the real domain 'virt.boston.redhat.com' instead
    of '.local'. Not been able to make this work. 

  - Simply don't bother validating the remote hostname against the server
    certificate cname. ie, take the position that if using zero-conf then
    you already have some implicit level of trust in your LAN and validating
    the cert against the CA cert is sufficient & hostname matching can be
    ignored.

These are all mildy abusing mdns / zeroconf, but then x509 certificates don't
really fit into the model of 'zero conf' in the first place. If people want 
true zero-conf then the (SSH) tunnel is better (and always available), but 
if they've setup certificates they should still be allowed to use zero-conf 
to at least locate hosts. So mildly abusing the rules is reasonable IMHO.

Personally I'm tending towards the latter approach.

Regards,
Dan.

[1] http://files.dns-sd.org/draft-cheshire-dnsext-dns-sd.txt
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=|

[Libvir] PATCH: Avahi advertisement for libvirtd

Daniel P. Berrange