On Wed, Feb 24, 2010 at 02:34:53PM +0000, Simon Kelley wrote:
As the principal maintainer of dnsmasq, I'm seeing increasing
reports of
problems on systems which run both dnsmasq and libvirt. I'm fairly sure
I understand what's going on in these cases, and I have a few proposals
for changes in libvir and dnsmasq that should fix things.
Thanks for starting this topic - it would certainly be nice if we can come
up with a solution that has better inter-operability & fewer surprises for
administrators.
The problem is that libvirt runs a private instance of dnsmasq: on
machines which are also running a "system" dnsmasq daemon, this can
cause problems.
Some background: dnsmasq can run in two modes.
Default mode: dnsmasq binds the wildcard address and does network magic
to determine which interface request packets actually come from, so that
the results can be sent back with the correct source address. This has
the advantage that network interfaces can come and go and change IP
address and dnsmasq will keep working. It's possible to restrict dnsmasq
to only reply to requests on some interfaces; requests from other
interfaces will be read by dnsmasq and then silently dropped. Telling
dnsmasq to use an interface which doesn't exist but might in the future
will result in a logged warning, but dnsmasq will still start and when
the interface comes up it will work.
Bind-interfaces mode: This is the traditional way to do UDP servers. At
startup dnsmasq enumerates all the extant interfaces and then opens a
socket for each one, listening on the interfaces's IP address.
Interfaces may be skipped if excluded by the --interface and
--except-interface flags, and any interface specified in --interface
which doesn't exist at start-up will generate a fatal error.
Yep, I remember we hit that fatal error in libvirt, when we create our
bridge device & then launched dnsmasq, sometimes dnsmasq would exit
with an error because the bridge device wasn't visible in userspace
yet. Thus we use bind-interfaces mode, but instead of using the flag
--interface=virbr0, we switched to --listen-address=IP-of-VIRBR0
I imagine you've already seen this, but as an example of the ARGV that
libvirt generates for its dnsmasq instances:
/usr/sbin/dnsmasq \
--strict-order \
--bind-interfaces \
--pid-file=/var/run/libvirt/network/default.pid \
--conf-file= \
--listen-address 192.168.122.1 \
--except-interface lo \
--dhcp-range 192.168.122.2,192.168.122.254 \
--dhcp-lease-max=253
In almost all cases, default mode is better: --bind-interfaces is
only
there to cope with old platforms which don't support enough socket
options to do default mode.
The only time when --bind-interfaces works better is when it's desirable
to run more than one instance of dnsmasq or have dnsmasq co-exist with
another DNS server. This is not possible in default mode, but it does
work in bind-interfaces mode, providing than _all_ instances of dnsmasq
are in bind-interfaces mode, and that they listen on a disjoint set of
interfaces.
Yes, for want of any alternative, we currently recommend users with a
system instance of dnsmasq to use bind-interfaces, and either --interface
or --listen-address
http://wiki.libvirt.org/page/Libvirtd_and_dnsmasq
Therefore, to allow multiple dnsmasq instances libvirt's private
dnsmasq
instance is started in bind-interfaces mode: that forces one of the
dnsmasq instances to do bind-interfaces. Many of the Linux distibution
dnsmasq packages have now implemented an /etc/dnsmasq.d directory where
configuration fragments can be dropped. Their libvirt packages are
putting a file there which contains a bind-interfaces command, so that
the "system" dnsmasq is automatically forced into the same mode, and the
two can co-exist.
This works, sort-of, but there some disadvantages. Installing libvirt
drops the configuration change for the system dnsmasq, but the packages
frequently don't restart the system daemon, so that things transiently
fail until everything has rebooted. Much worse, the system dnsmasq is
forced into bind-interfaces mode and then service to transient
interfaces (usb, ad-hoc wifi) no longer works, or, because those
interfaces are mentioned in the dnsmasq configuration, dnsmasq now fails
at start-up when the interfaces don't exist.
Yes, this is rather a pain. Aside from the scheme you propose later,
there is one other (hacky) way to deal with this - use a udev script
to trigger update + reload of the system dnsmasq's configuration when
a USB NIC device hotplug/unplug occurs. That is clearly just crude
patch over the already serious problem.
My proposal is to get rid of the necessity for two dnsmasq instances.
Libvirt should check for the existance of a "system" dnsmasq and, if the
system daemon exists, libvirt should drop the required configuration
into /etc/dnsmasq.d and then restart it. If the system daemon is not
installed or enabled, libvirt can start a private instance as now.
I'm wondering if there's any way we can arrange things so that we will
always be able to use a system dnsmasq instance, regardless of whether
the host already has it running.
My other concern with writing libvirt configs into /etc/dnsmasq.d is that
users will then get the impression that this is something that they can
freely edit / modify at will. They'll be unhappy with libvirt overwrites
their changes whenever it starts. This could perhaps be addressed by
allowing use to put the configs into /var/lib/dnsmasq/ instead of
/etc/dnsmasq.d, which is more common location for non-user editable
configs generated at runtime.
Your general plan of having a single dnsmasq instance though does sound
desirable, given the way the sockets() APIs work wrt binding to addresses
The difficulty with this scheme is that libvirt needs to create some
configuration which enables the services it needs on the virtual network
without disturbing, or being disturbed by, whatever configuration exists
for the system daemon. That's not currently possible, but it can be made
possible. I'm assuming that libvirt needs to provide a set of IP
address / MAC address mappings, and range of IP addresses on a virtual
network. It needs DHCP and DNS service on the virtual network.
The total set of DNSMASQ args that we currently use are
--strict-order
--bind-interfaces
--domain DOMAIN-NAME (optional)
--pid-file=/var/run/libvirt/network/$NETWORK.pid
--conf-file=
--listen-address=IPADDR-OF-BRIDGE
--except-interface=lo
--dhcp-range=IPRANGE (optional, multiple times)
--dhcp-lease-max=RANGE-SIZE (optional)
--dhcp-host=STATIC-HOST-MAPPING (optional)
--enable-tftp (optional)
--tftp-root=/some/path (optional)
--dhcp-boot=PXE-BOOT-SERVER (optional)
NB, we explicitly give a NULL conf-file in order to prevent any of the
user's settings from the system instance from conflicting with libvirts
settings. We don't really want users to be able to specify arbitrary
other configuration settings for the libvirt dnsmasq instances, other
than those we enable via the libvirt XML configuration.
I've used 'optional' to denote flags we only pass when explicitly
configured via libvirt's XML format. The others we pass all the
time.
The 'lease-max' arg we calculate to be exactly matching the number of
addresses in the configured dhcp-range args. This is because some of
our users had configured dhcp ranges larger than 150 addresses in len.
The dhcp-host IP/MAC mappings are a non-problem: they will be ignored
for any other subnet where the IP addresses don't fit, and any other
dhcp-hosts in the system configuration will be similarly ignored for
DHCP on the virtual network subnet.
The dhcp-range is more of a problem. Service to particular networks in
dnsmasq is controlled by interface=<interface name"> lines in the
configuration. If there are none of these, service is provided to all
interfaces. If they exist, service is limited to the interfaces
specified. The existence of any dhcp-range line in dnsmasq's
configuration enables the DHCP server for any subnet unless explicitly
limited to particular interfaces. So a default dnsmasq installation,
(with no interface=<interface>) which provides DNS everywhere but DHCP
nowhere would be turned into one which provided DHCP on every interface
by libvirt adding a dhcp-range. Since there wouldn't be a suitable DHCP
range for most subnets, this would only result in logged errors, but it
is still not good.
Worse, there's no good answer to the question 'should libvirt include
interface=virt0"' in the configuration it supplies? If it does, then the
"enable DHCP on all interfaces" problem is solved, but a default system
configuration with no interface declaration is transformed from one
which provides DNS everywhere to one which provides DNS only to the
virtual interface. If libvirt doesn't provide "interface=virt0" and the
system configuration includes interface declarations, then there will be
no DNS or DHCP service to the virtual network.
Historically we did try using 'interface=virbr0' at one time, but we
suffered from race conditions with creation of our bridge, so we switched
to 'listen-address' instead & assume each host interface has separately
configured IP addresses. What would happen was that we'd create the bridge
device via an ioctl(), then spawn dnsmasq & it'd exit saying the inteface
didn't exist. Adding a sleep(1) after the ioctl() would make it work, so
it was clearly some kernel<->userspace race rather than dnsmasq's problem.
To solve this, I propose to add an optional interface name to the
dhcp-range declaration. The semantics of this would be rather odd, but
solve the problem perfectly.
1) for DHCP, if any other dhcp-range exists _without_ an interface name,
them the interface name is ignored and and things behave as before,
otherwise DHCP is only provided to interfaces mentioned in dhcp-range
declarations.
2) for DNS, if there are no interface declarations, things work as
before. If there are interface declarations, the interfaces mentioned in
dhcp-ranges are added to the set which get DNS service.
With these rules, it should be possible for libvirt to drop eg
dhcp-range=interface:virt0,192.168.0.1,192.168.0.240
into the configuration of the system dnsmasq and get DHCP and DNS
service for virt0, irrespective of any other configuration in the system
dnsmasq, and doing so shouldn't affect the services supplied elsewhere.
Would this scheme allow libvirt to guarantee that no DHCP is present
on its interface ? We currently support running in DNS-only mode, or
DNS+DHCP. It is desirable to keep that regardless of how the host's
system dnsmasq is currently configured for other interfaces.
If I am understanding your suggestion, this allows libvirt to easily
enable DNS+DHCP mode on its own interface, without us accidentally
enabling DHCP on other host interfaces.
If libvirt doesn't use any --dhcp-range flags, there is still a chance
that DHCP could be enabled on libvirt's interface if the system dnsmasq
had any dhcp-range args. Though assuming the IP ranges don't overlap
this should be effectively a no-op ?
The code in libvirt to make this work looks like this:
echo dhcp-range=interface:virt0,<ip range> >>/etc/dnsmasq.d/libvirt
if <system dnsmasq is not installed or not enabled>
dnsmasq --interface=virt0\
--bind-interfaces --conf-file=/etc/dnsmasq.d/libvirt
else
/etc/init.d/dnsmasq restart
(The --bind-interfaces in the private-dnsmasq instance keeps dnsmasq
from clashing with other nameservers eg BIND which may be running.)
The system dnsmasq package has to ensure that /etc/dnsmasq.d is read for
configuration fragments, and the dnsmasq package and the libvirt package
will have to co-operate to manage transitions between private and system
dnsmasq mode caused by package installation or removal.
Does that make sense? It's a long and involved explanation to
come to
cold. I fear I may have over-simplified what libvirt is doing with
dnsmasq, in which case please enlighten me and I'll modify my scheme to
take that into account. If this looks good I can easily have the
necessary dnsmasq changes in the next release.
I think you've got the general picture of what we're doing with dnsmasq.
At a very high level our original goals were
- Support multiple independantly configured networks (virbr0, virbr1, etc)
- Isolation between libvirt network interface config & host inteface config
- Only support configuratin of options via libvirt network XML format
Overall, libvirt aims to provide a standard representation of configuration
of services regardless of underlying implementation. Thus ideal would be
that end users would not need to know or care that libvirt was using dnsmasq
as its implementation. Obviously we're failing here due to the inevitable
conflict with the system dnsmasq that operates in wildcard addressing mode.
Your proposal certainly helps us deal with that conflict in a better way.
My main concern is that it has the potential to significantly reduce the
isolation of configuration between interfaces. eg, does libvirt's use of
the --enable-tftp arg suffer from the same problem as --dhcp-range, where
libvirt setting it for one interface inadvertantly enables it for all others
This would seem to imply that many other dnsmasq arguments would need to gain
an extra 'interface' parameter to restrict their scope, which sounds like
quite a burden for your code to support ? In essence we're trying to have
1 single dnsmasq process, but at the same time ensure that everything in the
extra /etc/dnsmasq.d/libvirt-virbr0 file is scoped to a single interface.
I could almost see that file containing 'scope=virbr0' as a short-cut for
saying that every config flag listed there only apply to that one interface.
Nb, I've not said it explicitly, but although the default libvirt config
starts with a single dnsmasq instance attached to virbr0 interface, we
have the ability to start many dnsmasq instances each on a different bridge
device.
On a completely unrelated topic, do you have any plans to support IPv6 in
dnsmasq in the future ? eg things like DHCPv6, listen on IPv6 for DNS
requests, and serving of AAAA records.
Regards,
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://deltacloud.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|