I am sort-of cross posting this to the libvir-list because the
bind-dynamic fix may have introduced an undesirable new "feature".
I will be troubleshooting this. One thing I want to try is to build a
dnsmasq with a fake version so that bind-interface is forced into use.
Since this is occurring with a SLAAC address, I do not need DHCPv6 for
this testing. Wait, since Laine put a lot of effort into parsing
dnsmasq --help, I will just remove the bind-dynamic rather than changing
the version.
Gene
-------- Original Message --------
Subject: Re: [Dnsmasq-discuss] RA support in dnsmasq
Date: Fri, 30 Nov 2012 12:20:36 -0500
From: Gene Czarcinski <gene(a)czarc.net>
To: dnsmasq-discuss(a)thekelleys.org.uk
On 11/30/2012 11:32 AM, Simon Kelley wrote:
On 30/11/12 15:54, Gene Czarcinski wrote:
> On 11/29/2012 04:18 PM, Simon Kelley wrote:
>> On 29/11/12 20:31, Gene Czarcinski wrote:
>>
>>> I spoke too quickly.
>>>
>>> The cause of the problem is libvirt related but I am not sure what just
>>> yet.
>>>
>>> I was running a libvirt that had a lot of "stuff" on it but seemed
to
>>> work OK. Then, earlier today I update to a point that appears to be
>>> somewhat beyond the leading edge and, although I was not getting any
>>> RTR-ADVERT messages, it turned out that there were/are big-time problems
>>> running qemu-kvm. So, back off/downgrade to the previous version.
>>> Qemu-kvm now works but the RTR-ADVERT messages are back.
>>>
>>> This may be a bit time-consuming to debug!
>>>
>> Are you seeing the new log message in netlink.c?
>>
>>
> The good news is that libvirt is working again (I must have done a
> git-pull in the middle of an update). Thus, I am not seeing the large
> numbers of RTR-ADVERT.
>
> Yes, I am seeing the new log message and I have a question about that.
> Every time a new virtual network interface is started, something must be
> doing some type of broadcast because all of the dnsmasq instances (the
> new one and all the "old" ones) suddenly wake up and issue a flurry of
> RA packets and related syslog messages. To kick the flurry off, there
> one of the new "unsolicited" syslog messages from each dnsmasq instance.
>
> Is this something you would expect? Is this "normal?" The libvirt
> folks they are not doing it.
I'd expect it. The code you instrumented gets run whenever a "new
address" event happens, which is whenever an address is added to an
interface. "Every time a new virtual network interface is started" is a
good proxy for that.
The dnsmasq code isn't very discriminating, it updates it's idea of
which interfaces hace which addresses, and then does a minute of fast
advertisements on all of them. It might be possible to only do the fast
advertisements on new interfaces, but implementing that isn't totally
trivial.
Yes, I doubt very much if it would be trivial. However, I do not
believe that this is the basic problem.
When the problem occurs, one of the networks "suddenly" attempts to work
with the real NIC rather than the virtual one defined in its config
file. I slightly changed the IPv4 and IPv6 addresses defined for this
network and the problem went away. I have also "just" seen the problem
happen on another system which also had that virtual address defined.
BTW, these configurations all use interface= and bind-dynamic rather
than the "old" bind-interface with listen-address= specified for each
specified IPv4 and IPv6 address. I had not noticed the problem
previously. Why it occurs at all with just this specific address is
puzzling.
The configuration in which causes problems is:
------------------------------------------
# dnsmasq conf file created by libvirt
strict-order
domain-needed
domain=net6
expand-hosts
local=/net6/
pid-file=/var/run/libvirt/network/net6.pid
bind-dynamic
interface=virbr11
dhcp-range=192.168.6.128,192.168.6.254
dhcp-no-override
dhcp-leasefile=/var/lib/libvirt/dnsmasq/net6.leases
dhcp-lease-max=127
dhcp-hostsfile=/var/lib/libvirt/dnsmasq/net6.hostsfile
addn-hosts=/var/lib/libvirt/dnsmasq/net6.addnhosts
dhcp-range=fd00:beef:10:6::1,ra-only
-------------------------------------------------
When I changed all the "6" to "160", the problem, disappeared. And
there is another network defined almost the same with "8" instead of
"6"
and I have had no problems with it.
The real NIC is configured as a DHCP client for both IPv4 and IPv6.
It is assigned "nailed" addresses of 192.168.17.2/24 and
fd00:dead:beef:17::2.
And I just discovered why crazy stuff is happening (but I do not know
what causes it) ... the P33p1 NIC has:
inet6 fd00:beef:10:6:3285:a9ff:fe8f:e982/64 scope global dynamic
And the reason why may be related to that NetworkManager goes through a
verly long dance to bring up an interface. With dnsmasq autostarted for
net6, it may have gotten there first ... but it should have not
responded to p33p1 in any case. I am getting a little suspicious of
dnsmasq with bind-dynamic!
Gene
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss(a)lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss