[libvirt] fwd: RA support in dnsmasq

I am sort-of cross posting this to the libvir-list because the bind-dynamic fix may have introduced an undesirable new "feature". I will be troubleshooting this. One thing I want to try is to build a dnsmasq with a fake version so that bind-interface is forced into use. Since this is occurring with a SLAAC address, I do not need DHCPv6 for this testing. Wait, since Laine put a lot of effort into parsing dnsmasq --help, I will just remove the bind-dynamic rather than changing the version. Gene -------- Original Message -------- Subject: Re: [Dnsmasq-discuss] RA support in dnsmasq Date: Fri, 30 Nov 2012 12:20:36 -0500 From: Gene Czarcinski <gene@czarc.net> To: dnsmasq-discuss@thekelleys.org.uk On 11/30/2012 11:32 AM, Simon Kelley wrote:
On 30/11/12 15:54, Gene Czarcinski wrote:
On 11/29/2012 04:18 PM, Simon Kelley wrote:
On 29/11/12 20:31, Gene Czarcinski wrote:
I spoke too quickly.
The cause of the problem is libvirt related but I am not sure what just yet.
I was running a libvirt that had a lot of "stuff" on it but seemed to work OK. Then, earlier today I update to a point that appears to be somewhat beyond the leading edge and, although I was not getting any RTR-ADVERT messages, it turned out that there were/are big-time problems running qemu-kvm. So, back off/downgrade to the previous version. Qemu-kvm now works but the RTR-ADVERT messages are back.
This may be a bit time-consuming to debug!
Are you seeing the new log message in netlink.c?
The good news is that libvirt is working again (I must have done a git-pull in the middle of an update). Thus, I am not seeing the large numbers of RTR-ADVERT.
Yes, I am seeing the new log message and I have a question about that. Every time a new virtual network interface is started, something must be doing some type of broadcast because all of the dnsmasq instances (the new one and all the "old" ones) suddenly wake up and issue a flurry of RA packets and related syslog messages. To kick the flurry off, there one of the new "unsolicited" syslog messages from each dnsmasq instance.
Is this something you would expect? Is this "normal?" The libvirt folks they are not doing it. I'd expect it. The code you instrumented gets run whenever a "new address" event happens, which is whenever an address is added to an interface. "Every time a new virtual network interface is started" is a good proxy for that.
The dnsmasq code isn't very discriminating, it updates it's idea of which interfaces hace which addresses, and then does a minute of fast advertisements on all of them. It might be possible to only do the fast advertisements on new interfaces, but implementing that isn't totally trivial.
Yes, I doubt very much if it would be trivial. However, I do not believe that this is the basic problem. When the problem occurs, one of the networks "suddenly" attempts to work with the real NIC rather than the virtual one defined in its config file. I slightly changed the IPv4 and IPv6 addresses defined for this network and the problem went away. I have also "just" seen the problem happen on another system which also had that virtual address defined. BTW, these configurations all use interface= and bind-dynamic rather than the "old" bind-interface with listen-address= specified for each specified IPv4 and IPv6 address. I had not noticed the problem previously. Why it occurs at all with just this specific address is puzzling. The configuration in which causes problems is: ------------------------------------------ # dnsmasq conf file created by libvirt strict-order domain-needed domain=net6 expand-hosts local=/net6/ pid-file=/var/run/libvirt/network/net6.pid bind-dynamic interface=virbr11 dhcp-range=192.168.6.128,192.168.6.254 dhcp-no-override dhcp-leasefile=/var/lib/libvirt/dnsmasq/net6.leases dhcp-lease-max=127 dhcp-hostsfile=/var/lib/libvirt/dnsmasq/net6.hostsfile addn-hosts=/var/lib/libvirt/dnsmasq/net6.addnhosts dhcp-range=fd00:beef:10:6::1,ra-only ------------------------------------------------- When I changed all the "6" to "160", the problem, disappeared. And there is another network defined almost the same with "8" instead of "6" and I have had no problems with it. The real NIC is configured as a DHCP client for both IPv4 and IPv6. It is assigned "nailed" addresses of 192.168.17.2/24 and fd00:dead:beef:17::2. And I just discovered why crazy stuff is happening (but I do not know what causes it) ... the P33p1 NIC has: inet6 fd00:beef:10:6:3285:a9ff:fe8f:e982/64 scope global dynamic And the reason why may be related to that NetworkManager goes through a verly long dance to bring up an interface. With dnsmasq autostarted for net6, it may have gotten there first ... but it should have not responded to p33p1 in any case. I am getting a little suspicious of dnsmasq with bind-dynamic! Gene _______________________________________________ Dnsmasq-discuss mailing list Dnsmasq-discuss@lists.thekelleys.org.uk http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

On 11/30/2012 12:31 PM, Gene Czarcinski wrote:
I am sort-of cross posting this to the libvir-list because the bind-dynamic fix may have introduced an undesirable new "feature".
I will be troubleshooting this. One thing I want to try is to build a dnsmasq with a fake version so that bind-interface is forced into use. Since this is occurring with a SLAAC address, I do not need DHCPv6 for this testing. Wait, since Laine put a lot of effort into parsing dnsmasq --help, I will just remove the bind-dynamic rather than changing the version.
Cancel that. I did some grepping of syslogs and this "problem" started around 8 Nov (or at least that is the earliest one I found). That is well before the bind-dynamic stuff was integrated in.
-------- Original Message -------- Subject: Re: [Dnsmasq-discuss] RA support in dnsmasq Date: Fri, 30 Nov 2012 12:20:36 -0500 From: Gene Czarcinski <gene@czarc.net> To: dnsmasq-discuss@thekelleys.org.uk
On 11/30/2012 11:32 AM, Simon Kelley wrote:
On 30/11/12 15:54, Gene Czarcinski wrote:
On 11/29/2012 04:18 PM, Simon Kelley wrote:
On 29/11/12 20:31, Gene Czarcinski wrote:
I spoke too quickly.
The cause of the problem is libvirt related but I am not sure what just yet.
I was running a libvirt that had a lot of "stuff" on it but seemed to work OK. Then, earlier today I update to a point that appears to be somewhat beyond the leading edge and, although I was not getting any RTR-ADVERT messages, it turned out that there were/are big-time problems running qemu-kvm. So, back off/downgrade to the previous version. Qemu-kvm now works but the RTR-ADVERT messages are back.
This may be a bit time-consuming to debug!
Are you seeing the new log message in netlink.c?
The good news is that libvirt is working again (I must have done a git-pull in the middle of an update). Thus, I am not seeing the large numbers of RTR-ADVERT.
Yes, I am seeing the new log message and I have a question about that. Every time a new virtual network interface is started, something must be doing some type of broadcast because all of the dnsmasq instances (the new one and all the "old" ones) suddenly wake up and issue a flurry of RA packets and related syslog messages. To kick the flurry off, there one of the new "unsolicited" syslog messages from each dnsmasq instance.
Is this something you would expect? Is this "normal?" The libvirt folks they are not doing it. I'd expect it. The code you instrumented gets run whenever a "new address" event happens, which is whenever an address is added to an interface. "Every time a new virtual network interface is started" is a good proxy for that.
The dnsmasq code isn't very discriminating, it updates it's idea of which interfaces hace which addresses, and then does a minute of fast advertisements on all of them. It might be possible to only do the fast advertisements on new interfaces, but implementing that isn't totally trivial.
Yes, I doubt very much if it would be trivial. However, I do not believe that this is the basic problem.
When the problem occurs, one of the networks "suddenly" attempts to work with the real NIC rather than the virtual one defined in its config file. I slightly changed the IPv4 and IPv6 addresses defined for this network and the problem went away. I have also "just" seen the problem happen on another system which also had that virtual address defined.
BTW, these configurations all use interface= and bind-dynamic rather than the "old" bind-interface with listen-address= specified for each specified IPv4 and IPv6 address. I had not noticed the problem previously. Why it occurs at all with just this specific address is puzzling.
The configuration in which causes problems is: ------------------------------------------ # dnsmasq conf file created by libvirt strict-order domain-needed domain=net6 expand-hosts local=/net6/ pid-file=/var/run/libvirt/network/net6.pid bind-dynamic interface=virbr11 dhcp-range=192.168.6.128,192.168.6.254 dhcp-no-override dhcp-leasefile=/var/lib/libvirt/dnsmasq/net6.leases dhcp-lease-max=127 dhcp-hostsfile=/var/lib/libvirt/dnsmasq/net6.hostsfile addn-hosts=/var/lib/libvirt/dnsmasq/net6.addnhosts dhcp-range=fd00:beef:10:6::1,ra-only -------------------------------------------------
When I changed all the "6" to "160", the problem, disappeared. And there is another network defined almost the same with "8" instead of "6" and I have had no problems with it.
The real NIC is configured as a DHCP client for both IPv4 and IPv6. It is assigned "nailed" addresses of 192.168.17.2/24 and fd00:dead:beef:17::2.
And I just discovered why crazy stuff is happening (but I do not know what causes it) ... the P33p1 NIC has: inet6 fd00:beef:10:6:3285:a9ff:fe8f:e982/64 scope global dynamic
And the reason why may be related to that NetworkManager goes through a verly long dance to bring up an interface. With dnsmasq autostarted for net6, it may have gotten there first ... but it should have not responded to p33p1 in any case. I am getting a little suspicious of dnsmasq with bind-dynamic!
Gene
_______________________________________________ Dnsmasq-discuss mailing list Dnsmasq-discuss@lists.thekelleys.org.uk http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
participants (1)
-
Gene Czarcinski