On 05/17/2018 02:14 AM, Olaf Hering wrote:
Am Wed, 16 May 2018 18:44:32 -0400
schrieb John Ferlan <jferlan(a)redhat.com>:
> On 05/15/2018 04:20 AM, Olaf Hering wrote:
>> Currently virNetSocketNewListenTCP bails out early under the following
>> conditions:
>> - the hostname resolves to at least one IPv4 and at least one IPv6
>> address
> which produces what error?
>> IPv6. Binding the IPv6 address will obviously fail. But this terminates
>> the entire loop, even if binding to IPv4 succeeded.
What happens if a hostname resolves to more than one address, but none of the
interfaces has some of the addresses assigned? bind() will fail.
Just try it yourself, remove one address with 'ip addr del ...' (whatever the
required syntax is) and do a migration to that host.
For me it happens with BOOTPROTO='dhcp4' instead of 'dhcp' in ifcfg-br0.
Sorry, I guess it just wasn't clear enough to me what the issue was and
the proposed resolution. I wasn't in the throes of debugging...
>> @@ -382,11 +382,8 @@ int virNetSocketNewListenTCP(const char
*nodename,
>> #endif
>>
>> if (bind(fd, runp->ai_addr, runp->ai_addrlen) < 0) {
>> - if (errno != EADDRINUSE) {
>> - virReportSystemError(errno, "%s", _("Unable to
bind to port"));
>> - goto error;
>> - }
>> - addrInUse = true;
>> + if (errno == EADDRINUSE)
>> + addrInUse = true;
>
> So this to me reads as if we're just ignoring all errors, but we do care
> if the address was in use already and we'll save that bit of information
> for later.
Why would any error matter here, beside the two that this function actually
handles in some way? It seems to me the intent of that function is to bind
to at least one of the addresses for the hostname passed to this function.
There is no point in failing the entire operation just because there are
more addresses in the resolved list. A client that actually tries to connect
to that hostname will also cycle through all addresses until one succeeds.
If there was some other kind of configuration issue we theoretically
could make a few bind calls that are failing in exactly the same way and
perhaps that error is more fatal than others. There's lots of reasons
for bind to fail and lots of different settings for errno.
I figured you had been debugging and would know there were a few that
were being seeing and those could be filtered like EADDRINUSE currently
is. I wasn't in the middle of debugging this so I couldn't picture why
we'd want to go from filtering one error to filtering all errors.
Maybe we should VIR_DEBUG the errno for debugging purposes since you'll
possibly change it later to EDESTADDRREQ.
> bind says this is returned when "The address argument is a
null
> pointer." - so essentially we're making up a failure?
Yes. There was no success trying to bind() to any of the resolved addresses.
There are a few places that can lead to nsocks==0, not just failures of bind().
Fair enough - almost makes me wonder if we craft a different error
message when nsocks == 0 and familyNotSupport or addrInUse is not set.
John