On 04/26/2017 05:57 PM, Laine Stump wrote:
> On 04/26/2017 02:39 PM, John Ferlan wrote:
>>
>>
>> On 04/25/2017 12:34 PM, Laine Stump wrote:
>>> If the network isn't active during networkNotifyActualDevice(), we
>>> would log an error message stating that the bridge device didn't
>>> exist. This patch adds a check to see if the network is active, making
>>> the logs more useful in the case that it isn't.
>>>
>>> Partially resolves:
https://bugzilla.redhat.com/1442700
>>> ---
>>> src/network/bridge_driver.c | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c
>>> index d2d8557..e06f81b 100644
>>> --- a/src/network/bridge_driver.c
>>> +++ b/src/network/bridge_driver.c
>>> @@ -4676,6 +4676,13 @@ networkNotifyActualDevice(virDomainDefPtr dom,
>>> }
>>> netdef = network->def;
>>>
>>> + if (!virNetworkObjIsActive(network)) {
>>> + virReportError(VIR_ERR_OPERATION_INVALID,
>>> + _("network '%s' is not active"),
>>> + netdef->name);
>>> + goto error;
>>> + }
>>> +
>>
>> /me wonders whether this should just a goto cleanup - IOW: if the
>> network isn't active, so what, why error. Once someone attempts to start
>> it, they'll get errors I assume...
>>
>> Not that goto error or cleanup matters since commit id '4fee4e0' changed
>> the goto cleanup to goto error and added:
>>
>> +
>> +error:
>> + goto cleanup;
>
> That's missing the point of that commit. "cleanup:" is there only as a
> place for error: to goto the cleanup code that is common to both the
> "success:" exit and the "error:" exit (and the three labels are
all
> there so that that this function is structured similarly to
> networkAllocateActualDevice() - I wanted them to be as close to the same
> as possible). In the body of the function you either declare the
> allocate/notification a success (with "goto success") in which case the
> connect count for the network goes up, or you declare it a failure (with
> "goto error") in which case the connect count for the network remains
> unchanged, but you never directly goto the noncommittal "cleanup:".
>
OK understood - it's just one of those odd looking constructs to have a
goto error and goto cleanup immediately following it, but your reasoning
for having the logic this way is understandable.
>>
>> I guess I don't have the answer readily available in my head as to how
>> much of the subsequent code would be called at network start time?
>
> None of this is called when the network is started - we have no way for
> the networkStart() function to learn which interfaces in which domains
> are supposed to be connected to a particular network. That needs more
> infrastructure change than I wanted to put in right now (we need some
> sort of event or callback that will allow the network driver to notify
> all active hypervisor drivers whenever a network is started (or
> destroyed?), giving the hypervisor driver a chance to call
> networkNotifyActualDevice() for any affected domain interface).
>
> So have I explained myself well enough?
>
Sure, but for my context...
I guess I was putting myself in the mindset of a libvirtd reconnect
where "who knows" what was done with each network prior to reconnection
and this is the discovery of that. For some paths it's a hard death
issue - the physical connection had problems or went away. For other
paths it's more of a soft or state issue - someone stopped/destroyed,
but didn't undefine a network. I was thinking more that in this latter
case, does an error make sense?
I think it makes sense to log an error, but it doesn't make sense to
kill the guest. I think it would only make sense to kill the guest if an
error condition led to 1) a guest that was completely non-functioning
and/or unreachable anyway, so it would eventually need to be killed
anyway no matter what you tried to do, or 2) the guest being left
exposed/vulnerable in some serious way. Neither of those is the case here.