On 08/23/2018 03:44 AM, Daniel P. Berrangé wrote:
On Wed, Aug 22, 2018 at 05:43:21PM -0400, John Ferlan wrote:
>
https://bugzilla.redhat.com/show_bug.cgi?id=1607202
>
> It's stated that if the admin wants to shoot themselves in
> the foot by removing the nwfilter binding while the domain
> is running we will certainly allow that. However, in doing
> so we also run the risk that a libvirtd restart will cause
> the domain to be shutdown, which isn't a good thing.
>
> So add another boolean to virDomainConfNWFilterInstantiate
> which allows us to recover somewhat gracefully in the event
> the virNWFilterBindingCreateXML fails when we come from
> qemuProcessReconnect and we determine that the filter has
> been deleted. It was there at some point (it had to be), but
> if it's missing, then we don't want to cause the guest to
> stop running, so issue a warning and continue on.
>
> Signed-off-by: John Ferlan <jferlan(a)redhat.com>
> ---
> src/conf/domain_nwfilter.c | 33 ++++++++++++++++++++++++++++-----
> src/conf/domain_nwfilter.h | 3 ++-
> src/lxc/lxc_process.c | 3 ++-
> src/qemu/qemu_hotplug.c | 7 ++++---
> src/qemu/qemu_interface.c | 6 ++++--
> src/qemu/qemu_process.c | 10 +++++++---
> src/uml/uml_conf.c | 3 ++-
> 7 files changed, 49 insertions(+), 16 deletions(-)
>
> diff --git a/src/conf/domain_nwfilter.c b/src/conf/domain_nwfilter.c
> index f39c8a1f9b..3e6e462def 100644
> --- a/src/conf/domain_nwfilter.c
> +++ b/src/conf/domain_nwfilter.c
> @@ -85,16 +85,19 @@ int
> virDomainConfNWFilterInstantiate(const char *vmname,
> const unsigned char *vmuuid,
> virDomainNetDefPtr net,
> - bool ignoreExists)
> + bool ignoreExists,
> + bool ignoreDeleted)
> {
> virConnectPtr conn = virGetConnectNWFilter();
> virNWFilterBindingDefPtr def = NULL;
> virNWFilterBindingPtr binding = NULL;
> + virNWFilterPtr nwfilter = NULL;
> char *xml = NULL;
> int ret = -1;
>
> - VIR_DEBUG("vmname=%s portdev=%s filter=%s ignoreExists=%d",
> - vmname, NULLSTR(net->ifname), NULLSTR(net->filter),
ignoreExists);
> + VIR_DEBUG("vmname=%s portdev=%s filter=%s ignoreExists=%d
ignoreDeleted=%d",
> + vmname, NULLSTR(net->ifname), NULLSTR(net->filter),
> + ignoreExists, ignoreDeleted);
>
> if (!conn)
> goto cleanup;
> @@ -113,14 +116,34 @@ virDomainConfNWFilterInstantiate(const char *vmname,
> if (!(xml = virNWFilterBindingDefFormat(def)))
> goto cleanup;
>
> - if (!(binding = virNWFilterBindingCreateXML(conn, xml, 0)))
> - goto cleanup;
> + if (!(binding = virNWFilterBindingCreateXML(conn, xml, 0))) {
> + virErrorPtr orig_err;
> +
> + if (!ignoreDeleted)
> + goto cleanup;
> +
> + /* Let's determine if the error was because the filter was deleted.
> + * Save the orig_err just in case it's not a failure to find the
> + * filter by name. */
> + orig_err = virSaveLastError();
> + nwfilter = virNWFilterLookupByName(conn, def->filter);
> + virSetError(orig_err);
> + virFreeError(orig_err);
> + if (nwfilter)
> + goto cleanup;
> +
> + VIR_WARN("filter '%s' for binding '%s' has been deleted
while the "
> + "guest was running, ignoring for restart processing",
> + def->filter, def->portdevname);
> + virResetLastError();
> + }
I don't get what this code is trying to do. This method is about creating
nwfilter bindings. If virNWFilterBindingCreateXML() fails, that means the
filter binding already exists. But the stated problem was that the admin
had deleted the filter binding. The code is also checking a filter, not
a filter binding.
virNWFilterBindingCreateXML eventually calls nwfilterBindingCreateXML,
which will virNWFilterBindingObjListAdd the binding, get the @ret
binding and attempt to virNWFilterInstantiateFilter the binding @def.
Instantiate will call virNWFilterInstantiateFilterInternal which calls
virNWFilterInstantiateFilterUpdate (eventually) which calls
virNWFilterObjListFindInstantiateFilter because someone deleted the
filter after deleting the portdev before hand.
As the bz shows, one cannot delete the filter because it's active in the
portdev; however, deleting the portdev allows one to delete the filter.
Once the port is deleted, the 'ebtables -t nat -L' will show it's gone.
Attempting to create the portdev again when libvirtd restarts (because
virDomainConfNWFilterInstantiate is called during that processing) will
cause a failure and the guest will be stopped.
Prior to the changes, adding a debug print in nwfilterBindingCreateXML
of the XML I see the following in a debug libvirtd session restart:
Detaching after fork from child process 13163.
2018-08-22 17:08:58.878+0000: 13162: warning :
nwfilterBindingCreateXML:748 : xml=<filterbinding>
<owner>
<name>f23</name>
<uuid>43dc0cf8-809b-4adb-9bea-a9abb5f3d90e</uuid>
</owner>
<portdev name='vnet0'/>
<mac address='52:54:00:f7:d4:f7'/>
<filterref filter='clean-traffic'/>
</filterbinding>
2018-08-22 17:08:58.878+0000: 13162: error :
virNWFilterObjListFindInstantiateFilter:201 : internal error: referenced
filter 'clean-traffic' is missing
...
One finds the guest shutdown, the subsequent start would also fail:
[Thread 0x7fffa7fff700 (LWP 13083) exited]
2018-08-22 17:16:33.035+0000: 13068: warning :
nwfilterBindingCreateXML:748 : xml=<filterbinding>
<owner>
<name>f23</name>
<uuid>43dc0cf8-809b-4adb-9bea-a9abb5f3d90e</uuid>
</owner>
<portdev name='vnet0'/>
<mac address='52:54:00:f7:d4:f7'/>
<filterref filter='clean-traffic'/>
</filterbinding>
2018-08-22 17:16:33.036+0000: 13068: error :
virNWFilterObjListFindInstantiateFilter:201 : internal error: referenced
filter 'clean-traffic' is missing
2018-08-22 17:16:33.065+0000: 13157: error : virFileReadAll:1435 :
Failed to open file '/sys/class/net/vnet0/operstate': No such file or
directory
2018-08-22 17:16:33.065+0000: 13157: error : virNetDevGetLinkInfo:2559 :
unable to read: /sys/class/net/vnet0/operstate: No such file or directory
....
With the patch in place, I'll now get:
Detaching after fork from child process 6919.
2018-08-22 21:29:45.201+0000: 6918: error :
virNWFilterObjListFindInstantiateFilter:201 : internal error: referenced
filter 'clean-traffic' is missing
2018-08-22 21:29:45.201+0000: 6918: warning :
virDomainConfNWFilterInstantiate:135 : filter 'clean-traffic' for
binding 'vnet0' has been deleted while the guest was running, ignoring
for restart processing
[Thread 0x7fff9dffb700 (LWP 6918) exited]
....
After restarting libvirtd, I can define the filter again and use the
*CreateXML API in order to reinstate the filter for the guest as shown
in the ebtables output.
And yes, I understand/agree that if some admin ends up using the
virNWFilterBindingDelete API they are on their own, but I think the
shutdown would be unexpected. Another "option" I considered was to alter
the live XML to remove the net->filter so that a subsequent libvirtd
restart wouldn't fail. That may be accomplish-able via some @flag usage
for the *Delete API to know the call didn't come from the hypervisor
induced call, but I wasn't sure if it was "fail safe" and altering the
live XML for the guest has it's own drawbacks/failures that would need
to be handled.
John