nwfilter issue with new ebtables

Hi, I have last week discussed breakage in nwfilter usage on IRC <filterref filter='clean-traffic'> <parameter name='CTRL_IP_LEARNING' value='dhcp'/> </filterref> virsh start <guest> error: Failed to start domain <guest> error: internal error: applyDHCPOnlyRules failed - spoofing not protect With debug in the logs enabled I got confirmation by Daniel (thanks!) that the command sequence libvirt issued looked kind of "normal". Hereby I wanted to let you know that some further debugging identified a part of the sequence that libvirt issues as being broken in recent ebtables versions. # ebtables --concurrent -t nat -N testrule3 # ebtables --concurrent -t nat -E testrule3 testrule3-renamed ebtables v1.8.6 (nf_tables): Chain 'testrule3' doesn't exists This led to upstream ebtables bug [1] - for now just FYI in case you want/need to subscribe for your own tracking. [1]: https://bugzilla.netfilter.org/show_bug.cgi?id=1481 -- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd

On 11/16/20 2:01 AM, Christian Ehrhardt wrote:
Hi, I have last week discussed breakage in nwfilter usage on IRC
<filterref filter='clean-traffic'> <parameter name='CTRL_IP_LEARNING' value='dhcp'/> </filterref> virsh start <guest> error: Failed to start domain <guest> error: internal error: applyDHCPOnlyRules failed - spoofing not protect
With debug in the logs enabled I got confirmation by Daniel (thanks!) that the command sequence libvirt issued looked kind of "normal".
Hereby I wanted to let you know that some further debugging identified a part of the sequence that libvirt issues as being broken in recent ebtables versions.
# ebtables --concurrent -t nat -N testrule3 # ebtables --concurrent -t nat -E testrule3 testrule3-renamed ebtables v1.8.6 (nf_tables): Chain 'testrule3' doesn't exists
So you're saying you can just run those two commands together and always get the error? (assuming that "testrule3 and testrule3-renamed don't exist beforehand) From your description it sounds like maybe the error doesn't occur when there is a pause between the two commands - is that right, or am I assuming too much? I tried the above commands (well, I put the two commands together on a single line separated by ";") on a Fedora 33 system and a RHEL 8.3.0 system, and both of them completed successfully. This is the fedora ebtables -V: ebtables v2.0.11 (legacy) (December 2011) And this is the ebtables -V on RHEL 8.3.0: ebtables 1.8.4 (nf_tables) (I don't have any idea how the version's relate to each other for legacy ebtables vs. the nf_tables version)
This led to upstream ebtables bug [1] - for now just FYI in case you want/need to subscribe for your own tracking.

On Mon, Nov 16, 2020 at 4:24 PM Laine Stump <laine@redhat.com> wrote:
On 11/16/20 2:01 AM, Christian Ehrhardt wrote:
Hi, I have last week discussed breakage in nwfilter usage on IRC
<filterref filter='clean-traffic'> <parameter name='CTRL_IP_LEARNING' value='dhcp'/> </filterref> virsh start <guest> error: Failed to start domain <guest> error: internal error: applyDHCPOnlyRules failed - spoofing not protect
With debug in the logs enabled I got confirmation by Daniel (thanks!) that the command sequence libvirt issued looked kind of "normal".
Hereby I wanted to let you know that some further debugging identified a part of the sequence that libvirt issues as being broken in recent ebtables versions.
# ebtables --concurrent -t nat -N testrule3 # ebtables --concurrent -t nat -E testrule3 testrule3-renamed ebtables v1.8.6 (nf_tables): Chain 'testrule3' doesn't exists
So you're saying you can just run those two commands together and always get the error? (assuming that "testrule3 and testrule3-renamed don't exist beforehand)
yes
From your description it sounds like maybe the error doesn't occur when there is a pause between the two commands - is that right, or am I assuming too much?
Assuming too much, it happens when libvirt issues them at "computer speed" as well as when I run them manually at "human speed". I have not tried waiting an extra long time in between thou ...
I tried the above commands (well, I put the two commands together on a single line separated by ";") on a Fedora 33 system and a RHEL 8.3.0 system, and both of them completed successfully.
This is the fedora ebtables -V: ebtables v2.0.11 (legacy) (December 2011)
Those worked on Ubuntu as well in older releases.
And this is the ebtables -V on RHEL 8.3.0: ebtables 1.8.4 (nf_tables)
That since 1.8.5 is what is broken for us at the moment. Thanks for cross checking Laine!
(I don't have any idea how the version's relate to each other for legacy ebtables vs. the nf_tables version)
This led to upstream ebtables bug [1] - for now just FYI in case you want/need to subscribe for your own tracking.
-- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd

On Mon, Nov 16, 2020 at 10:23:32AM -0500, Laine Stump wrote:
On 11/16/20 2:01 AM, Christian Ehrhardt wrote:
Hi, I have last week discussed breakage in nwfilter usage on IRC
<filterref filter='clean-traffic'> <parameter name='CTRL_IP_LEARNING' value='dhcp'/> </filterref> virsh start <guest> error: Failed to start domain <guest> error: internal error: applyDHCPOnlyRules failed - spoofing not protect
With debug in the logs enabled I got confirmation by Daniel (thanks!) that the command sequence libvirt issued looked kind of "normal".
Hereby I wanted to let you know that some further debugging identified a part of the sequence that libvirt issues as being broken in recent ebtables versions.
# ebtables --concurrent -t nat -N testrule3 # ebtables --concurrent -t nat -E testrule3 testrule3-renamed ebtables v1.8.6 (nf_tables): Chain 'testrule3' doesn't exists
So you're saying you can just run those two commands together and always get the error? (assuming that "testrule3 and testrule3-renamed don't exist beforehand)
From your description it sounds like maybe the error doesn't occur when there is a pause between the two commands - is that right, or am I assuming too much?
I tried the above commands (well, I put the two commands together on a single line separated by ";") on a Fedora 33 system and a RHEL 8.3.0 system, and both of them completed successfully.
I tried it on Fedora 33 and it failed :-) It looks like the issue is with iptables-nft impl
This is the fedora ebtables -V: ebtables v2.0.11 (legacy) (December 2011)
And this is the ebtables -V on RHEL 8.3.0: ebtables 1.8.4 (nf_tables)
I guess it means 1.8.5 iptables-nft is broken. I filed a Fedora Bug too which should get more direct attention of the person who's likely to fix this. https://bugzilla.redhat.com/show_bug.cgi?id=1898130 Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 11/16/20 10:36 AM, Daniel P. Berrangé wrote:
On Mon, Nov 16, 2020 at 10:23:32AM -0500, Laine Stump wrote:
On 11/16/20 2:01 AM, Christian Ehrhardt wrote:
Hi, I have last week discussed breakage in nwfilter usage on IRC
<filterref filter='clean-traffic'> <parameter name='CTRL_IP_LEARNING' value='dhcp'/> </filterref> virsh start <guest> error: Failed to start domain <guest> error: internal error: applyDHCPOnlyRules failed - spoofing not protect
With debug in the logs enabled I got confirmation by Daniel (thanks!) that the command sequence libvirt issued looked kind of "normal".
Hereby I wanted to let you know that some further debugging identified a part of the sequence that libvirt issues as being broken in recent ebtables versions.
# ebtables --concurrent -t nat -N testrule3 # ebtables --concurrent -t nat -E testrule3 testrule3-renamed ebtables v1.8.6 (nf_tables): Chain 'testrule3' doesn't exists
So you're saying you can just run those two commands together and always get the error? (assuming that "testrule3 and testrule3-renamed don't exist beforehand)
From your description it sounds like maybe the error doesn't occur when there is a pause between the two commands - is that right, or am I assuming too much?
I tried the above commands (well, I put the two commands together on a single line separated by ";") on a Fedora 33 system and a RHEL 8.3.0 system, and both of them completed successfully.
I tried it on Fedora 33 and it failed :-)
Strange. Both of my Fedora 33 systems are using iptables-1.8.5 and ebtables-legacy-2.0.11. Is this because they were upgraded rather than fresh installs? That seems kind of... bad. :-/ Whatever the case, I should really remedy that.
It looks like the issue is with iptables-nft impl
This is the fedora ebtables -V: ebtables v2.0.11 (legacy) (December 2011)
And this is the ebtables -V on RHEL 8.3.0: ebtables 1.8.4 (nf_tables)
I guess it means 1.8.5 iptables-nft is broken. I filed a Fedora Bug too which should get more direct attention of the person who's likely to fix this.
https://bugzilla.redhat.com/show_bug.cgi?id=1898130
Regards, Daniel
participants (3)
-
Christian Ehrhardt
-
Daniel P. Berrangé
-
Laine Stump