On a Tuesday in 2021, Michal Privoznik wrote:
Let me take you ~8 years back. Back then, virNetworkUpdate() API
was introduced and the public implementation is nothing special -
it calls the networkUpdate() callback of the network driver.
Except, a small "typo" slipped through - while the public API
takes @command argument first followed by @section argument,
these are passed to the callback in swapped order. This wasn't
visible, until split daemons and sub-driver URIs became a thing.
Simply because the client was talking to the network driver via
our remote driver.
On client side, when the public API was called the RPC layer
swapped the order (because it was called with swapped arguments
already).
So it did not swap it, then? :)
Then, on the server side, after deserialization the
public API was called again (still with swapped arguments) and it
subsequently called network driver callback (where the arguments
were in the right order again).
Since both arguments are of the same type (unsigned int) XDR was
happy too.
The problem arose with split daemons and sub-driver URIs. Imagine
a user running split daemons. When they connect to
network:///system, they talk to virnetworkd "directly" (as in no
proxy daemon sits in between). Both sides still use remote driver
though, so the order is fixed by RPC layer, just like in libvirtd
case. Connecting to qemu:///system is where things get
interesting because virtqemud serves as a proxy to virtnetworkd
(the former talks to the later on users behalf and forwards API
calls).
What happens is, virtqemud sees incoming RPC packet (with swapped
arguments), it decodes it and calls remoteDispatchNetworkUpdate()
(valued are still swapped in remote_network_update_args struct).
Subsequently, virNetworkUpdate() is called and since virtqemud
has no network driver, it calls remote driver again. But this
time, the API callback gets the arguments in correct order (just
like network driver would in case of libvirtd). But this means,
that virnetworkd will see them in wrong order, because it swaps
them again.
After fixing the call of the callback, the API works again in
both cases, if both sides run with the fix.
And to make things work for newer clients talking to older
servers, we have to swap the order on RPC layer too. If both
sides run with the change, they both encode and decode packet
properly. But if newer client talks to older server, it will
encode packet just how the older server expects it.
Fixes: 574b9bc66b6b10cc4cf50f299c3f0ff55f2cbefb
Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1870552
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
src/libvirt-network.c | 2 +-
src/remote/remote_protocol.x | 7 ++++++-
src/remote_protocol-structs | 2 +-
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/src/libvirt-network.c b/src/libvirt-network.c
index b84389f762..310da44a4e 100644
--- a/src/libvirt-network.c
+++ b/src/libvirt-network.c
@@ -543,7 +543,7 @@ virNetworkUpdate(virNetworkPtr network,
if (conn->networkDriver && conn->networkDriver->networkUpdate) {
int ret;
- ret = conn->networkDriver->networkUpdate(network, section, command,
+ ret = conn->networkDriver->networkUpdate(network, command, section,
parentIndex, xml, flags);
Especially in the migration code, we use VIR_DRV_SUPPORTS_FEATURE to
figure out whether the remote side has some functionality.
With something like VIR_DRV_FEATURE_NETWORK_UPDATE_HAS_CORRECT_ORDER,
we could pass the correct order if it's safe to do so and fix the
virtqemud use case for newer daemons. (Since it never worked, I think
it's nicer to not fix it for older virtqemud than to break client
compatibility with older libvirt)
if (ret < 0)
goto error;
diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x
index d3724bc305..464c6b4af1 100644
--- a/src/remote/remote_protocol.x
+++ b/src/remote/remote_protocol.x
@@ -1542,10 +1542,15 @@ struct remote_network_undefine_args {
remote_nonnull_network net;
};
+/* The @section and @command members are intentionally inverted compared to the
+ * virNetworkUpdate() API. The reason is that since it's introduction until the
+ * 7.2.0 release the driver callback was given arguments in inverted order.
+ * After it was fixed, the XDR has to be swapped to keep compatibility with
+ * older daemons. */
This does not actually affect RPC at all, merely the names used in the
autogenerated stubs. If it did, I'm afraid it would just move the bug
from the driver to the RPC layer.
Jano
struct remote_network_update_args {
remote_nonnull_network net;
- unsigned int command;
unsigned int section;
+ unsigned int command;
int parentIndex;
remote_nonnull_string xml;
unsigned int flags;
diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs
index c0c034ac6a..800678c92e 100644
--- a/src/remote_protocol-structs
+++ b/src/remote_protocol-structs
@@ -1109,8 +1109,8 @@ struct remote_network_undefine_args {
};
struct remote_network_update_args {
remote_nonnull_network net;
- u_int command;
u_int section;
+ u_int command;
int parentIndex;
remote_nonnull_string xml;
u_int flags;
--
2.26.2