non-root bridge set-up on Fedora 39 aarch64

Hello- I'm somewhat new to the libvirt world, and I've encountered a problem that needs better troubleshooting skills than I have. I've searched Google/Ecosia and stackoverflow without finding a solution. I set up libvirt on an x86_64 system without a problem, but on my new aarch64 / Fedora 39 system, virsh doesn't seem to want to start virbr0 when run from my own user account: cel@boudin:~/kdevops$ virsh net-start default error: Failed to start network default error: error creating bridge interface virbr0: Operation not permitted cel@boudin:~/kdevops$ cat /etc/qemu/bridge.conf allow virbr0 cel@boudin:~/kdevops$ Where can I look next? -- Chuck Lever

On 2/19/24 10:21 AM, Chuck Lever wrote:
Hello-
I'm somewhat new to the libvirt world, and I've encountered a problem that needs better troubleshooting skills than I have. I've searched Google/Ecosia and stackoverflow without finding a solution.
I set up libvirt on an x86_64 system without a problem, but on my new aarch64 / Fedora 39 system, virsh doesn't seem to want to start virbr0 when run from my own user account:
cel@boudin:~/kdevops$ virsh net-start default error: Failed to start network default error: error creating bridge interface virbr0: Operation not permitted
If you run virsh as a normal user, it will auto-create an unprivileged ("session mode") libvirt instance, and connect to that rather than the single privileged (ie. run as root) libvirt instance that is managed by systemd. Because this libvirt is running as a normal user with no elevated privileges, it is unable to create a virtual network. What you probably wanted to do was to connect to the system-wide privileged libvirt, you can do this by either running virsh as root (or with sudo), or by using # virsh -c qemu:///system rather than straight "virsh". Whichever method you choose, you'll want to do that for all of your virsh commands, both for creating/managing networks and guests.
cel@boudin:~/kdevops$ cat /etc/qemu/bridge.conf allow virbr0 cel@boudin:~/kdevops$
/etc/qemu/bridge.conf is used by the QEMU package's qemu-bridge-helper binary (an SUID root program that creates a tap device attached to an existing bridge, and can be executed by an unprivileged qemu or libvirt that doesn't have permission to create a tap device or attach a tap to a bridge). The only place where bridge.conf matters is if you are using session mode libvirt for your guest you can use <interface type='bridge'> ... <source bridge='virbr0'/> to "make an end run" around libvirt's own network management and connect the guest's tap device to (in this example) virbr0 (assuming it already exists, for example if you've started the default virtual network in the system/privileged libvirt).

On Mon, Feb 19, 2024 at 07:18:06PM -0500, Laine Stump wrote:
On 2/19/24 10:21 AM, Chuck Lever wrote:
Hello-
I'm somewhat new to the libvirt world, and I've encountered a problem that needs better troubleshooting skills than I have. I've searched Google/Ecosia and stackoverflow without finding a solution.
I set up libvirt on an x86_64 system without a problem, but on my new aarch64 / Fedora 39 system, virsh doesn't seem to want to start virbr0 when run from my own user account:
cel@boudin:~/kdevops$ virsh net-start default error: Failed to start network default error: error creating bridge interface virbr0: Operation not permitted
If you run virsh as a normal user, it will auto-create an unprivileged ("session mode") libvirt instance, and connect to that rather than the single privileged (ie. run as root) libvirt instance that is managed by systemd. Because this libvirt is running as a normal user with no elevated privileges, it is unable to create a virtual network.
What you probably wanted to do was to connect to the system-wide privileged libvirt, you can do this by either running virsh as root (or with sudo), or by using
# virsh -c qemu:///system
rather than straight "virsh". Whichever method you choose, you'll want to do that for all of your virsh commands, both for creating/managing networks and guests.
These are wrapped up in scripts and ansible playbooks, so I'll have to dig through that to figure out which connection is being used. Strange that this all works on my x86_64 system, but not on aarch64. Thanks for the pointer... the first two or three times I found this advice on the Internet, I guess it didn't sink in ;-) -- Chuck Lever

On Tue, Feb 20, 2024 at 10:17:43AM -0500, Chuck Lever wrote:
On Mon, Feb 19, 2024 at 07:18:06PM -0500, Laine Stump wrote:
On 2/19/24 10:21 AM, Chuck Lever wrote:
Hello-
I'm somewhat new to the libvirt world, and I've encountered a problem that needs better troubleshooting skills than I have. I've searched Google/Ecosia and stackoverflow without finding a solution.
I set up libvirt on an x86_64 system without a problem, but on my new aarch64 / Fedora 39 system, virsh doesn't seem to want to start virbr0 when run from my own user account:
cel@boudin:~/kdevops$ virsh net-start default error: Failed to start network default error: error creating bridge interface virbr0: Operation not permitted
If you run virsh as a normal user, it will auto-create an unprivileged ("session mode") libvirt instance, and connect to that rather than the single privileged (ie. run as root) libvirt instance that is managed by systemd. Because this libvirt is running as a normal user with no elevated privileges, it is unable to create a virtual network.
What you probably wanted to do was to connect to the system-wide privileged libvirt, you can do this by either running virsh as root (or with sudo), or by using
# virsh -c qemu:///system
rather than straight "virsh". Whichever method you choose, you'll want to do that for all of your virsh commands, both for creating/managing networks and guests.
These are wrapped up in scripts and ansible playbooks, so I'll have to dig through that to figure out which connection is being used. Strange that this all works on my x86_64 system, but not on aarch64.
This makes me very suspicious. There are a few things that differ between x86_64 and aarch64, but this shouldn't be one of them. Are you 100% sure that the two environments are identical, modulo the architecture? Honestly, what seems a lot more likely is that either the Ansible playbooks execute some tasks conditionally based on the architecture, or some changes were made to the x86_64 machine outside of the scope of the playbooks. -- Andrea Bolognani / Red Hat / Virtualization

On Tue, Feb 20, 2024 at 10:58:46AM -0800, Andrea Bolognani wrote:
On Tue, Feb 20, 2024 at 10:17:43AM -0500, Chuck Lever wrote:
On Mon, Feb 19, 2024 at 07:18:06PM -0500, Laine Stump wrote:
On 2/19/24 10:21 AM, Chuck Lever wrote:
Hello-
I'm somewhat new to the libvirt world, and I've encountered a problem that needs better troubleshooting skills than I have. I've searched Google/Ecosia and stackoverflow without finding a solution.
I set up libvirt on an x86_64 system without a problem, but on my new aarch64 / Fedora 39 system, virsh doesn't seem to want to start virbr0 when run from my own user account:
cel@boudin:~/kdevops$ virsh net-start default error: Failed to start network default error: error creating bridge interface virbr0: Operation not permitted
If you run virsh as a normal user, it will auto-create an unprivileged ("session mode") libvirt instance, and connect to that rather than the single privileged (ie. run as root) libvirt instance that is managed by systemd. Because this libvirt is running as a normal user with no elevated privileges, it is unable to create a virtual network.
What you probably wanted to do was to connect to the system-wide privileged libvirt, you can do this by either running virsh as root (or with sudo), or by using
# virsh -c qemu:///system
rather than straight "virsh". Whichever method you choose, you'll want to do that for all of your virsh commands, both for creating/managing networks and guests.
These are wrapped up in scripts and ansible playbooks, so I'll have to dig through that to figure out which connection is being used. Strange that this all works on my x86_64 system, but not on aarch64.
This makes me very suspicious. There are a few things that differ between x86_64 and aarch64, but this shouldn't be one of them.
Are you 100% sure that the two environments are identical, modulo the architecture? Honestly, what seems a lot more likely is that either the Ansible playbooks execute some tasks conditionally based on the architecture, or some changes were made to the x86_64 machine outside of the scope of the playbooks.
It's impossible to say that the two environments are identical. The two possibilities you mention are the first things I plan to investigate. -- Chuck Lever

On Tue, Feb 20, 2024 at 02:04:11PM -0500, Chuck Lever wrote:
On Tue, Feb 20, 2024 at 10:58:46AM -0800, Andrea Bolognani wrote:
On Tue, Feb 20, 2024 at 10:17:43AM -0500, Chuck Lever wrote:
On Mon, Feb 19, 2024 at 07:18:06PM -0500, Laine Stump wrote:
On 2/19/24 10:21 AM, Chuck Lever wrote:
Hello-
I'm somewhat new to the libvirt world, and I've encountered a problem that needs better troubleshooting skills than I have. I've searched Google/Ecosia and stackoverflow without finding a solution.
I set up libvirt on an x86_64 system without a problem, but on my new aarch64 / Fedora 39 system, virsh doesn't seem to want to start virbr0 when run from my own user account:
cel@boudin:~/kdevops$ virsh net-start default error: Failed to start network default error: error creating bridge interface virbr0: Operation not permitted
If you run virsh as a normal user, it will auto-create an unprivileged ("session mode") libvirt instance, and connect to that rather than the single privileged (ie. run as root) libvirt instance that is managed by systemd. Because this libvirt is running as a normal user with no elevated privileges, it is unable to create a virtual network.
What you probably wanted to do was to connect to the system-wide privileged libvirt, you can do this by either running virsh as root (or with sudo), or by using
# virsh -c qemu:///system
rather than straight "virsh". Whichever method you choose, you'll want to do that for all of your virsh commands, both for creating/managing networks and guests.
These are wrapped up in scripts and ansible playbooks, so I'll have to dig through that to figure out which connection is being used. Strange that this all works on my x86_64 system, but not on aarch64.
This makes me very suspicious. There are a few things that differ between x86_64 and aarch64, but this shouldn't be one of them.
Are you 100% sure that the two environments are identical, modulo the architecture? Honestly, what seems a lot more likely is that either the Ansible playbooks execute some tasks conditionally based on the architecture, or some changes were made to the x86_64 machine outside of the scope of the playbooks.
It's impossible to say that the two environments are identical. The two possibilities you mention are the first things I plan to investigate.
Possible leads: * contents of ~/.config/libvirt; * libvirt-related variables in the user's environment; * groups the user is part of. If you have the ability to provision a fresh x86_64 environment to use for a more direct comparison, that would be ideal of course. -- Andrea Bolognani / Red Hat / Virtualization

On Tue, Feb 20, 2024 at 11:10:22AM -0800, Andrea Bolognani wrote:
On Tue, Feb 20, 2024 at 02:04:11PM -0500, Chuck Lever wrote:
On Tue, Feb 20, 2024 at 10:58:46AM -0800, Andrea Bolognani wrote:
On Tue, Feb 20, 2024 at 10:17:43AM -0500, Chuck Lever wrote:
On Mon, Feb 19, 2024 at 07:18:06PM -0500, Laine Stump wrote:
On 2/19/24 10:21 AM, Chuck Lever wrote:
Hello-
I'm somewhat new to the libvirt world, and I've encountered a problem that needs better troubleshooting skills than I have. I've searched Google/Ecosia and stackoverflow without finding a solution.
I set up libvirt on an x86_64 system without a problem, but on my new aarch64 / Fedora 39 system, virsh doesn't seem to want to start virbr0 when run from my own user account:
cel@boudin:~/kdevops$ virsh net-start default error: Failed to start network default error: error creating bridge interface virbr0: Operation not permitted
If you run virsh as a normal user, it will auto-create an unprivileged ("session mode") libvirt instance, and connect to that rather than the single privileged (ie. run as root) libvirt instance that is managed by systemd. Because this libvirt is running as a normal user with no elevated privileges, it is unable to create a virtual network.
What you probably wanted to do was to connect to the system-wide privileged libvirt, you can do this by either running virsh as root (or with sudo), or by using
# virsh -c qemu:///system
rather than straight "virsh". Whichever method you choose, you'll want to do that for all of your virsh commands, both for creating/managing networks and guests.
These are wrapped up in scripts and ansible playbooks, so I'll have to dig through that to figure out which connection is being used. Strange that this all works on my x86_64 system, but not on aarch64.
This makes me very suspicious. There are a few things that differ between x86_64 and aarch64, but this shouldn't be one of them.
Are you 100% sure that the two environments are identical, modulo the architecture? Honestly, what seems a lot more likely is that either the Ansible playbooks execute some tasks conditionally based on the architecture, or some changes were made to the x86_64 machine outside of the scope of the playbooks.
It's impossible to say that the two environments are identical. The two possibilities you mention are the first things I plan to investigate.
One major difference that escaped me before is that the x86_64 system is using vagrant, but the aarch64 system is using libguestfs. The libguestfs stuff is new and there are likely some untested bits there.
Possible leads:
* contents of ~/.config/libvirt;
On x86_64 / vagrant, .config/libvirt has a channel/ directory, but no networks/ directory. On aarch64 / libguestfs, .config/libvirt has no channel/ directory, but the networks/ directory contains the definition of the "default" network.
* libvirt-related variables in the user's environment;
I don't see any remarkable differences there.
* groups the user is part of.
x86_64: [cel@renoir target]$ id uid=1046(cel) gid=100(users) groups=100(users),10(wheel),36(kvm),107(qemu),986(libvirt) [cel@renoir target]$ I see that, though the SELinux policy is "enforcing", the kernel is booted with "selinux=0". aarch64: cel@boudin:~/.config/libvirt/qemu$ id uid=1046(cel) gid=100(users) groups=100(users),10(wheel),36(kvm),107(qemu),981(libvirt) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 cel@boudin:~/.config/libvirt/qemu$ -- Chuck Lever

On Tue, Feb 20, 2024 at 03:40:57PM -0500, Chuck Lever wrote:
On Tue, Feb 20, 2024 at 11:10:22AM -0800, Andrea Bolognani wrote:
On Tue, Feb 20, 2024 at 02:04:11PM -0500, Chuck Lever wrote:
On Tue, Feb 20, 2024 at 10:58:46AM -0800, Andrea Bolognani wrote:
On Tue, Feb 20, 2024 at 10:17:43AM -0500, Chuck Lever wrote:
On Mon, Feb 19, 2024 at 07:18:06PM -0500, Laine Stump wrote:
On 2/19/24 10:21 AM, Chuck Lever wrote: > Hello- > > I'm somewhat new to the libvirt world, and I've encountered a problem > that needs better troubleshooting skills than I have. I've searched > Google/Ecosia and stackoverflow without finding a solution. > > I set up libvirt on an x86_64 system without a problem, but on my > new aarch64 / Fedora 39 system, virsh doesn't seem to want to start > virbr0 when run from my own user account: > > cel@boudin:~/kdevops$ virsh net-start default > error: Failed to start network default > error: error creating bridge interface virbr0: Operation not permitted
If you run virsh as a normal user, it will auto-create an unprivileged ("session mode") libvirt instance, and connect to that rather than the single privileged (ie. run as root) libvirt instance that is managed by systemd. Because this libvirt is running as a normal user with no elevated privileges, it is unable to create a virtual network.
What you probably wanted to do was to connect to the system-wide privileged libvirt, you can do this by either running virsh as root (or with sudo), or by using
# virsh -c qemu:///system
rather than straight "virsh". Whichever method you choose, you'll want to do that for all of your virsh commands, both for creating/managing networks and guests.
These are wrapped up in scripts and ansible playbooks, so I'll have to dig through that to figure out which connection is being used. Strange that this all works on my x86_64 system, but not on aarch64.
This makes me very suspicious. There are a few things that differ between x86_64 and aarch64, but this shouldn't be one of them.
Are you 100% sure that the two environments are identical, modulo the architecture? Honestly, what seems a lot more likely is that either the Ansible playbooks execute some tasks conditionally based on the architecture, or some changes were made to the x86_64 machine outside of the scope of the playbooks.
It's impossible to say that the two environments are identical. The two possibilities you mention are the first things I plan to investigate.
One major difference that escaped me before is that the x86_64 system is using vagrant, but the aarch64 system is using libguestfs. The libguestfs stuff is new and there are likely some untested bits there.
Possible leads:
* contents of ~/.config/libvirt;
On x86_64 / vagrant, .config/libvirt has a channel/ directory, but no networks/ directory.
On aarch64 / libguestfs, .config/libvirt has no channel/ directory, but the networks/ directory contains the definition of the "default" network.
* libvirt-related variables in the user's environment;
I don't see any remarkable differences there.
* groups the user is part of.
x86_64:
[cel@renoir target]$ id uid=1046(cel) gid=100(users) groups=100(users),10(wheel),36(kvm),107(qemu),986(libvirt) [cel@renoir target]$
I see that, though the SELinux policy is "enforcing", the kernel is booted with "selinux=0".
aarch64:
cel@boudin:~/.config/libvirt/qemu$ id uid=1046(cel) gid=100(users) groups=100(users),10(wheel),36(kvm),107(qemu),981(libvirt) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 cel@boudin:~/.config/libvirt/qemu$
I found the answer; posting here for the archive. There was a bug in the Ansible playbook responsible for setting up libvirt to "run as a regular user". It was enabling libvirtd, but was failing to enable virtnetworkd. On Fedora systems, both of these steps are necessary. Once that was corrected, virtual networking works without error. -- Chuck Lever

On Mon, Feb 26, 2024 at 03:12:51PM -0500, Chuck Lever wrote:
> > Hello- > > > > I'm somewhat new to the libvirt world, and I've encountered a problem > > that needs better troubleshooting skills than I have. I've searched > > Google/Ecosia and stackoverflow without finding a solution. > > > > I set up libvirt on an x86_64 system without a problem, but on my > > new aarch64 / Fedora 39 system, virsh doesn't seem to want to start > > virbr0 when run from my own user account: > > > > cel@boudin:~/kdevops$ virsh net-start default > > error: Failed to start network default > > error: error creating bridge interface virbr0: Operation not permitted > > > If you run virsh as a normal user, it will auto-create an unprivileged > ("session mode") libvirt instance, and connect to that rather than the > single privileged (ie. run as root) libvirt instance that is managed by > systemd. Because this libvirt is running as a normal user with no elevated > privileges, it is unable to create a virtual network. > > > What you probably wanted to do was to connect to the system-wide privileged > libvirt, you can do this by either running virsh as root (or with sudo), or > by using > > > # virsh -c qemu:///system > > > rather than straight "virsh". Whichever method you choose, you'll want to do > that for all of your virsh commands, both for creating/managing networks and > guests.
These are wrapped up in scripts and ansible playbooks, so I'll have to dig through that to figure out which connection is being used. Strange that this all works on my x86_64 system, but not on aarch64.
I found the answer; posting here for the archive.
There was a bug in the Ansible playbook responsible for setting up libvirt to "run as a regular user". It was enabling libvirtd, but was failing to enable virtnetworkd. On Fedora systems, both of these steps are necessary.
Once that was corrected, virtual networking works without error.
Glad to hear you managed to figure it out. As suspected, it wasn't an aarch64-related issue after all :) Note that you shouldn't enable both the monolithic daemon (libvirtd) and the modular daemons (virtnetworkd, virtqemud) at the same time. If your version of libvirt is recent enough (>= 9.9.0) the situation should be handled cleanly, but in general it's not a supported configuration. Moreover, Fedora has defaulted to modular daemons for a long time now, so really you shouldn't need to do anything special to ensure that they are enabled. Just install the package, then either start the various services/sockets manually or simply reboot. That should do the trick. -- Andrea Bolognani / Red Hat / Virtualization

On Tue, Feb 27, 2024 at 01:20:46AM -0800, Andrea Bolognani wrote:
On Mon, Feb 26, 2024 at 03:12:51PM -0500, Chuck Lever wrote:
> > > Hello- > > > > > > I'm somewhat new to the libvirt world, and I've encountered a problem > > > that needs better troubleshooting skills than I have. I've searched > > > Google/Ecosia and stackoverflow without finding a solution. > > > > > > I set up libvirt on an x86_64 system without a problem, but on my > > > new aarch64 / Fedora 39 system, virsh doesn't seem to want to start > > > virbr0 when run from my own user account: > > > > > > cel@boudin:~/kdevops$ virsh net-start default > > > error: Failed to start network default > > > error: error creating bridge interface virbr0: Operation not permitted > > > > > > If you run virsh as a normal user, it will auto-create an unprivileged > > ("session mode") libvirt instance, and connect to that rather than the > > single privileged (ie. run as root) libvirt instance that is managed by > > systemd. Because this libvirt is running as a normal user with no elevated > > privileges, it is unable to create a virtual network. > > > > > > What you probably wanted to do was to connect to the system-wide privileged > > libvirt, you can do this by either running virsh as root (or with sudo), or > > by using > > > > > > # virsh -c qemu:///system > > > > > > rather than straight "virsh". Whichever method you choose, you'll want to do > > that for all of your virsh commands, both for creating/managing networks and > > guests. > > These are wrapped up in scripts and ansible playbooks, so I'll have > to dig through that to figure out which connection is being used. > Strange that this all works on my x86_64 system, but not on aarch64.
I found the answer; posting here for the archive.
There was a bug in the Ansible playbook responsible for setting up libvirt to "run as a regular user". It was enabling libvirtd, but was failing to enable virtnetworkd. On Fedora systems, both of these steps are necessary.
Once that was corrected, virtual networking works without error.
Glad to hear you managed to figure it out. As suspected, it wasn't an aarch64-related issue after all :)
Note that you shouldn't enable both the monolithic daemon (libvirtd) and the modular daemons (virtnetworkd, virtqemud) at the same time. If your version of libvirt is recent enough (>= 9.9.0) the situation should be handled cleanly, but in general it's not a supported configuration.
This Ansible code dates from before 2020, so it's legacy, I suppose. Perhaps, if it can figure out which version of libvirt is available, Ansible needn't start libvirtd at all? It would be a nicer fix, that I can subsequently contribute to kdevops, if Ansible would start a supported libvirt configuration.
Moreover, Fedora has defaulted to modular daemons for a long time now, so really you shouldn't need to do anything special to ensure that they are enabled. Just install the package, then either start the various services/sockets manually or simply reboot. That should do the trick.
I too expected that simply installing libvirt on my new Fedora 39 system would have created a working environment, so there's clearly something I missed during set-up. -- Chuck Lever

On Tue, Feb 27, 2024 at 09:49:23AM -0500, Chuck Lever wrote:
On Tue, Feb 27, 2024 at 01:20:46AM -0800, Andrea Bolognani wrote:
Note that you shouldn't enable both the monolithic daemon (libvirtd) and the modular daemons (virtnetworkd, virtqemud) at the same time. If your version of libvirt is recent enough (>= 9.9.0) the situation should be handled cleanly, but in general it's not a supported configuration.
This Ansible code dates from before 2020, so it's legacy, I suppose.
The change in default from monolithic daemon to modular daemons feels like forever ago to me, but in reality it only happened[1] with Fedora 35 in late 2021. So it's understandable that there would be code out there that is not prepared to cope with this scenario.
Perhaps, if it can figure out which version of libvirt is available, Ansible needn't start libvirtd at all? It would be a nicer fix, that I can subsequently contribute to kdevops, if Ansible would start a supported libvirt configuration.
Looking at the Fedora-specific part of enabling libvirt in kdevops[2], I'm pretty sure that what it attempts to do is not right. Specifically, it starts libvirtd, then starts virtnetworkd. As I mentioned earlier, mixing the monolithic daemon with the modular ones is very much an unsupported configuration. Fedora 39 has libvirt 9.7.0, which doesn't contain the systemd cleanups I talked about above, so the consequences of doing this are likely going to be even more nasty. I don't understand why starting virtnetworkd would be needed in the first place. The only difference between a monolithic deployment and a modular one should be in which process each of the drivers is running. If a running virtnetworkd allows you to do what you need, networking wise, so should running libvirtd. I will admit that I have never tried the "split" setup that you seem to be aiming for, e.g. libvirtd/virtqemud running as an unprivileged user but getting access to the host's networking via a privileged virtnetworkd instance or other setuid trickery. Looking at the libvirt-specific configuration knobs in kdevops[2], it seems that qemu:///session is used by default on Fedora, and on Fedora only. That honestly feels like a questionable choice to me... Everywhere else, qemu:///system is used instead, so I'm not surprised that issues would show up when you're exercising the odd path out.
Moreover, Fedora has defaulted to modular daemons for a long time now, so really you shouldn't need to do anything special to ensure that they are enabled. Just install the package, then either start the various services/sockets manually or simply reboot. That should do the trick.
I too expected that simply installing libvirt on my new Fedora 39 system would have created a working environment, so there's clearly something I missed during set-up.
One thing that people often miss, because it's admittedly not so obvious, is that it's not enough to install the libvirt package to start using libvirt: you also need to start the corresponding services, or at least their sockets. This is not something that's specific to libvirt, but rather the consequence of a more general policy adopted by RHEL-derived distros, where services are not automatically started after installation. Debian-derived distros have the opposite policy, so you get a smoother out of the box experience there. Unfortunately, this being a distro-wide policy, there's not much we can do about it. [1] https://pagure.io/fesco/issue/2627 [2] https://github.com/mcgrof/kdevops/blob/master/playbooks/roles/libvirt_user/t... [3] https://github.com/mcgrof/kdevops/blob/master/kconfigs/Kconfig.libvirt -- Andrea Bolognani / Red Hat / Virtualization

On Tue, Feb 27, 2024 at 08:09:17AM -0800, Andrea Bolognani wrote:
On Tue, Feb 27, 2024 at 09:49:23AM -0500, Chuck Lever wrote:
On Tue, Feb 27, 2024 at 01:20:46AM -0800, Andrea Bolognani wrote:
Note that you shouldn't enable both the monolithic daemon (libvirtd) and the modular daemons (virtnetworkd, virtqemud) at the same time. If your version of libvirt is recent enough (>= 9.9.0) the situation should be handled cleanly, but in general it's not a supported configuration.
This Ansible code dates from before 2020, so it's legacy, I suppose.
The change in default from monolithic daemon to modular daemons feels like forever ago to me, but in reality it only happened[1] with Fedora 35 in late 2021. So it's understandable that there would be code out there that is not prepared to cope with this scenario.
Perhaps, if it can figure out which version of libvirt is available, Ansible needn't start libvirtd at all? It would be a nicer fix, that I can subsequently contribute to kdevops, if Ansible would start a supported libvirt configuration.
Looking at the Fedora-specific part of enabling libvirt in kdevops[2], I'm pretty sure that what it attempts to do is not right.
Specifically, it starts libvirtd, then starts virtnetworkd. As I mentioned earlier, mixing the monolithic daemon with the modular ones is very much an unsupported configuration. Fedora 39 has libvirt 9.7.0, which doesn't contain the systemd cleanups I talked about above, so the consequences of doing this are likely going to be even more nasty.
I don't understand why starting virtnetworkd would be needed in the first place. The only difference between a monolithic deployment and a modular one should be in which process each of the drivers is running. If a running virtnetworkd allows you to do what you need, networking wise, so should running libvirtd.
I will admit that I have never tried the "split" setup that you seem to be aiming for, e.g. libvirtd/virtqemud running as an unprivileged user but getting access to the host's networking via a privileged virtnetworkd instance or other setuid trickery.
Looking at the libvirt-specific configuration knobs in kdevops[2], it seems that qemu:///session is used by default on Fedora, and on Fedora only. That honestly feels like a questionable choice to me... Everywhere else, qemu:///system is used instead, so I'm not surprised that issues would show up when you're exercising the odd path out.
I confess I don't understand the libvirt_user role well enough to effect any changes here except to add an action to enable virtnetworkd.
Moreover, Fedora has defaulted to modular daemons for a long time now, so really you shouldn't need to do anything special to ensure that they are enabled. Just install the package, then either start the various services/sockets manually or simply reboot. That should do the trick.
I too expected that simply installing libvirt on my new Fedora 39 system would have created a working environment, so there's clearly something I missed during set-up.
One thing that people often miss, because it's admittedly not so obvious, is that it's not enough to install the libvirt package to start using libvirt: you also need to start the corresponding services, or at least their sockets.
This is not something that's specific to libvirt, but rather the consequence of a more general policy adopted by RHEL-derived distros, where services are not automatically started after installation. Debian-derived distros have the opposite policy, so you get a smoother out of the box experience there.
Unfortunately, this being a distro-wide policy, there's not much we can do about it.
That must be it: enabling libvirtd.service appears to add in the socket services too: root@boudin:~# systemctl enable libvirtd Created symlink /etc/systemd/system/multi-user.target.wants/libvirtd.service → /usr/lib/systemd/system/libvirtd.service. Created symlink /etc/systemd/system/sockets.target.wants/virtlockd.socket → /usr/lib/systemd/system/virtlockd.socket. Created symlink /etc/systemd/system/sockets.target.wants/virtlogd.socket → /usr/lib/systemd/system/virtlogd.socket. Created symlink /etc/systemd/system/sockets.target.wants/libvirtd.socket → /usr/lib/systemd/system/libvirtd.socket. Created symlink /etc/systemd/system/sockets.target.wants/libvirtd-ro.socket → /usr/lib/systemd/system/libvirtd-ro.socket. root@boudin:~# -- Chuck Lever

On Tue, Feb 20, 2024 at 11:10:22AM -0800, Andrea Bolognani wrote:
On Tue, Feb 20, 2024 at 02:04:11PM -0500, Chuck Lever wrote:
On Tue, Feb 20, 2024 at 10:58:46AM -0800, Andrea Bolognani wrote:
On Tue, Feb 20, 2024 at 10:17:43AM -0500, Chuck Lever wrote:
On Mon, Feb 19, 2024 at 07:18:06PM -0500, Laine Stump wrote:
On 2/19/24 10:21 AM, Chuck Lever wrote:
Hello-
I'm somewhat new to the libvirt world, and I've encountered a problem that needs better troubleshooting skills than I have. I've searched Google/Ecosia and stackoverflow without finding a solution.
I set up libvirt on an x86_64 system without a problem, but on my new aarch64 / Fedora 39 system, virsh doesn't seem to want to start virbr0 when run from my own user account:
cel@boudin:~/kdevops$ virsh net-start default error: Failed to start network default error: error creating bridge interface virbr0: Operation not permitted
If you run virsh as a normal user, it will auto-create an unprivileged ("session mode") libvirt instance, and connect to that rather than the single privileged (ie. run as root) libvirt instance that is managed by systemd. Because this libvirt is running as a normal user with no elevated privileges, it is unable to create a virtual network.
What you probably wanted to do was to connect to the system-wide privileged libvirt, you can do this by either running virsh as root (or with sudo), or by using
# virsh -c qemu:///system
rather than straight "virsh". Whichever method you choose, you'll want to do that for all of your virsh commands, both for creating/managing networks and guests.
These are wrapped up in scripts and ansible playbooks, so I'll have to dig through that to figure out which connection is being used. Strange that this all works on my x86_64 system, but not on aarch64.
This makes me very suspicious. There are a few things that differ between x86_64 and aarch64, but this shouldn't be one of them.
Are you 100% sure that the two environments are identical, modulo the architecture? Honestly, what seems a lot more likely is that either the Ansible playbooks execute some tasks conditionally based on the architecture, or some changes were made to the x86_64 machine outside of the scope of the playbooks.
It's impossible to say that the two environments are identical. The two possibilities you mention are the first things I plan to investigate.
Possible leads:
* contents of ~/.config/libvirt; * libvirt-related variables in the user's environment; * groups the user is part of.
If you have the ability to provision a fresh x86_64 environment to use for a more direct comparison, that would be ideal of course.
At this point I'm not sure the comparison is useful. Something is misconfigured on the aarch64 system. After a fresh boot: cel@boudin:~$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever 2: enP1p1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:1b:21:e7:ae:56 brd ff:ff:ff:ff:ff:ff inet 192.168.1.64/24 brd 192.168.1.255 scope global noprefixroute enP1p1s0 valid_lft forever preferred_lft forever inet6 fe80::21b:21ff:fee7:ae56/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: enP4p3s0u1u3c2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 link/ether 02:21:28:57:47:17 brd ff:ff:ff:ff:ff:ff cel@boudin:~$ virsh -c qemu:///system net-start default error: Failed to start network default error: Requested operation is not valid: network is already active cel@boudin:~$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever 2: enP1p1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:1b:21:e7:ae:56 brd ff:ff:ff:ff:ff:ff inet 192.168.1.64/24 brd 192.168.1.255 scope global noprefixroute enP1p1s0 valid_lft forever preferred_lft forever inet6 fe80::21b:21ff:fee7:ae56/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: enP4p3s0u1u3c2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 link/ether 02:21:28:57:47:17 brd ff:ff:ff:ff:ff:ff 4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 link/ether 52:54:00:7d:89:ef brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever cel@boudin:~$ sudo virsh net-list Name State Autostart Persistent -------------------------------------------- default active yes yes cel@boudin:~$ virsh net-list Name State Autostart Persistent ---------------------------------------- cel@boudin:~$ Starting the default network as "root" throws an error too, though the end result is the bridge is created. The local user still doesn't see a network. -- Chuck Lever
participants (3)
-
Andrea Bolognani
-
Chuck Lever
-
Laine Stump