
On Tue, Jun 30, 2020 at 04:02:05PM +0100, Daniel P. Berrangé wrote:
On Tue, Jun 30, 2020 at 12:59:03PM +0200, Miguel Duarte de Mora Barroso wrote:
On Mon, Apr 6, 2020 at 4:03 PM Laine Stump <lstump@redhat.com> wrote:
On 4/6/20 9:54 AM, Daniel P. Berrangé wrote:
On Mon, Apr 06, 2020 at 03:47:01PM +0200, Miguel Duarte de Mora Barroso wrote:
Hi all,
I'm aware that it is possible to plug pre-created macvtap devices to libvirt guests - tracked in RFE [0].
My interpretation of the wording in [1] and [2] is that it is also possible to plug pre-created tap devices into libvirt guests - that would be a requirement to allow kubevirt to run with less capabilities in the pods that encapsulate the VMs.
I took a look at the libvirt code ([3] & [4]), and, from my limited understanding, I got the impression that plugging existing interfaces via `managed='no' ` is only possible for macvtap interfaces.
No, it works for standard tap devices as well.
The reason the BZs and commit logs talk mostly about macvtap rather than tap is because 1) that's what kubevirt people had asked for and 2) it already *mostly* worked for tap devices, so most of the work was related to macvtap (my memory is already fuzzy, but I think there were a couple privileged operations we still tried to do for standard tap devices even if they were precreated (standard disclaimer: I often misremember, so this memory could be wrong! But definitely precreated tap devices do work).
It's been a while since I've started this thread, but lately I've understood better how tap devices work, and that new insight makes me wonder about a couple of things.
Our ultimate goal In kubevirt is to consume a pre-created tap device by a kubernetes pod that doesn't have the NET_ADMIN capability.
After looking at the current libvirt code, I don't think that is currently supported, since we'll *always* enter the `virNetDevTapCreate` function in [1] (I'm interested in the *tap* scenario).
The tap device is effectively created in that function - [2] - by opening the clone device (/dev/net/tun), and calling `ioctl(fd, TUNSETIFF,...)` in it. AFAIK, both of those operations *require* the NET_ADMIN capability. If I'm correct, this means that the current libvirt implementation makes our goals impossible to achieve.
AFAIK, that is not correct - CAP_NET_ADMIN isn't required to open or create a tap device - only to add the tap device to a bridge.
So if you create the tap device & attach it to a bridge ahead of time, libvirt should then be able to open it and give it to QEMU
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... ((uid_valid(tun->owner) && !uid_eq(cred->euid, tun->owner)) || (gid_valid(tun->group) && !in_egroup_p(tun->group))) && !ns_capable(net->user_ns, CAP_NET_ADMIN); This is called by the TUNSETIFF code. AFAICT, that means if you fchown(tapfd, uid, gid), to the uid+gid of libvirtd, it should not require CAP_NET_ADMIN. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|