Re: [libvirt] How to make udev not touch my device?

7 Nov 2016

      On Mon, Nov 07, 2016 at 01:11:14PM +0100, Michal Privoznik wrote:
...
On 07.11.2016 10:17, Daniel P. Berrange wrote:
...
On Fri, Nov 04, 2016 at 08:47:34AM +0100, Michal Privoznik wrote:
...
Hey udev developers,
I'm a libvirt developer and I've been facing an interesting issue
recently. Libvirt is a library for managing virtual machines and as such
allows basically any device to be exposed to a virtual machine. For
instance, a virtual machine can use /dev/sdX as its own disk. Because of
security reasons we allow users to configure their VMs to run under
different UID/GID and also SELinux context. That means that whenever a
VM is being started up, libvirtd (our daemon we have) relabels all the
necessary paths that QEMU process (representing VM) can touch.
However, I'm facing an issue that I don't know how to fix. In some cases
QEMU can close & reopen a block device. However, closing a block device
triggers an event and hence if there is a rule that sets a security
label on a device the QEMU process is unable to reopen the device again.
My question is, whet we can do to prevent udev from mangling with our
security labels that we've set on the devices?
One of the ideas our lead developer had was for libvirt to set some kind
of udev label on devices managed by libvirt (when setting up security
labels) and then whenever udev sees such labelled device it won't touch
it at all (this could be achieved by a rule perhaps?). Later, when
domain is shutting down libvirt removes that label. But I don't think
setting an arbitrary label on devices is supported, is it?
Having thought about this over the weekend, I'm strongly inclined to
just take udev out of the equation by starting a new mount namespace
for each QEMU we launch and setting up a custom /dev containing just
the devices we need. This will be both a security improvement and
avoid the udev races, with no complex code required in libvirt and
will work for libvirt all the way back to RHEL6
How would this work with device hotplug, i.e. I start a domain with some
set of devices. Then I bring up an iSCSI target (which appears under
/dev) and how does one 'transfer' the device into the new namespace?
BTW: can you elaborate more one udev-namespace relations? Doesn't udev
run in the namespaces too?
A single process can only ever be in a single namespace at any point in
time and udev only ever runs in the initial namespaces. When running
containers you never have udev inside them, and udev certainly doesn't
interact with arbitrary namespaces created by other applications for
their own purposes.

So if libvirt creates a private mount namespace for each QEMU and mounts
a custom /dev there, this is invisible to udev, and thus udev won't/can't
mess with permissions we set in our private /dev.

For hotplug, the libvirt QEMU would do the same as the libvirt LXC driver
currently does. It would fork and setns() into the QEMU mount namespace
and run mknod()+chmod() there, before doing the rest of its normal hotplug
logic. See lxcDomainAttachDeviceMknodHelper() for what LXC does.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|