[libvirt] RFC: Enable unprivileged SG_IO

Hi, http://lwn.net/Articles/524720/ introduces new sysfs knob (unpriv_sgio) for SCSI device to allow the unprivileged SG_IO. I don't have solid thought on how the libvirt interface should be yet. It shouldn't be a XML entry of disk device, as the device can be shared by multiple guests, and the configuration should be kept same for all of them, having a XML entry for it will make things a mess. What Paolo suggested is to add an entry in qemu.conf, just like "cgroup_device_acl": sgio_device_acl = [ "/dev/sda" ] When libvirtd starting, set the sysfs knob "unpriv_sgio" of the devices listed to 1, and 0 when libvirtd exists. I don't quite agree with this approach, as entries in qemu.conf generally should be configuration for the whole qemu driver, however, the SG_IO setting is at the device layer, or not higher than guest layer. What I'm thinking about is to have a public API to tune the knob independantly with domain/driver, that means it's up to management apps to manage the knob's value, setting it to 1 before domain(s) starting, and 0 when no domain is using it. Any thoughts? Regards, Osier

On Thu, Nov 22, 2012 at 10:11:01PM +0800, Osier Yang wrote:
Hi,
http://lwn.net/Articles/524720/ introduces new sysfs knob (unpriv_sgio) for SCSI device to allow the unprivileged SG_IO.
I don't have solid thought on how the libvirt interface should be yet. It shouldn't be a XML entry of disk device, as the device can be shared by multiple guests, and the configuration should be kept same for all of them, having a XML entry for it will make things a mess.
IMHO it should be an XML entry of the disk device. If the same device is then given to multiple guests, libvirt has to validate that they all have the same setting in this respect.
What Paolo suggested is to add an entry in qemu.conf, just like "cgroup_device_acl":
sgio_device_acl = [ "/dev/sda" ]
When libvirtd starting, set the sysfs knob "unpriv_sgio" of the devices listed to 1, and 0 when libvirtd exists.
I don't quite agree with this approach, as entries in qemu.conf generally should be configuration for the whole qemu driver, however, the SG_IO setting is at the device layer, or not higher than guest layer.
This is fundamentally guest configuration IMHO,not system configuration, so qemu.conf is the wrong place for it. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Il 22/11/2012 15:19, Daniel P. Berrange ha scritto:
What Paolo suggested is to add an entry in qemu.conf, just like "cgroup_device_acl":
sgio_device_acl = [ "/dev/sda" ]
When libvirtd starting, set the sysfs knob "unpriv_sgio" of the devices listed to 1, and 0 when libvirtd exists.
I don't quite agree with this approach, as entries in qemu.conf generally should be configuration for the whole qemu driver, however, the SG_IO setting is at the device layer, or not higher than guest layer.
This is fundamentally guest configuration IMHO,not system configuration, so qemu.conf is the wrong place for it.
We can make it 100% guest configuration. Let's add the same whitelist as the kernel to QEMU's scsi-block/scsi-generic as well. This way, libvirt will be able to start domains with different settings as long as QEMU supports the new property (let's call it scsi-block.privileged). I can add it to 1.4. Paolo

Il 22/11/2012 15:11, Osier Yang ha scritto:
Hi,
http://lwn.net/Articles/524720/ introduces new sysfs knob (unpriv_sgio) for SCSI device to allow the unprivileged SG_IO.
I don't have solid thought on how the libvirt interface should be yet. It shouldn't be a XML entry of disk device, as the device can be shared by multiple guests, and the configuration should be kept same for all of them, having a XML entry for it will make things a mess.
What Paolo suggested is to add an entry in qemu.conf, just like "cgroup_device_acl":
sgio_device_acl = [ "/dev/sda" ]
When libvirtd starting, set the sysfs knob "unpriv_sgio" of the devices listed to 1, and 0 when libvirtd exists.
I don't quite agree with this approach, as entries in qemu.conf generally should be configuration for the whole qemu driver, however, the SG_IO setting is at the device layer, or not higher than guest layer.
I don't like it either, but I don't see any alternative...
What I'm thinking about is to have a public API to tune the knob independantly with domain/driver, that means it's up to management apps to manage the knob's value, setting it to 1 before domain(s) starting, and 0 when no domain is using it.
At this point, it's simpler to just let the admin do this in /etc/rc.d/rc.local or in udev rules (which was my initial idea). Paolo
Any thoughts?
Regards, Osier

On 2012年11月22日 22:20, Paolo Bonzini wrote:
Il 22/11/2012 15:11, Osier Yang ha scritto:
Hi,
http://lwn.net/Articles/524720/ introduces new sysfs knob (unpriv_sgio) for SCSI device to allow the unprivileged SG_IO.
I don't have solid thought on how the libvirt interface should be yet. It shouldn't be a XML entry of disk device, as the device can be shared by multiple guests, and the configuration should be kept same for all of them, having a XML entry for it will make things a mess.
What Paolo suggested is to add an entry in qemu.conf, just like "cgroup_device_acl":
sgio_device_acl = [ "/dev/sda" ]
When libvirtd starting, set the sysfs knob "unpriv_sgio" of the devices listed to 1, and 0 when libvirtd exists.
I don't quite agree with this approach, as entries in qemu.conf generally should be configuration for the whole qemu driver, however, the SG_IO setting is at the device layer, or not higher than guest layer.
I don't like it either, but I don't see any alternative...
What I'm thinking about is to have a public API to tune the knob independantly with domain/driver, that means it's up to management apps to manage the knob's value, setting it to 1 before domain(s) starting, and 0 when no domain is using it.
At this point, it's simpler to just let the admin do this in /etc/rc.d/rc.local or in udev rules (which was my initial idea).
Isn't an API helpful in this case? The apps will want to manage it anyway. Regards, Osier
participants (3)
-
Daniel P. Berrange
-
Osier Yang
-
Paolo Bonzini