Re: [libvirt] RFC [3/3]: Lock manager usage scenarios

----- "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Tue, Sep 14, 2010 at 05:03:21PM -0400, Ayal Baron wrote:
----- "Daniel P. Berrange" <berrange@redhat.com> wrote:
That is probably possible with the current security driver implementations but more generally I think it will still hit some trouble. Specifically one of the items on our todo list is a new security driver that
use of Linux container namespace functionality to isolate the VMs, so
can't even see other resources / processes on the host. This may
well
prevent the sync manager wrapper talking to a central sync mnager process The general rule we aim for is that once libvirtd has spawned a VM they are completely isolated with exception of any disks marked with <shareable/> In other words, any communictions channels must be initiated/established by the mgmt layer to the VM process, with nothing to be established in the reverse direction. Correct me if I'm wrong, but the security limitations (selinux context) would only take effect after the "exec", no? so the process could still communicate with the daemon, open an FD and then exec. After exec,
makes they the
VM would be locked down but the daemon could still wait on the FD to see whether VM has died.
It depends on which exec you are talking about here. If the comms to the daemon are done straight from the libvirtd plugin, then it would still be unrestricted. If the comms were done from the supervisor process, it would be restricted.
Daniel I'm talking about the supervisor. You said you spoke to Dan Walsh and that the supervisor and qemu processes could get different contexts. Now you're saying the supervisor would be restricted nonetheless. What am I missing?
-- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Thu, Sep 16, 2010 at 08:31:45AM -0400, Ayal Baron wrote:
----- "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Tue, Sep 14, 2010 at 05:03:21PM -0400, Ayal Baron wrote:
----- "Daniel P. Berrange" <berrange@redhat.com> wrote:
That is probably possible with the current security driver implementations but more generally I think it will still hit some trouble. Specifically one of the items on our todo list is a new security driver that
use of Linux container namespace functionality to isolate the VMs, so
can't even see other resources / processes on the host. This may
well
prevent the sync manager wrapper talking to a central sync mnager process The general rule we aim for is that once libvirtd has spawned a VM they are completely isolated with exception of any disks marked with <shareable/> In other words, any communictions channels must be initiated/established by the mgmt layer to the VM process, with nothing to be established in the reverse direction. Correct me if I'm wrong, but the security limitations (selinux context) would only take effect after the "exec", no? so the process could still communicate with the daemon, open an FD and then exec. After exec,
makes they the
VM would be locked down but the daemon could still wait on the FD to see whether VM has died.
It depends on which exec you are talking about here. If the comms to the daemon are done straight from the libvirtd plugin, then it would still be unrestricted. If the comms were done from the supervisor process, it would be restricted.
Daniel I'm talking about the supervisor. You said you spoke to Dan Walsh and that the supervisor and qemu processes could get different contexts. Now you're saying the supervisor would be restricted nonetheless. What am I missing?
The distinction is between what is possible, and what is recommended to do. Even with the supervisor & QEMU having separate SELinux contexts, it is still desirable to lock down the supervisor to only be able to access the VM lease file & only its own QEMU pid. So while we could write policy such that a supervisor can talk to a central lock daemon, it is preferrable for the lock supervisor to be self contained. The other point I make is that SElinux is the main security driver today, but others will come along in the future. A container based security driver will almost certainly completely isolate the spawned processes with no option to talk to a central lock daemon. There would be separate filesystem namespace, PID namespace, network namespace per VM - in essence each process would see its own isolated OS with only QEMU & the optional lock supervisor running in it. So to get a maximally flexible & future proof sync maanger plugin, it is best to any reliance on a central daemon, even if that is technically possible today. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Thu, Sep 16, 2010 at 01:50:46PM +0100, Daniel P. Berrange wrote:
The distinction is between what is possible, and what is recommended to do. Even with the supervisor & QEMU having separate SELinux contexts, it is still desirable to lock down the supervisor to only be able to access the VM lease file & only its own QEMU pid. So while we could write policy such that a supervisor can talk to a central lock daemon, it is preferrable for the lock supervisor to be self contained.
The other point I make is that SElinux is the main security driver today, but others will come along in the future. A container based security driver will almost certainly completely isolate the spawned processes with no option to talk to a central lock daemon. There would be separate filesystem namespace, PID namespace, network namespace per VM - in essence each process would see its own isolated OS with only QEMU & the optional lock supervisor running in it.
Could containers make isolation exceptions for - shared storage devices? - shared /var/run/sync_manager/watchdog/ so that the system watchdog could monitor all sync_manager instances? Dave

On Thu, Sep 16, 2010 at 11:31:38AM -0400, David Teigland wrote:
On Thu, Sep 16, 2010 at 01:50:46PM +0100, Daniel P. Berrange wrote:
The distinction is between what is possible, and what is recommended to do. Even with the supervisor & QEMU having separate SELinux contexts, it is still desirable to lock down the supervisor to only be able to access the VM lease file & only its own QEMU pid. So while we could write policy such that a supervisor can talk to a central lock daemon, it is preferrable for the lock supervisor to be self contained.
The other point I make is that SElinux is the main security driver today, but others will come along in the future. A container based security driver will almost certainly completely isolate the spawned processes with no option to talk to a central lock daemon. There would be separate filesystem namespace, PID namespace, network namespace per VM - in essence each process would see its own isolated OS with only QEMU & the optional lock supervisor running in it.
Could containers make isolation exceptions for - shared storage devices? - shared /var/run/sync_manager/watchdog/ so that the system watchdog could monitor all sync_manager instances?
Yes, resources (files) from the primary OS can be exposed in the container on a case by case basis & potentially be visible inside many containers. If we did a full virtual chroot setup, then the container would only be able to see designated paths. It is also possible to hide the containers chroot heirarchy from the host completely. In any case, we can share paths between containers and the host as needed. A process inside the container would not be able to see any processes outside the container. Processes outside can, however, see processes inside the container, but its view of the PIDs will be different. eg PID 1 inside the container may be PID 2345 outside. The point I was trying to make, is that if the supervisor process wants to connect back to a central lock daemon directly this might run into trouble. If the supervisor process only needs to access file resources on disk, it should be fine. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

From: libvir-list-bounces@redhat.com [mailto:libvir-list-bounces@redhat.com] On Behalf Of Daniel P. Berrange ...
Could containers make isolation exceptions for - shared storage devices? - shared /var/run/sync_manager/watchdog/ so that the system watchdog could monitor all sync_manager instances?
Yes, resources (files) from the primary OS can be exposed in the container on a case by case basis & potentially be visible inside many containers. If we did a full virtual chroot setup, then the container would only be able to see designated paths. It is also possible to hide the containers chroot heirarchy from the host completely. In any case, we can share paths between containers and the host as needed.
A process inside the container would not be able to see any processes outside the container. Processes outside can, however, see processes inside the container, but its view of the PIDs will be different. eg PID 1 inside the container may be PID 2345 outside.
The point I was trying to make, is that if the supervisor process wants to connect back to a central lock daemon directly this might run into trouble. If the supervisor process only needs to access file resources on disk, it should be fine. [IH] how would Libvirt know to give security context to the leases area of the VM? it would be a different implementation per lock manager (say, I'd like to lock a row in a central remote db for this)?

On Mon, Sep 20, 2010 at 03:47:11AM -0400, Itamar Heim wrote:
From: libvir-list-bounces@redhat.com [mailto:libvir-list-bounces@redhat.com] On Behalf Of Daniel P. Berrange ...
Could containers make isolation exceptions for - shared storage devices? - shared /var/run/sync_manager/watchdog/ so that the system watchdog could monitor all sync_manager instances?
Yes, resources (files) from the primary OS can be exposed in the container on a case by case basis & potentially be visible inside many containers. If we did a full virtual chroot setup, then the container would only be able to see designated paths. It is also possible to hide the containers chroot heirarchy from the host completely. In any case, we can share paths between containers and the host as needed.
A process inside the container would not be able to see any processes outside the container. Processes outside can, however, see processes inside the container, but its view of the PIDs will be different. eg PID 1 inside the container may be PID 2345 outside.
The point I was trying to make, is that if the supervisor process wants to connect back to a central lock daemon directly this might run into trouble. If the supervisor process only needs to access file resources on disk, it should be fine. [IH] how would Libvirt know to give security context to the leases area of the VM? it would be a different implementation per lock manager (say, I'd like to lock a row in a central remote db for this)?
That's easy enough to handle. If it is a shared lease file between all VMs, then presumably that needs to be created ahead of time. SElinux policy can defined a suitable default label, or the label can be set as part of creation process. If there is a per-VM lease file that needs the per-VM security context, then this can be specified as a config parameter in the VM XML. If its a remote DB, then we don't need to care about it. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (4)
-
Ayal Baron
-
Daniel P. Berrange
-
David Teigland
-
Itamar Heim