[PATCH 0/3] qemu_passt: Fixes for passt lifecycle handling

This series implements fixes in the handling of passt's lifecycle. Stefano Brivio (3): qemu_passt: Don't make passt transition to svirt_t/virt_domain on start qemu_passt: Set UID and GID to configured values for qemu driver, if any qemu_passt: Remove passt socket file on exit src/qemu/qemu_passt.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) -- 2.39.1

qemuSecurityCommandRun() causes an explicit domain transition of the new process, but passt ships with its own SELinux policy, with external interfaces for libvirtd, so we simply need to transition from virtd_t to passt_t as passt is executed. The qemu type enforcement rules have little to do with it. Usage of those new interfaces is implemented by this change in selinux-policy: https://github.com/fedora-selinux/selinux-policy/pull/1613 Replace qemuSecurityCommandRun() with virCommandRun(), which simply does that: it runs the command. Fixes: a56f0168d576 ("qemu: hook up passt config to qemu domains") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> --- src/qemu/qemu_passt.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 1217a6a087..1a67cf44de 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -158,8 +158,6 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; - int exitstatus = 0; - int cmdret = 0; cmd = virCommandNew(PASST); @@ -271,10 +269,7 @@ qemuPasstStart(virDomainObj *vm, if (qemuExtDeviceLogCommand(driver, vm, cmd, "passt") < 0) return -1; - if (qemuSecurityCommandRun(driver, vm, cmd, -1, -1, &exitstatus, &cmdret) < 0) - goto error; - - if (cmdret < 0 || exitstatus != 0) { + if (virCommandRun(cmd, NULL)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Could not start 'passt': %s"), NULLSTR(errbuf)); goto error; -- 2.39.1

On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote:
qemuSecurityCommandRun() causes an explicit domain transition of the new process, but passt ships with its own SELinux policy, with external interfaces for libvirtd, so we simply need to transition from virtd_t to passt_t as passt is executed. The qemu type enforcement rules have little to do with it.
Can you clarify the difference here ? Runing passt under 'svirt_t' is not desirable as that allows many actions that are only relevant to QEMU. Running passt under the MCS label that is associated with the VM is highly desirable though. Two passt instances belonging to separate VMs are isolated from each other if they each use the VM specific MCS label, than if they use the global default MCS label. To use the VM specific MCS label would require libvirt to explicitly set the desired selinux label on exec, it can't happen automatically via an SELinux transition rule. We do stil want to use the passt_t type though. IOW, if we have a VM running svirt_t:s0:c710,c716 IMHO we would its corresponding passt instance to be running passt_t:s0:c710,c716 not passt_t:s0:c0.c1023
Usage of those new interfaces is implemented by this change in selinux-policy: https://github.com/fedora-selinux/selinux-policy/pull/1613
Replace qemuSecurityCommandRun() with virCommandRun(), which simply does that: it runs the command.
Fixes: a56f0168d576 ("qemu: hook up passt config to qemu domains") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> --- src/qemu/qemu_passt.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 1217a6a087..1a67cf44de 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -158,8 +158,6 @@ qemuPasstStart(virDomainObj *vm, g_autofree char *errbuf = NULL; char macaddr[VIR_MAC_STRING_BUFLEN]; size_t i; - int exitstatus = 0; - int cmdret = 0;
cmd = virCommandNew(PASST);
@@ -271,10 +269,7 @@ qemuPasstStart(virDomainObj *vm, if (qemuExtDeviceLogCommand(driver, vm, cmd, "passt") < 0) return -1;
- if (qemuSecurityCommandRun(driver, vm, cmd, -1, -1, &exitstatus, &cmdret) < 0) - goto error; - - if (cmdret < 0 || exitstatus != 0) { + if (virCommandRun(cmd, NULL)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Could not start 'passt': %s"), NULLSTR(errbuf)); goto error; -- 2.39.1
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Tue, 21 Feb 2023 19:43:33 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote:
qemuSecurityCommandRun() causes an explicit domain transition of the new process, but passt ships with its own SELinux policy, with external interfaces for libvirtd, so we simply need to transition from virtd_t to passt_t as passt is executed. The qemu type enforcement rules have little to do with it.
Can you clarify the difference here ?
Between...? I mean, virCommandRun() will just keep things running under virtd_t, so that passt later can transition to passt_t. With qemuSecurityCommandRun(), there would be a transition from virtd_t to svirt_t (it's the function that's called to start qemu, but shouldn't be called here), and not to passt_t. But I'm not really sure that's what you were asking for.
Runing passt under 'svirt_t' is not desirable as that allows many actions that are only relevant to QEMU.
Right, that's what this patch avoids. There are also actions, such as starting passt or killing it, that we don't want to allow QEMU to do.
Running passt under the MCS label that is associated with the VM is highly desirable though. Two passt instances belonging to separate VMs are isolated from each other if they each use the VM specific MCS label, than if they use the global default MCS label.
To use the VM specific MCS label would require libvirt to explicitly set the desired selinux label on exec, it can't happen automatically via an SELinux transition rule.
We do stil want to use the passt_t type though.
IOW, if we have a VM running
svirt_t:s0:c710,c716
IMHO we would its corresponding passt instance to be running
passt_t:s0:c710,c716
not
passt_t:s0:c0.c1023
Practically speaking, it doesn't make a huge difference for passt because it unshares any relevant namespace right after it starts -- that's *in theory* a strictly stronger isolation compared to what SELinux provides (at least once we reach the main loop). But it makes sense, and I guess we should relabel to a specific MCS with still 'virtd_t' as a type, then later the domain would transition to passt_t. This could probably be done as an extension of qemuSecurityCommandRun(), I haven't looked into it yet. I will. Anyway, right now, I think this provides better security than 'setenforce 0', which is the only way to run passt from libvirt at the moment on some distributions. I'm not sure how you handle these cases on libvirt, but generally speaking, this patch is a vast improvement on the current situation, and I can follow up with an extension or a different version of qemuSecurityCommandRun() later. -- Stefano

On Tue, Feb 21, 2023 at 10:49:46PM +0100, Stefano Brivio wrote:
On Tue, 21 Feb 2023 19:43:33 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote:
qemuSecurityCommandRun() causes an explicit domain transition of the new process, but passt ships with its own SELinux policy, with external interfaces for libvirtd, so we simply need to transition from virtd_t to passt_t as passt is executed. The qemu type enforcement rules have little to do with it.
Can you clarify the difference here ?
Between...?
I mean, virCommandRun() will just keep things running under virtd_t, so that passt later can transition to passt_t.
With qemuSecurityCommandRun(), there would be a transition from virtd_t to svirt_t (it's the function that's called to start qemu, but shouldn't be called here), and not to passt_t.
But I'm not really sure that's what you were asking for.
Yes, it is.
Runing passt under 'svirt_t' is not desirable as that allows many actions that are only relevant to QEMU.
Right, that's what this patch avoids. There are also actions, such as starting passt or killing it, that we don't want to allow QEMU to do.
Running passt under the MCS label that is associated with the VM is highly desirable though. Two passt instances belonging to separate VMs are isolated from each other if they each use the VM specific MCS label, than if they use the global default MCS label.
To use the VM specific MCS label would require libvirt to explicitly set the desired selinux label on exec, it can't happen automatically via an SELinux transition rule.
We do stil want to use the passt_t type though.
IOW, if we have a VM running
svirt_t:s0:c710,c716
IMHO we would its corresponding passt instance to be running
passt_t:s0:c710,c716
not
passt_t:s0:c0.c1023
Practically speaking, it doesn't make a huge difference for passt because it unshares any relevant namespace right after it starts -- that's *in theory* a strictly stronger isolation compared to what SELinux provides (at least once we reach the main loop).
Even docker/podman will apply SELinux isolation per container, rather than only relying on namespaces.
But it makes sense, and I guess we should relabel to a specific MCS with still 'virtd_t' as a type, then later the domain would transition to passt_t. This could probably be done as an extension of qemuSecurityCommandRun(), I haven't looked into it yet. I will.
Anyway, right now, I think this provides better security than 'setenforce 0', which is the only way to run passt from libvirt at the moment on some distributions.
If running with SELinux permissive, this patch has no effect, as it is unconfined regardless. That's not a situation we care about. If people want to run without protection they get to keep the pieces when it all goes wrong.
I'm not sure how you handle these cases on libvirt, but generally speaking, this patch is a vast improvement on the current situation, and I can follow up with an extension or a different version of qemuSecurityCommandRun() later.
No, I don't think it is a vast improvement. The goal of sVirt is to guarantee isolation between VMs. Running passt under svirt_t:MCS is not ideal because the svirt_t policy allows things that passt should not get. It does still guarantee isolation between VMs, because the MCS label is present. Switching to running passt_t:c0.c1023 will be more correct in terms of what permissions passt should be allowed, but it has disabled isolation of passt between VMs. This is a clear degradation of capabilities from the POV of sVirt's goals. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, 22 Feb 2023 09:46:42 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 10:49:46PM +0100, Stefano Brivio wrote:
On Tue, 21 Feb 2023 19:43:33 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote:
qemuSecurityCommandRun() causes an explicit domain transition of the new process, but passt ships with its own SELinux policy, with external interfaces for libvirtd, so we simply need to transition from virtd_t to passt_t as passt is executed. The qemu type enforcement rules have little to do with it.
Can you clarify the difference here ?
Between...?
I mean, virCommandRun() will just keep things running under virtd_t, so that passt later can transition to passt_t.
With qemuSecurityCommandRun(), there would be a transition from virtd_t to svirt_t (it's the function that's called to start qemu, but shouldn't be called here), and not to passt_t.
But I'm not really sure that's what you were asking for.
Yes, it is.
Runing passt under 'svirt_t' is not desirable as that allows many actions that are only relevant to QEMU.
Right, that's what this patch avoids. There are also actions, such as starting passt or killing it, that we don't want to allow QEMU to do.
Running passt under the MCS label that is associated with the VM is highly desirable though. Two passt instances belonging to separate VMs are isolated from each other if they each use the VM specific MCS label, than if they use the global default MCS label.
To use the VM specific MCS label would require libvirt to explicitly set the desired selinux label on exec, it can't happen automatically via an SELinux transition rule.
We do stil want to use the passt_t type though.
IOW, if we have a VM running
svirt_t:s0:c710,c716
IMHO we would its corresponding passt instance to be running
passt_t:s0:c710,c716
not
passt_t:s0:c0.c1023
Practically speaking, it doesn't make a huge difference for passt because it unshares any relevant namespace right after it starts -- that's *in theory* a strictly stronger isolation compared to what SELinux provides (at least once we reach the main loop).
Even docker/podman will apply SELinux isolation per container, rather than only relying on namespaces.
Sure, I'm not saying it's not desirable -- but still, many (most?) host-facing services they rely on are not isolated in this sense. Same for the current implementation of libvirt with dnsmasq, for example.
But it makes sense, and I guess we should relabel to a specific MCS with still 'virtd_t' as a type, then later the domain would transition to passt_t. This could probably be done as an extension of qemuSecurityCommandRun(), I haven't looked into it yet. I will.
Anyway, right now, I think this provides better security than 'setenforce 0', which is the only way to run passt from libvirt at the moment on some distributions.
If running with SELinux permissive, this patch has no effect, as it is unconfined regardless. That's not a situation we care about. If people want to run without protection they get to keep the pieces when it all goes wrong.
The current implementation simply does not and cannot work with SELinux in enforcing mode, so users have no other choice.
I'm not sure how you handle these cases on libvirt, but generally speaking, this patch is a vast improvement on the current situation, and I can follow up with an extension or a different version of qemuSecurityCommandRun() later.
No, I don't think it is a vast improvement.
The goal of sVirt is to guarantee isolation between VMs.
Running passt under svirt_t:MCS is not ideal because the svirt_t policy allows things that passt should not get. It does still guarantee isolation between VMs, because the MCS label is present.
It's a bit more than that: it's not ideal because libvirt simply won't start passt. There's no isolation and no VMs.
Switching to running passt_t:c0.c1023 will be more correct in terms of what permissions passt should be allowed, but it has disabled isolation of passt between VMs. This is a clear degradation of capabilities from the POV of sVirt's goals.
It's not a degradation because VMs can't start passt with SELinux in enforcing mode, without this patch. No service, no degradation. I looked into options to rework virSecurityManagerSetChildProcessLabel() and friends with a more flexible modeling/building of labels -- just doing some trick all the way down from qemuSecurityCommandRun() implies a number of layering violations that I would like to avoid. It all looks doable, but implementing the type of functionality/API that's currently missing there isn't a small rework -- some refactoring of interfaces is definitely needed. I started, but it's not quick. So for the moment being I would suggest that, if passt can't be relabeled as passt_t (i.e. if this patch or equivalent can't be applied), the whole passt back-end should be dropped. I'm not sure what is the accepted way to deprecate functionality, so I would need some pointers about that. -- Stefano

On 2/22/23 11:05, Stefano Brivio wrote:
On Wed, 22 Feb 2023 09:46:42 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 10:49:46PM +0100, Stefano Brivio wrote:
On Tue, 21 Feb 2023 19:43:33 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote:
qemuSecurityCommandRun() causes an explicit domain transition of the new process, but passt ships with its own SELinux policy, with external interfaces for libvirtd, so we simply need to transition from virtd_t to passt_t as passt is executed. The qemu type enforcement rules have little to do with it.
Can you clarify the difference here ?
Between...?
I mean, virCommandRun() will just keep things running under virtd_t, so that passt later can transition to passt_t.
With qemuSecurityCommandRun(), there would be a transition from virtd_t to svirt_t (it's the function that's called to start qemu, but shouldn't be called here), and not to passt_t.
But I'm not really sure that's what you were asking for.
Yes, it is.
Runing passt under 'svirt_t' is not desirable as that allows many actions that are only relevant to QEMU.
Right, that's what this patch avoids. There are also actions, such as starting passt or killing it, that we don't want to allow QEMU to do.
Running passt under the MCS label that is associated with the VM is highly desirable though. Two passt instances belonging to separate VMs are isolated from each other if they each use the VM specific MCS label, than if they use the global default MCS label.
To use the VM specific MCS label would require libvirt to explicitly set the desired selinux label on exec, it can't happen automatically via an SELinux transition rule.
We do stil want to use the passt_t type though.
IOW, if we have a VM running
svirt_t:s0:c710,c716
IMHO we would its corresponding passt instance to be running
passt_t:s0:c710,c716
not
passt_t:s0:c0.c1023
Practically speaking, it doesn't make a huge difference for passt because it unshares any relevant namespace right after it starts -- that's *in theory* a strictly stronger isolation compared to what SELinux provides (at least once we reach the main loop).
Even docker/podman will apply SELinux isolation per container, rather than only relying on namespaces.
Sure, I'm not saying it's not desirable -- but still, many (most?) host-facing services they rely on are not isolated in this sense. Same for the current implementation of libvirt with dnsmasq, for example.
But it makes sense, and I guess we should relabel to a specific MCS with still 'virtd_t' as a type, then later the domain would transition to passt_t. This could probably be done as an extension of qemuSecurityCommandRun(), I haven't looked into it yet. I will.
Anyway, right now, I think this provides better security than 'setenforce 0', which is the only way to run passt from libvirt at the moment on some distributions.
If running with SELinux permissive, this patch has no effect, as it is unconfined regardless. That's not a situation we care about. If people want to run without protection they get to keep the pieces when it all goes wrong.
The current implementation simply does not and cannot work with SELinux in enforcing mode, so users have no other choice.
I'm not sure how you handle these cases on libvirt, but generally speaking, this patch is a vast improvement on the current situation, and I can follow up with an extension or a different version of qemuSecurityCommandRun() later.
No, I don't think it is a vast improvement.
The goal of sVirt is to guarantee isolation between VMs.
Running passt under svirt_t:MCS is not ideal because the svirt_t policy allows things that passt should not get. It does still guarantee isolation between VMs, because the MCS label is present.
It's a bit more than that: it's not ideal because libvirt simply won't start passt. There's no isolation and no VMs.
Switching to running passt_t:c0.c1023 will be more correct in terms of what permissions passt should be allowed, but it has disabled isolation of passt between VMs. This is a clear degradation of capabilities from the POV of sVirt's goals.
It's not a degradation because VMs can't start passt with SELinux in enforcing mode, without this patch. No service, no degradation.
I looked into options to rework virSecurityManagerSetChildProcessLabel() and friends with a more flexible modeling/building of labels -- just doing some trick all the way down from qemuSecurityCommandRun() implies a number of layering violations that I would like to avoid.
It all looks doable, but implementing the type of functionality/API that's currently missing there isn't a small rework -- some refactoring of interfaces is definitely needed. I started, but it's not quick.
So for the moment being I would suggest that, if passt can't be relabeled as passt_t (i.e. if this patch or equivalent can't be applied), the whole passt back-end should be dropped.
I don't think we need such drastic measure. I think you can use: qemuPasstStart() { seclabel = virDomainDefGetSecurityLabelDef(vm->def, "selinux"); s = context_new(seclabel->label); context_type_set(s, "virt_t); newLabel = context_str(s); virCommandSetSELinuxLabel(cmd, newLabel); virCommandRun(); } Michal

On Wed, 22 Feb 2023 12:21:09 +0100 Michal Prívozník <mprivozn@redhat.com> wrote:
On 2/22/23 11:05, Stefano Brivio wrote:
On Wed, 22 Feb 2023 09:46:42 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 10:49:46PM +0100, Stefano Brivio wrote:
On Tue, 21 Feb 2023 19:43:33 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote:
qemuSecurityCommandRun() causes an explicit domain transition of the new process, but passt ships with its own SELinux policy, with external interfaces for libvirtd, so we simply need to transition from virtd_t to passt_t as passt is executed. The qemu type enforcement rules have little to do with it.
Can you clarify the difference here ?
Between...?
I mean, virCommandRun() will just keep things running under virtd_t, so that passt later can transition to passt_t.
With qemuSecurityCommandRun(), there would be a transition from virtd_t to svirt_t (it's the function that's called to start qemu, but shouldn't be called here), and not to passt_t.
But I'm not really sure that's what you were asking for.
Yes, it is.
Runing passt under 'svirt_t' is not desirable as that allows many actions that are only relevant to QEMU.
Right, that's what this patch avoids. There are also actions, such as starting passt or killing it, that we don't want to allow QEMU to do.
Running passt under the MCS label that is associated with the VM is highly desirable though. Two passt instances belonging to separate VMs are isolated from each other if they each use the VM specific MCS label, than if they use the global default MCS label.
To use the VM specific MCS label would require libvirt to explicitly set the desired selinux label on exec, it can't happen automatically via an SELinux transition rule.
We do stil want to use the passt_t type though.
IOW, if we have a VM running
svirt_t:s0:c710,c716
IMHO we would its corresponding passt instance to be running
passt_t:s0:c710,c716
not
passt_t:s0:c0.c1023
Practically speaking, it doesn't make a huge difference for passt because it unshares any relevant namespace right after it starts -- that's *in theory* a strictly stronger isolation compared to what SELinux provides (at least once we reach the main loop).
Even docker/podman will apply SELinux isolation per container, rather than only relying on namespaces.
Sure, I'm not saying it's not desirable -- but still, many (most?) host-facing services they rely on are not isolated in this sense. Same for the current implementation of libvirt with dnsmasq, for example.
But it makes sense, and I guess we should relabel to a specific MCS with still 'virtd_t' as a type, then later the domain would transition to passt_t. This could probably be done as an extension of qemuSecurityCommandRun(), I haven't looked into it yet. I will.
Anyway, right now, I think this provides better security than 'setenforce 0', which is the only way to run passt from libvirt at the moment on some distributions.
If running with SELinux permissive, this patch has no effect, as it is unconfined regardless. That's not a situation we care about. If people want to run without protection they get to keep the pieces when it all goes wrong.
The current implementation simply does not and cannot work with SELinux in enforcing mode, so users have no other choice.
I'm not sure how you handle these cases on libvirt, but generally speaking, this patch is a vast improvement on the current situation, and I can follow up with an extension or a different version of qemuSecurityCommandRun() later.
No, I don't think it is a vast improvement.
The goal of sVirt is to guarantee isolation between VMs.
Running passt under svirt_t:MCS is not ideal because the svirt_t policy allows things that passt should not get. It does still guarantee isolation between VMs, because the MCS label is present.
It's a bit more than that: it's not ideal because libvirt simply won't start passt. There's no isolation and no VMs.
Switching to running passt_t:c0.c1023 will be more correct in terms of what permissions passt should be allowed, but it has disabled isolation of passt between VMs. This is a clear degradation of capabilities from the POV of sVirt's goals.
It's not a degradation because VMs can't start passt with SELinux in enforcing mode, without this patch. No service, no degradation.
I looked into options to rework virSecurityManagerSetChildProcessLabel() and friends with a more flexible modeling/building of labels -- just doing some trick all the way down from qemuSecurityCommandRun() implies a number of layering violations that I would like to avoid.
It all looks doable, but implementing the type of functionality/API that's currently missing there isn't a small rework -- some refactoring of interfaces is definitely needed. I started, but it's not quick.
So for the moment being I would suggest that, if passt can't be relabeled as passt_t (i.e. if this patch or equivalent can't be applied), the whole passt back-end should be dropped.
I don't think we need such drastic measure. I think you can use:
qemuPasstStart() {
seclabel = virDomainDefGetSecurityLabelDef(vm->def, "selinux"); s = context_new(seclabel->label); context_type_set(s, "virt_t); newLabel = context_str(s);
virCommandSetSELinuxLabel(cmd, newLabel);
virCommandRun(); }
Yes, I actually tried something like this and it seemed to work, but I didn't propose it as it looks (is) gross. On the other hand, if you think it's acceptable as a temporary measure, let me test it (in a bit). Thanks for the snippet. -- Stefano

On 2/22/23 12:30, Stefano Brivio wrote:
I don't think we need such drastic measure. I think you can use:
qemuPasstStart() {
seclabel = virDomainDefGetSecurityLabelDef(vm->def, "selinux"); s = context_new(seclabel->label); context_type_set(s, "virt_t); newLabel = context_str(s);
virCommandSetSELinuxLabel(cmd, newLabel);
virCommandRun(); }
Yes, I actually tried something like this and it seemed to work, but I didn't propose it as it looks (is) gross.
Agreed, it's not something I'd show to my kids, but it works.
On the other hand, if you think it's acceptable as a temporary measure, let me test it (in a bit). Thanks for the snippet.
Forgot to mention, it should be wrapped in #ifdef WITH_SELINUX as we offer users to compile without SELinux support (e.g. FreeBSD which does support QEMU but doesn't have SELinux, what a surprise, right?). Michal

On Wed, Feb 22, 2023 at 12:21:09PM +0100, Michal Prívozník wrote:
On 2/22/23 11:05, Stefano Brivio wrote:
On Wed, 22 Feb 2023 09:46:42 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 10:49:46PM +0100, Stefano Brivio wrote:
On Tue, 21 Feb 2023 19:43:33 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote:
qemuSecurityCommandRun() causes an explicit domain transition of the new process, but passt ships with its own SELinux policy, with external interfaces for libvirtd, so we simply need to transition from virtd_t to passt_t as passt is executed. The qemu type enforcement rules have little to do with it.
Can you clarify the difference here ?
Between...?
I mean, virCommandRun() will just keep things running under virtd_t, so that passt later can transition to passt_t.
With qemuSecurityCommandRun(), there would be a transition from virtd_t to svirt_t (it's the function that's called to start qemu, but shouldn't be called here), and not to passt_t.
But I'm not really sure that's what you were asking for.
Yes, it is.
Runing passt under 'svirt_t' is not desirable as that allows many actions that are only relevant to QEMU.
Right, that's what this patch avoids. There are also actions, such as starting passt or killing it, that we don't want to allow QEMU to do.
Running passt under the MCS label that is associated with the VM is highly desirable though. Two passt instances belonging to separate VMs are isolated from each other if they each use the VM specific MCS label, than if they use the global default MCS label.
To use the VM specific MCS label would require libvirt to explicitly set the desired selinux label on exec, it can't happen automatically via an SELinux transition rule.
We do stil want to use the passt_t type though.
IOW, if we have a VM running
svirt_t:s0:c710,c716
IMHO we would its corresponding passt instance to be running
passt_t:s0:c710,c716
not
passt_t:s0:c0.c1023
Practically speaking, it doesn't make a huge difference for passt because it unshares any relevant namespace right after it starts -- that's *in theory* a strictly stronger isolation compared to what SELinux provides (at least once we reach the main loop).
Even docker/podman will apply SELinux isolation per container, rather than only relying on namespaces.
Sure, I'm not saying it's not desirable -- but still, many (most?) host-facing services they rely on are not isolated in this sense. Same for the current implementation of libvirt with dnsmasq, for example.
But it makes sense, and I guess we should relabel to a specific MCS with still 'virtd_t' as a type, then later the domain would transition to passt_t. This could probably be done as an extension of qemuSecurityCommandRun(), I haven't looked into it yet. I will.
Anyway, right now, I think this provides better security than 'setenforce 0', which is the only way to run passt from libvirt at the moment on some distributions.
If running with SELinux permissive, this patch has no effect, as it is unconfined regardless. That's not a situation we care about. If people want to run without protection they get to keep the pieces when it all goes wrong.
The current implementation simply does not and cannot work with SELinux in enforcing mode, so users have no other choice.
I'm not sure how you handle these cases on libvirt, but generally speaking, this patch is a vast improvement on the current situation, and I can follow up with an extension or a different version of qemuSecurityCommandRun() later.
No, I don't think it is a vast improvement.
The goal of sVirt is to guarantee isolation between VMs.
Running passt under svirt_t:MCS is not ideal because the svirt_t policy allows things that passt should not get. It does still guarantee isolation between VMs, because the MCS label is present.
It's a bit more than that: it's not ideal because libvirt simply won't start passt. There's no isolation and no VMs.
Switching to running passt_t:c0.c1023 will be more correct in terms of what permissions passt should be allowed, but it has disabled isolation of passt between VMs. This is a clear degradation of capabilities from the POV of sVirt's goals.
It's not a degradation because VMs can't start passt with SELinux in enforcing mode, without this patch. No service, no degradation.
I looked into options to rework virSecurityManagerSetChildProcessLabel() and friends with a more flexible modeling/building of labels -- just doing some trick all the way down from qemuSecurityCommandRun() implies a number of layering violations that I would like to avoid.
It all looks doable, but implementing the type of functionality/API that's currently missing there isn't a small rework -- some refactoring of interfaces is definitely needed. I started, but it's not quick.
So for the moment being I would suggest that, if passt can't be relabeled as passt_t (i.e. if this patch or equivalent can't be applied), the whole passt back-end should be dropped.
I don't think we need such drastic measure. I think you can use:
qemuPasstStart() {
seclabel = virDomainDefGetSecurityLabelDef(vm->def, "selinux"); s = context_new(seclabel->label); context_type_set(s, "virt_t);
I presume you meant "passt_t" here.
newLabel = context_str(s);
Probably don't want code directly exposed to SELinux APIs though. We also likely shouldn't hardcode 'passt_t' either, as that's not an inherant property of passt, but rather of the policy. This problem is not unique to passt either - we'll want to deal with this for almost any helper binary we spawn off per-VM. We can avoid hardcoding labels by calling getfilecon() on the binary we're going to run, to query its assigned type label that is on disk. Then concatenate the MCS label. So I'm inclined to say we need a virSecurityManagerGetHelperLabel(virSecurityManager *mgr, virDomainDef *def, const char *helper_path); And we would call virSecurityManagerGetHelperLabel(mgr, vm->def, "/usr/bin/passt")
virCommandSetSELinuxLabel(cmd, newLabel);
virCommandRun(); }
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, 22 Feb 2023 11:35:16 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Wed, Feb 22, 2023 at 12:21:09PM +0100, Michal Prívozník wrote:
On 2/22/23 11:05, Stefano Brivio wrote:
On Wed, 22 Feb 2023 09:46:42 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 10:49:46PM +0100, Stefano Brivio wrote:
On Tue, 21 Feb 2023 19:43:33 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote: > qemuSecurityCommandRun() causes an explicit domain transition of the > new process, but passt ships with its own SELinux policy, with > external interfaces for libvirtd, so we simply need to transition > from virtd_t to passt_t as passt is executed. The qemu type > enforcement rules have little to do with it.
Can you clarify the difference here ?
Between...?
I mean, virCommandRun() will just keep things running under virtd_t, so that passt later can transition to passt_t.
With qemuSecurityCommandRun(), there would be a transition from virtd_t to svirt_t (it's the function that's called to start qemu, but shouldn't be called here), and not to passt_t.
But I'm not really sure that's what you were asking for.
Yes, it is.
Runing passt under 'svirt_t' is not desirable as that allows many actions that are only relevant to QEMU.
Right, that's what this patch avoids. There are also actions, such as starting passt or killing it, that we don't want to allow QEMU to do.
Running passt under the MCS label that is associated with the VM is highly desirable though. Two passt instances belonging to separate VMs are isolated from each other if they each use the VM specific MCS label, than if they use the global default MCS label.
To use the VM specific MCS label would require libvirt to explicitly set the desired selinux label on exec, it can't happen automatically via an SELinux transition rule.
We do stil want to use the passt_t type though.
IOW, if we have a VM running
svirt_t:s0:c710,c716
IMHO we would its corresponding passt instance to be running
passt_t:s0:c710,c716
not
passt_t:s0:c0.c1023
Practically speaking, it doesn't make a huge difference for passt because it unshares any relevant namespace right after it starts -- that's *in theory* a strictly stronger isolation compared to what SELinux provides (at least once we reach the main loop).
Even docker/podman will apply SELinux isolation per container, rather than only relying on namespaces.
Sure, I'm not saying it's not desirable -- but still, many (most?) host-facing services they rely on are not isolated in this sense. Same for the current implementation of libvirt with dnsmasq, for example.
But it makes sense, and I guess we should relabel to a specific MCS with still 'virtd_t' as a type, then later the domain would transition to passt_t. This could probably be done as an extension of qemuSecurityCommandRun(), I haven't looked into it yet. I will.
Anyway, right now, I think this provides better security than 'setenforce 0', which is the only way to run passt from libvirt at the moment on some distributions.
If running with SELinux permissive, this patch has no effect, as it is unconfined regardless. That's not a situation we care about. If people want to run without protection they get to keep the pieces when it all goes wrong.
The current implementation simply does not and cannot work with SELinux in enforcing mode, so users have no other choice.
I'm not sure how you handle these cases on libvirt, but generally speaking, this patch is a vast improvement on the current situation, and I can follow up with an extension or a different version of qemuSecurityCommandRun() later.
No, I don't think it is a vast improvement.
The goal of sVirt is to guarantee isolation between VMs.
Running passt under svirt_t:MCS is not ideal because the svirt_t policy allows things that passt should not get. It does still guarantee isolation between VMs, because the MCS label is present.
It's a bit more than that: it's not ideal because libvirt simply won't start passt. There's no isolation and no VMs.
Switching to running passt_t:c0.c1023 will be more correct in terms of what permissions passt should be allowed, but it has disabled isolation of passt between VMs. This is a clear degradation of capabilities from the POV of sVirt's goals.
It's not a degradation because VMs can't start passt with SELinux in enforcing mode, without this patch. No service, no degradation.
I looked into options to rework virSecurityManagerSetChildProcessLabel() and friends with a more flexible modeling/building of labels -- just doing some trick all the way down from qemuSecurityCommandRun() implies a number of layering violations that I would like to avoid.
It all looks doable, but implementing the type of functionality/API that's currently missing there isn't a small rework -- some refactoring of interfaces is definitely needed. I started, but it's not quick.
So for the moment being I would suggest that, if passt can't be relabeled as passt_t (i.e. if this patch or equivalent can't be applied), the whole passt back-end should be dropped.
I don't think we need such drastic measure. I think you can use:
qemuPasstStart() {
seclabel = virDomainDefGetSecurityLabelDef(vm->def, "selinux"); s = context_new(seclabel->label); context_type_set(s, "virt_t);
I presume you meant "passt_t" here.
No, that needs to be "virtd_t" for the right domain transition to happen. -- Stefano

On Wed, Feb 22, 2023 at 12:38:23PM +0100, Stefano Brivio wrote:
On Wed, 22 Feb 2023 11:35:16 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Wed, Feb 22, 2023 at 12:21:09PM +0100, Michal Prívozník wrote:
On 2/22/23 11:05, Stefano Brivio wrote:
On Wed, 22 Feb 2023 09:46:42 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Feb 21, 2023 at 10:49:46PM +0100, Stefano Brivio wrote:
On Tue, 21 Feb 2023 19:43:33 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Tue, Feb 21, 2023 at 08:19:05PM +0100, Stefano Brivio wrote: >> qemuSecurityCommandRun() causes an explicit domain transition of the >> new process, but passt ships with its own SELinux policy, with >> external interfaces for libvirtd, so we simply need to transition >> from virtd_t to passt_t as passt is executed. The qemu type >> enforcement rules have little to do with it. > > Can you clarify the difference here ?
Between...?
I mean, virCommandRun() will just keep things running under virtd_t, so that passt later can transition to passt_t.
With qemuSecurityCommandRun(), there would be a transition from virtd_t to svirt_t (it's the function that's called to start qemu, but shouldn't be called here), and not to passt_t.
But I'm not really sure that's what you were asking for.
Yes, it is.
> Runing passt under 'svirt_t' is not desirable as that allows > many actions that are only relevant to QEMU.
Right, that's what this patch avoids. There are also actions, such as starting passt or killing it, that we don't want to allow QEMU to do.
> Running passt under the MCS label that is associated with the > VM is highly desirable though. Two passt instances belonging > to separate VMs are isolated from each other if they each use > the VM specific MCS label, than if they use the global default > MCS label. > > To use the VM specific MCS label would require libvirt to > explicitly set the desired selinux label on exec, it can't > happen automatically via an SELinux transition rule. > > We do stil want to use the passt_t type though. > > IOW, if we have a VM running > > svirt_t:s0:c710,c716 > > IMHO we would its corresponding passt instance to be > running > > passt_t:s0:c710,c716 > > > not > > passt_t:s0:c0.c1023
Practically speaking, it doesn't make a huge difference for passt because it unshares any relevant namespace right after it starts -- that's *in theory* a strictly stronger isolation compared to what SELinux provides (at least once we reach the main loop).
Even docker/podman will apply SELinux isolation per container, rather than only relying on namespaces.
Sure, I'm not saying it's not desirable -- but still, many (most?) host-facing services they rely on are not isolated in this sense. Same for the current implementation of libvirt with dnsmasq, for example.
But it makes sense, and I guess we should relabel to a specific MCS with still 'virtd_t' as a type, then later the domain would transition to passt_t. This could probably be done as an extension of qemuSecurityCommandRun(), I haven't looked into it yet. I will.
Anyway, right now, I think this provides better security than 'setenforce 0', which is the only way to run passt from libvirt at the moment on some distributions.
If running with SELinux permissive, this patch has no effect, as it is unconfined regardless. That's not a situation we care about. If people want to run without protection they get to keep the pieces when it all goes wrong.
The current implementation simply does not and cannot work with SELinux in enforcing mode, so users have no other choice.
I'm not sure how you handle these cases on libvirt, but generally speaking, this patch is a vast improvement on the current situation, and I can follow up with an extension or a different version of qemuSecurityCommandRun() later.
No, I don't think it is a vast improvement.
The goal of sVirt is to guarantee isolation between VMs.
Running passt under svirt_t:MCS is not ideal because the svirt_t policy allows things that passt should not get. It does still guarantee isolation between VMs, because the MCS label is present.
It's a bit more than that: it's not ideal because libvirt simply won't start passt. There's no isolation and no VMs.
Switching to running passt_t:c0.c1023 will be more correct in terms of what permissions passt should be allowed, but it has disabled isolation of passt between VMs. This is a clear degradation of capabilities from the POV of sVirt's goals.
It's not a degradation because VMs can't start passt with SELinux in enforcing mode, without this patch. No service, no degradation.
I looked into options to rework virSecurityManagerSetChildProcessLabel() and friends with a more flexible modeling/building of labels -- just doing some trick all the way down from qemuSecurityCommandRun() implies a number of layering violations that I would like to avoid.
It all looks doable, but implementing the type of functionality/API that's currently missing there isn't a small rework -- some refactoring of interfaces is definitely needed. I started, but it's not quick.
So for the moment being I would suggest that, if passt can't be relabeled as passt_t (i.e. if this patch or equivalent can't be applied), the whole passt back-end should be dropped.
I don't think we need such drastic measure. I think you can use:
qemuPasstStart() {
seclabel = virDomainDefGetSecurityLabelDef(vm->def, "selinux"); s = context_new(seclabel->label); context_type_set(s, "virt_t);
I presume you meant "passt_t" here.
No, that needs to be "virtd_t" for the right domain transition to happen.
The transitions only happen on execve(). We're running in libvirtd (as virtd_t:c0.c1023) and will execve() passt (needs to be passt_t:c243,c2353). If we use virtd_t, then when passt is exec'd it'll get virtd_t:c243,c2353. Without a second execve it'll never transition onto passt_t:c243,c2353. We have no need to rely on automatic transitions when we know the target label we wish to use upfront. So we should be using passt_t:c243,c2353 with virCommand With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

qemuSecurityCommandRun() would have dealt with this (if UID and GID had been passed). With virCommandRun() we need separate, explicit calls. Fixes: a56f0168d576 ("qemu: hook up passt config to qemu domains") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> --- src/qemu/qemu_passt.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 1a67cf44de..c7012e349a 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -152,6 +152,7 @@ qemuPasstStart(virDomainObj *vm, { qemuDomainObjPrivate *priv = vm->privateData; virQEMUDriver *driver = priv->driver; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); g_autoptr(virCommand) cmd = NULL; g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); @@ -164,6 +165,11 @@ qemuPasstStart(virDomainObj *vm, virCommandClearCaps(cmd); virCommandSetErrorBuffer(cmd, &errbuf); + if (cfg->user != (uid_t) -1) + virCommandSetUID(cmd, cfg->user); + if (cfg->group != (gid_t) -1) + virCommandSetGID(cmd, cfg->group); + virCommandAddArgList(cmd, "--one-off", "--socket", passtSocketName, -- 2.39.1

Just like it can't remove its own PID files, passt can't unlink its own socket upon exit (unless the initialisation fails), because it has no access to the filesystem at runtime. Remove the socket file in qemuPasstKill(). Fixes: a56f0168d576 ("qemu: hook up passt config to qemu domains") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> --- src/qemu/qemu_passt.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index c7012e349a..0e028ca752 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -103,7 +103,7 @@ qemuPasstAddNetProps(virDomainObj *vm, static void -qemuPasstKill(const char *pidfile) +qemuPasstKill(const char *pidfile, const char *passtSocketName) { virErrorPtr orig_err; pid_t pid = 0; @@ -115,6 +115,8 @@ qemuPasstKill(const char *pidfile) virProcessKillPainfully(pid, true); unlink(pidfile); + unlink(passtSocketName); + virErrorRestore(&orig_err); } @@ -124,8 +126,9 @@ qemuPasstStop(virDomainObj *vm, virDomainNetDef *net) { g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net); - qemuPasstKill(pidfile); + qemuPasstKill(pidfile, passtSocketName); } @@ -284,6 +287,6 @@ qemuPasstStart(virDomainObj *vm, return 0; error: - qemuPasstKill(pidfile); + qemuPasstKill(pidfile, passtSocketName); return -1; } -- 2.39.1

On 2/21/23 2:19 PM, Stefano Brivio wrote:
Just like it can't remove its own PID files, passt can't unlink its own socket upon exit (unless the initialisation fails), because it has no access to the filesystem at runtime.
Remove the socket file in qemuPasstKill().
Fixes: a56f0168d576 ("qemu: hook up passt config to qemu domains") Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com> (it's independent of the rest of the series, so I pushed it)
--- src/qemu/qemu_passt.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index c7012e349a..0e028ca752 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -103,7 +103,7 @@ qemuPasstAddNetProps(virDomainObj *vm,
static void -qemuPasstKill(const char *pidfile) +qemuPasstKill(const char *pidfile, const char *passtSocketName) { virErrorPtr orig_err; pid_t pid = 0; @@ -115,6 +115,8 @@ qemuPasstKill(const char *pidfile) virProcessKillPainfully(pid, true); unlink(pidfile);
+ unlink(passtSocketName); + virErrorRestore(&orig_err); }
@@ -124,8 +126,9 @@ qemuPasstStop(virDomainObj *vm, virDomainNetDef *net) { g_autofree char *pidfile = qemuPasstCreatePidFilename(vm, net); + g_autofree char *passtSocketName = qemuPasstCreateSocketPath(vm, net);
- qemuPasstKill(pidfile); + qemuPasstKill(pidfile, passtSocketName); }
@@ -284,6 +287,6 @@ qemuPasstStart(virDomainObj *vm, return 0;
error: - qemuPasstKill(pidfile); + qemuPasstKill(pidfile, passtSocketName); return -1; }
participants (4)
-
Daniel P. Berrangé
-
Laine Stump
-
Michal Prívozník
-
Stefano Brivio