Resolving issue 135 (Qemu under AppArmor can't access volume-type disks)
Hi, Several months ago, I ran into issue #135 which says that Qemu under AppArmor can't access LVM volume disks. I have been studying the code and the invocation of virt-aa-helper. I'm using 11.3.0 and 10.0.0 -- I'm working to compile and run a development version, but have my progress to share in the mean time. So far, I'm finding that if I create a volume-based disk XML entry in my domain definition: <disk type='volume' device='disk'> <driver name='qemu' type='qcow2'/> <source pool='default' volume='cirros.img'/> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> with this, there is *no* XML (at all) on standard input in the virt-aa-helper command, whereas when using effectively the same definition, resolving the file manually, like this: <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/cirros.img'/> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> I get full XML on standard input for virt-aa-helper, with this being the snippet for the disk definition: <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/cirros.img' index='1'/> <backingStore/> <target dev='sda' bus='scsi'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> I'm not asking for a fix, but would like to know if anybody has any "Ah hah!" moments about this - and if not, if there are some hints on how to test this, hopefully without needing to restart a VM over-and-over. Thanks in advance for any hints you may be able to provide! -- Bye, Erik. http://efficito.com -- Hosted accounting and ERP. Robust and Flexible. No vendor lock-in.
On 1/14/26 00:35, Erik Huelsmann wrote:
Hi,
Several months ago, I ran into issue #135 which says that Qemu under AppArmor can't access LVM volume disks.
The problem here is, that AppArmor driver in libvirt generates the profile in qemuSecurityGenLabel() phase. Long story short, starting up a libvirt domain is broken down into several steps. One of them is qemuProcessPrepareDomain() where domain XML is filled with (parts of) live data. In this specific case, disk source is translated (from storage pool/vol to a filepath). But it's not just that, various other paths are generated (e.g. socket paths, render nodes for graphics, etc.). Then, the next step is qemuProcessPrepareHost() where those files from previous step are created, their seclabels are (possibly) set. This has a caveat though. Initially, libvirt cared about SELinux only (leave traditional uid/gid perms (we call them DAC) aside for a brief moment). Here. an unique seclabel is generated in the domain prepare phase, so that any helper process that's used to create files in host prepare phase can run under that label (and thus be restricted). AppArmor does not work this way. So when an aforementioned helper process wants to run (say /usr/bin/swtpm_setup) it already needs profile to be defined. As "obvious" workaround, AppArmor generates the profile in seclabel generation phase. But by that time path are not populated! IMO, quick and dirty fix would be to generate seclabels at the end of qemuProcessPrepareDomain() instead of beginning (see below). The only downside of this is that seclabels won't be available upfront, e.g. if we want to copy them into a device specific seclabel (we do NOT do that though). diff --git i/src/qemu/qemu_process.c w/src/qemu/qemu_process.c index a53bb40783..00b9a732f3 100644 --- i/src/qemu/qemu_process.c +++ w/src/qemu/qemu_process.c @@ -7024,15 +7024,6 @@ qemuProcessPrepareDomain(virQEMUDriver *driver, return -1; if (!(flags & VIR_QEMU_PROCESS_START_PRETEND)) { - /* If you are using a SecurityDriver with dynamic labelling, - then generate a security label for isolation */ - VIR_DEBUG("Generating domain security label (if required)"); - if (qemuSecurityGenLabel(driver->securityManager, vm->def) < 0) { - virDomainAuditSecurityLabel(vm, false); - return -1; - } - virDomainAuditSecurityLabel(vm, true); - if (qemuProcessPrepareDomainNUMAPlacement(vm) < 0) return -1; } @@ -7154,6 +7145,17 @@ qemuProcessPrepareDomain(virQEMUDriver *driver, } } + /* Keep this as the very last step because AppArmor already generates + * profile at this point. IOW, we need all the paths filled in. */ + if (!(flags & VIR_QEMU_PROCESS_START_PRETEND)) { + VIR_DEBUG("Generating domain security label (if required)"); + if (qemuSecurityGenLabel(driver->securityManager, vm->def) < 0) { + virDomainAuditSecurityLabel(vm, false); + return -1; + } + virDomainAuditSecurityLabel(vm, true); + } + return 0; } Another alternative is to move profile generation into its separate step. So then we'd have: qemuProcessPrepareDomain(): 1) gen seclabel /* here only SELinux/DAC drivers would do something useful. For AppArmor it'd be a NOP. */ 2) generate all the paths 3) write profile /* only AppArmor would act upon, for SELinux and AppArmor it'd be nop. */ Let me see if that'd would fix the issue. Michal
Hi Michal, Thank you for your super quick response - including patches, no less!
Several months ago, I ran into issue #135 which says that Qemu under AppArmor can't access LVM volume disks.
The problem here is, that AppArmor driver in libvirt generates the profile in qemuSecurityGenLabel() phase. Long story short, starting up a libvirt domain is broken down into several steps. One of them is qemuProcessPrepareDomain() where domain XML is filled with (parts of) live data. In this specific case, disk source is translated (from storage pool/vol to a filepath). But it's not just that, various other paths are generated (e.g. socket paths, render nodes for graphics, etc.).
Thanks for this context. Yes, I was unaware of that situation and completely focussed at the volume paths. Taking the extra paths into account complicates the matter.
Then, the next step is qemuProcessPrepareHost() where those files from previous step are created, their seclabels are (possibly) set. This has a caveat though.
Initially, libvirt cared about SELinux only (leave traditional uid/gid perms (we call them DAC) aside for a brief moment). Here. an unique seclabel is generated in the domain prepare phase, so that any helper process that's used to create files in host prepare phase can run under that label (and thus be restricted).
AppArmor does not work this way. So when an aforementioned helper process wants to run (say /usr/bin/swtpm_setup) it already needs profile to be defined. As "obvious" workaround, AppArmor generates the profile in seclabel generation phase. But by that time path are not populated!
I saw your patches. From the patches I concluded that the security policy is strictly applicable to Qemu (apparently). That greatly simplifies the solution space, but with your proposed patches, I see that the learning curve would have been too steep for me to be able to come up with a solution this simple.
IMO, quick and dirty fix would be to generate seclabels at the end of qemuProcessPrepareDomain() instead of beginning (see below). The only downside of this is that seclabels won't be available upfront, e.g. if we want to copy them into a device specific seclabel (we do NOT do that though). diff --git i/src/qemu/qemu_process.c w/src/qemu/qemu_process.c index a53bb40783..00b9a732f3 100644 --- i/src/qemu/qemu_process.c +++ w/src/qemu/qemu_process.c @@ -7024,15 +7024,6 @@ qemuProcessPrepareDomain(virQEMUDriver *driver, return -1;
if (!(flags & VIR_QEMU_PROCESS_START_PRETEND)) { - /* If you are using a SecurityDriver with dynamic labelling, - then generate a security label for isolation */ - VIR_DEBUG("Generating domain security label (if required)"); - if (qemuSecurityGenLabel(driver->securityManager, vm->def) < 0) { - virDomainAuditSecurityLabel(vm, false); - return -1; - } - virDomainAuditSecurityLabel(vm, true); - if (qemuProcessPrepareDomainNUMAPlacement(vm) < 0) return -1; } @@ -7154,6 +7145,17 @@ qemuProcessPrepareDomain(virQEMUDriver *driver, } }
+ /* Keep this as the very last step because AppArmor already generates + * profile at this point. IOW, we need all the paths filled in. */ + if (!(flags & VIR_QEMU_PROCESS_START_PRETEND)) { + VIR_DEBUG("Generating domain security label (if required)"); + if (qemuSecurityGenLabel(driver->securityManager, vm->def) < 0) { + virDomainAuditSecurityLabel(vm, false); + return -1; + } + virDomainAuditSecurityLabel(vm, true); + } + return 0; }
Another alternative is to move profile generation into its separate step. So then we'd have:
qemuProcessPrepareDomain(): 1) gen seclabel /* here only SELinux/DAC drivers would do something useful. For AppArmor it'd be a NOP. */ 2) generate all the paths 3) write profile /* only AppArmor would act upon, for SELinux and AppArmor it'd be nop. */
Let me see if that'd would fix the issue.
From the fact that you submitted the patches, I think I understand the change above didn't work. Is there anything I can do to help this patch set? -- Bye, Erik. http://efficito.com -- Hosted accounting and ERP. Robust and Flexible. No vendor lock-in.
On 1/14/26 20:39, Erik Huelsmann wrote:
Hi Michal,
Thank you for your super quick response - including patches, no less!
Several months ago, I ran into issue #135 which says that Qemu under AppArmor can't access LVM volume disks.
The problem here is, that AppArmor driver in libvirt generates the profile in qemuSecurityGenLabel() phase. Long story short, starting up a libvirt domain is broken down into several steps. One of them is qemuProcessPrepareDomain() where domain XML is filled with (parts of) live data. In this specific case, disk source is translated (from storage pool/vol to a filepath). But it's not just that, various other paths are generated (e.g. socket paths, render nodes for graphics, etc.).
Thanks for this context. Yes, I was unaware of that situation and completely focussed at the volume paths. Taking the extra paths into account complicates the matter.
Then, the next step is qemuProcessPrepareHost() where those files from previous step are created, their seclabels are (possibly) set. This has a caveat though.
Initially, libvirt cared about SELinux only (leave traditional uid/gid perms (we call them DAC) aside for a brief moment). Here. an unique seclabel is generated in the domain prepare phase, so that any helper process that's used to create files in host prepare phase can run under that label (and thus be restricted).
AppArmor does not work this way. So when an aforementioned helper process wants to run (say /usr/bin/swtpm_setup) it already needs profile to be defined. As "obvious" workaround, AppArmor generates the profile in seclabel generation phase. But by that time path are not populated!
I saw your patches. From the patches I concluded that the security policy is strictly applicable to Qemu (apparently). That greatly simplifies the solution space, but with your proposed patches, I see that the learning curve would have been too steep for me to be able to come up with a solution this simple.
Yeah, learning curve is a bit steep, libvirt's a large project after all.
IMO, quick and dirty fix would be to generate seclabels at the end of qemuProcessPrepareDomain() instead of beginning (see below). The only downside of this is that seclabels won't be available upfront, e.g. if we want to copy them into a device specific seclabel (we do NOT do that though). diff --git i/src/qemu/qemu_process.c w/src/qemu/qemu_process.c index a53bb40783..00b9a732f3 100644 --- i/src/qemu/qemu_process.c +++ w/src/qemu/qemu_process.c @@ -7024,15 +7024,6 @@ qemuProcessPrepareDomain(virQEMUDriver *driver, return -1;
if (!(flags & VIR_QEMU_PROCESS_START_PRETEND)) { - /* If you are using a SecurityDriver with dynamic labelling, - then generate a security label for isolation */ - VIR_DEBUG("Generating domain security label (if required)"); - if (qemuSecurityGenLabel(driver->securityManager, vm->def) < 0) { - virDomainAuditSecurityLabel(vm, false); - return -1; - } - virDomainAuditSecurityLabel(vm, true); - if (qemuProcessPrepareDomainNUMAPlacement(vm) < 0) return -1; } @@ -7154,6 +7145,17 @@ qemuProcessPrepareDomain(virQEMUDriver *driver, } }
+ /* Keep this as the very last step because AppArmor already generates + * profile at this point. IOW, we need all the paths filled in. */ + if (!(flags & VIR_QEMU_PROCESS_START_PRETEND)) { + VIR_DEBUG("Generating domain security label (if required)"); + if (qemuSecurityGenLabel(driver->securityManager, vm->def) < 0) { + virDomainAuditSecurityLabel(vm, false); + return -1; + } + virDomainAuditSecurityLabel(vm, true); + } + return 0; }
Another alternative is to move profile generation into its separate step. So then we'd have:
qemuProcessPrepareDomain(): 1) gen seclabel /* here only SELinux/DAC drivers would do something useful. For AppArmor it'd be a NOP. */ 2) generate all the paths 3) write profile /* only AppArmor would act upon, for SELinux and AppArmor it'd be nop. */
Let me see if that'd would fix the issue.
From the fact that you submitted the patches, I think I understand the change above didn't work.
It did, but I'm not sure about all the implications of it. I mean, if there's something, hidden under a dozen of function calls from qemuProcessPrepareDomain() that expects seclabel to be generated, then moving the generation at the end is no good.
Is there anything I can do to help this patch set?
You can test the patch set and if it works you can reply to the cover letter with your Tested-by tag. Michal
On Thu, Jan 15, 2026 at 08:38:36 +0100, Michal Prívozník via Devel wrote:
On 1/14/26 20:39, Erik Huelsmann wrote:
Hi Michal,
[...]
Another alternative is to move profile generation into its separate step. So then we'd have:
qemuProcessPrepareDomain(): 1) gen seclabel /* here only SELinux/DAC drivers would do something useful. For AppArmor it'd be a NOP. */ 2) generate all the paths 3) write profile /* only AppArmor would act upon, for SELinux and AppArmor it'd be nop. */
Let me see if that'd would fix the issue.
From the fact that you submitted the patches, I think I understand the change above didn't work.
It did, but I'm not sure about all the implications of it. I mean, if there's something, hidden under a dozen of function calls from qemuProcessPrepareDomain() that expects seclabel to be generated, then moving the generation at the end is no good.
I'm not exactly sure how the apparmour profile is used in this case, because in qemuProcessPrepareDomain() the qemu process certainly is not around. What *can* be around are some various helper processes that are launched (e.g. 'numad' is invoked, some vhost-user backend binaries are invoked (at least for capability detection but I remember one case where some "useful" setup was done there too, at least historically)). So at the very least, with this move libvirt must be able to do the things above and hopefully I didn't overlook anything. That's the reason why I didn't put an R-b on those patches yet, because at least code-wise they look good to me.
Is there anything I can do to help this patch set?
You can test the patch set and if it works you can reply to the cover letter with your Tested-by tag.
Michal
On 1/15/26 08:49, Peter Krempa wrote:
On Thu, Jan 15, 2026 at 08:38:36 +0100, Michal Prívozník via Devel wrote:
On 1/14/26 20:39, Erik Huelsmann wrote:
Hi Michal,
[...]
Another alternative is to move profile generation into its separate step. So then we'd have:
qemuProcessPrepareDomain(): 1) gen seclabel /* here only SELinux/DAC drivers would do something useful. For AppArmor it'd be a NOP. */ 2) generate all the paths 3) write profile /* only AppArmor would act upon, for SELinux and AppArmor it'd be nop. */
Let me see if that'd would fix the issue.
From the fact that you submitted the patches, I think I understand the change above didn't work.
It did, but I'm not sure about all the implications of it. I mean, if there's something, hidden under a dozen of function calls from qemuProcessPrepareDomain() that expects seclabel to be generated, then moving the generation at the end is no good.
I'm not exactly sure how the apparmour profile is used in this case, because in qemuProcessPrepareDomain() the qemu process certainly is not around. What *can* be around are some various helper processes that are launched (e.g. 'numad' is invoked, some vhost-user backend binaries are invoked (at least for capability detection but I remember one case where some "useful" setup was done there too, at least historically)).
Arguably, that setup might be moved to host prep phase. BTW: I've checked and although we do run binaries, plain virCommandRun() is used, i.e. not qemuSecurityCommandRun() which would execute those binaries with already generated seclabel.
So at the very least, with this move libvirt must be able to do the things above and hopefully I didn't overlook anything.
Agreed.
That's the reason why I didn't put an R-b on those patches yet, because at least code-wise they look good to me.
Another idea I had to fix this issue is: with AppArmor libvirt generates basically two files (both under /etc/apparmor.d/libvirt/): 1) libvirt-$UUID 2) libvirt-$UUID.files where the first one basically just includes abstractions/libvirt-qemu AND libvirt-$UUID.files. The libvirt-$UUID.files then contains all those domain specific paths, like: /run/libvirt/qemu/swtpm/1-guest-swtpm.sock /var/lib/libvirt/qemu/domain-1-guest/fs0-fs.sock and so on (intentionally picked interesting paths to illustrate my point). Thing is, this list of paths is constructed in two ways: a) in qemuProcessPrepareDomain() during qemuSecurityGenLabel() from domain XML (we know this is problematic), b) in qemuProcessPrepareHost() where qemuSecurityDomainSetPathLabel() ends up executing virt-aa-helper in such fashion, that the domain XML on its input is ignored, but desired path is appended into the libvirt-$UUID.files. In the end, the libvirt-$UUID.files is a list of paths with mixed sources (some come from domain XML, others from runtime decisions). With that, it's impossible to keep the profile true. So what if we split the file? We'd have: 1) libvirt-$UUID 2) libvirt-$UUID.files 3) libvirt-$UUID.internal Now, libvirt-$UUID would just include the other two files, libvirt-$UUID.files would contain all the paths collected from domain XML and thus can be regenerated whenever domain XML changes, and finally, libvirt-$UUID.internal would then contain those specific, internal paths like /var/lib/libvirt/qemu/domain-1-guest/master-key.aes for instance. But this is much much bigger task. So let's just stick with my patches for now and if we ever run into problem, we can revisit this idea. Michal
On Thu, Jan 15, 2026 at 11:14:04 +0100, Michal Prívozník wrote:
On 1/15/26 08:49, Peter Krempa wrote:
On Thu, Jan 15, 2026 at 08:38:36 +0100, Michal Prívozník via Devel wrote:
On 1/14/26 20:39, Erik Huelsmann wrote:
Hi Michal,
[...]
Another idea I had to fix this issue is: with AppArmor libvirt generates basically two files (both under /etc/apparmor.d/libvirt/):
1) libvirt-$UUID 2) libvirt-$UUID.files
where the first one basically just includes abstractions/libvirt-qemu AND libvirt-$UUID.files. The libvirt-$UUID.files then contains all those domain specific paths, like:
/run/libvirt/qemu/swtpm/1-guest-swtpm.sock /var/lib/libvirt/qemu/domain-1-guest/fs0-fs.sock
and so on (intentionally picked interesting paths to illustrate my point). Thing is, this list of paths is constructed in two ways:
a) in qemuProcessPrepareDomain() during qemuSecurityGenLabel() from domain XML (we know this is problematic), b) in qemuProcessPrepareHost() where qemuSecurityDomainSetPathLabel() ends up executing virt-aa-helper in such fashion, that the domain XML on its input is ignored, but desired path is appended into the libvirt-$UUID.files. In the end, the libvirt-$UUID.files is a list of paths with mixed sources (some come from domain XML, others from runtime decisions). With that, it's impossible to keep the profile true. So what if we split the file? We'd have:
1) libvirt-$UUID 2) libvirt-$UUID.files 3) libvirt-$UUID.internal
Now, libvirt-$UUID would just include the other two files, libvirt-$UUID.files would contain all the paths collected from domain XML and thus can be regenerated whenever domain XML changes, and finally, libvirt-$UUID.internal would then contain those specific, internal paths like /var/lib/libvirt/qemu/domain-1-guest/master-key.aes for instance.
I've just reviewed some patches which tried to add some data to "status XML" for labelling with apparmor. One of the patches tried to store paths to the TAP devices we create. Now it was broken because it stored paths containing PID of virtqemud, which makes no sense because we close the FDs not long after and also changes if virtqemud restarts. Now while I think that qemu ought to have access to any FD we've passed (Thus just add everything in /proc/PID/fd/?) if more granular approach is neede we possibly could have an append-only file catching these volatile entries. Arguably it'd be less secure than with e.g. selinux where we label the FD itself and if it's closed the label vanishes, with an append-only file the FD would be always allowed. Anyways I agree that the above is not related to the case you're trying to solve here, since in this case we do store the resolved paths (from 'volume' -> path translation) internally.
On 1/15/26 11:24, Peter Krempa wrote:
On Thu, Jan 15, 2026 at 11:14:04 +0100, Michal Prívozník wrote:
On 1/15/26 08:49, Peter Krempa wrote:
On Thu, Jan 15, 2026 at 08:38:36 +0100, Michal Prívozník via Devel wrote:
On 1/14/26 20:39, Erik Huelsmann wrote:
Hi Michal,
[...]
Another idea I had to fix this issue is: with AppArmor libvirt generates basically two files (both under /etc/apparmor.d/libvirt/):
1) libvirt-$UUID 2) libvirt-$UUID.files
where the first one basically just includes abstractions/libvirt-qemu AND libvirt-$UUID.files. The libvirt-$UUID.files then contains all those domain specific paths, like:
/run/libvirt/qemu/swtpm/1-guest-swtpm.sock /var/lib/libvirt/qemu/domain-1-guest/fs0-fs.sock
and so on (intentionally picked interesting paths to illustrate my point). Thing is, this list of paths is constructed in two ways:
a) in qemuProcessPrepareDomain() during qemuSecurityGenLabel() from domain XML (we know this is problematic), b) in qemuProcessPrepareHost() where qemuSecurityDomainSetPathLabel() ends up executing virt-aa-helper in such fashion, that the domain XML on its input is ignored, but desired path is appended into the libvirt-$UUID.files. In the end, the libvirt-$UUID.files is a list of paths with mixed sources (some come from domain XML, others from runtime decisions). With that, it's impossible to keep the profile true. So what if we split the file? We'd have:
1) libvirt-$UUID 2) libvirt-$UUID.files 3) libvirt-$UUID.internal
Now, libvirt-$UUID would just include the other two files, libvirt-$UUID.files would contain all the paths collected from domain XML and thus can be regenerated whenever domain XML changes, and finally, libvirt-$UUID.internal would then contain those specific, internal paths like /var/lib/libvirt/qemu/domain-1-guest/master-key.aes for instance.
I've just reviewed some patches which tried to add some data to "status XML" for labelling with apparmor.
One of the patches tried to store paths to the TAP devices we create. Now it was broken because it stored paths containing PID of virtqemud, which makes no sense because we close the FDs not long after and also changes if virtqemud restarts.
I've seen those patches and I plan onto reviewing them, but mind you - virt-aa-helper is fed live XML, NOT status XML: static int load_profile(virSecurityManager *mgr G_GNUC_UNUSED, const char *profile, virDomainDef *def, const char *fn, bool append) { ... if (virDomainDefFormatInternal(def, NULL, &buf, VIR_DOMAIN_DEF_FORMAT_SECURE | VIR_DOMAIN_DEF_FORMAT_VOLUME_TRANSLATED) < 0) return -1; xml = virBufferContentAndReset(&buf); ... cmd = virCommandNewArgList(VIRT_AA_HELPER, create ? "-c" : "-r", "-u", profile, NULL); ... virCommandSetInputBuffer(cmd, xml); return virCommandRun(cmd, NULL); } Anyway, let's have this discussion there. Michal
participants (3)
-
Erik Huelsmann -
Michal Prívozník -
Peter Krempa