Hi Erik,

Just to let you know that the error I reported in one of my replies was
being caused by one change I forgot to undo. This error here:


error : virQEMUCapsNewForBinaryInternal:4687 : internal error: Failed to
probe QEMU binary with
QMP: libvirt:  error : prctl failed to enable 'dac_override' in the AMBIENT
set:
Operation not permitted


was happening because I have commented out this line inside
qemu_capabilities.c:

--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -4519,7 +4519,7 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd,
                                     "-daemonize",
                                     NULL);
     virCommandAddEnvPassCommon(cmd->cmd);
-    virCommandClearCaps(cmd->cmd);
+   // virCommandClearCaps(cmd->cmd);
 
 #if WITH_CAPNG
     /* QEMU might run into permission issues, e.g. /dev/sev (0600), override


Thus there is no need to move the
PR_CAP_AMBIENT around to prevent the
error message. Sorry for any alarms I might have raised there.


I'm still experiencing the issue with IPC_LOCK inside the guest though. I'll update
here when I have concrete findings about it.


Thanks,

DHB

On 2/4/19 4:26 PM, Daniel Henrique Barboza wrote:
Hey Erik,


On 2/4/19 8:11 AM, Erik Skultety wrote:
On Fri, Feb 01, 2019 at 07:40:36PM -0200, Daniel Henrique Barboza wrote:
Update: I've figured it out.

The bug here was that, even running as root, I was getting errors like:

error : virQEMUCapsNewForBinaryInternal:4687 : internal error: Failed to
probe QEMU binary with
QMP: libvirt:  error : prctl failed to enable 'dac_override' in the AMBIENT
set:
Operation not permitted
Being responsible for the latest changes wrt to capabilities, this error itself
is very strange because the prctl man page says the following about EPERM errno:

"option is PR_CAP_AMBIENT and arg2 is PR_CAP_AMBIENT_RAISE, but either the
capability specified in arg3 is not present in the process's permitted and
inheritable capability sets, or the PR_CAP_AMBIENT_LOWER securebit has been
set."

So I'm wondering how can that be since that prctl call happens after we applied
the capabilities we want with capng_apply. Just out of curiosity, what happens
if you move the whole PR_CAP_AMBIENT at the very end of virSetUIDGIDWithCaps
function? Does it change anything?

Moving the code as  you suggested got rid of the internal error:


--- a/src/util/virutil.c
+++ b/src/util/virutil.c
@@ -1587,27 +1587,6 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups,
         goto cleanup;
     }

-# ifdef PR_CAP_AMBIENT
-    /* we couldn't do this in the loop earlier above, because the capabilities
-     * were not applied yet, since in order to add a capability into the AMBIENT
-     * set, it has to be present in both the PERMITTED and INHERITABLE sets
-     * (capabilities(7))
-     */
-    for (i = 0; i <= CAP_LAST_CAP; i++) {
-        capstr = capng_capability_to_name(i);
-
-        if (capBits & (1ULL << i)) {
-            if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) {
-                virReportSystemError(errno,
-                                     _("prctl failed to enable '%s' in the "
-                                       "AMBIENT set"),
-                                     capstr);
-                goto cleanup;
-            }
-        }
-    }
-# endif
-
     /* Set bounding set while we have CAP_SETPCAP.  Unfortunately we cannot
      * do this if we failed to get the capability above, so ignore the
      * return value.
@@ -1630,6 +1609,27 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups,
         goto cleanup;
     }

+# ifdef PR_CAP_AMBIENT
+    /* we couldn't do this in the loop earlier above, because the capabilities
+     * were not applied yet, since in order to add a capability into the AMBIENT
+     * set, it has to be present in both the PERMITTED and INHERITABLE sets
+     * (capabilities(7))
+     */
+    for (i = 0; i <= CAP_LAST_CAP; i++) {
+        capstr = capng_capability_to_name(i);
+
+        if (capBits & (1ULL << i)) {
+            if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) {
+                virReportSystemError(errno,
+                                     _("prctl failed to enable '%s' in the "
+                                       "AMBIENT set"),
+                                     capstr);
+                goto cleanup;
+            }
+        }
+    }
+# endif
+




However, this code still doesn't add IPC_LOCK as capability:


index 0d58f1ee57..f4b46abc08 100644
--- a/src/util/virutil.c
+++ b/src/util/virutil.c
+++ b/src/qemu/qemu_capabilities.c
@@ -4525,6 +4525,9 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd,
     /* QEMU might run into permission issues, e.g. /dev/sev (0600), override
      * them just for the purpose of probing */
     virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE);
+    virCommandAllowCap(cmd->cmd, CAP_IPC_LOCK);
+    virCommandAllowCap(cmd->cmd, CAP_IPC_OWNER);
+
 #endif



So I am not sure if my mod above is wrong or your suggestion of moving the
PR_CAP_AMBIENT code made the warning go away but isn't setting the capabilities
at all. I'll investigate it more.



DHB



Thanks,
Erik

The reason is that the host has libcap-ng installed. ./configure uses it if
available,
setting WITH_CAPNG in the code. I am unsure if this has something to do with
the libcap-ng configuration in this system I'm using or if there is
something
missing in the Libvirt code, but the spawned QEMU process isn't inheriting
the
capabilities it should have.

Disabling support of this lib with "--with-capng=no" in autogen.sh and
rebuilding Libvirt fixed the problem. I was even able to see more NUMA
nodes than I was before using the system libvirt (which is the original
bug I am/was investigating).


Thanks!





On 2/1/19 4:04 PM, Daniel Henrique Barboza wrote:
Hi,

I'm facing a strange behavior when running Libvirt from source code,
latest upstream, on an Ubuntu 18.04.1 LTS Power 9 server. My QEMU
guest - which is using VFIO and GPU passthrough - breaks on boot when
trying to allocate a DMA window inside KVM.

Debugging the code, I've found out that the problem is related to the
process
not having CAP_IPC_LOCK - at least from the host kernel perspective.

This is strange because:

- the same VM running directly from QEMU command line works
- the same VM running in the system Libvirt (v4.0.0, Ubuntu version)
also works

What am I missing? My understanding on Linux process is that a process
running as root should inherit the same capabilities of the user, which
includes
CAP_IPC_LOCK. Running Libvirt from source code should grant ipc_lock
to it ... right?



Any help is appreciated. I can provide more details (VM XML for example)
if necessary.


Thanks!
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list