Hey Erik,
On 2/4/19 8:11 AM, Erik Skultety wrote:
On Fri, Feb 01, 2019 at 07:40:36PM -0200, Daniel Henrique Barboza
wrote:
> Update: I've figured it out.
>
> The bug here was that, even running as root, I was getting errors like:
>
> error : virQEMUCapsNewForBinaryInternal:4687 : internal error: Failed to
> probe QEMU binary with
> QMP: libvirt: error : prctl failed to enable 'dac_override' in the AMBIENT
> set:
> Operation not permitted
Being responsible for the latest changes wrt to capabilities, this error itself
is very strange because the prctl man page says the following about EPERM errno:
"option is PR_CAP_AMBIENT and arg2 is PR_CAP_AMBIENT_RAISE, but either the
capability specified in arg3 is not present in the process's permitted and
inheritable capability sets, or the PR_CAP_AMBIENT_LOWER securebit has been
set."
So I'm wondering how can that be since that prctl call happens after we applied
the capabilities we want with capng_apply. Just out of curiosity, what happens
if you move the whole PR_CAP_AMBIENT at the very end of virSetUIDGIDWithCaps
function? Does it change anything?
Moving the code as you suggested got rid of the internal error:
--- a/src/util/virutil.c
+++ b/src/util/virutil.c
@@ -1587,27 +1587,6 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t
*groups, int ngroups,
goto cleanup;
}
-# ifdef PR_CAP_AMBIENT
- /* we couldn't do this in the loop earlier above, because the
capabilities
- * were not applied yet, since in order to add a capability into
the AMBIENT
- * set, it has to be present in both the PERMITTED and INHERITABLE sets
- * (capabilities(7))
- */
- for (i = 0; i <= CAP_LAST_CAP; i++) {
- capstr = capng_capability_to_name(i);
-
- if (capBits & (1ULL << i)) {
- if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) {
- virReportSystemError(errno,
- _("prctl failed to enable '%s' in
the "
- "AMBIENT set"),
- capstr);
- goto cleanup;
- }
- }
- }
-# endif
-
/* Set bounding set while we have CAP_SETPCAP. Unfortunately we
cannot
* do this if we failed to get the capability above, so ignore the
* return value.
@@ -1630,6 +1609,27 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t
*groups, int ngroups,
goto cleanup;
}
+# ifdef PR_CAP_AMBIENT
+ /* we couldn't do this in the loop earlier above, because the
capabilities
+ * were not applied yet, since in order to add a capability into
the AMBIENT
+ * set, it has to be present in both the PERMITTED and INHERITABLE sets
+ * (capabilities(7))
+ */
+ for (i = 0; i <= CAP_LAST_CAP; i++) {
+ capstr = capng_capability_to_name(i);
+
+ if (capBits & (1ULL << i)) {
+ if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) {
+ virReportSystemError(errno,
+ _("prctl failed to enable '%s' in
the "
+ "AMBIENT set"),
+ capstr);
+ goto cleanup;
+ }
+ }
+ }
+# endif
+
However, this code still doesn't add IPC_LOCK as capability:
index 0d58f1ee57..f4b46abc08 100644
--- a/src/util/virutil.c
+++ b/src/util/virutil.c
+++ b/src/qemu/qemu_capabilities.c
@@ -4525,6 +4525,9 @@
virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd,
/* QEMU might run into permission issues, e.g. /dev/sev (0600),
override
* them just for the purpose of probing */
virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE);
+ virCommandAllowCap(cmd->cmd, CAP_IPC_LOCK);
+ virCommandAllowCap(cmd->cmd, CAP_IPC_OWNER);
+
#endif
So I am not sure if my mod above is wrong or your suggestion of moving the
PR_CAP_AMBIENT code made the warning go away but isn't setting the
capabilities
at all. I'll investigate it more.
DHB
Thanks,
Erik
> The reason is that the host has libcap-ng installed. ./configure uses it if
> available,
> setting WITH_CAPNG in the code. I am unsure if this has something to do with
> the libcap-ng configuration in this system I'm using or if there is
> something
> missing in the Libvirt code, but the spawned QEMU process isn't inheriting
> the
> capabilities it should have.
>
> Disabling support of this lib with "--with-capng=no" in autogen.sh and
> rebuilding Libvirt fixed the problem. I was even able to see more NUMA
> nodes than I was before using the system libvirt (which is the original
> bug I am/was investigating).
>
>
> Thanks!
>
>
>
>
>
> On 2/1/19 4:04 PM, Daniel Henrique Barboza wrote:
>> Hi,
>>
>> I'm facing a strange behavior when running Libvirt from source code,
>> latest upstream, on an Ubuntu 18.04.1 LTS Power 9 server. My QEMU
>> guest - which is using VFIO and GPU passthrough - breaks on boot when
>> trying to allocate a DMA window inside KVM.
>>
>> Debugging the code, I've found out that the problem is related to the
>> process
>> not having CAP_IPC_LOCK - at least from the host kernel perspective.
>>
>> This is strange because:
>>
>> - the same VM running directly from QEMU command line works
>> - the same VM running in the system Libvirt (v4.0.0, Ubuntu version)
>> also works
>>
>> What am I missing? My understanding on Linux process is that a process
>> running as root should inherit the same capabilities of the user, which
>> includes
>> CAP_IPC_LOCK. Running Libvirt from source code should grant ipc_lock
>> to it ... right?
>>
>>
>>
>> Any help is appreciated. I can provide more details (VM XML for example)
>> if necessary.
>>
>>
>> Thanks!
> --
> libvir-list mailing list
> libvir-list(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/libvir-list