[PATCH 0/7] Polkit tty agent fixes

Apart from fixing bz 1945501 [0] there are some small changes/fixes to some of the polkit code. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1945501 Martin Kletzander (7): virsh: Remove needless variable util: Tiny reword fix in comment util: Add virPolkitAgentAvailable virsh: Do not try connecting first time without polkit agent util: Report errors in all code paths in virPolkitAgentCreate util: Check for pkttyagent availability properly util: Make client-side polkit work even with polkit disabled src/libvirt_private.syms | 1 + src/util/virpolkit.c | 202 ++++++++++++++++++++++----------------- src/util/virpolkit.h | 1 + tools/virsh.c | 14 ++- 4 files changed, 127 insertions(+), 91 deletions(-) -- 2.34.0

It only redundantly reflects whether pkagent != NULL. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tools/virsh.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index b9f3f851d3ec..da35c5c2b9c1 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -119,7 +119,6 @@ virshConnect(vshControl *ctl, const char *uri, bool readonly) bool keepalive_forced = false; virPolkitAgent *pkagent = NULL; int authfail = 0; - bool agentCreated = false; if (ctl->keepalive_interval >= 0) { interval = ctl->keepalive_interval; @@ -141,12 +140,11 @@ virshConnect(vshControl *ctl, const char *uri, bool readonly) goto cleanup; err = virGetLastError(); - if (!agentCreated && + if (!pkagent && err && err->domain == VIR_FROM_POLKIT && err->code == VIR_ERR_AUTH_UNAVAILABLE) { - if (!pkagent && !(pkagent = virPolkitAgentCreate())) + if (!(pkagent = virPolkitAgentCreate())) goto cleanup; - agentCreated = true; } else if (err && err->domain == VIR_FROM_POLKIT && err->code == VIR_ERR_AUTH_FAILED) { authfail++; -- 2.34.0

On a Sunday in 2021, Martin Kletzander wrote:
It only redundantly reflects whether pkagent != NULL.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tools/virsh.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Automatic "Ptr " -> " *" also wreaked havoc in comments. Fix it and while at it reword the sentence so it is clear that the object is newly allocated. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index e90b3b871d15..86255a96760f 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -168,7 +168,7 @@ virPolkitAgentDestroy(virPolkitAgent *agent) * * Allocate and setup a polkit agent * - * Returns a virCommand *on success and NULL on failure + * Returns newly allocated virPolkitAgent * on success and NULL on failure */ virPolkitAgent * virPolkitAgentCreate(void) -- 2.34.0

On a Sunday in 2021, Martin Kletzander wrote:
Automatic "Ptr " -> " *" also wreaked havoc in comments. Fix it and while at it reword the sentence so it is clear that the object is newly allocated.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

With this function we can decide whether to try running the polkit text agent only if it is available, removing a potential needless error saying that the agent binary does not exist, which is useful especially when running the agent before knowing whether it is going to be needed. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virpolkit.c | 44 ++++++++++++++++++++++++++++++++++++++++ src/util/virpolkit.h | 1 + 3 files changed, 46 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a7bc50a4d16d..c11be4eafa19 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -3078,6 +3078,7 @@ virPidFileWritePath; # util/virpolkit.h +virPolkitAgentAvailable; virPolkitAgentCreate; virPolkitAgentDestroy; virPolkitCheckAuth; diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 86255a96760f..3b333547d70b 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -20,6 +20,7 @@ */ #include <config.h> +#include <fcntl.h> #include <unistd.h> #include "virpolkit.h" @@ -217,6 +218,42 @@ virPolkitAgentCreate(void) } +/* + * virPolkitAgentAvailable + * + * This function does some preliminary checking that the pkttyagent does not + * fail starting so that it can be started without waiting for first failed + * connection with VIR_ERR_AUTH_UNAVAILABLE. + */ +bool +virPolkitAgentAvailable(void) +{ + const char *termid = ctermid(NULL); + VIR_AUTOCLOSE fd = -1; + + if (!virFileExists(PKTTYAGENT)) + return false; + + if (!termid) + return false; + + /* + *The pkttyagent needs to open the controlling terminal. + * + * Just in case we are running without a ctty make sure this open() does not + * change that. + * + * We could check if our session has a controlling terminal available + * instead, but it would require parsing `/proc/self/stat` on Linux, which + * is not portable and moreover requires way more work than just open(). + */ + fd = open(termid, O_RDWR | O_NOCTTY); + if (fd < 0) + return false; + + return true; +} + #else /* ! WITH_POLKIT */ int virPolkitCheckAuth(const char *actionid G_GNUC_UNUSED, @@ -247,4 +284,11 @@ virPolkitAgentCreate(void) _("polkit text authentication agent unavailable")); return NULL; } + +bool +virPolkitAgentAvailable(void) +{ + return false; +} + #endif /* WITH_POLKIT */ diff --git a/src/util/virpolkit.h b/src/util/virpolkit.h index a577d59452ba..7bcd040e5e06 100644 --- a/src/util/virpolkit.h +++ b/src/util/virpolkit.h @@ -37,3 +37,4 @@ typedef struct _virPolkitAgent virPolkitAgent; void virPolkitAgentDestroy(virPolkitAgent *cmd); virPolkitAgent *virPolkitAgentCreate(void); +bool virPolkitAgentAvailable(void); -- 2.34.0

On a Sunday in 2021, Martin Kletzander wrote:
With this function we can decide whether to try running the polkit text agent only if it is available, removing a potential needless error saying that the agent binary does not exist, which is useful especially when running the agent before knowing whether it is going to be needed.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virpolkit.c | 44 ++++++++++++++++++++++++++++++++++++++++ src/util/virpolkit.h | 1 + 3 files changed, 46 insertions(+)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On Mon, Nov 22, 2021 at 04:38:42PM +0100, Ján Tomko wrote:
On a Sunday in 2021, Martin Kletzander wrote:
With this function we can decide whether to try running the polkit text agent only if it is available, removing a potential needless error saying that the agent binary does not exist, which is useful especially when running the agent before knowing whether it is going to be needed.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virpolkit.c | 44 ++++++++++++++++++++++++++++++++++++++++ src/util/virpolkit.h | 1 + 3 files changed, 46 insertions(+)
Reviewed-by: Ján Tomko <jtomko@redhat.com>
I changed the virFileExists to virFileIsExecutable as mentioned in the thread for patch 7/7, so I'll amend it here. Hope that's fine, if not then take it as a trivial change ;)
Jano

Trying to connect once without a polkit agent will generate an error on the server side which seems too rough given it only serves the purpose of the client (virsh in this case) to figure out that an agent is needed. Thankfully we can just try running the agent. It does not break anything as we are running it with `--fallback`, which makes sure it does not replace an existing agent in case there is one already registered. The second piece of code trying to start the polkit text agent is kept in order to _really_ try out starting the agent (and error out when failing to do so) just in case the agent was not available the first time it was ran. Even though it should not happen it avoids a very rare race condition and really does not add much complexity. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1945501 Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tools/virsh.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/virsh.c b/tools/virsh.c index da35c5c2b9c1..5234a3decb22 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -129,6 +129,10 @@ virshConnect(vshControl *ctl, const char *uri, bool readonly) keepalive_forced = true; } + if (virPolkitAgentAvailable() && + !(pkagent = virPolkitAgentCreate())) + virResetLastError(); + do { virErrorPtr err; @@ -140,6 +144,10 @@ virshConnect(vshControl *ctl, const char *uri, bool readonly) goto cleanup; err = virGetLastError(); + /* + * If polkit agent failed starting the first time, then retry once more + * now when we know it really is needed. + */ if (!pkagent && err && err->domain == VIR_FROM_POLKIT && err->code == VIR_ERR_AUTH_UNAVAILABLE) { -- 2.34.0

On a Sunday in 2021, Martin Kletzander wrote:
Trying to connect once without a polkit agent will generate an error on the server side which seems too rough given it only serves the purpose of the client (virsh in this case) to figure out that an agent is needed. Thankfully we can just try running the agent. It does not break anything as we are running it with `--fallback`, which makes sure it does not replace an existing agent in case there is one already registered.
The second piece of code trying to start the polkit text agent is kept in order to _really_ try out starting the agent (and error out when failing to do so) just in case the agent was not available the first time it was ran. Even though it should not happen it avoids a very rare race condition and really does not add much complexity.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1945501
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tools/virsh.c | 8 ++++++++ 1 file changed, 8 insertions(+)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 3b333547d70b..63bb8133a8aa 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -180,8 +180,11 @@ virPolkitAgentCreate(void) int outfd = STDOUT_FILENO; int errfd = STDERR_FILENO; - if (!isatty(STDIN_FILENO)) + if (!isatty(STDIN_FILENO)) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", + _("Cannot start polkit text agent without a tty")); goto error; + } if (virPipe(pipe_fd) < 0) goto error; @@ -205,8 +208,11 @@ virPolkitAgentCreate(void) pollfd.fd = pipe_fd[0]; pollfd.events = POLLHUP; - if (poll(&pollfd, 1, -1) < 0) + if (poll(&pollfd, 1, -1) < 0) { + virReportSystemError(errno, "%s", + _("error in poll call")); goto error; + } return agent; -- 2.34.0

On a Sunday in 2021, Martin Kletzander wrote:
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

It does not need a tty to work, it opens its controlling terminal for user interaction and with this patch even crazy things like this work: echo 'list --name' | virsh -q >/dev/null Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 63bb8133a8aa..7156adc10c0a 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -180,9 +180,9 @@ virPolkitAgentCreate(void) int outfd = STDOUT_FILENO; int errfd = STDERR_FILENO; - if (!isatty(STDIN_FILENO)) { + if (!virPolkitAgentAvailable()) { virReportError(VIR_ERR_SYSTEM_ERROR, "%s", - _("Cannot start polkit text agent without a tty")); + _("polkit text authentication agent unavailable")); goto error; } -- 2.34.0

On a Sunday in 2021, Martin Kletzander wrote:
It does not need a tty to work, it opens its controlling terminal for user interaction and with this patch even crazy things like this work:
echo 'list --name' | virsh -q >/dev/null
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Hi Martin! I recently received a bug report (sorry, not public) about simple operations like 'virsh list' hanging when invoked with an internal test tool. I found this commit to be the culprit. On 11/20/21 16:10, Martin Kletzander wrote:
It does not need a tty to work, it opens its controlling terminal for user interaction and with this patch even crazy things like this work:
echo 'list --name' | virsh -q >/dev/null
FYI, your crazy thing worked for me without this commit :-).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 63bb8133a8aa..7156adc10c0a 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -180,9 +180,9 @@ virPolkitAgentCreate(void) int outfd = STDOUT_FILENO; int errfd = STDERR_FILENO;
- if (!isatty(STDIN_FILENO)) {
With the test tool invoking virsh, isatty fails
+ if (!virPolkitAgentAvailable()) {
but virPolkitAgentAvailable succeeds. pkttyagent is then needlessly spawned with no one to talk to. I haven't been able to cook up a simple reproducer. Not sure if it helps, but here's a pstree view of the internal test tool sshd(15736)---bash(15739)---perl(17717) \ ---runtest(17722)---test.sh(17727) \ ---virsh(17728)-+-pkttyagent(17730)-+-{gdbus}(17732) | |-{gmain}(17731) | `-{pkttyagent}(17733) `-{vshEventLoop}(17729) I'm not familiar with the test tool but have cc'd Julie, who might be able to answer any questions about it. Regards, Jim
virReportError(VIR_ERR_SYSTEM_ERROR, "%s", - _("Cannot start polkit text agent without a tty")); + _("polkit text authentication agent unavailable")); goto error; }

On Fri, Dec 10, 2021 at 05:48:03PM -0700, Jim Fehlig wrote:
Hi Martin!
I recently received a bug report (sorry, not public) about simple operations like 'virsh list' hanging when invoked with an internal test tool. I found this commit to be the culprit.
On 11/20/21 16:10, Martin Kletzander wrote:
It does not need a tty to work, it opens its controlling terminal for user interaction and with this patch even crazy things like this work:
echo 'list --name' | virsh -q >/dev/null
FYI, your crazy thing worked for me without this commit :-).
With a polkit access driver, without a polkit agent already running, and without a polkit rule allowing access without authentication?
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 63bb8133a8aa..7156adc10c0a 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -180,9 +180,9 @@ virPolkitAgentCreate(void) int outfd = STDOUT_FILENO; int errfd = STDERR_FILENO;
- if (!isatty(STDIN_FILENO)) {
With the test tool invoking virsh, isatty fails
+ if (!virPolkitAgentAvailable()) {
but virPolkitAgentAvailable succeeds. pkttyagent is then needlessly spawned with no one to talk to.
Well, pkttyagent does not use stdin at all. It checks the controlling terminal and then opens that to be used for both stdout and stdin. That's why I tried to make it work.
I haven't been able to cook up a simple reproducer. Not sure if it helps, but here's a pstree view of the internal test tool
sshd(15736)---bash(15739)---perl(17717) \ ---runtest(17722)---test.sh(17727) \ ---virsh(17728)-+-pkttyagent(17730)-+-{gdbus}(17732) | |-{gmain}(17731) | `-{pkttyagent}(17733) `-{vshEventLoop}(17729)
I'm not familiar with the test tool but have cc'd Julie, who might be able to answer any questions about it.
The thing is that I got our test suite failing because we had the opposite problem. Meson is running tests in a new session without a controlling terminal, however with stdin normally accessible. Virsh tries to run it and it fails every time. I guess there is no harm in also checking if stdin is a tty, I'll send a patch for that. Thanks for letting me know!
Regards, Jim
virReportError(VIR_ERR_SYSTEM_ERROR, "%s", - _("Cannot start polkit text agent without a tty")); + _("polkit text authentication agent unavailable")); goto error; }

On Sat, Dec 11, 2021 at 11:16:13AM +0100, Martin Kletzander wrote:
On Fri, Dec 10, 2021 at 05:48:03PM -0700, Jim Fehlig wrote:
Hi Martin!
I recently received a bug report (sorry, not public) about simple operations like 'virsh list' hanging when invoked with an internal test tool. I found this commit to be the culprit.
OK, one more thing though, the fact that pkttyagent is spawned cannot cause virsh to hang. If the authentications is not required, then it will just wait there for a while and then be killed. If authentication *is* required, then either you already have an agent running and that one should be used since we're starting pkttyagent with `--fallback` or you do not have any agent running in which case virsh list would fail to connect. Where does the virsh hang, what's the backtrace? Anyway, if just adding: if (!isatty(STDIN_FILENO)) return false; to top of virPolkitAgentAvailable() solves your problem I do not have a particular issue with it. In any case I would like to better understand the issue as just the fact that we're running pkttyagent should not cause any issues. Have a nice day, Martin

My idea was that running pkttyagent unconditionally, modulo checks that pkttyagent itself does to make sure it does not fail, is not going to be an issue turned out to be wrong. Adding back the original check for stdin being a tty helps in some testing scenarios as reported by Jim Fehlig and does not really cause any issues. I originally wanted it in because it also made pkttyagent auth work with redirected input into virsh (with a connection that requires polkit authentication and without a session-wide polkit tty agent, basically making pkttyagent necessary to succeed). But anyone running virsh like that is asking for problems already anyway =) Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 8970de192fe1..bb68b5a59614 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -237,6 +237,9 @@ virPolkitAgentAvailable(void) const char *termid = ctermid(NULL); VIR_AUTOCLOSE fd = -1; + if (!isatty(STDIN_FILENO)) + return false; + if (!virFileIsExecutable(PKTTYAGENT)) return false; -- 2.34.1

On Sat, Dec 11, 2021 at 02:23:11PM +0100, Martin Kletzander wrote:
My idea was that running pkttyagent unconditionally, modulo checks that pkttyagent itself does to make sure it does not fail, is not going to be an issue turned out to be wrong. Adding back the original check for stdin being a tty helps in some testing scenarios as reported by Jim Fehlig and does not really cause any issues. I originally wanted it in because it also made pkttyagent auth work with redirected input into virsh (with a connection that requires polkit authentication and without a session-wide polkit tty agent, basically making pkttyagent necessary to succeed). But anyone running virsh like that is asking for problems already anyway =)
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> ---
Of course I added a note that this would be a patch that can go in if we figure out the real root cause, then sent it to a wrong address, then formatted it again properly, sent it to the right address ad forgot to add the note again 🤦
src/util/virpolkit.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 8970de192fe1..bb68b5a59614 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -237,6 +237,9 @@ virPolkitAgentAvailable(void) const char *termid = ctermid(NULL); VIR_AUTOCLOSE fd = -1;
+ if (!isatty(STDIN_FILENO)) + return false; + if (!virFileIsExecutable(PKTTYAGENT)) return false;
-- 2.34.1

On 12/11/21 14:23, Martin Kletzander wrote:
My idea was that running pkttyagent unconditionally, modulo checks that pkttyagent itself does to make sure it does not fail, is not going to be an issue turned out to be wrong. Adding back the original check for stdin being a tty helps in some testing scenarios as reported by Jim Fehlig and does not really cause any issues. I originally wanted it in because it also made pkttyagent auth work with redirected input into virsh (with a connection that requires polkit authentication and without a session-wide polkit tty agent, basically making pkttyagent necessary to succeed). But anyone running virsh like that is asking for problems already anyway =)
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 3 +++ 1 file changed, 3 insertions(+)
In case you want to push this: Reviewed-by: Michal Privoznik <mprivozn@redhat.com> But I'd really like to get to the bottom of this. Why does sleep() suspends for that long and why isn't polkit-agent killed? Michal

On 12/17/21 7:01 AM, Michal Prívozník wrote:
On 12/11/21 14:23, Martin Kletzander wrote:
My idea was that running pkttyagent unconditionally, modulo checks that pkttyagent itself does to make sure it does not fail, is not going to be an issue turned out to be wrong. Adding back the original check for stdin being a tty helps in some testing scenarios as reported by Jim Fehlig and does not really cause any issues. I originally wanted it in because it also made pkttyagent auth work with redirected input into virsh (with a connection that requires polkit authentication and without a session-wide polkit tty agent, basically making pkttyagent necessary to succeed). But anyone running virsh like that is asking for problems already anyway =)
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 3 +++ 1 file changed, 3 insertions(+)
In case you want to push this:
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
And Tested-by: Jim Fehlig <jfehlig@suse.com>
But I'd really like to get to the bottom of this. Why does sleep() suspends for that long and why isn't polkit-agent killed?
I got sidetracked before I could dig deeper, however I did notice the test process and all children were in a stopped state per /proc/<pid>/status. I sent SIGCONT to the processes and the test successfully completed. I'm baffled why these processes become stopped when pkttyagent is in the picture and work fine otherwise. Jim

On 12/17/21 9:25 AM, Jim Fehlig wrote:
On 12/17/21 7:01 AM, Michal Prívozník wrote:
On 12/11/21 14:23, Martin Kletzander wrote:
My idea was that running pkttyagent unconditionally, modulo checks that pkttyagent itself does to make sure it does not fail, is not going to be an issue turned out to be wrong. Adding back the original check for stdin being a tty helps in some testing scenarios as reported by Jim Fehlig and does not really cause any issues. I originally wanted it in because it also made pkttyagent auth work with redirected input into virsh (with a connection that requires polkit authentication and without a session-wide polkit tty agent, basically making pkttyagent necessary to succeed). But anyone running virsh like that is asking for problems already anyway =)
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 3 +++ 1 file changed, 3 insertions(+)
In case you want to push this:
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
And
Tested-by: Jim Fehlig <jfehlig@suse.com>
But I'd really like to get to the bottom of this. Why does sleep() suspends for that long and why isn't polkit-agent killed?
I got sidetracked before I could dig deeper, however I did notice the test process and all children were in a stopped state per /proc/<pid>/status. I sent SIGCONT to the processes and the test successfully completed. I'm baffled why these processes become stopped when pkttyagent is in the picture and work fine otherwise.
I forgot to mention, it's likely a problem in the test framework code, which is apparently unmaintained by still used by SUSE QA. "I got sidetracked" is a nice way to say I didn't have a lot of motivation to read through a gob of old, unfamiliar perl code :-). Jim

On Fri, Dec 17, 2021 at 09:33:53AM -0700, Jim Fehlig wrote:
On 12/17/21 9:25 AM, Jim Fehlig wrote:
On 12/17/21 7:01 AM, Michal Prívozník wrote:
On 12/11/21 14:23, Martin Kletzander wrote:
My idea was that running pkttyagent unconditionally, modulo checks that pkttyagent itself does to make sure it does not fail, is not going to be an issue turned out to be wrong. Adding back the original check for stdin being a tty helps in some testing scenarios as reported by Jim Fehlig and does not really cause any issues. I originally wanted it in because it also made pkttyagent auth work with redirected input into virsh (with a connection that requires polkit authentication and without a session-wide polkit tty agent, basically making pkttyagent necessary to succeed). But anyone running virsh like that is asking for problems already anyway =)
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 3 +++ 1 file changed, 3 insertions(+)
In case you want to push this:
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
And
Tested-by: Jim Fehlig <jfehlig@suse.com>
But I'd really like to get to the bottom of this. Why does sleep() suspends for that long and why isn't polkit-agent killed?
I got sidetracked before I could dig deeper, however I did notice the test process and all children were in a stopped state per /proc/<pid>/status. I sent SIGCONT to the processes and the test successfully completed. I'm baffled why these processes become stopped when pkttyagent is in the picture and work fine otherwise.
That's interesting. So let's assume it is stopped because of a signal. By default it can be one of: SIGSTOP P1990 Stop Stop process SIGTSTP P1990 Stop Stop typed at terminal SIGTTIN P1990 Stop Terminal input for background process SIGTTOU P1990 Stop Terminal output for background process as those are the only ones that have default action set to Stop. Looking at pkttyagent sources it does not look like it is sending its parent any signals (looked for getppid and kill calls) and I presume your test suite will not do that either, why would it. So let's say we can eliminate the first two. What we're left with is SIGTTIN and SIGTTOU which should be sent to us if we are trying to do I/O on the terminal. If pkttyagent was to ask for anything then it would be done on the controlling terminal even if standard input, output, and error output are redirected. But that (the agent doing I/O on the terminal) should only happen if it is really needed (not if there's an error for example, that goes to stderr). And if that happened then it should not be sent to the whole process group, should it? I would suggest you install a handler in virsh and get some debug output for some of the signals (sigaction with SA_SIGINFO as siginfo_t hlready has enough information to further debug the issue, mainly the caller. Unfortunately that would not help for SIGSTOP as that cannot be used for sigaction(2), but maybe we could at least get somewhere? There's still hope!
I forgot to mention, it's likely a problem in the test framework code, which is apparently unmaintained by still used by SUSE QA. "I got sidetracked" is a nice way to say I didn't have a lot of motivation to read through a gob of old, unfamiliar perl code :-).
But libvirt-tck is still maintained O:-) =D Anyway, looking at kill() and SIGSTOP references in that code might at least give a clue to whether it is even possible.
Jim

On 12/11/21 03:28, Martin Kletzander wrote:
On Sat, Dec 11, 2021 at 11:16:13AM +0100, Martin Kletzander wrote:
On Fri, Dec 10, 2021 at 05:48:03PM -0700, Jim Fehlig wrote:
Hi Martin!
I recently received a bug report (sorry, not public) about simple operations like 'virsh list' hanging when invoked with an internal test tool. I found this commit to be the culprit.
OK, one more thing though, the fact that pkttyagent is spawned cannot cause virsh to hang. If the authentications is not required, then it will just wait there for a while and then be killed. If authentication *is* required, then either you already have an agent running and that one should be used since we're starting pkttyagent with `--fallback` or you do not have any agent running in which case virsh list would fail to connect. Where does the virsh hang, what's the backtrace?
The last scenario you describe appears to be the case. virsh fails to connect then gets stuck trying to kill off pkttyagent #0 0x00007f9f07530241 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 #1 0x00007f9f07535ad3 in nanosleep () from /lib64/libc.so.6 #2 0x00007f9f07f478af in g_usleep () from /usr/lib64/libglib-2.0.so.0 #3 0x00007f9f086694fa in virProcessAbort (pid=367) at ../src/util/virprocess.c:187 #4 0x00007f9f0861ed9b in virCommandAbort (cmd=cmd@entry=0x55a798660c50) at ../src/util/vircommand.c:2774 #5 0x00007f9f08621478 in virCommandFree (cmd=0x55a798660c50) at ../src/util/vircommand.c:3061 #6 0x00007f9f08668581 in virPolkitAgentDestroy (agent=0x55a7986426e0) at ../src/util/virpolkit.c:164 #7 0x000055a797836d93 in virshConnect (ctl=ctl@entry=0x7ffc551dd980, uri=0x0, readonly=readonly@entry=false) at ../tools/virsh.c:187 #8 0x000055a797837007 in virshReconnect (ctl=ctl@entry=0x7ffc551dd980, name=name@entry=0x0, readonly=<optimized out>, readonly@entry=false, force=force@entry=false) at ../tools/virsh.c:223 #9 0x000055a7978371e0 in virshConnectionHandler (ctl=0x7ffc551dd980) at ../tools/virsh.c:325 #10 0x000055a797880172 in vshCommandRun (ctl=ctl@entry=0x7ffc551dd980, cmd=0x55a79865f580) at ../tools/vsh.c:1308 #11 0x000055a7978367b7 in main (argc=2, argv=<optimized out>) at ../tools/virsh.c:907 Odd thing is, I attached gdb to this virsh process several minutes after invoking the test tool that calls 'virsh list'. I can't explain why the process is still blocked in g_usleep, which should only have slept for 10 milliseconds. Even odder, detaching from the process appears to awaken g_usleep and allows process shutdown to continue. The oddness can also be seen in the debug output 2021-12-12 16:35:38.783+0000: 5912: debug : virCommandRunAsync:2629 : About to run /usr/bin/pkttyagent --process 5912 --notify-fd 4 --fallback 2021-12-12 16:35:38.787+0000: 5912: debug : virCommandRunAsync:2632 : Command result 0, with PID 5914 ... 2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:177 : aborting child process 5914 2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:185 : trying SIGTERM to child process 5914 Attach gdb to the process, observe above backtrace, quit gdb. 2021-12-12 16:44:18.059+0000: 5912: debug : virProcessAbort:195 : trying SIGKILL to child process 5914 2021-12-12 16:44:18.061+0000: 5912: debug : virProcessAbort:201 : process has ended: fatal signal 9
Anyway, if just adding:
if (!isatty(STDIN_FILENO)) return false;
This indeed fixes the regression in the test tool.
to top of virPolkitAgentAvailable() solves your problem I do not have a particular issue with it. In any case I would like to better understand the issue as just the fact that we're running pkttyagent should not cause any issues.
Given the above observations, I'm having a difficult time articulating the root cause :-). Regards, Jim

On Sun, Dec 12, 2021 at 10:40:46AM -0700, Jim Fehlig wrote:
On 12/11/21 03:28, Martin Kletzander wrote:
On Sat, Dec 11, 2021 at 11:16:13AM +0100, Martin Kletzander wrote:
On Fri, Dec 10, 2021 at 05:48:03PM -0700, Jim Fehlig wrote:
Hi Martin!
I recently received a bug report (sorry, not public) about simple operations like 'virsh list' hanging when invoked with an internal test tool. I found this commit to be the culprit.
OK, one more thing though, the fact that pkttyagent is spawned cannot cause virsh to hang. If the authentications is not required, then it will just wait there for a while and then be killed. If authentication *is* required, then either you already have an agent running and that one should be used since we're starting pkttyagent with `--fallback` or you do not have any agent running in which case virsh list would fail to connect. Where does the virsh hang, what's the backtrace?
The last scenario you describe appears to be the case. virsh fails to connect then gets stuck trying to kill off pkttyagent
#0 0x00007f9f07530241 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 #1 0x00007f9f07535ad3 in nanosleep () from /lib64/libc.so.6 #2 0x00007f9f07f478af in g_usleep () from /usr/lib64/libglib-2.0.so.0 #3 0x00007f9f086694fa in virProcessAbort (pid=367) at ../src/util/virprocess.c:187 #4 0x00007f9f0861ed9b in virCommandAbort (cmd=cmd@entry=0x55a798660c50) at ../src/util/vircommand.c:2774 #5 0x00007f9f08621478 in virCommandFree (cmd=0x55a798660c50) at ../src/util/vircommand.c:3061 #6 0x00007f9f08668581 in virPolkitAgentDestroy (agent=0x55a7986426e0) at ../src/util/virpolkit.c:164 #7 0x000055a797836d93 in virshConnect (ctl=ctl@entry=0x7ffc551dd980, uri=0x0, readonly=readonly@entry=false) at ../tools/virsh.c:187 #8 0x000055a797837007 in virshReconnect (ctl=ctl@entry=0x7ffc551dd980, name=name@entry=0x0, readonly=<optimized out>, readonly@entry=false, force=force@entry=false) at ../tools/virsh.c:223 #9 0x000055a7978371e0 in virshConnectionHandler (ctl=0x7ffc551dd980) at ../tools/virsh.c:325 #10 0x000055a797880172 in vshCommandRun (ctl=ctl@entry=0x7ffc551dd980, cmd=0x55a79865f580) at ../tools/vsh.c:1308 #11 0x000055a7978367b7 in main (argc=2, argv=<optimized out>) at ../tools/virsh.c:907
Odd thing is, I attached gdb to this virsh process several minutes after invoking the test tool that calls 'virsh list'. I can't explain why the process is still blocked in g_usleep, which should only have slept for 10 milliseconds. Even odder, detaching from the process appears to awaken g_usleep and allows process shutdown to continue. The oddness can also be seen in the debug output
2021-12-12 16:35:38.783+0000: 5912: debug : virCommandRunAsync:2629 : About to run /usr/bin/pkttyagent --process 5912 --notify-fd 4 --fallback
2021-12-12 16:35:38.787+0000: 5912: debug : virCommandRunAsync:2632 : Command result 0, with PID 5914
... 2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:177 : aborting child process 5914
2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:185 : trying SIGTERM to child process 5914
Attach gdb to the process, observe above backtrace, quit gdb.
2021-12-12 16:44:18.059+0000: 5912: debug : virProcessAbort:195 : trying SIGKILL to child process 5914
2021-12-12 16:44:18.061+0000: 5912: debug : virProcessAbort:201 : process has ended: fatal signal 9
The stuck g_usleep() is weird. Isn't there a tremendous load on the machine? I can't think of much else. Also I am looking at the pkttyagent code and it looks like it blocks the first SIGTERM and sending two of them should help in that case, but if we want to wait for few ms between them, than we;ll be in the same pickle.
Anyway, if just adding:
if (!isatty(STDIN_FILENO)) return false;
This indeed fixes the regression in the test tool.
That just means that it won't start the agent. Let's do this, but I would really, *really* like to figure out what the heck is happening there, because there has to be something wrong and it might just be waiting around the corner for us and bite us in the back in a year or so. Although I understand how improbable that is.
to top of virPolkitAgentAvailable() solves your problem I do not have a particular issue with it. In any case I would like to better understand the issue as just the fact that we're running pkttyagent should not cause any issues.
Given the above observations, I'm having a difficult time articulating the root cause :-).
Regards, Jim

On 12/12/21 14:15, Martin Kletzander wrote:
On Sun, Dec 12, 2021 at 10:40:46AM -0700, Jim Fehlig wrote:
On 12/11/21 03:28, Martin Kletzander wrote:
On Sat, Dec 11, 2021 at 11:16:13AM +0100, Martin Kletzander wrote:
On Fri, Dec 10, 2021 at 05:48:03PM -0700, Jim Fehlig wrote:
Hi Martin!
I recently received a bug report (sorry, not public) about simple operations like 'virsh list' hanging when invoked with an internal test tool. I found this commit to be the culprit.
OK, one more thing though, the fact that pkttyagent is spawned cannot cause virsh to hang. If the authentications is not required, then it will just wait there for a while and then be killed. If authentication *is* required, then either you already have an agent running and that one should be used since we're starting pkttyagent with `--fallback` or you do not have any agent running in which case virsh list would fail to connect. Where does the virsh hang, what's the backtrace?
The last scenario you describe appears to be the case. virsh fails to connect then gets stuck trying to kill off pkttyagent
#0 0x00007f9f07530241 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 #1 0x00007f9f07535ad3 in nanosleep () from /lib64/libc.so.6 #2 0x00007f9f07f478af in g_usleep () from /usr/lib64/libglib-2.0.so.0 #3 0x00007f9f086694fa in virProcessAbort (pid=367) at ../src/util/virprocess.c:187 #4 0x00007f9f0861ed9b in virCommandAbort (cmd=cmd@entry=0x55a798660c50) at ../src/util/vircommand.c:2774 #5 0x00007f9f08621478 in virCommandFree (cmd=0x55a798660c50) at ../src/util/vircommand.c:3061 #6 0x00007f9f08668581 in virPolkitAgentDestroy (agent=0x55a7986426e0) at ../src/util/virpolkit.c:164 #7 0x000055a797836d93 in virshConnect (ctl=ctl@entry=0x7ffc551dd980, uri=0x0, readonly=readonly@entry=false) at ../tools/virsh.c:187 #8 0x000055a797837007 in virshReconnect (ctl=ctl@entry=0x7ffc551dd980, name=name@entry=0x0, readonly=<optimized out>, readonly@entry=false, force=force@entry=false) at ../tools/virsh.c:223 #9 0x000055a7978371e0 in virshConnectionHandler (ctl=0x7ffc551dd980) at ../tools/virsh.c:325 #10 0x000055a797880172 in vshCommandRun (ctl=ctl@entry=0x7ffc551dd980, cmd=0x55a79865f580) at ../tools/vsh.c:1308 #11 0x000055a7978367b7 in main (argc=2, argv=<optimized out>) at ../tools/virsh.c:907
Odd thing is, I attached gdb to this virsh process several minutes after invoking the test tool that calls 'virsh list'. I can't explain why the process is still blocked in g_usleep, which should only have slept for 10 milliseconds. Even odder, detaching from the process appears to awaken g_usleep and allows process shutdown to continue. The oddness can also be seen in the debug output
2021-12-12 16:35:38.783+0000: 5912: debug : virCommandRunAsync:2629 : About to run /usr/bin/pkttyagent --process 5912 --notify-fd 4 --fallback
2021-12-12 16:35:38.787+0000: 5912: debug : virCommandRunAsync:2632 : Command result 0, with PID 5914
... 2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:177 : aborting child process 5914
2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:185 : trying SIGTERM to child process 5914
Attach gdb to the process, observe above backtrace, quit gdb.
2021-12-12 16:44:18.059+0000: 5912: debug : virProcessAbort:195 : trying SIGKILL to child process 5914
2021-12-12 16:44:18.061+0000: 5912: debug : virProcessAbort:201 : process has ended: fatal signal 9
The stuck g_usleep() is weird. Isn't there a tremendous load on the machine? I can't think of much else.
The machine is mostly idle.
Also I am looking at the pkttyagent code and it looks like it blocks the first SIGTERM and sending two of them should help in that case, but if we want to wait for few ms between them, than we;ll be in the same pickle.
Anyway, if just adding:
if (!isatty(STDIN_FILENO)) return false;
This indeed fixes the regression in the test tool.
That just means that it won't start the agent. Let's do this, but I would really, *really* like to figure out what the heck is happening there, because there has to be something wrong and it might just be waiting around the corner for us and bite us in the back in a year or so. Although I understand how improbable that is.
Do you have additional suggestions that may help us gain a better understanding of the problem? Regards, Jim

On Mon, Dec 13, 2021 at 09:42:04AM -0700, Jim Fehlig wrote:
On 12/12/21 14:15, Martin Kletzander wrote:
On Sun, Dec 12, 2021 at 10:40:46AM -0700, Jim Fehlig wrote:
On 12/11/21 03:28, Martin Kletzander wrote:
On Sat, Dec 11, 2021 at 11:16:13AM +0100, Martin Kletzander wrote:
On Fri, Dec 10, 2021 at 05:48:03PM -0700, Jim Fehlig wrote:
Hi Martin!
I recently received a bug report (sorry, not public) about simple operations like 'virsh list' hanging when invoked with an internal test tool. I found this commit to be the culprit.
OK, one more thing though, the fact that pkttyagent is spawned cannot cause virsh to hang. If the authentications is not required, then it will just wait there for a while and then be killed. If authentication *is* required, then either you already have an agent running and that one should be used since we're starting pkttyagent with `--fallback` or you do not have any agent running in which case virsh list would fail to connect. Where does the virsh hang, what's the backtrace?
The last scenario you describe appears to be the case. virsh fails to connect then gets stuck trying to kill off pkttyagent
#0 0x00007f9f07530241 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 #1 0x00007f9f07535ad3 in nanosleep () from /lib64/libc.so.6 #2 0x00007f9f07f478af in g_usleep () from /usr/lib64/libglib-2.0.so.0 #3 0x00007f9f086694fa in virProcessAbort (pid=367) at ../src/util/virprocess.c:187 #4 0x00007f9f0861ed9b in virCommandAbort (cmd=cmd@entry=0x55a798660c50) at ../src/util/vircommand.c:2774 #5 0x00007f9f08621478 in virCommandFree (cmd=0x55a798660c50) at ../src/util/vircommand.c:3061 #6 0x00007f9f08668581 in virPolkitAgentDestroy (agent=0x55a7986426e0) at ../src/util/virpolkit.c:164 #7 0x000055a797836d93 in virshConnect (ctl=ctl@entry=0x7ffc551dd980, uri=0x0, readonly=readonly@entry=false) at ../tools/virsh.c:187 #8 0x000055a797837007 in virshReconnect (ctl=ctl@entry=0x7ffc551dd980, name=name@entry=0x0, readonly=<optimized out>, readonly@entry=false, force=force@entry=false) at ../tools/virsh.c:223 #9 0x000055a7978371e0 in virshConnectionHandler (ctl=0x7ffc551dd980) at ../tools/virsh.c:325 #10 0x000055a797880172 in vshCommandRun (ctl=ctl@entry=0x7ffc551dd980, cmd=0x55a79865f580) at ../tools/vsh.c:1308 #11 0x000055a7978367b7 in main (argc=2, argv=<optimized out>) at ../tools/virsh.c:907
Odd thing is, I attached gdb to this virsh process several minutes after invoking the test tool that calls 'virsh list'. I can't explain why the process is still blocked in g_usleep, which should only have slept for 10 milliseconds. Even odder, detaching from the process appears to awaken g_usleep and allows process shutdown to continue. The oddness can also be seen in the debug output
2021-12-12 16:35:38.783+0000: 5912: debug : virCommandRunAsync:2629 : About to run /usr/bin/pkttyagent --process 5912 --notify-fd 4 --fallback
2021-12-12 16:35:38.787+0000: 5912: debug : virCommandRunAsync:2632 : Command result 0, with PID 5914
... 2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:177 : aborting child process 5914
2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:185 : trying SIGTERM to child process 5914
Attach gdb to the process, observe above backtrace, quit gdb.
2021-12-12 16:44:18.059+0000: 5912: debug : virProcessAbort:195 : trying SIGKILL to child process 5914
2021-12-12 16:44:18.061+0000: 5912: debug : virProcessAbort:201 : process has ended: fatal signal 9
The stuck g_usleep() is weird. Isn't there a tremendous load on the machine? I can't think of much else.
The machine is mostly idle.
Also I am looking at the pkttyagent code and it looks like it blocks the first SIGTERM and sending two of them should help in that case, but if we want to wait for few ms between them, than we;ll be in the same pickle.
Anyway, if just adding:
if (!isatty(STDIN_FILENO)) return false;
This indeed fixes the regression in the test tool.
That just means that it won't start the agent. Let's do this, but I would really, *really* like to figure out what the heck is happening there, because there has to be something wrong and it might just be waiting around the corner for us and bite us in the back in a year or so. Although I understand how improbable that is.
Do you have additional suggestions that may help us gain a better understanding of the problem?
Unfortunately I cannot think of anything else than trying to figure out whether the kernel is scheduling that task at all and if not then why. How to do that however is unfortunately out of my territory :(
Regards, Jim

The reason for this is twofold: - the polkit build option is documented for UNIX socket access checks - there is no server-side change or dbus call done when enabling this as it only starts a polkit agent on the client-side (actually only in virsh) and does not need any requirements (starting is skipped if pkttyagent is not installed) Also move the conditional implementation to the bottom of the file so that it does not look like the whole file is build conditionally and the common functions are at the top. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 240 ++++++++++++++++++++----------------------- 1 file changed, 109 insertions(+), 131 deletions(-) diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 7156adc10c0a..b51104100796 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -21,6 +21,7 @@ #include <config.h> #include <fcntl.h> +#include <poll.h> #include <unistd.h> #include "virpolkit.h" @@ -37,119 +38,10 @@ VIR_LOG_INIT("util.polkit"); -#if WITH_POLKIT -# include <poll.h> - struct _virPolkitAgent { virCommand *cmd; }; -/* - * virPolkitCheckAuth: - * @actionid: permission to check - * @pid: client process ID - * @startTime: process start time, or 0 - * @uid: client process user ID - * @details: NULL terminated (key, value) pair list - * @allowInteraction: true if auth prompts are allowed - * - * Check if a client is authenticated with polkit - * - * Returns 0 on success, -1 on failure, -2 on auth denied - */ -int virPolkitCheckAuth(const char *actionid, - pid_t pid, - unsigned long long startTime, - uid_t uid, - const char **details, - bool allowInteraction) -{ - GDBusConnection *sysbus; - GVariantBuilder builder; - GVariant *gprocess = NULL; - GVariant *gdetails = NULL; - g_autoptr(GVariant) message = NULL; - g_autoptr(GVariant) reply = NULL; - g_autoptr(GVariantIter) iter = NULL; - char *retkey; - char *retval; - gboolean is_authorized; - gboolean is_challenge; - bool is_dismissed = false; - const char **next; - - if (!(sysbus = virGDBusGetSystemBus())) - return -1; - - VIR_INFO("Checking PID %lld running as %d", - (long long) pid, uid); - - g_variant_builder_init(&builder, G_VARIANT_TYPE("a{sv}")); - g_variant_builder_add(&builder, "{sv}", "pid", g_variant_new_uint32(pid)); - g_variant_builder_add(&builder, "{sv}", "start-time", g_variant_new_uint64(startTime)); - g_variant_builder_add(&builder, "{sv}", "uid", g_variant_new_int32(uid)); - gprocess = g_variant_builder_end(&builder); - - g_variant_builder_init(&builder, G_VARIANT_TYPE("a{ss}")); - - if (details) { - for (next = details; *next; next++) { - const char *detail1 = *(next++); - const char *detail2 = *next; - g_variant_builder_add(&builder, "{ss}", detail1, detail2); - } - } - - gdetails = g_variant_builder_end(&builder); - - message = g_variant_new("((s@a{sv})s@a{ss}us)", - "unix-process", - gprocess, - actionid, - gdetails, - allowInteraction, - "" /* cancellation ID */); - - if (virGDBusCallMethod(sysbus, - &reply, - G_VARIANT_TYPE("((bba{ss}))"), - NULL, - "org.freedesktop.PolicyKit1", - "/org/freedesktop/PolicyKit1/Authority", - "org.freedesktop.PolicyKit1.Authority", - "CheckAuthorization", - message) < 0) - return -1; - - g_variant_get(reply, "((bba{ss}))", &is_authorized, &is_challenge, &iter); - - while (g_variant_iter_loop(iter, "{ss}", &retkey, &retval)) { - if (STREQ(retkey, "polkit.dismissed") && STREQ(retval, "true")) - is_dismissed = true; - } - - VIR_DEBUG("is auth %d is challenge %d", - is_authorized, is_challenge); - - if (is_authorized) - return 0; - - if (is_dismissed) { - virReportError(VIR_ERR_AUTH_CANCELLED, "%s", - _("user cancelled authentication process")); - } else if (is_challenge) { - virReportError(VIR_ERR_AUTH_UNAVAILABLE, - _("no polkit agent available to authenticate action '%s'"), - actionid); - } else { - virReportError(VIR_ERR_AUTH_FAILED, "%s", - _("access denied by policy")); - } - - return -2; -} - - /* virPolkitAgentDestroy: * @cmd: Pointer to the virCommand * created during virPolkitAgentCreate * @@ -260,6 +152,114 @@ virPolkitAgentAvailable(void) return true; } + +#if WITH_POLKIT + +/* + * virPolkitCheckAuth: + * @actionid: permission to check + * @pid: client process ID + * @startTime: process start time, or 0 + * @uid: client process user ID + * @details: NULL terminated (key, value) pair list + * @allowInteraction: true if auth prompts are allowed + * + * Check if a client is authenticated with polkit + * + * Returns 0 on success, -1 on failure, -2 on auth denied + */ +int virPolkitCheckAuth(const char *actionid, + pid_t pid, + unsigned long long startTime, + uid_t uid, + const char **details, + bool allowInteraction) +{ + GDBusConnection *sysbus; + GVariantBuilder builder; + GVariant *gprocess = NULL; + GVariant *gdetails = NULL; + g_autoptr(GVariant) message = NULL; + g_autoptr(GVariant) reply = NULL; + g_autoptr(GVariantIter) iter = NULL; + char *retkey; + char *retval; + gboolean is_authorized; + gboolean is_challenge; + bool is_dismissed = false; + const char **next; + + if (!(sysbus = virGDBusGetSystemBus())) + return -1; + + VIR_INFO("Checking PID %lld running as %d", + (long long) pid, uid); + + g_variant_builder_init(&builder, G_VARIANT_TYPE("a{sv}")); + g_variant_builder_add(&builder, "{sv}", "pid", g_variant_new_uint32(pid)); + g_variant_builder_add(&builder, "{sv}", "start-time", g_variant_new_uint64(startTime)); + g_variant_builder_add(&builder, "{sv}", "uid", g_variant_new_int32(uid)); + gprocess = g_variant_builder_end(&builder); + + g_variant_builder_init(&builder, G_VARIANT_TYPE("a{ss}")); + + if (details) { + for (next = details; *next; next++) { + const char *detail1 = *(next++); + const char *detail2 = *next; + g_variant_builder_add(&builder, "{ss}", detail1, detail2); + } + } + + gdetails = g_variant_builder_end(&builder); + + message = g_variant_new("((s@a{sv})s@a{ss}us)", + "unix-process", + gprocess, + actionid, + gdetails, + allowInteraction, + "" /* cancellation ID */); + + if (virGDBusCallMethod(sysbus, + &reply, + G_VARIANT_TYPE("((bba{ss}))"), + NULL, + "org.freedesktop.PolicyKit1", + "/org/freedesktop/PolicyKit1/Authority", + "org.freedesktop.PolicyKit1.Authority", + "CheckAuthorization", + message) < 0) + return -1; + + g_variant_get(reply, "((bba{ss}))", &is_authorized, &is_challenge, &iter); + + while (g_variant_iter_loop(iter, "{ss}", &retkey, &retval)) { + if (STREQ(retkey, "polkit.dismissed") && STREQ(retval, "true")) + is_dismissed = true; + } + + VIR_DEBUG("is auth %d is challenge %d", + is_authorized, is_challenge); + + if (is_authorized) + return 0; + + if (is_dismissed) { + virReportError(VIR_ERR_AUTH_CANCELLED, "%s", + _("user cancelled authentication process")); + } else if (is_challenge) { + virReportError(VIR_ERR_AUTH_UNAVAILABLE, + _("no polkit agent available to authenticate action '%s'"), + actionid); + } else { + virReportError(VIR_ERR_AUTH_FAILED, "%s", + _("access denied by policy")); + } + + return -2; +} + #else /* ! WITH_POLKIT */ int virPolkitCheckAuth(const char *actionid G_GNUC_UNUSED, @@ -275,26 +275,4 @@ int virPolkitCheckAuth(const char *actionid G_GNUC_UNUSED, return -1; } - -void -virPolkitAgentDestroy(virPolkitAgent *agent G_GNUC_UNUSED) -{ - return; /* do nothing */ -} - - -virPolkitAgent * -virPolkitAgentCreate(void) -{ - virReportError(VIR_ERR_AUTH_FAILED, "%s", - _("polkit text authentication agent unavailable")); - return NULL; -} - -bool -virPolkitAgentAvailable(void) -{ - return false; -} - #endif /* WITH_POLKIT */ -- 2.34.0

On Sun, Nov 21, 2021 at 12:10:08AM +0100, Martin Kletzander wrote:
The reason for this is twofold:
- the polkit build option is documented for UNIX socket access checks
- there is no server-side change or dbus call done when enabling this as it only starts a polkit agent on the client-side (actually only in virsh) and does not need any requirements (starting is skipped if pkttyagent is not installed)
Also move the conditional implementation to the bottom of the file so that it does not look like the whole file is build conditionally and the common functions are at the top.
Does this still work correctly on Windows if we try by default ? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Nov 22, 2021 at 09:23:01AM +0000, Daniel P. Berrangé wrote:
On Sun, Nov 21, 2021 at 12:10:08AM +0100, Martin Kletzander wrote:
The reason for this is twofold:
- the polkit build option is documented for UNIX socket access checks
- there is no server-side change or dbus call done when enabling this as it only starts a polkit agent on the client-side (actually only in virsh) and does not need any requirements (starting is skipped if pkttyagent is not installed)
Also move the conditional implementation to the bottom of the file so that it does not look like the whole file is build conditionally and the common functions are at the top.
Does this still work correctly on Windows if we try by default ?
Any call to virPolkitAgentAvailable() should return false on Windows unless PKTTYAGENT (/usr/bin/pkttyagent) exists. While thinking about it now I will change that virFileExists() call to virFileIsExecutable() to catch even more possible issues. Anyway since that should return false on Windows (and I hope my presumptions are correct) then virPolkitAgentCreate() should report an error, just like it would without this patch if connecting to polkit-guarded libvirtd socket (e.g. through ssh). It should actually return a better error with this patch applied. And ctermid() is a POSIX function, not sure what that returns on windows, but it should not even get there as the first check is done against the existence/executability of PKTTYAGENT.
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Nov 22, 2021 at 11:03:53AM +0100, Martin Kletzander wrote:
On Mon, Nov 22, 2021 at 09:23:01AM +0000, Daniel P. Berrangé wrote:
On Sun, Nov 21, 2021 at 12:10:08AM +0100, Martin Kletzander wrote:
The reason for this is twofold:
- the polkit build option is documented for UNIX socket access checks
- there is no server-side change or dbus call done when enabling this as it only starts a polkit agent on the client-side (actually only in virsh) and does not need any requirements (starting is skipped if pkttyagent is not installed)
Also move the conditional implementation to the bottom of the file so that it does not look like the whole file is build conditionally and the common functions are at the top.
Does this still work correctly on Windows if we try by default ?
Any call to virPolkitAgentAvailable() should return false on Windows unless PKTTYAGENT (/usr/bin/pkttyagent) exists. While thinking about it now I will change that virFileExists() call to virFileIsExecutable() to catch even more possible issues. Anyway since that should return false on Windows (and I hope my presumptions are correct) then virPolkitAgentCreate() should report an error, just like it would without this patch if connecting to polkit-guarded libvirtd socket (e.g. through ssh). It should actually return a better error with this patch applied.
And ctermid() is a POSIX function, not sure what that returns on windows, but it should not even get there as the first check is done against the existence/executability of PKTTYAGENT.
ctermid() doesn't exist in Windows, so it shouldn't even compile if we try to build with it ! A conditional willbe needed I expect. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Nov 22, 2021 at 10:07:39AM +0000, Daniel P. Berrangé wrote:
On Mon, Nov 22, 2021 at 11:03:53AM +0100, Martin Kletzander wrote:
On Mon, Nov 22, 2021 at 09:23:01AM +0000, Daniel P. Berrangé wrote:
On Sun, Nov 21, 2021 at 12:10:08AM +0100, Martin Kletzander wrote:
The reason for this is twofold:
- the polkit build option is documented for UNIX socket access checks
- there is no server-side change or dbus call done when enabling this as it only starts a polkit agent on the client-side (actually only in virsh) and does not need any requirements (starting is skipped if pkttyagent is not installed)
Also move the conditional implementation to the bottom of the file so that it does not look like the whole file is build conditionally and the common functions are at the top.
Does this still work correctly on Windows if we try by default ?
Any call to virPolkitAgentAvailable() should return false on Windows unless PKTTYAGENT (/usr/bin/pkttyagent) exists. While thinking about it now I will change that virFileExists() call to virFileIsExecutable() to catch even more possible issues. Anyway since that should return false on Windows (and I hope my presumptions are correct) then virPolkitAgentCreate() should report an error, just like it would without this patch if connecting to polkit-guarded libvirtd socket (e.g. through ssh). It should actually return a better error with this patch applied.
And ctermid() is a POSIX function, not sure what that returns on windows, but it should not even get there as the first check is done against the existence/executability of PKTTYAGENT.
ctermid() doesn't exist in Windows, so it shouldn't even compile if we try to build with it ! A conditional willbe needed I expect.
Yeah, with my limited (and mostly forgotten Windows experience) I was too much reliant on our builds which all have allow_failure: true for mingw builds. Sorry for that, I'll fix that up.
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Tue, Nov 23, 2021 at 12:55:00PM +0100, Martin Kletzander wrote:
On Mon, Nov 22, 2021 at 10:07:39AM +0000, Daniel P. Berrangé wrote:
On Mon, Nov 22, 2021 at 11:03:53AM +0100, Martin Kletzander wrote:
On Mon, Nov 22, 2021 at 09:23:01AM +0000, Daniel P. Berrangé wrote:
On Sun, Nov 21, 2021 at 12:10:08AM +0100, Martin Kletzander wrote:
The reason for this is twofold:
- the polkit build option is documented for UNIX socket access checks
- there is no server-side change or dbus call done when enabling this as it only starts a polkit agent on the client-side (actually only in virsh) and does not need any requirements (starting is skipped if pkttyagent is not installed)
Also move the conditional implementation to the bottom of the file so that it does not look like the whole file is build conditionally and the common functions are at the top.
Does this still work correctly on Windows if we try by default ?
Any call to virPolkitAgentAvailable() should return false on Windows unless PKTTYAGENT (/usr/bin/pkttyagent) exists. While thinking about it now I will change that virFileExists() call to virFileIsExecutable() to catch even more possible issues. Anyway since that should return false on Windows (and I hope my presumptions are correct) then virPolkitAgentCreate() should report an error, just like it would without this patch if connecting to polkit-guarded libvirtd socket (e.g. through ssh). It should actually return a better error with this patch applied.
And ctermid() is a POSIX function, not sure what that returns on windows, but it should not even get there as the first check is done against the existence/executability of PKTTYAGENT.
ctermid() doesn't exist in Windows, so it shouldn't even compile if we try to build with it ! A conditional willbe needed I expect.
Yeah, with my limited (and mostly forgotten Windows experience) I was too much reliant on our builds which all have allow_failure: true for mingw builds. Sorry for that, I'll fix that up.
We need to move at least one of our mingw builds off rawhide onto the stable Fedora, so we can remove the allow_failure flag. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Tue, Nov 23, 2021 at 11:56:40AM +0000, Daniel P. Berrangé wrote:
On Tue, Nov 23, 2021 at 12:55:00PM +0100, Martin Kletzander wrote:
On Mon, Nov 22, 2021 at 10:07:39AM +0000, Daniel P. Berrangé wrote:
On Mon, Nov 22, 2021 at 11:03:53AM +0100, Martin Kletzander wrote:
On Mon, Nov 22, 2021 at 09:23:01AM +0000, Daniel P. Berrangé wrote:
On Sun, Nov 21, 2021 at 12:10:08AM +0100, Martin Kletzander wrote:
The reason for this is twofold:
- the polkit build option is documented for UNIX socket access checks
- there is no server-side change or dbus call done when enabling this as it only starts a polkit agent on the client-side (actually only in virsh) and does not need any requirements (starting is skipped if pkttyagent is not installed)
Also move the conditional implementation to the bottom of the file so that it does not look like the whole file is build conditionally and the common functions are at the top.
Does this still work correctly on Windows if we try by default ?
Any call to virPolkitAgentAvailable() should return false on Windows unless PKTTYAGENT (/usr/bin/pkttyagent) exists. While thinking about it now I will change that virFileExists() call to virFileIsExecutable() to catch even more possible issues. Anyway since that should return false on Windows (and I hope my presumptions are correct) then virPolkitAgentCreate() should report an error, just like it would without this patch if connecting to polkit-guarded libvirtd socket (e.g. through ssh). It should actually return a better error with this patch applied.
And ctermid() is a POSIX function, not sure what that returns on windows, but it should not even get there as the first check is done against the existence/executability of PKTTYAGENT.
ctermid() doesn't exist in Windows, so it shouldn't even compile if we try to build with it ! A conditional willbe needed I expect.
Yeah, with my limited (and mostly forgotten Windows experience) I was too much reliant on our builds which all have allow_failure: true for mingw builds. Sorry for that, I'll fix that up.
We need to move at least one of our mingw builds off rawhide onto the stable Fedora, so we can remove the allow_failure flag.
And thinking about it I will just remove this patch, there is no point in that any more.
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On a Sunday in 2021, Martin Kletzander wrote:
The reason for this is twofold:
- the polkit build option is documented for UNIX socket access checks
- there is no server-side change or dbus call done when enabling this as it only starts a polkit agent on the client-side (actually only in virsh) and does not need any requirements (starting is skipped if pkttyagent is not installed)
Also move the conditional implementation to the bottom of the file so that it does not look like the whole file is build conditionally and the common functions are at the top.
Please, separate the code movement from the functional changes, to make it obvious what this commit does.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/util/virpolkit.c | 240 ++++++++++++++++++++----------------------- 1 file changed, 109 insertions(+), 131 deletions(-)
diff --git a/src/util/virpolkit.c b/src/util/virpolkit.c index 7156adc10c0a..b51104100796 100644 --- a/src/util/virpolkit.c +++ b/src/util/virpolkit.c @@ -21,6 +21,7 @@
#include <config.h> #include <fcntl.h> +#include <poll.h>
This does not compile on mingw: ../src/util/virpolkit.c:24:10: fatal error: poll.h: No such file or directory 24 | #include <poll.h> | ^~~~~~~~ Jano
#include <unistd.h>
#include "virpolkit.h"
participants (5)
-
Daniel P. Berrangé
-
Jim Fehlig
-
Ján Tomko
-
Martin Kletzander
-
Michal Prívozník