On 12/11/21 03:28, Martin Kletzander wrote:
On Sat, Dec 11, 2021 at 11:16:13AM +0100, Martin Kletzander wrote:
> On Fri, Dec 10, 2021 at 05:48:03PM -0700, Jim Fehlig wrote:
>> Hi Martin!
>>
>> I recently received a bug report (sorry, not public) about simple operations
>> like 'virsh list' hanging when invoked with an internal test tool. I
found this
>> commit to be the culprit.
>>
OK, one more thing though, the fact that pkttyagent is spawned cannot
cause virsh to hang. If the authentications is not required, then it
will just wait there for a while and then be killed. If authentication
*is* required, then either you already have an agent running and that
one should be used since we're starting pkttyagent with `--fallback` or
you do not have any agent running in which case virsh list would fail
to connect. Where does the virsh hang, what's the backtrace?
The last scenario you describe appears to be the case. virsh fails to connect
then gets stuck trying to kill off pkttyagent
#0 0x00007f9f07530241 in clock_nanosleep(a)GLIBC_2.2.5 () from /lib64/libc.so.6
#1 0x00007f9f07535ad3 in nanosleep () from /lib64/libc.so.6
#2 0x00007f9f07f478af in g_usleep () from /usr/lib64/libglib-2.0.so.0
#3 0x00007f9f086694fa in virProcessAbort (pid=367) at ../src/util/virprocess.c:187
#4 0x00007f9f0861ed9b in virCommandAbort (cmd=cmd@entry=0x55a798660c50) at
../src/util/vircommand.c:2774
#5 0x00007f9f08621478 in virCommandFree (cmd=0x55a798660c50) at
../src/util/vircommand.c:3061
#6 0x00007f9f08668581 in virPolkitAgentDestroy (agent=0x55a7986426e0) at
../src/util/virpolkit.c:164
#7 0x000055a797836d93 in virshConnect (ctl=ctl@entry=0x7ffc551dd980, uri=0x0,
readonly=readonly@entry=false) at ../tools/virsh.c:187
#8 0x000055a797837007 in virshReconnect (ctl=ctl@entry=0x7ffc551dd980,
name=name@entry=0x0, readonly=<optimized out>, readonly@entry=false,
force=force@entry=false) at ../tools/virsh.c:223
#9 0x000055a7978371e0 in virshConnectionHandler (ctl=0x7ffc551dd980) at
../tools/virsh.c:325
#10 0x000055a797880172 in vshCommandRun (ctl=ctl@entry=0x7ffc551dd980,
cmd=0x55a79865f580) at ../tools/vsh.c:1308
#11 0x000055a7978367b7 in main (argc=2, argv=<optimized out>) at
../tools/virsh.c:907
Odd thing is, I attached gdb to this virsh process several minutes after
invoking the test tool that calls 'virsh list'. I can't explain why the
process
is still blocked in g_usleep, which should only have slept for 10 milliseconds.
Even odder, detaching from the process appears to awaken g_usleep and allows
process shutdown to continue. The oddness can also be seen in the debug output
2021-12-12 16:35:38.783+0000: 5912: debug : virCommandRunAsync:2629 : About to
run /usr/bin/pkttyagent --process 5912 --notify-fd 4 --fallback
2021-12-12 16:35:38.787+0000: 5912: debug : virCommandRunAsync:2632 : Command
result 0, with PID 5914
...
2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:177 : aborting child
process 5914
2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:185 : trying SIGTERM
to child process 5914
Attach gdb to the process, observe above backtrace, quit gdb.
2021-12-12 16:44:18.059+0000: 5912: debug : virProcessAbort:195 : trying SIGKILL
to child process 5914
2021-12-12 16:44:18.061+0000: 5912: debug : virProcessAbort:201 : process has
ended: fatal signal 9
Anyway, if just adding:
if (!isatty(STDIN_FILENO))
return false;
This indeed fixes the regression in the test tool.
to top of virPolkitAgentAvailable() solves your problem I do not have
a
particular issue with it. In any case I would like to better understand
the issue as just the fact that we're running pkttyagent should not
cause any issues.
Given the above observations, I'm having a difficult time articulating the root
cause :-).
Regards,
Jim