On Thu, Dec 19, 2019 at 11:15:50AM +0000, Daniel P. Berrangé wrote:
On Tue, Dec 17, 2019 at 12:41:27PM -0500, Cole Robinson wrote:
> The second issue: testing with virt-manager, everything locks up with
> OpenGraphicsFD:
>
> #0 0x00007ffff7a7f07a in pthread_cond_timedwait@(a)GLIBC_2.3.2 () at
> /lib64/libpthread.so.0
> #1 0x00007fffe94be113 in virCondWaitUntil (c=c@entry=0x7fffc8071f98,
> m=m@entry=0x7fffc8071ed0, whenms=whenms@entry=1576602286073) at
> /home/crobinso/src/libvirt/src/util/virthread.c:159
> #2 0x00007fffe44fc549 in qemuDomainObjBeginJobInternal
> (driver=driver@entry=0x7fffc8004f30, obj=0x7fffc8071ec0,
> job=job@entry=QEMU_JOB_MODIFY,
> agentJob=agentJob@entry=QEMU_AGENT_JOB_NONE,
> asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, nowait=nowait@entry=false)
> at /home/crobinso/src/libvirt/src/qemu/qemu_domain.c:9357
> #3 0x00007fffe4500aa1 in qemuDomainObjBeginJob
> (driver=driver@entry=0x7fffc8004f30, obj=<optimized out>,
> job=job@entry=QEMU_JOB_MODIFY) at
> /home/crobinso/src/libvirt/src/qemu/qemu_domain.c:9521
> #4 0x00007fffe4582572 in qemuDomainOpenGraphicsFD (dom=<optimized out>,
> idx=<optimized out>, flags=0) at
> /home/crobinso/src/libvirt/src/qemu/qemu_driver.c:18990
> #5 0x00007fffe968699c in virDomainOpenGraphicsFD (dom=0x7fffd0005830,
> idx=0, flags=0) at /home/crobinso/src/libvirt/src/libvirt-domain.c:10664
> #6 0x00007fffe98cc6b1 in libvirt_virDomainOpenGraphicsFD () at
> /usr/lib64/python3.7/site-packages/libvirtmod.cpython-37m-x86_64-linux-gnu.so
>
>
> I didn't dig into it any more than that. Otherwise in some mild testing
> I'm surprised how much things seemed to 'just work' :)
Hard to tell but my guess is that there's probably some event loop
interaction causing us problems. Obviously this stack trace is waiting
for the job lock. I'm guessing that the job lock is held by another
API call made from the event loop thread. Will investigate it some more.
In libvirtd no libvirt APIs calls are ever made from the event loop
thread, they can only be made from worker threads. We'll need to make
it clear that apps using the embedded driver must likewise ensure that
libvirt APIs calls are *NEVER* made from the event loop thread. This
is good practice even when using traditional libvirt connections,
because the RPC calls to libvirtd can take an arbitrary amount of
time to complete & so make the event loop response very poor.
I reproduced this issue and can confirm that it is indeed
caused by calling virDomainOpenGraphicsFD from the main
event loop. THis call sends a QMP monitor command and I/O
for the monitor is performed by the event loop. So we're
maining for QMP send that will enver happen since virtman
is blocking the event loop.
This is quite tedious to fix with the way virt-manager is
written using the libvirt python bindings.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|