On Tue, May 2, 2017 at 4:07 PM, Peter Krempa <pkrempa(a)redhat.com> wrote:
On Tue, May 02, 2017 at 16:01:40 +0530, Prerna wrote:
[please don't top-post on technical lists]
> Thanks for the quick response Peter !
> This ratifies the basic approach I had in mind.
> It needs some (not-so-small) cleanup of the qemu driver code, and I have
> already started cleaning up some of it. I am planning to have a constant
> number of event handler threads to start with. I'll try adding this as a
> configurable parameter in qemu.conf once basic functionality is
completed.
That is wrong, since you can't guarantee that it will not lock up. Since
the workers handling monitor events tend to call monitor commands
themselves it's possible that it will get stuck due to unresponsive
qemu. Without having a worst-case-scenario of a thread per VM you can't
guarantee that the pool won't be depleted.
Once a worker thread "picks" an event, it will contend on the per-VM lock
for that VM. Consequently, the handling for that event will be delayed
until an existing RPC call for that VM completes.
If you want to fix this properly, you'll need a dynamic pool.
To improve the efficiency of the thread pool, we can try contending for a
VM's lock for a specific time, say, N seconds, and then relinquish the
lock. The same thread in the pool can then move on to process events of the
next VM.
Note that this needs all VMs to be hashed to a constant number of threads
in the pool, say 5. This ensures that each worker thread has a unique ,
non-overlapping set of VMs to work with.
As an example, {VM_ID: 1, 6,11,16,21 ..} are handled by the same worker
thread. If this particular worker thread cannot find the requisite VM's
lock, it will move on to the event list for the next VM and so on. The use
of pthread_trylock() ensures that the worker thread will never be stuck
forever.