On Tue, May 2, 2017 at 4:07 PM, Peter Krempa <pkrempa@redhat.com> wrote:
On Tue, May 02, 2017 at 16:01:40 +0530, Prerna wrote:

[please don't top-post on technical lists]

> Thanks for the quick response Peter !
> This ratifies the basic approach I had in mind.
> It needs some (not-so-small) cleanup of the qemu driver code, and I have
> already started cleaning up some of it. I am planning to have a constant
> number of event handler threads to start with. I'll try adding this as a
> configurable parameter in qemu.conf once basic functionality is completed.

That is wrong, since you can't guarantee that it will not lock up. Since
the workers handling monitor events tend to call monitor commands
themselves it's possible that it will get stuck due to unresponsive
qemu. Without having a worst-case-scenario of a thread per VM you can't
guarantee that the pool won't be depleted.

Once a worker thread "picks" an event, it will contend on the per-VM lock for that VM. Consequently, the handling for that event will be delayed until an existing RPC call for that VM completes.
 

If you want to fix this properly, you'll need a dynamic pool.

To improve the efficiency of the thread pool, we can try contending for a VM's lock for a specific time, say, N seconds, and then relinquish the lock. The same thread in the pool can then move on to process events of the next VM.

Note that this needs all VMs to be hashed to a constant number of threads in the pool, say 5. This ensures that each worker thread has a unique , non-overlapping set of VMs to work with.

As an example,  {VM_ID: 1, 6,11,16,21 ..} are handled by the same worker thread. If this particular worker thread cannot find the requisite VM's lock, it will move on to the event list for the next VM and so on. The use of pthread_trylock() ensures that the worker thread will never be stuck forever.