Re: [libvirt] [[RFC] 0/8] Implement async QEMU event handling in libvirtd.

Wednesday, 25 October 2017

On Tue, Oct 24, 2017 at 10:34:53 -0700, Prerna Saxena wrote:
...

 As noted in
 https://www.redhat.com/archives/libvir-list/2017-May/msg00016.html
 libvirt-QEMU driver handles all async events from the main loop.
 Each event handling needs the per-VM lock to make forward progress. In
 the case where an async event is received for the same VM which has an
 RPC running, the main loop is held up contending for the same lock.

 This impacts scalability, and should be addressed on priority.

 Note that libvirt does have a 2-step deferred handling for a few event
 categories, but (1) That is insufficient since blockign happens before
 the handler could disambiguate which one needs to be posted to this
 other queue.
 (2) There needs to be homogeniety.

 The current series builds a framework for recording and handling VM
 events.
 It initializes per-VM event queue, and a global event queue pointing to
 events from all the VMs. Event handling is staggered in 2 stages:
 - When an event is received, it is enqueued in the per-VM queue as well
   as the global queues.
 - The global queue is built into the QEMU Driver as a threadpool
   (currently with a single thread).
 - Enqueuing of a new event triggers the global event worker thread, which
   then attempts to take a lock for this event's VM.
     - If the lock is available, the event worker runs the function handling
       this event type. Once done, it dequeues this event from the global
       as well as per-VM queues.
     - If the lock is unavailable(ie taken by RPC thread), the event worker 
       thread leaves this as-is and picks up the next event. 
If I get it right, the event is either processed immediately when its VM
object is unlocked or it has to wait until the current job running on
the VM object finishes even though the lock may be released before that.
Correct? If so, this needs to be addressed.

...
 - Once the RPC thread completes, it looks for events pertaining to
the
   VM in the per-VM event queue. It then processes the events serially
   (holding the VM lock) until there are no more events remaining for
   this VM. At this point, the per-VM lock is relinquished.

 Patch Series status:
 Strictly RFC only. No compilation issues. I have not had a chance to
 (stress) test it after rebase to latest master.
 Note that documentation and test coverage is TBD, since a few open
 points remain.

 Known issues/ caveats:
 - RPC handling time will become non-deterministic.
 - An event will only be "notified" to a client once the RPC for same VM
completes.
 - Needs careful consideration in all cases where a QMP event is used to
   "signal" an RPC thread, else will deadlock. 
This last issue is actually a show stopper here. We need to make sure
QMP events are processed while a job is still active on the same domain.
Otherwise thinks kile block jobs and migration, which are long running
jobs driven by events, will break.

Jirka

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [[RFC] 0/8] Implement async QEMU event handling in libvirtd.