Re: [libvirt] [PATCH v2 7/9] rpc: Alter virNetDaemonQuit processing

Monday, 16 July 2018

On Mon, Jul 16, 2018 at 10:18:59 +0100, Daniel Berrange wrote:
...
 On Thu, Jul 12, 2018 at 03:34:00PM -0400, John Ferlan wrote:
 > On 07/09/2018 03:11 AM, Peter Krempa wrote:
 > > On Sat, Jul 07, 2018 at 08:11:05 -0400, John Ferlan wrote:

 I think the key thing to bear in mind is what is the benefit of what we
 are trying todo. ie why do we need todo a clean shutdown instead of just
 immediately exiting. The kernel will clean up all resources associated
 with the process when it exits. So we only need care about a clean shutdown
 if there is something needing cleanup that the kernel won't handle, or if
 we are trying to debug with valgrind & similar tools.

 I tend to think we put way too much time into worrying about getting perfect
 clean shutdown, for little real world benefit.

 IMHO, we should make a moderate effort to shutdown cleanly, but if there's
 something stuck after 15 seconds, we should just uncoditionally call exit(). 
I'm not really worried about the uncleannes of the shutdown but rather
desynchronisation of qemu-libvirt state. If we exit during a modify job
prior to writing the status XML to disk we might end up in an
intermediate state which will not be recognized by libvirt afterwards.

Just exit()-ing is not a systematic solution here because of the above.

In my opinion a better solution is to close the monitor connections
after some time and only in cases where we know it's safe. E.g. in QUERY
type jobs. Some modify-type jobs may get stuck as well, but only a very
few commands create this risk so annotating them and not allowing to
break the monitor there should be way better.

Currently I'm mostly worried about the 'transaction' command which is
used for snapshots, as losing track of the new files is not recoverable.

Other block jobs which modify the backing graph would be a problem
normally but since we don't track the bakcing chain fully yet it's okay
except for finishing step of the active block commit.

For the blockjobs it will be possible to work this around by the new
APIs in qemu which allow the jobs to linger until we retrieve the state.

That means that the only problematic job probably will be
'transaction'/snapshot creation.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [PATCH v2 7/9] rpc: Alter virNetDaemonQuit processing