Re: [libvirt] [RFC 0/2] Fix detection of slow guest shutdown

Monday, 6 August 2018

On Mon, Aug 6, 2018 at 10:47 AM Daniel P. Berrangé <berrange(a)redhat.com&gt;
wrote:

...
 On Mon, Aug 06, 2018 at 07:20:10AM +0200, Christian Ehrhardt wrote:
 > In that case I wonder what the libvirt community thinks of the proposed
 > general "Pid is gone means we can assume it is dead" approach?

 The key thing with the shutdown process is that we use the dissapperance of
 the PID as the flag to indicate that it is safe to release any resources
 that
 the PID was using. eg the hostdevs are now available for another guest to
 use.

 I'd be concerned that if we looking /proc/$PID going away as the flag, then
 we would be releasing the hostdevs for reuse, before the kernel has cleaned
 them up. In the best case this would result in a 2nd guest failing to start
 because the device was still in the case, in the worst case we could crash
 the entire host (though I'd be hopeful vfio prevents that).

Yeah I agree that ressources being in use could lead to bad and rather hard
to debug problems.

...
 An alternative would be to understand on the Kernel side why the PID
is
 > gone "too early" and fix that so it stays until fully cleaned up.
 > But even then on the Libvirt side we would need the extended timeout
 values.

 Yeah, looks like extended timeouts are unavoidable. The only real
 optimization
 would be to pass an explicit timeout to the kill method, increasing it by 2
 seconds for each hostdev that is assigned. That way we'll scale the timeout
 up as we need, so don't have to predict the worst case number of assigned
 devices.

I'd do both:
- extending the KILL path (if force is set) timeout in general to give bad
systems a chance
- extend the maximum by 2s per hostdev

I'll submit that in a few minutes as a reply.

...
 Regards,
 Daniel
 --
 |: https://berrange.com      -o-
 https://www.flickr.com/photos/dberrange :|
 |: https://libvirt.org         -o-
 https://fstop138.berrange.com :|
 |: https://entangle-photo.org    -o-
 https://www.instagram.com/dberrange :|

-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [RFC 0/2] Fix detection of slow guest shutdown