On Mon, Aug 19, 2019 at 11:16:24AM +0200, Ján Tomko wrote:
On Sun, Aug 18, 2019 at 06:45:29PM -0300, Daniel Henrique Barboza
wrote:
> For some architectures and setups, device removal can take
> longer than the default 5 seconds. This results in commands
> such as 'virsh setvcpus' to fire timeout messages even if
> the actual operation happened in the guest, causing confusion
> for the user.
>
The commit that introduced this error message:
commit e3229f6e4461cd1721dc68a32e16ab1718ae716e
qemu: hotplug: Add support for VCPU unplug
specifically says that we treat this differently than regular device
detach:
As the new code is using device_del all the implications of using it
are present. Contrary to the device deletion code, the vcpu deletion
code fails if the unplug request is not executed in time.
Technically, we already did execute the unplug request so we lie to the
user saying "operation failed".
Maybe we can revisit the decision? [cc-ing pkrempa who added this]
> This patch adds a new qemu.conf parameter called 'unplug_timeout'
> to handle these cases. If left unset, the current default
> timeout is used. To avoid user 'experimentation' with small
> timeouts, the current timeout is also the minimal value
> allowed.
>
The reason for this timeout is that we originally promised something
that we cannot deliver - a synchronous device detach API, while the
operation itself is asynchronous. I'm not a fan of exposing it and
making it configurable.
I'm especially *not* a fan because the commit messages says this is
a problem on certain architectures. Since we know what those arches
are, we should use a larger timeout for those arches out of the box.
Requiring admin to set a config param to fix the architectures is
super unpleasant out of the box experiance.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|