
On 8/30/19 5:24 AM, Daniel P. Berrangé wrote:
On Mon, Aug 19, 2019 at 11:16:24AM +0200, Ján Tomko wrote:
On Sun, Aug 18, 2019 at 06:45:29PM -0300, Daniel Henrique Barboza wrote:
For some architectures and setups, device removal can take longer than the default 5 seconds. This results in commands such as 'virsh setvcpus' to fire timeout messages even if the actual operation happened in the guest, causing confusion for the user.
The commit that introduced this error message: commit e3229f6e4461cd1721dc68a32e16ab1718ae716e qemu: hotplug: Add support for VCPU unplug
specifically says that we treat this differently than regular device detach:
As the new code is using device_del all the implications of using it are present. Contrary to the device deletion code, the vcpu deletion code fails if the unplug request is not executed in time.
Technically, we already did execute the unplug request so we lie to the user saying "operation failed".
Maybe we can revisit the decision? [cc-ing pkrempa who added this]
This patch adds a new qemu.conf parameter called 'unplug_timeout' to handle these cases. If left unset, the current default timeout is used. To avoid user 'experimentation' with small timeouts, the current timeout is also the minimal value allowed.
The reason for this timeout is that we originally promised something that we cannot deliver - a synchronous device detach API, while the operation itself is asynchronous. I'm not a fan of exposing it and making it configurable. I'm especially *not* a fan because the commit messages says this is a problem on certain architectures. Since we know what those arches are, we should use a larger timeout for those arches out of the box. Requiring admin to set a config param to fix the architectures is super unpleasant out of the box experiance.
Good point. I'll re-send the series changing the timeout for PowerPC guests only. There's no need to impact all users for a problem that so far only impacts PPC. Thanks, DHB
Regards, Daniel