On 10/9/19 5:15 PM, Cole Robinson wrote:
On 9/11/19 5:05 PM, Daniel Henrique Barboza wrote:
> For some architectures and setups, device removal can take
> longer than the default 5 seconds. This results in commands
> such as 'virsh setvcpus' to fire timeout messages even if
> the operation were successful in the guest, confusing the
> user.
>
> This patch sets a new 10 seconds unplug timeout for PPC64
> guests. All other archs will keep the default 5 seconds
> timeout.
>
> Instead of putting 'if PPC64' conditionals inside qemu_hotplug.c
> to set the new timeout value, a new QEMU driver attribute
> 'unplugTimeout' was added. The timeout value is set during
> qemuStateInitialize only once. All qemu_hotplug.c functions
> that uses the timeout have easy access to a qemu_driver object,
> thus the change to use unplugTimeout is straightforward.
>
> The now unused 'qemuDomainRemoveDeviceWaitTime' global can't
> be simply erased from qemu_hotplug.c though. Next patch will
> remove it properly.
>
Sorry for the wrong review delay. I see this implements danpb's
suggestion from the previous thread. The implementation seems a little
odd to me though because it is differentiating on host arch, but this
is about guest arch right? And probably an arbitrary number of
options, like I imagine TCG would want a longer timeout too (though
that's not anything you need to deal with)
I think it's sensible to say that a TCG guest would always required a
greater unplug timeout than a hardware accelerated one. However, never
thought about considering TCG guests in this patch series though. A shame.
So, considering that TCG guest exists, we can parametrize this unplug
timeout
calculation by considering guest (not host) architecture and if it's a
TCG or
a KVM guest. Speaking about the problem I'm trying to fix with ppc64 and
what we already have, we can roll out this unplug timeout logic like:
- pseries guest on KVM: 10 seconds
- everyone else on KVM: 5 seconds (untouched, like it is today)
For TCG guests, perhaps double the KVM timeout? This would need
experimentation, but double the timeout seems ok at first glance. Then we
can do:
- TCG pseries guest: 10 * 2 = 20 seconds
- TCG with every other arch guest = 5 * 2 = 10 seconds
This TCG calculation can be left alone for now as well - I can create
the API
considering that TCG guest exists, but do not infer the timeout for it.
Either
way works for me.
So I think this should be a function that lives in qemu_hotplug.c and
acts on a DomainDef at least. The test suite will have to mock that in
a qemuhotplugmock.c file, tests/qemucpumock.c is a good example to
follow.
Just to check if we're on the same page, by 'I think this should be a
function'
you mean the calculation of qemuDomainRemoveDeviceWaitTime that I just
mentioned above, right?
If you do that as an upfront patch, it can go in first. Then add the
ppc changes in an add on patch.
Works for me.
Thanks,
DHB
You can CC me on the next version and I will review it
Thanks,
Cole