[RFC] Reintroducing customisable hotplug timeouts
I'd like to resurface an earlier proposal [1] to make the unplug timeout user-configurable. With production guests running sustained, non-trivial workloads, we find that hot(un)plug operations regularly takes longer than the current default of 5s. However, the inability to easily configure this timeout means that in the worst case a recompilation with appropriate values set is required for each deployment, which is cumbersome to maintain and patch. The previous proposal [1] also initially made the timeout configurable, however that was specifically for the ppc64 architecture. Therefore, imposing this change for an entire architecture made sense, since it was a universal requirement for that platform. With deployment environments as diverse and varied as x86, having a single, inconfigurable value does not make as much sense now. For x86, such a change is not ideal since the desired timeout value depends very heavily on the deployment environment in use - if it is being performed by a human anything more than 10s is an undesirable user experience [2], as pointed out in the original discussion. However, more relaxed timeout values (e.g. 15-20s) can be more acceptable if the operations are background/ not directly user-facing. It is in these cases where the ability to change the timeout more easily becomes more desirable. To prevent misconfiguration, while maintaining the same out-of-box experience for existing users, I'd also like to propose the following- 1. As mentioned in the original proposal [1], the timeout cannot be set to a value lower than the default timeout of 5s; in these cases a warning will be emitted and the timeout will be reset to the default value. 2. Including an upper cap would also be desirable, in my opinion - any API consumer should not wait endlessly for an async operation to finish. 3. This configuration option will only be described in qemu.conf, not imposed - the out-of-box experience will remain unchanged at the current default (5s) for everyone who does not wish to tweak/is unaware of this value. An administrator who knowingly changes these values is therefore not blindsided by any perceived "hangs" resulting from the use of an increased timeout window. Since this would be a clearly documented and explicitly opt-in behaviour, the benefits in the form of easy modification based on evolving operational needs would be quite significant. Given the varying nature of production environments and specific operational needs at scale, it would therefore be beneficial to reconsider the merits of the original proposal. [1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/TXYVZ... [2] https://www.nngroup.com/articles/response-times-3-important-limits/
On Tue, Jan 13, 2026 at 09:47:48 +0530, Akash Kulhalli via Devel wrote:
I'd like to resurface an earlier proposal [1] to make the unplug timeout user-configurable. With production guests running sustained, non-trivial workloads, we find that hot(un)plug operations regularly takes longer than the current default of 5s. However, the inability to easily configure this timeout
Since hot-unplug is an asynchronous operation and we already have an asynchronous API (virDomainDetachDeviceAlias) I think that is the proper solution. No amount of timeouts will fix the inherent problems with it being async and guest-OS controled.
means that in the worst case a recompilation with appropriate values set is required for each deployment, which is cumbersome to maintain and patch.
You really shouldn't do this. If your application does care about the unplug you need to wait for the event and use the async API.
Completely agree with your views — no amount of time is truly sufficient in these scenarios. However, when debugging these issues, it’s crucial that we provide a larger window to ensure we can capture the right guest context. To that end, we’re exploring the possibility of injecting an NMI, controlled via a configuration, in case a CPU hot-plug timeout issue arises (using qemuMonitorInjectNMI). The current 5-second timeout could indeed be too aggressive in some cases. To support better debugging within the guest, would it be possible to introduce a combination of a configurable timeout followed by the NMI to capture the guest core state? This would give us more flexibility and a better chance of diagnosing the issue effectively. Looking forward to hearing your thoughts!
On Wed, Jan 14, 2026 at 11:17:21 -0000, partha.satapathy--- via Devel wrote:
Completely agree with your views — no amount of time is truly sufficient in these scenarios. However, when debugging these issues, it’s crucial that we provide a larger window to ensure we can capture the right guest context.
To that end, we’re exploring the possibility of injecting an NMI, controlled via a configuration, in case a CPU hot-plug timeout issue arises (using qemuMonitorInjectNMI). The current 5-second timeout could indeed be too aggressive in some cases. To support better debugging within the guest, would it be possible to introduce a combination of a configurable timeout followed by the NMI to capture the guest core state? This would give us more flexibility and a better chance of diagnosing the issue effectively.
Once again, with the asynchronous API I've mentioned in the reply you've trimmed has no timeout. It just submits the device removal request to the VM. Success of that API is only meaning that the device unplug request was sent to the VM. You need to act based on wheter you've received VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event or not in any timeframe your application desires, thus giving you full control of any policy you might want to apply. https://www.libvirt.org/html/libvirt-libvirt-domain.html#VIR_DOMAIN_EVENT_ID... The existance of the APIs which do apply timeout is only a quirk of their implementation as historically hotunplug was considered synchronous. The APIs needed to be retrofitted to have similar behaviour as if it were synchronous but it's not and that creates the weird situations. That's why the async API exists which is very upfront about being async and requiring the user to watch for the event.
Understood, thanks for the info. How would this then work for cpu hotunplug operations? Since aliases cannot be assigned to individual vcpus, virDomainDetachDeviceAlias cannot be used in that case. Is the caller still expected to wait for the same device_deleted event in case the virDomainSetVcpusFlags call times out, and any additional timeout can be adjusted in the client application as required? This would somewhat mirror the semantics of the async API you've mentioned earlier. If I were to take the case of qemu: qemuDomainHotplugDelVcpu clubs both failures and timeouts in the same error code (-1). So there does not appear to be a way to actually identify if the request just timed out, or there was an error during the operation, due to which its caller (qemuDomainSetVcpusLive) only sees a return of (-1) and bubbles the same error code down the stack. This is a problem if the application is expected to wait for the event, akin to the async API. Any insight into this would be extremely helpful. I did a quick test with `virsh event` (using libvirt 9.x and qemu 7.2.0), and cpu hotunplug events do not seem to make it to virsh. I can confirm it's definitely registered for the event because I can see the same event for other alias-based detachment operations that I try (e.g. disk/iface) in the same invocation. The libvirtd log however includes the successful event that it received. Not *very* relevant to the discussion at the moment, just thought I'd mention it to see if I was doing something wrong.
On Tue, Jan 20, 2026 at 12:21:17 -0000, akash.kulhalli--- via Devel wrote: Please avoid deleting the context. It makes it hard to see the context of your reply.
Understood, thanks for the info.
How would this then work for cpu hotunplug operations? Since aliases cannot be assigned to individual vcpus, virDomainDetachDeviceAlias cannot be used
So we have an special API for CPU hot(un)plug 'virDomainSetVcpu' for inidividual CPU hotplug.
in that case. Is the caller still expected to wait for the same device_deleted event in case the virDomainSetVcpusFlags call times out, and any additional timeout
'virDomainSetVcpusFlags' is not the correct API here because it can attempt to unplug more cpus at once which has uncertain.
can be adjusted in the client application as required? This would somewhat mirror the semantics of the async API you've mentioned earlier.
So while 'virDomainSetVcpu' does have the semi-synchronous behaviour we had with the old-style APIs before we could add a new flag for the API that just skips the call to qemuDomainWaitForDeviceRemoval and qemuDomainRemoveVcpu and just returns success to avoid the possible timeout if that's a problem for your application.
If I were to take the case of qemu: qemuDomainHotplugDelVcpu clubs both failures and timeouts in the same error code (-1). So there does not appear
Everything will return -1 as failure because that's the RPC's implementation limitation. We do have error codes in the error object though ...
to be a way to actually identify if the request just timed out, or there was an error during the operation, due to which its caller (qemuDomainSetVcpusLive) only sees a return of (-1) and bubbles the same error code down the stack.
... so while you see a -1 here, the error code will be VIR_ERR_OPERATION_TIMEOUT from this API only in case when the guest OS didn't cooperate.
This is a problem if the application is expected to wait for the event, akin to the async API. Any insight into this would be extremely helpful.
I did a quick test with `virsh event` (using libvirt 9.x and qemu 7.2.0), and cpu hotunplug events do not seem to make it to virsh. I can confirm it's definitely registered for the event because I can see the same event for other alias-based detachment operations that I try (e.g. disk/iface) in the same invocation. The libvirtd log however includes the successful event that it received. Not *very* relevant to the discussion at the moment, just thought I'd mention it to see if I was doing something wrong.
Ah, no you are doing things correctly here. Unfortunately we indeed don't send out the device-deleted event on cpu unplug. Specifically 'processDeviceDeletedEvent' calls 'qemuDomainRemoveVcpuAlias' which doesn't actually fire off an event. I guess we could either define an schema for aliases for vCPUs to use the normal event or introduce another one (quite unpleasant work though).
participants (4)
-
Akash Kulhalli -
akash.kulhalli@oracle.com -
partha.satapathy@oracle.com -
Peter Krempa