
On 27.09.2018 14:37, Wang Yechao wrote:
qemuAgentClose and qemuAgentIO have race condition, as follows:
main thread: second thread: virEventPollRunOnce processSerialChangedEvent virEventPollDispatchHandles virMutexUnlock(&eventLoop.lock) qemuAgentClose virObjectLock(mon) virEventRemoveHandle VIR_FORCE_CLOSE(mon->fd) virObjectUnlock(mon) priv->agentError = false qemuAgentIO virObjectLock(mon) mon->fd != fd --> error = true qemuProcessHandleAgentError priv->agentError = true virObjectUnlock(mon) virMutexLock(&eventLoop.lock)
qemuAgentClose set the mon->fd to '-1', and then qemuAgentIO check the mon->fd not equals to fd that registered before. qemuProcessHandleAgentError will be called to set priv->agentError to 'true', then the priv->agentError is always 'true' except restart libvirtd or restart qemu-guest-agent process in guest. We can't send any qemu-agent-command anymore even if qemuConnectAgent return success later.
This is accidently occurs when hot-add-vcpu in windows2012. virsh setvcpus ... virsh qemu-agent-command $vm '{"execute":"guest-get-vcpus"}'
Reset the priv->agentError to 'false' when qemuConnectAgent sucess to fix this problem.
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> --- src/qemu/qemu_process.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 29b0ba1..4fbb955 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -269,6 +269,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) virResetLastError(); }
+ priv->agentError = false; return 0; }
There was similar problem with qemu monitor: commit 89563efc0209b854d2b2e554423423d7602acdbd Author: Daniel P. Berrange <berrange@redhat.com> Date: Fri Sep 28 15:27:39 2012 +0100 Avoid bogus I/O event errors when closing the QEMU monitor After calling qemuMonitorClose(), it is still possible for the QEMU monitor I/O event callback to get invoked. This will trigger an error message because mon->fd has been set to -1 at this point. Silently ignore the case where mon->fd is -1, likewise for mon->watch being zero. Signed-off-by: Daniel P. Berrange <berrange@redhat.com This approach has advantage it does not set agentError at all. So clients see agent state as disconnected (which is true) instead of error. Nikolay