[libvirt] [PATCH] qemu: agent: Reset agentError when qemuConnectAgent success

qemuAgentClose and qemuAgentIO have race condition, as follows: main thread: second thread: virEventPollRunOnce processSerialChangedEvent virEventPollDispatchHandles virMutexUnlock(&eventLoop.lock) qemuAgentClose virObjectLock(mon) virEventRemoveHandle VIR_FORCE_CLOSE(mon->fd) virObjectUnlock(mon) priv->agentError = false qemuAgentIO virObjectLock(mon) mon->fd != fd --> error = true qemuProcessHandleAgentError priv->agentError = true virObjectUnlock(mon) virMutexLock(&eventLoop.lock) qemuAgentClose set the mon->fd to '-1', and then qemuAgentIO check the mon->fd not equals to fd that registered before. qemuProcessHandleAgentError will be called to set priv->agentError to 'true', then the priv->agentError is always 'true' except restart libvirtd or restart qemu-guest-agent process in guest. We can't send any qemu-agent-command anymore even if qemuConnectAgent return success later. This is accidently occurs when hot-add-vcpu in windows2012. virsh setvcpus ... virsh qemu-agent-command $vm '{"execute":"guest-get-vcpus"}' Reset the priv->agentError to 'false' when qemuConnectAgent sucess to fix this problem. Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> --- src/qemu/qemu_process.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 29b0ba1..4fbb955 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -269,6 +269,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) virResetLastError(); } + priv->agentError = false; return 0; } -- 1.8.3.1

On 27.09.2018 14:37, Wang Yechao wrote:
qemuAgentClose and qemuAgentIO have race condition, as follows:
main thread: second thread: virEventPollRunOnce processSerialChangedEvent virEventPollDispatchHandles virMutexUnlock(&eventLoop.lock) qemuAgentClose virObjectLock(mon) virEventRemoveHandle VIR_FORCE_CLOSE(mon->fd) virObjectUnlock(mon) priv->agentError = false qemuAgentIO virObjectLock(mon) mon->fd != fd --> error = true qemuProcessHandleAgentError priv->agentError = true virObjectUnlock(mon) virMutexLock(&eventLoop.lock)
qemuAgentClose set the mon->fd to '-1', and then qemuAgentIO check the mon->fd not equals to fd that registered before. qemuProcessHandleAgentError will be called to set priv->agentError to 'true', then the priv->agentError is always 'true' except restart libvirtd or restart qemu-guest-agent process in guest. We can't send any qemu-agent-command anymore even if qemuConnectAgent return success later.
This is accidently occurs when hot-add-vcpu in windows2012. virsh setvcpus ... virsh qemu-agent-command $vm '{"execute":"guest-get-vcpus"}'
Reset the priv->agentError to 'false' when qemuConnectAgent sucess to fix this problem.
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> --- src/qemu/qemu_process.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 29b0ba1..4fbb955 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -269,6 +269,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) virResetLastError(); }
+ priv->agentError = false; return 0; }
There was similar problem with qemu monitor: commit 89563efc0209b854d2b2e554423423d7602acdbd Author: Daniel P. Berrange <berrange@redhat.com> Date: Fri Sep 28 15:27:39 2012 +0100 Avoid bogus I/O event errors when closing the QEMU monitor After calling qemuMonitorClose(), it is still possible for the QEMU monitor I/O event callback to get invoked. This will trigger an error message because mon->fd has been set to -1 at this point. Silently ignore the case where mon->fd is -1, likewise for mon->watch being zero. Signed-off-by: Daniel P. Berrange <berrange@redhat.com This approach has advantage it does not set agentError at all. So clients see agent state as disconnected (which is true) instead of error. Nikolay

On 27.09.2018 14:37, Wang Yechao wrote:
qemuAgentClose and qemuAgentIO have race condition, as follows:
main thread: second thread: virEventPollRunOnce processSerialChangedEvent virEventPollDispatchHandles virMutexUnlock(&eventLoop.lock) qemuAgentClose virObjectLock(mon) virEventRemoveHandle VIR_FORCE_CLOSE(mon->fd) virObjectUnlock(mon) priv->agentError = false qemuAgentIO virObjectLock(mon) mon->fd != fd --> error = true qemuProcessHandleAgentError priv->agentError = true virObjectUnlock(mon) virMutexLock(&eventLoop.lock)
qemuAgentClose set the mon->fd to '-1', and then qemuAgentIO check the mon->fd not equals to fd that registered before. qemuProcessHandleAgentError will be called to set priv->agentError to 'true', then the priv->agentError is always 'true' except restart libvirtd or restart qemu-guest-agent process in guest. We can't send any qemu-agent-command anymore even if qemuConnectAgent return success later.
This is accidently occurs when hot-add-vcpu in windows2012. virsh setvcpus ... virsh qemu-agent-command $vm '{"execute":"guest-get-vcpus"}'
Reset the priv->agentError to 'false' when qemuConnectAgent sucess to fix this problem.
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> --- src/qemu/qemu_process.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 29b0ba1..4fbb955 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -269,6 +269,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) virResetLastError(); }
+ priv->agentError = false; return 0; }
There was similar problem with qemu monitor:
commit 89563efc0209b854d2b2e554423423d7602acdbd Author: Daniel P. Berrange <berrange@redhat.com> Date: Fri Sep 28 15:27:39 2012 +0100
Avoid bogus I/O event errors when closing the QEMU monitor
After calling qemuMonitorClose(), it is still possible for the QEMU monitor I/O event callback to get invoked. This will trigger an error message because mon->fd has been set to -1 at this point. Silently ignore the case where mon->fd is -1, likewise for mon->watch being zero.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com
This approach has advantage it does not set agentError at all. So clients see agent state as disconnected (which is true) instead of error.
Nikolay
Thanks Nikolay, this solution is better. I'll give another patch. --- Best wishes Wang Yechao
participants (3)
-
Nikolay Shirokovskiy
-
Wang Yechao
-
wang.yechao255@zte.com.cn