On 11.03.2020 12:38, Michal Privoznik wrote:
On 3/5/20 3:47 PM, Nikolay Shirokovskiy wrote:
> needReply added in [1] looks redundant. Indeed it is set to false only
> when mon->await_event is set too (the only exception qemuAgentFSTrim
> which is mistaken).
>
> However it fixes the issue when qemuAgentCommand exits on error path and
> mon->await_event is not reset. Let's instead reset mon->await_event
properly.
>
> Also remove "Woken up by event" debug message as it can be misleading.
> We can get it also if monitor is closed due to serial changed event
> currently. Anyway both qemuAgentClose and qemuAgentNotifyEvent log
> itself.
>
> [1] qemu: make sure agent returns error when required data are missing
>
> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy(a)virtuozzo.com>
> ---
> src/qemu/qemu_agent.c | 47 ++++++++++++++++++++-----------------------
> 1 file changed, 22 insertions(+), 25 deletions(-)
>
> diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c
> index 2de53efb7a..605c9f563e 100644
> --- a/src/qemu/qemu_agent.c
> +++ b/src/qemu/qemu_agent.c
> @@ -1112,7 +1112,6 @@ static int
> qemuAgentCommand(qemuAgentPtr mon,
> virJSONValuePtr cmd,
> virJSONValuePtr *reply,
> - bool needReply,
> int seconds)
> {
> int ret = -1;
> @@ -1121,17 +1120,16 @@ qemuAgentCommand(qemuAgentPtr mon,
> int await_event = mon->await_event;
> *reply = NULL;
> + memset(&msg, 0, sizeof(msg));
> if (!mon->running) {
> virReportError(VIR_ERR_AGENT_UNRESPONSIVE, "%s",
> _("Guest agent disappeared while executing
command"));
> - return -1;
> + goto cleanup;
> }
> if (qemuAgentGuestSync(mon) < 0)
> - return -1;
> -
> - memset(&msg, 0, sizeof(msg));
> + goto cleanup;
> if (!(cmdstr = virJSONValueToString(cmd, false)))
> goto cleanup;
> @@ -1149,9 +1147,7 @@ qemuAgentCommand(qemuAgentPtr mon,
> /* If we haven't obtained any reply but we wait for an
> * event, then don't report this as error */
> if (!msg.rxObject) {
> - if (await_event && !needReply) {
> - VIR_DEBUG("Woken up by event %d", await_event);
> - } else {
Please keep this debug printing. It doesn't hurt anything and has potential of
helping us to debug.
This debug line is not correct. I reflected it in the commit message.
I actually get this line during debug session when agent monitor is closed
during VM shutdown on serial event and not shutdown event. Serial event
just comes first.
> + if (!await_event) {
> if (mon->running)
> virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> _("Missing monitor reply object"));
> @@ -1169,6 +1165,7 @@ qemuAgentCommand(qemuAgentPtr mon,
> cleanup:
> VIR_FREE(cmdstr);
> VIR_FREE(msg.txBuffer);
> + mon->await_event = QEMU_AGENT_EVENT_NONE;
This is the part I don't like about this patch. The rest looks fine. Why is this
needed? Either the mon->await_event is not set by the caller, or if set the the event
handler will clear it out by calling qemuAgentNotifyEvent(). Or am I missing something?
The whole point of the patch is in this line)
Yeah, you are talking of success paths. But on error paths mon->await_event is not
cleared currently so next command which need reply
can be finish on await event without needReply which lead to SIGSGEV. This is what patch
[1] fixes by introducing needReply which don't need to be cleared
as it is on stack.
I guess scenario is next given the bug [2] mentioned in [1]:
- vm shutdown command is sent to GA
- it is finished with timeout (60 s), mon->await_event is not cleared
- guest ping is sent
- guest shutdowned and ping command get awakend by this event as mon->await_event is
set (bug)
So needReply argument is helpful but it is cleaner just to reset mon->await_event
properly.
[1] qemu: make sure agent returns error when required data are missing
[2]
https://bugzilla.redhat.com/show_bug.cgi?id=1058149
Nikolay
> return ret;
> }
> @@ -1269,7 +1266,7 @@ int qemuAgentShutdown(qemuAgentPtr mon,
> mon->await_event = QEMU_AGENT_EVENT_RESET;
> else
> mon->await_event = QEMU_AGENT_EVENT_SHUTDOWN;
> - ret = qemuAgentCommand(mon, cmd, &reply, false,
> + ret = qemuAgentCommand(mon, cmd, &reply,
> VIR_DOMAIN_QEMU_AGENT_COMMAND_SHUTDOWN);
> virJSONValueFree(cmd);
> @@ -1312,7 +1309,7 @@ int qemuAgentFSFreeze(qemuAgentPtr mon, const char
**mountpoints,
> if (!cmd)
> goto cleanup;
> - if (qemuAgentCommand(mon, cmd, &reply, true, mon->timeout) < 0)
> + if (qemuAgentCommand(mon, cmd, &reply, mon->timeout) < 0)
> goto cleanup;
> if (virJSONValueObjectGetNumberInt(reply, "return", &ret) <
0) {
> @@ -1349,7 +1346,7 @@ int qemuAgentFSThaw(qemuAgentPtr mon)
> if (!cmd)
> return -1;
> - if (qemuAgentCommand(mon, cmd, &reply, true, mon->timeout) < 0)
> + if (qemuAgentCommand(mon, cmd, &reply, mon->timeout) < 0)
> goto cleanup;
> if (virJSONValueObjectGetNumberInt(reply, "return", &ret) <
0) {
> @@ -1386,7 +1383,7 @@ qemuAgentSuspend(qemuAgentPtr mon,
> return -1;
> mon->await_event = QEMU_AGENT_EVENT_SUSPEND;
> - ret = qemuAgentCommand(mon, cmd, &reply, false, mon->timeout);
> + ret = qemuAgentCommand(mon, cmd, &reply, mon->timeout);
> virJSONValueFree(cmd);
> virJSONValueFree(reply);
> @@ -1415,7 +1412,7 @@ qemuAgentArbitraryCommand(qemuAgentPtr mon,
> if (!(cmd = virJSONValueFromString(cmd_str)))
> goto cleanup;
> - if ((ret = qemuAgentCommand(mon, cmd, &reply, true, timeout)) < 0)
> + if ((ret = qemuAgentCommand(mon, cmd, &reply, timeout)) < 0)
> goto cleanup;
> if (!(*result = virJSONValueToString(reply, false)))
> @@ -1442,7 +1439,7 @@ qemuAgentFSTrim(qemuAgentPtr mon,
> if (!cmd)
> return ret;
> - ret = qemuAgentCommand(mon, cmd, &reply, false, mon->timeout);
> + ret = qemuAgentCommand(mon, cmd, &reply, mon->timeout);
Oh this is fun. This 'false' you are removing should have been true as the
command does have a reply that we need to wait for.
> virJSONValueFree(cmd);
> virJSONValueFree(reply);
Michal