Hi,
For this bug, by checking the object's type is valid or not will always
be true.
We should check for the object's value type instead.
I will send the V2 patch.
Thanks,
BRs
Xiubo
On 07/07/2016 09:12, Xiubo Li wrote:
Hi All,
This patch I have test for more than 3 days using 5 different scripts at
the same time, and the test scripts and libvirtd are all running happily
till now.
Thanks,
BRs
Xiubo
On 06/07/2016 15:21, Xiubo Li wrote:
> These days, we experienced one qemu guest agent is not available
> error. Then the guest agent couldn't be used forever before rebooting
> the libvirtd daemon or reboot the qemu-guest-agent in the VM(you can
> also reboot the VM) to clear the priv->agentError.
>
> This is one bug for a long time in many verisons of libvirtd:
>
https://bugzilla.redhat.com/show_bug.cgi?id=1090551
>
> And there is one python script to reproduce this bug from the bugzilla:
>
https://github.com/aspirer/study/blob/master/qemu-guest-agent/test2.py
> Just set the timeout to 0 at Line26, you can reproduce it very quickly:
> rsp = libvirt_qemu.qemuAgentCommand(dom, cmd, 1, 0) // 1 -> 0
>
> The reason why this could happen is that:
>
> In case we received something like: {"return": {}} for example, it is
> likely that somebody:
>
> 1) Started GA which sent one "guest-ping" command with seconds equals
> to VIR_DOMAIN_QEMU_AGENT_COMMAND_NOWAIT(0), which makes this function
> return "immediately" without waiting, and sleeps in virCondWait().
>
> 2) While in AgentIO thread the "guest-ping" command is sent successfully,
> and then waiting for reply.
>
> 3) Then the GA thread is woken up and the mon->msg is set to NULL and
> exit.
>
> 4) Now the AgentIO has received the reply for "guest-ping" command with
> the mon->msg == NULL.
>
> Before we just check the guest-sync case, and should also check for all
> the other GA commands.
>
> I have test many of the GA commands, which all can reporduce this bug.
> This patch will check if this is the case and don't report an error but
> return silently.
>
> Signed-off-by: Xiubo Li <lixiubo(a)cmss.chinamobile.com>
> Signed-off-by: Zhuoyu Zhang <zhangzhuoyu(a)cmss.chinamobile.com>
> Signed-off-by: Wei Tang <tangwei(a)cmss.chinamobile.com>
> Signed-off-by: Yaowei Bai <baiyaowei(a)cmss.chinamobile.com>
> Signed-off-by: Qide Chen <chenqide(a)cmss.chinamobile.com>
> ---
> src/qemu/qemu_agent.c | 30 +++++++++++++++++++++---------
> src/util/virjson.h | 7 +++++++
> 2 files changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c
> index eeede6b..5f08ba0 100644
> --- a/src/qemu/qemu_agent.c
> +++ b/src/qemu/qemu_agent.c
> @@ -308,7 +308,6 @@ qemuAgentIOProcessLine(qemuAgentPtr mon,
> {
> virJSONValuePtr obj = NULL;
> int ret = -1;
> - unsigned long long id;
>
> VIR_DEBUG("Line [%s]", line);
>
> @@ -333,16 +332,29 @@ qemuAgentIOProcessLine(qemuAgentPtr mon,
> obj = NULL;
> ret = 0;
> } else {
> - /* If we've received something like:
> - * {"return": 1234}
> - * it is likely that somebody started GA
> - * which is now processing our previous
> - * guest-sync commands. Check if this is
> - * the case and don't report an error but
> + /* In case we received something like:
> + * {"return": {}} for example,
> + * it is likely that somebody:
> + *
> + * 1) Started GA which sent one "guest-ping" command with
> + * seconds equals to VIR_DOMAIN_QEMU_AGENT_COMMAND_NOWAIT(0)
> + * makes this function return immediately without waiting,
> + * and sleeps in virCondWait().
> + *
> + * 2) While in AgentIO thread the "guest-ping" command is
> sent
> + * successfully, and the waiting for reply.
> + *
> + * 3) Then the GA thread is woken up and the mon->msg is set
> + * to NULL and exit.
> + *
> + * 4) Now the AgentIO has received the reply for
> "guest-ping"
> + * command with the mon->msg == NULL.
> + *
> + * Check if this is the case and don't report an error but
> * return silently.
> */
> - if (virJSONValueObjectGetNumberUlong(obj, "return", &id)
> == 0) {
> - VIR_DEBUG("Ignoring delayed reply to guest-sync:
> %llu", id);
> + if (virIsValidJSONType(obj)) {
> + VIR_DEBUG("Ignoring delayed command reply");
> ret = 0;
> goto cleanup;
> }
> diff --git a/src/util/virjson.h b/src/util/virjson.h
> index 66ed48a..d36ce8e 100644
> --- a/src/util/virjson.h
> +++ b/src/util/virjson.h
> @@ -80,6 +80,13 @@ struct _virJSONValue {
> } data;
> };
>
> +static inline bool
> +virIsValidJSONType(virJSONValuePtr object)
> +{
> + return ((object->type >= VIR_JSON_TYPE_OBJECT)
> + && (object->type <= VIR_JSON_TYPE_BOOLEAN));
> +}
> +
> void virJSONValueFree(virJSONValuePtr value);
>
> int virJSONValueObjectCreate(virJSONValuePtr *obj, ...)
>