On 10/22/18 3:36 AM, Nikolay Shirokovskiy wrote:
>
>
> On 18.10.2018 18:28, John Ferlan wrote:
>> When qemuProcessReconnectHelper was introduced (commit d38897a5d)
>> reconnection failure used VIR_DOMAIN_SHUTOFF_FAILED; however, that
>> was changed in commit bda2f17d to either VIR_DOMAIN_SHUTOFF_CRASHED
>> or VIR_DOMAIN_SHUTOFF_UNKNOWN.
>>
>> When QEMU_CAPS_NO_SHUTDOWN checking was removed in commit fe35b1ad6
>> the conditional state was just left at VIR_DOMAIN_SHUTOFF_CRASHED.
>>
>> Restore the logic adjustment using the same conditions as command
>> line building and alter the comment to describe the reasoning.
>>
>> Signed-off-by: John Ferlan <jferlan(a)redhat.com>
>> ---
>>
>> This was "discovered" while reviewing Nikolay's patch related
>> to adding "DAEMON" as the shutdown reason:
>>
>>
https://www.redhat.com/archives/libvir-list/2018-October/msg00493.html
>>
>> While working through the iterations of that - figured I'd at
>> least post this.
>>
>>
>> src/qemu/qemu_process.c | 15 ++++++++++-----
>> 1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index 3955eda17c..4a39111d51 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -7985,11 +7985,16 @@ qemuProcessReconnect(void *opaque)
>> if (virDomainObjIsActive(obj)) {
>> /* We can't get the monitor back, so must kill the VM
>> * to remove danger of it ending up running twice if
>> - * user tries to start it again later
>> - * If we couldn't get the monitor since QEMU supports
>> - * no-shutdown, we can safely say that the domain
>> - * crashed ... */
>> - state = VIR_DOMAIN_SHUTOFF_CRASHED;
>> + * user tries to start it again later.
>> + *
>> + * If we cannot get to the monitor when the QEMU command
>> + * line used -no-shutdown, then we can safely say that the
>> + * domain crashed; otherwise we don't really know. */
>> + if (priv->monJSON && priv->allowReboot ==
VIR_TRISTATE_BOOL_YES)
>> + state = VIR_DOMAIN_SHUTOFF_CRASHED;
>> + else
>> + state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
>> +
>> /* If BeginJob failed, we jumped here without a job, let's hope
another
>> * thread didn't have a chance to start playing with the domain yet
>> * (it's all we can do anyway).
>>
>
> This is reasonable but not complete. This is only applied to case when we do
connect(2)
I understand your point; however, the goal of this patch wasn't to fix
the various other failure scenarios/reasons. It was to merely restore
the lost UNKNOWN reason in the event that either priv->monJSON == false
(unlikely) or priv->allowReboot != VIR_TRISTATE_BOOL_YES (possible).
Rather than copy-pasta, I'll create qemuDomainIsUsingNoShutdown(priv)
which will perform the condition checks for both command line generation
and reconnection failure paths.
I'm not against this change. I hoped we can elaborate more presice solution
as discussed in
If that's not desirable, then that's fine. I won't spend more cycles on
this.
John
> to qemu and it is failed with ECONNREFUSED which means qemu process does not exists
> (in which case it is either crashed if was configured with -no-shutdown or
terminated
> for unknown reason) or just closed monitor connection (in this case we are going
> to terminate it by ourselves).
>
> But there are other cases. For example we can jump to error path even before
connect(2)
> or after successfull connect. See discussion of such cases in [1].
>
> [1]
https://www.redhat.com/archives/libvir-list/2018-October/msg01031.html
>
> Nikolay
>