When qemuProcessReconnectHelper was introduced (commit d38897a5d)
reconnection failure used VIR_DOMAIN_SHUTOFF_FAILED; however, that
was changed in commit bda2f17d to either VIR_DOMAIN_SHUTOFF_CRASHED
or VIR_DOMAIN_SHUTOFF_UNKNOWN.
When QEMU_CAPS_NO_SHUTDOWN checking was removed in commit fe35b1ad6
the conditional state was just left at VIR_DOMAIN_SHUTOFF_CRASHED.
Restore the logic adjustment using the same conditions as command
line building and alter the comment to describe the reasoning.
Signed-off-by: John Ferlan <jferlan(a)redhat.com>
---
This was "discovered" while reviewing Nikolay's patch related
to adding "DAEMON" as the shutdown reason:
https://www.redhat.com/archives/libvir-list/2018-October/msg00493.html
While working through the iterations of that - figured I'd at
least post this.
src/qemu/qemu_process.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 3955eda17c..4a39111d51 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7985,11 +7985,16 @@ qemuProcessReconnect(void *opaque)
if (virDomainObjIsActive(obj)) {
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
- * user tries to start it again later
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
+ * user tries to start it again later.
+ *
+ * If we cannot get to the monitor when the QEMU command
+ * line used -no-shutdown, then we can safely say that the
+ * domain crashed; otherwise we don't really know. */
+ if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
+ state = VIR_DOMAIN_SHUTOFF_CRASHED;
+ else
+ state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
+
/* If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
This is reasonable but not complete. This is only applied to case when we do connect(2)
to qemu and it is failed with ECONNREFUSED which means qemu process does not exists
(in which case it is either crashed if was configured with -no-shutdown or terminated
for unknown reason) or just closed monitor connection (in this case we are going
to terminate it by ourselves).
But there are other cases. For example we can jump to error path even before connect(2)
or after successfull connect. See discussion of such cases in [1].
[1]