Il 21/08/2013 15:30, Michael S. Tsirkin ha scritto:
On Wed, Aug 21, 2013 at 02:01:17PM +0200, Paolo Bonzini wrote:
> After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
> The reason for this is that events are edge-triggered, and can be lost if
> management dies at the wrong time. Stopping a panicked VM lets management
> know of a panic even if it has crashed; management can learn about the
> panic when it restarts and queries running QEMU processes. The downside
> is of course that the VM will be paused while management is not running,
> but that is acceptable if it only happens with explicit "-device pvpanic".
>
> Upon learning of a panic, management (if configured to do so) can pick a
> variety of behaviors: leave the VM paused, reset it, destroy it. In
> addition to all of these behaviors, it is possible dumping the VM core
> from the host.
>
> However, right now, the panicked state is irreversible, and can only be
> exited by resetting the machine. This means that any policy decision
> is entirely in the hands of the host. In particular there is no way to
> use the "reboot on panic" option together with pvpanic.
>
> This patch makes the panicked state reversible (and removes various
> workarounds that were there because of the state being irreversible).
> With this change, management has a wider set of possible policies: it
> can just log the crash and leave policy to the guest, it can leave the
> VM paused. In particular, the "log the crash and continue" is implemented
> simply by sending a "cont" as soon as management learns about the panic.
> Management could also implement the "irreversible paused state" itself.
> And again, all such actions can be coupled with dumping the VM core.
>
> Unfortunately we cannot change the behavior of 1.6.0. Thus, even if
> it uses "-device pvpanic", management should check for "cont"
failures.
> If "cont" fails, management can then log that the VM remained paused
> and urge the administrator to update QEMU.
>
> I suggest that this patch be included in an 1.6.1 release as soon as
> possible, and perhaps in the 1.5 branch too.
>
> Cc: qemu-stable(a)nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
OK this does sound reasonable, but it looks like current behaviour
was intentional
Yes, it was intentional and it also sounded reasonable at the time. The
gdbstub.c hack definitely should have raised a warning sign, though.
I suspect it was done this way simply because Xen has a "crashed" state
that behaves exactly like that, with policy determined exclusively by
the host. And it has the same problems, in fact, even though I never
heard anyone complain about it for some weird reason...
With this patch, the same thing can be implemented at the libvirt level,
and the GUEST_PANICKED runstate now matches the semantics of watchdogs
too, so this solution is not only easier to use, but also more
consistent with the rest of QEMU.
Paolo
, so I wonder why was it put in place.
Any idea?
> ---
> gdbstub.c | 3 ---
> vl.c | 6 ++----
> 2 files changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/gdbstub.c b/gdbstub.c
> index 35ca7c2..747e67d 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s)
> #ifdef CONFIG_USER_ONLY
> s->running_state = 1;
> #else
> - if (runstate_check(RUN_STATE_GUEST_PANICKED)) {
> - runstate_set(RUN_STATE_DEBUG);
> - }
> if (!runstate_needs_reset()) {
> vm_start();
> }
> diff --git a/vl.c b/vl.c
> index 25b8f2f..818d99e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
> { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
> { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>
> - { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
> + { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
> { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
> - { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG },
>
> { RUN_STATE_MAX, RUN_STATE_MAX },
> };
> @@ -685,8 +684,7 @@ int runstate_is_running(void)
> bool runstate_needs_reset(void)
> {
> return runstate_check(RUN_STATE_INTERNAL_ERROR) ||
> - runstate_check(RUN_STATE_SHUTDOWN) ||
> - runstate_check(RUN_STATE_GUEST_PANICKED);
> + runstate_check(RUN_STATE_SHUTDOWN);
> }
>
> StatusInfo *qmp_query_status(Error **errp)
> --
> 1.8.3.1