On 10/30/19 4:09 AM, Nikolay Shirokovskiy wrote:
If we use fake reboot then domain goes thru
running->shutdown->running
state changes with shutdown state only for short period of time. At
least this is implementation details leaking into API. And also there is
one real case when this is not convinient. I'm doing a backup with the
help of temporary block snapshot (with the help of qemu's API which is
used in the newly created libvirt's backup API). If guest is shutdowned
I want to continue to backup so I don't kill the process and domain is
in shutdown state. Later when backup is finished I want to destroy qemu
process. So I check if it is in shutdowned state and destroy it if it
is. Now if instead of shutdown domain got fake reboot then I can destroy
process in the middle of fake reboot process.
After shutdown event we also get stop event and now as domain state is
running it will be transitioned to paused state and back to running
later. Though this is not critical for the described case I guess it is
better not to leak these details to user too. So let's leave domain in
running state on stop event if fake reboot is in process.
As we don't know when stop event really arrive let's move resetting flag
after starting CPUs - at this point stop event should be handled.
Sorry this didn't receive a timely response. Unsurprisingly it's
conflicting with master now.
Conceptually what you say makes sense, that the 'shutdown' is an
implementation detail of our 'reboot' implementation and details
shouldn't leak to apps.
Here's the events I see with: sudo virsh reboot --mode=acpi f30
$ sudo virsh event --domain f30 --all --loop
event 'agent-lifecycle' for domain f30: state: 'disconnected' reason:
'channel event'
event 'lifecycle' for domain f30: Shutdown Finished after guest request
event 'reboot' for domain f30
event 'lifecycle' for domain f30: Resumed Unpaused
event 'reboot' for domain f30
event 'agent-lifecycle' for domain f30: state: 'connected' reason:
'channel event'
Using --mode=agent gives:
event 'agent-lifecycle' for domain f30: state: 'disconnected' reason:
'channel event'
event 'reboot' for domain f30
event 'reboot' for domain f30
event 'agent-lifecycle' for domain f30: state: 'connected' reason:
'channel event'
So this change is moving the first more towards the latter. (not sure
what those two reboot events are about)
Signed-off-by: Nikolay Shirokovskiy
<nshirokovskiy(a)virtuozzo.com>
---
src/qemu/qemu_process.c | 57 ++++++++++++++++++++++-------------------
1 file changed, 31 insertions(+), 26 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index ed8666e9d1..2d37f92724 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -500,6 +500,8 @@ qemuProcessFakeReboot(void *opaque)
goto endjob;
}
+ qemuDomainSetFakeReboot(driver, vm, false);
+
I think this should also be in the cleanup path of this function,
otherwise an earlier error looks like it would leave priv->fakeReboot
still set. But it makes sense it needs to be here too, so it's saved in
the domain status
if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm,
driver->caps) < 0) {
VIR_WARN("Unable to save status on vm %s after state change",
vm->def->name);
@@ -525,7 +527,6 @@ qemuProcessShutdownOrReboot(virQEMUDriverPtr driver,
qemuDomainObjPrivatePtr priv = vm->privateData;
if (priv->fakeReboot) {
- qemuDomainSetFakeReboot(driver, vm, false);
virObjectRef(vm);
virThread th;
if (virThreadCreate(&th,
@@ -534,6 +535,7 @@ qemuProcessShutdownOrReboot(virQEMUDriverPtr driver,
vm) < 0) {
VIR_ERROR(_("Failed to create reboot thread, killing domain"));
ignore_value(qemuProcessKill(vm, VIR_QEMU_PROCESS_KILL_NOWAIT));
+ qemuDomainSetFakeReboot(driver, vm, false);
virObjectUnref(vm);
}
} else {
@@ -595,35 +597,37 @@ qemuProcessHandleShutdown(qemuMonitorPtr mon G_GNUC_UNUSED,
goto unlock;
}
- VIR_DEBUG("Transitioned guest %s to shutdown state",
- vm->def->name);
- virDomainObjSetState(vm,
- VIR_DOMAIN_SHUTDOWN,
- VIR_DOMAIN_SHUTDOWN_UNKNOWN);
+ if (!priv->fakeReboot) {
Here it would be helpful to have a comment like
/* If the qemu process was stopped as part of fakeReboot reset, we skip
sending a shutdown event */
+ VIR_DEBUG("Transitioned guest %s to shutdown
state",
+ vm->def->name);
+ virDomainObjSetState(vm,
+ VIR_DOMAIN_SHUTDOWN,
+ VIR_DOMAIN_SHUTDOWN_UNKNOWN);
- switch (guest_initiated) {
- case VIR_TRISTATE_BOOL_YES:
- detail = VIR_DOMAIN_EVENT_SHUTDOWN_GUEST;
- break;
+ switch (guest_initiated) {
+ case VIR_TRISTATE_BOOL_YES:
+ detail = VIR_DOMAIN_EVENT_SHUTDOWN_GUEST;
+ break;
- case VIR_TRISTATE_BOOL_NO:
- detail = VIR_DOMAIN_EVENT_SHUTDOWN_HOST;
- break;
+ case VIR_TRISTATE_BOOL_NO:
+ detail = VIR_DOMAIN_EVENT_SHUTDOWN_HOST;
+ break;
- case VIR_TRISTATE_BOOL_ABSENT:
- case VIR_TRISTATE_BOOL_LAST:
- default:
- detail = VIR_DOMAIN_EVENT_SHUTDOWN_FINISHED;
- break;
- }
+ case VIR_TRISTATE_BOOL_ABSENT:
+ case VIR_TRISTATE_BOOL_LAST:
+ default:
+ detail = VIR_DOMAIN_EVENT_SHUTDOWN_FINISHED;
+ break;
+ }
- event = virDomainEventLifecycleNewFromObj(vm,
- VIR_DOMAIN_EVENT_SHUTDOWN,
- detail);
+ event = virDomainEventLifecycleNewFromObj(vm,
+ VIR_DOMAIN_EVENT_SHUTDOWN,
+ detail);
- if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm, driver->caps)
< 0) {
- VIR_WARN("Unable to save status on vm %s after state change",
- vm->def->name);
+ if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm,
driver->caps) < 0) {
+ VIR_WARN("Unable to save status on vm %s after state change",
+ vm->def->name);
+ }
}
if (priv->agent)
@@ -657,7 +661,8 @@ qemuProcessHandleStop(qemuMonitorPtr mon G_GNUC_UNUSED,
reason = priv->pausedReason;
priv->pausedReason = VIR_DOMAIN_PAUSED_UNKNOWN;
- if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) {
+ if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING &&
+ !priv->fakeReboot) {
What case does this trigger in? A comment will help too
Couple things I'm curious about is if the VM is left in an inaccurate
state if the fakereboot fails. But qemuProcessFakeReboot will
qemuProcessKill in that case which I think covers it.
The other is, whether we should fully ignore the internal state change,
or just not send the event. By ignoring the state change we are
willfully claiming the VM is still running when we know it isn't. I
guess that should be a small window of time, reset + StartCPUs should be
quick I think, but it's not a pattern I've seen in the code before
- Cole