Steps:
1 virsh reboot guest1 --mode=acpi
2 virsh shutdown guest1 --mode=agent
Expected result:
As the SHUTDOWN job is after REBOOT, we expected the guest to be *shutoff*. (Do you
think so?)
Exacted result:
After the 2 steps above, the guest got *rebooted*.
The reason to this problem:
1 in qemuDomainReboot(mode acpi), it sets priv->fakeReboot to 1.
2 after shutdown/reboot, qemu monitor IO trigged qemuProcessHandleShutdown(), which
finds that priv->fakeReboot is 1, and reboot the guest.
Root Cause of the problem:
After further look into the problem, We found that the design of acpi/agent
shutdown/reboot seems a little chaotic.
-----------------------------------
sheet1 who sets fakeReboot
-----------------------------------
shutdown reboot
acpi Y(0) Y(1)
agent N N
It's apparently, *acpi-mode* jobs set fakeReboot.
-----------------------------------
sheet2 who needs to check fakeReboot(qemuProcessHandleShutdown())
-----------------------------------
shutdown reboot
acpi Y Y
agent *Y* *N*
Things become a little odd here. only agent-mode reboot doesn't check fakeReboot.
We can tell from the above 2 sheets, that they're not consistent.
*Agent-mode shutdown needs to check fakeReboot(trigger by SHUTDOWN monitorIO event),
while it didn't set it before.*
The chaos is not caused by libvirtd, it's a systematic problem, including guest os
and qemu. (guest os writes ACPI, triggers qemu, qemu then triggers libvirtd)
My Solution:
A simple solution is to make the 1st sheet consistent to the 2nd sheet, which is:
-----------------------------------
sheet3 who should set fakeReboot:
-----------------------------------
shutdown reboot
acpi Y(0) Y(1)
agent *Y(0)* N
-----------------------------------
we let agent-mode shutdown set fakeReboot to 0.
--------------------------------------------------------------------------------
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 7eb5a7d..c751dcf 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -1959,6 +1959,8 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int
flags)
goto endjob;
}
+ qemuDomainSetFakeReboot(driver, vm, isReboot);
+
if (useAgent) {
qemuDomainObjEnterAgent(vm);
ret = qemuAgentShutdown(priv->agent, agentFlag);
@@ -1970,7 +1972,6 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int
flags)
*/
if (!useAgent ||
(ret < 0 && (acpiRequested || !flags))) {
- qemuDomainSetFakeReboot(driver, vm, isReboot);
/* Even if agent failed, we have to check if guest went away
* by itself while our locks were down. */
----------------------------------------------------------------------------------
Discussion:
Although the solution above seems to have solved the problem, but it seems a little
awkward to have sheets like this.
Maybe we should let sheet2 consistent with sheet1, rather than letting sheet1
consistent to sheet2. But It's too difficult to realize that, seems impossible
So, is there a better way to solve this problem? or, shall I commit this patch?