In func doPeer2PeerMigrate3(), in the "finish" step, it checks whether
domainMigrateFinish3() returns NULL or not.
if it(ddomain) is NULL, it just restarts the guest on the source.
Please consider the scenario that the ddomain has already been running on the dest, but it
fails to tell the source
this fact, and ddomain becomes NULL. If we then restart the guest on the source, there
will be 2 same guests running
on both sides, and a SPLIT-BRAIN occurs.
It seems much better to stop them both , rather than leaving them both running. At least,
when we found the ddomain
is NULL, we should probably check whether the problem is caused by keepAlive failure, if
so, kill the guest on the source
rather than restarting it.
How do you think about that?
BTW, it says that: "The lock manager plugins should take care of safety in this
scenario" in the comment,
with the commit 2593f9692df0f128b14cde811e18aa49c1cf3e06, I don't quite understand
that:
1) If we migrate the guest with the flag VIR_MIGRATE_NON_SHARED_DISK, then nbd server may
take care of the data
consistency, But before it starts the cpus on the dest, the nbd server is already stopped.
So, at this moment, no
one takes care of this problem.
2) If we migrate the guest with a shared disk, then does it mean that the nfs or other
shareing-disk schemas should
prevent split-brain by themselves?
Show replies by date