On 10/14/19 5:39 PM, Jim Fehlig wrote:
The ordering of lock manager locks in the libxl driver has a flaw
that was
uncovered by a migration error path. In the perform phase of migration, the
source host calls virDomainLockProcessPause to release the lock before
sending the VM to the destination host. If the send fails an attempt is made
to reacquire the lock with virDomainLockProcessResume, but that too can fail
if the destination host has not finished cleaning up the failed VM and
releasing the lock it acquired when starting to receive the VM.
This change delays calling virDomainLockProcessResume in libxlDomainStart
until the VM is successfully created, but before it is unpaused. A similar
approach is used by the qemu driver, avoiding the need to release the lock
if VM creation fails. In the migration perform phase, releasing the lock
with virDomainLockProcessPause is delayed until the VM is successfully
sent to the destination, which avoids reacquiring the lock if the send
fails.
Signed-off-by: Jim Fehlig <jfehlig(a)suse.com>
---
src/libxl/libxl_domain.c | 14 +++++++-------
src/libxl/libxl_migration.c | 14 +++++---------
2 files changed, 12 insertions(+), 16 deletions(-)
Reviewed-by: Cole Robinson <crobinso(a)redhat.com>
- Cole