
On 11/12/18 4:26 AM, Daniel P. Berrangé wrote:
On Fri, Nov 02, 2018 at 04:34:02PM -0600, Jim Fehlig wrote:
A dry run can be used as a best-effort check that a migration command will succeed. The destination host will be checked to see if it can accommodate the resources required by the domain. DRY_RUN will fail if the destination host is not capable of running the domain. Although a subsequent migration will likely succeed, the success of DRY_RUN does not ensure a future migration will succeed. Resources on the destination host could become unavailable between a DRY_RUN and actual migration.
I'm not really convinced this is a particularly useful concept, as it is only going to catch a very small number of the reasons why migration can fail. So you still have to expect the real migration invokation to have a strong chance of failing.
I agree it is difficult to reliably check that a migration will succeed. TBH, I was expecting opposition due to libvirt already providing info for applications to do the check themselves. E.g. as nova has done with check_can_live_migrate_{source,destination} APIs. Do you think libvirt provides enough information for an app to determine if a VM can be migrated between two hosts? Or maybe better asked: What info is currently missing for an app to reliably check if a VM can be migrated between two hosts?
Signed-off-by: Jim Fehlig <jfehlig@suse.com> ---
If it is agreed this is useful, my thought was to use the begin and prepare phases of migration to implement it. qemuMigrationDstPrepareAny() already does a lot of the heavy lifting wrt checking the host can accommodate the domain. Some of it, and the remaining migration phases, can be short-circuited in the case of dry run.
One interesting wrinkle I've observed is the check for cpu compatibility. AFAICT qemu is actually invoked on the dst, "filtered-features" of the cpu are requested via qmp, and results are checked against cpu in domain config. If cpu on dst is insufficient, migration fails in the prepare phase with something like "guest CPU doesn't match specification: missing features: z y z". I was hoping to avoid launching qemu in the case of dry run, but that may be unavoidable if we'd like a dependable dry run result.
Even launching QEMU isn't good enough - it has to actually process the migration data stream for devices to get a good indication of success, at which point you're basically doing a real migration.
Bummer. I guess that answers my question above: no. It also implies apps cannot reliably check if a migration will succeed and should instead put effort into handling errors from an actual migration :-). Regards, Jim