On 11/12/18 4:26 AM, Daniel P. Berrangé wrote:
On Fri, Nov 02, 2018 at 04:34:02PM -0600, Jim Fehlig wrote:
> A dry run can be used as a best-effort check that a migration command
> will succeed. The destination host will be checked to see if it can
> accommodate the resources required by the domain. DRY_RUN will fail if
> the destination host is not capable of running the domain. Although a
> subsequent migration will likely succeed, the success of DRY_RUN does not
> ensure a future migration will succeed. Resources on the destination host
> could become unavailable between a DRY_RUN and actual migration.
I'm not really convinced this is a particularly useful concept,
as it is only going to catch a very small number of the reasons
why migration can fail. So you still have to expect the real
migration invokation to have a strong chance of failing.
I agree it is difficult to reliably check that a migration will succeed. TBH, I
was expecting opposition due to libvirt already providing info for applications
to do the check themselves. E.g. as nova has done with
check_can_live_migrate_{source,destination} APIs.
Do you think libvirt provides enough information for an app to determine if a VM
can be migrated between two hosts? Or maybe better asked: What info is currently
missing for an app to reliably check if a VM can be migrated between two hosts?
>
> Signed-off-by: Jim Fehlig <jfehlig(a)suse.com>
> ---
>
> If it is agreed this is useful, my thought was to use the begin and
> prepare phases of migration to implement it. qemuMigrationDstPrepareAny()
> already does a lot of the heavy lifting wrt checking the host can
> accommodate the domain. Some of it, and the remaining migration phases,
> can be short-circuited in the case of dry run.
>
> One interesting wrinkle I've observed is the check for cpu compatibility.
> AFAICT qemu is actually invoked on the dst, "filtered-features" of the cpu
> are requested via qmp, and results are checked against cpu in domain config.
> If cpu on dst is insufficient, migration fails in the prepare phase with
> something like "guest CPU doesn't match specification: missing features: z y
z".
> I was hoping to avoid launching qemu in the case of dry run, but that may
> be unavoidable if we'd like a dependable dry run result.
Even launching QEMU isn't good enough - it has to actually process the
migration data stream for devices to get a good indication of success,
at which point you're basically doing a real migration.
Bummer. I guess that answers my question above: no. It also implies apps cannot
reliably check if a migration will succeed and should instead put effort into
handling errors from an actual migration :-).
Regards,
Jim