On Mon, Mar 10, 2014 at 15:36:06 +0100, Olaf Hering wrote:
During live migration of VMs from one host to another the VM is
suspended for an unpredictable amount of time. The actual downtime
depends on how many new pages will be dirty and the band width to the
destination host. Since VM memory size grows faster than transfer rates
the currently available tuneables will cause troubles for workloads
within the VM which can not handle large timejumps.
I have already written code to tweak the inner loop doing the actual
migration work in libxc. But the patchset exposes the details of the
loop to the cmdline, as such it is not portable nor is it a friendly UI
for the hostadmin.
Here is my proposal for a new option for virsh and 2 new options for xl:
[xl | virsh --live] --max-suspend-time N --timeout N VM host
--max-suspend-time N: as the name suggests, the VM downtime must not be
longer than specified. The code doing the migration has to estimate the
transfer speed. If the VM is about to be suspended, it has to check if
the remaining dirty pages can be transfered within the required
timeframe. If not, the migration is aborted, the VM continues to run on
the src host, the new VM on the dst host is destroyed and an error is
returned.
Libvirt already has virDomainMigrateSetMaxDowntime API with this
semantics. However, using virsh, one can set it with virsh
migrate-setmaxdowntime command while migration is happening. Not sure if
exposing it as yet another parameter of already quite complicated
migrate command would buy us much.
--timeout N: if a VM is busy and its workload causes many new dirty
pages the migrate command would take forever. This option is supposed to
stop the migration attempt if the number of new dirty pages is too high.
It would change the semantics of "virsh migrate --timeout n", which
currently forces a suspend (according to the help text).
This is not acceptable. If you want an option to automatically cancel
migration after a given timeout, you would need to introduce a new
option instead of changing semantics of an existing option.
Jirka