On 27.08.2015 13:34, Daniel P. Berrange wrote:
On Tue, Aug 25, 2015 at 12:04:14PM +0300, nshirokovskiy(a)virtuozzo.com
wrote:
> From: Nikolay Shirokovskiy <nshirokovskiy(a)virtuozzo.com>
>
> This patch makes basic vz migration possible. For example by virsh:
> virsh -c vz:///system migrate --direct $NAME $STUB vz+ssh://$DST/system
>
> $STUB could be anything as it is required virsh argument but it is not
> used in direct migration.
>
> Vz migration is implemented as direct migration. The reason is that vz sdk do
> all the job. Prepare phase function is used to pass session uuid from
> destination to source so we don't introduce new rpc call.
Looking more closely at migration again, the scenario you have is pretty
much identical to the Xen scenario, in that the hypervisor actually
manages the migration, but you still need a connection to dest libvirtd
to fetch some initialization data.
You have claimed you are implementing, what we describe as "direct, unmanaged"
migration on this page:
http://libvirt.org/migration.html
But based on the fact that you need to talk to dest libvirtd, you should
in fact implement 'direct, managed' migration - this name is slightly
misleading as the VZ SDK is still actually managing it.
Since you don't need to have the begin/confirm phases, you also don't
need to implement the V3 migration protocol - it is sufficient to just
use V1.
This doesn't need many changes in your patch fortunately.
I've been looking at common migration code for rather long time and think that
using direct managed scheme for vz migration could lead to problems. Let me
share my concerns.
1. Migration protocol of version1 differs from version3 not only by number of
stages. Version3 supports extended parameters like
VIR_MIGRATE_PARAM_GRAPHICS_URI which have meaning for vz migration too. Thus in
future we could move to implementing version3 as well.
2. Direct managed stages doesn't have a meaning do anything on source, then on
destination and so on. They interconnected and this interconnection is given in
migration algorithm. For version3 (virDomainMigrateVersion3Full) it is more
noticeable. If finish3 phase fail then we cancel migration on confirm3 phase.
See, we treat this phases specifically - on perform3 we think we move data, on
finish we think we start domain on destination, on comfirm we think we stop
domain on source. That is how qemu migration works and that is how we think of
phases when we implement direct managed algorithm. So phases have some
contracts. If we implement vz migration thru this scheme we could not keep
these contracts as perform3 phase not only move data, but also kill source
domain and start destination. The worst things the user could get are an
erroneous warnings in logs and overall migration failure reports on actual
migration success in case of side effect failures like rpc or OOM. The worser
is that you should keep in mind that phases imlementation contracts are vague.
So as as version1 scheme is quite simple and phase contracts are looser that
for version3 we could go this way but i see potential problems (at least for
developer). Thus suggest keep contracts of phases of all versions of direct
managed migration clear and hide all differences by implementing p2p or direct
scheme.
The questing arises how these two differ. Documentation states that p2p is when
libvirt daemon manages migrations and direct is when all managing is done by
hypervisor. As vz migration needs some help from destination daemon it looks
like a candidate for p2p. But as this help is as just little as help
authenticate i suggest to think of it as of direct. From implementation point
of view there is no difference, from user point of view the difference is only
in flags. Another argument is that if we take qemu we see that p2p is just go
thru same steps as direct managed, the most of difference is that managing move
from client to daemon. That is p2p and direct managed are some kind of coupled.
If there is p2p then direct managed should be possible too and this is not the
case of vz migration.