Re: [libvirt] [PATCH v3 2/5] vz: add migration backbone code

1 Sep 2015

      On Mon, Aug 31, 2015 at 11:40:55AM +0300, Nikolay Shirokovskiy wrote:
...
On 28.08.2015 19:37, Daniel P. Berrange wrote:
...
On Fri, Aug 28, 2015 at 12:18:30PM +0300, Nikolay Shirokovskiy wrote:
...
On 27.08.2015 13:34, Daniel P. Berrange wrote:
...
On Tue, Aug 25, 2015 at 12:04:14PM +0300, nshirokovskiy@virtuozzo.com wrote:
...
From: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
This patch makes basic vz migration possible. For example by virsh:
  virsh -c vz:///system migrate --direct $NAME $STUB vz+ssh://$DST/system
$STUB could be anything as it is required virsh argument but it is not
used in direct migration.
Vz migration is implemented as direct migration. The reason is that vz sdk do
all the job. Prepare phase function is used to pass session uuid from
destination to source so we don't introduce new rpc call.
Looking more closely at migration again, the scenario you have is pretty
much identical to the Xen scenario, in that the hypervisor actually
manages the migration, but you still need a connection to dest libvirtd
to fetch some initialization data.
You have claimed you are implementing, what we describe as "direct, unmanaged"
migration on this page:
http://libvirt.org/migration.html
But based on the fact that you need to talk to dest libvirtd, you should
in fact implement 'direct, managed' migration - this name is slightly
misleading as the VZ SDK is still actually managing it.
Since you don't need to have the begin/confirm phases, you also don't
need to implement the V3 migration protocol - it is sufficient to just
use V1.
This doesn't need many changes in your patch fortunately.
I've been looking at common migration code for rather long time and think that
using direct managed scheme for vz migration could lead to problems. Let me
share my concerns.
1. Migration protocol of version1 differs from version3 not only by number of
stages. Version3 supports extended parameters like
VIR_MIGRATE_PARAM_GRAPHICS_URI which have meaning for vz migration too. Thus in
future we could move to implementing version3 as well.
Ah, that is indeed true. From that POV it certainly makes sense to want
to start with V3 straight away.
...
2. Direct managed stages doesn't have a meaning do anything on source, then on
destination and so on. They interconnected and this interconnection is given in
migration algorithm. For version3 (virDomainMigrateVersion3Full) it is more
noticeable. If finish3 phase fail then we cancel migration on confirm3 phase.
See, we treat this phases specifically - on perform3 we think we move data, on
finish we think we start domain on destination, on comfirm we think we stop
domain on source. That is how qemu migration works and that is how we think of
phases when we implement direct managed algorithm. So phases have some
contracts. If we implement vz migration thru this scheme we could not keep
these contracts as perform3 phase not only move data, but also kill source
domain and start destination. The worst things the user could get are an
erroneous warnings in logs and overall migration failure reports on actual
migration success in case of side effect failures like rpc or OOM. The worser
is that you should keep in mind that phases imlementation contracts are vague.
It isn't the end of the world if the Perform3 stage kills the source domain.
That is the same behaviour as with Xen. Particularly since the VZ SDK itself
does the switchover, there's no functional downside to letting Perform3
kill the source.
I can not quite agree with you. Yes, luckily, vz migration could be
implemented via existent 5-phases interface and existing managing
virDomainMigrateVersion3Full algorithm but this is fragile. I mean
as phases have different meaning for qemu and vz in future 
if virDomainMigrateVersion3Full will be somehow changed this could lead 
to improper functioning of vz migration. As change will be done 
with qemu meaning for phases in mind.
On the contrary the migration stages are explicitly intended to
allow arbitrary hypervisor specific operations to be performed.

In V3, we essentially have 2 initialization steps (begin on src,
prepare on dst), and 2 cleanup steps (finish on dst, confirm on src),
and 1 action step. The only really required semantics are that the
perform step starts the migration. What a hypervisor does in the
2 initialization and 2 cleanup steps is entirely arbitrary.

You can rely on the fact that in V3, we will always call the 5
steps in the same sequence as they are defined now. We will never
change the way they are called in the future. If there was ever
a need to make some incompatible change to suit something the
QEMU driver needs, we would have to introduce a V4 protocol, so
we would avoid any risk of breaking VZ in that manner.
...
...
...
So as as version1 scheme is quite simple and phase contracts are looser that
for version3 we could go this way but i see potential problems (at least for
developer). Thus suggest keep contracts of phases of all versions of direct
managed migration clear and hide all differences by implementing p2p or direct
scheme.
The questing arises how these two differ. Documentation states that p2p is when
libvirt daemon manages migrations and direct is when all managing is done by
hypervisor. As vz migration needs some help from destination daemon it looks
like a candidate for p2p. But as this help is as just little as help
authenticate i suggest to think of it as of direct. From implementation point
of view there is no difference, from user point of view the difference is only
in flags. Another argument is that if we take qemu we see that p2p is just go
thru same steps as direct managed, the most of difference is that managing move
from client to daemon. That is p2p and direct managed are some kind of coupled.
If there is p2p then direct managed should be possible too and this is not the
case of vz migration.
The p2p migration mode is only different from the default mode,
in that instead of the client app talking to the dest libvirtd,
it is the source libvirtd talking.
With the VZ driver though, the driver runs directly in the client
app, not libvirtd. As such, there is no benefit to implementing
p2p mode in VZ - it will just end up working in the exact same
way as the default mode in terms of what part communicates with
the dest.
Again can not quite agree. Vz could run on libvirtd too and someone
could want whole managing to be done on libvirtd in this case. Thus user
expects there is either direct or p2p migration exists.
Yes, it would be valid to implement the P2P migration protocol in VZ,
as well as the default protocol (managed direct). Implementing the
unmanaged direct protocol (aka --direct virsh flag) is not appropriate
though, as that's for the case where dest libvirtd is not involved in
any way, which is not the case here.
...
Another reason is that it would be simplier to support vz
migration in openstack nova. It uses toURI2 to migrate and
I would better run vz driver on libvirtd and use p2p or direct
migration rather then introduce a branch for vz to use
MigrateN API with a client side driver.
If we implement the P2P protocol in VZ too, then Openstack
should not need any changes at all in how it invokes migration
...
...
As you do need to talk to dest libvirtd, IMHO, this rules out
use of the direct mode, as that is intended for the case where
you don't ever use libvirtd on the target. This is why you ended
up having the wierd situation with passing a dummy URI to virsh,
and then passing a libvirt URI as the second parameter. This leads
to a rather confusing setup for apps IMHO.
Actually dummy URI is not caused by some kind of improper use
from my side. If someone wants to use existing direct migration
it end up passing URIs in this manner. Cause is that from one side
'dconnuri' is a required parameter for virsh and from other side
it is ignored in direct migrations.
Well we have 2 URIs in the migration APIs, one URI is intended to
be a libvirt URI, and one is intended to be a hypervisor URI. The
way you implemented this initial patch, is requiring a libvirt URI
to be provided where we have documented a hypervisor URI should be
provided. This is what I consider to be wrong about the current
impl.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|