On 2015/4/10 15:54, Jiri Denemark wrote:
On Wed, Apr 08, 2015 at 15:40:36 +0800, zhang bo wrote:
> We recently encountered a problem:
> 1) migrate a domain
> 2) the client unexpectedly got *crashed* (let's take it as virsh command)
> 3) *libvirtd still kept migrating the domain*
> 4) after it's restarted, the client didn't know the guest is still
migrating.
>
> The problem is that libvirtd and the client has different view of the task state.
After migration,
> the client may wrongly think that something's wrong that the domain got
unexpectedly migrated.
>
> In my opinion, libvirtd should just *execute* tasks, like the hands of a human,
> while clients should be the brain to *schedule and remember* tasks.
>
> So, In order to avoid this problem,we should let the client record all the taskes
somewhere,
> and reload the states after its restart. the client may cancel or continue the task
as it wishes.
> Libvirtd should not record the task status.
Not really. It's libvirtd, the daemon, which has to remember everything.
It manages the state of all domains running on a host and synchronizes
all clients that want to change state of the domains. Remember, even if
a client is not restarted, domains my unexpectedly migrate somewhere
else because another client might have asked for it.
That said, if you're implementing a higher management layer which
manages domains using libvirt and you know it is going to be the only
client talking directly to libvirt, you can remember the state there if
you want. However, it's not something libvirt itself should or could do.
But you will most likely need to synchronize the state with libvirtd in
case the client is restarted. Even libvirtd has to synchronize its
internal state with all running QEMU processes when it is restarted
because the state might have changed.
Jirka
.
Thank you Jirka.
Let's go a step further, suppose that the client doesn't crash at step 2),
it's just disconnected to libvirtd at src side.
1) client(nova) calls virDomainMigrateToURI2() to migrate a guest
2) libvirtd at src side connects to libvirtd at dest side.
3) Unfortunately, somehow, client(nova) gets disconnected to libvirtd while migrating
the guest.
4) the API virDomainMigrateToURI2() returns with error in client(nova)
5) but libvirtd doesn't aware that the connection to client is broken, and keeps
migrating the guest to dest.
6) the guest is migrated to the dest side eventually.
7) Because the nova at src side thinks migration is not successed as step 4), the nova
at the dest will consider the migrated-in guest as an unexpected running guest, and will
shut it down.
The guest disappears at last, due to the previous disconnection of libvirtd client and
server.
Even though libvirtd remembers everything, the client at dest side still wrongly killed
the guest after migration.
So, how to solve this problem? Shall libvirtd keep watching its clients' connection,
and cancel running jobs concerning the disconnected client immediately after the client
disconnects?