On Wed, Feb 26, 2020 at 10:50:20AM +0530, Shaju Abraham wrote:
On Tue, Feb 25, 2020 at 6:29 PM Daniel P. Berrangé
<berrange(a)redhat.com>
wrote:
> On Wed, Feb 19, 2020 at 02:39:22AM +0000, Shaju Abraham wrote:
> >
> >
> > On 2/11/20, 7:06 PM, "Daniel P. Berrangé"
<berrange(a)redhat.com> wrote:
> >
> > On Tue, Feb 11, 2020 at 10:05:53AM +0100, Martin Kletzander wrote:
> > > > On Wed, Feb 05, 2020 at 05:32:50PM +0000, Daniel P. Berrangé
> wrote:
> > > > > On Mon, Feb 03, 2020 at 12:43:32PM +0000, Daniel P.
Berrangé
> wrote:
> > > > > > From: Shaju Abraham <shaju.abraham(a)nutanix.com>
> > > > > >
> > > > > There are various config paths that a VM uses. The monitor
> paths and
> > > > > > other lib paths are examples. These paths are tied to
the VM
> name or
> > > > > > UUID. The local migration breaks the assumption that
there
> will be only
> > > > > > one VM with a unique UUID and name. During local
migrations
> there can be
> > > > > > multiple VMs with same name and UUID in the same host.
> Append the
> > > > > > domain-id field to the path so that there is no
duplication
> of path
> > > > > names.
> > > > >
> > > > >This is the really critical problem with localhost
migration.
> > > > >
> > > > >Appending the domain-id looks "simple" but this is
a significant
> > > > >behavioural / functional change for applications, and I
don't
> think
> > > > >it can fully solve the problem either.
> > > > >
> > > > >This is changing thue paths used in various places where
libvirt
> > > > >internally generates unique paths (eg QMP socket, huge page
or
> > > > >file based memory paths, and defaults used for auto-filling
> > > > >device paths (<channel> if not specified).
> > > > >
> > > > >Some of these paths are functionally important to external
> > > > >applications and cannot be changed in this way. eg stuff
> > > > >integrating with QEMU can be expecting certain memory
backing
> > > > >file paths, or certain <channel> paths & is liable
to break
> > > > >if we change the naming convention.
> > > > >
> > > > >For sake of argument, lets assume we can changing the naming
> > > > >convention without breaking anything...
> > > > >
> > > >
> > > >This was already done in (I would say) most places as they use
> > > >virDomainDefGetShortName() to get a short, unique name of a
> directory -- it uses
> > > >the domain ID as a prefix.
> > > >
> > > > > This only applies to paths libvirt generates at VM startup.
> > > > >
> > > > >There are plenty of configuration elements in the guest XML
> > > > >that are end user / mgmt app defined, and reference paths in
> > > > >the host OS.
> > > > >
> > > > >For example <graphics>, <serial>,
<console>, <channel>,
> > > > >all support UNIX domain sockets and TCP sockets. A UNIX
> > > > >domain socket cannot be listened on by multiple VMs
> > > > >at once. If the UNIX socket is in client mode, we cannot
> > > > >assume the thing QEMU is connecting to allows multiple
> > > > >concurrent connections. eg 2 QEMU's could have their
> > > > ><serial> connected together over a UNIX socket pair.
> > > > >Similarly if automatic TCP port assignment is not used
> > > > >we cannot have multiple QEMU's listening on the same
> > > > >host.
> > > > >
> > > > >One answer is to say that localhost migration is just
> > > > >not supported for such VMs, but I don't find that very
> > > > >convincing because the UNIX domain socket configs
> > > > >affected are in common use.
> > > > >
> > > >
> > > >I would be okay with saying that these either need to be changed
> in a provided
> > > >destination XML or the migration will probably break. I do not
> think it is
> > > >unreasonable to say that if users are trying to shoot themselves,
> we should not
> > > >spend a ridiculous time just so we can prevent that. Otherwise
> we will get to
> > > >the same point as we are now -- there might be a case where a
> local migration
> > > >would work, but users cannot execute it even if they were very
> cautious and went
> > > >through all the things that can prevent it from the technical
> point of view,
> > > >libvirt will still disallow that.
> >
> > >If there are clashing resources, we can't rely on QEMU reporting
an
> > >error. For example with a UNIX domain socket, the first thing QEMU
> > >does is unlink(/socket/path), which will blow away the UNIX domain
> > >socket belonging to the original QEMU. As a result if migration
> > >fails, and we rollback, the original QEMU will be broken.
> >
> > By appending the id field to the path, we are effectively nullifying
> this particular
> > concern. Each qemu instance will get its own unique path and monitor.
> If a migration
> > fails, we can roll back.
>
> No, you've not nullified the problem. This only helps the cases where
> libvirt generates the path. This is only a subset of affected cases.
> Just one example:
>
> <graphics type='vnc'
socket='/some/path/QEMUGuest1-vnc.sock'>
>
> there are many other parts of the domain XML that accept UNIX socket
> paths where the mgmt app picks the socket path. Nothing is validating
> this to prevent clashes between the src+dst QEMU on the same host,
> meaning on migration rollback, src QEMU is broken.
>
> It is true that I have not covered all the use cases. I would like to
know if the approach
using the symlink is acceptable. In that case we can have the same design
for externally
generated paths as well. Siting your example, if management application
provides a path like
<graphics type='vnc' socket='/some/path/QEMUGuest1-vnc.sock'>
We can always consider the path "'/some/path/QEMUGuest1-vnc.sock'" as
a
symlink to the
internally generated path generated appending the ID field.
/some/path/QEMUGuest1-vnc.sock--->/some/path-ID/QEMUGuest1-vnc.sock.
The management application will always refer by the path it has provided
and will be valid even during migration. At the end of migration the
symlink will be changed to point to the destination QEMU.
This sounds like it could be the least worst solution to the problem
of clashing UNIX sockets.
We've still got a log file problem. This could potentially done the
same way with symlinks, but that will cause significant complications
with cummulative logfile size limiting and rollover processing. We
already pass in pre-opened FDs to QEMU from virtlogd, so potentially
we can ask virtlogd for the pre-existing FDs it gave to the src
QEMU and also give them to the dest QEMU for reuse.
Clashing IP addresses is also a problem. For VNC/SPICE at least we
have the auto-port allocation which is a good solution. We might need
to consider expanding that to other scenarios.
The only other gross hack I thought of would be to live migrate twice.
Once to a new QEMU that uses arbitrary paths, and then again to a new
QEMU that uses the user specified paths. That has bad recovery scenarios
though if the 2nd migration fails, which makes it impractical I think.
A further option would be to have a way for QEMU to pass its existing
open FDs for chardevs & sockets, etc to the new QEMU using SCM_RIGHTS
passing. This is conceptually my favourite, but I think it is not
practical in reality given the way that QEMU backends are implemented.
All these FDs are needed during early QEMU intiialization, before the
migration data stream is established.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|