On Tue, 30 Jul 2019 14:00:48 +0100
Daniel P. Berrangé <berrange(a)redhat.com> wrote:
On Tue, Jul 30, 2019 at 02:49:02PM +0200, Stephan von Krawczynski
wrote:
> On Tue, 30 Jul 2019 11:35:45 +0100
> Daniel P. Berrangé <berrange(a)redhat.com> wrote:
>
> > On Mon, Jul 29, 2019 at 02:04:14PM +0200, Stephan von Krawczynski wrote:
> >
> > > Hello,
> > >
> > > is there some immanent code in libvirt that forces UID/GID of the
> > > libvirt standard user to be the same on two boxes migrating qemu vms
> > > against each other?
> > > The migration itself uses root obviously (password is requested). But
> > > if a vm xml does not contain any definition regarding UID/GID what
> > > else could prevent this from working?
> > >
> > > I believe I ran into such a problem trying to migrate and ending up in
> > > an error, a vm still working on original host but its fs (netfs pool
> > > (nfs/raw)) being switched to read-only...
> >
> > When migrating a VM whose image is hosted on NFS, you have 2 QEMU
> > processes which both need to be able to open the same image file at the
> > same time. QEMU runs as an unprivileged user normally, and so the disk
> > images get chowned to this unprivileged user by libvirt when QEMU is
> > started. If the QEMU on the target host is given a UID/GID that's
> > different from the QEMU on the source host, then the target QEMU will
> > likely have problems opening the image.
> >
> > Basically when using shared FS storage, the rule is to have all your
> > hosts configured in the same way from libvirt's POV.
> >
> > Regards,
> > Daniel
>
> Hello Daniel,
>
> thank you for the short explanation. The key words are "both need to be
> able to open the same image file at the same time".
> I would not have expected that. I thought qemu 1 will close and exit, and
> then qemu 2 will open the image, which means he can change the uid/gid
> right before
This is supposed to be a safety thing. If anything goes badly wrong on the
target host you need to be able to rollback & continue running on the source
VMs. So the source VM doesn't want to close the disk images, until the target
VM has confirmed it is successfully running. This implies there is a period
of time when both have to have the disk image open. Crucially though, even
when 2 QEMUs have the disks open, only *1* QEMU has the guest CPU running
and permitting disk writes.
> - just as in normal operation.
> Is this the reason why my failing try leaves me with a read-only fs on the
> guest? Which I would see as a bug, not?
> Turning it read-only is possibly the only way to not corrupt the fs image
> if two qemus have it open simultaneously.
The guest sets its FS read-only when it gets an I/O error reported by
the virtual disk driver. QEMU reports an I/O error to the guest, when
it in turns gets an I/O error on the host storage. This happens because
qemu is loosing privileges to access the disks.
Regards,
Daniel
Ok, the source VM does not want to close the image until confirmed that the
destination VM is up and running. But if it is not up and running and the
source VM still cannot go on - because its fs is read-only - then what's the
use of keeping it open in the first place?
The simple fact is: a reproducably failing migration leaves you with a
non-working guest. I cannot think of any argument supporting the idea this
being no bug. I mean the whole story of migration is only done for keeping up
the guest...
Else you could as well copy the config with the guest offline.
And: the source guest VM has no I/O error. The destination guest VM cannot
touch the fs image for permission reason, so the fs is still safe in the state
the frozen source guest left it.
--
Regards,
Stephan