On Tue, Jul 2, 2019 at 10:29 AM Daniel P. Berrangé <berrange(a)redhat.com> wrote:
On Mon, Jul 01, 2019 at 08:20:44PM +0200, Christian Ehrhardt wrote:
> On Mon, Jul 1, 2019 at 7:56 PM Daniel P. Berrangé <berrange(a)redhat.com>
wrote:
> >
> > On Mon, Jul 01, 2019 at 05:26:45PM +0200, Christian Ehrhardt wrote:
> > > Hi,
> > > today I was debugging an issue that I found with qemu 4.0 and ended up
> > > puzzled about file ownership. I'm almost EOD now, but wanted to reach
> > > out here for a sanity check before I debug further. The case
> > > sumamrized is this:
> > >
> > > 1. start guest
> > > 1.1 image files are changed to libvirt-qemu:kvm (which matches
> > > Ubuntus user/group config)
> > > 2. live migrate the guest to a different node
> > > 2.1 image files go back to root:root which they initially had (ok)
> > > 3. migrate guest back to the original node
> > > x. image files stay root:root and are not changed back to
libvirt-qemu:kvm
> > >
> > > That is odd/unexpected, but it seems the same applies to older
> > > versions and there it was never a problem so far.
> >
> > This sounds odd - I see no reason why the first migration should
> > behave differently from the second migration, as libvirt makes
> > no distinction & has no knowledge of previous migrations.
>
> I might have been unclear, let me clarify:
> 1. migration without copy (changes the ownership)
> lxc exec testkvm-eoan-from -- virsh migrate --live --unsafe
> kvmguest-eoan-normal qemu+ssh://10.222.144.19/system
> (and the same type backwards)
> 2. migration away from here with --copy-storage* fails due to file ownership
>
> > Could the two hosts be configured differently in some way. For
> > example is the disk image on ext4 on one host, but nfs on the
> > other host ? With shared filesystems, we'd generally expect
> > the disk to be exposed on a shared filesystem on both hosts.
>
> The two hosts are in fact two LXD containers sharing the same
> Filesystem (for the first migration)
> And still two containers, but without shared FS for the second
> copy-storage-* migration that then eventually fails due to the bad
> file ownership.
>
> But since they in fact use shared storage that might still be the
> reason - maybe the ordering is important here.
> Since in the setup for migration #1 they really use shared storage it
> might be that we have something like
>
> 1. source: migrates off guest
> 2. target: receives guest
> 3. target: guest is complete and changes file ownership to
> libvirt-qemu:kvm (correct)
> 4. source: shuts down the stub that is left after migration and
> changes file ownership to root:root
>
> Due to really using the same storage (not like the same FCP or scsi
> device but the same FS) that could be the reason the ownership after a
> migration cycle is reset.
> Thanks for the hint, worth a check with some slightly altered setups ...
> ... and confirmed - if I use copied images, but not on a shared FS the
> ownership is handled correctly.
>
> So my way of sharing the FS might be odd. And libvirt does not detect
> it as such, and due to that above ordering triggers the issue or the
> ownership being set to root:root by the migration source after the
> migration is complete.
Yes, this is going to be a problem. Libvirt will look at the filesystem
and see that it is a local-only filesystem, and so assume it can safely
reset ownership when source VM shuts down. It cannot tell that this is
going to negatively impact the target VM using the same filesystem.
This is an example of one of the problems that makes us say that
localhost migration is not supported.
If you're going to use containers, you need to make sure that each
container either has a separate filesystem mount, or that the container
sees a filesystem like NFS so it knows it is shared.
> Quoting the virsh man page my setup is similar to "disk images are
> stored on coherent clustered filesystem, such as GFS2 or GPFS" but my
> libvirt doesn't know that and therefore changes the ownership.
> Maybe it would have the same bug on GFS2/GPFS?
> I don't have such a setup at hand, how is the issue avoided there?
> Is there a way to make libvirt realize that it really is on "the same"
> FS to avoid these operations?
Libvirt uses statfs to find the filesystem magic
virFileIsSharedFSType checks NFS, GFS2, OCFS, AFS, SMB,
CIFS, CEPH & GPFS
Thanks for the pointer Daniel!
To stat -f [1] it appears to be the same (ext2/ext3) as in the host.
I started a discussion with our container people if I could
detect/differentiate that somehow.
Until then, now that I'm aware I can just let automation clean up
ownership after those actions.
[1]:
http://paste.ubuntu.com/p/jffSrtKg7t/
--
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd