On Wed, Nov 15, 2017 at 02:24:48PM +0000, Branimir Pejakovic wrote:
Dear colleagues,
I am facing a problem that has been troubling me for last week and a half.
Please if you are able to help or offer some guidance.
I have a non-prod POC environment with 2 CentOS7 fully updated hypervisors
and an NFS filer that serves as a VM image storage. The overall environment
works exceptionally well. However, starting a few weeks ago I have been
trying to implement virtlock in order to prevent a VM running on 2
hypervisors at the same time.
[snip]
h2 # virsh start test09
error: Failed to start domain test09
error: resource busy: Lockspace resource
'/storage_nfs/images_001/test09.qcow2' is locked
[snip]
Now, I am pretty sure that I am missing something simple here since
this is
a standard feature and should work out of the box if set correctly but so
far I cannot see what I am missing.
So I think you are hitting the little surprise in the way our locking
works. Specifically, right now the locking only protects the image
file contents from concurrent writes. We don't have locking around
the file attributes (permissions, user/group ownership, selinux label,
etc)
Unfortunately with the current libvirt design, the security drivers run
before locking takes effect. So what happens is that you have your first
VM running normally. It has been granted ability to write to the image
in terms of SELinux label & permissions/owership. The lock manager is
holding locks protecting the image contents
Now you try to start the second guest, and libvirt will apply the SELinux
label & permissions/ownership needed for that second guest, despite it
being used by the first guest. Only then do we acquire the locks for the
disk image, and fail because the first guest holds the lock. We now
reset the permissions/ownership we just granted for the second guest,
and thus unfortunately blocks the first guest from using the images,
causing the I/O errors you mention
We *have* successfully prevented 2 guests from writing to the same
image at once, so your data is still safe. Unfortunately though the
first guest cannot write any further datas, so that previously
running guest is now fubar :-(
I appreciated this is rather surprising & unhelpful in general. Just
console yourself with the fact that at least your disk iamge is not
corrupted.
Note, this should only happen with SELinux enforcing though - if it is
permissive, then I'd expect the first guest to carry on working.
We would like to improve our locking so that we can apply locks before
we even try to change ownership/permissions/selinux, which would make
it far more useful. We've never succesfully completed that work though.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|