[libvirt] [PATCH] qemu: turn on virtlockd by default

The virtlockd daemon has existed for years now, but we have never turned it on by default, requiring explicit user opt-in. This leaves users unprotected against accidents out of the box. By turning it on by default, users will at least be protected for mistakes involving local files, and files on shared filesystems that support fcntl() (eg NFS). In turning it on the various services files are updated to have the same dependancies for virtlockd as we have for virtlogd now, since turning the latter on exposed some gaps. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- daemon/libvirtd.service.in | 1 + src/locking/virtlockd.service.in | 1 + src/locking/virtlockd.socket.in | 1 + src/qemu/qemu.conf | 2 +- src/qemu/qemu_conf.c | 3 +++ 5 files changed, 7 insertions(+), 1 deletion(-) diff --git a/daemon/libvirtd.service.in b/daemon/libvirtd.service.in index bbf27da..c72dde5 100644 --- a/daemon/libvirtd.service.in +++ b/daemon/libvirtd.service.in @@ -6,6 +6,7 @@ [Unit] Description=Virtualization daemon Requires=virtlogd.socket +Requires=virtlockd.socket Before=libvirt-guests.service After=network.target After=dbus.service diff --git a/src/locking/virtlockd.service.in b/src/locking/virtlockd.service.in index 57089b0..69b568f 100644 --- a/src/locking/virtlockd.service.in +++ b/src/locking/virtlockd.service.in @@ -1,6 +1,7 @@ [Unit] Description=Virtual machine lock manager Requires=virtlockd.socket +Before=libvirtd.service Documentation=man:virtlockd(8) Documentation=http://libvirt.org diff --git a/src/locking/virtlockd.socket.in b/src/locking/virtlockd.socket.in index 9808bbb..45e0f20 100644 --- a/src/locking/virtlockd.socket.in +++ b/src/locking/virtlockd.socket.in @@ -1,5 +1,6 @@ [Unit] Description=Virtual machine lock manager socket +Before=libvirtd.service [Socket] ListenStream=@localstatedir@/run/libvirt/virtlockd-sock diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index a8cd369..3239f7b 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -535,7 +535,7 @@ # share one writable disk, libvirt offers two approaches for # locking files. The first one is sanlock, the other one, # virtlockd, is then our own implementation. Accepted values -# are "sanlock" and "lockd". +# are "sanlock", "lockd", "nop". The default is "lockd". # #lock_manager = "lockd" diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 6613d59..d4c6cdc 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -314,6 +314,9 @@ virQEMUDriverConfigPtr virQEMUDriverConfigNew(bool privileged) cfg->glusterDebugLevel = 4; cfg->stdioLogD = true; + if (VIR_STRDUP(cfg->lockManagerName, "lockd") < 0) + goto error; + if (!(cfg->namespaces = virBitmapNew(QEMU_DOMAIN_NS_LAST))) goto error; -- 2.9.3

On Wed, Feb 01, 2017 at 16:54:01 +0000, Daniel Berrange wrote:
The virtlockd daemon has existed for years now, but we have never turned it on by default, requiring explicit user opt-in. This leaves users unprotected against accidents out of the box.
By turning it on by default, users will at least be protected for mistakes involving local files, and files on shared filesystems that support fcntl() (eg NFS).
In turning it on the various services files are updated to have the same dependancies for virtlockd as we have for virtlogd now, since turning the latter on exposed some gaps.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> ---
Unfortunately ... NACK, while I fixed quite some bugs with locking and block jobs few of them are still present. Mostly non-active layer block commit where we will leak the locked file once we finish the job as we are not tracking the job progress ( and nor can we properly do so since qemu jobs vanish right after finishing). I discussed it today with Kevin Wolf and John Snow and they plan to add a way for the jobs to linger. Until than I don't think we should enable this especially not auto disk leases. Peter

On Wed, Feb 01, 2017 at 07:04:54PM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 16:54:01 +0000, Daniel Berrange wrote:
The virtlockd daemon has existed for years now, but we have never turned it on by default, requiring explicit user opt-in. This leaves users unprotected against accidents out of the box.
By turning it on by default, users will at least be protected for mistakes involving local files, and files on shared filesystems that support fcntl() (eg NFS).
In turning it on the various services files are updated to have the same dependancies for virtlockd as we have for virtlogd now, since turning the latter on exposed some gaps.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> ---
Unfortunately ... NACK, while I fixed quite some bugs with locking and block jobs few of them are still present. Mostly non-active layer block commit where we will leak the locked file once we finish the job as we are not tracking the job progress ( and nor can we properly do so since qemu jobs vanish right after finishing). I discussed it today with Kevin Wolf and John Snow and they plan to add a way for the jobs to linger.
Can't we detrect completion / error via the the block job events, even if the actual job object itself vanishes ? The events look like that contain enough info to correlate back to libvirt's record of the job. This would avoid reliance on a new QEMU release Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Wed, Feb 01, 2017 at 18:11:28 +0000, Daniel Berrange wrote:
On Wed, Feb 01, 2017 at 07:04:54PM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 16:54:01 +0000, Daniel Berrange wrote:
The virtlockd daemon has existed for years now, but we have never turned it on by default, requiring explicit user opt-in. This leaves users unprotected against accidents out of the box.
By turning it on by default, users will at least be protected for mistakes involving local files, and files on shared filesystems that support fcntl() (eg NFS).
In turning it on the various services files are updated to have the same dependancies for virtlockd as we have for virtlogd now, since turning the latter on exposed some gaps.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> ---
Unfortunately ... NACK, while I fixed quite some bugs with locking and block jobs few of them are still present. Mostly non-active layer block commit where we will leak the locked file once we finish the job as we are not tracking the job progress ( and nor can we properly do so since qemu jobs vanish right after finishing). I discussed it today with Kevin Wolf and John Snow and they plan to add a way for the jobs to linger.
Can't we detrect completion / error via the the block job events, even if the actual job object itself vanishes ? The events look like that
We can and we do, the problem is only if libvirt would not run at that point, when it's hard to re-detect what's happening. We also don't track which job we've started for non-active commit and block pull and thus can't infer from the backing chain which was dropped.
contain enough info to correlate back to libvirt's record of the job.
Well it won't be that hard to fix for the active case. We need to track the backing chain fully anyways. The problem only remains in case when we need to infer whether it was or wasn't successful if we missed the event. I've filed https://bugzilla.redhat.com/show_bug.cgi?id=1405537 a while ago so that I don't forget about this issue. I'm currently working on the support for blockdev-add and friends so this work will be somewhat relevant to that. I'll let you know once all known locking issues are resolved. Peter

On Thu, Feb 02, 2017 at 10:58:50AM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 18:11:28 +0000, Daniel Berrange wrote:
On Wed, Feb 01, 2017 at 07:04:54PM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 16:54:01 +0000, Daniel Berrange wrote:
The virtlockd daemon has existed for years now, but we have never turned it on by default, requiring explicit user opt-in. This leaves users unprotected against accidents out of the box.
By turning it on by default, users will at least be protected for mistakes involving local files, and files on shared filesystems that support fcntl() (eg NFS).
In turning it on the various services files are updated to have the same dependancies for virtlockd as we have for virtlogd now, since turning the latter on exposed some gaps.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> ---
Unfortunately ... NACK, while I fixed quite some bugs with locking and block jobs few of them are still present. Mostly non-active layer block commit where we will leak the locked file once we finish the job as we are not tracking the job progress ( and nor can we properly do so since qemu jobs vanish right after finishing). I discussed it today with Kevin Wolf and John Snow and they plan to add a way for the jobs to linger.
Can't we detrect completion / error via the the block job events, even if the actual job object itself vanishes ? The events look like that
We can and we do, the problem is only if libvirt would not run at that point, when it's hard to re-detect what's happening. We also don't track which job we've started for non-active commit and block pull and thus can't infer from the backing chain which was dropped.
contain enough info to correlate back to libvirt's record of the job.
Well it won't be that hard to fix for the active case. We need to track the backing chain fully anyways. The problem only remains in case when we need to infer whether it was or wasn't successful if we missed the event.
I've filed https://bugzilla.redhat.com/show_bug.cgi?id=1405537 a while ago so that I don't forget about this issue.
I'm currently working on the support for blockdev-add and friends so this work will be somewhat relevant to that. I'll let you know once all known locking issues are resolved.
So based on the above we only have a show stopper problem if libvirtd is restarted while a job is running. IMHO if we can make job handling reliable while libvirtd is running that is good enough to let us enable virtlockd by default, without needing to wait for QEMU improvements. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Thu, Feb 02, 2017 at 10:02:54 +0000, Daniel Berrange wrote:
On Thu, Feb 02, 2017 at 10:58:50AM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 18:11:28 +0000, Daniel Berrange wrote:
On Wed, Feb 01, 2017 at 07:04:54PM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 16:54:01 +0000, Daniel Berrange wrote:
[..]
We can and we do, the problem is only if libvirt would not run at that point, when it's hard to re-detect what's happening. We also don't track which job we've started for non-active commit and block pull and thus can't infer from the backing chain which was dropped.
contain enough info to correlate back to libvirt's record of the job.
Well it won't be that hard to fix for the active case. We need to track the backing chain fully anyways. The problem only remains in case when we need to infer whether it was or wasn't successful if we missed the event.
I've filed https://bugzilla.redhat.com/show_bug.cgi?id=1405537 a while ago so that I don't forget about this issue.
I'm currently working on the support for blockdev-add and friends so this work will be somewhat relevant to that. I'll let you know once all known locking issues are resolved.
So based on the above we only have a show stopper problem if libvirtd is restarted while a job is running. IMHO if we can make job handling
Yes, if the block job finishes/fails while libvirtd is not running the file will remain locked forever. I still consider that a serious problem since you can't recover from that and the image stays locked. The tracking of the block job is still required though and we don't do that currently.
reliable while libvirtd is running that is good enough to let us enable
Off topic: I'd rather not make "good enough" a sufficient measure for adding stuff to libvirt ...
virtlockd by default, without needing to wait for QEMU improvements.
Peter

On Thu, Feb 02, 2017 at 01:05:17PM +0100, Peter Krempa wrote:
On Thu, Feb 02, 2017 at 10:02:54 +0000, Daniel Berrange wrote:
On Thu, Feb 02, 2017 at 10:58:50AM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 18:11:28 +0000, Daniel Berrange wrote:
On Wed, Feb 01, 2017 at 07:04:54PM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 16:54:01 +0000, Daniel Berrange wrote:
[..]
We can and we do, the problem is only if libvirt would not run at that point, when it's hard to re-detect what's happening. We also don't track which job we've started for non-active commit and block pull and thus can't infer from the backing chain which was dropped.
contain enough info to correlate back to libvirt's record of the job.
Well it won't be that hard to fix for the active case. We need to track the backing chain fully anyways. The problem only remains in case when we need to infer whether it was or wasn't successful if we missed the event.
I've filed https://bugzilla.redhat.com/show_bug.cgi?id=1405537 a while ago so that I don't forget about this issue.
I'm currently working on the support for blockdev-add and friends so this work will be somewhat relevant to that. I'll let you know once all known locking issues are resolved.
So based on the above we only have a show stopper problem if libvirtd is restarted while a job is running. IMHO if we can make job handling
Yes, if the block job finishes/fails while libvirtd is not running the file will remain locked forever. I still consider that a serious problem since you can't recover from that and the image stays locked.
The tracking of the block job is still required though and we don't do that currently.
reliable while libvirtd is running that is good enough to let us enable
Off topic: I'd rather not make "good enough" a sufficient measure for adding stuff to libvirt ...
In general if libvirtd is not running, then all bets are off wrt to management of VMs. We've made reasonable effort to clean up / fix things but we've never said everything will work correctly if libvirtd is stopped while key events occur. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Thu, Feb 02, 2017 at 12:09:07 +0000, Daniel Berrange wrote:
On Thu, Feb 02, 2017 at 01:05:17PM +0100, Peter Krempa wrote:
On Thu, Feb 02, 2017 at 10:02:54 +0000, Daniel Berrange wrote:
On Thu, Feb 02, 2017 at 10:58:50AM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 18:11:28 +0000, Daniel Berrange wrote:
On Wed, Feb 01, 2017 at 07:04:54PM +0100, Peter Krempa wrote:
On Wed, Feb 01, 2017 at 16:54:01 +0000, Daniel Berrange wrote:
[...]
Yes, if the block job finishes/fails while libvirtd is not running the file will remain locked forever. I still consider that a serious problem since you can't recover from that and the image stays locked.
The tracking of the block job is still required though and we don't do that currently.
reliable while libvirtd is running that is good enough to let us enable
Off topic: I'd rather not make "good enough" a sufficient measure for adding stuff to libvirt ...
In general if libvirtd is not running, then all bets are off wrt to management of VMs. We've made reasonable effort to clean up / fix things but we've never said everything will work correctly if libvirtd is stopped while key events occur.
Also once we have the job tracking done, we can always unlock the image file once the block job vanishes after libvirtd restart. That way we won't ever leak the lock and qemu should not touch the file afterwards anyways. But we still need to unlock it regardless of how the job finished.

On Wed, Feb 01, 2017 at 04:54:01PM +0000, Daniel P. Berrange wrote:
The virtlockd daemon has existed for years now, but we have never turned it on by default, requiring explicit user opt-in. This leaves users unprotected against accidents out of the box.
By turning it on by default, users will at least be protected for mistakes involving local files, and files on shared filesystems that support fcntl() (eg NFS).
What are the implications of this for passively reading live disks? (a la tools such as virt-df) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v

On Thu, Feb 02, 2017 at 12:47:30PM +0000, Richard W.M. Jones wrote:
On Wed, Feb 01, 2017 at 04:54:01PM +0000, Daniel P. Berrange wrote:
The virtlockd daemon has existed for years now, but we have never turned it on by default, requiring explicit user opt-in. This leaves users unprotected against accidents out of the box.
By turning it on by default, users will at least be protected for mistakes involving local files, and files on shared filesystems that support fcntl() (eg NFS).
What are the implications of this for passively reading live disks? (a la tools such as virt-df)
None - at this time, a disk marked <readonly/> won't acquire any locks. We map <readonly/> -> no lock <shareable/> -> fnctl read lock default -> fcntl write lock In future we'll likely copy the trick qemu has done to use a pair of fcntl locks on separate bytes, in order to map all combinations to locks. As long as you always mark disks <readonly/> though you should be fine no matter what we do. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|
participants (3)
-
Daniel P. Berrange
-
Peter Krempa
-
Richard W.M. Jones