[libvirt-users] Locking without virtlockd (nor sanlock)?

Hi list, I would like to ask a clarification about how locking works. My test system is CentOS 7.7 with libvirt-4.5.0-23.el7_7.1.x86_64 Is was understanding that, by default, libvirt does not use any locks. From here [1]: "The out of the box configuration, however, currently uses the nop lock manager plugin". As "lock_manager" is commented in my qemu.conf file, I was expecting that no locks were used to protect my virtual disk from guest double-start or misassignement to other vms. However, "cat /proc/locks" shows the following (17532905 being the vdisk inode): [root@localhost tmp]# cat /proc/locks | grep 17532905 42: OFDLCK ADVISORY READ -1 fd:00:17532905 201 201 43: OFDLCK ADVISORY READ -1 fd:00:17532905 100 101 Indeed, try to associate and booting the disk to another machines give me an error (stating that the disk is alredy in use). Enabling the "lockd" plugin and starting the same machine, "cat /proc/locks" looks different: [root@localhost tmp]# cat /proc/locks | grep 17532905 31: POSIX ADVISORY WRITE 19266 fd:00:17532905 0 0 32: OFDLCK ADVISORY READ -1 fd:00:17532905 201 201 33: OFDLCK ADVISORY READ -1 fd:00:17532905 100 101 As you can see, an *additional* write lock was granted. Again, assigning the disk to another vms and booting it up ends with the same error. So, may I ask: - why does libvirtd requests READ locks even commenting the "lock_manager" option? - does it means that I can avoid modifying anything, relying on libvirtd to correctly locks image files? - if so, I should use virtlockd for what use cases? Thanks. [1] https://libvirt.org/locking-lockd.html -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@assyoma.it - info@assyoma.it GPG public key ID: FF5F32A8

Il 28-12-2019 01:39 Gionatan Danti ha scritto:
Hi list, I would like to ask a clarification about how locking works. My test system is CentOS 7.7 with libvirt-4.5.0-23.el7_7.1.x86_64
Is was understanding that, by default, libvirt does not use any locks. From here [1]: "The out of the box configuration, however, currently uses the nop lock manager plugin". As "lock_manager" is commented in my qemu.conf file, I was expecting that no locks were used to protect my virtual disk from guest double-start or misassignement to other vms.
However, "cat /proc/locks" shows the following (17532905 being the vdisk inode): [root@localhost tmp]# cat /proc/locks | grep 17532905 42: OFDLCK ADVISORY READ -1 fd:00:17532905 201 201 43: OFDLCK ADVISORY READ -1 fd:00:17532905 100 101 Indeed, try to associate and booting the disk to another machines give me an error (stating that the disk is alredy in use).
Enabling the "lockd" plugin and starting the same machine, "cat /proc/locks" looks different: [root@localhost tmp]# cat /proc/locks | grep 17532905 31: POSIX ADVISORY WRITE 19266 fd:00:17532905 0 0 32: OFDLCK ADVISORY READ -1 fd:00:17532905 201 201 33: OFDLCK ADVISORY READ -1 fd:00:17532905 100 101 As you can see, an *additional* write lock was granted. Again, assigning the disk to another vms and booting it up ends with the same error.
So, may I ask: - why does libvirtd requests READ locks even commenting the "lock_manager" option? - does it means that I can avoid modifying anything, relying on libvirtd to correctly locks image files? - if so, I should use virtlockd for what use cases?
Thanks.
Ok, maybe I found some answers: from what I read here [1] and here [2], Qemu started to automatically lock disk image files to prevent corruption from processes outside libvirt scope (ie: manually issues "qemu-img" commands). Do you suggest relying on Qemu own locks or using virtlockd (in addition to Qemu locks)? Whatever the answer is, can you explain why? Thanks. [1] https://qemu.weilnetz.de/doc/2.12/qemu-doc.html#disk_005fimage_005flocking [2] https://bugzilla.redhat.com/show_bug.cgi?id=1378241 -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@assyoma.it - info@assyoma.it GPG public key ID: FF5F32A8

Il 28-12-2019 14:36 Gionatan Danti ha scritto:
Il 28-12-2019 01:39 Gionatan Danti ha scritto:
Hi list, I would like to ask a clarification about how locking works. My test system is CentOS 7.7 with libvirt-4.5.0-23.el7_7.1.x86_64
Is was understanding that, by default, libvirt does not use any locks. From here [1]: "The out of the box configuration, however, currently uses the nop lock manager plugin". As "lock_manager" is commented in my qemu.conf file, I was expecting that no locks were used to protect my virtual disk from guest double-start or misassignement to other vms.
However, "cat /proc/locks" shows the following (17532905 being the vdisk inode): [root@localhost tmp]# cat /proc/locks | grep 17532905 42: OFDLCK ADVISORY READ -1 fd:00:17532905 201 201 43: OFDLCK ADVISORY READ -1 fd:00:17532905 100 101 Indeed, try to associate and booting the disk to another machines give me an error (stating that the disk is alredy in use).
Enabling the "lockd" plugin and starting the same machine, "cat /proc/locks" looks different: [root@localhost tmp]# cat /proc/locks | grep 17532905 31: POSIX ADVISORY WRITE 19266 fd:00:17532905 0 0 32: OFDLCK ADVISORY READ -1 fd:00:17532905 201 201 33: OFDLCK ADVISORY READ -1 fd:00:17532905 100 101 As you can see, an *additional* write lock was granted. Again, assigning the disk to another vms and booting it up ends with the same error.
So, may I ask: - why does libvirtd requests READ locks even commenting the "lock_manager" option? - does it means that I can avoid modifying anything, relying on libvirtd to correctly locks image files? - if so, I should use virtlockd for what use cases?
Thanks.
Ok, maybe I found some answers: from what I read here [1] and here [2], Qemu started to automatically lock disk image files to prevent corruption from processes outside libvirt scope (ie: manually issues "qemu-img" commands).
Do you suggest relying on Qemu own locks or using virtlockd (in addition to Qemu locks)? Whatever the answer is, can you explain why?
Thanks.
[1] https://qemu.weilnetz.de/doc/2.12/qemu-doc.html#disk_005fimage_005flocking
Hi all, any suggestion on the matter? Thanks. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@assyoma.it - info@assyoma.it GPG public key ID: FF5F32A8

On Sat, Dec 28, 2019 at 02:36:27PM +0100, Gionatan Danti wrote:
Il 28-12-2019 01:39 Gionatan Danti ha scritto:
Hi list, I would like to ask a clarification about how locking works. My test system is CentOS 7.7 with libvirt-4.5.0-23.el7_7.1.x86_64
Is was understanding that, by default, libvirt does not use any locks. From here [1]: "The out of the box configuration, however, currently uses the nop lock manager plugin". As "lock_manager" is commented in my qemu.conf file, I was expecting that no locks were used to protect my virtual disk from guest double-start or misassignement to other vms.
However, "cat /proc/locks" shows the following (17532905 being the vdisk inode): [root@localhost tmp]# cat /proc/locks | grep 17532905 42: OFDLCK ADVISORY READ -1 fd:00:17532905 201 201 43: OFDLCK ADVISORY READ -1 fd:00:17532905 100 101 Indeed, try to associate and booting the disk to another machines give me an error (stating that the disk is alredy in use).
Enabling the "lockd" plugin and starting the same machine, "cat /proc/locks" looks different: [root@localhost tmp]# cat /proc/locks | grep 17532905 31: POSIX ADVISORY WRITE 19266 fd:00:17532905 0 0 32: OFDLCK ADVISORY READ -1 fd:00:17532905 201 201 33: OFDLCK ADVISORY READ -1 fd:00:17532905 100 101 As you can see, an *additional* write lock was granted. Again, assigning the disk to another vms and booting it up ends with the same error.
So, may I ask: - why does libvirtd requests READ locks even commenting the "lock_manager" option? - does it means that I can avoid modifying anything, relying on libvirtd to correctly locks image files? - if so, I should use virtlockd for what use cases?
Thanks.
Ok, maybe I found some answers: from what I read here [1] and here [2], Qemu started to automatically lock disk image files to prevent corruption from processes outside libvirt scope (ie: manually issues "qemu-img" commands).
Yes, this is correct, the OFDLCK you are seeing are held by QEMU itself and can't be turned off.
Do you suggest relying on Qemu own locks or using virtlockd (in addition to Qemu locks)? Whatever the answer is, can you explain why?
The QEMU locks use fcntl() as their impl and as such they only apply to the local machine filesystem, except when using NFS which is cross node. virtlockd also uses fcntl(), however, it doesn't have to acquire locks on the file/block device directly. It can use a look-aside file for locking. For example a path under /var/lib/libvirt/lock. This means that locks on block devices for /dev/sda1 would be held as /var/lib/libvirt/lock/$HASH(/dev/sda1) If you mount /var/lib/libvirt/lock on NFS, these locks now apply across all machines which use the same block devices. This is useful when your block device storage is network based (iSCSI, RBD, etc). There are some issues with libvirt's locking though where we haven't always released/re-acquired locks at the correct time when dealing with block jobs. As long as your not using snapshots, block rebase, block mirror APIs, it'll be ok though. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Il 03-01-2020 11:26 Daniel P. Berrangé ha scritto:
virtlockd also uses fcntl(), however, it doesn't have to acquire locks on the file/block device directly. It can use a look-aside file for locking. For example a path under /var/lib/libvirt/lock. This means that locks on block devices for /dev/sda1 would be held as /var/lib/libvirt/lock/$HASH(/dev/sda1)
If you mount /var/lib/libvirt/lock on NFS, these locks now apply across all machines which use the same block devices. This is useful when your block device storage is network based (iSCSI, RBD, etc).
Hi Daniel, if I understand the docs correctly, this locking scheme is really useful for raw block devices, right? Now that Qemu automatically locks file-based vdisks, what is the main advantage of virtlockd locking?
There are some issues with libvirt's locking though where we haven't always released/re-acquired locks at the correct time when dealing with block jobs. As long as your not using snapshots, block rebase, block mirror APIs, it'll be ok though.
While I am not an heavy user of external snapshot and other block-related operation, I occasionally use them (and, in these cases, I found them very useful). Does it means that I should avoid relying on virtlockd for locking? Should I rely on Qemu locks only? Thanks. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@assyoma.it - info@assyoma.it GPG public key ID: FF5F32A8

On Fri, Jan 03, 2020 at 02:56:50PM +0100, Gionatan Danti wrote:
Il 03-01-2020 11:26 Daniel P. Berrangé ha scritto:
virtlockd also uses fcntl(), however, it doesn't have to acquire locks on the file/block device directly. It can use a look-aside file for locking. For example a path under /var/lib/libvirt/lock. This means that locks on block devices for /dev/sda1 would be held as /var/lib/libvirt/lock/$HASH(/dev/sda1)
If you mount /var/lib/libvirt/lock on NFS, these locks now apply across all machines which use the same block devices. This is useful when your block device storage is network based (iSCSI, RBD, etc).
Hi Daniel, if I understand the docs correctly, this locking scheme is really useful for raw block devices, right?
Now that Qemu automatically locks file-based vdisks, what is the main advantage of virtlockd locking?
QEMU's locking should be good enough for file based images. There isn't a clear benefit to virtlockd in this case.
There are some issues with libvirt's locking though where we haven't always released/re-acquired locks at the correct time when dealing with block jobs. As long as your not using snapshots, block rebase, block mirror APIs, it'll be ok though.
While I am not an heavy user of external snapshot and other block-related operation, I occasionally use them (and, in these cases, I found them very useful).
Does it means that I should avoid relying on virtlockd for locking? Should I rely on Qemu locks only?
As above, QEMU's locking is good enough to rely on for file based images. The flaws I mention with libvirt might actually finally be something we have fixed in 5.10.0 with QEMU 4.2.0, since we can finally use "blockdev" syntax for configuring disks. Copying Peter to confirm/deny this... Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, Jan 03, 2020 at 14:08:03 +0000, Daniel Berrange wrote:
On Fri, Jan 03, 2020 at 02:56:50PM +0100, Gionatan Danti wrote:
Il 03-01-2020 11:26 Daniel P. Berrangé ha scritto:
[...]
There are some issues with libvirt's locking though where we haven't always released/re-acquired locks at the correct time when dealing with block jobs. As long as your not using snapshots, block rebase, block mirror APIs, it'll be ok though.
While I am not an heavy user of external snapshot and other block-related operation, I occasionally use them (and, in these cases, I found them very useful).
Does it means that I should avoid relying on virtlockd for locking? Should I rely on Qemu locks only?
As above, QEMU's locking is good enough to rely on for file based images.
The flaws I mention with libvirt might actually finally be something we have fixed in 5.10.0 with QEMU 4.2.0, since we can finally use "blockdev" syntax for configuring disks. Copying Peter to confirm/deny this...
The main issue was that we were leaking locks on the backing chain. This should be now fixed with -blockdev as we call the appropriate apis to lock/unlock the images but I didn't try it with virtlockd. Certainly if there's still a problem now we have well defined places where we know what's happening to images so it should be easy to fix them.

Il 06-01-2020 10:06 Peter Krempa ha scritto:
On Fri, Jan 03, 2020 at 14:08:03 +0000, Daniel Berrange wrote:
As above, QEMU's locking is good enough to rely on for file based images.
Hi Daniel, thank you for the direct confirmation.
The flaws I mention with libvirt might actually finally be something we have fixed in 5.10.0 with QEMU 4.2.0, since we can finally use "blockdev" syntax for configuring disks. Copying Peter to confirm/deny this...
The main issue was that we were leaking locks on the backing chain. This should be now fixed with -blockdev as we call the appropriate apis to lock/unlock the images but I didn't try it with virtlockd.
Certainly if there's still a problem now we have well defined places where we know what's happening to images so it should be easy to fix them.
Hi Peter, can I ask what do you mean with "fixed with -blockdev"? Thanks. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@assyoma.it - info@assyoma.it GPG public key ID: FF5F32A8

On Mon, Jan 06, 2020 at 18:44:31 +0100, Gionatan Danti wrote:
Il 06-01-2020 10:06 Peter Krempa ha scritto:
On Fri, Jan 03, 2020 at 14:08:03 +0000, Daniel Berrange wrote:
As above, QEMU's locking is good enough to rely on for file based images.
Hi Daniel, thank you for the direct confirmation.
The flaws I mention with libvirt might actually finally be something we have fixed in 5.10.0 with QEMU 4.2.0, since we can finally use "blockdev" syntax for configuring disks. Copying Peter to confirm/deny this...
The main issue was that we were leaking locks on the backing chain. This should be now fixed with -blockdev as we call the appropriate apis to lock/unlock the images but I didn't try it with virtlockd.
Certainly if there's still a problem now we have well defined places where we know what's happening to images so it should be easy to fix them.
Hi Peter, can I ask what do you mean with "fixed with -blockdev"?
blockdev is the new way to specify disks on qemu command line. It required quite a lot of internal changes, some of which probably fixed the block job cooperation with virtlockd. (leaking locks of images). Blockdev is used starting from libvirt-5.10 and qemu-4.2

Il 07-01-2020 08:31 Peter Krempa ha scritto:
blockdev is the new way to specify disks on qemu command line. It required quite a lot of internal changes, some of which probably fixed the block job cooperation with virtlockd. (leaking locks of images).
Blockdev is used starting from libvirt-5.10 and qemu-4.2
Understood. Thank you so much. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@assyoma.it - info@assyoma.it GPG public key ID: FF5F32A8
participants (3)
-
Daniel P. Berrangé
-
Gionatan Danti
-
Peter Krempa