On Thu, Jul 09, 2020 at 08:30:14AM -0600, Chris Murphy wrote:
It's generally recommended by upstream Btrfs development to set
'nodatacow' using 'chattr +C' on the containing directory for VM
images. By setting it on the containing directory it means new VM
images inherit the attribute, including images copied to this
location.
But 'nodatacow' also implies no compression and no data checksums.
Could there be use cases in which it's preferred to do compression and
checksumming on VM images? In that case the fragmentation problem can
be mitigated by periodic defragmentation.
Setting nodatacow makes sense particularly when using qcow2, since
qcow2 is already potentially performing copy-on-write.
Skipping compression and data checksums is reasonable in many cases,
as the OS inside the VM is often going to be able todo this itself,
so we don't want double checksums or double compression as it just
wastes performance.
Is this something libvirt can and should do? Possibly by default
with
a way for users to opt out?
The default /var/lib/libvirt/images directory is created by RPM
at install time. Not sure if there's a way to get RPM to set
attributes on the dir at this time ?
Libvirt's storage pool APIs has support for setting "nodatacow"
on a per-file basis when creating the images. This was added for
btrfs benefit, but this is opt in and I'm not sure any mgmt tools
actually use this right now. So in practice it probably dooesn't
have any out of the box benefit.
The storage pool APIs don't have any feature to set nodatacow
for the directory as a whole, but probably we should add this.
Another option is to have the installer set 'chattr +C' in
the short
term. But this doesn't help GNOME Boxes, since the user home isn't
created at installation time.
Three advantages of libvirt awareness of Btrfs:
(a) GNOME Boxes Cockpit, and other users of libvirt can then use this
same mechanism, and apply to their VM image locations.
(b) Create the containing directory as a subvolume instead of a directory
(1) btrfs snapshots are not recursive, therefore making this a
subvolume would prevent it from being snapshot, and thus (temporarily)
resuming datacow.
(2) in heavy workloads there can be lock contention on file
btrees; a subvolume is a dedicated file btree; so this would reduce
tendency for lock contention in heavy workloads (this is not likely a
desktop/laptop use case)
Being able to create subvolumes sounds like a reasonable idea. We already
have a ZFS specific storage driver that can do the ZFS equivalent.
Again though we'll also need mgmt tools modified to take advantage of
this. Not sure how we would make this all work out of the box, with
the way we let RPM pre-create /var/lib/libvirt/images, as we'd need
different behaviour depending on what filesystem you install the RPM
onto.
(c) virtiofs might be able to take advantage of btrfs subvolumes.
Libvirt doesn't currently do anything much wrt virtiofs except
configure QEMU. The creation of the directory containing the
share and populating its contents is left as an exercise for the
user/admin/mgmt tool.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|