On Thu, Sep 3, 2020 at 12:49 PM Richard Laager <rlaager(a)wiktel.com> wrote:
On 9/3/20 5:18 AM, Christian Ehrhardt wrote:
> Even if my fix lands, we are back to square one and would need
> virt-manager to submit a different XML.
> Remember: my target here would be to come back to pralloca=metadata as
> it was before for image creations from virt-manager.
Why is that your goal?
If this is simply because OpenZFS doesn't support fallocate(mode=0),
Yeah it was because behavior changed for users on upgrades and became
slow (if on ZFS).
And since the symptom was restricted to just that and also just a
slowdown the prio was always low anyway.
Glad to hear that - and I agree that fallocate @ COW/Compression FS is
not really applicable.
So it seems "my need" for this is completely gone once we get the change above.
Thanks Richard for the FYI on it!
ZFS will "fake" the fallocate() request. It'll check to
make sure
there's enough free space at the moment, which is about all it can do
anyway. It can't reserve the space anyway, mostly because it is a
copy-on-write filesystem. Even if the application writes zeros, ZFS will
just throw them away anyway (assuming you are using compression, which
everyone should be).
> On the libvirt side allocation>capacity sounds like being wrong anyway.
> And if that is so we have these possible conditions:
> - capacity==allocation now and before my change falloc
> - capacity>allocation now and before my change metadata
> - capacity<allocation before my change falloc, afterwards metadata
> (but this one seems invalid anyway)
>
> So I wonder are we really back at me asking Cole to let virt-manager
> request things differently which is how this started about a year ago?
Setting aside cases of semi-allocation (capacity > allocation != 0) and
overprovisioning (allocation > capacity), I assume the common cases are
thin provisioning (allocation == 0) and thick provisioning (capacity ==
allocation).
virt-manager (at least in the way I use it) asks explicitly for the
allocation and capacity. If virt-manager is properly conveying (and I'd
assume it is) the user's capacity and allocation choices from the GUI to
libvirt, then virt-manager is working correctly in my view and should be
left alone.
I believe the main goal for thick provisioning is to reserve the space
as best as possible, because ENOSPC underneath a virtual machine is bad.
Secondary goals would be allocating the space relatively contiguously
for performance and accounting for the space immediately to help the
administrator keep track of usage.
If the filesystem supports fallocate(), using it accomplishes all of
these goals in a very performant way. If the filesystem does not support
fallocate(), then the application can either write zeros or do nothing.
Writing zeros is slow, but achieves the goals to the extent possible.
Not writing zeros is fast, but does not reserve/account for the space;
though, depending on the filesystem, that might not be possible anyway.
I think the question fundamentally comes down to: how strong do you take
a "thick provisioning" request? Do you do everything in your power to
achieve it (which would mean writing zeros*) or do you treat it as a
hint that you'll only follow if it is fast to do so?
If it's a demand, then try fallocate() but fall back to writing zeroes.
(glibc's posix_fallocate() does exactly this.). If it's a hint, then
only ever call fallocate().
I think it is reasonable to treat it as a demand and write zeros if
fallocate() fails. If it is too slow, the admin will notice and can make
the decision to (in the future) stop requesting thick provisioning and
just request thin provisioning.
In the ZFS case, why is the admin requesting thick provisioning anyway?
* One could go further and defeat compression by writing random data.
But that seems extreme, so I'm going to ignore that.
--
Richard
--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd