<...snip...>
> Why all this happens I'm not sure. Bug in posix_fallocate()?
Bug in
> configuration? I have to assume that when this code was first added NFS
> probably was still using smaller block sizes.
The code was introduced in 2013. Maybe it wasn't tested on NFS at all?
It looks like a posix_fallocate bug to me.
Technically it was before that - the safezero function and it's
supporting cast were moved into virfile.c in 2013; however, they were in
[vir]util.c before that. It was first introduced in 2009 in commit id
'c29d0929' and beyond some minor changes/reword has stayed relatively
unchanged.
The one downside I see with it is that it's a build related either/or
setup. That is the "HAVE_POSIX_FALLOCATE" dictates the usage and
provides no fallback in the event that perhaps the result isn't expected
or if for some reason posix_fallocate fails.
"Theoretically" the safewrite option should/could be the fallback in the
event it's determined that the posix_fallocate() didn't generate the
right allocation size.
The other method we have (syscall(SYS_fallocate)) gives me EOPNOTSUPP
-
Operation not supported on transport endpoint.
Right - I saw that too.... The resize code followed the safezero() code
at least with respect to the either/or with the HAVE_POSIX_FALLOCATE
build conditional. It was introduced in commit id 'aa2a4cff'. It
assumes that the "SYS_fallocate" syscall is present "and" will work.
It
is missing an option to safewrite() from current allocation to the new
end of the file (eg, a call to safezero passing the the current offset).
> Whether anyone has
> noticed or not beyond the virt-test which discovered the issue - I'm not
> sure. In any case, does anyone have feedback/thoughts for next steps?
> I can put together something that avoids posix_fallocate() for the
> create-as and resize paths.
>
I think reporting an error when preallocate is requested for a NFS pool makes
sense, but that might be pretty annoying if something supplies the preallocate
flag by default.
My comment was more geared towards my findings of the posix_fallocate()
introduction in 2009.
While I agree there's something flaky with posix_fallocate, I don't
think an error is necessarily the best path. Maybe better "fall back
code" - we could check the posix_fallocate() results (using stat)
perhaps VIR_WARN that something is wrong before falling back to the
mmap/safewrite options.
John