On 08/09/2014 12:28 AM, John Ferlan wrote:
> Testing seems to indicate that posix_fallocate() either
doesn't work as
> expected on the target or using the target.path is incorrect...
>
> Before posix_fallocate
> stat st_blocks=0 st_blksize=1048576 st_size=10485760
> lseek end=10485760
>
> posix_fallocate of 10485760 bytes on /home/nfs_pool/target/test-vol1
>
> After posix_fallocate
> stat st_blocks=88 st_blksize=1048576 st_size=10485760
> lseek end=10485760
>
>
> Hmm... would going at the target be correct in this instance? Same test
> but use the source path:
Using the source path would only work if the NFS export is on the localhost :)
>
> Before posix_fallocate
> stat st_blocks=0 st_blksize=4096 st_size=10485760
> lseek end=10485760
>
> posix_fallocate of 10485760 bytes on /home/nfs_pool/nfs-export/test-vol1
>
> After posix_fallocate
> stat st_blocks=20480 st_blksize=4096 st_size=10485760
> lseek end=10485760
>
> ...
>
> hmm.... 20480 * 512 = 10485760
>
> # df
> ...
> localhost:/home/nfs_pool/nfs-export 140979200 35521536 98273280 27%
> /home/nfs_pool/target
> #
>
Well it's a tangled web that's being weaved... The blksize of the
target volume comes from the 'wsize' value in the mount:
localhost:/home/nfs_pool/nfs-export on /home/nfs_pool/target type nfs4
(rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,...)
Further testing shows if I change the wsize to 4096, then I get what I
expect; however, starting at 8192 I'll get decreasingly smaller
allocations. So this is a "math problem".
So going back to writing via posix_fallocate() to the
"/home/nfs_pool/target/test-vol" the "issue" is the blksize of the
"source" (nfs-export) is 4096 while the blksize of the "target"
(target)
is 1048576 (as a result of the nfs mount settings). What "seems" to
happen is the posix_fallocate() makes 11 "writes" - I assume because
blksize*10 < desired_size (10485760) (instead of <=...).
Thus 11 * 4096 = 45056 (bytes) / 1024 = 44 KiB which was displayed.
Why all this happens I'm not sure. Bug in posix_fallocate()? Bug in
configuration? I have to assume that when this code was first added NFS
probably was still using smaller block sizes.
The code was introduced in 2013. Maybe it wasn't tested on NFS at all?
It looks like a posix_fallocate bug to me.
The other method we have (syscall(SYS_fallocate)) gives me EOPNOTSUPP -
Operation not supported on transport endpoint.
Whether anyone has
noticed or not beyond the virt-test which discovered the issue - I'm not
sure. In any case, does anyone have feedback/thoughts for next steps?
I can put together something that avoids posix_fallocate() for the
create-as and resize paths.
I think reporting an error when preallocate is requested for a NFS pool makes
sense, but that might be pretty annoying if something supplies the preallocate
flag by default.
Jan