On Fri, Aug 22, 2014 at 10:56:47AM -0400, John Ferlan wrote:
On 08/22/2014 10:46 AM, Daniel P. Berrange wrote:
> On Mon, Aug 11, 2014 at 04:30:19PM -0400, John Ferlan wrote:
>> Currently the safezero() function uses build conditionals to choose either
>> the posix_fallocate() or mmap() with a fallback to safewrite() in order to
>> preallocate a file.
>>
>> This patch will modify the logic in order to allow fallbacks in the
>> event that posix_fallocate() or the ftruncate()and mmap() doesn't work
>> properly. The fallback will be to use the slow safewrite of zero filled
>> buffers to the file.
>
> Have you actually encountered failing of posix_fallocate() in the
> real world ? It is supposed to automatically fallback to the
> equivalent of writing zeros if the filesystem / kernel does not
> support it, so we should not have todo runtime fallback ourselves.
> The existance of fallback is the main distinction between the
> posix_fallocate() and fallocate() system calls.
>
It wasn't so much as a "failure" as "unexpected results" - the
key being
that the resulting created (or resized) file was not sized as expected.
For an NFS target the results are not what was expected. I've left some
history in the prior set of patches with the following probably having
the most details:
http://www.redhat.com/archives/libvir-list/2014-August/msg00367.html
So, IIUC, the bug happens when the rsize mount option to NFS is not 4k.
strace'ing libvirtd on an NFS volume in this case shows:
open("/var/lib/libvirt/images/lettuce/foo", O_RDWR|O_CREAT|O_EXCL, 0600) = 24
fstat(24, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
ftruncate(24, 1073741824) = 0
fallocate(24, 0, 0, 1073741824) = -1 EOPNOTSUPP (Operation not supported)
fallocate(24, 0, 0, 1073741824) = -1 EOPNOTSUPP (Operation not supported)
fstat(24, {st_mode=S_IFREG|0600, st_size=1073741824, ...}) = 0
fstatfs(24, {f_type="NFS_SUPER_MAGIC", f_bsize=1048576, f_blocks=118342,
f_bfree=71002, f_bavail=65632, f_files=7678560, f_ffree=5495931, f_fsid={0, 0},
f_namelen=255, f_frsize=1048576}) = 0
pread(24, "\0", 1, 1048575) = 1
pwrite(24, "\0", 1, 1048575) = 1
pread(24, "\0", 1, 2097151) = 1
pwrite(24, "\0", 1, 2097151) = 1
pread(24, "\0", 1, 3145727) = 1
So we can see glibc here trying fallocate() and then falling back to
writing zeros. Since the volume does not come out at the right size
this seems to show a bug in glibc.
So I think we really ought to report that bug to glibc to be fixed
there rather than working around it in libvirt, as there are many
more applications besides libvirt that will be impacted by this
bug.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|