
Daniel P. Berrange wrote:
On Wed, Jul 15, 2009 at 11:51:41PM +0200, Jim Meyering wrote:
If it matters, we can come up with a more efficient (yet still portable) way to compare the last two bytes of each file to "\n\n". I went ahead and wrote a nearly-minimal script to do that. Rather than reading/processing all 27MB of sources, this reads just the last 2 bytes of each of the 1048 files, comparing those bytes to "\n\n" and printing the name when there's a match:
I just strace'd the 'tail' program and that does the right thing too
It had better ;-)
# strace tail -c 2 somefile .... open("somefile", O_RDONLY|O_LARGEFILE) = 3 fstat64(3, {st_mode=S_IFREG|0664, st_size=10600, ...}) = 0 _llseek(3, 0, [0], SEEK_CUR) = 0 _llseek(3, 0, [10600], SEEK_END) = 0 _llseek(3, -2, [10598], SEEK_END) = 0 read(3, "l\n"..., 2) = 2
git ls-files -z \ | xargs -0 perl -le ' foreach my $f (@ARGV) { open F,"<",$f or (warn "failed to open $f: $!\n"), next; my $p = sysseek(F, -2, 2); # seek failure probably means file has < 2 bytes; ignore my $two; defined $p and $p = sysread F,$two,2; close F; # ignore read failure $p && $two eq "\n\n" and (print $f),$fail=1; } END {exit defined $fail ? 1 : 0}'
So using 'tail' instead of this perl script would be more readable and efficient.
tail -c2 might be my choice, too, for a small number of files. But in our case, using it would incur the cost of 1000+ fork+execs.