cp: work around linux kernel bug: short-read != EOF on /proc

Remove the optimization that avoided up to 50% of cp's read syscalls.
Do not assume that a short read on a regular file indicates EOF.
When reading from a file in /proc on linux [at least 2.6.9 - 2.6.29]
into a 4k-byte buffer or larger, a short read does not
always indicate EOF.  For example, "cp /proc/slabinfo /tmp"
copies only 4068 of the total 7493 bytes.  This optimization
(25719a3315, Improve performance a bit
by optimizing away; 2005-11-24) appears to have been worth less than
a 2% speed-up (and usually much less), so the impact of removing it
is negligible.

* src/copy.c (copy_reg): Don't exit the loop early.
* tests/cp/proc-short-read: New test, lightly based on a suggestion
from Mike Frysinger, to exercise this fix.
* tests/Makefile.am (TESTS): Add cp/proc-short-read.
* NEWS (Improve robustness): Mention this change.
This commit is contained in:
Jim Meyering 2009-04-17 18:44:18 +02:00
parent 2ad7da7594
commit c74fbaefeb
4 changed files with 62 additions and 3 deletions

12
NEWS
View File

@ -16,6 +16,18 @@ GNU coreutils NEWS -*- outline -*-
default should proceed at the speed of the disk. Previously /dev/urandom
was used if available, which is relatively slow on GNU/Linux systems.
** Improved robustness
cp would exit successfully after copying less than the full contents
of a file larger than ~4000 bytes from a linux-/proc file system to a
destination file system with a fundamental block size of 4KiB or greater.
Reading into a 4KiB-or-larger buffer, cp's "read" syscall would return
a value smaller than 4096, and cp would interpret that as EOF (POSIX
allows this). This optimization, now removed, saved 50% of cp's read
syscalls when copying small files. Affected linux kernels: at least
2.6.9 through 2.6.29.
[the optimization was introduced in coreutils-6.0]
** Portability
`id -G $USER` now works correctly even on Darwin and NetBSD. Previously it

View File

@ -700,9 +700,10 @@ copy_reg (char const *src_name, char const *dst_name,
}
last_write_made_hole = false;
/* A short read on a regular file means EOF. */
if (n_read != buf_size && S_ISREG (src_open_sb.st_mode))
break;
/* It is tempting to return early here upon a short read from a
regular file. That would save the final read syscall for each
file. Unfortunately that doesn't work for certain files in
/proc with linux kernels from at least 2.6.9 .. 2.6.29. */
}
}

View File

@ -281,6 +281,7 @@ TESTS = \
cp/parent-perm-race \
cp/perm \
cp/preserve-2 \
cp/proc-short-read \
cp/proc-zero-len \
cp/r-vs-symlink \
cp/same-file \

45
tests/cp/proc-short-read Executable file
View File

@ -0,0 +1,45 @@
#!/bin/sh
# exercise cp's short-read failure when operating on >4KB files in /proc
# Copyright (C) 2009 Free Software Foundation, Inc.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
if test "$VERBOSE" = yes; then
set -x
cp --version
fi
. $srcdir/test-lib.sh
fail=0
kall=/proc/kallsyms
test -r $kall || skip_test_ "your system lacks $kall"
# Before coreutils-7.3, cp would copy less than 4KiB of this 1MB+ file.
cp $kall 1 || fail=1
cat $kall > 2 || fail=1
compare 1 2 || fail=1
# Also check md5sum, just for good measure.
md5sum $kall > 3 || fail=1
md5sum 2 > 4 || fail=1
# Remove each file name before comparing checksums.
sed 's/ .*//' 3 > sum.proc || fail=1
sed 's/ .*//' 4 > sum.2 || fail=1
compare sum.proc sum.2 || fail=1
Exit $fail