This fixes Bug#14371, reported by Killer Bassist.
* NEWS: Document this.
* src/mkdir.c (struct mkdir_options): Remove member ancestor_mode.
New member umask_value. All uses changed.
* src/mkdir.c (make_ancestor): Fix umask assumption.
* src/mkdir.c, src/mkfifo.c, src/mknod.c (main):
Leave umask alone. This requires invoking lchmod after creating
the file, which introduces a race condition, but this can't be
avoided on hosts with "POSIX" default ACLs, and there's no easy
way with network file systems to tell what kind of host the
directory is on.
* tests/local.mk (all_tests): Add tests/mkdir/p-acl.sh.
* tests/mkdir/p-acl.sh: New file.
Use a sentinel value that's checked implicitly, rather than
a bit array, to determine if an item should be output.
Benchmark results for this change are:
$ yes abcdfeg | head -n1MB > big-file
$ for c in orig sentinel; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 big-file > /dev/null
done
== orig ==
real 0m0.049s
user 0m0.044s
sys 0m0.005s
== sentinel ==
real 0m0.035s
user 0m0.032s
sys 0m0.002s
## Again with --output-delimiter ##
$ for c in orig sentinel; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 --output-delimiter=: big-file > /dev/null
done
== orig ==
real 0m0.106s
user 0m0.103s
sys 0m0.002s
== sentinel ==
real 0m0.055s
user 0m0.052s
sys 0m0.003s
eol_range_start: Removed. 'n-' is no longer treated specially,
and instead SIZE_MAX is set for the 'hi' limit, and tested implicitly.
complement_rp: Used to complement 'rp' when '--complement' is specified.
ADD_RANGE_PAIR: Macro renamed to 'add_range_pair' function.
* tests/misc/cut-huge-range.sh: Adjust to the SENTINEL value.
Also remove the overlapping range test as this is no longer
dependent on large ranges and also is already handled with
the EOL-subsumed-3 test in cut.pl.
This issue was introduced in commit v8.21-43-g3e466ad
* src/cut.c (set_fields): Process all range pairs when merging.
* tests/misc/cut-huge-range.sh: Add a test for this edge case.
Also fix an issue where we could miss reported errors due
to truncation of the 'err' file.
print_kth() is the central function of cut used to
determine if an item is to be output or not,
so simplify it by moving some logic outside.
Benchmark results for this change are:
$ yes abcdfeg | head -n1MB > big-file
$ for c in orig split; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 big-file > /dev/null
done
== orig ==
real 0m0.111s
user 0m0.108s
sys 0m0.002s
== split ==
real 0m0.088s
user 0m0.081s
sys 0m0.007s
* src/cut.c (print_kth): Refactor a branch to outside the function.
Related to http://bugs.gnu.org/13127
The current implementation of cut, uses a bit array,
an array of `struct range_pair's, and (when --output-delimiter
is specified) a hash_table. The new implementation will use
only an array of `struct range_pair's.
The old implementation is memory inefficient because:
1. When -b with a big num is specified, it allocates a lot of
memory for `printable_field'.
2. When --output-delimiter is specified, it will allocate 31 buckets.
Even if only a few ranges are specified.
Note CPU overhead is increased to determine if an item is to be printed,
as shown by:
$ yes abcdfeg | head -n1MB > big-file
$ for c in with-bitarray without-bitarray; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 big-file > /dev/null
done
== with-bitarray ==
real 0m0.084s
user 0m0.078s
sys 0m0.006s
== without-bitarray ==
real 0m0.111s
user 0m0.108s
sys 0m0.002s
Subsequent patches will reduce this overhead.
* src/cut.c (set_fields): Set and initialize RP
instead of printable_field.
* src/cut.c (is_range_start_index): Use CURRENT_RP rather than a hash.
* tests/misc/cut.pl: Check if `eol_range_start' is set correctly.
* tests/misc/cut-huge-range.sh: Rename from cut-huge-to-eol-range.sh,
and add a test to verify large amounts of mem aren't allocated.
Fixes http://bugs.gnu.org/13127
* init.cfg (require_ulimit_v_): Renamed from require_ulimit_
as this only checks for ulimit -v support. Other uses of
ulimit -t and ulimit -n in tests shouldn't cause false failures
if not supported.
* cfg.mk (sc_prohibit_test_ulimit_without_require_): A new syntax check
to ensure that require_ulimit_v_() is used iff required.
* tests/misc/head-c.sh: Add missing call to require_ulimit_v_.
* tests/rm/many-dir-entries-vs-OOM.sh: Likewise.
* tests/split/r-chunk.sh: Remove non mandatory require_ulimit_ call.
* tests/misc/sort-merge-fdlimit.sh: Likewise.
* tests/cp/link-heap.sh: Adjust to renamed require_ulimit_v_.
* tests/dd/no-allocate.sh: Likewise.
* tests/misc/csplit-heap.sh: Likewise.
* tests/misc/cut-huge-to-eol-range.sh: Likewise.
* tests/misc/printf-surprise.sh: Likewise.
As a side effect of the previous commit which fixes 'tail -f --retry'
to wait for a file to appear, tail would not exit when the last file
appears untailable and gives up on this file.
This can happen, for example, when the argument file name appears
as directory. Tail sets the 'ignore' flag of this file to true,
but instead of exiting the program, tail would continue the loop.
* src/tail.c (any_live_files): Change the function to return true
if any of the files is still tailable or if tail should continue to
try to check again.
(tail_forever): Change the condition to break the loop in the
"no files remaining" case, because now any_live_files() will care
about it, as mentioned above.
(parse_options): When --retry is used without any follow mode,
then reset reopen_inaccessible_files to false.
* tests/tail-2/retry.sh: Add test case.
The --retry option is indeed useful for both following modes
by name and by file descriptor. The difference is that in the
latter case, it is effective only during the initial open.
As a regression of the implementation of the inotify support,
tail -f --retry would immediately exit if the given file is
inaccessible.
* src/tail.c (usage): Change the description of the --retry option:
remove the note that this option would mainly be useful when
following by name.
(main): Change diagnosing dubios uses of --retry option:
when the --retry option is used without following, then issue
a warning that this option is ignored; when it is used together
with --follow=descriptor, then issue a warning that it is only
effective for the initial open.
Disable inotify also in the case when the initial open in tail_file()
failed (which is the actual bug fix).
* init.cfg (retry_delay_): Pass excess arguments to the test function.
* tests/tail-2/retry.sh: Add new tests.
* tests/local.mk (all_tests): Mention it.
* doc/coreutils.texi (tail invocation): Enhance the documentation
of the --retry option. Clarify the difference in tail's behavior
regarding the --retry option when combined with the following modes
name versus descriptor.
* NEWS (Bug fixes): Mention the fix.
Reported by Noel Morrison in:
http://lists.gnu.org/archive/html/coreutils/2013-04/msg00003.html
On OS X it was seen that the group ID used for new files,
are set to a that of the directory rather than the current user.
It's not currently understood when this happens, but it was confirmed
that ACLs, extended attributes and setgid bits are _not_ involved.
* init.cfg (skip_if_nondefault_group_): A new function to detect
and avoid this situation. Document with links to the discussions
for hopefully future clarification.
* tests/install/install-C-root.sh: Use the new function.
* tests/install/install-C-selinux.sh: Likewise.
* tests/install/install-C.sh: Likewise.
* src/head.c (elide_tail_bytes_pipe): Don't use calloc as that
bypasses memory overcommit due to the zeroing requirement.
Also realloc rather than malloc the pointer array to avoid
dependence on overcommit entirely.
* tests/misc/head-c.sh: Add a test case.
Fixes http://bugs.gnu.org/13530
* src/dd.c: Add new static global variable ibuf.
(alloc_ibuf, alloc_obuf): New functions factored from dd_copy().
(dd_copy): Call the new functions to allocate memory for
ibuf and obuf when necessary.
(skip): Likewise.
* tests/dd/no-allocate.sh: New test.
* tests/local.mk: Reference the test.
With --suppress-matched, the lines that match the pattern will not be
printed in the output files. I.E. the first line from the second
and subsequent splits will be suppressed.
* src/csplit.c: process_regexp(),process_line_count(): Don't output the
matched lines. Since csplit includes "up to but not including" matched
lines in each split, the first line (in the next group) is the matched
line - so just skip it.
main(): Handle new option.
usage(): Mention new option.
* doc/coreutils.texi (csplit invocation): Mention new option, examples.
* tests/misc/csplit-suppress-matched.pl: New test script.
* tests/local.mk: Reference the new test.
* NEWS: Mention new feature.
* init.cfg (require_gcc_shared_): A new function to check
that we can build shared libraries in the particular manner
we use in our tests.
* tests/cp/nfs-removal-race.sh: Use require_gcc_shared_.
Then fail rather than skip, if the actual shared lib build fails.
* tests/df/no-mtab-status.sh: Likewise.
* tests/df/skip-duplicates.sh: Likewise.
* tests/ls/getxattr-speedup.sh: Likewise.
Reported in http://bugs.gnu.org/14024
* src/tail.c (main): If -n0 or -c0 were specified without -f,
then no data would ever be output, so exit without reading input.
* tests/tail-2/tail-n0f.sh: Augment the related test with this case.
* src/shuf.c (main): If -n0 specified then no data would ever be output,
so exit without reading input.
* tests/misc/shuf.sh: Augment the related test with this case.
* doc/coreutils.texi (ln invocation): Describe how symlinks are
resolved with --relative, and give an example showing the greater
control available through realpath(1).
* tests/ln/relative.sh: Add a test to demonstrate full symlink
resolution, in a case where it might not be wanted.
Don't dereference an existing symlink being replaced.
I.E. generate the symlink relative to the symlink's containing dir,
rather than to some arbitrary place it points to.
* src/ln.c (convert_abs_rel): Don't consider the final component
of the symlink name when canonicalizing, as we want to avoid
dereferencing the final component.
* tests/ln/relative.sh: Add a test case.
* NEWS: Mention the fix.
Resolves http://bugs.gnu.org/14116
Reservoir sampling optimizes selecting K random lines from large or
unknown-sized input: http://en.wikipedia.org/wiki/Reservoir_sampling
Note this also avoids reading any input when -n0 is specified.
* src/shuf.c (main): Use reservoir-sampling when the number of output
lines is known, and the input size is large or unknown.
(input_size): A new function to get the input size for regular files.
(read_input_reservoir_sampling): New function to read lines from input,
keeping only K lines in memory, replacing lines with decreasing prob.
(write_permuted_output_reservoir): New function to output reservoir.
* tests/misc/shuf-reservoir.sh: An expensive_ test using valgrind to
exercise the reservoir-sampling code.
* tests/local.mk: Reference new test.
* NEWS: Mention the improvement.
* src/uniq.c (usage): Summarize the new option,
and adjust the --all-repeated option to be more consistent.
(check_file): Merge the --group functionality into
the core loop for the default uniq operation since
it's very similar and can output lines immediately upon reading.
(main): Handle the new --group option and make it
mutually exclusive with other selection options.
* tests/misc/uniq.pl: Add tests.
* NEWS: Mention the new feature.
* doc/coreutils.texi (uniq invocation): Describe --group.
* src/system.h (emit_ancillary_info): Link to the bug report email
addresses and general help URLs online rather than specifying directly.
This give us greater scope to present better info like describing
the difference between bug-coreutils@gnu.org and coreutils@gnu.org etc.
* tests/misc/help-version.sh: Remove the check for bug-coreutils@gnu.org
* tests/local.mk: Remove the no longer needed PACKAGE_BUGREPORT.
* NEWS: Mention join's new option: --zero-terminated (-z).
* src/join.c: Add new option, --zero-terminated (-z), to make
join use the NUL byte as separator/delimiter rather than newline.
(get_line): Use readlinebuffer_delim in place of readlinebuffer.
(main): Handle the new option.
(usage): Describe new option the same way sort does.
* doc/coreutils.texi (join invocation): Describe the new option.
* tests/misc/join.pl: add tests for -z option.
* src/install.c (strip): Indicate failure with a return code instead
of terminating the program.
(install_file_in_file): Handle strip's return code and unlink the
created file if necessary.
* tests/install/strip-program.sh: Add a test to cover the changes.
* NEWS (Bug fixes): Mention the fix.
Reported by John Reiser in http://bugzilla.redhat.com/632444.
* tests/du/long-from-unreadable.sh: This test requires a NAME_MAX
of at least 200, so skip the test otherwise.
* tests/rm/deep-2.sh: Likewise.
Reported by C de-Avillez with ecryptfs where NAME_MAX = 143.
* tests/du/threshold.sh: use `cut` rather than
sed to avoid using the non portable \t which
fails on sed on openbsd 5 at least.
Also remove a redundant call to `tr` and avoid
explicit setting of LANG=C which is done globally.
Both factor and numfmt recently introduced debug messages
for developers, enabled by --verbose and ---devdebug respectively.
There were a few issues though:
1. They used different mechanisms to enable these messages.
2. factor used --verbose which might be needed for something else
3. They used different methods to output the messages,
and numfmt used error() which added an unwanted newline
4. numfmt marked all these messages for translation and factor
marked a couple. We really don't need these translated.
So we fix the above issues here while renaming the enabling
option for both commands to ---debug (still undocumented).
* src/factor.c (verbose): Rename to dev_debug and change from int to
bool as it's just a toggle flag.
(long_options): Rename --verbose to ---debug.
* src/system.h (devmsg): A new inline function to output a message
if enabled by a global dev_debug variable in the compilation unit.
* src/numfmt.c: Use devmsg() rather than error().
Also remove the translation tags from these messages.
Also change debug flag to bool from int.
* tests/misc/numfmt.pl: Adjust for the ---devdebug to ---debug change.
* cfg.mk (sc_marked_devdiagnostics): Add a syntax check to ensure
translations are not added to devmsg calls.
Reported by Göran Uddeborg in http://bugs.gnu.org/13665
* tests/tail-2/inotify-rotate.sh: Avoid a subshell with bash,
which in turn causes the `kill` to be ineffective to the tail
processes (as the SIGTERM is sent to the subshell which doesn't
propagate the signal on to its children). On NFS the test
cleanup will then fail as there will be .nfs files maintained
in the directory for the files still opened by the tail processes.
Reported by Bernhard Voelker.
* tests/misc/numfmt.pl: When the system locale grouping doesn't
match our expected format for grouping 1234 in the fr_FR locale,
reset the locale to 'C' so as to skip all locale tests.
* tests/cp/fail-perm.sh: Adjust expected diagnostic to match
just-changed cp diagnostic.
* tests/ln/hard-to-sym.sh: Likewise.
* .mailmap: Also map my new address.
Originally requested in Red Hat bugzilla #445213.
* src/stty.c (mode_info): Add support for DTR/DSR hardware flow control,
if available.
* doc/coreutils.texi: Document it.
* tests/misc/stty.sh: Add it to the list of serial options to avoid.
* NEWS: Mention the improvement.
* AUTHORS: Add my name.
* NEWS: Mention the new program.
* README: Reference the new program.
* src/numfmt.c: New file.
* src/.gitignore: Ignore the new binary.
* build-aux/gen-lists-of-programs.sh: Update.
* scripts/git-hooks/commit-msg: Allow numfmt: commit prefix.
* po/POTFILES.in: Add new c file.
* tests/misc/numfmt.pl: A new test file giving >93% coverage.
* tests/local.mk: Reference the new test.
* man/.gitignore: Ignore the new man page.
* man/local.mk: Reference the new man page.
* man/numfmt.x: A new template.
* doc/coreutils.texi: Document the new command.
Fixes the issue introduced in unreleased commit v8.20-60-gec48bea.
* src/cut.c (set_fields): Don't access the bit array if
we've an open ended range that's outside any finite range.
* tests/misc/cut.pl: Add tests for this case.
Reported by Marcel Böhme in http://bugs.gnu.org/13627
Like any other pseudo file system, df should show rootfs only
when the -a option is specified, i.e. specifying -trootfs alone
is not sufficient. As the rootfs entry is now elided by the
general deduplication in filter_mount_list (commit v8.20-103-gbb116d3),
all other references to rootfs can be removed again.
* src/df.c (show_rootfs): Remove global variable.
(ROOTFS): Remove constant.
(filter_mount_list): Remove case to handle rootfs specially.
(main): In the case for handling the -t option, remove setting
of the show_rootfs variable.
* tests/df/skip-rootfs.sh: Adapt the test case "df -t rootfs":
the rootfs file system must not be printed (because no -a).
* doc/coreutils.texi (df invocation): Correct the documentation
about eliding mount entries: it is not the first occurrence of
the the device which wins, but now rather the entry with the
shortest mount point name. Also adapt the description about
eliding pseudo file system types like rootfs.
* NEWS (Changes in behavior): Adapt entry.
* src/df.c (struct devlist): Add a new element for storing
pointers to mount_entry structures.
(devlist_head, dev_examined): Remove.
(filter_mount_list): Add new function to filter out the rootfs
entry (unless -trootfs is specified), and duplicities. The
function favors entries with a '/' character in me_devname
or those with the shortest me_mountdir string, if multiple
entries fulfill the first condition.
Use struct devlist to build up a list of entries already known,
and finally rebuild the global mount_list.
(get_all_entries): Call the above new function unless the -a
option is specified.
(get_dev): Remove the code for skipping rootfs and duplicities.
* tests/df/skip-duplicates.sh: Add test cases.
Co-authored-by: Bernhard Voelker <mail@bernhard-voelker.de>
* src/timeout.c (unblock_signal): A new function to unblock a
specified signal, or warn if not possible.
(set_timeout): Ensure SIGALRM is unblocked before we setup the timer.
* tests/misc/timeout-blocked.pl: A new test for the issue.
* tests/local.mk: Reference the new test.
* NEWS: Mention the fix.
Fixes: http://bugs.gnu.org/13535
* src/seq.c (main): With 3 positive integer args we were
checking the end value was == "1", rather than the step value.
* tests/misc/seq.pl: Add tests for this case.
Reported by Marcel Böhme in http://bugs.gnu.org/13525
* src/seq.c (get_default_format): Also account for the case where '.'
is auto added to the start value, which is significant when the
number sequence narrows.
* tests/misc/seq.pl: Add two new tests for the failing cases.
* NEWS: Mention the fix.
Fixes http://bugs.gnu.org/13394
* src/cut.c (cut_fields): Handle the edge case where '\n' is
the delimiter, which could be used for example to suppress
the last line if it doesn't contain a '\n'.
* test/misc/cut.pl: Add tests for this edge case.
Previously line N+1 was inspected before line N was fully output,
which causes output ordering issues at the terminal or delays
from intermittent sources like tail -f.
* src/cut.c (cut_fields): Adjust so that we record the
previous output character so we can use that info to
determine wether to output a '\n' or not.
* tests/misc/cut.pl: Add tests to ensure existing
functionality isn't broken.
* NEWS: Mention the fix.
Fixes bug http://bugs.gnu.org/13498
* src/du.c (opt_threshold): Add variable to hold the value of
the --threshold option specified by the user.
(long_options): Add a required_argument entry for the new
--threshold option.
(usage): Add --threshold option.
(process_file): Elide printing the entry if its size does not
meet the value specified by the --threshold option.
(main): In the argument parsing loop, add a case for the new
-t option. Convert the given argument by permitting the
well-known suffixes for megabyte, gigabytes, etc.
Handle the special case "-0": give an error as this value is
not permitted.
* doc/coreutils.texi (du invocation): Add documentation for the
above new option.
* tests/du/threshold.sh: Add new test to exercise the new option.
* tests/local.mk (all_tests): Mention the above test.
Co-authored-by: Bernhard Voelker <mail@bernhard-voelker.de>
This test tried to ensure that not all symlinks (across all
file system types) have Zero size and refers to a change
in system.h from 2002-08-31 (commit SH-UTILS-2_0_15-55-g62808a7).
The test used to do this by working on symlinks to long file
names. This assumption is dependant on the underlying file
system, and in some environments like XEN does not even work
on file systems known to work otherwise.
The test for dereferencing and no-dereferencing symlinks is
already covered by other tests (du/deref.sh, du/deref-args.sh,
and du/no-deref.sh). Therefore, remove it.
* tests/du/slink.sh: Remove file.
* tests/local.mk (all_tests): Remove the above test.
Discussed in:
http://lists.gnu.org/archive/html/coreutils/2013-01/msg00053.html
Run "make update-copyright", but then also run this,
perl -pi -e 's/2\d\d\d-//' tests/sample-test
to make that one script use the single most recent year number.
This regression was introduced in commit v8.19-132-g3786fb6.
* src/seq.c (seq_fast): Don't use puts() to output the first number,
and instead insert it into the buffer as for other numbers.
Also output the terminator unconditionally.
* tests/misc/seq.pl: Add some basic tests for the -s option.
* NEWS: Mention the fix.
* THANKS.in: Reported by Philipp Gortan.
The -z option has been introduced in commit v8.15-60-ga3eb71a,
i.e. in coreutils-8.16. Time to add some tests for it.
* tests/misc/basename.pl: Add tests exercising the -z option.
In the foreach loop to append a newline to the end of each
expected 'OUT' string, skip the -z tests.
* tests/misc/timeout-group.sh: The kernel might possibly delay
signal propagation to timeout.cmd long enough, that it exits
normally without running the signal handler (as sleep will
be in the same process group and so get the signal too).
So avoid this by explicitly checking that the signal handler
is called, which should always happen under normal circumstances.
Reported by Stefano Lattarini on linux-2.6.30-2-686 and bash-4.2.36.
This allows efficient processing of multiple files,
while also increasing compatibility with BSD's readlink(1).
We also add the -z, --zero option to delimit output items
with the NUL character which disambiguates output in the
presence of '\n' characters.
* src/readlink.c (usage): Add the --zero description,
and also adjust the description of --no-newline accordingly.
(main): Handle the -z option and iterate over multiple arguments.
Also as in commit v8.15-24-g9d46b25 we use fputs() and putchar()
rather than printf() for performance reasons.
* doc/coreutils.texi (readlink invocation): Document the
new --zero option, adjust the --no-newline description, and
tweak the general info to indicate multiple files are supported.
* tests/readlink/multi.sh: A new test for the new functionality.
* tests/local.mk: Reference the new test.
* man/readlink.x: Adjust the summary and also reference realpath.
* NEWS: Mention the improvement.
* THANKS.in: Suggested by Aaron Davies.
* tests/misc/cut-huge-to-eol-range.sh: New test, showing that
the change in v8.20-51-g7d03466 is a bug fix after all.
* tests/local.mk (all_tests): Add it.
* NEWS (Bug fixes): Mention it.