expr: fix bug with unmatched \(...\)

Problem reported by Qiuhao Li.
* NEWS: Mention this.
* doc/coreutils.texi (String expressions):
Document the correct behavior, which POSIX requires.
* src/expr.c (docolon): Treat unmatched \(...\) as empty.
* tests/misc/expr.pl: New test.
This commit is contained in:
Paul Eggert 2021-01-26 09:23:54 -08:00
parent bb21daa125
commit 735083ba24
4 changed files with 20 additions and 7 deletions

3
NEWS
View File

@ -17,6 +17,9 @@ GNU coreutils NEWS -*- outline -*-
heavily changed during the run.
[bug introduced in coreutils-8.25]
expr no longer mishandles unmatched \(...\) in regular expressions.
[bug introduced in coreutils-6.0]
ls no longer crashes when printing the SELinux context for unstatable files.
[bug introduced in coreutils-6.9.91]

View File

@ -13559,12 +13559,14 @@ second is considered to be a (basic, a la GNU @code{grep}) regular
expression, with a @code{^} implicitly prepended. The first argument is
then matched against this regular expression.
If the match succeeds and @var{regex} uses @samp{\(} and @samp{\)}, the
@code{:} expression returns the part of @var{string} that matched the
subexpression; otherwise, it returns the number of characters matched.
If @var{regex} does not use @samp{\(} and @samp{\)}, the @code{:}
expression returns the number of characters matched, or 0 if the match
fails.
If the match fails, the @code{:} operator returns the null string if
@samp{\(} and @samp{\)} are used in @var{regex}, otherwise 0.
If @var{regex} uses @samp{\(} and @samp{\)}, the @code{:} expression
returns the part of @var{string} that matched the subexpression, or
the null string if the match failed or the subexpression did not
contribute to the match.
@kindex \( @r{regexp operator}
Only the first @samp{\( @dots{} \)} pair is relevant to the return

View File

@ -614,8 +614,13 @@ docolon (VALUE *sv, VALUE *pv)
/* Were \(...\) used? */
if (re_buffer.re_nsub > 0)
{
sv->u.s[re_regs.end[1]] = '\0';
v = str_value (sv->u.s + re_regs.start[1]);
if (re_regs.end[1] < 0)
v = str_value ("");
else
{
sv->u.s[re_regs.end[1]] = '\0';
v = str_value (sv->u.s + re_regs.start[1]);
}
}
else
{

View File

@ -84,6 +84,9 @@ my @Tests =
# In 5.94 and earlier, anchors incorrectly matched newlines.
['anchor', "'a\nb' : 'a\$'", {OUT => '0'}, {EXIT => 1}],
# In 8.32, \( ... \) that did not match caused memory errors.
['emptysub', '"a" : "\\(b\\)*"', {OUT => ''}, {EXIT => 1}],
# These tests are taken from grep/tests/bre.tests.
['bre1', '"abc" : "a\\(b\\)c"', {OUT => 'b'}],
['bre2', '"a(" : "a("', {OUT => '2'}],