Commit Graph

21 Commits

Author SHA1 Message Date
Wilco Dijkstra
5e0a7ecb66 Improve performance of strstr
This patch significantly improves performance of strstr using a novel
modified Horspool algorithm.  Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 3 use a dedicated linear search.  Very long needles
use the Two-Way algorithm.

The performance gain using the improved bench-strstr on Cortex-A72 is 5.8
times basic_strstr and 3.7 times twoway_strstr.

Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

	* string/str-two-way.h (two_way_short_needle): Add inline to avoid
	warning.
	(two_way_long_needle): Block inlining.
	* string/strstr.c (strstr2): Add new function.
	(strstr3): Likewise.
	(STRSTR): Completely rewrite strstr to improve performance.
2019-06-12 11:38:52 +01:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Wilco Dijkstra
3ae725dfb6 Improve strstr performance
Improve strstr performance.  Strstr tends to be slow because it uses
many calls to memchr and a slow byte loop to scan for the next match.
Performance is significantly improved by using strnlen on larger blocks
and using strchr to search for the next matching character.  strcasestr
can also use strnlen to scan ahead, and memmem can use memchr to check
for the next match.

On the GLIBC bench tests the performance gains on Cortex-A72 are:
strstr: +25%
strcasestr: +4.3%
memmem: +18%

On a 256KB dataset strstr performance improves by 67%, strcasestr by 47%.

    Reviewd-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2018-07-16 17:51:52 +01:00
Joseph Myers
688903eb3e Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2018-01-01 00:32:25 +00:00
Joseph Myers
bfff8b1bec Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Joseph Myers
f7a9f785e5 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
Joseph Myers
b168057aaa Update copyright dates with scripts/update-copyrights. 2015-01-02 16:29:47 +00:00
Allan McRae
d4697bc93d Update copyright notices with scripts/update-copyrights 2014-01-01 22:00:23 +10:00
Tom de Vries
a175b684e2 Fix typo, improve comment, remove superfluous #undefs, add missing #undef. 2013-02-12 00:00:49 +01:00
Joseph Myers
568035b787 Update copyright notices with scripts/update-copyrights. 2013-01-02 19:05:09 +00:00
Maxim Kuvyrkov
e9f3725206 Fix BZ #14716: memmem crash 2012-10-15 17:22:41 -07:00
Maxim Kuvyrkov
57e605ba50 Fix BZ #14602: strstr and strcasestr return wrong result. 2012-10-08 20:52:53 -07:00
Maxim Kuvyrkov
bcca089526 Micro-optimize critical path of strstr, strcase and memmem. 2012-08-21 18:07:47 -07:00
Maxim Kuvyrkov
99677e5755 Use pointers for traversing arrays in strstr, strcasestr and memmem. 2012-08-21 18:07:47 -07:00
Maxim Kuvyrkov
400726deef Detect EOL on-the-fly in strstr, strcasestr and memmem. 2012-08-21 18:07:47 -07:00
Maxim Kuvyrkov
20a71f2c8a Optimize first-character loop of strstr, strcasestr and memmem. 2012-08-21 18:07:47 -07:00
Roland McGrath
be75d75807 Remove local redefinition of MAX macro. 2012-08-15 11:40:41 -07:00
Paul Eggert
59ba27a63a Replace FSF snail mail address with URLs. 2012-02-09 23:18:22 +00:00
Eric Blake
5fb308bca2 Fix strstr and memmem algorithm. 2010-10-06 13:48:07 -04:00
Ulrich Drepper
dbc676d4ff Add performance tests for strstr and strcasestr. 2010-07-23 22:27:53 -07:00
Ulrich Drepper
0caca71ac9 * string/Makefile (distribute): Add str-two-way.h.
2008-03-29  Eric Blake	<ebb9@byu.net>

	Rewrite string searches to O(n) rather than O(n^2).
	* string/str-two-way.h: New file.  For linear fixed-allocation
	string searching.
	* string/memmem.c: New implementation.
	* string/strstr.c: New implementation.
	* string/strcasestr.c: New implementation.

	* sysdeps/posix/getaddrinfo.c (getaddrinfo): Call _res_hconf_init
2008-05-15 04:42:20 +00:00