glibc/string
Wilco Dijkstra 612fba2fe9 Improve performance of memmem
This patch significantly improves performance of memmem using a novel
modified Horspool algorithm.  Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 2 use a dedicated linear search.  Very long needles
use the Two-Way algorithm (to avoid increasing stack size or slowing down
the common case, inlining is disabled).

The performance gain is 6.6 times on English text on AArch64 using random
needles with average size 8.

Tested against GLIBC testsuite and randomized tests.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

	* string/memmem.c (__memmem): Rewrite to improve performance.

(cherry picked from commit 680942b016)
2019-09-13 16:41:34 +01:00
..
bits Remove bits/string.h. 2017-06-20 08:21:24 -04:00
_strerror.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-addsep.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-append.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-count.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-create.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-ctsep.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-delete.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-extract.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-insert.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-next.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-replace.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz-stringify.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
argz.h Remove __need macros from errno.h (__need_Emath, __need_error_t). 2017-06-14 08:14:34 -04:00
basename.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
bcopy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
bug-envz1.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
bug-strcoll1.c Update. 2001-04-26 20:45:18 +00:00
bug-strcoll2.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
bug-strncat1.c Fix string/bug-strncat1.c build with GCC 8. 2018-10-22 14:16:02 +02:00
bug-strpbrk1.c * malloc/memusagestat.c (main): Use return instead of exit to 2000-12-31 10:52:32 +00:00
bug-strspn1.c * malloc/memusagestat.c (main): Use return instead of exit to 2000-12-31 10:52:32 +00:00
bug-strtok1.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
byteswap.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
bzero.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Depend Update. 2001-03-19 21:40:15 +00:00
endian.h Make endian-conversion macros always return correct types (bug 16458). 2017-01-11 15:28:08 +00:00
envz.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
envz.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
explicit_bzero.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
ffs.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
ffsll.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
inl-tester.c Update. 1997-09-11 12:09:10 +00:00
Makefile Remove bits/string.h. 2017-06-20 08:21:24 -04:00
memccpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memchr.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memcmp.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memcpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memfrob.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memmem.c Improve performance of memmem 2019-09-13 16:41:34 +01:00
memmove.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memory.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
mempcpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memrchr.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
memset.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
noinl-tester.c Update. 1997-09-16 00:42:43 +00:00
rawmemchr.c Fix rawmemchr build with GCC 8. 2017-05-10 00:25:59 +00:00
stpcpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
stpncpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
str-two-way.h Improve performance of strstr 2019-09-13 16:41:26 +01:00
stratcliff.c string/stratcliff.c: Replace int with size_t [BZ #21982] 2017-09-11 08:48:28 -07:00
strcasecmp_l.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcasecmp.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strcasestr.c Fix strstr bug with huge needles (bug 23637) 2019-09-13 16:40:59 +01:00
strcat.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strchr.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strchrnul.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcmp.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcoll_l.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strcoll.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strcspn.c Narrowing the visibility of libc-internal.h even further. 2017-03-01 20:33:46 -05:00
strdup.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strerror_l.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strerror.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strfry.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
string-inlines.c Remove bits/string.h. 2017-06-20 08:21:24 -04:00
string.h Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strings.h Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strlen.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strncase_l.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strncase.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strncat.c Remove bits/string.h. 2017-06-20 08:21:24 -04:00
strncmp.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strncpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strndup.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strnlen.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strpbrk.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strrchr.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strsep.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strsignal.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strspn.c Narrowing the visibility of libc-internal.h even further. 2017-03-01 20:33:46 -05:00
strstr.c Improve performance of strstr 2019-09-13 16:41:26 +01:00
strtok_r.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strtok.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strverscmp.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
strxfrm_l.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strxfrm.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
swab.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-bcopy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-bzero.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-endian-types.c Make endian-conversion macros always return correct types (bug 16458). 2017-01-11 15:28:08 +00:00
test-explicit_bzero.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-ffs.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-memccpy.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-memchr.c Add memchr tests for n == 0 2017-05-25 11:38:01 -07:00
test-memcmp.c x86-64: memcmp-avx2-movbe.S needs saturating subtraction [BZ #21662] 2017-06-23 17:24:40 +02:00
test-memcpy.c Add a test case for [BZ #23196] 2018-05-24 15:47:22 +02:00
test-memmem.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-memmove.c Fix i386 memmove issue (bug 22644). 2018-05-17 13:58:22 +02:00
test-mempcpy.c Don't write beyond destination in __mempcpy_avx512_no_vzeroupper (bug 23196) 2018-05-24 15:47:12 +02:00
test-memrchr.c Add more tests for memchr 2017-06-08 09:56:01 -07:00
test-memset.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-rawmemchr.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-stpcpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-stpncpy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-strcasecmp.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strcasestr.c Improve strstr performance 2019-09-13 16:39:12 +01:00
test-strcat.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strchr.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strchrnul.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-strcmp.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strcpy.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strcspn.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
test-string.h Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
test-strlen.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strncasecmp.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strncat.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strncmp.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strncpy.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strnlen.c Add page tests to string/test-strnlen. 2017-04-05 10:28:41 -03:00
test-strpbrk.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strrchr.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strspn.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
test-strstr.c Fix strstr bug with huge needles (bug 23637) 2019-09-13 16:40:59 +01:00
testcopy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
tester.c Ignore -Wrestrict for one strncat test. 2018-10-22 14:15:29 +02:00
tst-bswap.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-cmp.c Increase some test timeouts. 2017-07-06 17:01:03 +00:00
tst-endian.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-inlcall.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-strcoll-overflow.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-strfry.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-strlen.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-strtok_r.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-strtok.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-strxfrm2.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-strxfrm.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-svc2.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-svc.c Update string tests to use the support test driver. 2017-03-23 11:32:29 -03:00
tst-svc.expect * string/strverscmp.c (__strverscmp): Fix last cleanups. 2009-04-07 06:51:59 +00:00
tst-svc.input * string/strverscmp.c (__strverscmp): Fix last cleanups. 2009-04-07 06:51:59 +00:00
tst-xbzero-opt.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
Versions New string function explicit_bzero (from OpenBSD). 2016-12-16 16:21:54 -05:00
wordcopy.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
xpg-strerror.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00