glibc/sysdeps/x86
Noah Goldstein 9a421348cd elf: Optimize _dl_new_hash in dl-new-hash.h
Unroll slightly and enforce good instruction scheduling. This improves
performance on out-of-order machines. The unrolling allows for
pipelined multiplies.

As well, as an optional sysdep, reorder the operations and prevent
reassosiation for better scheduling and higher ILP. This commit
only adds the barrier for x86, although it should be either no
change or a win for any architecture.

Unrolling further started to induce slowdowns for sizes [0, 4]
but can help the loop so if larger sizes are the target further
unrolling can be beneficial.

Results for _dl_new_hash
Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz

Time as Geometric Mean of N=30 runs
Geometric of all benchmark New / Old: 0.674
  type, length, New Time, Old Time, New Time / Old Time
 fixed,      0,    2.865,     2.72,               1.053
 fixed,      1,    3.567,    2.489,               1.433
 fixed,      2,    2.577,    3.649,               0.706
 fixed,      3,    3.644,    5.983,               0.609
 fixed,      4,    4.211,    6.833,               0.616
 fixed,      5,    4.741,    9.372,               0.506
 fixed,      6,    5.415,    9.561,               0.566
 fixed,      7,    6.649,   10.789,               0.616
 fixed,      8,    8.081,   11.808,               0.684
 fixed,      9,    8.427,   12.935,               0.651
 fixed,     10,    8.673,   14.134,               0.614
 fixed,     11,    10.69,   15.408,               0.694
 fixed,     12,   10.789,   16.982,               0.635
 fixed,     13,   12.169,   18.411,               0.661
 fixed,     14,   12.659,   19.914,               0.636
 fixed,     15,   13.526,   21.541,               0.628
 fixed,     16,   14.211,   23.088,               0.616
 fixed,     32,   29.412,   52.722,               0.558
 fixed,     64,    65.41,  142.351,               0.459
 fixed,    128,  138.505,  295.625,               0.469
 fixed,    256,  291.707,  601.983,               0.485
random,      2,   12.698,   12.849,               0.988
random,      4,   16.065,   15.857,               1.013
random,      8,   19.564,   21.105,               0.927
random,     16,   23.919,   26.823,               0.892
random,     32,   31.987,   39.591,               0.808
random,     64,   49.282,   71.487,               0.689
random,    128,    82.23,  145.364,               0.566
random,    256,  152.209,  298.434,                0.51

Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-05-23 10:38:40 -05:00
..
bits Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
fpu Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
include Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
nptl Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
sys/platform Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
__longjmp_cancel.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
abi-note.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
atomic-machine.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
cacheinfo.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
cacheinfo.h x86: use default cache size if it cannot be determined [BZ #28784] 2022-01-17 19:42:46 +01:00
cet-control.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
check-cet.awk Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
configure elf: Replace PI_STATIC_AND_HIDDEN with opposite HIDDEN_VAR_NEEDS_DYNAMIC_RELOC 2022-04-26 09:26:22 -07:00
configure.ac elf: Replace PI_STATIC_AND_HIDDEN with opposite HIDDEN_VAR_NEEDS_DYNAMIC_RELOC 2022-04-26 09:26:22 -07:00
cpu-features-offsets.sym x86: Cleanup cpu-features-offsets.sym 2018-08-03 06:42:09 -07:00
cpu-features.c x86: Black list more Intel CPUs for TSX [BZ #27398] 2022-01-18 14:20:09 -08:00
cpu-tunables.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-cacheinfo.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-cet.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-diagnostics-cpu.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-get-cpu-features.c x86: Add x86-64-vN check to early startup 2022-01-14 20:17:49 +01:00
dl-hwcap.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-isa-level.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-lookupcfg.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-minsigstacksize.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-new-hash.h elf: Optimize _dl_new_hash in dl-new-hash.h 2022-05-23 10:38:40 -05:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-procinfo.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-procruntime.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-prop.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
dl-tunables.list Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
elf-initfini.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
elide.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
float128-abi.h Move __isnanf128 to libc.so 2021-03-30 14:58:19 +05:30
fpu_control.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
get-cpuid-feature-leaf.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
get-isa-level.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
hp-timing.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
init-arch.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
isa-level.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
jmp_buf-ssp.sym x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
ldbl2mpn.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ldsodefs.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
libc-start.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
libc-start.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
link_map.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
linkmap.h Rename bits/linkmap.h to linkmap.h (bug 14912). 2015-09-04 19:44:27 +00:00
longjmp.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
Makeconfig Add _Float64x function aliases. 2017-11-27 14:16:47 +00:00
Makefile x86: Test wcscmp RTM in the wcsncmp overflow case [BZ #28896] 2022-02-18 16:35:18 -06:00
string_private.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
sysdep.h x86: Improve L to support L(XXX_SYMBOL (YYY, ZZZ)) 2022-02-05 16:42:17 -08:00
tininess.h Use sysdeps/x86/tininess.h for i386 and x86_64 2012-10-30 20:38:31 -07:00
tst-cet-legacy-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-1a.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-2.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-2a.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-3.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-4.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-4a.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-4b.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-4c.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-5.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-5a.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-5b.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-6.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-6a.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-6b.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-7.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-8.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-9-static.c x86: Properly set usable CET feature bits [BZ #26625] 2021-01-29 03:58:11 -08:00
tst-cet-legacy-9.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-10-static.c x86: Properly set usable CET feature bits [BZ #26625] 2021-01-29 03:58:11 -08:00
tst-cet-legacy-10.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-mod-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-mod-2.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-mod-4.c x86/CET: Add tests with legacy non-CET shared objects 2018-07-25 04:47:05 -07:00
tst-cet-legacy-mod-5.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-mod-5a.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-mod-5b.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-mod-5c.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-mod-6.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-mod-6a.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-mod-6b.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cet-legacy-mod-6c.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-cet-legacy-mod-6d.c Call _dl_open_check after relocation [BZ #24259] 2019-07-01 12:23:22 -07:00
tst-cpu-features-cpuinfo-static.c x86: Add PTWRITE feature detection [BZ #27346] 2021-02-07 08:01:14 -08:00
tst-cpu-features-cpuinfo.c x86: Don't check PTWRITE in tst-cpu-features-cpuinfo.c 2022-02-14 05:53:03 -08:00
tst-cpu-features-supports-static.c x86: Add PTWRITE feature detection [BZ #27346] 2021-02-07 08:01:14 -08:00
tst-cpu-features-supports.c x86: Use CHECK_FEATURE_PRESENT on PCONFIG 2022-02-14 05:53:03 -08:00
tst-get-cpu-features-static.c Add _dl_x86_cpu_features to rtld_global 2015-08-13 03:41:22 -07:00
tst-get-cpu-features.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-ifunc-isa-1-static.c x86: Check ifunc resolver with CPU_FEATURE_USABLE [BZ #27072] 2021-01-21 10:22:26 -08:00
tst-ifunc-isa-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-ifunc-isa-2-static.c x86: Check ifunc resolver with CPU_FEATURE_USABLE [BZ #27072] 2021-01-21 10:22:26 -08:00
tst-ifunc-isa-2.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-ifunc-isa.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-isa-level-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-isa-level-mod-1-baseline.c x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717] 2021-01-07 13:10:13 -08:00
tst-isa-level-mod-1-v2.c x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717] 2021-01-07 13:10:13 -08:00
tst-isa-level-mod-1-v3.c x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717] 2021-01-07 13:10:13 -08:00
tst-isa-level-mod-1-v4.c x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717] 2021-01-07 13:10:13 -08:00
tst-isa-level-mod-1.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-ldbl-nonnormal-printf.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-memchr-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-memcmp-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-memmove-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-memrchr-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-memset-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-setjmp-cet.c x86: Set header.feature_1 in TCB for always-on CET [BZ #27177] 2021-01-13 05:03:34 -08:00
tst-stack-align.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-strchr-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-strcpy-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-string-rtm.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-strlen-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-strncmp-rtm.c x86: Fix fallback for wcsncmp_avx2 in strcmp-avx2.S [BZ #28896] 2022-03-25 11:46:13 -05:00
tst-strrchr-rtm.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-sysconf-cache-linesize-static.c x86: Handle _SC_LEVEL1_ICACHE_LINESIZE [BZ #27444] 2021-03-15 05:43:26 -07:00
tst-sysconf-cache-linesize.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
tst-wcsncmp-rtm.c x86: Test wcscmp RTM in the wcsncmp overflow case [BZ #28896] 2022-02-18 16:35:18 -06:00
Versions <sys/platform/x86.h>: Remove the C preprocessor magic 2021-01-21 05:58:17 -08:00