Go to file
Noah Goldstein d4386d345e x86: Increase non_temporal_threshold to roughly sizeof_L3 / 4
Current `non_temporal_threshold` set to roughly '3/4 * sizeof_L3 /
ncores_per_socket'. This patch updates that value to roughly
'sizeof_L3 / 4`

The original value (specifically dividing the `ncores_per_socket`) was
done to limit the amount of other threads' data a `memcpy`/`memset`
could evict.

Dividing by 'ncores_per_socket', however leads to exceedingly low
non-temporal thresholds and leads to using non-temporal stores in
cases where REP MOVSB is multiple times faster.

Furthermore, non-temporal stores are written directly to main memory
so using it at a size much smaller than L3 can place soon to be
accessed data much further away than it otherwise could be. As well,
modern machines are able to detect streaming patterns (especially if
REP MOVSB is used) and provide LRU hints to the memory subsystem. This
in affect caps the total amount of eviction at 1/cache_associativity,
far below meaningfully thrashing the entire cache.

As best I can tell, the benchmarks that lead this small threshold
where done comparing non-temporal stores versus standard cacheable
stores. A better comparison (linked below) is to be REP MOVSB which,
on the measure systems, is nearly 2x faster than non-temporal stores
at the low-end of the previous threshold, and within 10% for over
100MB copies (well past even the current threshold). In cases with a
low number of threads competing for bandwidth, REP MOVSB is ~2x faster
up to `sizeof_L3`.

The divisor of `4` is a somewhat arbitrary value. From benchmarks it
seems Skylake and Icelake both prefer a divisor of `2`, but older CPUs
such as Broadwell prefer something closer to `8`. This patch is meant
to be followed up by another one to make the divisor cpu-specific, but
in the meantime (and for easier backporting), this patch settles on
`4` as a middle-ground.

Benchmarks comparing non-temporal stores, REP MOVSB, and cacheable
stores where done using:
https://github.com/goldsteinn/memcpy-nt-benchmarks

Sheets results (also available in pdf on the github):
https://docs.google.com/spreadsheets/d/e/2PACX-1vS183r0rW_jRX6tG_E90m9qVuFiMbRIJvi5VAE8yYOvEOIEEc3aSNuEsrFbuXw5c3nGboxMmrupZD7K/pubhtml
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

(cherry picked from commit af992e7abd)
2023-09-11 22:48:28 -05:00
argp Fix a few typos in comments 2019-01-12 13:44:51 +00:00
assert Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
benchtests Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
bits Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
catgets Update copyright dates not handled by scripts/update-copyrights. 2019-01-01 00:15:13 +00:00
ChangeLog.old Add missing reference to bug 21654 2017-10-07 13:14:36 +02:00
conform Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
crypt Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
csu Update copyright dates not handled by scripts/update-copyrights. 2019-01-01 00:15:13 +00:00
ctype Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
debug debug: Mark libSegFault.so as NODELETE 2023-07-21 16:41:11 +02:00
dirent Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
dlfcn dlfcn: Guard __dlerror_main_freeres with __libc_once_get (once) [BZ#24476] 2019-05-16 14:54:23 +02:00
elf Fix SXID_ERASE behavior in setuid programs (BZ #27471) 2021-04-14 11:08:02 +05:30
gmon Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
gnulib Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
grp Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
gshadow Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
hesiod Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
htl Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
hurd hurd: Fix initial sigaltstack state 2019-01-24 19:27:00 +01:00
iconv Update copyright dates not handled by scripts/update-copyrights. 2019-01-01 00:15:13 +00:00
iconvdata gconv: Fix assertion failure in ISO-2022-JP-3 module (bug 27256) 2021-01-27 15:22:18 +01:00
include elf: Refuse to dlopen PIE objects [BZ #24323] 2019-10-31 19:29:35 -04:00
inet Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
intl intl: Do not return NULL on asprintf failure in gettext [BZ #24018] 2019-01-02 16:26:58 +01:00
io io: Remove copy_file_range emulation [BZ #24744] 2019-07-09 10:01:21 +02:00
libio libio: Disable vtable validation for pre-2.1 interposed handles [BZ #25203] 2019-11-28 14:17:27 +01:00
locale Update copyright dates not handled by scripts/update-copyrights. 2019-01-01 00:15:13 +00:00
localedata ja_JP locale: Add entry for the new Japanese era [BZ #22964] 2019-04-03 19:42:20 +02:00
login Update copyright dates not handled by scripts/update-copyrights. 2019-01-01 00:15:13 +00:00
mach Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
malloc Base max_fast on alignment, not width, of bins (Bug 24903) 2019-11-18 16:10:51 +01:00
manual Add glibc.malloc.mxfast tunable 2019-11-18 16:08:44 +01:00
math Add XFAIL_ROUNDING_IBM128_LIBGCC to more fma() tests 2019-01-15 16:35:10 -02:00
mathvec Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
misc Fix a few typos in comments 2019-01-12 13:44:51 +00:00
nis Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
nptl Fix alignment of TLS variables for tls variant TLS_TCB_AT_TP [BZ #23403] 2019-11-05 14:36:16 -05:00
nptl_db Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
nscd nscd: Fix double free in netgroupcache [BZ #27462] 2021-03-08 15:42:15 +05:30
nss nss_compat: internal_end*ent may clobber errno, hiding ERANGE [BZ #25976] 2020-05-19 16:33:04 +02:00
po Update translations 2019-01-25 22:05:42 +05:30
posix Fix use-after-free in glob when expanding ~user (bug 25414) 2020-03-17 21:51:11 -04:00
pwd Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
resolv CVE-2016-10739: getaddrinfo: Fully parse IPv4 address strings [BZ #20018] 2019-01-21 21:26:03 +01:00
resource Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
rt Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
scripts Use a proper C tokenizer to implement the obsolete typedefs test. 2019-06-05 14:15:01 +02:00
setjmp Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
shadow Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
signal Disable lazy binding on tests for minimal signal handler 2019-01-18 08:56:51 -08:00
socket Fix a few typos in comments 2019-01-12 13:44:51 +00:00
soft-fp soft-fp: Properly check _FP_W_TYPE_SIZE [BZ #24066] 2019-01-07 09:04:39 -08:00
stdio-common Add test for bug 29530 2022-08-30 11:07:43 +02:00
stdlib support: Add capability to fork an sgid child 2021-04-14 11:07:45 +05:30
streams Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
string x86: Fix wcsnlen-avx2 page cross length comparison [BZ #29591] 2022-11-24 18:05:33 -08:00
sunrpc Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
support support: Add xsetlocale function 2022-08-30 11:07:43 +02:00
sysdeps x86: Increase non_temporal_threshold to roughly sizeof_L3 / 4 2023-09-11 22:48:28 -05:00
sysvipc Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
termios Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
time strftime: Pass the additional flags from "%EY" to "%Ey" [BZ #24096] 2019-01-24 23:04:12 +09:00
timezone Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
wcsmbs Use C99-compliant scanf under _GNU_SOURCE with modern compilers. 2019-01-03 11:12:39 -05:00
wctype Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
.gitattributes Assume __NR_openat is always defined 2016-03-23 23:35:08 +01:00
.gitignore Add *.pyc to .gitignore 2015-05-18 15:26:26 +05:30
abi-tags Remove the bulk of the NaCl port. 2017-05-20 08:09:10 -04:00
aclocal.m4 LIBC_SLIBDIR_RTLDDIR: substitute arguments in single quotes 2018-01-25 17:20:28 +01:00
ChangeLog riscv: Do not use __has_include__ 2020-01-21 13:42:47 +01:00
config.h.in Add C-SKY port 2018-12-21 09:48:04 +08:00
config.make.in Fix ifunc support with DT_TEXTREL segments (BZ#20480) 2018-09-25 16:27:50 -03:00
configure x86: Assume --enable-cet if GCC defaults to CET [BZ #25225] 2019-12-03 21:08:49 +01:00
configure.ac x86: Assume --enable-cet if GCC defaults to CET [BZ #25225] 2019-12-03 21:08:49 +01:00
COPYING Update to latest versions of GPL-2.0 and LGPL-2.1 2013-09-09 12:52:48 +10:00
COPYING.LIB Update to latest versions of GPL-2.0 and LGPL-2.1 2013-09-09 12:52:48 +10:00
extra-lib.mk Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
gen-locales.mk Improve gen-locales.mk and gen-locale.sh to make test files with @ options work 2018-02-27 17:01:57 +01:00
INSTALL Prepare for 2.29 release 2019-01-31 22:01:21 +05:30
libc-abis libc-abis: Define ABSOLUTE ABI [BZ #19818][BZ #23307] 2018-07-05 18:06:43 +01:00
libof-iterator.mk Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
LICENSES stdio-common/tst-printf.c: Remove part under a non-free license [BZ #23363] 2018-07-03 18:29:16 +02:00
MAINTAINERS Add MAINTAINERS 2017-05-11 13:38:30 -04:00
Makeconfig Only build libm with -fno-math-errno (bug 24024) 2019-01-07 14:59:07 +01:00
Makefile test-container: Install with $(all-subdirs) [BZ #24794] 2022-01-27 07:28:12 -08:00
Makefile.in New make target to only build benchmark binaries 2016-04-20 10:23:28 +05:30
Makerules Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
NEWS Remove most vfprintf width/precision-dependent allocations (bug 14231, bug 26211). 2022-08-30 11:07:36 +02:00
o-iterator.mk
README Add C-SKY port 2018-12-21 09:48:04 +08:00
Rules Use a proper C tokenizer to implement the obsolete typedefs test. 2019-06-05 14:15:01 +02:00
shlib-versions Extend NSS test suite 2017-07-17 15:52:44 -04:00
test-skeleton.c Update copyright dates with scripts/update-copyrights. 2019-01-01 00:11:28 +00:00
version.h Tag 2.29 release 2019-01-31 22:15:36 +05:30

This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.

The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system.  It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.

In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.

The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu.

When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.

Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.

The GNU C Library supports these configurations for using Linux kernels:

	aarch64*-*-linux-gnu
	alpha*-*-linux-gnu
	arm-*-linux-gnueabi
	csky-*-linux-gnuabiv2
	hppa-*-linux-gnu
	i[4567]86-*-linux-gnu
	x86_64-*-linux-gnu	Can build either x86_64 or x32
	ia64-*-linux-gnu
	m68k-*-linux-gnu
	microblaze*-*-linux-gnu
	mips-*-linux-gnu
	mips64-*-linux-gnu
	powerpc-*-linux-gnu	Hardware or software floating point, BE only.
	powerpc64*-*-linux-gnu	Big-endian and little-endian.
	s390-*-linux-gnu
	s390x-*-linux-gnu
	riscv64-*-linux-gnu
	sh[34]-*-linux-gnu
	sparc*-*-linux-gnu
	sparc64*-*-linux-gnu

If you are interested in doing a port, please contact the glibc
maintainers; see http://www.gnu.org/software/libc/ for more
information.

See the file INSTALL to find out how to configure, build, and install
the GNU C Library.  You might also consider reading the WWW pages for
the C library at http://www.gnu.org/software/libc/.

The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory.  The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like.  For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below.  Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.

Please see http://www.gnu.org/software/libc/bugs.html for bug reporting
information.  We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.

The GNU C Library is free software.  See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed.  License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.