glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-27 11:43:34 +08:00

Author	SHA1	Message	Date
Stafford Horne	b57adfa49b	or1k: Add hard float libm-test-ulps This patch adds the ulps test file to prepare for the upcoming hard float patch. This is separated out to make the hard float patch smaller. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-05-03 18:28:18 +01:00
Gabi Falk	5a2cf833f5	i686: Fix multiple definitions of __memmove_chk and __memset_chk Commit `c73c96a4a1` updated memcpy.S and mempcpy.S, but omitted memmove.S and memset.S. As a result, the static library built as PIC, whether with or without multiarch support, contains two definitions for each of the __memmove_chk and __memset_chk symbols. /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../i686-pc-linux-gnu/bin/ld: /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset-ia32.o): in function `__memset_chk': /var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/string/../sysdeps/i386/i686/memset.S:32: multiple definition of `__memset_chk'; /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset_chk.o):/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/debug/../sysdeps/i386/i686/multiarch/memset_chk.c:24: first defined here After this change, regardless of PIC options, the static library, built for i686 with multiarch contains implementations of these functions respectively from debug/memmove_chk.c and debug/memset_chk.c, and without multiarch contains implementations of these functions respectively from sysdeps/i386/memmove_chk.S and sysdeps/i386/memset_chk.S. This ensures that memmove and memset won't pull in __chk_fail and the routines it calls. Reported-by: Sam James <sam@gentoo.org> Tested-by: Sam James <sam@gentoo.org> Fixes: `c73c96a4a1` ("i686: Fix build with --disable-multiarch") Signed-off-by: Gabi Falk <gabifalk@gmx.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>	2024-05-02 11:51:10 +01:00
Gabi Falk	0fdf4ba48c	i586: Fix multiple definitions of __memcpy_chk and __mempcpy_chk /home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy_chk.o): in function `__memcpy_chk': /home/bmg/src/glibc/debug/../sysdeps/i386/memcpy_chk.S:29: multiple definition of `__memcpy_chk';/home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here /home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy_chk.o): in function `__mempcpy_chk': /home/bmg/src/glibc/debug/../sysdeps/i386/mempcpy_chk.S:28: multiple definition of `__mempcpy_chk'; /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here After this change, the static library built for i586, regardless of PIC options, contains implementations of these functions respectively from sysdeps/i386/memcpy_chk.S and sysdeps/i386/mempcpy_chk.S. This ensures that memcpy and mempcpy won't pull in __chk_fail and the routines it calls. Reported-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Gabi Falk <gabifalk@gmx.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>	2024-05-02 11:50:21 +01:00
Carlos O'Donell	91695ee459	time: Allow later version licensing. The FSF's Licensing and Compliance Lab noted a discrepancy in the licensing of several files in the glibc package. When timespect_get.c was impelemented the license did not include the standard ", or (at your option) any later version." text. Change the license in timespec_get.c and all copied files to match the expected license. This change was previously approved in principle by the FSF in RT ticket #1316403. And a similar instance was fixed in commit `46703efa02`.	2024-05-01 09:03:26 -04:00
Wilco Dijkstra	6dae61567f	AArch64: Remove unused defines of CPU names Remove unused defines of CPU names in cpu-features.h. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-04-30 13:32:29 +01:00
Florian Weimer	b62928f907	x86: In ld.so, diagnose missing APX support in APX-only builds At this point, this is mainly a tool for testing the early ld.so CPU compatibility diagnostics: GCC uses the new instructions in most functions, so it's easy to spot if some of the early code is not built correctly. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-25 17:20:28 +02:00
Florian Weimer	3a3a449742	i386: ulp update for SSE2 --disable-multi-arch configurations	2024-04-25 12:56:48 +02:00
H.J. Lu	46c9997413	x86: Define MINIMUM_X86_ISA_LEVEL in config.h [BZ #31676 ] Define MINIMUM_X86_ISA_LEVEL at configure time to avoid /usr/bin/ld: …/build/elf/librtld.os: in function `init_cpu_features': …/git/elf/../sysdeps/x86/cpu-features.c:1202: undefined reference to `_dl_runtime_resolve_fxsave' /usr/bin/ld: …/build/elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status when glibc is built with -march=x86-64-v3 and configured with --with-rtld-early-cflags=-march=x86-64, which is used to allow ld.so to print an error message on unsupported CPUs: Fatal glibc error: CPU does not support x86-64-v3 This fixes BZ #31676. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-04-24 04:50:56 -07:00
caiyinyu	095067efdf	LoongArch: Add glibc.cpu.hwcap support. The current IFUNC selection is always using the most recent features which are available via AT_HWCAP. But in some scenarios it is useful to adjust this selection. The environment variable: GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,zzz,.... can be used to enable HWCAP feature yyy, disable HWCAP feature xxx, where the feature name is case-sensitive and has to match the ones used in sysdeps/loongarch/cpu-tunables.c. Signed-off-by: caiyinyu <caiyinyu@loongson.cn>	2024-04-24 18:22:38 +08:00
Florian Weimer	f4724843ad	nptl: Fix tst-cancel30 on kernels without ppoll_time64 support Fall back to ppoll if ppoll_time64 fails with ENOSYS. Fixes commit `370da8a121` ("nptl: Fix tst-cancel30 on sparc64"). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-04-23 21:16:32 +02:00
Florian Weimer	5361ad3910	login: Use unsigned 32-bit types for seconds-since-epoch These fields store timestamps when the system was running. No Linux systems existed before 1970, so these values are unused. Switching to unsigned types allows continued use of the existing struct layouts beyond the year 2038. The intent is to give distributions more time to switch to improved interfaces that also avoid locking/data corruption issues. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-04-19 14:38:17 +02:00
Florian Weimer	9abdae94c7	login: structs utmp, utmpx, lastlog _TIME_BITS independence (bug 30701) These structs describe file formats under /var/log, and should not depend on the definition of _TIME_BITS. This is achieved by defining __WORDSIZE_TIME64_COMPAT32 to 1 on 32-bit ports that support 32-bit time_t values (where __time_t is 32 bits). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-04-19 14:38:17 +02:00
Florian Weimer	4d4da5aab9	login: Check default sizes of structs utmp, utmpx, lastlog The default <utmp-size.h> is for ports with a 64-bit time_t. Ports with a 32-bit time_t or with __WORDSIZE_TIME64_COMPAT32=1 need to override it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-04-19 14:38:17 +02:00
Florian Weimer	14e56bd4ce	powerpc: Fix ld.so address determination for PCREL mode (bug 31640) This seems to have stopped working with some GCC 14 versions, which clobber r2. With other compilers, the kernel-provided r2 value is still available at this point. Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-04-14 08:24:51 +02:00
Florian Weimer	aea52e3d2b	Revert "x86_64: Suppress false positive valgrind error" This reverts commit `a1735e0aa8`. The test failure is a real valgrind bug that needs to be fixed before valgrind is usable with a glibc that has been built with CC="gcc -march=x86-64-v3". The proposed valgrind patch teaches valgrind to replace ld.so strcmp with an unoptimized scalar implementation, thus avoiding any AVX2-related problems. Valgrind bug: <https://bugs.kde.org/show_bug.cgi?id=485487> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-13 17:42:13 +02:00
Adhemerval Zanella	686d542025	posix: Sync tempname with gnulib The gnulib version contains an important change (9ce573cde), which fixes some problems with multithreading, entropy loss, and ASLR leak nfo. It also fixes an issue where getrandom is not being used on some new files generation (only for __GT_NOCREATE on first try). The 044bf893ac removed __path_search, which is now moved to another gnulib shared files (stdio-common/tmpdir.{c,h}). Tthis patch also fixes direxists to use __stat64_time64 instead of __xstat64, and move the include of pathmax.h for !_LIBC (since it is not used by glibc). The license is also changed from GPL 3.0 to 2.1, with permission from the authors (Bruno Haible and Paul Eggert). The sync also removed the clock fallback, since clock_gettime with CLOCK_REALTIME is expected to always succeed. It syncs with gnulib commit 323834962817af7b115187e8c9a833437f8d20ec. Checked on x86_64-linux-gnu. Co-authored-by: Bruno Haible <bruno@clisp.org> Co-authored-by: Paul Eggert <eggert@cs.ucla.edu> Reviewed-by: Bruno Haible <bruno@clisp.org>	2024-04-10 14:53:39 -03:00
Florian Weimer	f8d8b1b1e6	aarch64: Enhanced CPU diagnostics for ld.so This prints some information from struct cpu_features, and the midr_el1 and dczid_el0 system register contents on every CPU. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-08 16:48:55 +02:00
Florian Weimer	7a430f40c4	x86: Add generic CPUID data dumper to ld.so --list-diagnostics This is surprisingly difficult to implement if the goal is to produce reasonably sized output. With the current approaches to output compression (suppressing zeros and repeated results between CPUs, folding ranges of identical subleaves, dealing with the %ecx reflection issue), the output is less than 600 KiB even for systems with 256 logical CPUs. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-08 16:48:55 +02:00
Florian Weimer	5653ccd847	elf: Add CPU iteration support for future use in ld.so diagnostics Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-08 16:48:55 +02:00
H.J. Lu	9e1f4aef86	x86-64: Exclude FMA4 IFUNC functions for -mapxf When -mapxf is used to build glibc, the resulting glibc will never run on FMA4 machines. Exclude FMA4 IFUNC functions when -mapxf is used. This requires GCC which defines __APX_F__ for -mapxf with commit: 1df56719bd8 x86: Define __APX_F__ for -mapxf Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-04-06 05:03:55 -07:00
Adhemerval Zanella	c27f8763cf	Reinstate generic features-time64.h The `a4ed0471d7` removed the generic version which is included by features.h and used by Hurd. Checked by building i686-gnu and x86_64-gnu with build-many-glibc.py.	2024-04-05 09:02:36 -03:00
Adhemerval Zanella	460d9e2dfe	Cleanup __tls_get_addr on alpha/microblaze localplt.data They are not required. Checked with a make check for both ABIs.	2024-04-04 17:20:33 -03:00
Adhemerval Zanella	95700e7998	arm: Remove ld.so __tls_get_addr plt usage Use the hidden alias instead. Checked on arm-linux-gnueabihf.	2024-04-04 17:03:32 -03:00
Adhemerval Zanella	50c2be2390	aarch64: Remove ld.so __tls_get_addr plt usage Use the hidden alias instead. Checked on aarch64-linux-gnu.	2024-04-04 17:02:32 -03:00
Adhemerval Zanella	44ccc2465c	math: x86 trunc traps when FE_INEXACT is enabled (BZ 31603) The implementations of trunc functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Adhemerval Zanella	932544efa4	math: x86 floor traps when FE_INEXACT is enabled (BZ 31601) The implementations of floor functions using x87 floating point (i386 and 86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Adhemerval Zanella	637bfc392f	math: x86 ceill traps when FE_INEXACT is enabled (BZ 31600) The implementations of ceil functions using x87 floating point (i386 and x86_64 long double only) traps when FE_INEXACT is enabled. Although this is a GNU extension outside the scope of the C standard, other architectures that also support traps do not show this behavior. The fix moves the implementation to a common one that holds any exceptions with a 'fnclex' (libc_feholdexcept_setround_387). Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-04-04 14:29:28 -03:00
Joe Ramsay	87cb1dfcd6	aarch64/fpu: Add vector variants of erfc Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:24 +01:00
Joe Ramsay	3d3a4fb8e4	aarch64/fpu: Add vector variants of tanh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:20 +01:00
Joe Ramsay	eedbbca0bf	aarch64/fpu: Add vector variants of sinh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:16 +01:00
Joe Ramsay	8b67920528	aarch64/fpu: Add vector variants of atanh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:12 +01:00
Joe Ramsay	81406ea3c5	aarch64/fpu: Add vector variants of asinh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:02 +01:00
Joe Ramsay	b09fee1d21	aarch64/fpu: Add vector variants of acosh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:32:58 +01:00
Joe Ramsay	bdb5705b7b	aarch64/fpu: Add vector variants of cosh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:32:52 +01:00
Joe Ramsay	cb5d84f1f8	aarch64/fpu: Add vector variants of erf Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:32:48 +01:00
Stafford Horne	3db9d208dd	misc: Add support for Linux uio.h RWF_NOAPPEND flag In Linux 6.9 a new flag is added to allow for Per-io operations to disable append mode even if a file was opened with the flag O_APPEND. This is done with the new RWF_NOAPPEND flag. This caused two test failures as these tests expected the flag 0x00000020 to be unused. Adding the flag definition now fixes these tests on Linux 6.9 (v6.9-rc1). FAIL: misc/tst-preadvwritev2 FAIL: misc/tst-preadvwritev64v2 This patch adds the flag, adjusts the test and adds details to documentation. Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/ Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-04-04 09:41:27 +01:00
Adhemerval Zanella	4dcd674b66	powerpc: Add missing arch flags on rounding ifunc variants The ifunc variants now uses the powerpc implementation which in turn uses the compiler builtin. Without the proper -mcpu switch the builtin does not generate the expected optimization. Checked on powerpc-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-04-02 15:49:31 -03:00
Adhemerval Zanella	a4ed0471d7	Always define __USE_TIME_BITS64 when 64 bit time_t is used It was raised on libc-help [1] that some Linux kernel interfaces expect the libc to define __USE_TIME_BITS64 to indicate the time_t size for the kABI. Different than defined by the initial y2038 design document [2], the __USE_TIME_BITS64 is only defined for ABIs that support more than one time_t size (by defining the _TIME_BITS for each module). The 64 bit time_t redirects are now enabled using a different internal define (__USE_TIME64_REDIRECTS). There is no expected change in semantic or code generation. Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and arm-linux-gnueabi [1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html [2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign Reviewed-by: DJ Delorie <dj@redhat.com>	2024-04-02 15:28:36 -03:00
Adhemerval Zanella	721314c980	x86_64: Remove avx512 strstr implementation As indicated in a recent thread, this it is a simple brute-force algorithm that checks the whole needle at a matching character pair (and does so 1 byte at a time after the first 64 bytes of a needle). Also it never skips ahead and thus can match at every haystack position after trying to match all of the needle, which generic implementation avoids. As indicated by Wilco, a 4x larger needle and 16x larger haystack gives a clear 65x slowdown both basic_strstr and __strstr_avx512: "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512", "__strstr_sse2_unaligned", "__strstr_generic"], { "len_haystack": 65536, "len_needle": 1024, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2] }, { "len_haystack": 1048576, "len_needle": 4096, "align_haystack": 0, "align_needle": 0, "fail": 1, "desc": "Difficult bruteforce needle", "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9] } PS: I don't have an AVX512 capable machine to verify this issues, but skimming through the code it does seems to follow what Wilco has described. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-03-27 13:48:16 -03:00
Adhemerval Zanella	2e53eb9234	signal: Avoid system signal disposition to interfere with tests Both tst-sigset2 and tst-signal1 expectes that SIGINT disposition is set to SIG_DFL.	2024-03-27 13:47:09 -03:00
Palmer Dabbelt	96d1b9ac23	RISC-V: Fix the static-PIE non-relocated object check The value of l_scope is only valid post relocation, so this original check was triggering undefined behavior. Instead just directly check to see if the object has been relocated, at which point using l_scope is safe. Reported-by: Andreas Schwab <schwab@suse.de> Closes: BZ #31317 Fixes: `e0590f41fe` ("RISC-V: Enable static-pie.") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-25 15:17:13 +01:00
Sergey Bugaev	dc1a77269c	htl: Implement some support for TLS_DTV_AT_TP Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-19-bugaevc@gmail.com>	2024-03-23 23:00:30 +01:00
Sergey Bugaev	a4273efa21	htl: Respect GL(dl_stack_flags) when allocating stacks Previously, HTL would always allocate non-executable stacks. This has never been noticed, since GNU Mach on x86 ignores VM_PROT_EXECUTE and makes all pages implicitly executable. Since GNU Mach on AArch64 supports non-executable pages, HTL forgetting to pass VM_PROT_EXECUTE immediately breaks any code that (unfortunately, still) relies on executable stacks. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-7-bugaevc@gmail.com>	2024-03-23 22:48:44 +01:00
Sergey Bugaev	b467cfcaee	hurd: Use the RETURN_ADDRESS macro This gives us PAC stripping on AArch64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-6-bugaevc@gmail.com>	2024-03-23 22:48:01 +01:00
Sergey Bugaev	6afeac1289	hurd: Disable Prefer_MAP_32BIT_EXEC on non-x86_64 for now While we could support it on any architecture, the tunable is currently only defined on x86_64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-5-bugaevc@gmail.com>	2024-03-23 22:47:46 +01:00
Sergey Bugaev	7f02511e5b	hurd: Move internal functions to internal header Move _hurd_self_sigstate (), _hurd_critical_section_lock (), and _hurd_critical_section_unlock () inline implementations (that were already guarded by #if defined _LIBC) to the internal version of the header. While at it, add <tls.h> to the includes, and use __LIBC_NO_TLS () unconditionally. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240323173301.151066-2-bugaevc@gmail.com>	2024-03-23 22:43:07 +01:00
Stafford Horne	ad05a42370	or1k: Add prctl wrapper to unwrap variadic args On OpenRISC variadic functions and regular functions have different calling conventions so this wrapper is needed to translate. This wrapper is copied from x86_64/x32. I don't know the build system enough to find a cleaner way to share the code between x86_64/x32 and or1k (maybe Implies?), so I went with the straight copy. This fixes test failures: misc/tst-prctl nptl/tst-setgetname	2024-03-22 15:43:34 +00:00
Stafford Horne	df7e29e2a4	or1k: Only define fpu rouding and exceptions with hard-float This test failure: math/test-fenv If rounding mode and exception macros are defined then the fenv tests run and always fail. This patch adds an ifdef using the __or1k_hard_float__ macro provided by gcc to avoid defining these fenv macros when they cnnot be used. This is similar to what is done in csky. Note, I will post the or1k hard-float support soon. So, I prefer to leave the hard-float bits here for now.	2024-03-22 15:43:34 +00:00
Stafford Horne	2e982a3937	or1k: Update libm test ulps To fix test failures: FAIL: math/test-float-hypot FAIL: math/test-float32-hypot	2024-03-22 15:43:34 +00:00
Wilco Dijkstra	2e94e2f5d2	AArch64: Check kernel version for SVE ifuncs Old Linux kernels disable SVE after every system call. Calling the SVE-optimized memcpy afterwards will then cause a trap to reenable SVE. As a result, applications with a high use of syscalls may run slower with the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0, except for 5.14.0 which was patched. Avoid this by checking the kernel version and selecting the SVE ifunc on modern kernels. Parse the kernel version reported by uname() into a 24-bit kernel.major.minor value without calling any library functions. If uname() is not supported or if the version format is not recognized, assume the kernel is modern. Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-03-21 16:50:51 +00:00
Amrita H S	1ea0511456	powerpc: Placeholder and infrastructure/build support to add Power11 related changes. The following three changes have been added to provide initial Power11 support. 1. Add the directories to hold Power11 files. 2. Add support to select Power11 libraries based on AT_PLATFORM. 3. Let submachine=power11 be set automatically. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-03-19 21:11:34 -05:00
Manjunath Matti	3ab9b88e2a	powerpc: Add HWCAP3/HWCAP4 data to TCB for Power Architecture. This patch adds a new feature for powerpc. In order to get faster access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for implementing __builtin_cpu_supports() in GCC) without the overhead of reading them from the auxiliary vector, we now reserve space for them in the TCB. This is an ABI change for GLIBC 2.39. Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-03-19 17:19:27 -05:00
Adhemerval Zanella	3d53d18fc7	elf: Enable TLS descriptor tests on aarch64 The aarch64 uses 'trad' for traditional tls and 'desc' for tls descriptors, but unlike other targets it defaults to 'desc'. The gnutls2 configure check does not set aarch64 as an ABI that uses TLS descriptors, which then disable somes stests. Also rename the internal machinery fron gnu2 to tls descriptors. Checked on aarch64-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-03-19 14:53:30 -03:00
Adhemerval Zanella	64c7e34428	arm: Update _dl_tlsdesc_dynamic to preserve caller-saved registers (BZ 31372) ARM _dl_tlsdesc_dynamic slow path has two issues: * The ip/r12 is defined by AAPCS as a scratch register, and gcc is used to save the stack pointer before on some function calls. So it should also be saved/restored as well. It fixes the tst-gnu2-tls2. * None of the possible VFP registers are saved/restored. ARM has the additional complexity to have different VFP bank sizes (depending of VFP support by the chip). The tst-gnu2-tls2 test is extended to check for VFP registers, although only for hardfp builds. Different than setcontext, _dl_tlsdesc_dynamic does not have HWCAP_ARM_IWMMXT (I don't have a way to properly test it and it is almost a decade since newer hardware was released). With this patch there is no need to mark tst-gnu2-tls2 as XFAIL. Checked on arm-linux-gnueabihf. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-03-19 14:53:30 -03:00
Andreas Schwab	fd7ee2e6c5	Add tst-gnu2-tls2mod1 to test-internal-extras That allows sysdeps/x86_64/tst-gnu2-tls2mod1.S to use internal headers. Fixes: `717ebfa85c` ("x86-64: Allocate state buffer space for RDI, RSI and RBX")	2024-03-19 14:28:28 +01:00
H.J. Lu	717ebfa85c	x86-64: Allocate state buffer space for RDI, RSI and RBX _dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack. After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11. Define TLSDESC_CALL_REGISTER_SAVE_AREA to allocate space for RDI, RSI and RBX to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to STATE_SAVE_OFFSET(%rsp). +==================+<- stack frame start aligned at 8 or 16 bytes \| \|<- RDI saved in the red zone \| \|<- RSI saved in the red zone \| \|<- RBX saved in the red zone \| \|<- paddings for stack realignment of 64 bytes \|------------------\|<- xsave buffer end aligned at 64 bytes \| \|<- \| \|<- \| \|<- \|------------------\|<- xsave buffer start at STATE_SAVE_OFFSET(%rsp) \| \|<- 8-byte padding for 64-byte alignment \| \|<- 8-byte padding for 64-byte alignment \| \|<- R11 \| \|<- R10 \| \|<- R9 \| \|<- R8 \| \|<- RDX \| \|<- RCX +==================+<- RSP aligned at 64 bytes Define TLSDESC_CALL_REGISTER_SAVE_AREA, the total register save area size for all integer registers by adding 24 to STATE_SAVE_OFFSET since RDI, RSI and RBX are saved onto stack without adjusting stack pointer first, using the red-zone. This fixes BZ #31501. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-03-18 19:45:13 -07:00
Darius Rad	f44f3aed31	riscv: Update nofpu libm test ulps Fix two test failures. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-03-18 11:28:50 +01:00
Florian Weimer	7a76f21867	linux: Use rseq area unconditionally in sched_getcpu (bug 31479) Originally, nptl/descr.h included <sys/rseq.h>, but we removed that in commit `2c6b4b272e` ("nptl: Unconditionally use a 32-byte rseq area"). After that, it was not ensured that the RSEQ_SIG macro was defined during sched_getcpu.c compilation that provided a definition. This commit always checks the rseq area for CPU number information before using the other approaches. This adds an unnecessary (but well-predictable) branch on architectures which do not define RSEQ_SIG, but its cost is small compared to the system call. Most architectures that have vDSO acceleration for getcpu also have rseq support. Fixes: `2c6b4b272e` Fixes: `1d350aa060` Reviewed-by: Arjun Shankar <arjun@redhat.com>	2024-03-15 19:08:24 +01:00
Szabolcs Nagy	73c26018ed	aarch64: fix check for SVE support in assembler Due to GCC bug 110901 -mcpu can override -march setting when compiling asm code and thus a compiler targetting a specific cpu can fail the configure check even when binutils gas supports SVE. The workaround is that explicit .arch directive overrides both -mcpu and -march, and since that's what the actual SVE memcpy uses the configure check should use that too even if the GCC issue is fixed independently. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-03-14 14:27:56 +00:00
Joseph Myers	2367bf468c	Update kernel version to 6.8 in header constant tests This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py and tst-pidfd-consts.py to 6.8. (There are no new constants covered by these tests in 6.8 that need any other header changes.) Tested with build-many-glibcs.py.	2024-03-13 19:46:21 +00:00
Joseph Myers	3de2f8755c	Update syscall lists for Linux 6.8 Linux 6.8 adds five new syscalls. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py.	2024-03-13 13:57:56 +00:00
Adhemerval Zanella	4a76fb1da8	powerpc: Remove power8 strcasestr optimization Similar to strstr (`1e9a550ba4`), power8 strcasestr does not show much improvement compared to the generic implementation. The geomean on bench-strcasestr shows: __strcasestr_power8 __strcasestr_ppc power10 1159 1120 power9 1640 1469 power8 1787 1904 The strcasestr uses the same 'trick' as power7 strstr to detect potential quadradic behavior, which only adds overheads for input that trigger quadradic behavior and it is really a hack. Checked on powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-03-12 17:11:01 -03:00
Adhemerval Zanella	2149da3683	riscv: Fix alignment-ignorant memcpy implementation The memcpy optimization (commit `587a1290a1`) has a series of mistakes: - The implementation is wrong: the chunk size calculation is wrong leading to invalid memory access. - It adds ifunc supports as default, so --disable-multi-arch does not work as expected for riscv. - It mixes Linux files (memcpy ifunc selection which requires the vDSO/syscall mechanism) with generic support (the memcpy optimization itself). - There is no __libc_ifunc_impl_list, which makes testing only check the selected implementation instead of all supported by the system. This patch also simplifies the required bits to enable ifunc: there is no need to memcopy.h; nor to add Linux-specific files. The __memcpy_noalignment tail handling now uses a branchless strategy similar to aarch64 (overlap 32-bits copies for sizes 4..7 and byte copies for size 1..3). Checked on riscv64 and riscv32 by explicitly enabling the function on __libc_ifunc_impl_list on qemu-system. Changes from v1: * Implement the memcpy in assembly to correctly handle RISCV strict-alignment. Reviewed-by: Evan Green <evan@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-12 14:38:08 -03:00
Andreas Schwab	2173173d57	linux/sigsetops: fix type confusion (bug 31468) Each mask in the sigset array is an unsigned long, so fix __sigisemptyset to use that instead of int. The __sigword function returns a simple array index, so it can return int instead of unsigned long.	2024-03-12 10:00:22 +01:00
caiyinyu	aeee41f1cf	LoongArch: Correct {__ieee754, _}_scalb -> {__ieee754, _}_scalbf	2024-03-12 14:07:27 +08:00
Sunil K Pandey	b6e3898194	x86-64: Simplify minimum ISA check ifdef conditional with if Replace minimum ISA check ifdef conditional with if. Since MINIMUM_X86_ISA_LEVEL and AVX_X86_ISA_LEVEL are compile time constants, compiler will perform constant folding optimization, getting same results. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-03-03 15:47:53 -08:00
Evan Green	587a1290a1	riscv: Add and use alignment-ignorant memcpy For CPU implementations that can perform unaligned accesses with little or no performance penalty, create a memcpy implementation that does not bother aligning buffers. It will use a block of integer registers, a single integer register, and fall back to bytewise copy for the remainder. Signed-off-by: Evan Green <evan@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-01 07:15:01 -08:00
Evan Green	a2b47f7d46	riscv: Add ifunc helper method to hwprobe.h Add a little helper method so it's easier to fetch a single value from the hwprobe function when used within an ifunc selector. Signed-off-by: Evan Green <evan@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-01 07:15:00 -08:00
Evan Green	a29bb320a1	riscv: Enable multi-arg ifunc resolvers RISC-V is apparently the first architecture to pass more than one argument to ifunc resolvers. The helper macros in libc-symbols.h, __ifunc_resolver(), __ifunc(), and __ifunc_hidden(), are incompatible with this. These macros have an "arg" (non-final) parameter that represents the parameter signature of the ifunc resolver. The result is an inability to pass the required comma through in a single preprocessor argument. Rearrange the __ifunc_resolver() macro to be variadic, and pass the types as those variable parameters. Move the guts of __ifunc() and __ifunc_hidden() into new macros, __ifunc_args(), and __ifunc_args_hidden(), that pass the variable arguments down through to __ifunc_resolver(). Then redefine __ifunc() and __ifunc_hidden(), which are used in a bunch of places, to simply shuffle the arguments down into __ifunc_args[_hidden]. Finally, define a riscv-ifunc.h header, which provides convenience macros to those looking to write ifunc selectors that use both arguments. Signed-off-by: Evan Green <evan@rivosinc.com> Reviewed-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-01 07:14:59 -08:00
Evan Green	78308ce77a	riscv: Add __riscv_hwprobe pointer to ifunc calls The new __riscv_hwprobe() function is designed to be used by ifunc selector functions. This presents a challenge for applications and libraries, as ifunc selectors are invoked before all relocations have been performed, so an external call to __riscv_hwprobe() from an ifunc selector won't work. To address this, pass a pointer to the __riscv_hwprobe() function into ifunc selectors as the second argument (alongside dl_hwcap, which was already being passed). Include a typedef as well for convenience, so that ifunc users don't have to go through contortions to call this routine. Users will need to remember to check the second argument for NULL, to account for older glibcs that don't pass the function. Signed-off-by: Evan Green <evan@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-01 07:14:58 -08:00
Evan Green	e7919e0db2	riscv: Add hwprobe vdso call support The new riscv_hwprobe syscall also comes with a vDSO for faster answers to your most common questions. Call in today to speak with a kernel representative near you! Signed-off-by: Evan Green <evan@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-01 07:14:57 -08:00
Evan Green	c6c33339b4	linux: Introduce INTERNAL_VSYSCALL Add an INTERNAL_VSYSCALL() macro that makes a vDSO call, falling back to a regular syscall, but without setting errno. Instead, the return value is plumbed straight out of the macro. Signed-off-by: Evan Green <evan@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-01 07:14:56 -08:00
Evan Green	426d0e1aa8	riscv: Add Linux hwprobe syscall support Add awareness and a thin wrapper function around a new Linux system call that allows callers to get architecture and microarchitecture information about the CPUs from the kernel. This can be used to do things like dynamically choose a memcpy implementation. Signed-off-by: Evan Green <evan@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-03-01 07:14:55 -08:00
H.J. Lu	9b7091415a	x86-64: Update _dl_tlsdesc_dynamic to preserve AMX registers _dl_tlsdesc_dynamic should also preserve AMX registers which are caller-saved. Add X86_XSTATE_TILECFG_ID and X86_XSTATE_TILEDATA_ID to x86-64 TLSDESC_CALL_STATE_SAVE_MASK. Compute the AMX state size and save it in xsave_state_full_size which is only used by _dl_tlsdesc_dynamic_xsave and _dl_tlsdesc_dynamic_xsavec. This fixes the AMX part of BZ #31372. Tested on AMX processor. AMX test is enabled only for compilers with the fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 GCC 14 and GCC 11/12/13 branches have the bug fix. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-02-29 04:30:01 -08:00
H.J. Lu	a1735e0aa8	x86_64: Suppress false positive valgrind error When strcmp-avx2.S is used as the default, elf/tst-valgrind-smoke fails with ==1272761== Conditional jump or move depends on uninitialised value(s) ==1272761== at 0x4022C98: strcmp (strcmp-avx2.S:462) ==1272761== by 0x400B05B: _dl_name_match_p (dl-misc.c:75) ==1272761== by 0x40085F3: _dl_map_object (dl-load.c:1966) ==1272761== by 0x401AEA4: map_doit (rtld.c:644) ==1272761== by 0x4001488: _dl_catch_exception (dl-catch.c:237) ==1272761== by 0x40015AE: _dl_catch_error (dl-catch.c:256) ==1272761== by 0x401B38F: do_preload (rtld.c:816) ==1272761== by 0x401C116: handle_preload_list (rtld.c:892) ==1272761== by 0x401EDF5: dl_main (rtld.c:1842) ==1272761== by 0x401A79E: _dl_sysdep_start (dl-sysdep.c:140) ==1272761== by 0x401BEEE: _dl_start_final (rtld.c:494) ==1272761== by 0x401BEEE: _dl_start (rtld.c:581) ==1272761== by 0x401AD87: ??? (in /elf/ld.so) The assembly codes are: 0x0000000004022c80 <+144>: vmovdqu 0x20(%rdi),%ymm0 0x0000000004022c85 <+149>: vpcmpeqb 0x20(%rsi),%ymm0,%ymm1 0x0000000004022c8a <+154>: vpcmpeqb %ymm0,%ymm15,%ymm2 0x0000000004022c8e <+158>: vpandn %ymm1,%ymm2,%ymm1 0x0000000004022c92 <+162>: vpmovmskb %ymm1,%ecx 0x0000000004022c96 <+166>: inc %ecx => 0x0000000004022c98 <+168>: jne 0x4022c32 <strcmp+66> strcmp-avx2.S has 32-byte vector loads of strings which are shorter than 32 bytes: (gdb) p (char ) ($rdi + 0x20) $6 = 0x1ffeffea20 "memcheck-amd64-linux.so" (gdb) p (char ) ($rsi + 0x20) $7 = 0x4832640 "core-amd64-linux.so" (gdb) call (int) strlen ((char ) ($rsi + 0x20)) $8 = 19 (gdb) call (int) strlen ((char *) ($rdi + 0x20)) $9 = 23 (gdb) It triggers the valgrind error. The above code is safe since the loads don't cross the page boundary. Update tst-valgrind-smoke.sh to accept an optional suppression file and pass a suppression file to valgrind when strcmp-avx2.S is the default implementation of strcmp. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2024-02-28 13:40:55 -08:00
H.J. Lu	8c7c188d62	x86: Don't check XFD against /proc/cpuinfo Since /proc/cpuinfo doesn't report XFD, don't check it against /proc/cpuinfo.	2024-02-28 11:50:38 -08:00
H.J. Lu	befe2d3c4d	x86-64: Don't use SSE resolvers for ISA level 3 or above When glibc is built with ISA level 3 or above enabled, SSE resolvers aren't available and glibc fails to build: ld: .../elf/librtld.os: in function `init_cpu_features': .../elf/../sysdeps/x86/cpu-features.c:1200:(.text+0x1445f): undefined reference to `_dl_runtime_resolve_fxsave' ld: .../elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object /usr/local/bin/ld: final link failed: bad value For ISA level 3 or above, don't use _dl_runtime_resolve_fxsave nor _dl_tlsdesc_dynamic_fxsave. This fixes BZ #31429. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-02-28 11:49:30 -08:00
H.J. Lu	0aac205a81	x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers Compiler generates the following instruction sequence for GNU2 dynamic TLS access: leaq tls_var@TLSDESC(%rip), %rax call tls_var@TLSCALL(%rax) or leal tls_var@TLSDESC(%ebx), %eax call tls_var@TLSCALL(%eax) CALL instruction is transparent to compiler which assumes all registers, except for EFLAGS and RAX/EAX, are unchanged after CALL. When _dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the slow path. __tls_get_addr is a normal function which doesn't preserve any caller-saved registers. _dl_tlsdesc_dynamic saved and restored integer caller-saved registers, but didn't preserve any other caller-saved registers. Add _dl_tlsdesc_dynamic IFUNC functions for FNSAVE, FXSAVE, XSAVE and XSAVEC to save and restore all caller-saved registers. This fixes BZ #31372. Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_dynamic) to optimize elf_machine_runtime_setup. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-02-28 09:02:56 -08:00
H.J. Lu	e6350be7e9	sysdeps/unix/sysv/linux/x86_64/Makefile: Add the end marker Add the end marker to tests, tests-container and modules-names.	2024-02-28 05:48:27 -08:00
Adhemerval Zanella	b53e73ea80	s390: Improve static-pie configure tests Instead of tying based on the linker name and version, check for the required support: * whether it does not generate dynamic TLS relocations in PIE (binutils PR ld/22263); * if it accepts --no-dynamic-linker (by using -static-pie); * and if it adds a DT_JMPREL pointing to .rela.iplt with static pie. The patch also trims the comments, for binutils one of the tests should already cover it. The kernel ones are not clear which version should have the backport, nor it is something that glibc can do much about it. Finally, the glibc is somewhat confusing, since it refers to commits not related to s390x. Checked with a build for s390x-linux-gnu. Reviewed-by: Stefan Liebler <stli@linux.ibm.com>	2024-02-28 10:09:53 -03:00
H.J. Lu	24c8db87c9	x86: Change ENQCMD test to CHECK_FEATURE_PRESENT Since ENQCMD is mainly used in kernel, change the ENQCMD test to CHECK_FEATURE_PRESENT. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-02-27 11:50:52 -08:00
Joe Ramsay	e302e10213	aarch64/fpu: Sync libmvec routines from 2.39 and before with AOR This includes a fix for big-endian in AdvSIMD log, some cosmetic changes, and numerous small optimisations mainly around inlining and using indexed variants of MLA intrinsics. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-02-26 09:45:50 -03:00
Stefan Liebler	02782fd128	S390: Do not clobber r7 in clone [BZ #31402 ] Starting with commit `e57d8fc97b` "S390: Always use svc 0" clone clobbers the call-saved register r7 in error case: function or stack is NULL. This patch restores the saved registers also in the error case. Furthermore the existing test misc/tst-clone is extended to check all error cases and that clone does not clobber registers in this error case.	2024-02-26 13:37:46 +01:00
Sunil K Pandey	9f78a7c1d0	x86_64: Exclude SSE, AVX and FMA4 variants in libm multiarch When glibc is built with ISA level 3 or higher by default, the resulting glibc binaries won't run on SSE or FMA4 processors. Exclude SSE, AVX and FMA4 variants in libm multiarch when ISA level 3 or higher is enabled by default. When glibc is built with ISA level 2 enabled by default, only keep SSE4.1 variant. Fixes BZ 31335. NB: elf/tst-valgrind-smoke test fails with ISA level 4, because valgrind doesn't support AVX512 instructions: https://bugs.kde.org/show_bug.cgi?id=383010 Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-02-25 13:20:51 -08:00
H.J. Lu	dfb05f8e70	x86-64: Save APX registers in ld.so trampoline Add APX registers to STATE_SAVE_MASK so that APX registers are saved in ld.so trampoline. This fixes BZ #31371. Also update STATE_SAVE_OFFSET and STATE_SAVE_MASK for i386 which will be used by i386 _dl_tlsdesc_dynamic. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-02-25 09:22:15 -08:00
Simon Chopin	59e0441d4a	tests: gracefully handle AppArmor userns containment Recent AppArmor containment allows restricting unprivileged user namespaces, which is enabled by default on recent Ubuntu systems. When this happens, as is common with Linux Security Modules, the syscall will fail with -EACCESS. When that happens, the affected tests will now be considered unsupported rather than simply failing. Further information: * https://gitlab.com/apparmor/apparmor/-/wikis/unprivileged_userns_restriction * https://ubuntu.com/blog/ubuntu-23-10-restricted-unprivileged-user-namespaces * https://manpages.ubuntu.com/manpages/jammy/man5/apparmor.d.5.html (for the return code) V2: * Fix duplicated line in check_unshare_hints * Also handle similar failure in tst-pidfd_getpid V3: * Comment formatting * Aded some more documentation on syscall return value Signed-off-by: Simon Chopin <simon.chopin@canonical.com>	2024-02-23 08:50:00 -03:00
Adhemerval Zanella	1e9a550ba4	powerpc: Remove power7 strstr optimization The optimization is not faster than the generic algorithm, using the bench-strstr the geometric mean running on a POWER10 machine using gcc 13.1.1 is 482.47 while the default __strstr_ppc is 340.97 (which uses the generic implementation). Also, there is no need to redirect the internal str/mem call to optimized version, internal ifunc is supported and enabled for internal calls (meaning that the generic implementation will use any asm optimization if available). Checked on powerpc64le-linux-gnu. Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-02-23 08:50:00 -03:00
Adhemerval Zanella	f4c142bb9f	arm: Use _dl_find_object on __gnu_Unwind_Find_exidx (BZ 31405) Instead of __dl_iterate_phdr. On ARM dlfo_eh_frame/dlfo_eh_count maps to PT_ARM_EXIDX vaddr start / length. On a Neoverse N1 machine with 160 cores, the following program: $ cat test.c #include <stdlib.h> #include <pthread.h> #include <assert.h> enum { niter = 1024, ntimes = 128, }; static void * tf (void arg) { int a = (int) arg; for (int i = 0; i < niter; i++) { void p[ntimes]; for (int j = 0; j < ntimes; j++) p[j] = malloc (a * 128); for (int j = 0; j < ntimes; j++) free (p[j]); } return NULL; } int main (int argc, char argv[]) { enum { nthreads = 16 }; pthread_t t[nthreads]; for (int i = 0; i < nthreads; i ++) assert (pthread_create (&t[i], NULL, tf, (void ) i) == 0); for (int i = 0; i < nthreads; i++) { void *r; assert (pthread_join (t[i], &r) == 0); assert (r == NULL); } return 0; } $ arm-linux-gnueabihf-gcc -fsanitize=address test.c -o test Improves from ~15s to 0.5s. Checked on arm-linux-gnueabihf.	2024-02-23 08:50:00 -03:00
Xi Ruoyao	e2a65ecc4b	math: Update mips64 ulps Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-02-22 21:28:25 +01:00
Daniel Cederman	aa4106db1d	sparc: Treat the version field in the FPU control word as reserved The FSR version field is read-only and might be non-zero. This allows math/test-fpucw* to correctly pass when the version is non-zero. Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-02-19 10:55:50 -03:00
Flavio Cruz	88b771ab5e	Implement setcontext/getcontext/makecontext/swapcontext for Hurd x86_64 Tested with the tests provided by glibc plus some other toy examples. Message-ID: <20240217202535.1860803-1-flaviocruz@gmail.com>	2024-02-17 21:45:35 +01:00
Flavio Cruz	e3da8f9bad	Use proc_getchildren_rusage when available in getrusage and times. Message-ID: <20240217164846.1837223-1-flaviocruz@gmail.com>	2024-02-17 21:14:39 +01:00
Florian Weimer	6a04404521	Linux: Switch back to assembly syscall wrapper for prctl (bug 29770) Commit `ff026950e2` ("Add a C wrapper for prctl [BZ #25896]") replaced the assembler wrapper with a C function. However, on powerpc64le-linux-gnu, the C variadic function implementation requires extra work in the caller to set up the parameter save area. Calling a function that needs a parameter save area without one (because the prototype used indicates the function is not variadic) corrupts the caller's stack. The Linux manual pages project documents prctl as a non-variadic function. This has resulted in various projects over the years using non-variadic prototypes, including the sanitizer libraries in LLVm and GCC (GCC PR 113728). This commit switches back to the assembler implementation on most targets and only keeps the C implementation for x86-64 x32. Also add the __prctl_time64 alias from commit `b39ffab860` ("Linux: Add time64 alias for prctl") to sysdeps/unix/sysv/linux/syscalls.list; it was not yet present in commit `ff026950e2`. This restores the old ABI on powerpc64le-linux-gnu, thus fixing bug 29770. Reviewed-By: Simon Chopin <simon.chopin@canonical.com>	2024-02-17 09:17:04 +01:00
Florian Weimer	0d9166c224	i386: Use generic memrchr in libc (bug 31316) Before this change, we incorrectly used the SSE2 variant in the implementation, without checking that the system actually supports SSE2. Tested-by: Sam James <sam@gentoo.org>	2024-02-16 07:41:04 +01:00
H.J. Lu	ef7f4b1fef	Apply the Makefile sorting fix Apply the Makefile sorting fix generated by sort-makefile-lines.py.	2024-02-15 11:19:56 -08:00
H.J. Lu	71d133c500	sysdeps/x86_64/Makefile (tests): Add the end marker	2024-02-15 11:12:13 -08:00
Junxian Zhu	545480506f	mips: Use builtins for ffs and ffsll __builtin_ffs{,ll} basically on __builtin_ctz{,ll} in MIPS GCC compiler. The hardware ctz instructions were available after MIPS{32,64} Release1. By using builtin ctz. It can also reduce code size of ffs/ffsll. Checked on mips o32. mips64. Signed-off-by: Junxian Zhu <zhujunxian@oss.cipunited.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>	2024-02-14 12:20:49 -03:00
Adhemerval Zanella	491e55beab	x86: Expand the comment on when REP STOSB is used on memset Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-02-13 08:49:43 -08:00
Adhemerval Zanella	272708884c	x86: Do not prefer ERMS for memset on Zen3+ For AMD Zen3+ architecture, the performance of the vectorized loop is slightly better than ERMS. Checked on x86_64-linux-gnu on Zen3. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-02-13 08:49:13 -08:00
Adhemerval Zanella	0c0d39fe4a	x86: Fix Zen3/Zen4 ERMS selection (BZ 30994) The REP MOVSB usage on memcpy/memmove does not show much performance improvement on Zen3/Zen4 cores compared to the vectorized loops. Also, as from BZ 30994, if the source is aligned and the destination is not the performance can be 20x slower. The performance difference is noticeable with small buffer sizes, closer to the lower bounds limits when memcpy/memmove starts to use ERMS. The performance of REP MOVSB is similar to vectorized instruction on the size limit (the L2 cache). Also, there is no drawback to multiple cores sharing the cache. Checked on x86_64-linux-gnu on Zen3. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-02-13 08:49:12 -08:00
Michael Jeanson	155bb9d036	x86/cet: fix shadow stack test scripts Some shadow stack test scripts use the '==' operator with the 'test' command to validate exit codes resulting in the following error: sysdeps/x86_64/tst-shstk-legacy-1e.sh: 31: test: 139: unexpected operator The '==' operator is invalid for the 'test' command, use '-eq' like the previous call to 'test'. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-02-12 06:49:57 -08:00
Joseph Myers	1bc61cf8e0	Add SOL_VSOCK from Linux 6.7 to bits/socket.h Linux 6.7 adds a constant SOL_VSOCK (recall that various constants in include/linux/socket.h are in fact part of the kernel-userspace API despite that not being a uapi header). Add it to glibc's bits/socket.h. Tested for x86_64.	2024-02-08 12:57:24 +00:00
Joseph Myers	284b928321	Add new AArch64 HWCAP2 definitions from Linux 6.7 to bits/hwcap.h Linux 6.7 adds three new HWCAP2_* values for AArch64; add them to bits/hwcap.h in glibc.	2024-02-08 01:39:09 +00:00
Adhemerval Zanella	1e25112dc0	arm: Remove wrong ldr from _dl_start_user (BZ 31339) The commit `49d877a80b` (arm: Remove _dl_skip_args usage) removed the _SKIP_ARGS literal, which was previously loader to r4 on loader _start. However, the cleanup did not remove the following 'ldr r4, [sl, r4]' on _dl_start_user, used to check to skip the arguments after ld self-relocations. In my testing, the kernel initially set r4 to 0, which makes the ldr instruction just read the _GLOBAL_OFFSET_TABLE_. However, since r4 is a callee-saved register; a different runtime might not zero initialize it and thus trigger an invalid memory access. Checked on arm-linux-gnu. Reported-by: Adrian Ratiu <adrian.ratiu@collabora.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-02-05 15:29:23 -03:00
Xi Ruoyao	2e80f13937	LoongArch: Use builtins for ffs and ffsll On LoongArch GCC compiles __builtin_ffs{,ll} to basically `(x ? __builtin_ctz (x) : -1) + 1`. Since a hardware ctz instruction is available, this is much better than the table-driven generic implementation. Tested on loongarch64. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-02-05 15:19:41 -03:00
Xi Ruoyao	814ed22eab	Remove sysdeps/ia64/math-use-builtins-ffs.h IA64 is gone. Signed-off-by: Xi Ruoyao <xry111@xry111.site>	2024-02-05 15:19:41 -03:00
Adhemerval Zanella	bbd248ac0d	mips: FIx clone3 implementation (BZ 31325) For o32 we need to setup a minimal stack frame to allow cprestore on __thread_start_clone3 (which instruct the linker to save the gp for PIC). Also, there is no guarantee by kABI that $8 will be preserved after syscall execution, so we need to save it on the provided stack. Checked on mipsel-linux-gnu. Reported-by: Khem Raj <raj.khem@gmail.com> Tested-by: Khem Raj <raj.khem@gmail.com>	2024-02-02 10:28:16 -03:00
Joseph Myers	83d8d289b2	Rename c2x / gnu2x tests to c23 / gnu23 Complete the internal renaming from "C2X" and related names in GCC by renaming -c2x and -gnu2x tests to -c23 and -gnu23. Tested for x86_64, and with build-many-glibcs.py for powerpc64le.	2024-02-01 17:55:57 +00:00
Adhemerval Zanella Netto	ae4b8d6a0e	string: Use builtins for ffs and ffsll It allows to remove a lot of arch-specific implementations. Checked on x86_64, aarch64, powerpc64. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-02-01 09:31:33 -03:00
Adhemerval Zanella	26d01172f5	misc: tst-poll: Proper synchronize with child before sending the signal When running the testsuite in parallel, for instance running make -j $(nproc) check, occasionally tst-epoll fails with a timeout. It happens because it sometimes takes a bit more than 10ms for the process to get cloned and blocked by the syscall. In that case the signal is sent to early, and the test fails with a timeout. Checked on x86_64-linux-gnu.	2024-02-01 09:31:33 -03:00
Joseph Myers	42cc619dfb	Refer to C23 in place of C2X in glibc WG14 decided to use the name C23 as the informal name of the next revision of the C standard (notwithstanding the publication date in 2024). Update references to C2X in glibc to use the C23 name. This is intended to update everything except where it involves renaming files (the changes involving renaming tests are intended to be done separately). In the case of the _ISOC2X_SOURCE feature test macro - the only user-visible interface involved - support for that macro is kept for backwards compatibility, while adding _ISOC23_SOURCE. Tested for x86_64.	2024-02-01 11:02:01 +00:00
Stefan Liebler	cc1b91eabd	S390: Fix building with --disable-mutli-arch [BZ #31196 ] Starting with commits - `7ea510127e` string: Add libc_hidden_proto for strchrnul - `22999b2f0f` string: Add libc_hidden_proto for memrchr building glibc on s390x with --disable-multi-arch fails if only the C-variant of strchrnul / memrchr is used. This is the case if gcc uses -march < z13. The build fails with: ../sysdeps/s390/strchrnul-c.c:28:49: error: ‘__strchrnul_c’ undeclared here (not in a function); did you mean ‘__strchrnul’? 28 \| __hidden_ver1 (__strchrnul_c, __GI___strchrnul, __strchrnul_c); With --disable-multi-arch, __strchrnul_c is not available as string/strchrnul.c is just included without defining STRCHRNUL and thus we also don't have to create the internal hidden symbol. Tested-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-01-30 22:28:51 +01:00
Andreas Schwab	6edaa12b41	riscv: add support for static PIE In order to support static PIE the startup code must avoid relocations before __libc_start_main is called.	2024-01-22 14:58:23 +01:00
Adhemerval Zanella	bcf2abd43b	sh: Fix static build with --enable-fortify For static the internal symbols should not be prepended with the internal __GI_. Checked with a make check for sh4-linux-gnu.	2024-01-22 10:04:53 -03:00
Adhemerval Zanella	926a4bdbb5	sparc: Fix sparc64 memmove length comparison (BZ 31266) The small counts copy bytes comparsion should be unsigned (as the memmove size argument). It fixes string/tst-memmove-overflow on sparcv9, where the input size triggers an invalid code path. Checked on sparc64-linux-gnu and sparcv9-linux-gnu.	2024-01-22 09:34:50 -03:00
Adhemerval Zanella	369efd8177	sparc64: Remove unwind information from signal return stubs [BZ#31244] Similar to sparc32 fix, remove the unwind information on the signal return stubs. This fixes the regressions: FAIL: nptl/tst-cancel24-static FAIL: nptl/tst-cond8-static FAIL: nptl/tst-mutex8-static FAIL: nptl/tst-mutexpi8-static FAIL: nptl/tst-mutexpi9 On sparc64-linux-gnu.	2024-01-22 09:34:50 -03:00
Adhemerval Zanella	dd57f5e7b6	sparc: Remove 64 bit check on sparc32 wordsize (BZ 27574) The sparc32 is always 32 bits. Checked on sparcv9-linux-gnu.	2024-01-22 09:34:50 -03:00
Daniel Cederman	87d921e270	sparc: Do not test preservation of NaN payloads for LEON The FPU used by LEON does not preserve NaN payload. This change allows the math/test-*-canonicalize tests to pass on LEON. Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-18 08:27:44 -03:00
Daniel Cederman	45f7ea26c1	sparc: Force calculation that raises exception Use the math_force_eval() macro to force the calculation to complete and raise the exception. With this change the math/test-fenv test pass. Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-18 08:27:44 -03:00
Daniel Cederman	a8f7c77970	sparc: Fix llrint and llround missing exceptions on SPARC V8 Conversions from a float to a long long on SPARC v8 uses a libgcc function that may not raise the correct exceptions on overflow. It also may raise spurious "inexact" exceptions on non overflow cases. This patch fixes the problem in the same way as for RV32. Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-18 08:27:44 -03:00
Daniel Cederman	7bd06985c0	sparc: Remove unwind information from signal return stubs [BZ #31244 ] The functions were previously written in C, but were not compiled with unwind information. The ENTRY/END macros includes .cfi_startproc and .cfi_endproc which adds unwind information. This caused the tests cleanup-8 and cleanup-10 in the GCC testsuite to fail. This patch adds a version of the ENTRY/END macros without the CFI instructions that can be used instead. sigaction registers a restorer address that is located two instructions before the stub function. This patch adds a two instruction padding to avoid that the unwinder accesses the unwind information from the function that the linker has placed right before it in memory. This fixes an issue with pthread_cancel that caused tst-mutex8-static (and other tests) to fail. Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-18 08:27:44 -03:00
Daniel Cederman	82a35070ec	sparc: Prevent stfsr from directly following floating-point instruction On LEON, if the stfsr instruction is immediately following a floating-point operation instruction in a running program, with no other instruction in between the two, the stfsr might behave as if the order was reversed between the two instructions and the stfsr occurred before the floating-point operation. Add a nop instruction before the stfsr to prevent this from happening. Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-18 08:27:44 -03:00
Daniel Cederman	3bb1350c36	sparc: Use existing macros to avoid code duplication Macros for using inline assembly to access the fp state register exists in both fenv_private.h and in fpu_control.h. Let fenv_private.h use the macros from fpu_control.h Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-18 08:27:43 -03:00
Joseph Myers	6511b579a5	Update kernel version to 6.7 in header constant tests This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py and tst-pidfd-consts.py to 6.7. (There are no new constants covered by these tests in 6.7 that need any other header changes.) Tested with build-many-glibcs.py.	2024-01-17 21:15:37 +00:00
Joseph Myers	df11c05be9	Update syscall lists for Linux 6.7 Linux 6.7 adds the futex_requeue, futex_wait and futex_wake syscalls, and enables map_shadow_stack for architectures previously missing it. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py.	2024-01-17 15:38:54 +00:00
H.J. Lu	457bd9cf2e	x86-64: Check if mprotect works before rewriting PLT Systemd execution environment configuration may prohibit changing a memory mapping to become executable: MemoryDenyWriteExecute= Takes a boolean argument. If set, attempts to create memory mappings that are writable and executable at the same time, or to change existing memory mappings to become executable, or mapping shared memory segments as executable, are prohibited. When it is set, systemd service stops working if PLT rewrite is enabled. Check if mprotect works before rewriting PLT. This fixes BZ #31230. This also works with SELinux when deny_execmem is on. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-01-15 06:59:23 -08:00
Sunil K Pandey	9d94997b5f	x86_64: Optimize ffsll function code size. Ffsll function randomly regress by ~20%, depending on how code gets aligned in memory. Ffsll function code size is 17 bytes. Since default function alignment is 16 bytes, it can load on 16, 32, 48 or 64 bytes aligned memory. When ffsll function load at 16, 32 or 64 bytes aligned memory, entire code fits in single 64 bytes cache line. When ffsll function load at 48 bytes aligned memory, it splits in two cache line, hence random regression. Ffsll function size reduction from 17 bytes to 12 bytes ensures that it will always fit in single 64 bytes cache line. This patch fixes ffsll function random performance regression. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-01-13 12:20:08 -08:00
Yanzhang Wang	e0590f41fe	RISC-V: Enable static-pie. This patch referents the commit `374cef3` to add static-pie support. And because the dummy link map is used when relocating ourselves, so need not to set __global_pointer$ at this time. It will also check whether toolchain supports to build static-pie. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-12 15:11:45 -03:00
Adhemerval Zanella	061eaf0244	linux: Fix fstat64 on alpha and sparc64 The `551101e824` change is incorrect for alpha and sparc, since __NR_stat is defined by both kABI. Use __NR_newfstat to check whether to fallback to __NR_fstat64 (similar to what fstatat64 does). Checked on sparc64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-01-12 15:11:11 -03:00
Wilco Dijkstra	08ddd26814	math: remove exp10 wrappers Remove the error handling wrapper from exp10. This is very similar to the changes done to exp and exp2, except that we also need to handle pow10 and pow10l. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-12 16:02:12 +00:00
Xi Ruoyao	5a85786a90	Make __getrandom_nocancel set errno and add a _nostatus version The __getrandom_nocancel function returns errors as negative values instead of errno. This is inconsistent with other _nocancel functions and it breaks "TEMP_FAILURE_RETRY (__getrandom_nocancel (p, n, 0))" in __arc4random_buf. Use INLINE_SYSCALL_CALL instead of INTERNAL_SYSCALL_CALL to fix this issue. But __getrandom_nocancel has been avoiding from touching errno for a reason, see BZ 29624. So add a __getrandom_nocancel_nostatus function and use it in tcache_key_initialize. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-01-12 14:23:11 +01:00
H.J. Lu	f2b65a4471	x86-64/cet: Make CET feature check specific to Linux/x86 CET feature bits in TCB, which are Linux specific, are used to check if CET features are active. Move CET feature check to Linux/x86 directory. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-01-11 20:35:24 -08:00
H.J. Lu	874214db62	i386: Remove CET support bits 1. Remove _dl_runtime_resolve_shstk and _dl_runtime_profile_shstk. 2. Move CET offsets from x86 cpu-features-offsets.sym to x86-64 features-offsets.sym. 3. Rename x86 cet-control.h to x86-64 feature-control.h since it is only for x86-64 and also used for PLT rewrite. 4. Add x86-64 ldsodefs.h to include feature-control.h. 5. Change TUNABLE_CALLBACK (set_plt_rewrite) to x86-64 only. 6. Move x86 dl-procruntime.c to x86-64. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-10 05:20:20 -08:00
H.J. Lu	7d544dd049	x86-64/cet: Move check-cet.awk to x86_64 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-10 05:20:16 -08:00
H.J. Lu	a1bbee9fd1	x86-64/cet: Move dl-cet.[ch] to x86_64 directories Since CET is only enabled for x86-64, move dl-cet.[ch] to x86_64 directories. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-10 05:19:32 -08:00
H.J. Lu	b45115a666	x86: Move x86-64 shadow stack startup codes Move sysdeps/x86/libc-start.h to sysdeps/x86_64/libc-start.h and use sysdeps/generic/libc-start.h for i386. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-10 05:19:32 -08:00
Adhemerval Zanella	a0cfc48e8a	i386: Fail if configured with --enable-cet Since it is only supported for x86_64. Checked on i686-linux-gnu.	2024-01-09 13:55:51 -03:00
Adhemerval Zanella	25f1e16ef0	i386: Remove CET support CET is only support for x86_64, this patch reverts: - `faaee1f07e` x86: Support shadow stack pointer in setjmp/longjmp. - `be9ccd27c0` i386: Add _CET_ENDBR to indirect jump targets in add_n.S/sub_n.S - `c02695d776` x86/CET: Update vfork to prevent child return - `5d844e1b72` i386: Enable CET support in ucontext functions - `124bcde683` x86: Add _CET_ENDBR to functions in crti.S - `562837c002` x86: Add _CET_ENDBR to functions in dl-tlsdesc.S - `f753fa7dea` x86: Support IBT and SHSTK in Intel CET [BZ #21598] - `825b58f3fb` i386-mcount.S: Add _CET_ENDBR to _mcount and __fentry__ - `7e119cd582` i386: Use _CET_NOTRACK in i686/memcmp.S - `177824e232` i386: Use _CET_NOTRACK in memcmp-sse4.S - `0a899af097` i386: Use _CET_NOTRACK in memcpy-ssse3-rep.S - `7fb613361c` i386: Use _CET_NOTRACK in memcpy-ssse3.S - `77a8ae0948` i386: Use _CET_NOTRACK in memset-sse2-rep.S - `00e7b76a8f` i386: Use _CET_NOTRACK in memset-sse2.S - `90d15dc577` i386: Use _CET_NOTRACK in strcat-sse2.S - `f1574581c7` i386: Use _CET_NOTRACK in strcpy-sse2.S - `4031d7484a` i386/sub_n.S: Add a missing _CET_ENDBR to indirect jump - target - Checked on i686-linux-gnu.	2024-01-09 13:55:51 -03:00
Adhemerval Zanella	b7fc4a07f2	x86: Move CET infrastructure to x86_64 The CET is only supported for x86_64 and there is no plan to add kernel support for i386. Move the Makefile rules and files from the generic x86 folder to x86_64 one. Checked on x86_64-linux-gnu and i686-linux-gnu.	2024-01-09 13:55:51 -03:00
Adhemerval Zanella	460860f457	Remove ia64-linux-gnu Linux 6.7 removed ia64 from the official tree [1], following the general principle that a glibc port needs upstream support for the architecture in all the components it depends on (binutils, GCC, and the Linux kernel). Apart from the removal of sysdeps/ia64 and sysdeps/unix/sysv/linux/ia64, there are updates to various comments referencing ia64 for which removal of those references seemed appropriate. The configuration is removed from README and build-many-glibcs.py. The CONTRIBUTED-BY, elf/elf.h, manual/contrib.texi (the porting mention), *.po files, config.guess, and longlong.h are not changed. For Linux it allows cleanup some clone2 support on multiple files. The following bug can be closed as WONTFIX: BZ 22634 [2], BZ 14250 [3], BZ 21634 [4], BZ 10163 [5], BZ 16401 [6], and BZ 11585 [7]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43ff221426d33db909f7159fdf620c3b052e2d1c [2] https://sourceware.org/bugzilla/show_bug.cgi?id=22634 [3] https://sourceware.org/bugzilla/show_bug.cgi?id=14250 [4] https://sourceware.org/bugzilla/show_bug.cgi?id=21634 [5] https://sourceware.org/bugzilla/show_bug.cgi?id=10163 [6] https://sourceware.org/bugzilla/show_bug.cgi?id=16401 [7] https://sourceware.org/bugzilla/show_bug.cgi?id=11585 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-01-08 17:09:36 -03:00
H.J. Lu	0f9afc265a	x32: Handle displacement overflow in PLT rewrite [BZ #31218 ] PLT rewrite calculated displacement with ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE; On x32, displacement from 0xf7fbe060 to 0x401030 was calculated as unsigned int disp = 0x401030 - 0xf7fbe060 - 5; with disp == 0x8442fcb and caused displacement overflow. The PLT entry was changed to: 0xf7fbe060 <+0>: e9 cb 2f 44 08 jmp 0x401030 0xf7fbe065 <+5>: cc int3 0xf7fbe066 <+6>: cc int3 0xf7fbe067 <+7>: cc int3 0xf7fbe068 <+8>: cc int3 0xf7fbe069 <+9>: cc int3 0xf7fbe06a <+10>: cc int3 0xf7fbe06b <+11>: cc int3 0xf7fbe06c <+12>: cc int3 0xf7fbe06d <+13>: cc int3 0xf7fbe06e <+14>: cc int3 0xf7fbe06f <+15>: cc int3 x32 has 32-bit address range, but it doesn't wrap address around at 4GB, JMP target was changed to 0x100401030 (0xf7fbe060LL + 0x8442fcbLL + 5), which is above 4GB. Always use uint64_t to calculate displacement. This fixes BZ #31218. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-01-06 14:25:49 -08:00
Noah Goldstein	b96a2eba2f	x86: Fixup some nits in longjmp asm implementation Replace a stray `nop` with a `.p2align` directive.	2024-01-05 18:00:38 -08:00
H.J. Lu	848746e88e	elf: Add ELF_DYNAMIC_AFTER_RELOC to rewrite PLT Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after relocation. For x86-64, add #define DT_X86_64_PLT (DT_LOPROC + 0) #define DT_X86_64_PLTSZ (DT_LOPROC + 1) #define DT_X86_64_PLTENT (DT_LOPROC + 3) 1. DT_X86_64_PLT: The address of the procedure linkage table. 2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage table. 3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table entry. With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the memory offset of the indirect branch instruction. Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section with direct branch after relocation when the lazy binding is disabled. PLT rewrite is disabled by default since SELinux may disallow modifying code pages and ld.so can't detect it in all cases. Use $ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 to enable PLT rewrite with 32-bit direct jump at run-time or $ export GLIBC_TUNABLES=glibc.cpu.plt_rewrite=2 to enable PLT rewrite with 32-bit direct jump and on APX processors with 64-bit absolute jump at run-time. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-01-05 05:49:49 -08:00
Sergey Bugaev	520b1df08d	aarch64: Make cpu-features definitions not Linux-specific These describe generic AArch64 CPU features, and are not tied to a kernel-specific way of determining them. We can share them between the Linux and Hurd AArch64 ports. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240103171502.1358371-13-bugaevc@gmail.com>	2024-01-04 23:48:54 +01:00
Sergey Bugaev	fbfe0b20ab	hurd: Initializy _dl_pagesize early in static builds We fetch __vm_page_size as the very first RPC that we do, inside __mach_init (). Propagate that to _dl_pagesize ASAP after that, before any other initialization. In dynamic builds, this is already done immediately after __mach_init (), inside _dl_sysdep_start (). Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240103171502.1358371-12-bugaevc@gmail.com>	2024-01-04 23:48:36 +01:00
Sergey Bugaev	4145de65f6	hurd: Only init early static TLS if it's used to store stack or pointer guards This is the case on both x86 architectures, but not on AArch64. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240103171502.1358371-11-bugaevc@gmail.com>	2024-01-04 23:48:23 +01:00
Sergey Bugaev	9eaa0e1799	hurd: Make init-first.c no longer x86-specific This will make it usable in other ports. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240103171502.1358371-10-bugaevc@gmail.com>	2024-01-04 23:48:07 +01:00
Sergey Bugaev	b44ad8944b	hurd: Drop x86-specific assembly from init-first.c We already have the RETURN_TO macro for this exact use case, and it's already used in the non-static code path. Use it here too. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240103171502.1358371-9-bugaevc@gmail.com>	2024-01-04 23:47:23 +01:00
Sergey Bugaev	24b707c166	hurd: Pass the data pointer to _hurd_stack_setup explicitly Instead of relying on the stack frame layout to figure out where the stack pointer was prior to the _hurd_stack_setup () call, just pass the pointer as an argument explicitly. This is less brittle and much more portable. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240103171502.1358371-8-bugaevc@gmail.com>	2024-01-04 23:47:03 +01:00
H.J. Lu	35694d3416	x86-64/cet: Check the restore token in longjmp setcontext and swapcontext put a restore token on the old shadow stack which is used to restore the target shadow stack when switching user contexts. When longjmp from a user context, the target shadow stack can be different from the current shadow stack and INCSSP can't be used to restore the shadow stack pointer to the target shadow stack. Update longjmp to search for a restore token. If found, use the token to restore the shadow stack pointer before using INCSSP to pop the shadow stack. Stop the token search and use INCSSP if the shadow stack entry value is the same as the current shadow stack pointer. It is a user error if there is a shadow stack switch without leaving a restore token on the old shadow stack. The only difference between __longjmp.S and __longjmp_chk.S is that __longjmp_chk.S has a check for invalid longjmp usages. Merge __longjmp.S and __longjmp_chk.S by adding the CHECK_INVALID_LONGJMP macro. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-01-04 13:38:26 -08:00
H.J. Lu	bbfb54930c	i386: Ignore --enable-cet Since shadow stack is only supported for x86-64, ignore --enable-cet for i386. Always setting $(enable-cet) for i386 to "no" to support ifneq ($(enable-cet),no) in x86 Makefiles. We can't use ifeq ($(enable-cet),yes) since $(enable-cet) can be "yes", "no" or "permissive". Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-04 06:08:55 -08:00
Sergey Bugaev	0d4a2f3576	mach: Drop SNARF_ARGS macro We're obtaining arguments from the stack differently, see init-first.c. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>	2024-01-03 21:59:55 +01:00
Sergey Bugaev	114de961e0	mach: Drop some unnecessary vm_param.h includes Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>	2024-01-03 21:59:54 +01:00
Sergey Bugaev	dac7c64065	hurd: Add some missing includes Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>	2024-01-03 21:59:54 +01:00
Joseph Myers	b34b46b880	Implement C23 <stdbit.h> C23 adds a header <stdbit.h> with various functions and type-generic macros for bit-manipulation of unsigned integers (plus macro defines related to endianness). Implement this header for glibc. The functions have both inline definitions in the header (referenced by macros defined in the header) and copies with external linkage in the library (which are implemented in terms of those macros to avoid duplication). They are documented in the glibc manual. Tests, as well as verifying results for various inputs (of both the macros and the out-of-line functions), verify the types of those results (which showed up a bug in an earlier version with the type-generic macro stdc_has_single_bit wrongly returning a promoted type), that the macros can be used at top level in a source file (so don't use ({})), that they evaluate their arguments exactly once, and that the macros for the type-specific functions have the expected implicit conversions to the relevant argument type. Jakub previously referred to -Wconversion warnings in type-generic macros, so I've included a test with -Wconversion (but the only warnings I saw and fixed from that test were actually in inline functions in the <stdbit.h> header - not anything coming from use of the type-generic macros themselves). This implementation of the type-generic macros does not handle unsigned __int128, or unsigned _BitInt types with a width other than that of a standard integer type (and C23 doesn't require the header to handle such types either). Support for those types, using the new type-generic built-in functions Jakub's added for GCC 14, can reasonably be added in a followup (along of course with associated tests). This implementation doesn't do anything special to handle C++, or have any tests of functionality in C++ beyond the existing tests that all headers can be compiled in C++ code; it's not clear exactly what form this header should take in C++, but probably not one using macros. DIS ballot comment AT-107 asks for the word "count" to be added to the names of the stdc_leading_zeros, stdc_leading_ones, stdc_trailing_zeros and stdc_trailing_ones functions and macros. I don't think it's likely to be accepted (accepting any technical comments would mean having an FDIS ballot), but if it is accepted at the WG14 meeting (22-26 January in Strasbourg, starting with DIS ballot comment handling) then there would still be time to update glibc for the renaming before the 2.39 release. The new functions and header are placed in the stdlib/ directory in glibc, rather than creating a new toplevel stdbit/ or putting them in string/ alongside ffs. Tested for x86_64 and x86.	2024-01-03 12:07:14 +00:00
Szabolcs Nagy	0c12c8c0cb	aarch64: Add longjmp test for SME Includes test for setcontext too. The test directly checks after longjmp if ZA got disabled and the ZA contents got saved following the lazy saving scheme. It does not use ACLE code to verify that gcc can interoperate with glibc. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-02 16:54:21 +00:00
Szabolcs Nagy	9d30e5cf96	aarch64: Add setcontext support for SME For the ZA lazy saving scheme to work, setcontext has to call __libc_arm_za_disable. Also fixes swapcontext which uses setcontext internally. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-02 15:43:30 +00:00
Szabolcs Nagy	a7373e457f	aarch64: Add longjmp support for SME For the ZA lazy saving scheme to work, longjmp has to call __libc_arm_za_disable. In ld.so we assume ZA is not used so longjmp does not need special support there. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-02 15:43:30 +00:00
Szabolcs Nagy	d3c32ae207	aarch64: Add SME runtime support The runtime support routines for the call ABI of the Scalable Matrix Extension (SME) are mostly in libgcc. Since libc.so cannot depend on libgcc_s.so have an implementation of __arm_za_disable in libc for libc internal use in longjmp and similar APIs. __libc_arm_za_disable follows the same PCS rules as __arm_za_disable, but it's a hidden symbol so it does not need variant PCS marking. Using __libc_fatal instead of abort because it can print a message and works in ld.so too. But for now we don't need SME routines in ld.so. To check the SME HWCAP in asm, we need the _dl_hwcap2 member offset in _rtld_global_ro in the shared libc.so, while in libc.a the _dl_hwcap2 object is accessed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-01-02 15:43:30 +00:00
H.J. Lu	b5dcccfb12	x86/cet: Add -fcf-protection=none before -fcf-protection=branch When shadow stack is enabled, some CET tests failed when compiled with GCC 14: FAIL: elf/tst-cet-legacy-4 FAIL: elf/tst-cet-legacy-5a FAIL: elf/tst-cet-legacy-6a which are caused by https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039 These tests use -fcf-protection -fcf-protection=branch and assume that -fcf-protection=branch will override -fcf-protection. But this GCC 14 commit: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=1c6231c05bdcca changed the -fcf-protection behavior such that -fcf-protection -fcf-protection=branch is treated the same as -fcf-protection Use -fcf-protection -fcf-protection=none -fcf-protection=branch as the workaround. This fixes BZ #31187. Tested with GCC 13 and GCC 14 on Intel Tiger Lake. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-01-01 15:53:52 -08:00
Paul Eggert	dff8da6b3e	Update copyright dates with scripts/update-copyrights	2024-01-01 10:53:40 -08:00
H.J. Lu	cf9481724b	x86/cet: Run some CET tests with shadow stack When CET is disabled by default, run some CET tests with shadow stack enabled using $ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK	2024-01-01 05:22:48 -08:00
H.J. Lu	55d63e7312	x86/cet: Don't set CET active by default Not all CET enabled applications and libraries have been properly tested in CET enabled environments. Some CET enabled applications or libraries will crash or misbehave when CET is enabled. Don't set CET active by default so that all applications and libraries will run normally regardless of whether CET is active or not. Shadow stack can be enabled by $ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK at run-time if shadow stack can be enabled by kernel. NB: This commit can be reverted if it is OK to enable CET by default for all applications and libraries.	2024-01-01 05:22:48 -08:00
H.J. Lu	d360dcc001	x86/cet: Check feature_1 in TCB for active IBT and SHSTK Initially, IBT and SHSTK are marked as active when CPU supports them and CET are enabled in glibc. They can be disabled early by tunables before relocation. Since after relocation, GLRO(dl_x86_cpu_features) becomes read-only, we can't update GLRO(dl_x86_cpu_features) to mark IBT and SHSTK as inactive. Instead, check the feature_1 field in TCB to decide if IBT and SHST are active.	2024-01-01 05:22:48 -08:00
H.J. Lu	541641a3de	x86/cet: Enable shadow stack during startup Previously, CET was enabled by kernel before passing control to user space and the startup code must disable CET if applications or shared libraries aren't CET enabled. Since the current kernel only supports shadow stack and won't enable shadow stack before passing control to user space, we need to enable shadow stack during startup if the application and all shared library are shadow stack enabled. There is no need to disable shadow stack at startup. Shadow stack can only be enabled in a function which will never return. Otherwise, shadow stack will underflow at the function return. 1. GL(dl_x86_feature_1) is set to the CET features which are supported by the processor and are not disabled by the tunable. Only non-zero features in GL(dl_x86_feature_1) should be enabled. After enabling shadow stack with ARCH_SHSTK_ENABLE, ARCH_SHSTK_STATUS is used to check if shadow stack is really enabled. 2. Use ARCH_SHSTK_ENABLE in RTLD_START in dynamic executable. It is safe since RTLD_START never returns. 3. Call arch_prctl (ARCH_SHSTK_ENABLE) from ARCH_SETUP_TLS in static executable. Since the start function using ARCH_SETUP_TLS never returns, it is safe to enable shadow stack in ARCH_SETUP_TLS.	2024-01-01 05:22:48 -08:00
H.J. Lu	8d9f9c4460	elf: Always provide _dl_get_dl_main_map in libc.a Always provide _dl_get_dl_main_map in libc.a. It will be used by x86 to process PT_GNU_PROPERTY segment.	2024-01-01 05:22:48 -08:00
H.J. Lu	edb5e0c8f9	x86/cet: Sync with Linux kernel 6.6 shadow stack interface Sync with Linux kernel 6.6 shadow stack interface. Since only x86-64 is supported, i386 shadow stack codes are unchanged and CET shouldn't be enabled for i386. 1. When the shadow stack base in TCB is unset, the default shadow stack is in use. Use the current shadow stack pointer as the marker for the default shadow stack. It is used to identify if the current shadow stack is the same as the target shadow stack when switching ucontexts. If yes, INCSSP will be used to unwind shadow stack. Otherwise, shadow stack restore token will be used. 2. Allocate shadow stack with the map_shadow_stack syscall. Since there is no function to explicitly release ucontext, there is no place to release shadow stack allocated by map_shadow_stack in ucontext functions. Such shadow stacks will be leaked. 3. Rename arch_prctl CET commands to ARCH_SHSTK_XXX. 4. Rewrite the CET control functions with the current kernel shadow stack interface. Since CET is no longer enabled by kernel, a separate patch will enable shadow stack during startup.	2024-01-01 05:22:48 -08:00
Aurelien Jarno	6b32696116	RISC-V: Add support for dl_runtime_profile (BZ #31151 ) Code is mostly inspired from the LoongArch one, which has a similar ABI, with minor changes to support riscv32 and register differences. This fixes elf/tst-sprof-basic. This also fixes elf/tst-audit1, elf/tst-audit2 and elf/tst-audit8 with recent binutils snapshots when --enable-bind-now is used. Resolves: BZ #31151 Acked-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-12-30 11:00:10 +01:00
H.J. Lu	81be2a61da	x86-64: Fix the tcb field load for x32 [BZ #31185 ] _dl_tlsdesc_undefweak and _dl_tlsdesc_dynamic access the thread pointer via the tcb field in TCB: _dl_tlsdesc_undefweak: _CET_ENDBR movq 8(%rax), %rax subq %fs:0, %rax ret _dl_tlsdesc_dynamic: ... subq %fs:0, %rax movq -8(%rsp), %rdi ret Since the tcb field in TCB is a pointer, %fs:0 is a 32-bit location, not 64-bit. It should use "sub %fs:0, %RAX_LP" instead. Since _dl_tlsdesc_undefweak returns ptrdiff_t and _dl_make_tlsdesc_dynamic returns void *, RAX_LP is appropriate here for x32 and x86-64. This fixes BZ #31185.	2023-12-22 05:37:17 -08:00
H.J. Lu	3502440397	x86-64: Fix the dtv field load for x32 [BZ #31184 ] On x32, I got FAIL: elf/tst-tlsgap $ gdb elf/tst-tlsgap ... open tst-tlsgap-mod1.so Thread 2 "tst-tlsgap" received signal SIGSEGV, Segmentation fault. [Switching to LWP 2268754] _dl_tlsdesc_dynamic () at ../sysdeps/x86_64/dl-tlsdesc.S:108 108 movq (%rsi), %rax (gdb) p/x $rsi $4 = 0xf7dbf9005655fb18 (gdb) This is caused by _dl_tlsdesc_dynamic: _CET_ENDBR /* Preserve call-clobbered registers that we modify. We need two scratch regs anyway. */ movq %rsi, -16(%rsp) movq %fs:DTV_OFFSET, %rsi Since the dtv field in TCB is a pointer, %fs:DTV_OFFSET is a 32-bit location, not 64-bit. Load the dtv field to RSI_LP instead of rsi. This fixes BZ #31184.	2023-12-22 05:37:00 -08:00
H.J. Lu	41560a9312	x86/cet: Don't disable CET if not single threaded In permissive mode, don't disable IBT nor SHSTK when dlopening a legacy shared library if not single threaded since IBT and SHSTK may be still enabled in other threads. Other threads with IBT or SHSTK enabled will crash when calling functions in the legacy shared library. Instead, an error will be issued.	2023-12-20 05:03:37 -08:00
H.J. Lu	c04035809a	x86: Modularize sysdeps/x86/dl-cet.c Improve readability and make maintenance easier for dl-feature.c by modularizing sysdeps/x86/dl-cet.c: 1. Support processors with: a. Only IBT. Or b. Only SHSTK. Or c. Both IBT and SHSTK. 2. Lock CET features only if IBT or SHSTK are enabled and are not enabled permissively.	2023-12-20 04:57:55 -08:00
H.J. Lu	1a23b39f9d	x86/cet: Update tst-cet-vfork-1 Change tst-cet-vfork-1.c to verify that vfork child return triggers SIGSEGV due to shadow stack mismatch.	2023-12-20 04:57:21 -08:00
Joe Ramsay	667f277c78	aarch64: Add SIMD attributes to math functions with vector versions Added annotations for autovec by GCC and GFortran - this enables GCC >= 9 to autovectorise math calls at -Ofast. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-12-20 08:41:25 +00:00
Joe Ramsay	cc0d77ba94	aarch64: Add half-width versions of AdvSIMD f32 libmvec routines Compilers may emit calls to 'half-width' routines (two-lane single-precision variants). These have been added in the form of wrappers around the full-width versions, where the low half of the vector is simply duplicated. This will perform poorly when one lane triggers the special-case handler, as there will be a redundant call to the scalar version, however this is expected to be rare at Ofast. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-12-20 08:41:25 +00:00
H.J. Lu	50bef9bd63	Fix elf: Do not duplicate the GLIBC_TUNABLES string commit `2a969b53c0` Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Wed Dec 6 10:24:01 2023 -0300 elf: Do not duplicate the GLIBC_TUNABLES string has @@ -38,7 +39,7 @@ which isn't available. */ #define CHECK_GLIBC_IFUNC_PREFERRED_OFF(f, cpu_features, name, len) \ _Static_assert (sizeof (#name) - 1 == len, #name " != " #len); \ - if (memcmp (f, #name, len) == 0) \ + if (tunable_str_comma_strcmp_cte (&f, #name) == 0) \ { \ cpu_features->preferred[index_arch_##name] \ &= ~bit_arch_##name; \ @@ -46,12 +47,11 @@ Fix it by removing "== 0" after tunable_str_comma_strcmp_cte.	2023-12-19 16:01:33 -08:00
H.J. Lu	cad5703e4f	Fix elf: Do not duplicate the GLIBC_TUNABLES string Fix issues in sysdeps/x86/tst-hwcap-tunables.c added by Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Wed Dec 6 10:24:01 2023 -0300 elf: Do not duplicate the GLIBC_TUNABLES string 1. -AVX,-AVX2,-AVX512F should be used to disable AVX, AVX2 and AVX512. 2. AVX512 IFUNC functions check AVX512VL. -AVX512VL should be added to disable these functions. This fixed: FAIL: elf/tst-hwcap-tunables ... [0] Spawned test for -Prefer_ERMS,-Prefer_FSRM,-AVX,-AVX2,-AVX_Usable,-AVX2_Usable,-AVX512F_Usable,-SSE4_1,-SSE4_2,-SSSE3,-Fast_Unaligned_Load,-ERMS,-AVX_Fast_Unaligned_Load error: subprocess failed: tst-tunables error: unexpected output from subprocess ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false [1] Spawned test for ,-,-Prefer_ERMS,-Prefer_FSRM,-AVX,-AVX2,-AVX_Usable,-AVX2_Usable,-AVX512F_Usable,-SSE4_1,-SSE4_2,,-SSSE3,-Fast_Unaligned_Load,,-,-ERMS,-AVX_Fast_Unaligned_Load,-, error: subprocess failed: tst-tunables error: unexpected output from subprocess ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false ../sysdeps/x86/tst-hwcap-tunables.c:91: numeric comparison failure left: 1 (0x1); from: impls[i].usable right: 0 (0x0); from: false error: 2 test failures on Intel Tiger Lake.	2023-12-19 13:34:14 -08:00
Bruno Haible	d082930272	hppa: Fix undefined behaviour in feclearexcept (BZ 30983) The expression (excepts & FE_ALL_EXCEPT) << 27 produces a signed integer overflow when 'excepts' is specified as FE_INVALID (= 0x10), because - excepts is of type 'int', - FE_ALL_EXCEPT is of type 'int', - thus (excepts & FE_ALL_EXCEPT) is (int) 0x10, - 'int' is 32 bits wide. The patched code produces the same instruction sequence as previosuly. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:38 -03:00
Bruno Haible	80a40a9e14	alpha: Fix fesetexceptflag (BZ 30998) It clears some exception flags that are outside the EXCEPTS argument. It fixes math/test-fexcept on qemu-user. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:38 -03:00
Adhemerval Zanella	802aef27b2	riscv: Fix feenvupdate with FE_DFL_ENV (BZ 31022) libc_feupdateenv_riscv should check for FE_DFL_ENV, similar to libc_fesetenv_riscv. Also extend the test-fenv.c to test fenvupdate. Checked on riscv under qemu-system. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:38 -03:00
Bruno Haible	787282dede	x86: Do not raises floating-point exception traps on fesetexceptflag (BZ 30990) According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). The flags can be set in the 387 unit or in the SSE unit. When we need to clear a flag, we need to do so in both units, due to the way fetestexcept is implemented. When we need to set a flag, it is sufficient to do it in the SSE unit, because that is guaranteed to not trap. However, on i386 CPUs that have only a 387 unit, set the flags in the 387, as long as this cannot trap. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:38 -03:00
Adhemerval Zanella	47a9eeb9ba	i686: Do not raise exception traps on fesetexcept (BZ 30989) According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). The flags can be set in the 387 unit or in the SSE unit. To set a flag, it is sufficient to do it in the SSE unit, because that is guaranteed to not trap. However, on i386 CPUs that have only a 387 unit, set the flags in the 387, as long as this cannot trap. Checked on i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:38 -03:00
Adhemerval Zanella	ecb1e7220d	powerpc: Do not raise exception traps for fesetexcept/fesetexceptflag (BZ 30988) According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). This is a side-effect of how we implement the GNU extension feenableexcept, where feenableexcept/fesetenv/fesetmode/feupdateenv might issue prctl (PR_SET_FPEXC, PR_FP_EXC_PRECISE) depending of the argument. And on PR_FP_EXC_PRECISE, setting a floating-point exception flag triggers a trap. To make the both functions follow the C23, fesetexcept and fesetexceptflag now fail if the argument may trigger a trap. The math tests now check for an value different than 0, instead of bail out as unsupported for EXCEPTION_SET_FORCES_TRAP. Checked on powerpc64le-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2023-12-19 15:12:34 -03:00
Adhemerval Zanella	2a969b53c0	elf: Do not duplicate the GLIBC_TUNABLES string The tunable parsing duplicates the tunable environment variable so it null-terminates each one since it simplifies the later parsing. It has the drawback of adding another point of failure (__minimal_malloc failing), and the memory copy requires tuning the compiler to avoid mem operations calls. The parsing now tracks the tunable start and its size. The dl-tunable-parse.h adds helper functions to help parsing, like a strcmp that also checks for size and an iterator for suboptions that are comma-separated (used on hwcap parsing by x86, powerpc, and s390x). Since the environment variable is allocated on the stack by the kernel, it is safe to keep the references to the suboptions for later parsing of string tunables (as done by set_hwcaps by multiple architectures). Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-12-19 13:25:45 -03:00
Joseph Myers	5275fc784c	Do not build sparc32 libgcc functions into static libc Since GCC commit f31a019d1161ec78846473da743aedf49cca8c27 "Emit funcall external declarations only if actually used.", the glibc testsuite has failed to build for 32-bit SPARC with GCC mainline. /scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/../../../../sparc64-glibc-linux-gnu/bin/ld: /scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/32/libgcc.a(_divsi3.o): in function `.div': /scratch/jmyers/glibc-bot/src/gcc/libgcc/config/sparc/lb1spc.S:138: multiple definition of `.div'; /scratch/jmyers/glibc-bot/build/glibcs/sparcv9-linux-gnu/glibc/libc.a(sdiv.o):/scratch/jmyers/glibc-bot/src/glibc/gnulib/../sysdeps/sparc/sparc32/sparcv9/sdiv.S:13: first defined here /scratch/jmyers/glibc-bot/install/compilers/sparc64-linux-gnu/lib/gcc/sparc64-glibc-linux-gnu/14.0.0/../../../../sparc64-glibc-linux-gnu/bin/ld: disabling relaxation; it will not work with multiple definitions collect2: error: ld returned 1 exit status make[3]: *** [../Rules:298: /scratch/jmyers/glibc-bot/build/glibcs/sparcv9-linux-gnu/glibc/nptl/tst-cancel24-static] Error 1 https://sourceware.org/pipermail/libc-testresults/2023q4/012154.html I'm not sure of the exact sequence of undefined references that cause first the glibc object file defining .div and then the libgcc object file defining both .div and .udiv to be pulled in (which must have been perturbed by that GCC change in a way that introduced the build failure), but I think the failure illustrates that it's inherently fragile for glibc to define symbols in separate object files that libgcc defines in the same object file - and indeed for glibc to redefine libgcc symbols at all, since the division into object files shouldn't really be part of the interface between libgcc and libc. These symbols appear to be in libc only for compatibility, maybe one of the cases where they were accidentally exported from shared libc in glibc 2.0 before the introduction of symbol versioning and so programs started expecting shared libc to provide them. Thus, there is no need to have them in static libc. Add this set of libgcc functions to shared-only-routines so they are no longer provided in static libc. (No change is made regarding .mul - dotmul source file - since unlike the other symbols in this grouping, it doesn't actually appear to be a libgcc symbol, at least in current GCC.) Tested with build-many-glibcs.py for sparcv9-linux-gnu with GCC mainline.	2023-12-19 16:00:11 +00:00
H.J. Lu	4d8a01d2b0	x86/cet: Check CPU_FEATURE_ACTIVE in permissive mode Verify that CPU_FEATURE_ACTIVE works properly in permissive mode.	2023-12-19 06:58:05 -08:00
H.J. Lu	28bd6f832d	x86/cet: Check legacy shadow stack code in .init_array section Verify that legacy shadow stack code in .init_array section in application and shared library, which are marked as shadow stack enabled, will trigger segfault.	2023-12-19 06:57:47 -08:00
H.J. Lu	9424ce80c2	x86/cet: Add tests for GLIBC_TUNABLES=glibc.cpu.hwcaps=-SHSTK Verify that GLIBC_TUNABLES=glibc.cpu.hwcaps=-SHSTK turns off shadow stack properly.	2023-12-19 06:57:39 -08:00
H.J. Lu	71c0cc3357	x86/cet: Check CPU_FEATURE_ACTIVE when CET is disabled Verify that CPU_FEATURE_ACTIVE (SHSTK) works properly when CET is disabled.	2023-12-19 06:57:32 -08:00
H.J. Lu	f418fe6f97	x86/cet: Check legacy shadow stack applications Add tests to verify that legacy shadow stack applications run properly when shadow stack is enabled in Linux kernel.	2023-12-19 06:57:27 -08:00
Stefan Liebler	664f565f9c	s390: Set psw addr field in getcontext and friends. So far if the ucontext structure was obtained by getcontext and co, the return address was stored in general purpose register 14 as it is defined as return address in the ABI. In contrast, the context passed to a signal handler contains the address in psw.addr field. If somebody e.g. wants to dump the address of the context, the origin needs to be known. Now this patch adjusts getcontext and friends and stores the return address also in psw.addr field. Note that setcontext isn't adjusted and it is not supported to pass a ucontext structure from signal-handler to setcontext. We are not able to restore all registers and branching to psw.addr without clobbering one register.	2023-12-19 11:00:19 +01:00
Matthew Sterrett	e957308723	x86: Unifies 'strlen-evex' and 'strlen-evex512' implementations. This commit uses a common implementation 'strlen-evex-base.S' for both 'strlen-evex' and 'strlen-evex512' The motivation is to reduce the number of implementations to maintain. This incidentally gives a small performance improvement. All tests pass on x86. Benchmarks were taken on SKX. https://www.intel.com/content/www/us/en/products/sku/123613/intel-core-i97900x-xseries-processor-13-75m-cache-up-to-4-30-ghz/specifications.html Geometric mean for strlen-evex512 over all benchmarks (N=10) was (new/old) 0.939 Geometric mean for wcslen-evex512 over all benchmarks (N=10) was (new/old) 0.965 Code Size Changes: strlen-evex512.S : +24 bytes wcslen-evex512.S : +54 bytes Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2023-12-18 12:38:01 -06:00
H.J. Lu	442983319b	x86/cet: Don't assume that SHSTK implies IBT Since shadow stack (SHSTK) is enabled in the Linux kernel without enabling indirect branch tracking (IBT), don't assume that SHSTK implies IBT. Use "CPU_FEATURE_ACTIVE (IBT)" to check if IBT is active and "CPU_FEATURE_ACTIVE (SHSTK)" to check if SHSTK is active.	2023-12-18 07:04:18 -08:00
H.J. Lu	0b850186fd	x86/cet: Check user_shstk in /proc/cpuinfo Linux kernel reports CPU shadow stack feature in /proc/cpuinfo as user_shstk, instead of shstk.	2023-12-17 10:42:06 -08:00
Manjunath Matti	93a739d4a1	powerpc: Add space for HWCAP3/HWCAP4 in the TCB for future Power. This patch reserves space for HWCAP3/HWCAP4 in the TCB of powerpc. These hardware capabilities bits will be used by future Power architectures. Versioned symbol '__parse_hwcap_3_4_and_convert_at_platform' advertises the availability of the new HWCAP3/HWCAP4 data in the TCB. This is an ABI change for GLIBC 2.39. Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2023-12-15 20:20:14 -06:00
Amrita H S	90bcc8721e	powerpc: Fix performance issues of strcmp power10 Current implementation of strcmp for power10 has performance regression for multiple small sizes and alignment combination. Most of these performance issues are fixed by this patch. The compare loop is unrolled and page crosses of unrolled loop is handled. Thanks to Paul E. Murphy for helping in fixing the performance issues. Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com> Co-Authored-By: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>	2023-12-15 16:42:40 -06:00
MAHESH BODAPATI	b9182c793c	powerpc : Add optimized memchr for POWER10 Optimized memchr for POWER10 based on existing rawmemchr and strlen. Reordering instructions and loop unrolling helped in getting better performance. Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>	2023-12-14 14:40:14 -06:00
H.J. Lu	4753e92868	x86: Check PT_GNU_PROPERTY early The PT_GNU_PROPERTY segment is scanned before PT_NOTE. For binaries with the PT_GNU_PROPERTY segment, we can check it to avoid scan of the PT_NOTE segment. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2023-12-11 11:05:46 -08:00
H.J. Lu	7e03e0de7e	sysdeps/x86/Makefile: Split and sort tests Put each test on a separate line and sort tests.	2023-12-11 08:49:57 -08:00
Amrita H S	3367d8e180	powerpc: Optimized strcmp for power10 This patch is based on __strcmp_power9 and __strlen_power10. Improvements from __strcmp_power9: 1. Uses new POWER10 instructions - This code uses lxvp to decrease contention on load by loading 32 bytes per instruction. 2. Performance implication - This version has around 30% better performance on average. - Performance regression is seen for a specific combination of sizes and alignments. Some of them is observed without changes also, while rest may be induced by the patch. Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com> Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2023-12-07 11:10:40 -06:00
Adhemerval Zanella	61d848b554	elf: Ignore LD_BIND_NOW and LD_BIND_NOT for setuid binaries To avoid any environment variable to change setuid binaries semantics. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-12-05 13:21:36 -03:00
Adhemerval Zanella	876a12e513	elf: Ignore loader debug env vars for setuid Loader already ignores LD_DEBUG, LD_DEBUG_OUTPUT, and LD_TRACE_LOADED_OBJECTS. Both LD_WARN and LD_VERBOSE are similar to LD_DEBUG, in the sense they enable additional checks and debug information, so it makes sense to disable them. Also add both LD_VERBOSE and LD_WARN on filtered environment variables for setuid binaries. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-12-05 13:21:36 -03:00
Andreas Schwab	3f79842788	aarch64: correct CFI in rawmemchr (bug 31113) The .cfi_return_column directive changes the return column for the whole FDE range. But the actual intent is to tell the unwinder that the value in x30 (lr) now resides in x15 after the move, and that is expressed by the .cfi_register directive.	2023-12-05 12:49:37 +01:00
Joe Ramsay	63d0a35d5f	math: Add new exp10 implementation New implementation is based on the existing exp/exp2, with different reduction constants and polynomial. Worst-case error in round-to- nearest is 0.513 ULP. The exp/exp2 shared table is reused for exp10 - .rodata size of e_exp_data increases by 64 bytes. As for exp/exp2, targets with single-instruction rounding/conversion intrinsics can use them by toggling TOINT_INTRINSICS=1 and adding the necessary code to their math_private.h. Improvements on Neoverse V1 compared to current GLIBC master: exp10 thruput: 3.3x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8] exp10 latency: 1.8x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8] Tested on: aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-12-04 15:52:11 +00:00
Szabolcs Nagy	8e755f5bc8	aarch64: fix tested ifunc variants Don't test a64fx string functions when BTI is enabled since they are not BTI compatible.	2023-12-04 14:41:26 +00:00
Samuel Thibault	2fb85a3787	hurd: [!__USE_MISC] Do not #undef BSD macros in ioctls When e.g. including termios.h first and then sys/ioctl.h, without e.g. _BSD_SOURCE, the latter would #undef e.g. ECHO, without defining it.	2023-12-02 21:26:50 +01:00
Adhemerval Zanella	4e16d89866	linux: Make fdopendir fail with O_PATH (BZ 30373) It is not strictly required by the POSIX, since O_PATH is a Linux extension, but it is QoI to fail early instead of at readdir. Also the check is free, since fdopendir already checks if the file descriptor is opened for read. Checked on x86_64-linux-gnu.	2023-11-30 13:37:04 -03:00
Stefan Liebler	807849965b	Avoid padding in _init and _fini. [BZ #31042 ] The linker just concatenates the .init and .fini sections which results in the complete _init and _fini functions. If needed the linker adds padding bytes due to an alignment. GNU ld is adding NOPs, which is fine. But e.g. mold is adding traps which results in broken _init and _fini functions. Thus this patch removes the alignment in .init and .fini sections in crtn.S files. We keep the 4 byte function alignment in crti.S files. As the assembler now also outputs the start of _init and _fini functions as multiples of 4 byte, it perhaps has to fill it. Although GNU as is using NOPs here, to be sure, we just keep the alignment with 0x07 (=NOPs) at the end of crti.S. In order to avoid an obvious NOP slide in _fini, this patch also uses an lg instead of lgr instruction. Then the emitted instructions needs a multiple of 4 bytes.	2023-11-30 13:31:23 +01:00
Joe Ramsay	7b12776584	aarch64: Improve special-case handling in AdvSIMD double-precision libmvec routines Avoids emitting many saves/restores of vector registers, reduces the amount of code generated around the scalar fallback.	2023-11-29 15:03:36 +00:00
Noah Goldstein	9469261cf1	x86: Only align destination to 1x VEC_SIZE in memset 4x loop Current code aligns to 2x VEC_SIZE. Aligning to 2x has no affect on performance other than potentially resulting in an additional iteration of the loop. 1x maintains aligned stores (the only reason to align in this case) and doesn't incur any unnecessary loop iterations. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>	2023-11-28 12:06:19 -06:00
Tobias Klauser	06bbe63e36	Add TCP_MD5SIG_FLAG_IFINDEX from Linux 5.6 to netinet/tcp.h. This patch adds the TCP_MD5SIG_FLAG_IFINDEX constant from Linux 5.6 to sysdeps/gnu/netinet/tcp.h and updates struct tcp_md5sig accordingly to contain the device index. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2023-11-28 13:44:47 +01:00
Joseph Myers	2e0c0ff95c	Remove __access_noerrno A recent commit, apparently commit `6c6fce572f` "elf: Remove /etc/suid-debug support", resulted in localplt failures for i686-gnu and x86_64-gnu: Missing required PLT reference: ld.so: __access_noerrno After that commit, __access_noerrno is actually no longer used at all. So rather than just removing the localplt expectation for that symbol for Hurd, completely remove all definitions of and references to that symbol. Tested for x86_64, and with build-many-glibcs.py for i686-gnu and x86_64-gnu.	2023-11-23 19:01:32 +00:00
Adhemerval Zanella	472894d2cf	malloc: Use __get_nprocs on arena_get2 (BZ 30945) This restore the 2.33 semantic for arena_get2. It was changed by `11a02b035b` to avoid arena_get2 call malloc (back when __get_nproc was refactored to use an scratch_buffer - `903bc7dcc2`). The __get_nproc was refactored over then and now it also avoid to call malloc. The `11a02b035b` did not take in consideration any performance implication, which should have been discussed properly. The __get_nprocs_sched is still used as a fallback mechanism if procfs and sysfs is not acessible. Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2023-11-22 09:39:29 -03:00
Joe Ramsay	bd70d3bacf	aarch64: Fix libmvec benchmarks These were broken by the new atan2 functions, as they were only set up for univariate functions. Arity is now detected from the input file - this revealed a mistake that the double-precision inputs were being used for both single- and double-precision routines, which is now remedied.	2023-11-22 09:10:43 +00:00
Adhemerval Zanella	55f41ef8de	elf: Remove LD_PROFILE for static binaries The _dl_non_dynamic_init does not parse LD_PROFILE, which does not enable profile for dlopen objects. Since dlopen is deprecated for static objects, it is better to remove the support. It also allows to trim down libc.a of profile support. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-11-21 16:15:42 -03:00
Adhemerval Zanella	1c87f71a36	s390: Use dl-symbol-redir-ifunc.h on cpu-tunables Using the memcmp symbol directly allows the compile to inline the memcmp calls (especially because _dl_tunable_set_hwcaps uses constants values), generating better code. Checked with tst-tunables on s390x-linux-gnu (qemu system). Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-11-21 16:15:42 -03:00
Adhemerval Zanella	4862d546c0	x86: Use dl-symbol-redir-ifunc.h on cpu-tunables The dl-symbol-redir-ifunc.h redirects compiler-generated libcalls to arch-specific memory implementations to avoid ifunc calls where it is not yet possible. The memcmp-isa-default-impl.h aims to fix the same issue by calling the specific memset implementation directly. Using the memcmp symbol directly allows the compiler to inline the memset calls (especially because _dl_tunable_set_hwcaps uses constants values), generating better code. Checked on x86_64-linux-gnu. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-11-21 16:15:42 -03:00
Adhemerval Zanella	434eca873f	elf: Fix _dl_debug_vdprintf to work before self-relocation The strlen might trigger and invalid GOT entry if it used before the process is self-relocated (for instance on dl-tunables if any error occurs). For i386, _dl_writev with PIE requires to use the old 'int $0x80' syscall mode because the calling the TLS register (gs) is not yet initialized. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-11-21 16:15:42 -03:00
Adhemerval Zanella	11f7e3dd8f	elf: Add all malloc tunable to unsecvars Some environment variables allow alteration of allocator behavior across setuid boundaries, where a setuid program may ignore the tunable, but its non-setuid child can read it and adjust the memory allocator behavior accordingly. Most library behavior tunings is limited to the current process and does not bleed in scope; so it is unclear how pratical this misfeature is. If behavior change across privilege boundaries is desirable, it would be better done with a wrapper program around the non-setuid child that sets these envvars, instead of using the setuid process as the messenger. The patch as fixes tst-env-setuid, where it fail if any unsecvars is set. It also adds a dynamic test, although it requires --enable-hardcoded-path-in-tests so kernel correctly sets the setuid bit (using the loader command directly would require to set the setuid bit on the loader itself, which is not a usual deployment). Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2023-11-21 16:15:42 -03:00
Adhemerval Zanella	9c96c87d60	elf: Ignore GLIBC_TUNABLES for setuid/setgid binaries The tunable privilege levels were a retrofit to try and keep the malloc tunable environment variables' behavior unchanged across security boundaries. However, CVE-2023-4911 shows how tricky can be tunable parsing in a security-sensitive environment. Not only parsing, but the malloc tunable essentially changes some semantics on setuid/setgid processes. Although it is not a direct security issue, allowing users to change setuid/setgid semantics is not a good security practice, and requires extra code and analysis to check if each tunable is safe to use on all security boundaries. It also means that security opt-in features, like aarch64 MTE, would need to be explicit enabled by an administrator with a wrapper script or with a possible future system-wide tunable setting. Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2023-11-21 16:15:42 -03:00
Adhemerval Zanella	a72a4eb10b	elf: Add GLIBC_TUNABLES to unsecvars setuid/setgid process now ignores any glibc tunables, and filters out all environment variables that might changes its behavior. This patch also adds GLIBC_TUNABLES, so any spawned process by setuid/setgid processes should set tunable explicitly. Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2023-11-21 16:15:42 -03:00
Samuel Thibault	49b308a26e	hurd: Prevent the final file_exec_paths call from signals Otherwise if the exec server started thrashing the old task, we won't be able to restart the exec. This notably fixes building ghc.	2023-11-20 23:28:16 +01:00
Joe Ramsay	a8830c9285	aarch64: Add vector implementations of expm1 routines May discard sign of 0 - auto tests for -0 and -0x1p-10000 updated accordingly.	2023-11-20 17:53:14 +00:00
Adhemerval Zanella	65341f7bbe	linux: Use fchmodat2 on fchmod for flags different than 0 (BZ 26401) Linux 6.6 (09da082b07bbae1c) added support for fchmodat2, which has similar semantics as fchmodat with an extra flag argument. This allows fchmodat to implement AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH without the need for procfs. The syscall is registered on all architectures (with value of 452 except on alpha which is 562, commit 78252deb023cf087). The tst-lchmod.c requires a small fix where fchmodat checks two contradictory assertions ('(st.st_mode & 0777) == 2' and '(st.st_mode & 0777) == 3'). Checked on x86_64-linux-gnu on a 6.6 kernel. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2023-11-20 13:15:24 -03:00
Noah Goldstein	b7f8b6b64b	x86: Fix unchecked AVX512-VBMI2 usage in strrchr-evex-base.S strrchr-evex-base used `vpcompress{b\|d}` in the page cross logic but was missing the CPU_FEATURE checks for VBMI2 in the ifunc/ifunc-impl-list. The fix is either to add those checks or change the logic to not use `vpcompress{b\|d}`. Choosing the latter here so that the strrchr-evex implementation is usable on SKX. New implementation is a bit slower, but this is in a cold path so its probably okay.	2023-11-15 11:09:44 -06:00
Andreas Larsson	578190b7e4	sparc: Fix broken memset for sparc32 [BZ #31068 ] Fixes commit `a61933fe27` ("sparc: Remove bzero optimization") that after moving code jumped to the wrong label 4. Verfied by successfully running string/test-memset on sparc32. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2023-11-15 10:26:37 -03:00
Samuel Thibault	323f367cc4	hurd: Fix spawni returning allocation errors.	2023-11-14 23:55:35 +01:00
Wilco Dijkstra	2f5524cc53	AArch64: Remove Falkor memcpy The latest implementations of memcpy are actually faster than the Falkor implementations [1], so remove the falkor/phecda ifuncs for memcpy and the now unused IS_FALKOR/IS_PHECDA defines. [1] https://sourceware.org/pipermail/libc-alpha/2022-December/144227.html Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2023-11-13 16:52:50 +00:00
Wilco Dijkstra	3d7090f14b	AArch64: Add memset_zva64 Add a specialized memset for the common ZVA size of 64 to avoid the overhead of reading the ZVA size. Since the code is identical to __memset_falkor, remove the latter. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2023-11-13 16:50:44 +00:00
Wilco Dijkstra	9627ab99b5	AArch64: Cleanup emag memset Cleanup emag memset - merge the memset_base64.S file, remove the unused ZVA code (since it is disabled on emag). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2023-11-13 16:45:47 +00:00
Joe Ramsay	3548a4f087	aarch64: Add vector implementations of log1p routines May discard sign of zero.	2023-11-10 17:07:43 +00:00
Joe Ramsay	b07038c5d3	aarch64: Add vector implementations of atan2 routines	2023-11-10 17:07:43 +00:00
Joe Ramsay	d30c39f80d	aarch64: Add vector implementations of atan routines	2023-11-10 17:07:42 +00:00
Joe Ramsay	b5d23367a8	aarch64: Add vector implementations of acos routines	2023-11-10 17:07:42 +00:00
Joe Ramsay	9bed498418	aarch64: Add vector implementations of asin routines	2023-11-10 17:07:42 +00:00
Adhemerval Zanella	bf033c0072	elf: Add glibc.mem.decorate_maps tunable The PR_SET_VMA_ANON_NAME support is only enabled through a configurable kernel switch, mainly because assigning a name to a anonymous virtual memory area might prevent that area from being merged with adjacent virtual memory areas. For instance, with the following code: void p1 = mmap (NULL, 1024 4096, PROT_READ \| PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS, -1, 0); void p2 = mmap (p1 + (1024 4096), 1024 * 4096, PROT_READ \| PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS, -1, 0); The kernel will potentially merge both mappings resulting in only one segment of size 0x800000. If the segment is names with PR_SET_VMA_ANON_NAME with different names, it results in two mappings. Although this will unlikely be an issue for pthread stacks and malloc arenas (since for pthread stacks the guard page will result in a PROT_NONE segment, similar to the alignment requirement for the arena block), it still might prevent the mmap memory allocated for detail malloc. There is also another potential scalability issue, where the prctl requires to take the mmap global lock which is still not fully fixed in Linux [1] (for pthread stacks and arenas, it is mitigated by the stack cached and the arena reuse). So this patch disables anonymous mapping annotations as default and add a new tunable, glibc.mem.decorate_maps, can be used to enable it. [1] https://lwn.net/Articles/906852/ Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2023-11-07 10:27:57 -03:00
Adhemerval Zanella	f10ba2ab25	linux: Decorate __libc_fatal error buffer Reviewed-by: DJ Delorie <dj@redhat.com>	2023-11-07 10:27:53 -03:00
Adhemerval Zanella	78ed8bdf4f	linux: Add PR_SET_VMA_ANON_NAME support Linux 5.17 added support to naming anonymous virtual memory areas through the prctl syscall. The __set_vma_name is a wrapper to avoid optimizing the prctl call if the kernel does not support it. If the kernel does not support PR_SET_VMA_ANON_NAME, prctl returns EINVAL. And it also returns the same error for an invalid argument. Since it is an internal-only API, it assumes well-formatted input: aligned START, with (START, START+LEN) being a valid memory range, and NAME with a limit of 80 characters without an invalid one ("\\`$[]"). Reviewed-by: DJ Delorie <dj@redhat.com>	2023-11-07 10:27:20 -03:00
Samuel Thibault	091ee2190d	hurd: statfsconv: Add missing f_ffree conversion	2023-11-07 12:51:25 +01:00
Flavio Cruz	5dd3bda59c	Update BAD_TYPECHECK to work on x86_64 Message-ID: <ZUhn7LOcgLOJjKZr@jupiter.tail36e24.ts.net>	2023-11-06 23:24:48 +01:00
Sergio Durigan Junior	f957f47df7	sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist [BZ #30789 ] When invoking sem_open with O_CREAT as one of its flags, we'll end up in the second part of sem_open's "if ((oflag & O_CREAT) == 0 \|\| (oflag & O_EXCL) == 0)", which means that we don't expect the semaphore file to exist. In that part, open_flags is initialized as "O_RDWR \| O_CREAT \| O_EXCL \| O_CLOEXEC" and there's an attempt to open(2) the file, which will likely fail because it won't exist. After that first (expected) failure, some cleanup is done and we go back to the label "try_again", which lives in the first part of the aforementioned "if". The problem is that, in that part of the code, we expect the semaphore file to exist, and as such O_CREAT (this time the flag we pass to open(2)) needs to be cleaned from open_flags, otherwise we'll see another failure (this time unexpected) when trying to open the file, which will lead the call to sem_open to fail as well. This can cause very strange bugs, especially with OpenMPI, which makes extensive use of semaphores. Fix the bug by simplifying the logic when choosing open(2) flags and making sure O_CREAT is not set when the semaphore file is expected to exist. A regression test for this issue would require a complex and cpu time consuming logic, since to trigger the wrong code path is not straightforward due the racy condition. There is a somewhat reliable reproducer in the bug, but it requires using OpenMPI. This resolves BZ #30789. See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912 Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net> Co-Authored-By: Simon Chopin <simon.chopin@canonical.com> Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Fixes: `533deafbdf` ("Use O_CLOEXEC in more places (BZ #15722)")	2023-11-03 15:19:38 -03:00
Joseph Myers	ac79930498	Add SEGV_CPERR from Linux 6.6 to bits/siginfo-consts.h Linux 6.6 adds the constant SEGV_CPERR. Add it to glibc's bits/siginfo-consts.h. Tested for x86_64.	2023-11-03 16:36:35 +00:00
Adhemerval Zanella	9b3cb0277e	linux: Add HWCAP2_HBC from Linux 6.6 to AArch64 bits/hwcap.h	2023-11-03 10:01:46 -03:00
Adhemerval Zanella	10b4c8b96f	linux: Add FSCONFIG_CMD_CREATE_EXCL from Linux 6.6 to sys/mount.h The tst-mount-consts.py does not need to be updated because kernel exports it as an enum (compare_macro_consts can not parse it).	2023-11-03 10:01:46 -03:00
Adhemerval Zanella	cb8c78b2ff	linux: Add MMAP_ABOVE4G from Linux 6.6 to sys/mman.h x86 added the flag (29f890d1050fc099f) for CET enabled. Also update tst-mman-consts.py test.	2023-11-03 10:01:46 -03:00
Adhemerval Zanella	f680063f30	Update kernel version to 6.6 in header constant tests There are no new constants covered, the tst-mman-consts.py is updated separately along with a header constant addition.	2023-11-03 10:01:46 -03:00
Adhemerval Zanella	582383b37d	Update syscall lists for Linux 6.6 Linux 6.6 has one new syscall for all architectures, fchmodat2, and the map_shadow_stack on x86_64.	2023-11-03 10:01:46 -03:00
Wilco Dijkstra	9fd3409842	AArch64: Cleanup ifuncs Cleanup ifuncs. Remove uses of libc_hidden_builtin_def, use ENTRY rather than ENTRY_ALIGN, remove unnecessary defines and conditional compilation. Rename strlen_mte to strlen_generic. Remove rtld-memset. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-11-01 13:41:59 +00:00
Arjun Shankar	9db31d7456	Use correct subdir when building tst-rfc3484* for mach and arm Commit `7f602256ab` moved the tst-rfc3484* tests from posix/ to nss/, but didn't correct references to point to their new subdir when building for mach and arm. This commit fixes that. Tested with build-many-glibcs.sh for i686-gnu.	2023-11-01 11:53:03 +01:00
Adhemerval Zanella	fccf38c517	string: Add internal memswap implementation The prototype is: void __memswap (void restrict p1, void restrict p2, size_t n) The function swaps the content of two memory blocks P1 and P2 of len N. Memory overlap is NOT handled. It will be used on qsort optimization. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2023-10-31 14:17:33 -03:00

... 3 4 5 6 7 ...

16363 Commits