Commit Graph

12416 Commits

Author SHA1 Message Date
Alice Xu
3cee1ddad2 x86-64: Fix an unknown vector operation in memchr-evex.S
An unknown vector operation occurred in commit 2a76821c30. Fixed it
by using "ymm{k1}{z}" but not "ymm {k1} {z}".

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 6ea916adfa)
2022-01-27 16:25:58 -08:00
Noah Goldstein
cba2fbe17f x86: Optimize memchr-evex.S
No bug. This commit optimizes memchr-evex.S. The optimizations include
replacing some branches with cmovcc, avoiding some branches entirely
in the less_4x_vec case, making the page cross logic less strict,
saving some ALU in the alignment process, and most importantly
increasing ILP in the 4x loop. test-memchr, test-rawmemchr, and
test-wmemchr are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 2a76821c30)
2022-01-27 16:25:53 -08:00
Noah Goldstein
0a3b2efccc x86: Optimize strlen-avx2.S
No bug. This commit optimizes strlen-avx2.S. The optimizations are
mostly small things but they add up to roughly 10-30% performance
improvement for strlen. The results for strnlen are bit more
ambiguous. test-strlen, test-strnlen, test-wcslen, and test-wcsnlen
are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit aaa23c3507)
2022-01-27 16:25:48 -08:00
Noah Goldstein
c51e9501c2 x86: Fix overflow bug with wmemchr-sse2 and wmemchr-avx2 [BZ #27974]
This commit fixes the bug mentioned in the previous commit.

The previous implementations of wmemchr in these files relied
on n * sizeof(wchar_t) which was not guranteed by the standard.

The new overflow tests added in the previous commit now
pass (As well as all the other tests).

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 645a158978)
2022-01-27 16:25:42 -08:00
Noah Goldstein
5c38877df5 x86: Optimize memchr-avx2.S
No bug. This commit optimizes memchr-avx2.S. The optimizations include
replacing some branches with cmovcc, avoiding some branches entirely
in the less_4x_vec case, making the page cross logic less strict,
asaving a few instructions the in loop return loop. test-memchr,
test-rawmemchr, and test-wmemchr are all passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit acfd088a19)
2022-01-27 16:25:37 -08:00
H.J. Lu
3fc179e2b2 x86-64: Require BMI2 for __strlen_evex and __strnlen_evex
Since __strlen_evex and __strnlen_evex added by

commit 1fd8c163a8
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 5 06:24:52 2021 -0800

    x86-64: Add ifunc-avx2.h functions with 256-bit EVEX

use sarx:

c4 e2 6a f7 c0       	sarx   %edx,%eax,%eax

require BMI2 for __strlen_evex and __strnlen_evex in ifunc-impl-list.c.
ifunc-avx2.h already requires BMI2 for EVEX implementation.

(cherry picked from commit 55bf411b45)
2022-01-27 16:25:20 -08:00
Sunil K Pandey
aff5eec9ce x86-64: Fix ifdef indentation in strlen-evex.S
Fix some indentations of ifdef in file strlen-evex.S which are off by 1
and confusing to read.

(cherry picked from commit 595c22ecd8)
2022-01-27 12:07:11 -08:00
H.J. Lu
7056607e6a x86-64: Use ZMM16-ZMM31 in AVX512 memmove family functions
Update ifunc-memmove.h to select the function optimized with AVX512
instructions using ZMM16-ZMM31 registers to avoid RTM abort with usable
AVX512VL since VZEROUPPER isn't needed at function exit.

(cherry picked from commit e4fda46310)
2022-01-27 12:07:11 -08:00
H.J. Lu
a54fae8df4 x86-64: Use ZMM16-ZMM31 in AVX512 memset family functions
Update ifunc-memset.h/ifunc-wmemset.h to select the function optimized
with AVX512 instructions using ZMM16-ZMM31 registers to avoid RTM abort
with usable AVX512VL and AVX512BW since VZEROUPPER isn't needed at
function exit.

(cherry picked from commit 4e2d8f3527)
2022-01-27 12:07:11 -08:00
H.J. Lu
674137c9b0 x86: Add string/memory function tests in RTM region
At function exit, AVX optimized string/memory functions have VZEROUPPER
which triggers RTM abort.   When such functions are called inside a
transactionally executing RTM region, RTM abort causes severe performance
degradation.  Add tests to verify that string/memory functions won't
cause RTM abort in RTM region.

(cherry picked from commit 4bd660be40)
2022-01-27 12:07:11 -08:00
H.J. Lu
5be8a84721 x86-64: Add AVX optimized string/memory functions for RTM
Since VZEROUPPER triggers RTM abort while VZEROALL won't, select AVX
optimized string/memory functions with

	xtest
	jz	1f
	vzeroall
	ret
1:
	vzeroupper
	ret

at function exit on processors with usable RTM, but without 256-bit EVEX
instructions to avoid VZEROUPPER inside a transactionally executing RTM
region.

(cherry picked from commit 7ebba91361)
2022-01-27 12:07:11 -08:00
H.J. Lu
757d90ff37 x86-64: Add memcmp family functions with 256-bit EVEX
Update ifunc-memcmp.h to select the function optimized with 256-bit EVEX
instructions using YMM16-YMM31 registers to avoid RTM abort with usable
AVX512VL, AVX512BW and MOVBE since VZEROUPPER isn't needed at function
exit.

(cherry picked from commit 91264fe357)
2022-01-27 12:07:11 -08:00
H.J. Lu
9650d04fb6 x86-64: Add memset family functions with 256-bit EVEX
Update ifunc-memset.h/ifunc-wmemset.h to select the function optimized
with 256-bit EVEX instructions using YMM16-YMM31 registers to avoid RTM
abort with usable AVX512VL and AVX512BW since VZEROUPPER isn't needed at
function exit.

(cherry picked from commit 1b968b6b9b)
2022-01-27 12:07:11 -08:00
H.J. Lu
2f4a98ab17 x86-64: Add memmove family functions with 256-bit EVEX
Update ifunc-memmove.h to select the function optimized with 256-bit EVEX
instructions using YMM16-YMM31 registers to avoid RTM abort with usable
AVX512VL since VZEROUPPER isn't needed at function exit.

(cherry picked from commit 63ad43566f)
2022-01-27 12:07:11 -08:00
H.J. Lu
c90c70b8f9 x86-64: Add strcpy family functions with 256-bit EVEX
Update ifunc-strcpy.h to select the function optimized with 256-bit EVEX
instructions using YMM16-YMM31 registers to avoid RTM abort with usable
AVX512VL and AVX512BW since VZEROUPPER isn't needed at function exit.

(cherry picked from commit 525bc2a32c)
2022-01-27 12:07:11 -08:00
H.J. Lu
c671b231cf x86-64: Add ifunc-avx2.h functions with 256-bit EVEX
Update ifunc-avx2.h, strchr.c, strcmp.c, strncmp.c and wcsnlen.c to
select the function optimized with 256-bit EVEX instructions using
YMM16-YMM31 registers to avoid RTM abort with usable AVX512VL, AVX512BW
and BMI2 since VZEROUPPER isn't needed at function exit.

For strcmp/strncmp, prefer AVX2 strcmp/strncmp if Prefer_AVX2_STRCMP
is set.

(cherry picked from commit 1fd8c163a8)
2022-01-27 12:07:11 -08:00
H.J. Lu
be07b3e059 x86: Set Prefer_No_VZEROUPPER and add Prefer_AVX2_STRCMP
1. Set Prefer_No_VZEROUPPER if RTM is usable to avoid RTM abort triggered
by VZEROUPPER inside a transactionally executing RTM region.
2. Since to compare 2 32-byte strings, 256-bit EVEX strcmp requires 2
loads, 3 VPCMPs and 2 KORDs while AVX2 strcmp requires 1 load, 2 VPCMPEQs,
1 VPMINU and 1 VPMOVMSKB, AVX2 strcmp is faster than EVEX strcmp.  Add
Prefer_AVX2_STRCMP to prefer AVX2 strcmp family functions.

(cherry picked from commit 1da50d4bda)
2022-01-27 12:07:11 -08:00
Noah Goldstein
be6fb78a1f x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755]
Fixes [BZ# 28755] for wcsncmp by redirecting length >= 2^56 to
__wcscmp_avx2. For x86_64 this covers the entire address range so any
length larger could not possibly be used to bound `s1` or `s2`.

test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit ddf0992cf5)
2022-01-27 07:28:12 -08:00
H.J. Lu
1864775abc x86: Check IFUNC definition in unrelocated executable [BZ #20019]
Calling an IFUNC function defined in unrelocated executable also leads to
segfault.  Issue a fatal error message when calling IFUNC function defined
in the unrelocated executable from a shared library.

On x86, ifuncmain6pie failed with:

[hjl@gnu-cfl-2 build-i686-linux]$ ./elf/ifuncmain6pie --direct
./elf/ifuncmain6pie: IFUNC symbol 'foo' referenced in '/export/build/gnu/tools-build/glibc-32bit/build-i686-linux/elf/ifuncmod6.so' is defined in the executable and creates an unsatisfiable circular dependency.
[hjl@gnu-cfl-2 build-i686-linux]$ readelf -rW elf/ifuncmod6.so | grep foo
00003ff4  00000706 R_386_GLOB_DAT         0000400c   foo_ptr
00003ff8  00000406 R_386_GLOB_DAT         00000000   foo
0000400c  00000401 R_386_32               00000000   foo
[hjl@gnu-cfl-2 build-i686-linux]$

Remove non-JUMP_SLOT relocations against foo in ifuncmod6.so, which
trigger the circular IFUNC dependency, and build ifuncmain6pie with
-Wl,-z,lazy.

(cherry picked from commits 6ea5b57afa
 and 7137d682eb)
2021-01-13 14:30:42 -08:00
H.J. Lu
420ade1f64 x86: Set header.feature_1 in TCB for always-on CET [BZ #27177]
Update dl_cet_check() to set header.feature_1 in TCB when both IBT and
SHSTK are always on.

(cherry picked from commit 2ef23b5205)
2021-01-13 09:24:35 -08:00
H.J. Lu
8493ba72b1 x86-64: Avoid rep movsb with short distance [BZ #27130]
When copying with "rep movsb", if the distance between source and
destination is N*4GB + [1..63] with N >= 0, performance may be very
slow.  This patch updates memmove-vec-unaligned-erms.S for AVX and
AVX512 versions with the distance in RCX:

	cmpl	$63, %ecx
	// Don't use "rep movsb" if ECX <= 63
	jbe	L(Don't use rep movsb")
	Use "rep movsb"

Benchtests data with bench-memcpy, bench-memcpy-large, bench-memcpy-random
and bench-memcpy-walk on Skylake, Ice Lake and Tiger Lake show that its
performance impact is within noise range as "rep movsb" is only used for
data size >= 4KB.

(cherry picked from commit 3ec5d83d2a)
2021-01-12 07:01:34 -08:00
Szabolcs Nagy
a3c78954ee aarch64: Fix DT_AARCH64_VARIANT_PCS handling [BZ #26798]
The variant PCS support was ineffective because in the common case
linkmap->l_mach.plt == 0 but then the symbol table flags were ignored
and normal lazy binding was used instead of resolving the relocs early.
(This was a misunderstanding about how GOT[1] is setup by the linker.)

In practice this mainly affects SVE calls when the vector length is
more than 128 bits, then the top bits of the argument registers get
clobbered during lazy binding.

Fixes bug 26798.

(cherry picked from commit 558251bd87)
2020-11-04 12:26:02 +00:00
Wilco Dijkstra
28ff0f650c AArch64: Use __memcpy_simd on Neoverse N2/V1
Add CPU detection of Neoverse N2 and Neoverse V1, and select __memcpy_simd as
the memcpy/memmove ifunc.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
(cherry picked from commit e11ed9d2b4)
2020-10-14 14:31:24 +01:00
Wilco Dijkstra
be3eaffd5a [AArch64] Improve integer memcpy
Further optimize integer memcpy.  Small cases now include copies up
to 32 bytes.  64-128 byte copies are split into two cases to improve
performance of 64-96 byte copies.  Comments have been rewritten.

(cherry picked from commit 7000651327)
2020-10-12 18:29:42 +01:00
Krzysztof Koch
c969e84e0c aarch64: Increase small and medium cases for __memcpy_generic
Increase the upper bound on medium cases from 96 to 128 bytes.
Now, up to 128 bytes are copied unrolled.

Increase the upper bound on small cases from 16 to 32 bytes so that
copies of 17-32 bytes are not impacted by the larger medium case.

Benchmarking:
The attached figures show relative timing difference with respect
to 'memcpy_generic', which is the existing implementation.
'memcpy_med_128' denotes the the version of memcpy_generic with
only the medium case enlarged. The 'memcpy_med_128_small_32' numbers
are for the version of memcpy_generic submitted in this patch, which
has both medium and small cases enlarged. The figures were generated
using the script from:
https://www.sourceware.org/ml/libc-alpha/2019-10/msg00563.html

Depending on the platform, the performance improvement in the
bench-memcpy-random.c benchmark ranges from 6% to 20% between
the original and final version of memcpy.S

Tested against GLIBC testsuite and randomized tests.

(cherry picked from commit b9f145df85)
2020-10-12 18:29:42 +01:00
Wilco Dijkstra
53d501d6e9 AArch64: Rename IS_ARES to IS_NEOVERSE_N1
Rename IS_ARES to IS_NEOVERSE_N1 since that is a bit clearer.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 0f6278a879)
2020-10-12 18:29:38 +01:00
Wilco Dijkstra
64458aabeb AArch64: Improve backwards memmove performance
On some microarchitectures performance of the backwards memmove improves if
the stores use STR with decreasing addresses.  So change the memmove loop
in memcpy_advsimd.S to use 2x STR rather than STP.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit bd394d131c)
2020-10-12 18:28:42 +01:00
Wilco Dijkstra
58c6a7ae53 AArch64: Add optimized Q-register memcpy
Add a new memcpy using 128-bit Q registers - this is faster on modern
cores and reduces codesize.  Similar to the generic memcpy, small cases
include copies up to 32 bytes.  64-128 byte copies are split into two
cases to improve performance of 64-96 byte copies.  Large copies align
the source rather than the destination.

bench-memcpy-random is ~9% faster than memcpy_falkor on Neoverse N1,
so make this memcpy the default on N1 (on Centriq it is 15% faster than
memcpy_falkor).

Passes GLIBC regression tests.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 4a733bf375)
2020-10-12 18:28:34 +01:00
Wilco Dijkstra
2fb2098c24 AArch64: Align ENTRY to a cacheline
Given almost all uses of ENTRY are for string/memory functions,
align ENTRY to a cacheline to simplify things.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 34f0d01d5e)
2020-10-12 16:56:18 +01:00
Sunil K Pandey
0a4d3eac67 Fix avx2 strncmp offset compare condition check [BZ #25933]
strcmp-avx2.S: In avx2 strncmp function, strings are compared in
chunks of 4 vector size(i.e. 32x4=128 byte for avx2). After first 4
vector size comparison, code must check whether it already passed
the given offset. This patch implement avx2 offset check condition
for strncmp function, if both string compare same for first 4 vector
size.

(cherry picked from commit 75870237ff)
2020-07-04 09:53:28 -07:00
Andreas Schwab
a318448f7a Fix array overflow in backtrace on PowerPC (bug 25423)
When unwinding through a signal frame the backtrace function on PowerPC
didn't check array bounds when storing the frame address.  Fixes commit
d400dcac5e ("PowerPC: fix backtrace to handle signal trampolines").

(cherry picked from commit d937694059)
2020-03-18 13:41:55 -04:00
Florian Weimer
8e5d591b10 math/test-sinl-pseudo: Use stack protector only if available
This fixes commit 9333498794 ("Avoid ldbl-96 stack
corruption from range reduction of pseudo-zero (bug 25487).").

(cherry picked from commit c10acd4026)
2020-03-15 17:33:57 -04:00
Joseph Myers
0474cd5de6 Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487).
Bug 25487 reports stack corruption in ldbl-96 sinl on a pseudo-zero
argument (an representation where all the significand bits, including
the explicit high bit, are zero, but the exponent is not zero, which
is not a valid representation for the long double type).

Although this is not a valid long double representation, existing
practice in this area (see bug 4586, originally marked invalid but
subsequently fixed) is that we still seek to avoid invalid memory
accesses as a result, in case of programs that treat arbitrary binary
data as long double representations, although the invalid
representations of the ldbl-96 format do not need to be consistently
handled the same as any particular valid representation.

This patch makes the range reduction detect pseudo-zero and unnormal
representations that would otherwise go to __kernel_rem_pio2, and
returns a NaN for them instead of continuing with the range reduction
process.  (Pseudo-zero and unnormal representations whose unbiased
exponent is less than -1 have already been safely returned from the
function before this point without going through the rest of range
reduction.)  Pseudo-zero representations would previously result in
the value passed to __kernel_rem_pio2 being all-zero, which is
definitely unsafe; unnormal representations would previously result in
a value passed whose high bit is zero, which might well be unsafe
since that is not a form of input expected by __kernel_rem_pio2.

Tested for x86_64.

(cherry picked from commit 9333498794)
2020-03-15 17:33:26 -04:00
Florian Weimer
e0a0770bb4 riscv: Do not use __has_include__
The user-visible preprocessor construct is called __has_include.

(cherry picked from commit 28dd393922)
2020-01-21 13:42:47 +01:00
Florian Weimer
ea6f2c3174 misc/test-errno-linux: Handle EINVAL from quotactl
In commit 3dd4d40b420846dd35869ccc8f8627feef2cff32 ("xfs: Sanity check
flags of Q_XQUOTARM call"), Linux 5.4 added checking for the flags
argument, causing the test to fail due to too restrictive test
expectations.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 1f7525d924)
2019-12-05 17:30:40 +01:00
Marcin Kościelnicki
2626b15e88 rtld: Check __libc_enable_secure before honoring LD_PREFER_MAP_32BIT_EXEC (CVE-2019-19126) [BZ #25204]
The problem was introduced in glibc 2.23, in commit
b9eb92ab05
("Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT").

(cherry picked from commit d5dfad4326)
Change-Id: Ib782573b4623ee3edfa9f98ad62f69b9d8edcb27
2019-11-22 13:11:36 +01:00
Florian Weimer
845278f2c6 Linux: Use in-tree copy of SO_ constants for !__USE_MISC [BZ #24532]
The kernel changes for a 64-bit time_t on 32-bit architectures
resulted in <asm/socket.h> indirectly including <linux/posix_types.h>.
The latter is not namespace-clean for the POSIX version of
<sys/socket.h>.

This issue has persisted across several Linux releases, so this commit
creates our own copy of the SO_* definitions for !__USE_MISC mode.

The new test socket/tst-socket-consts ensures that the copy is
consistent with the kernel definitions (which vary across
architectures).  The test is tricky to get right because CPPFLAGS
includes include/libc-symbols.h, which in turn defines _GNU_SOURCE
unconditionally.

Tested with build-many-glibcs.py.  I verified that a discrepancy in
the definitions actually results in a failure of the
socket/tst-socket-consts test.

(cherry picked from commit 7854ebf8ed)
2019-11-18 18:16:03 +01:00
Dragan Mladjenovic
a7646651f8 mips: Force RWX stack for hard-float builds that can run on pre-4.8 kernels
Linux/Mips kernels prior to 4.8 could potentially crash the user
process when doing FPU emulation while running on non-executable
user stack.

Currently, gcc doesn't emit .note.GNU-stack for mips, but that will
change in the future. To ensure that glibc can be used with such
future gcc, without silently resulting in binaries that might crash
in runtime, this patch forces RWX stack for all built objects if
configured to run against minimum kernel version less than 4.8.

	* sysdeps/unix/sysv/linux/mips/Makefile
	(test-xfail-check-execstack):
	Move under mips-has-gnustack != yes.
	(CFLAGS-.o*, ASFLAGS-.o*): New rules.
	Apply -Wa,-execstack if mips-force-execstack == yes.
	* sysdeps/unix/sysv/linux/mips/configure: Regenerated.
	* sysdeps/unix/sysv/linux/mips/configure.ac
	(mips-force-execstack): New var.
	Set to yes for hard-float builds with minimum_kernel < 4.8.0
	or minimum_kernel not set at all.
	(mips-has-gnustack): New var.
	Use value of libc_cv_as_noexecstack
	if mips-force-execstack != yes, otherwise set to no.

(cherry picked from commit 33bc9efd91)
2019-11-05 14:49:11 -03:00
H.J. Lu
5e1548a6d9 Call _dl_open_check after relocation [BZ #24259]
This is a workaround for [BZ #20839] which doesn't remove the NODELETE
object when _dl_open_check throws an exception.  Move it after relocation
in dl_open_worker to avoid leaving the NODELETE object mapped without
relocation.

	[BZ #24259]
	* elf/dl-open.c (dl_open_worker): Call _dl_open_check after
	relocation.
	* sysdeps/x86/Makefile (tests): Add tst-cet-legacy-5a,
	tst-cet-legacy-5b, tst-cet-legacy-6a and tst-cet-legacy-6b.
	(modules-names): Add tst-cet-legacy-mod-5a, tst-cet-legacy-mod-5b,
	tst-cet-legacy-mod-5c, tst-cet-legacy-mod-6a, tst-cet-legacy-mod-6b
	and tst-cet-legacy-mod-6c.
	(CFLAGS-tst-cet-legacy-5a.c): New.
	(CFLAGS-tst-cet-legacy-5b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5a.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5c.c): Likewise.
	(CFLAGS-tst-cet-legacy-6a.c): Likewise.
	(CFLAGS-tst-cet-legacy-6b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6a.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6c.c): Likewise.
	($(objpfx)tst-cet-legacy-5a): Likewise.
	($(objpfx)tst-cet-legacy-5a.out): Likewise.
	($(objpfx)tst-cet-legacy-mod-5a.so): Likewise.
	($(objpfx)tst-cet-legacy-mod-5b.so): Likewise.
	($(objpfx)tst-cet-legacy-5b): Likewise.
	($(objpfx)tst-cet-legacy-5b.out): Likewise.
	(tst-cet-legacy-5b-ENV): Likewise.
	($(objpfx)tst-cet-legacy-6a): Likewise.
	($(objpfx)tst-cet-legacy-6a.out): Likewise.
	($(objpfx)tst-cet-legacy-mod-6a.so): Likewise.
	($(objpfx)tst-cet-legacy-mod-6b.so): Likewise.
	($(objpfx)tst-cet-legacy-6b): Likewise.
	($(objpfx)tst-cet-legacy-6b.out): Likewise.
	(tst-cet-legacy-6b-ENV): Likewise.
	* sysdeps/x86/tst-cet-legacy-5.c: New file.
	* sysdeps/x86/tst-cet-legacy-5a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-5b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5c.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6c.c: Likewise.

(cherry picked from commit d0093c5cef)
2019-10-31 16:53:28 -04:00
Joseph Myers
afbf970cae Fix RISC-V vfork build with Linux 5.3 kernel headers.
Building glibc for RISC-V with Linux 5.3 kernel headers fails because
<linux/sched.h>, included in vfork.S for CLONE_* constants, contains a
structure definition not safe for inclusion in assembly code.

All other architectures already avoid use of that header in vfork.S,
either defining the CLONE_* constants locally or embedding the
required values directly in the relevant instruction, where they
implement vfork using the clone syscall (see the implementations for
aarch64, ia64, mips and nios2).  This patch makes the RISC-V version
define the constants locally like the other architectures.

Tested build for all three RISC-V configurations in
build-many-glibcs.py with Linux 5.3 headers.

	* sysdeps/unix/sysv/linux/riscv/vfork.S: Do not include
	<linux/sched.h>.
	(CLONE_VM): New macro.
	(CLONE_VFORK): Likewise.

(cherry picked from commit 8cacbcf4a9)
2019-09-20 21:30:04 +02:00
Aurelien Jarno
a132a2c305 alpha: force old OSF1 syscalls for getegid, geteuid and getppid [BZ #24986]
On alpha, Linux kernel 5.1 added the standard getegid, geteuid and
getppid syscalls (commit ecf7e0a4ad15287). Up to now alpha was using
the corresponding OSF1 syscalls through:
 - sysdeps/unix/alpha/getegid.S
 - sysdeps/unix/alpha/geteuid.S
 - sysdeps/unix/alpha/getppid.S

When building against kernel headers >= 5.1, the glibc now use the new
syscalls through sysdeps/unix/sysv/linux/syscalls.list. When it is then
used with an older kernel, the corresponding 3 functions fail.

A quick fix is to move the OSF1 wrappers under the
sysdeps/unix/sysv/linux/alpha directory so they override the standard
linux ones. A better fix would be to try the new syscalls and fallback
to the old OSF1 in case the new ones fail. This can be implemented in
a later commit.

Changelog:
	[BZ #24986]
        * sysdeps/unix/alpha/getegid.S: Move to ...
	* sysdeps/unix/sysv/linux/alpha/getegid.S: ... here.
        * sysdeps/unix/alpha/geteuid.S: Move to ...
	* sysdeps/unix/sysv/linux/alpha/geteuid.S: ... here.
        * sysdeps/unix/alpha/getppid.S: Move to ...
	* sysdeps/unix/sysv/linux/alpha/getppid.S: ... here
2019-09-14 20:12:11 +02:00
Aurelien Jarno
ef98313dd3 Update Alpha libm-test-ulps
Changelog:

	* sysdeps/alpha/fpu/libm-test-ulps: Regenerated using GCC 9.2.

(cherry picked from commit b5367a08ae)
2019-09-03 21:43:22 +02:00
Adhemerval Zanella
6d8eaf4a25 hppa: Update libm-tests-ulps
The make regen-ulps was done on a PA8900 with 8.3.0.

	* sysdeps/hppa/fpu/libm-test-ulps: Update.

(cherry picked from commit 3175dcc1e6)
2019-08-18 11:38:26 +02:00
Richard Henderson
23ef51a50a alpha: Do not redefine __NR_shmat or __NR_osf_shmat
Fixes build using v5.1-rc1 headers.

The kernel has cleaned up how these are defined.  Previous behavior
was to define __NR_osf_shmat as 209 and not define __NR_shmat.
Current behavior is to define __NR_shmat as 209 and then define
__NR_osf_shmat as __NR_shmat.

	* sysdeps/unix/sysv/linux/alpha/kernel-features.h (__NR_shmat):
	Do not redefine.
	* sysdeps/unix/sysv/linux/alpha/sysdep.h (__NR_osf_shmat):
	Do not redefine.

(cherry picked from commit d5ecee822e)
2019-08-15 19:52:22 +02:00
Adhemerval Zanella
2d3fefd7ce posix: Fix large mmap64 offset for mips64n32 (BZ#24699)
The fix for BZ#21270 (commit 158d5fa0e1) added a mask to avoid offset larger
than 1^44 to be used along __NR_mmap2.  However mips64n32 users __NR_mmap,
as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such
x32, defines off_t being equal to off64_t).  This leads to use the same
mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum
offset it can use with mmap64.

This patch fixes by setting the high mask only for __NR_mmap2 usage. The
posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The
patch also change the test to check for an arch-specific header that defines
the maximum supported offset.

Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset
on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and
mips64-n32-linux-gnu.

	[BZ #24699]
	* posix/tst-mmap-offset.c: Mention BZ #24699.
	(do_test_bz21270): Rename to do_test_large_offset and use
	mmap64_maximum_offset to check for maximum expected offset value.
	* sysdeps/generic/mmap_info.h: New file.
	* sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise.
	* sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff
	__NR_mmap2 is used.

(cherry picked from commit a008c76b56)
2019-07-15 09:26:39 -03:00
Szabolcs Nagy
4163c382f0 aarch64: handle STO_AARCH64_VARIANT_PCS
Backport of commit 82bc69c012
and commit 30ba037546
without using DT_AARCH64_VARIANT_PCS for optimizing the symbol table check.
This is needed so the internal abi between ld.so and libc.so is unchanged.

Avoid lazy binding of symbols that may follow a variant PCS with different
register usage convention from the base PCS.

Currently the lazy binding entry code does not preserve all the registers
required for AdvSIMD and SVE vector calls.  Saving and restoring all
registers unconditionally may break existing binaries, even if they never
use vector calls, because of the larger stack requirement for lazy
resolution, which can be significant on an SVE system.

The solution is to mark all symbols in the symbol table that may follow
a variant PCS so the dynamic linker can handle them specially.  In this
patch such symbols are always resolved at load time, not lazily.

So currently LD_AUDIT for variant PCS symbols are not supported, for that
the _dl_runtime_profile entry needs to be changed e.g. to unconditionally
save/restore all registers (but pass down arg and retval registers to
pltentry/exit callbacks according to the base PCS).

This patch also removes a __builtin_expect from the modified code because
the branch prediction hint did not seem useful.

	* sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Check
	STO_AARCH64_VARIANT_PCS and bind such symbols at load time.
2019-07-12 10:14:12 +01:00
Florian Weimer
da347f4aa3 io: Remove copy_file_range emulation [BZ #24744]
The kernel is evolving this interface (e.g., removal of the
restriction on cross-device copies), and keeping up with that
is difficult.  Applications which need the function should
run kernels which support the system call instead of relying on
the imperfect glibc emulation.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 5a659ccc0e)
2019-07-09 10:01:21 +02:00
Stefan Liebler
6eb48fe80c S390: Mark vx and vxe as important hwcap.
This patch adds vx and vxe as important hwcaps
which allows one to provide shared libraries
tuned for platforms with non-vx/-vxe, vx or vxe.

ChangeLog:

	* sysdeps/s390/dl-procinfo.h (HWCAP_IMPORTANT):
	Add HWCAP_S390_VX and HWCAP_S390_VXE.

(cherry picked from commit 61f5e9470f)

Conflicts:
	ChangeLog
2019-03-21 09:27:37 +01:00
Stefan Liebler
bc6f839fb4 Fix output of LD_SHOW_AUXV=1.
Starting with commit 1616d034b6
the output was corrupted on some platforms as _dl_procinfo
was called for every auxv entry and on some architectures like s390
all entries were represented as "AT_HWCAP".

This patch is removing the condition and let _dl_procinfo decide if
an entry is printed in a platform specific or generic way.
This patch also adjusts all _dl_procinfo implementations which assumed
that they are only called for AT_HWCAP or AT_HWCAP2. They are now just
returning a non-zero-value for entries which are not handled platform
specifc.

ChangeLog:

	* elf/dl-sysdep.c (_dl_show_auxv): Remove condition and always
	call _dl_procinfo.
	* sysdeps/unix/sysv/linux/s390/dl-procinfo.h (_dl_procinfo):
	Ignore types other than AT_HWCAP.
	* sysdeps/sparc/dl-procinfo.h (_dl_procinfo): Likewise.
	* sysdeps/unix/sysv/linux/i386/dl-procinfo.h (_dl_procinfo):
	Likewise.
	* sysdeps/powerpc/dl-procinfo.h (_dl_procinfo): Adjust comment
	in the case of falling back to generic output mechanism.
	* sysdeps/unix/sysv/linux/arm/dl-procinfo.h (_dl_procinfo):
	Likewise.

(cherry picked from commit 7c6513082b)

Conflicts:
	ChangeLog
2019-03-13 10:51:23 +01:00
Florian Weimer
c096b008d2 nptl: Avoid fork handler lock for async-signal-safe fork [BZ #24161]
Commit 27761a1042 ("Refactor atfork
handlers") introduced a lock, atfork_lock, around fork handler list
accesses.  It turns out that this lock occasionally results in
self-deadlocks in malloc/tst-mallocfork2:

(gdb) bt
#0  __lll_lock_wait_private ()
    at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:63
#1  0x00007f160c6f927a in __run_fork_handlers (who=(unknown: 209394016),
    who@entry=atfork_run_prepare) at register-atfork.c:116
#2  0x00007f160c6b7897 in __libc_fork () at ../sysdeps/nptl/fork.c:58
#3  0x00000000004027d6 in sigusr1_handler (signo=<optimized out>)
    at tst-mallocfork2.c:80
#4  sigusr1_handler (signo=<optimized out>) at tst-mallocfork2.c:64
#5  <signal handler called>
#6  0x00007f160c6f92e4 in __run_fork_handlers (who=who@entry=atfork_run_parent)
    at register-atfork.c:136
#7  0x00007f160c6b79a2 in __libc_fork () at ../sysdeps/nptl/fork.c:152
#8  0x0000000000402567 in do_test () at tst-mallocfork2.c:156
#9  0x0000000000402dd2 in support_test_main (argc=1, argv=0x7ffc81ef1ab0,
    config=config@entry=0x7ffc81ef1970) at support_test_main.c:350
#10 0x0000000000402362 in main (argc=<optimized out>, argv=<optimized out>)
    at ../support/test-driver.c:168

If no locking happens in the single-threaded case (where fork is
expected to be async-signal-safe), this deadlock is avoided.
(pthread_atfork is not required to be async-signal-safe, so a fork
call from a signal handler interrupting pthread_atfork is not
a problem.)

(cherry picked from commit 669ff911e2)
2019-02-08 12:54:41 +01:00