glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-23 09:43:32 +08:00

Author	SHA1	Message	Date
Alex Butler	5daf0c9815	aarch64: MTE compatible strncmp Add support for MTE to strncmp. Regression tested with xcheck and benchmarked with glibc's benchtests on the Cortex-A53, Cortex-A72, and Neoverse N1. The existing implementation assumes that any access to the pages in which the string resides is safe. This assumption is not true when MTE is enabled. This patch updates the algorithm to ensure that accesses remain within the bounds of an MTE tag (16-byte chunks) and improves overall performance. Co-authored-by: Branislav Rankov <branislav.rankov@arm.com> Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com> (cherry picked from commit `03e1378f94`)	2024-11-04 17:20:31 +00:00
Szabolcs Nagy	268df5a910	aarch64: Fix DT_AARCH64_VARIANT_PCS handling [BZ #26798 ] The variant PCS support was ineffective because in the common case linkmap->l_mach.plt == 0 but then the symbol table flags were ignored and normal lazy binding was used instead of resolving the relocs early. (This was a misunderstanding about how GOT[1] is setup by the linker.) In practice this mainly affects SVE calls when the vector length is more than 128 bits, then the top bits of the argument registers get clobbered during lazy binding. Fixes bug 26798. (cherry picked from commit `558251bd87`)	2020-11-04 12:30:42 +00:00
Szabolcs Nagy	2b802d55b7	aarch64: add HWCAP_ATOMICS to HWCAP_IMPORTANT This enables searching shared libraries in atomics/ when the hardware supports LSE atomics of armv8.1 so one can provide optimized variants of libraries in a portable way. LSE atomics does not affect library abi, the new instructions can interoperate with old ones. I considered the earlier comments on the patch https://sourceware.org/ml/libc-alpha/2018-04/msg00400.html https://sourceware.org/ml/libc-alpha/2018-04/msg00625.html It turns out that the way glibc dynamic linker decides on the search path is not very flexible: it wants to use hwcap bits and associated strings. So some targets reuse hwcap bits for glibc internal purposes to affect the search logic. But hwcap is an interface with the kernel, glibc should not allocate bits in it for its internal logic as that limits future hwcap extensions and confusing to users who expect to see hwcap bits in ifunc resolvers. Instead of rewriting the dynamic linker path logic (which affects all targets) this patch just uses the existing mechanism, however this means that the path name has to be the hwcap name "atomics" and cannot be changed to something more meaningful to users. It is hard to tell how much performance benefit this can give, in principle armv8.1 atomics can be better optimized in the hardware, so it can make a difference for synchronization heavy code. On some systems such multilib setup may be the only viable way to get optimized libraries used. * sysdeps/unix/sysv/linux/aarch64/dl-procinfo.h (HWCAP_IMPORTANT): Add HWCAP_ATOMICS. (cherry picked from commit `397c54c1af`)	2020-03-25 18:00:22 +00:00
Szabolcs Nagy	c6ed5b563c	aarch64: Remove HWCAP_CPUID from HWCAP_IMPORTANT This partially reverts commit `f82e9672ad` Author: Siddhesh Poyarekar <siddhesh@sourceware.org> aarch64: Allow overriding HWCAP_CPUID feature check using HWCAP_MASK The idea was to make it possible to disable cpuid based ifunc resolution in glibc by changing the hwcap mask which the user could already control. However the hwcap mask has an orthogonal role: it specifies additional library search paths for the dynamic linker. So "cpuid" got added to the search paths when it was set in the default mask (HWCAP_IMPORTANT), which is not useful behaviour, the hwcap masking should not be reused in the cpu features code. Meanwhile there is a tunable to set the cpu explicitly so it is possible to disable the cpuid based dispatch without using a hwcap mask: GLIBC_TUNABLES=glibc.tune.cpu=generic * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (init_cpu_features): Use dl_hwcap without masking. * sysdeps/unix/sysv/linux/aarch64/dl-procinfo.h (HWCAP_IMPORTANT): Remove HWCAP_CPUID. (cherry picked from commit `d0cd798071`)	2020-03-25 17:59:54 +00:00
Andreas Schwab	263e617599	Fix use-after-free in glob when expanding ~user (bug 25414) The value of `end_name' points into the value of `dirname', thus don't deallocate the latter before the last use of the former. (cherry picked from commit `ddc650e9b3` with changes from commit `d711a00f93`)	2020-03-20 17:49:04 -03:00
Andreas Schwab	37db4539dd	Fix array overflow in backtrace on PowerPC (bug 25423) When unwinding through a signal frame the backtrace function on PowerPC didn't check array bounds when storing the frame address. Fixes commit `d400dcac5e` ("PowerPC: fix backtrace to handle signal trampolines"). (cherry picked from commit `d937694059`)	2020-03-20 15:32:44 -03:00
Florian Weimer	2dc2d678e9	libio: Disable vtable validation for pre-2.1 interposed handles [BZ #25203 ] Commit `c402355dfa` ("libio: Disable vtable validation in case of interposition [BZ #23313]") only covered the interposable glibc 2.1 handles, in libio/stdfiles.c. The parallel code in libio/oldstdfiles.c needs similar detection logic. Fixes (again) commit `db3476aff1` ("libio: Implement vtable verification [BZ #20191]"). Change-Id: Ief6f9f17e91d1f7263421c56a7dc018f4f595c21 (cherry picked from commit `cb61630ed7`)	2019-11-28 14:44:48 +01:00
Marcin Kościelnicki	bc42e3bd44	rtld: Check __libc_enable_secure before honoring LD_PREFER_MAP_32BIT_EXEC (CVE-2019-19126) [BZ #25204 ] The problem was introduced in glibc 2.23, in commit `b9eb92ab05` ("Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT"). (cherry picked from commit `d5dfad4326`)	2019-11-22 13:49:18 +01:00
Dragan Mladjenovic	aaf2f25b61	mips: Force RWX stack for hard-float builds that can run on pre-4.8 kernels Linux/Mips kernels prior to 4.8 could potentially crash the user process when doing FPU emulation while running on non-executable user stack. Currently, gcc doesn't emit .note.GNU-stack for mips, but that will change in the future. To ensure that glibc can be used with such future gcc, without silently resulting in binaries that might crash in runtime, this patch forces RWX stack for all built objects if configured to run against minimum kernel version less than 4.8. * sysdeps/unix/sysv/linux/mips/Makefile (test-xfail-check-execstack): Move under mips-has-gnustack != yes. (CFLAGS-.o, ASFLAGS-.o): New rules. Apply -Wa,-execstack if mips-force-execstack == yes. * sysdeps/unix/sysv/linux/mips/configure: Regenerated. * sysdeps/unix/sysv/linux/mips/configure.ac (mips-force-execstack): New var. Set to yes for hard-float builds with minimum_kernel < 4.8.0 or minimum_kernel not set at all. (mips-has-gnustack): New var. Use value of libc_cv_as_noexecstack if mips-force-execstack != yes, otherwise set to no. (cherry picked from commit `33bc9efd91`)	2019-11-05 13:56:52 -03:00
Wilco Dijkstra	612fba2fe9	Improve performance of memmem This patch significantly improves performance of memmem using a novel modified Horspool algorithm. Needles up to size 256 use a bad-character table indexed by hashed pairs of characters to quickly skip past mismatches. Long needles use a self-adapting filtering step to avoid comparing the whole needle repeatedly. By limiting the needle length to 256, the shift table only requires 8 bits per entry, lowering preprocessing overhead and minimizing cache effects. This limit also implies worst-case performance is linear. Small needles up to size 2 use a dedicated linear search. Very long needles use the Two-Way algorithm (to avoid increasing stack size or slowing down the common case, inlining is disabled). The performance gain is 6.6 times on English text on AArch64 using random needles with average size 8. Tested against GLIBC testsuite and randomized tests. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> * string/memmem.c (__memmem): Rewrite to improve performance. (cherry picked from commit `680942b016`)	2019-09-13 16:41:34 +01:00
Wilco Dijkstra	796c5ee030	Improve performance of strstr This patch significantly improves performance of strstr using a novel modified Horspool algorithm. Needles up to size 256 use a bad-character table indexed by hashed pairs of characters to quickly skip past mismatches. Long needles use a self-adapting filtering step to avoid comparing the whole needle repeatedly. By limiting the needle length to 256, the shift table only requires 8 bits per entry, lowering preprocessing overhead and minimizing cache effects. This limit also implies worst-case performance is linear. Small needles up to size 3 use a dedicated linear search. Very long needles use the Two-Way algorithm. The performance gain using the improved bench-strstr on Cortex-A72 is 5.8 times basic_strstr and 3.7 times twoway_strstr. Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test (https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> * string/str-two-way.h (two_way_short_needle): Add inline to avoid warning. (two_way_long_needle): Block inlining. * string/strstr.c (strstr2): Add new function. (strstr3): Likewise. (STRSTR): Completely rewrite strstr to improve performance. (cherry picked from commit `5e0a7ecb66`)	2019-09-13 16:41:26 +01:00
Wilco Dijkstra	cd3487afa2	Fix strstr bug with huge needles (bug 23637) The generic strstr in GLIBC 2.28 fails to match huge needles. The optimized AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of repeatedly checking for the end of the string. However if the needle length is larger than this, two_way_long_needle may confuse this as meaning the end of the string and return NULL. This is fixed by adding the needle length to the amount to read ahead. [BZ #23637] * string/test-strstr.c (pr23637): New function. (test_main): Add tests with longer needles. * string/strcasestr.c (AVAILABLE): Fix readahead distance. * string/strstr.c (AVAILABLE): Likewise. (cherry picked from commit `83a552b0bb`)	2019-09-13 16:40:59 +01:00
Rajalakshmi Srinivasaraghavan	ceeba1d73c	Speedup first memmem match As done in commit `284f42bc77`, memcmp can be used after memchr to avoid the initialization overhead of the two-way algorithm for the first match. This has shown improvement >40% for first match. (cherry picked from commit `c8dd67e7c9`)	2019-09-13 16:40:20 +01:00
Wilco Dijkstra	c60bf879b2	Simplify and speedup strstr/strcasestr first match Looking at the benchtests, both strstr and strcasestr spend a lot of time in a slow initialization loop handling one character per iteration. This can be simplified and use the much faster strlen/strnlen/strchr/memcmp. Read ahead a few cachelines to reduce the number of strnlen calls, which improves performance by ~3-4%. This patch improves the time taken for the full strstr benchtest by >40%. * string/strcasestr.c (STRCASESTR): Simplify and speedup first match. * string/strstr.c (AVAILABLE): Likewise. (cherry picked from commit `284f42bc77`)	2019-09-13 16:39:42 +01:00
Wilco Dijkstra	55a280689e	Improve strstr performance Improve strstr performance. Strstr tends to be slow because it uses many calls to memchr and a slow byte loop to scan for the next match. Performance is significantly improved by using strnlen on larger blocks and using strchr to search for the next matching character. strcasestr can also use strnlen to scan ahead, and memmem can use memchr to check for the next match. On the GLIBC bench tests the performance gains on Cortex-A72 are: strstr: +25% strcasestr: +4.3% memmem: +18% On a 256KB dataset strstr performance improves by 67%, strcasestr by 47%. Reviewd-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `3ae725dfb6`)	2019-09-13 16:39:12 +01:00
Wilco Dijkstra	d6613ad24f	[AArch64] Add ifunc support for Ares Add Ares to the midr_el0 list and support ifunc dispatch. Since Ares supports 2 128-bit loads/stores, use Neon registers for memcpy by selecting __memcpy_falkor by default (we should rename this to __memcpy_simd or similar). * manual/tunables.texi (glibc.cpu.name): Add ares tunable. * sysdeps/aarch64/multiarch/memcpy.c (__libc_memcpy): Use __memcpy_falkor for ares. * sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_ARES): Add new define. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list): Add ares cpu. (cherry picked from commit `02f440c1ef`)	2019-09-06 18:58:34 +01:00
Siddhesh Poyarekar	ad64510e5c	aarch64,falkor: Use vector registers for memcpy Vector registers perform better than scalar register pairs for copying data so prefer them instead. This results in a time reduction of over 50% (i.e. 2x speed improvemnet) for some smaller sizes for memcpy-walk. Larger sizes show improvements of around 1% to 2%. memcpy-random shows a very small improvement, in the range of 1-2%. * sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor): Use vector registers. (cherry picked from commit `0aec4c1d18`)	2019-09-06 18:42:14 +01:00
Siddhesh Poyarekar	d3c05bfffa	aarch64,falkor: Ignore prefetcher tagging for smaller copies For smaller and medium sized copies, the effect of hardware prefetching are not as dominant as instruction level parallelism. Hence it makes more sense to load data into multiple registers than to try and route them to the same prefetch unit. This is also the case for the loop exit where we are unable to latch on to the same prefetch unit anyway so it makes more sense to have data loaded in parallel. The performance results are a bit mixed with memcpy-random, with numbers jumping between -1% and +3%, i.e. the numbers don't seem repeatable. memcpy-walk sees a 70% improvement (i.e. > 2x) for 128 bytes and that improvement reduces down as the impact of the tail copy decreases in comparison to the loop. * sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor): Use multiple registers to copy data in loop tail. (cherry picked from commit `db725a458e`)	2019-09-06 18:41:04 +01:00
Siddhesh Poyarekar	e3c35100d3	aarch64/strncmp: Use lsr instead of mov+lsr A lsr can do what the mov and lsr did. (cherry picked from commit `b47c3e7637`)	2019-09-06 17:15:27 +01:00
Siddhesh Poyarekar	00fd3acde1	aarch64/strncmp: Unbreak builds with old binutils Binutils 2.26.* and older do not support moves with shifted registers, so use a separate shift instruction instead. (cherry picked from commit `d46f84de74`)	2019-09-06 17:14:47 +01:00
Siddhesh Poyarekar	af9381b734	aarch64: Improve strncmp for mutually misaligned inputs The mutually misaligned inputs on aarch64 are compared with a simple byte copy, which is not very efficient. Enhance the comparison similar to strcmp by loading a double-word at a time. The peak performance improvement (i.e. 4k maxlen comparisons) due to this on the strncmp microbenchmark is as follows: falkor: 3.5x (up to 72% time reduction) cortex-a73: 3.5x (up to 71% time reduction) cortex-a53: 3.5x (up to 71% time reduction) All mutually misaligned inputs from 16 bytes maxlen onwards show upwards of 15% improvement and there is no measurable effect on the performance of aligned/mutually aligned inputs. * sysdeps/aarch64/strncmp.S (count): New macro. (strncmp): Store misaligned length in SRC1 in COUNT. (mutual_align): Adjust. (misaligned8): Load dword at a time when it is safe. (cherry picked from commit `7108f1f944`)	2019-09-06 17:14:05 +01:00
Siddhesh Poyarekar	01de24dbca	aarch64/strcmp: fix misaligned loop jump target I accidentally set the loop jump back label as misaligned8 instead of do_misaligned. The typo is harmless but it's always nice to not have to unnecessarily execute those two instructions. * sysdeps/aarch64/strcmp.S (do_misaligned): Jump back to do_misaligned, not misaligned8. (cherry picked from commit `6ca24c4348`)	2019-09-06 17:13:34 +01:00
Siddhesh Poyarekar	4e75091d6c	aarch64: Improve strcmp unaligned performance Replace the simple byte-wise compare in the misaligned case with a dword compare with page boundary checks in place. For simplicity I've chosen a 4K page boundary so that we don't have to query the actual page size on the system. This results in up to 3x improvement in performance in the unaligned case on falkor and about 2.5x improvement on mustang as measured using bench-strcmp. * sysdeps/aarch64/strcmp.S (misaligned8): Compare dword at a time whenever possible. (cherry picked from commit `2bce01ebba`)	2019-09-06 17:13:02 +01:00
Siddhesh Poyarekar	8569357e11	aarch64: Fix branch target to loop16 I goofed up when changing the loop8 name to loop16 and missed on out the branch instance. Fixed and actually build tested this time. * sysdeps/aarch64/memcmp.S (more16): Fix branch target loop16. (cherry picked from commit `4e54d91863`)	2019-09-06 16:33:12 +01:00
Siddhesh Poyarekar	ec4512194f	aarch64: Optimized memcmp for medium to large sizes This improved memcmp provides a fast path for compares up to 16 bytes and then compares 16 bytes at a time, thus optimizing loads from both sources. The glibc memcmp microbenchmark retains performance (with an error of ~1ns) for smaller compare sizes and reduces up to 31% of execution time for compares up to 4K on the APM Mustang. On Qualcomm Falkor this improves to almost 48%, i.e. it is almost 2x improvement for sizes of 2K and above. * sysdeps/aarch64/memcmp.S: Widen comparison to 16 bytes at a time. (cherry picked from commit `30a81dae5b`)	2019-09-06 16:33:04 +01:00
Siddhesh Poyarekar	600e4e866c	aarch64: Use the L() macro for labels in memcmp The L() macro makes the assembly a bit more readable. * sysdeps/aarch64/memcmp.S: Use L() macro for labels. (cherry picked from commit `84c94d2fd9`)	2019-09-06 16:31:36 +01:00
Wilco Dijkstra	1896de3d92	[AArch64] Optimized memcmp. This is an optimized memcmp for AArch64. This is a complete rewrite using a different algorithm. The previous version split into cases where both inputs were aligned, the inputs were mutually aligned and unaligned using a byte loop. The new version combines all these cases, while small inputs of less than 8 bytes are handled separately. This allows the main code to be sped up using unaligned loads since there are now at least 8 bytes to be compared. After the first 8 bytes, align the first input. This ensures each iteration does at most one unaligned access and mutually aligned inputs behave as aligned. After the main loop, process the last 8 bytes using unaligned accesses. This improves performance of (mutually) aligned cases by 25% and unaligned by >500% (yes >6 times faster) on large inputs. * sysdeps/aarch64/memcmp.S (memcmp): Rewrite of optimized memcmp. (cherry picked from commit `922369032c`)	2019-09-06 16:30:10 +01:00
Adhemerval Zanella	54194d8b4d	posix: Fix large mmap64 offset for mips64n32 (BZ#24699) The fix for BZ#21270 (commit `158d5fa0e1`) added a mask to avoid offset larger than 1^44 to be used along __NR_mmap2. However mips64n32 users __NR_mmap, as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such x32, defines off_t being equal to off64_t). This leads to use the same mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum offset it can use with mmap64. This patch fixes by setting the high mask only for __NR_mmap2 usage. The posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The patch also change the test to check for an arch-specific header that defines the maximum supported offset. Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and mips64-n32-linux-gnu. [BZ #24699] * posix/tst-mmap-offset.c: Mention BZ #24699. (do_test_bz21270): Rename to do_test_large_offset and use mmap64_maximum_offset to check for maximum expected offset value. * sysdeps/generic/mmap_info.h: New file. * sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise. * sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff __NR_mmap2 is used. (cherry picked from commit `a008c76b56`)	2019-07-12 19:02:04 +00:00
Szabolcs Nagy	f2f501ff39	aarch64: handle STO_AARCH64_VARIANT_PCS Backport of commit `82bc69c012` and commit `30ba037546` without using DT_AARCH64_VARIANT_PCS for optimizing the symbol table check. This is needed so the internal abi between ld.so and libc.so is unchanged. Avoid lazy binding of symbols that may follow a variant PCS with different register usage convention from the base PCS. Currently the lazy binding entry code does not preserve all the registers required for AdvSIMD and SVE vector calls. Saving and restoring all registers unconditionally may break existing binaries, even if they never use vector calls, because of the larger stack requirement for lazy resolution, which can be significant on an SVE system. The solution is to mark all symbols in the symbol table that may follow a variant PCS so the dynamic linker can handle them specially. In this patch such symbols are always resolved at load time, not lazily. So currently LD_AUDIT for variant PCS symbols are not supported, for that the _dl_runtime_profile entry needs to be changed e.g. to unconditionally save/restore all registers (but pass down arg and retval registers to pltentry/exit callbacks according to the base PCS). This patch also removes a __builtin_expect from the modified code because the branch prediction hint did not seem useful. * sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Check STO_AARCH64_VARIANT_PCS and bind such symbols at load time.	2019-07-12 10:50:17 +01:00
Szabolcs Nagy	71c2578a9b	aarch64: add STO_AARCH64_VARIANT_PCS and DT_AARCH64_VARIANT_PCS STO_AARCH64_VARIANT_PCS is a non-visibility st_other flag for marking symbols that reference functions that may follow a variant PCS with different register usage convention from the base PCS. DT_AARCH64_VARIANT_PCS is a dynamic tag that marks ELF modules that have R__JUMP_SLOT relocations for symbols marked with STO_AARCH64_VARIANT_PCS (i.e. have variant PCS calls via a PLT). elf/elf.h (STO_AARCH64_VARIANT_PCS): Define. (DT_AARCH64_VARIANT_PCS): Define.	2019-07-12 10:50:12 +01:00
Wilco Dijkstra	ac92c66821	Fix tcache count maximum (BZ #24531 ) The tcache counts[] array is a char, which has a very small range and thus may overflow. When setting tcache_count tunable, there is no overflow check. However the tunable must not be larger than the maximum value of the tcache counts[] array, otherwise it can overflow when filling the tcache. [BZ #24531] * malloc/malloc.c (MAX_TCACHE_COUNT): New define. (do_set_tcache_count): Only update if count is small enough. * manual/tunables.texi (glibc.malloc.tcache_count): Document max value. (cherry picked from commit `5ad533e8e6`)	2019-05-22 15:41:24 +01:00
Andreas Schwab	4385ec1d8a	Fix crash in _IO_wfile_sync (bug 20568) When computing the length of the converted part of the stdio buffer, use the number of consumed wide characters, not the (negative) distance to the end of the wide buffer. (cherry picked from commit `32ff397533`)	2019-05-16 10:10:04 +02:00
Stefan Liebler	c165427d55	Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180 ] While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and Heiko Carstens found a bug in pthread_mutex_trylock due to misordered instructions: 140: a5 1b 00 01 oill %r1,1 144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL); 14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI vs (with compiler barriers): 140: a5 1b 00 01 oill %r1,1 144: e3 10 a0 e0 00 24 stg %r1,224(%r10) 14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0 Please have a look at the discussion: "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede" (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/) This patch is introducing the same compiler barriers and comments for pthread_mutex_trylock as introduced for pthread_mutex_lock and pthread_mutex_timedlock by commit `8f9450a0b7` "Add compiler barriers around modifications of the robust mutex list." ChangeLog: [BZ #24180] * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock): Add compiler barriers and comments. (cherry picked from commit `823624bdc4`)	2019-02-07 15:49:36 +01:00
H.J. Lu	04e767b59b	x86-64 memcmp: Use unsigned Jcc instructions on size [BZ #24155 ] Since the size argument is unsigned. we should use unsigned Jcc instructions, instead of signed, to check size. Tested on x86-64 and x32, with and without --disable-multi-arch. [BZ #24155] CVE-2019-7309 * NEWS: Updated for CVE-2019-7309. * sysdeps/x86_64/memcmp.S: Use RDX_LP for size. Clear the upper 32 bits of RDX register for x32. Use unsigned Jcc instructions, instead of signed. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp-2. * sysdeps/x86_64/x32/tst-size_t-memcmp-2.c: New test. (cherry picked from commit `3f635fb433`)	2019-02-04 10:27:37 -08:00
H.J. Lu	dc968f5573	x86-64 strnlen/wcsnlen: Properly handle the length parameter [BZ #24097 ] On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes strnlen/wcsnlen for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/strlen-avx2.S: Use RSI_LP for length. Clear the upper 32 bits of RSI register. * sysdeps/x86_64/strlen.S: Use RSI_LP for length. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strnlen and tst-size_t-wcsnlen. * sysdeps/x86_64/x32/tst-size_t-strnlen.c: New file. * sysdeps/x86_64/x32/tst-size_t-wcsnlen.c: Likewise. (cherry picked from commit `5165de69c0`)	2019-02-01 15:35:22 -08:00
H.J. Lu	40575878cd	x86-64 strncpy: Properly handle the length parameter [BZ #24097 ] On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes strncpy for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: Use RDX_LP for length. * sysdeps/x86_64/multiarch/strcpy-ssse3.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncpy. * sysdeps/x86_64/x32/tst-size_t-strncpy.c: New file. (cherry picked from commit `c7c54f65b0`)	2019-02-01 15:35:00 -08:00
H.J. Lu	15ce2f62f6	x86-64 strncmp family: Properly handle the length parameter [BZ #24097 ] On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes the strncmp family for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/strcmp-sse42.S: Use RDX_LP for length. * sysdeps/x86_64/strcmp.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-strncasecmp, tst-size_t-strncmp and tst-size_t-wcsncmp. * sysdeps/x86_64/x32/tst-size_t-strncasecmp.c: New file. * sysdeps/x86_64/x32/tst-size_t-strncmp.c: Likewise. * sysdeps/x86_64/x32/tst-size_t-wcsncmp.c: Likewise. (cherry picked from commit `ee915088a0`)	2019-02-01 15:34:41 -08:00
H.J. Lu	885e4af2ac	x86-64 memset/wmemset: Properly handle the length parameter [BZ #24097 ] On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memset/wmemset for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: Use RDX_LP for length. Clear the upper 32 bits of RDX register. * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-wmemset. * sysdeps/x86_64/x32/tst-size_t-memset.c: New file. * sysdeps/x86_64/x32/tst-size_t-wmemset.c: Likewise. (cherry picked from commit `82d0b4a4d7`)	2019-02-01 15:34:29 -08:00
H.J. Lu	c9ea2e82d4	x86-64 memrchr: Properly handle the length parameter [BZ #24097 ] On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memrchr for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/memrchr.S: Use RDX_LP for length. * sysdeps/x86_64/multiarch/memrchr-avx2.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memrchr. * sysdeps/x86_64/x32/tst-size_t-memrchr.c: New file. (cherry picked from commit `ecd8b842cf`)	2019-02-01 15:34:17 -08:00
H.J. Lu	94b88894b1	x86-64 memcpy: Properly handle the length parameter [BZ #24097 ] On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memcpy for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: Use RDX_LP for length. Clear the upper 32 bits of RDX register. * sysdeps/x86_64/multiarch/memcpy-ssse3.S: Likewise. * sysdeps/x86_64/multiarch/memmove-avx512-no-vzeroupper.S: Likewise. * sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcpy. tst-size_t-wmemchr. * sysdeps/x86_64/x32/tst-size_t-memcpy.c: New file. (cherry picked from commit `231c56760c`)	2019-02-01 15:34:04 -08:00
H.J. Lu	232a7628f0	x86-64 memcmp/wmemcmp: Properly handle the length parameter [BZ #24097 ] On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memcmp/wmemcmp for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S: Use RDX_LP for length. Clear the upper 32 bits of RDX register. * sysdeps/x86_64/multiarch/memcmp-sse4.S: Likewise. * sysdeps/x86_64/multiarch/memcmp-ssse3.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memcmp and tst-size_t-wmemcmp. * sysdeps/x86_64/x32/tst-size_t-memcmp.c: New file. * sysdeps/x86_64/x32/tst-size_t-wmemcmp.c: Likewise. (cherry picked from commit `b304fc201d`)	2019-02-01 15:33:50 -08:00
H.J. Lu	bff8346b01	x86-64 memchr/wmemchr: Properly handle the length parameter [BZ #24097 ] On x32, the size_t parameter may be passed in the lower 32 bits of a 64-bit register with the non-zero upper 32 bits. The string/memory functions written in assembly can only use the lower 32 bits of a 64-bit register as length or must clear the upper 32 bits before using the full 64-bit register for length. This pach fixes memchr/wmemchr for x32. Tested on x86-64 and x32. On x86-64, libc.so is the same with and withou the fix. [BZ #24097] CVE-2019-6488 * sysdeps/x86_64/memchr.S: Use RDX_LP for length. Clear the upper 32 bits of RDX register. * sysdeps/x86_64/multiarch/memchr-avx2.S: Likewise. * sysdeps/x86_64/x32/Makefile (tests): Add tst-size_t-memchr and tst-size_t-wmemchr. * sysdeps/x86_64/x32/test-size_t.h: New file. * sysdeps/x86_64/x32/tst-size_t-memchr.c: Likewise. * sysdeps/x86_64/x32/tst-size_t-wmemchr.c: Likewise. (cherry picked from commit `97700a34f3`)	2019-02-01 15:32:53 -08:00
Gabriel F. T. Gomes	7ab39c6a3c	powerpc: Regenerate ULPs On POWER9, cbrtf128 fails by 1 ULP. * sysdeps/powerpc/fpu/libm-test-ulps: Regenerate. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> (cherry picked from commit `428fc49eaa`)	2019-01-11 14:14:10 -02:00
Tulio Magno Quites Machado Filho	d851e9199e	[BZ #21745 ] powerpc: build some IFUNC math functions for libc and libm Some math functions have to be distributed in libc because they're required by printf. libc and libm require their own builds of these functions, e.g. libc functions have to call __stack_chk_fail_local in order to bypass the PLT, while libm functions have to call __stack_chk_fail. While math/Makefile treat the generic cases, i.e. s_isinff, the multiarch Makefile has to treat its own files, i.e. s_isinff-ppc64. [BZ #21745] * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: [$(subdir) = math] (sysdep_calls): New variable. Has the previous contents of sysdep_routines, but re-sorted.. [$(subdir) = math] (sysdep_routines): Re-use the contents from sysdep_calls. [$(subdir) = math] (libm-sysdep_routines): Remove the functions defined in sysdep_calls and replace by the respective m_* names. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: (compat_symbol): Undefine to avoid duplicated compat symbols in libc. (cherry picked from commit `61c45f2505`)	2019-01-11 14:14:10 -02:00
Florian Weimer	a5bd0ba192	intl: Do not return NULL on asprintf failure in gettext [BZ #24018 ] Fixes commit `9695dd0c93` ("DCIGETTEXT: Use getcwd, asprintf to construct absolute pathname"). (cherry picked from commit `8c1aafc1f3`)	2019-01-02 20:01:05 +01:00
Florian Weimer	94417f6c26	malloc: Always call memcpy in _int_realloc [BZ #24027 ] This commit removes the custom memcpy implementation from _int_realloc for small chunk sizes. The ncopies variable has the wrong type, and an integer wraparound could cause the existing code to copy too few elements (leaving the new memory region mostly uninitialized). Therefore, removing this code fixes bug 24027. (cherry picked from commit `b50dd3bc8c`)	2019-01-01 11:07:26 +01:00
Florian Weimer	a0bc5dd3be	CVE-2018-19591: if_nametoindex: Fix descriptor for overlong name [BZ #23927 ] (cherry picked from commit `d527c860f5`)	2018-11-27 21:35:31 +01:00
Alexandra Hájková	27e46f5836	Add an additional test to resolv/tst-resolv-network.c Test for the infinite loop in getnetbyname, bug #17630. (cherry picked from commit `ac8060265b`)	2018-11-09 14:43:45 +01:00
Andreas Schwab	fcd86c6253	libanl: properly cleanup if first helper thread creation failed (bug 22927) (cherry picked from commit `bd3b0fbae3`)	2018-11-06 17:12:07 +01:00
Adhemerval Zanella	dc40423dba	x86: Fix Haswell CPU string flags (BZ#23709) Th commit 'Disable TSX on some Haswell processors.' (`2702856bf4`) changed the default flags for Haswell models. Previously, new models were handled by the default switch path, which assumed a Core i3/i5/i7 if AVX is available. After the patch, Haswell models (0x3f, 0x3c, 0x45, 0x46) do not set the flags Fast_Rep_String, Fast_Unaligned_Load, Fast_Unaligned_Copy, and Prefer_PMINUB_for_stringop (only the TSX one). This patch fixes it by disentangle the TSX flag handling from the memory optimization ones. The strstr case cited on patch now selects the __strstr_sse2_unaligned as expected for the Haswell cpu. Checked on x86_64-linux-gnu. [BZ #23709] * sysdeps/x86/cpu-features.c (init_cpu_features): Set TSX bits independently of other flags. (cherry picked from commit `c3d8dc45c9`)	2018-11-02 11:14:05 +01:00

1 2 3 4 5 ...

32150 Commits