glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-26 19:23:34 +08:00

Author	SHA1	Message	Date
Alejandro Colomar	53fcdf5f74	Silence most -Wzero-as-null-pointer-constant diagnostics Replace 0 by NULL and {0} by {}. Omit a few cases that aren't so trivial to fix. Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117059> Link: <https://software.codidact.com/posts/292718/292759#answer-292759> Signed-off-by: Alejandro Colomar <alx@kernel.org>	2024-11-25 16:45:59 -03:00
Yannick Le Pennec	83d4b42ded	sysdeps: linux: Fix output of LD_SHOW_AUXV=1 for AT_RSEQ_* The constants themselves were added to elf.h back in `8754a4133e` but the array in _dl_show_auxv wasn't modified accordingly, resulting in the following output when running LD_SHOW_AUXV=1 /bin/true on recent Linux: AT_??? (0x1b): 0x1c AT_??? (0x1c): 0x20 With this patch: AT_RSEQ_FEATURE_SIZE: 28 AT_RSEQ_ALIGN: 32 Tested on Linux 6.11 x86_64 Signed-off-by: Yannick Le Pennec <yannick.lepennec@live.fr> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-25 16:45:59 -03:00
Michael Jeanson	d9f40387d3	nptl: initialize cpu_id_start prior to rseq registration When adding explicit initialization of rseq fields prior to registration, I glossed over the fact that 'cpu_id_start' is also documented as initialized by user-space. While current kernels don't validate the content of this field on registration, future ones could. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>	2024-11-25 19:42:14 +01:00
Joseph Myers	99671e72bb	Add multithreaded test of sem_getvalue Test coverage of sem_getvalue is fairly limited. Add a test that runs it on threads on each CPU. For this purpose I adapted tst-skeleton-thread-affinity.c; it didn't seem very suitable to use as-is or include directly in a different test doing things per-CPU, but did seem a suitable starting point (thus sharing tst-skeleton-affinity.c) for such testing. Tested for x86_64.	2024-11-22 16:58:51 +00:00
Yury Khrustalev	f4d00dd60d	AArch64: Add support for memory protection keys This patch adds support for memory protection keys on AArch64 systems with enabled Stage 1 permission overlays feature introduced in Armv8.9 / 9.4 (FEAT_S1POE) [1]. 1. Internal functions "pkey_read" and "pkey_write" to access data associated with memory protection keys. 2. Implementation of API functions "pkey_get" and "pkey_set" for the AArch64 target. 3. AArch64-specific PKEY flags for READ and EXECUTE (see below). 4. New target-specific test that checks behaviour of pkeys on AArch64 targets. 5. This patch also extends existing generic test for pkeys. 6. HWCAP constant for Permission Overlay Extension feature. To support more accurate mapping of underlying permissions to the PKEY flags, we introduce additional AArch64-specific flags. The full list of flags is: - PKEY_UNRESTRICTED: 0x0 (for completeness) - PKEY_DISABLE_ACCESS: 0x1 (existing flag) - PKEY_DISABLE_WRITE: 0x2 (existing flag) - PKEY_DISABLE_EXECUTE: 0x4 (new flag, AArch64 specific) - PKEY_DISABLE_READ: 0x8 (new flag, AArch64 specific) The problem here is that PKEY_DISABLE_ACCESS has unusual semantics as it overlaps with existing PKEY_DISABLE_WRITE and new PKEY_DISABLE_READ. For this reason mapping between permission bits RWX and "restrictions" bits awxr (a for disable access, etc) becomes complicated: - PKEY_DISABLE_ACCESS disables both R and W - PKEY_DISABLE_{WRITE,READ} disables W and R respectively - PKEY_DISABLE_EXECUTE disables X Combinations like the one below are accepted although they are redundant: - PKEY_DISABLE_ACCESS \| PKEY_DISABLE_READ \| PKEY_DISABLE_WRITE Reverse mapping tries to retain backward compatibility and ORs PKEY_DISABLE_ACCESS whenever both flags PKEY_DISABLE_READ and PKEY_DISABLE_WRITE would be present. This will break code that compares pkey_get output with == instead of using bitwise operations. The latter is more correct since PKEY_* constants are essentially bit flags. It should be noted that PKEY_DISABLE_ACCESS does not prevent execution. [1] https://developer.arm.com/documentation/ddi0487/ka/ section D8.4.1.4 Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-20 11:30:58 +00:00
Peter Bergner	229265cc2c	powerpc: Improve the inline asm for syscall wrappers Update the inline asm syscall wrappers to match the newer register constraint usage in INTERNAL_VSYSCALL_CALL_TYPE. Use the faster mfocrf instruction when available, rather than the slower mfcr microcoded instruction.	2024-11-19 12:43:57 -05:00
Adhemerval Zanella	461cab1de7	linux: Add support for getrandom vDSO Linux 6.11 has getrandom() in vDSO. It operates on a thread-local opaque state allocated with mmap using flags specified by the vDSO. Multiple states are allocated at once, as many as fit into a page, and these are held in an array of available states to be doled out to each thread upon first use, and recycled when a thread terminates. As these states run low, more are allocated. To make this procedure async-signal-safe, a simple guard is used in the LSB of the opaque state address, falling back to the syscall if there's reentrancy contention. Also, _Fork() is handled by blocking signals on opaque state allocation (so _Fork() always sees a consistent state even if it interrupts a getrandom() call) and by iterating over the thread stack cache on reclaim_stack. Each opaque state will be in the free states list (grnd_alloc.states) or allocated to a running thread. The cancellation is handled by always using GRND_NONBLOCK flags while calling the vDSO, and falling back to the cancellable syscall if the kernel returns EAGAIN (would block). Since getrandom is not defined by POSIX and cancellation is supported as an extension, the cancellation is handled as 'may occur' instead of 'shall occur' [1], meaning that if vDSO does not block (the expected behavior) getrandom will not act as a cancellation entrypoint. It avoids a pthread_testcancel call on the fast path (different than 'shall occur' functions, like sem_wait()). It is currently enabled for x86_64, which is available in Linux 6.11, and aarch64, powerpc32, powerpc64, loongarch64, and s390x, which are available in Linux 6.12. Link: https://pubs.opengroup.org/onlinepubs/9799919799/nframe.html [1] Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> # x86_64 Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> # x86_64, aarch64 Tested-by: Xi Ruoyao <xry111@xry111.site> # x86_64, aarch64, loongarch64 Tested-by: Stefan Liebler <stli@linux.ibm.com> # s390x	2024-11-12 14:42:12 -03:00
Michael Jeanson	97f60abd25	nptl: initialize rseq area prior to registration Per the rseq syscall documentation, 3 fields are required to be initialized by userspace prior to registration, they are 'cpu_id', 'rseq_cs' and 'flags'. Since we have no guarantee that 'struct pthread' is cleared on all architectures, explicitly set those 3 fields prior to registration. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-11-07 22:23:49 +01:00
Yury Khrustalev	ff254cabd6	misc: Align argument name for pkey_*() functions with the manual Change name of the access_rights argument to access_restrictions of the following functions: - pkey_alloc() - pkey_set() as this argument refers to access restrictions rather than access rights and previous name might have been misleading. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-11-06 13:11:33 +00:00
Aurelien Jarno	273694cd78	Add Arm HWCAP2_* constants from Linux 3.15 and 6.2 to <bits/hwcap.h> Linux 3.15 and 6.2 added HWCAP2_* values for Arm. These bits have already been added to dl-procinfo.{c,h} in commits `9aea0cb842` and `8ebe9c0b38`. Also add them to <bits/hwcap.h> so that they can be used in user code. For example, for checking bits in the value returned by getauxval(AT_HWCAP2). Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>	2024-11-05 21:03:37 +01:00
caiyinyu	93ced0e1b8	LoongArch: Add RSEQ_SIG in rseq.h. Signed-off-by: caiyinyu <caiyinyu@loongson.cn>	2024-11-01 10:41:20 +08:00
Sachin Monga	383e4f53cb	powerpc64: Obviate the need for ROP protection in clone/clone3 Save lr in a non-volatile register before scv in clone/clone3. For clone, the non-volatile register was unused and already saved/restored. Remove the dead code from clone. Signed-off-by: Sachin Monga <smonga@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>	2024-10-30 16:50:04 -04:00
Florian Weimer	4f5f8343c3	Linux: Match kernel text for SCHED_ macros This avoids -Werror build issues in strace, which bundles UAPI headers, but does not include them as system headers. Fixes commit `c444cc1d83` ("Linux: Add missing scheduler constants to <sched.h>"). Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-10-25 16:46:30 +02:00
DJ Delorie	81439a116c	configure: default to --prefix=/usr on GNU/Linux I'm getting tired of always typing --prefix=/usr so making it the default. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-22 18:11:49 -04:00
Adhemerval Zanella	ab564362d0	linux: Fix tst-syscall-restart.c on old gcc (BZ 32283) To avoid a parameter name omitted error.	2024-10-18 08:48:22 -03:00
Adhemerval Zanella	2c1903cbba	sparc: Fix restartable syscalls (BZ 32173) The commit 'sparc: Use Linux kABI for syscall return' (`86c5d2cf0c`) did not take into account a subtle sparc syscall kABI constraint. For syscalls that might block indefinitely, on an interrupt (like SIGCONT) the kernel will set the instruction pointer to just before the syscall: arch/sparc/kernel/signal_64.c 476 static void do_signal(struct pt_regs regs, unsigned long orig_i0) 477 { [...] 525 if (restart_syscall) { 526 switch (regs->u_regs[UREG_I0]) { 527 case ERESTARTNOHAND: 528 case ERESTARTSYS: 529 case ERESTARTNOINTR: 530 / replay the system call when we are done */ 531 regs->u_regs[UREG_I0] = orig_i0; 532 regs->tpc -= 4; 533 regs->tnpc -= 4; 534 pt_regs_clear_syscall(regs); 535 fallthrough; 536 case ERESTART_RESTARTBLOCK: 537 regs->u_regs[UREG_G1] = __NR_restart_syscall; 538 regs->tpc -= 4; 539 regs->tnpc -= 4; 540 pt_regs_clear_syscall(regs); 541 } However, on a SIGCONT it seems that 'g1' register is being clobbered after the syscall returns. Before `86c5d2cf0c`, the 'g1' was always placed jus before the 'ta' instruction which then reloads the syscall number and restarts the syscall. On master, where 'g1' might be placed before 'ta': $ cat test.c #include <unistd.h> int main () { pause (); } $ gcc test.c -o test $ strace -f ./t [...] ppoll(NULL, 0, NULL, NULL, 0 On another terminal $ kill -STOP 2262828 $ strace -f ./t [...] --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=2521813, si_uid=8289} --- --- stopped by SIGSTOP --- And then $ kill -CONT 2262828 Results in: --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=2521813, si_uid=8289} --- restart_syscall(<... resuming interrupted ppoll ...>) = -1 EINTR (Interrupted system call) Where the expected behaviour would be: $ strace -f ./t [...] ppoll(NULL, 0, NULL, NULL, 0) = ? ERESTARTNOHAND (To be restarted if no handler) --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=2521813, si_uid=8289} --- --- stopped by SIGSTOP --- --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=2521813, si_uid=8289} --- ppoll(NULL, 0, NULL, NULL, 0 Just moving the 'g1' setting near the syscall asm is not suffice, the compiler might optimize it away (as I saw on cancellation.c by trying this fix). Instead, I have change the inline asm to put the 'g1' setup in ithe asm block. This would require to change the asm constraint for INTERNAL_SYSCALL_NCS, since the syscall number is not constant. Checked on sparc64-linux-gnu. Reported-by: René Rebe <rene@exactcode.de> Tested-by: Sam James <sam@gentoo.org> Reviewed-by: Sam James <sam@gentoo.org>	2024-10-16 14:54:24 -03:00
caiyinyu	2fffaffde8	LoongArch: Regenerate loongarch/arch-syscall.h by build-many-glibcs.py update-syscalls.	2024-10-12 15:50:11 +08:00
Adhemerval Zanella	5ffc903216	misc: Add support for Linux uio.h RWF_ATOMIC flag Linux 6.11 adds the new flag for pwritev2 (commit c34fc6f26ab86d03a2d47446f42b6cd492dfdc56). Checked on x86_64-linux-gnu on 6.11 kernel. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-10 10:28:01 -03:00
Adhemerval Zanella	934d0bf426	Update kernel version to 6.11 in header constant tests This patch updates the kernel version in the tests tst-mount-consts.py, and tst-sched-consts.py to 6.11. There are no new constants covered by these tests in 6.11. Tested with build-many-glibcs.py. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-10 10:27:55 -03:00
Adhemerval Zanella	f6e849fd7c	linux: Add MAP_DROPPABLE from Linux 6.11 This request the page to be never written out to swap, it will be zeroed under memory pressure (so kernel can just drop the page), it is inherited by fork, it is not counted against @code{mlock} budget, and if there is no enough memory to service a page faults there is no fatal error (so not signal is sent). Tested with build-many-glibcs.py. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-10 10:27:53 -03:00
Adhemerval Zanella	86f06282cc	Update PIDFD_* constants for Linux 6.11 Linux 6.11 adds some more PIDFD_* constants for 'pidfs: allow retrieval of namespace file descriptors' (5b08bd408534bfb3a7cf5778da5b27d4e4fffe12). Tested with build-many-glibcs.py. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-10 10:27:51 -03:00
Adhemerval Zanella	02de16df48	Update syscall lists for Linux 6.11 Linux 6.11 changes for syscall are: * fstat/newfstatat for loongarch (it should be safe to add since `255dc1e4ed` that undefine them). * clone3 for nios2, which only adds the entry point but defined __ARCH_BROKEN_SYS_CLONE3 (the syscall will always return ENOSYS). * uretprobe for x86_64 and x32. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-10-10 10:27:49 -03:00
Adhemerval Zanella	d40ac01cbb	stdlib: Make abort/_Exit AS-safe (BZ 26275) The recursive lock used on abort does not synchronize with a new process creation (either by fork-like interfaces or posix_spawn ones), nor it is reinitialized after fork(). Also, the SIGABRT unblock before raise() shows another race condition, where a fork or posix_spawn() call by another thread, just after the recursive lock release and before the SIGABRT signal, might create programs with a non-expected signal mask. With the default option (without POSIX_SPAWN_SETSIGDEF), the process can see SIG_DFL for SIGABRT, where it should be SIG_IGN. To fix the AS-safe, raise() does not change the process signal mask, and an AS-safe lock is used if a SIGABRT is installed or the process is blocked or ignored. With the signal mask change removal, there is no need to use a recursive loc. The lock is also taken on both _Fork() and posix_spawn(), to avoid the spawn process to see the abort handler as SIG_DFL. A read-write lock is used to avoid serialize _Fork and posix_spawn execution. Both sigaction (SIGABRT) and abort() requires to lock as writer (since both change the disposition). The fallback is also simplified: there is no need to use a loop of ABORT_INSTRUCTION after _exit() (if the syscall does not terminate the process, the system is broken). The proposed fix changes how setjmp works on a SIGABRT handler, where glibc does not save the signal mask. So usage like the below will now always abort. static volatile int chk_fail_ok; static jmp_buf chk_fail_buf; static void handler (int sig) { if (chk_fail_ok) { chk_fail_ok = 0; longjmp (chk_fail_buf, 1); } else _exit (127); } [...] signal (SIGABRT, handler); [....] chk_fail_ok = 1; if (! setjmp (chk_fail_buf)) { // Something that can calls abort, like a failed fortify function. chk_fail_ok = 0; printf ("FAIL\n"); } Such cases will need to use sigsetjmp instead. The _dl_start_profile calls sigaction through _profil, and to avoid pulling abort() on loader the call is replaced with __libc_sigaction. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-10-08 14:40:12 -03:00
Adhemerval Zanella	55d33108c7	linux: Use GLRO(dl_vdso_time) on time The BZ#24967 fix (`1bdda52fe9`) missed the time for architectures that define USE_IFUNC_TIME. Although it is not an issue, since there is no pointer mangling, there is also no need to call dl_vdso_vsym since the vDSO setup was already done by the loader. Checked on x86_64-linux-gnu and i686-linux-gnu.	2024-10-08 13:28:21 -03:00
Adhemerval Zanella	02b195d30f	linux: Use GLRO(dl_vdso_gettimeofday) on gettimeofday The BZ#24967 fix (`1bdda52fe9`) missed the gettimeofday for architectures that define USE_IFUNC_GETTIMEOFDAY. Although it is not an issue, since there is no pointer mangling, there is also no need to call dl_vdso_vsym since the vDSO setup was already done by the loader. Checked on x86_64-linux-gnu and i686-linux-gnu.	2024-10-08 13:28:21 -03:00
Adhemerval Zanella	5e8cfc5d62	linux: sparc: Fix clone for LEON/sparcv8 (BZ 31394) The sparc clone mitigation (`faeaa3bc9f`) added the use of flushw, which is not support by LEON/sparcv8. As discussed on the libc-alpha, 'ta 3' is a working alternative [1]. [1] https://sourceware.org/pipermail/libc-alpha/2024-August/158905.html Checked with a build for sparcv8-linux-gnu targetting leon. Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-10-01 10:37:21 -03:00
Adhemerval Zanella	49c3682ce1	linux: sparc: Fix syscall_cancel for LEON LEON2/LEON3 are both sparcv8, which does not support branch hints (bne,pn) nor the return instruction. Checked with a build for sparcv8-linux-gnu targetting leon. I also checked some cancellation tests with qemu-system (targeting LEON3). Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-10-01 10:37:21 -03:00
Pavel Kozlov	cc84cd389c	arc: Cleanup arcbe Remove the mention of arcbe ABI to avoid any mislead. ARC big endian ABI is no longer supported. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-09-25 15:54:07 +01:00
Florian Weimer	4ff55d08df	arc: Remove HAVE_ARC_BE macro and disable big-endian port It is no longer needed, now that ARC is always little endian.	2024-09-25 11:25:22 +02:00
caiyinyu	255dc1e4ed	LoongArch: Undef __NR_fstat and __NR_newfstatat. In Linux 6.11, fstat and newfstatat are added back. To avoid the messy usage of the fstat, newfstatat, and statx system calls, we will continue using statx only in glibc, maintaining consistency with previous versions of the LoongArch-specific glibc implementation. Signed-off-by: caiyinyu <caiyinyu@loongson.cn> Reviewed-by: Xi Ruoyao <xry111@xry111.site> Suggested-by: Florian Weimer <fweimer@redhat.com>	2024-09-25 10:00:42 +08:00
Florian Weimer	7e21a65c58	misc: Enable internal use of memory protection keys This adds the necessary hidden prototypes.	2024-09-24 13:23:10 +02:00
Florian Weimer	6f3f6c506c	Linux: readdir64_r should not skip d_ino == 0 entries (bug 32126) This is the same bug as bug 12165, but for readdir_r. The regression test covers both bug 12165 and bug 32126. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-21 19:32:34 +02:00
Florian Weimer	e92718552e	Linux: Use readdir64_r for compat __old_readdir64_r (bug 32128) It is not necessary to do the conversion at the getdents64 layer for readdir64_r. Doing it piecewise for readdir64 is slightly simpler and allows deleting __old_getdents64. This fixes bug 32128 because readdir64_r handles the length check correctly. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-21 19:32:34 +02:00
Joe Ramsay	751a5502be	AArch64: Add vector logp1 alias for log1p This enables vectorisation of C23 logp1, which is an alias for log1p. There are no new tests or ulp entries because the new symbols are simply aliases. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-19 17:53:34 +01:00
Florian Weimer	c444cc1d83	Linux: Add missing scheduler constants to <sched.h> And add a test, misc/tst-sched-consts, that checks consistency with <sched.h>. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-09-11 10:05:08 +02:00
Florian Weimer	21571ca0d7	Linux: Add the sched_setattr and sched_getattr functions And struct sched_attr. In sysdeps/unix/sysv/linux/bits/sched.h, the hack that defines sched_param around the inclusion of <linux/sched/types.h> is quite ugly, but the definition of struct sched_param has already been dropped by the kernel, so there is nothing else we can do and maintain compatibility of <sched.h> with a wide range of kernel header versions. (An alternative would involve introducing a separate header for this functionality, but this seems unnecessary.) The existing sched_* functions that change scheduler parameters are already incompatible with PTHREAD_PRIO_PROTECT mutexes, so there is no harm in adding more functionality in this area. The documentation mostly defers to the Linux manual pages. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-09-11 10:05:08 +02:00
Florian Weimer	61f2c2e1d1	Linux: readdir_r needs to report getdents failures (bug 32124) Upon error, return the errno value set by the __getdents call in __readdir_unlocked. Previously, kernel-reported errors were ignored. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-05 12:05:32 +02:00
Adhemerval Zanella	1927f718fc	linux: mips: Fix syscall_cancell build for __mips_isa_rev >= 6 Use beqzc instead of bnel. Checked with a mipsisa64r6el-n64-linux-gnu build and some nptl cancellation tests on qemu.	2024-09-02 12:30:45 -03:00
Adhemerval Zanella	89b53077d2	nptl: Fix Race conditions in pthread cancellation [BZ#12683] The current racy approach is to enable asynchronous cancellation before making the syscall and restore the previous cancellation type once the syscall returns, and check if cancellation has happen during the cancellation entrypoint. As described in BZ#12683, this approach shows 2 problems: 1. Cancellation can act after the syscall has returned from the kernel, but before userspace saves the return value. It might result in a resource leak if the syscall allocated a resource or a side effect (partial read/write), and there is no way to program handle it with cancellation handlers. 2. If a signal is handled while the thread is blocked at a cancellable syscall, the entire signal handler runs with asynchronous cancellation enabled. This can lead to issues if the signal handler call functions which are async-signal-safe but not async-cancel-safe. For the cancellation to work correctly, there are 5 points at which the cancellation signal could arrive: [ ... )[ ... )[ syscall ]( ... 1 2 3 4 5 1. Before initial testcancel, e.g. [... testcancel) 2. Between testcancel and syscall start, e.g. [testcancel...syscall start) 3. While syscall is blocked and no side effects have yet taken place, e.g. [ syscall ] 4. Same as 3 but with side-effects having occurred (e.g. a partial read or write). 5. After syscall end e.g. (syscall end...] And libc wants to act on cancellation in cases 1, 2, and 3 but not in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually happen in the next cancellable entrypoint without any further external event. The proposed solution for each case is: 1. Do a conditional branch based on whether the thread has received a cancellation request; 2. It can be caught by the signal handler determining that the saved program counter (from the ucontext_t) is in some address range beginning just before the "testcancel" and ending with the syscall instruction. 3. SIGCANCEL can be caught by the signal handler and determine that the saved program counter (from the ucontext_t) is in the address range beginning just before "testcancel" and ending with the first uninterruptable (via a signal) syscall instruction that enters the kernel. 4. In this case, except for certain syscalls that ALWAYS fail with EINTR even for non-interrupting signals, the kernel will reset the program counter to point at the syscall instruction during signal handling, so that the syscall is restarted when the signal handler returns. So, from the signal handler's standpoint, this looks the same as case 2, and thus it's taken care of. 5. For syscalls with side-effects, the kernel cannot restart the syscall; when it's interrupted by a signal, the kernel must cause the syscall to return with whatever partial result is obtained (e.g. partial read or write). 6. The saved program counter points just after the syscall instruction, so the signal handler won't act on cancellation. This is similar to 4. since the program counter is past the syscall instruction. So The proposed fixes are: 1. Remove the enable_asynccancel/disable_asynccancel function usage in cancellable syscall definition and instead make them call a common symbol that will check if cancellation is enabled (__syscall_cancel at nptl/cancellation.c), call the arch-specific cancellable entry-point (__syscall_cancel_arch), and cancel the thread when required. 2. Provide an arch-specific generic system call wrapper function that contains global markers. These markers will be used in SIGCANCEL signal handler to check if the interruption has been called in a valid syscall and if the syscalls has side-effects. A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is provided. However, the markers may not be set on correct expected places depending on how INTERNAL_SYSCALL_NCS is implemented by the architecture. It is expected that all architectures add an arch-specific implementation. 3. Rewrite SIGCANCEL asynchronous handler to check for both canceling type and if current IP from signal handler falls between the global markers and act accordingly. 4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to use the appropriate cancelable syscalls. 5. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide cancelable futex calls. Some architectures require specific support on syscall handling: * On i386 the syscall cancel bridge needs to use the old int80 instruction because the optimized vDSO symbol the resulting PC value for an interrupted syscall points to an address outside the expected markers in __syscall_cancel_arch. It has been discussed in LKML [1] on how kernel could help userland to accomplish it, but afaik discussion has stalled. Also, sysenter should not be used directly by libc since its calling convention is set by the kernel depending of the underlying x86 chip (check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60). * mips o32 is the only kABI that requires 7 argument syscall, and to avoid add a requirement on all architectures to support it, mips support is added with extra internal defines. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu, powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. [1] https://lkml.org/lkml/2016/3/8/1105 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-08-23 14:27:43 -03:00
Maciej W. Rozycki	9fb237a1c8	nptl: Fix extraneous testing run by tst-rseq-nptl in the test driver Fix an issue with commit `8f4632deb3` ("Linux: rseq registration tests") and prevent testing from being run in the process of the test driver itself rather than just the test child where one has been forked. The problem here is the unguarded use of a destructor to call a part of the testing. The destructor function, 'do_rseq_destructor_test' is called implicitly at program completion, however because it is associated with the executable itself rather than an individual process, it is called both in the test child and in the test driver itself. Prevent this from happening by providing a guard variable that only enables test invocation from 'do_rseq_destructor_test' in the process that has first run 'do_test'. Consequently extra testing is invoked from 'do_rseq_destructor_test' only once and in the correct process, regardless of the use or the lack of of the '--direct' option. Where called in the controlling test driver process that has neved called 'do_test' the destructor function silently returns right away without taking any further actions, letting the test driver fail gracefully where applicable. This arrangement prevents 'tst-rseq-nptl' from ever causing testing to hang forever and never complete, such as currently happening with the 'mips-linux-gnu' (o32 ABI) target. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-08-16 14:38:33 +01:00
Carlos O'Donell	b22923abb0	Report error if setaffinity wrapper fails (Bug 32040) Previously if the setaffinity wrapper failed the rest of the subtest would not execute and the current subtest would be reported as passing. Now if the setaffinity wrapper fails the subtest is correctly reported as faling. Tested manually by changing the conditions of the affinity call including setting size to zero, or checking the wrong condition. No regressions on x86_64. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-08-15 15:28:48 -04:00
H.J. Lu	ff0320bec2	Add mremap tests Add tests for MREMAP_MAYMOVE and MREMAP_FIXED. On Linux, also test MREMAP_DONTUNMAP. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-08-01 05:06:12 -07:00
H.J. Lu	6c40cb0e9f	linux: Update the mremap C implementation [BZ #31968 ] Update the mremap C implementation to support the optional argument for MREMAP_DONTUNMAP added in Linux 5.7 since it may not always be correct to implement a variadic function as a non-variadic function on all Linux targets. Return MAP_FAILED and set errno to EINVAL for unknown flag bits. This fixes BZ #31968. Note: A test must be added when a new flag bit is introduced. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-08-01 05:06:12 -07:00
Adhemerval Zanella	28f8cee64a	Add F_DUPFD_QUERY from Linux 6.10 to bits/fcntl-linux.h It was added by commit c62b758bae6af16 as a way for userspace to check if two file descriptors refer to the same struct file. Checked on aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-07-30 08:52:52 -03:00
Adhemerval Zanella	e433cdec9b	Update kernel version to 6.10 in header constant tests This patch updates the kernel version in the tests tst-mman-consts.py, tst-mount-consts.py, and tst-pidfd-consts.py to 6.9. There are no new constants covered by these tests in 6.10. Tested with build-many-glibcs.py. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-07-30 08:48:51 -03:00
Adhemerval Zanella	eb0776d4e1	Update syscall lists for Linux 6.10 Linux 6.10 changes for syscall are: * mseal for all architectures. * map_shadow_stack for x32. * Replace sync_file_range with sync_file_range2 for csky (which fixes a broken sync_file_range usage). Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-07-30 08:48:51 -03:00
Michael Karcher	faeaa3bc9f	Mitigation for "clone on sparc might fail with -EFAULT for no valid reason" (bz 31394) It seems the kernel can not deal with uncommitted stack space in the area intended for the register window when executing the clone() system call. So create a nested frame (proxy for the kernel frame) and flush it from the processor to memory to force committing pages to the stack before invoking the system call. Bug: https://www.mail-archive.com/debian-glibc@lists.debian.org/msg62592.html Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31394 See-also: https://lore.kernel.org/sparclinux/62f9be9d-a086-4134-9a9f-5df8822708af@mkarcher.dialup.fu-berlin.de/ Signed-off-by: Michael Karcher <sourceware-bugzilla@mkarcher.dialup.fu-berlin.de> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-07-29 23:00:39 +02:00
H.J. Lu	8344c1f551	x32/cet: Support shadow stack during startup for Linux 6.10 Use RXX_LP in RTLD_START_ENABLE_X86_FEATURES. Support shadow stack during startup for Linux 6.10: commit 2883f01ec37dd8668e7222dfdb5980c86fdfe277 Author: H.J. Lu <hjl.tools@gmail.com> Date: Fri Mar 15 07:04:33 2024 -0700 x86/shstk: Enable shadow stacks for x32 1. Add shadow stack support to x32 signal. 2. Use the 64-bit map_shadow_stack syscall for x32. 3. Set up shadow stack for x32. Add the map_shadow_stack system call to <fixup-asm-unistd.h> and regenerate arch-syscall.h. Tested on Intel Tiger Lake with CET enabled x32. There are no regressions with CET enabled x86-64. There are no changes in CET enabled x86-64 _dl_start_user. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-07-25 00:17:21 -07:00
Andreas K. Hüttel	ab5748118f	linux: Trivial test output fix in tst-pkey Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>	2024-07-19 22:57:23 +02:00
Adhemerval Zanella	6b7e2e1d61	linux: Also check pkey_get for ENOSYS on tst-pkey (BZ 31996) The powerpc pkey_get/pkey_set support was only added for 64-bit [1], and tst-pkey only checks if the support was present with pkey_alloc (which does not fail on powerpc32, at least running a 64-bit kernel). Checked on powerpc-linux-gnu. [1] https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a803367bab167f5ec4fde1f0d0ec447707c29520 Reviewed-By: Andreas K. Huettel <dilfridge@gentoo.org>	2024-07-19 22:39:44 +02:00

1 2 3 4 5 ...

6972 Commits