Changed the makecontext logic: previously the first setcontext jumped
straight to the user callback function and the return address is set
to __startcontext. This does not work when GCS is enabled as the
integrity of the return address is protected, so instead the context
is setup such that setcontext jumps to __startcontext which calls the
user callback (passed in x20).
The map_shadow_stack syscall is used to allocate a suitably sized GCS
(which includes some reserved area to account for altstack signal
handlers and otherwise supports maximum number of 16 byte aligned
stack frames on the given stack) however the GCS is never freed as
the lifetime of ucontext and related stack is user managed.
Userspace ucontext needs to store GCSPR, it does not have to be
compatible with the kernel ucontext. For now we use the linux
struct gcs_context layout but only use the gcspr field from it.
Similar implementation to the longjmp code, supports switching GCS
if the target GCS is capped, and unwinding a continous GCS to a
previous state.
This implementations ensures that longjmp across different stacks
works: it scans for GCS cap token and switches GCS if necessary
then the target GCSPR is restored with a GCSPOPM loop once the
current GCSPR is on the same GCS.
This makes longjmp linear time in the number of jumped over stack
frames when GCS is enabled.
The target specific internal __longjmp is called with a __jmp_buf
argument which has its size exposed in the ABI. On aarch64 this has
no space left, so GCSPR cannot be restored in longjmp in the usual
way, which is needed for the Guarded Control Stack (GCS) extension.
setjmp is implemented via __sigsetjmp which has a jmp_buf argument
however it is also called with __pthread_unwind_buf_t argument cast
to jmp_buf (in cancellation cleanup code built with -fno-exception).
The two types, jmp_buf and __pthread_unwind_buf_t, have common bits
beyond the __jmp_buf field and there is unused space there which we
can use for saving GCSPR.
For this to work some bits of those two generic types have to be
reserved for target specific use and the generic code in glibc has
to ensure that __longjmp is always called with a __jmp_buf that is
embedded into one of those two types. Morally __longjmp should be
changed to take jmp_buf as argument, but that is an intrusive change
across targets.
Note: longjmp is never called with __pthread_unwind_buf_t from user
code, only the internal __libc_longjmp is called with that type and
thus the two types could have separate longjmp implementations on a
target. We don't rely on this now (but migh in the future given that
cancellation unwind does not need to restore GCSPR).
Given the above this patch finds an unused slot for GCSPR. This
placement is not exposed in the ABI so it may change in the future.
This is also very target ABI specific so the generic types cannot
be easily changed to clearly mark the reserved fields.
The Guarded Control Stack instructions can be present even if the
hardware does not support the extension (runtime checked feature),
so the asm code should be backward compatible with old assemblers.
GCC mainline produces a -Wheader-guard error building for x86_64-gnu.
Fix what seems to be incorrect macro naming in the #ifndef
conditional.
Tested with build-many-glibc.py for x86_64-gnu (GCC mainline).
Message-ID: <fd800046-5ecb-ebd5-4df1-29d4eb3d5433@redhat.com>
Test that clock_nanosleep rejects out of range time values.
Test that clock_nanosleep actually sleeps for at least the
requested time relative to the requested clock.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The recursive lock used on abort does not synchronize with a new process
creation (either by fork-like interfaces or posix_spawn ones), nor it
is reinitialized after fork().
Also, the SIGABRT unblock before raise() shows another race condition,
where a fork or posix_spawn() call by another thread, just after the
recursive lock release and before the SIGABRT signal, might create
programs with a non-expected signal mask. With the default option
(without POSIX_SPAWN_SETSIGDEF), the process can see SIG_DFL for
SIGABRT, where it should be SIG_IGN.
To fix the AS-safe, raise() does not change the process signal mask,
and an AS-safe lock is used if a SIGABRT is installed or the process
is blocked or ignored. With the signal mask change removal,
there is no need to use a recursive loc. The lock is also taken on
both _Fork() and posix_spawn(), to avoid the spawn process to see the
abort handler as SIG_DFL.
A read-write lock is used to avoid serialize _Fork and posix_spawn
execution. Both sigaction (SIGABRT) and abort() requires to lock
as writer (since both change the disposition).
The fallback is also simplified: there is no need to use a loop of
ABORT_INSTRUCTION after _exit() (if the syscall does not terminate the
process, the system is broken).
The proposed fix changes how setjmp works on a SIGABRT handler, where
glibc does not save the signal mask. So usage like the below will now
always abort.
static volatile int chk_fail_ok;
static jmp_buf chk_fail_buf;
static void
handler (int sig)
{
if (chk_fail_ok)
{
chk_fail_ok = 0;
longjmp (chk_fail_buf, 1);
}
else
_exit (127);
}
[...]
signal (SIGABRT, handler);
[....]
chk_fail_ok = 1;
if (! setjmp (chk_fail_buf))
{
// Something that can calls abort, like a failed fortify function.
chk_fail_ok = 0;
printf ("FAIL\n");
}
Such cases will need to use sigsetjmp instead.
The _dl_start_profile calls sigaction through _profil, and to avoid
pulling abort() on loader the call is replaced with __libc_sigaction.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
The BZ#24967 fix (1bdda52fe9) missed the time for
architectures that define USE_IFUNC_TIME. Although it is not
an issue, since there is no pointer mangling, there is also no need
to call dl_vdso_vsym since the vDSO setup was already done by the
loader.
Checked on x86_64-linux-gnu and i686-linux-gnu.
The BZ#24967 fix (1bdda52fe9) missed the gettimeofday for
architectures that define USE_IFUNC_GETTIMEOFDAY. Although it is not
an issue, since there is no pointer mangling, there is also no need
to call dl_vdso_vsym since the vDSO setup was already done by the
loader.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Building the s390 specific iconv modules - utf16-utf32-z9.c, utf8-utf32-z9.c
and utf8-utf16-z9.c - with -fno-omit-frame-pointer leads to a build error
"error: %r11 cannot be used in 'asm' here" as r11 is needed as frame-pointer.
The cuXY-instructions need two even-odd register pairs. Therefore the register
pinning is used. This patch just uses a different register pair.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Several copies of the licenses in files contained whitespace related
problems. Two cases are addressed here, the first is two spaces
after a period which appears between "PURPOSE." and "See". The other
is a space after the last forward slash in the URL. Both issues are
corrected and the licenses now match the official textual description
of the license (and the other license in the sources).
Since these whitespaces changes do not alter the paragraph structure of
the license, nor create new sentences, they do not change the license.
Add tests of freopen adding or removing "c" (non-cancelling I/O) from
the mode string (so completing my planned tests of freopen with
different features used in the mode strings). Note that it's in the
nature of the uncertain time at which cancellation might act (possibly
during freopen, possibly during subsequent reads) that these can leak
memory or file descriptors, so these do not include leak tests.
Tested for x86_64.
The sparc clone mitigation (faeaa3bc9f) added the use of
flushw, which is not support by LEON/sparcv8. As discussed on
the libc-alpha, 'ta 3' is a working alternative [1].
[1] https://sourceware.org/pipermail/libc-alpha/2024-August/158905.html
Checked with a build for sparcv8-linux-gnu targetting leon.
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
LEON2/LEON3 are both sparcv8, which does not support branch hints
(bne,pn) nor the return instruction.
Checked with a build for sparcv8-linux-gnu targetting leon. I also
checked some cancellation tests with qemu-system (targeting LEON3).
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
GCC aligns global data to 16 bytes if their size is >= 16 bytes. This patch
changes the exp2f_data struct slightly so that the fields are better aligned.
As a result on targets that support them, load-pair instructions accessing
poly_scaled and invln2_scaled are now 16-byte aligned.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Even though building glibc with 64 bit time_t flags is not supported,
and the usual way is to patch the build system to avoid it; some
systems do enable it by default, and it increases the requirements
to build glibc in such cases (it also does not help newcomers when
trying to build glibc).
The conform namespace and linknamespace tests also do not expect
that flag to be set by default, so disable it as well.
Checked with a build/check for major ABI and some (i386, arm,
mipsel, hppa) with a toolchain that has LFS flags by default.
Reviewed-by: DJ Delorie <dj@redhat.com>
Even though building glibc with LFS flags is not supported, and the
the usual way is to patch the build system to avoid it [1]; some system
do enable it by default, and it increases the requirements to build
glibc in such cases (it also does not help newcomers when trying
to build glibc).
The conform namespace and linknamespace tests also do not expect
that flag to be set by default, so disable it as well.
Checked with a build/check for major ABI and some (i386, arm,
mipsel, hppa) with a toolchain that has LFS flags by default.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=31624
Reviewed-by: DJ Delorie <dj@redhat.com>
The -Wp does not work properly if the compiler is configured to enable
fortify by default, since it bypasses the compiler driver (which defines
the fortify flags in this case).
This patch is similar to the one used on Ubuntu [1].
I checked with a build for x86_64-linux-gnu, i686-linux-gnu,
aarch64-linux-gnu, s390x-linux-gnu, and riscv64-linux-gnu with
gcc-13 that enables the fortify by default.
Co-authored-by: Matthias Klose <matthias.klose@canonical.com>
[1] https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/glibc/tree/debian/patches/ubuntu/fix-fortify-source.patch
Reviewed-by: DJ Delorie <dj@redhat.com>
Since _IO_vtable_offset is used to detect the old binaries, set it
in _IO_old_file_init_internal before calling _IO_link_in which checks
_IO_vtable_offset. Add a glibc 2.0 test with copy relocation on
_IO_stderr_@GLIBC_2.0 to verify that fopen won't cause memory corruption.
This fixes BZ #32148.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Exercises fwrite's internal buffer when doing a file operation.
The new test, exercises 2 overflow behaviors:
1. Call fwrite multiple times making usage of fwrite's internal buffer.
The total number of bytes written is larger than fwrite's internal
buffer, forcing an automatic flush.
2. Call fwrite a single time with an amount of data that is larger than
fwrite's internal buffer.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The loop should be aligned to 32-bytes so that it can ideally run out
the DSB. This is particularly important on Skylake-Server where
deficiencies in it's DSB implementation make it prone to not being
able to run loops out of the DSB.
For example running strcmp-evex on 200Mb string:
32-byte aligned loop:
- 43,399,578,766 idq.dsb_uops
not 32-byte aligned loop:
- 6,060,139,704 idq.dsb_uops
This results in a 25% performance degradation for the non-aligned
version.
The fix is to just ensure the code layout is such that the loop is
aligned. (Which was previously the case but was accidentally dropped
in 84e7c46df).
NB: The fix was actually 64-byte alignment. This is because 64-byte
alignment generally produces more stable performance than 32-byte
aligned code (cache line crosses can affect perf), so if we are going
past 16-byte alignmnent, might as well go to 64. 64-byte alignment
also matches most other functions we over-align, so it creates a
common point of optimization.
Times are reported as ratio of Time_With_Patch /
Time_Without_Patch. Lower is better.
The values being reported is the geometric mean of the ratio across
all tests in bench-strcmp and bench-strncmp.
Note this patch is only attempting to improve the Skylake-Server
strcmp for long strings. The rest of the numbers are only to test for
regressions.
Tigerlake Results Strings <= 512:
strcmp : 1.026
strncmp: 0.949
Tigerlake Results Strings > 512:
strcmp : 0.994
strncmp: 0.998
Skylake-Server Results Strings <= 512:
strcmp : 0.945
strncmp: 0.943
Skylake-Server Results Strings > 512:
strcmp : 0.778
strncmp: 1.000
The 2.6% regression on TGL-strcmp is due to slowdowns caused by
changes in alignment of code handling small sizes (most on the
page-cross logic). These should be safe to ignore because 1) We
previously only 16-byte aligned the function so this behavior is not
new and was essentially up to chance before this patch and 2) this
type of alignment related regression on small sizes really only comes
up in tight micro-benchmark loops and is unlikely to have any affect
on realworld performance.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This hides the inconsistent TCB state (missing robust mutex list) from
signal handlers.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Unicode 16.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 16.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).
Changes in CHARMAP and WIDTH:
Total added characters in newly generated CHARMAP: 5185
Total removed characters in newly generated WIDTH: 1
Total added characters in newly generated WIDTH: 170
The removed character from WIDTH is U+1171E AHOM CONSONANT SIGN MEDIAL RA.
It changed like this:
UnicodeData.txt 15.1.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mn;0;NSM;;;;;N;;;;;
UnicodeData.txt 16.0.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mc;0;L;;;;;N;;;;;
EastAsianWidth.txt 15.1.0: 1171D..1171F ; N # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
EastAsianWidth.txt 16.0.0: 1171E ; N # Mc AHOM CONSONANT SIGN MEDIAL RA
I.e it changed from Mn (Mark Nonspacing) to Mc (Mark Spacing
combining). So it should now have width 1 instead of 0, therefore it
is OK that it was removed from WIDTH, characters not in WIDTH get
width 1 by default.
Nothing suspicious when browsing the list of the 170 added characters.
Changes in ctype:
alpha: Added 4452 characters in new ctype which were not in old ctype
combining: Added 51 characters in new ctype which were not in old ctype
combining_level3: Added 43 characters in new ctype which were not in old ctype
graph: Added 5185 characters in new ctype which were not in old ctype
lower: Added 25 characters in new ctype which were not in old ctype
print: Added 5185 characters in new ctype which were not in old ctype
punct: Missing 33 characters of old ctype in new ctype
punct: Added 766 characters in new ctype which were not in old ctype
tolower: Added 27 characters in new ctype which were not in old ctype
totitle: Added 27 characters in new ctype which were not in old ctype
toupper: Added 27 characters in new ctype which were not in old ctype
upper: Added 27 characters in new ctype which were not in old ctype
Nothing suspicous in the additions.
About the 33 characters removed from `punct`:
U+0363 - U+036F are identical in UnicodeData.txt. Difference in DerivedCoreProperties.txt:
DerivedCoreProperties.txt 15.1.0: not there.
DerivedCoreProperties.txt 16.0.0: 0363..036F ; Alphabetic # Mn [13] COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X
So that’s the reason why they are added to `alpha` and removed from `punct`.
Same for U+1DD3 - U+1DE6, they are identical in UnicodeData.txt but there is a difference in DerivedCoreProperties.txt:
DerivedCoreProperties.txt 15.1.0: 1DE7..1DF4 ; Alphabetic # Mn [14] COMBINING LATIN SMALL LETTER ALPHA..COMBINING LATIN SMALL LETTER U WITH DIAERESIS
DerivedCoreProperties.txt 16.0.0: 1DD3..1DF4 ; Alphabetic # Mn [34] COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE..COMBINING LATIN SMALL LETTER U WITH DIAERESIS
So they became `Alphabetic` and were thus added to `alpha` and removed from `punct`.
Resolves: BZ #32168
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This is not completely clear from the C standard (although there
is footnote number 289 in C11), but I assume that our implementation
works this way.
Reviewed-by: DJ Delorie <dj@redhat.com>
This was discussed on the hallway track at GNU Tools Cauldron
2024. There are concerns about stability of the big-endian
GCC backend, and Linux removed support for the only big-endian
ARC platform in commit dd7c7ab01a04d645b7e7baa8530bfd81e31a2202
("ARC: [plat-eznps]: Drop support for EZChip NPS platform").
In Linux 6.11, fstat and newfstatat are added back. To avoid the messy
usage of the fstat, newfstatat, and statx system calls, we will continue
using statx only in glibc, maintaining consistency with previous versions of
the LoongArch-specific glibc implementation.
Signed-off-by: caiyinyu <caiyinyu@loongson.cn>
Reviewed-by: Xi Ruoyao <xry111@xry111.site>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Use the setresuid32 system call if it is available, prefering
it over setresuid. If both system calls exist, setresuid
is the 16-bit variant. This fixes a build failure on
sparcv9-linux-gnu.
Calling an extern function in a different translation unit before
self-relocation is brittle. The compiler may load the address
at an earlier point in _dl_start, before self-relocation. In
_dl_start_final, the call is behind a compiler barrier, so this
cannot happen.
This case is detected early in the elf/dl-version.c consistency
checks. (These checks could be disabled in the future to allow
the removal of symbol versioning from objects.)
Commit f0b2132b35 ("ld.so: Support moving versioned symbols between
sonames [BZ #24741]) removed another call to _dl_name_match_p. The
_dl_check_caller function no longer exists, and the remaining calls
to _dl_name_match_p happen under the loader lock. This means that
atomic accesses are no longer required for the l_libname list. This
supersedes commit 395be7c218 ("elf: Fix data race in _dl_name_match_p
[BZ #21349]").
With --enable-hardcoded-path-in-tests, $(test-program-prefix)
does not redirect to the built glibc, but we need to run
iconv (the program) against the built glibc even with
--enable-hardcoded-path-in-tests, as it is using the ABI
path for the dynamic linker (as an installed program).
Use $(run-program-prefix) instead.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This operation can be simplified to use simpler multiply-round-convert
sequence, which uses fewer instructions and constants.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>