From d813260d17cad44ef33400863d218a3e6b1fe7a9 Mon Sep 17 00:00:00 2001 From: Szabolcs Nagy Date: Fri, 25 Nov 2022 18:16:07 +0000 Subject: [PATCH] aarch64: Define jmp_buf offset for GCS The target specific internal __longjmp is called with a __jmp_buf argument which has its size exposed in the ABI. On aarch64 this has no space left, so GCSPR cannot be restored in longjmp in the usual way, which is needed for the Guarded Control Stack (GCS) extension. setjmp is implemented via __sigsetjmp which has a jmp_buf argument however it is also called with __pthread_unwind_buf_t argument cast to jmp_buf (in cancellation cleanup code built with -fno-exception). The two types, jmp_buf and __pthread_unwind_buf_t, have common bits beyond the __jmp_buf field and there is unused space there which we can use for saving GCSPR. For this to work some bits of those two generic types have to be reserved for target specific use and the generic code in glibc has to ensure that __longjmp is always called with a __jmp_buf that is embedded into one of those two types. Morally __longjmp should be changed to take jmp_buf as argument, but that is an intrusive change across targets. Note: longjmp is never called with __pthread_unwind_buf_t from user code, only the internal __libc_longjmp is called with that type and thus the two types could have separate longjmp implementations on a target. We don't rely on this now (but migh in the future given that cancellation unwind does not need to restore GCSPR). Given the above this patch finds an unused slot for GCSPR. This placement is not exposed in the ABI so it may change in the future. This is also very target ABI specific so the generic types cannot be easily changed to clearly mark the reserved fields. --- sysdeps/aarch64/jmpbuf-offsets.h | 63 ++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/sysdeps/aarch64/jmpbuf-offsets.h b/sysdeps/aarch64/jmpbuf-offsets.h index 632328c7e2..ec047cf6b1 100644 --- a/sysdeps/aarch64/jmpbuf-offsets.h +++ b/sysdeps/aarch64/jmpbuf-offsets.h @@ -39,6 +39,69 @@ #define JB_D14 20 #define JB_D15 21 +/* The target specific part of jmp_buf has no space for expansion but + the public jmp_buf ABI type has. Unfortunately there is another type + that is used with setjmp APIs and exposed by thread cancellation (in + binaries built with -fno-exceptions) which complicates the situation. + + // Internal layout of the public jmp_buf type on AArch64. + // This is passed to setjmp, longjmp, sigsetjmp, siglongjmp. + struct + { + uint64_t jmpbuf[22]; // Target specific part. + uint32_t mask_was_saved; // savemask bool used by sigsetjmp/siglongjmp. + uint32_t pad; + uint64_t saved_mask; // sigset_t bits used on linux. + uint64_t unused[15]; // sigset_t bits not used on linux. + }; + + // Internal layout of the public __pthread_unwind_buf_t type. + // This is passed to sigsetjmp with !savemask and to the internal + // __libc_longjmp (currently alias of longjmp on AArch64). + struct + { + uint64_t jmpbuf[22]; // Must match jmp_buf. + uint32_t mask_was_saved; // Must match jmp_buf, always 0. + uint32_t pad; + void *prev; // List for unwinding. + void *cleanup; // Cleanup handlers. + uint32_t canceltype; // 1 bit cancellation type. + uint32_t pad2; + void *pad3; + }; + + Ideally only the target specific part of jmp_buf (A) is accessed by + __setjmp and __longjmp. But that is always embedded into one of the + two types above so the bits that are unused in those types (B) may be + reused for target specific purposes. Setjmp can't distinguish between + jmp_buf and __pthread_unwind_buf_t, but longjmp can: only an internal + longjmp call uses the latter, so state that is not needed for cancel + cleanups can go to fields (C). If generic code is refactored then the + usage of additional fields can be optimized (D). And some fields are + only accessible in the savedmask case (E). Reusability of jmp_buf + fields on AArch64 for target purposes: + + struct + { + uint64_t A[22]; // 0 .. 176 + uint32_t D; // 176 .. 180 + uint32_t B; // 180 .. 184 + uint64_t D; // 184 .. 192 + uint64_t C; // 192 .. 200 + uint32_t C; // 200 .. 204 + uint32_t B; // 204 .. 208 + uint64_t B; // 208 .. 216 + uint64_t E[12]; // 216 .. 312 + } + + The B fields can be used with minimal glibc code changes. We need a + 64 bit field for the Guarded Control Stack pointer (GCSPR_EL0) which + can use a C field too as cancellation cleanup does not execute RET + for a previous BL of the cancelled thread, but that would require a + custom __libc_longjmp. This layout can change in the future. +*/ +#define JB_GCSPR 208 + #ifndef __ASSEMBLER__ #include #include