A recent change only initializes the regs.how[] during Dwarf unwinding
which resulted in an uninitialized offset used in return address signing
and random failures during unwinding. The fix is to encode the return
address signing state in REG_UNSAVED and a new state REG_UNSAVED_ARCHEXT.
libgcc/
PR target/107678
* unwind-dw2.h (REG_UNSAVED_ARCHEXT): Add new enum.
* unwind-dw2.c (uw_update_context_1): Add REG_UNSAVED_ARCHEXT case.
* unwind-dw2-execute_cfa.h: Use REG_UNSAVED_ARCHEXT/REG_UNSAVED to
encode the return address signing state.
* config/aarch64/aarch64-unwind.h (aarch64_demangle_return_addr)
Check current return address signing state.
(aarch64_frob_update_contex): Remove.
On many architectures, there is a padding gap after the how array
member, and cfa_how can be moved there. This reduces the size of the
struct and the amount of memory that uw_frame_state_for has to clear.
There is no measurable performance benefit from this on x86-64 (even
though the memset goes from 120 to 112 bytes), but it seems to be a
good idea to do anyway.
libgcc/
* unwind-dw2.h (struct frame_state_reg_info): Move cfa_how member
and reduce its size.
The following patch implements something that has Florian found as
low hanging fruit in our unwinder and has been discussed in the
https://gcc.gnu.org/wiki/cauldron2022#cauldron2022talks.inprocess_unwinding_bof
talk.
_Unwind_FrameState type seems to be (unlike the pre-GCC 3 frame_state
which has been part of ABI) private to unwind-dw2.c + unwind.inc it
includes, it is always defined on the stack of some entrypoints, initialized
by static uw_frame_state_for and the address of it is also passed to other
static functions or the static inlines handling machine dependent unwinding,
but it isn't fortunately passed to any callbacks or public functions, so I
think we can safely change it any time we want.
Florian mentioned that the structure is large even on x86_64, 384 bytes
there, starts with 328 bytes long element with frame_state_reg_info type
which then starts with an array with __LIBGCC_DWARF_FRAME_REGISTERS__ + 1
elements, each of them is 16 bytes long, on x86_64
__LIBGCC_DWARF_FRAME_REGISTERS__ is just 17 but even that is big, on say
riscv __LIBGCC_DWARF_FRAME_REGISTERS__ is I think 128, on powerpc 111,
on sh 153 etc. And, we memset to zero the whole fs variable with the
_Unwind_FrameState type at the start of the unwinding.
The reason why each element is 16 byte (on 64-bit arches) is that it
contains some pointer or pointer sized integer and then an enum (with just
7 different enumerators) + padding.
The following patch decreases it by moving the enum into a separate
array and using just one byte for each register in that second array.
We could compress it even more, say 4 bits per register, but I don't
want to uglify the code for it too much and make the accesses slower.
Furthermore, the clearing of the object can clear only thos how array
and members after it, because REG_UNSAVED enumerator (0) doesn't actually
need any pointer or pointer sized integer, it is just the other kinds
that need to have there something.
By doing this, on x86_64 the above numbers change to _Unwind_FrameState
type being now 264 bytes long, frame_state_reg_info 208 bytes and we
don't clear the first 144 bytes of the object, so the memset is 120 bytes,
so ~ 31% of the old clearing size. On riscv 64-bit assuming it has same
structure layout rules for the few types used there that would be
~ 2160 bytes of _Unwind_FrameState type before and ~ 1264 bytes after,
with the memset previously ~ 2160 bytes and after ~ 232 bytes after.
We've also talked about possibly adding a number of initially initialized
regs and initializing the rest lazily, but at least for x86_64 with
18 elements in the array that doesn't seem to be worth it anymore,
especially because return address column is 16 there and that is usually the
first thing to be touched. It might theory help with lots of registers if
they are usually untouched, but would uglify and complicate any stores to
how by having to check there for the not initialized yet cases and lazy
initialization, and similarly for all reads of how to do there if below
last initialized one, use how, otherwise imply REG_UNSAVED.
The disadvantage of the patch is that touching reg[x].loc and how[x]
now means 2 cachelines rather than one as before, and I admit beyond
bootstrap/regtest I haven't benchmarked it in any way.
2022-10-06 Jakub Jelinek <jakub@redhat.com>
* unwind-dw2.h (REG_UNSAVED, REG_SAVED_OFFSET, REG_SAVED_REG,
REG_SAVED_EXP, REG_SAVED_VAL_OFFSET, REG_SAVED_VAL_EXP,
REG_UNDEFINED): New anonymous enum, moved from inside of
struct frame_state_reg_info.
(struct frame_state_reg_info): Remove reg[].how element and the
anonymous enum there. Add how element.
* unwind-dw2.c: Include stddef.h.
(uw_frame_state_for): Don't clear first
offsetof (_Unwind_FrameState, regs.how[0]) bytes of *fs.
(execute_cfa_program, __frame_state_for, uw_update_context_1,
uw_update_context): Use fs->regs.how[X] instead of fs->regs.reg[X].how
or fs.regs.how[X] instead of fs.regs.reg[X].how.
* config/sh/linux-unwind.h (sh_fallback_frame_state): Likewise.
* config/bfin/linux-unwind.h (bfin_fallback_frame_state): Likewise.
* config/pa/linux-unwind.h (pa32_fallback_frame_state): Likewise.
* config/pa/hpux-unwind.h (UPDATE_FS_FOR_SAR, UPDATE_FS_FOR_GR,
UPDATE_FS_FOR_FR, UPDATE_FS_FOR_PC, pa_fallback_frame_state):
Likewise.
* config/alpha/vms-unwind.h (alpha_vms_fallback_frame_state):
Likewise.
* config/alpha/linux-unwind.h (alpha_fallback_frame_state): Likewise.
* config/arc/linux-unwind.h (arc_fallback_frame_state,
arc_frob_update_context): Likewise.
* config/riscv/linux-unwind.h (riscv_fallback_frame_state): Likewise.
* config/nios2/linux-unwind.h (NIOS2_REG): Likewise.
* config/nds32/linux-unwind.h (NDS32_PUT_FS_REG): Likewise.
* config/s390/tpf-unwind.h (s390_fallback_frame_state): Likewise.
* config/s390/linux-unwind.h (s390_fallback_frame_state): Likewise.
* config/sparc/sol2-unwind.h (sparc64_frob_update_context,
MD_FALLBACK_FRAME_STATE_FOR): Likewise.
* config/sparc/linux-unwind.h (sparc64_fallback_frame_state,
sparc64_frob_update_context, sparc_fallback_frame_state): Likewise.
* config/i386/sol2-unwind.h (x86_64_fallback_frame_state,
x86_fallback_frame_state): Likewise.
* config/i386/w32-unwind.h (i386_w32_fallback_frame_state): Likewise.
* config/i386/linux-unwind.h (x86_64_fallback_frame_state,
x86_fallback_frame_state): Likewise.
* config/i386/freebsd-unwind.h (x86_64_freebsd_fallback_frame_state):
Likewise.
* config/i386/dragonfly-unwind.h
(x86_64_dragonfly_fallback_frame_state): Likewise.
* config/i386/gnu-unwind.h (x86_gnu_fallback_frame_state): Likewise.
* config/csky/linux-unwind.h (csky_fallback_frame_state): Likewise.
* config/aarch64/linux-unwind.h (aarch64_fallback_frame_state):
Likewise.
* config/aarch64/freebsd-unwind.h
(aarch64_freebsd_fallback_frame_state): Likewise.
* config/aarch64/aarch64-unwind.h (aarch64_frob_update_context):
Likewise.
* config/or1k/linux-unwind.h (or1k_fallback_frame_state): Likewise.
* config/mips/linux-unwind.h (mips_fallback_frame_state): Likewise.
* config/loongarch/linux-unwind.h (loongarch_fallback_frame_state):
Likewise.
* config/m68k/linux-unwind.h (m68k_fallback_frame_state): Likewise.
* config/xtensa/linux-unwind.h (xtensa_fallback_frame_state):
Likewise.
* config/rs6000/darwin-fallback.c (set_offset): Likewise.
* config/rs6000/aix-unwind.h (MD_FROB_UPDATE_CONTEXT): Likewise.
* config/rs6000/linux-unwind.h (ppc_fallback_frame_state): Likewise.
* config/rs6000/freebsd-unwind.h (frob_update_context): Likewise.
gcc/c-family:
* c-cppbuiltin.c (c_cpp_builtins): Also define
__LIBGCC_EH_TABLES_CAN_BE_READ_ONLY__,
__LIBGCC_EH_FRAME_SECTION_NAME__, __LIBGCC_JCR_SECTION_NAME__,
__LIBGCC_CTORS_SECTION_ASM_OP__, __LIBGCC_DTORS_SECTION_ASM_OP__,
__LIBGCC_TEXT_SECTION_ASM_OP__, __LIBGCC_INIT_SECTION_ASM_OP__,
__LIBGCC_INIT_ARRAY_SECTION_ASM_OP__,
__LIBGCC_STACK_GROWS_DOWNWARD__,
__LIBGCC_DONT_USE_BUILTIN_SETJMP__,
__LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__,
__LIBGCC_DWARF_FRAME_REGISTERS__,
__LIBGCC_EH_RETURN_STACKADJ_RTX__, __LIBGCC_JMP_BUF_SIZE__,
__LIBGCC_STACK_POINTER_REGNUM__ and
__LIBGCC_VTABLE_USES_DESCRIPTORS__ for -fbuilding-libgcc.
(builtin_define_with_value): Handle backslash-escaping in string
macro values.
libgcc:
* Makefile.in (CRTSTUFF_CFLAGS): Add -fbuilding-libgcc.
* config/aarch64/linux-unwind.h (STACK_POINTER_REGNUM): Change all
uses to __LIBGCC_STACK_POINTER_REGNUM__.
(DWARF_ALT_FRAME_RETURN_COLUMN): Change all uses to
__LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__.
* config/alpha/vms-unwind.h (DWARF_ALT_FRAME_RETURN_COLUMN):
Change use to __LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__.
* config/cr16/unwind-cr16.c (STACK_GROWS_DOWNWARD): Change all
uses to __LIBGCC_STACK_GROWS_DOWNWARD__.
(DWARF_FRAME_REGISTERS): Change all uses to
__LIBGCC_DWARF_FRAME_REGISTERS__.
(EH_RETURN_STACKADJ_RTX): Change all uses to
__LIBGCC_EH_RETURN_STACKADJ_RTX__.
* config/cr16/unwind-dw2.h (DWARF_FRAME_REGISTERS): Change use to
__LIBGCC_DWARF_FRAME_REGISTERS__. Remove conditional definition.
* config/i386/cygming-crtbegin.c (EH_FRAME_SECTION_NAME): Change
use to __LIBGCC_EH_FRAME_SECTION_NAME__.
(JCR_SECTION_NAME): Change use to __LIBGCC_JCR_SECTION_NAME__.
* config/i386/cygming-crtend.c (EH_FRAME_SECTION_NAME): Change use
to __LIBGCC_EH_FRAME_SECTION_NAME__.
(JCR_SECTION_NAME): Change use to __LIBGCC_JCR_SECTION_NAME__
* config/mips/linux-unwind.h (STACK_POINTER_REGNUM): Change use to
__LIBGCC_STACK_POINTER_REGNUM__.
(DWARF_ALT_FRAME_RETURN_COLUMN): Change all uses to
__LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__.
* config/nios2/linux-unwind.h (STACK_POINTER_REGNUM): Change use
to __LIBGCC_STACK_POINTER_REGNUM__.
* config/pa/hpux-unwind.h (DWARF_ALT_FRAME_RETURN_COLUMN): Change
all uses to __LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__.
* config/pa/linux-unwind.h (DWARF_ALT_FRAME_RETURN_COLUMN): Change
all uses to __LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__.
* config/rs6000/aix-unwind.h (DWARF_ALT_FRAME_RETURN_COLUMN):
Change all uses to __LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__.
(STACK_POINTER_REGNUM): Change all uses to
__LIBGCC_STACK_POINTER_REGNUM__.
* config/rs6000/darwin-fallback.c (STACK_POINTER_REGNUM): Change
use to __LIBGCC_STACK_POINTER_REGNUM__.
* config/rs6000/linux-unwind.h (STACK_POINTER_REGNUM): Change all
uses to __LIBGCC_STACK_POINTER_REGNUM__.
* config/sparc/linux-unwind.h (DWARF_FRAME_REGISTERS): Change use
to __LIBGCC_DWARF_FRAME_REGISTERS__.
* config/sparc/sol2-unwind.h (DWARF_FRAME_REGISTERS): Change use
to __LIBGCC_DWARF_FRAME_REGISTERS__.
* config/tilepro/linux-unwind.h (STACK_POINTER_REGNUM): Change use
to __LIBGCC_STACK_POINTER_REGNUM__.
* config/xtensa/unwind-dw2-xtensa.h (DWARF_FRAME_REGISTERS):
Remove conditional definition.
* crtstuff.c (TEXT_SECTION_ASM_OP): Change all uses to
__LIBGCC_TEXT_SECTION_ASM_OP__.
(EH_FRAME_SECTION_NAME): Change all uses to
__LIBGCC_EH_FRAME_SECTION_NAME__.
(EH_TABLES_CAN_BE_READ_ONLY): Change all uses to
__LIBGCC_EH_TABLES_CAN_BE_READ_ONLY__.
(CTORS_SECTION_ASM_OP): Change all uses to
__LIBGCC_CTORS_SECTION_ASM_OP__.
(DTORS_SECTION_ASM_OP): Change all uses to
__LIBGCC_DTORS_SECTION_ASM_OP__.
(JCR_SECTION_NAME): Change all uses to
__LIBGCC_JCR_SECTION_NAME__.
(INIT_SECTION_ASM_OP): Change all uses to
__LIBGCC_INIT_SECTION_ASM_OP__.
(INIT_ARRAY_SECTION_ASM_OP): Change all uses to
__LIBGCC_INIT_ARRAY_SECTION_ASM_OP__.
* generic-morestack.c (STACK_GROWS_DOWNWARD): Change all uses to
__LIBGCC_STACK_GROWS_DOWNWARD__.
* libgcc2.c (INIT_SECTION_ASM_OP): Change all uses to
__LIBGCC_INIT_SECTION_ASM_OP__.
(INIT_ARRAY_SECTION_ASM_OP): Change all uses to
__LIBGCC_INIT_ARRAY_SECTION_ASM_OP__.
(EH_FRAME_SECTION_NAME): Change all uses to
__LIBGCC_EH_FRAME_SECTION_NAME__.
* libgcov-profiler.c (VTABLE_USES_DESCRIPTORS): Remove conditional
definitions. Change all uses to
__LIBGCC_VTABLE_USES_DESCRIPTORS__.
* unwind-dw2.c (STACK_GROWS_DOWNWARD): Change all uses to
__LIBGCC_STACK_GROWS_DOWNWARD__.
(DWARF_FRAME_REGISTERS): Change all uses to
__LIBGCC_DWARF_FRAME_REGISTERS__.
(EH_RETURN_STACKADJ_RTX): Change all uses to
__LIBGCC_EH_RETURN_STACKADJ_RTX__.
* unwind-dw2.h (DWARF_FRAME_REGISTERS): Remove conditional
definition. Change use to __LIBGCC_DWARF_FRAME_REGISTERS__.
* unwind-sjlj.c (DONT_USE_BUILTIN_SETJMP): Change all uses to
__LIBGCC_DONT_USE_BUILTIN_SETJMP__.
(JMP_BUF_SIZE): Change use to __LIBGCC_JMP_BUF_SIZE__.
From-SVN: r214954