This patch adds the analysis part of PLT call optimization, enabling
the code added with the previous patch that actually performs the
optimization.
Gold support is not available yet.
bfd/
* elf64-ppc.c (struct _ppc64_elf_section_data): Add has_pltcall field.
(struct ppc_link_hash_table): Add can_convert_all_inline_plt.
(ppc64_elf_check_relocs): Set has_pltcall.
(ppc64_elf_adjust_dynamic_symbol): Discard some PLT entries.
(ppc64_elf_inline_plt): New function.
(ppc64_elf_size_dynamic_sections): Discard some PLT entries for locals.
* elf64-ppc.h (ppc64_elf_inline_plt): Declare.
* elf32-ppc.c (has_pltcall): Define.
(struct ppc_elf_link_hash_table): Add can_convert_all_inline_plt.
(ppc_elf_check_relocs): Set has_pltcall.
(ppc_elf_inline_plt): New function.
(ppc_elf_adjust_dynamic_symbol): Discard some PLT entries.
(ppc_elf_size_dynamic_sections): Likewise.
* elf32-ppc.h (ppc_elf_inline_plt): Declare.
ld/
* emultempl/ppc64elf.em (no_inline_plt): New var.
(ppc_before_allocation): Call ppc64_elf_inline_plt.
(enum ppc64_opt): Add OPTION_NO_INLINE_OPT.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS,
PARSE_AND_LIST_ARGS_CASES): Handle --no-inline-optimize.
* emultemps/ppc32elf.em (no_inline_opt): New var.
(prelim_size_sections): New function, extracted from..
(ppc_before_allocation): ..here. Call ppc_elf_inline_plt.
(enum ppc32_opt): Add OPTION_NO_INLINE_OPT.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS,
PARSE_AND_LIST_ARGS_CASES): Handle --no-inline-optimize.
The current scheme where we output PLT relocs for global symbols in
finish_dynamic_symbol, and PLT relocs for local symbols when
outputting stubs does not work if PLT entries are to be used for
inline PLT sequences against non-dynamic globals or local symbols.
bfd/
* elf64-ppc.c (ppc_build_one_stub): Move output of PLT relocs
for local symbols to..
(write_plt_relocs_for_local_syms): ..here. New function.
(ppc64_elf_finish_dynamic_symbol): Move output of PLT relocs for
global symbols to..
(build_global_entry_stubs_and_plt): ..here. Rename from
build_global_entry_stubs.
(ppc64_elf_build_stubs): Always call build_global_entry_stubs_and_plt.
Call write_plt_relocs_for_local_syms.
* elf32-ppc.c (get_sym_h): New function.
(ppc_elf_relax_section): Use get_sym_h.
(ppc_elf_relocate_section): Move output of PLT relocs and glink
stubs for local symbols to..
(ppc_finish_symbols): ..here. New function.
(ppc_elf_finish_dynamic_symbol): Move output of PLT relocs for
global syms to..
(write_global_sym_plt): ..here. New function.
* elf32-ppc.h (ppc_elf_modify_segment_map): Delete attribute.
(ppc_finish_symbols): Declare.
ld/
* ppc32elf.em (ppc_finish): Call ppc_finish_symbols.
This reverts most of commit 1be5d8d3bb.
Left in place are addition of --no-plt-align to some ppc32 ld tests
and the ld.texinfo --no-plt-thread-safe fix.
This is in preparation for the next patch adding Spectre variant 2
mitigation for PowerPC and PowerPC64. Besides tidying code involved
in stub output (to reduce the number of places where bctr is output),
the patch adds some user visible features:
1) PowerPC64 ELFv2 global entry stubs now are aligned under the
control of --plt-align, with a default alignment of 32 bytes.
2) PowerPC64 __glink_PLTresolve is no longer padded out with nops.
3) PowerPC32 PLT stubs are aligned under the control of --plt-align,
with the default alignment being 16 bytes as before.
4) The PowerPC32 branch/nop table emitted before __glink_PLTresolve
is now smaller in many cases. It was sized incorrectly when the
__tls_get_addr_opt stub was used, and unnecessarily included space
for local ifuncs.
bfd/
* elf32-ppc.c (GLINK_ENTRY_SIZE): Add parameters, handle
__tls_get_addr_opt, and alignment sizing.
(TLS_GET_ADDR_GLINK_SIZE): Delete.
(is_nonpic_glink_stub): Don't use GLINK_ENTRY_SIZE.
(ppc_elf_get_synthetic_symtab): Recognize stubs spaced at 4, 6,
or 8 insns.
(ppc_elf_link_hash_table_create): Init new ppc_elf_params field.
(allocate_dynrelocs): Use new GLINK_ENTRY_SIZE.
(ppc_elf_size_dynamic_sections): Likewise. Size branch table
by PLT reloc count.
(write_glink_stub): Handle __tls_get_addr_opt stub.
Pad out to size given by GLINK_ENTRY_SIZE.
(ppc_elf_relocate_section): Adjust write_glink_stub call.
(ppc_elf_finish_dynamic_symbol): Likewise.
(ppc_elf_finish_dynamic_sections): Write PLTresolve without using
insn array since so many need rewriting.
* elf32-ppc.h (struct ppc_elf_params): Add plt_stub_align.
* elf64-ppc.c (GLINK_PLTRESOLVE_SIZE): Rename from
GLINK_CALL_STUB_SIZE. Add htab param and evaluate to size without
nops. Adjust all uses.
(ppc64_elf_get_synthetic_symtab): Don't use GLINK_CALL_STUB_SIZE
in glink_vma calculation.
(struct ppc_link_hash_table): Add global_entry section pointer.
(create_linkage_sections): Create separate section for global
entry stubs.
(PPC_LO, PPC_HI, PPC_HA): Move earlier.
(size_global_entry_stubs): Handle sizing for aligned stubs.
(ppc64_elf_size_dynamic_sections): Handle global_entry alloc,
and don't stash end of glink branch table in rawsize.
(ppc_build_one_stub): Rewrite stub size calculations.
(build_global_entry_stubs): Use new section.
(ppc64_elf_build_stubs): Don't pad __glink_PLTresolve with nops.
Build lazy link stubs out to end of section. Build global entry
stubs in new section.
gold/
* options.h (plt_align): Support for PowerPC32 too.
* powerpc.cc (Stub_table::stub_align): Heed --plt-align for 32-bit.
(Stub_table::plt_call_size, branch_stub_size): Tidy.
(Stub_table::plt_call_align): Implement using stub_align.
(Output_data_glink::global_entry_align): New function.
(Output_data_glink::global_entry_off): New function.
(Output_data_glink::global_entry_address): Use global_entry_off.
(Output_data_glink::pltresolve_size): New function, replacing
pltresolve_size_ constant. Update all uses.
(Output_data_glink::add_global_entry): Align offset.
(Output_data_glink::set_final_data_size): Use global_entry_align.
(Stub_table::do_write): Don't pad __glink_PLTrelsolve with nops.
Tidy stub output. Use global_entry_off.
ld/
* emultempl/ppc32elf.em (params): Init new field.
(enum ppc32_opt): New enum to define OPTION_* values. Add
OPTION_PLT_ALIGN and OPTION_NO_PLT_ALIGN.
(PARSE_AND_LIST_LONGOPTS): Handle new options.
(PARSE_AND_LIST_ARGS_CASES): Likewise.
(PARSE_AND_LIST_OPTIONS): Likewise. Break up help output.
* emultempl/ppc64elf.em (ppc_add_stub_section): Init alignment
correctly for negative --plt-stub-align.
* testsuite/ld-powerpc/elfv2exe.d,
* testsuite/ld-powerpc/elfv2so.d,
* testsuite/ld-powerpc/relbrlt.d,
* testsuite/ld-powerpc/relbrlt.s,
* testsuite/ld-powerpc/tlsexe.d,
* testsuite/ld-powerpc/tlsexe.r,
* testsuite/ld-powerpc/tlsexe32.d,
* testsuite/ld-powerpc/tlsexe32.g,
* testsuite/ld-powerpc/tlsexe32.r,
* testsuite/ld-powerpc/tlsexetoc.d,
* testsuite/ld-powerpc/tlsexetoc.r,
* testsuite/ld-powerpc/tlsopt5_32.d,
* testsuite/ld-powerpc/tlsso.d,
* testsuite/ld-powerpc/tlstocso.d: Update for changed stub order.
This is a linker-only solution to the incompatibility between shared
library protected visibility variables and using .dynbss and copy
relocs for non-PIC access to shared library variables.
bfd/
* elf32-ppc.c (struct ppc_elf_link_hash_entry): Add has_addr16_ha
and has_addr16_lo. Make has_sda_refs a bitfield.
(ppc_elf_check_relocs): Set new flags.
(ppc_elf_link_hash_table_create): Update default_params.
(ppc_elf_adjust_dynamic_symbol): Clear protected_def in cases
where we won't be making .dynbss entries or editing code. Set
params->pic_fixup when we'll edit code for protected var access.
(allocate_dynrelocs): Allocate got entry for edited code and
discard dyn_relocs.
(struct ppc_elf_relax_info): Add picfixup_size.
(ppc_elf_relax_section): Rename struct one_fixup to struct
one_branch_fixup. Rename fixups to branch_fixups. Size space for
pic fixups.
(ppc_elf_relocate_section): Edit non-PIC accessing protected
visibility variables to PIC. Don't emit dyn_relocs for code
we've edited.
* elf32-ppc.h (struct ppc_elf_params): Add pic_fixup.
ld/
* emultempl/ppc32elf.em: Handle --no-pic-fixup.
(params): Init new field.
(ppc_before_allocation): Enable relaxation for pic_fixup.
1) _SDA_BASE_ and _SDA2_BASE_ and defined automatically, in a similar
manner to the way _GLOBAL_OFFSET_TABLE_ is handled. It's a little
more complicated to remove the symbols because _SDA_BASE_ needs to
be there if either .sdata or .sbss is present, and similarly for
_SDA2_BASE.
2) The linker created .sdata and .sdata2 sections used for
R_PPC_EMB_SDAI16 and R_PPC_EMB_SDA2I16 pointers are created early.
Nowadays we strip unneeded sections from the output, so it isn't
necessary to delay creating the sections.
3) The output section for targets of various SDA relocs is now checked
as per the ABI(s). We previously allowed .sdata.foo and similar,
most likely because at some stage we were checking input sections.
Also, the patch fixes a long-standing bug in size_input_sections
that affects the values of symbols defined in stripped input
sections.
PR 16952
bfd/
* elf32-ppc.c (ppc_elf_create_linker_section): Move earlier.
Remove redundant setting of htab->elf.dynobj. Don't align.
Define .sdata symbols using _bfd_elf_define_linkage_sym.
(ppc_elf_create_glink): Call ppc_elf_create_linker_section.
(create_sdata_sym): Delete.
(elf_allocate_pointer_linker_section): Rename from
elf_create_pointer_linker_section. Align section.
(ppc_elf_check_relocs): Don't call ppc_elf_creat_linker_section
directly here, or create_sdata_sym. Set ref_regular on _SDA_BASE_
and _SDA2_BASE_.
(ppc_elf_size_dynamic_sections): Remove ATTRIBUTE_UNUSED on param.
Remove unnecessary tests on _SDA_BASE_ sym.
(maybe_strip_sdasym, ppc_elf_maybe_strip_sdata_syms): New functions.
(ppc_elf_relocate_section): Tighten SDA reloc symbol section checks.
* elf32-ppc.h (ppc_elf_set_sdata_syms): Delete.
(ppc_elf_maybe_strip_sdata_syms): Declare.
ld/
* emulparams/elf32ppccommon.sh (_SDA_BASE_, _SDA2_BASE_): Delete.
* emultempl/ppc32elf.em (ppc_before_allocation): Call
ppc_elf_maybe_strip_sdata_syms.
* ldlang.c (size_input_section): Correct output_offset value
for excluded input sections.
The Linux kernel builds modules using ld -r. These might need the
ppc476 workaround, so enable it for ld -r if sections have sufficient
alignment to tell location within a page.
bfd/
* elf32-ppc.c (ppc_elf_relax_section): Enable ppc476 workaround
for ld -r, when code sections are sufficiently aligned.
* elf32-ppc.h (struct ppc_elf_params): Delete pagesize. Add
pagesize_p2.
ld/
* emultempl/ppc32elf.em (pagesize): New static var.
(ppc_after_open_output): Set params.pagesize_p2 from pagesize.
(PARSE_AND_LIST_ARGS_CASES): Adjust to use pagesize.
This implements a work-around for an icache bug on 476 that can cause
execution of stale instructions when control falls through from one
page to the next. The idea is to prevent such fall-through by
replacing the last instruction on a page with a branch to a patch
area containing the instruction, then branch to the next page.
The patch also fixes a number of bugs in the existing support for long
branch trampolines.
bfd/
* elf32-ppc.c (struct ppc_elf_link_hash_table): Add params.
Delete emit_stub_syms, no_tls_get_addr_opt. Update all uses.
(ppc_elf_link_params): New function.
(ppc_elf_create_glink): Align .glink to 64 bytes for ppc476
workaround.
(ppc_elf_select_plt_layout): Remove plt_style and emit_stub_syms
parameters. Use htab->params instead.
(ppc_elf_tls_setup): Remove no_tls_get_addr_opt parameter.
(ppc_elf_size_dynamic_sections): Align __glink_PLTresolve to
64 bytes for ppc476 workaround.
(struct ppc_elf_relax_info): New.
(ppc_elf_relax_section): Exclude linker created sections and
those too small to hold one instruction. Don't add another
branch around trampolines on later relax passes. Don't
generate trampolines for undefined symbols when !relocatable,
nor for plugin symbols. Allocate space for ppc476 workaround
patch area. Free fixups on error return path.
(ppc_elf_relocate_section): Handle ppc476 workaround patching.
* elf32-ppc.h (struct ppc_elf_params): New.
(ppc_elf_select_plt_layout, ppc_elf_tls_setup): Update prototype.
(ppc_elf_link_params): Declare.
* section.c (SEC_INFO_TYPE_TARGET): Define.
* bfd-in2.h: Regenerate.
ld/
* emultempl/ppc32elf.em (no_tls_get_addr_opt, emit_stub_syms)
plt_style): Delete. Adjust all refs to instead use..
(params): ..this. New variable.
(ppc_after_open_output): New function. Tweak params and pass to
ppc_elf_link_params.
(ppc_after_open): Adjust ppc_elf_select_plt_layout call.
(ppc_before_allocation): Adjust ppc_elf_tls_setup call. Enable
relaxation for ppc476 workaround.
(PARSE_AND_LIST_*): Add --{no-,}ppc476-workaround support.
(LDEMUL_CREATE_OUTPUT_SECTION_STATEMENTS): Define.
(BFD_RELOC_HI16_S_PCREL, BFD_RELOC_LO16_PCREL): Define.
* elf32-ppc.c (GLINK_PLTRESOLVE, GLINK_ENTRY_SIZE): Define.
(CROR_151515, CROR_313131): Delete.
(ADDIS_11_11, ADDI_11_11, SUB_11_11_30, ADD_0_11_11, ADD_11_0_11,
LWZ_0_4_30, MTCTR_0, LWZ_12_8_30, BCTR, ADDIS_11_30,
LWZU_0_X_11): Define.
(ppc_elf_howto_raw): Add R_PPC_REL16, R_PPC_REL16_LO, R_PPC_REL16_HI
and R_PPC_REL16_HA entries.
(ppc_elf_reloc_type_lookup): Convert new bfd reloc types.
(ppc_elf_addr16_ha_reloc): Also handle R_PPC_REL16_HA.
(struct ppc_elf_link_hash_table): Add glink, glink_pltresolve,
new_plt, and old_plt.
(ppc_elf_create_dynamic_sections): Create .glink section.
(ppc_elf_check_relocs): Set new_plt and old_plt.
(ppc_elf_select_plt_layout): New function.
(ppc_elf_tls_setup): Set plt output section elf type and flags.
(allocate_got): Handle differences between old and new got layout.
(allocate_dynrelocs): Likewise for plt.
(ppc_elf_size_dynamic_sections): Likewise. Allocate memory for
.glink. Don't allocate memory for old bss .plt. Emit DT_PPC_GLINK.
(ppc_elf_relax_section): Rename ppc_info to htab. Handle .glink
destination of R_PPC_PLTREL24 relocs.
(ppc_elf_relocate_section): Handle new relocs and changed destination
of R_PPC_PLTREL24.
(ppc_elf_finish_dynamic_symbol): Init new style plt and handle
differences in layout.
(ppc_elf_finish_dynamic_sections): Set DT_PPC_GLINK value. Don't
put a blrl in new got. Write glink contents.
* elf32-ppc.h (ppc_elf_select_plt_layout): Declare.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
* elf32-ppc.c (struct elf_linker_section): Remove sym_hash and
sym_offset. Add name, bss_name, sym_name, sym_val.
(struct ppc_elf_link_hash_table): Remove sdata and sdata2 pointers.
Add sdata array of elf_linker_section_t.
(ppc_elf_link_hash_table_create): Set name, sym_name, and bss_name.
(enum elf_linker_section_enum): Delete.
(ppc_elf_create_linker_section): Rewrite. Don't create syms here.
(ppc_elf_check_relocs): Delay ppc_elf_create_linker_section until
the special sections are needed. Adjust htab->sdata refs.
Ensure dynobj is set in sreloc code.
(ppc_elf_size_dynamic_sections): Strip sdata sections.
(ppc_elf_set_sdata_syms): New function.
(elf_finish_pointer_linker_section): Use 0x8000 for sym_offset.
(ppc_elf_relocate_section): Adjust references to htab->sdata. Use
sym_val instead of sym_hash.
* elf32-ppc.h (ppc_elf_set_sdata_syms): Declare.
ld/
* emultempl/ppc32elf.em (gld${EMULATION_NAME}_after_allocation): New
function.
(LDEMUL_AFTER_ALLOCATION): Define.
* elf32-ppc.c: Include elf32-ppc.h.
(NOP, CROR_151515, CROR_313131, TP_OFFSET, DTP_OFFSET): Define.
(struct ppc_elf_link_hash_entry): Rename "root" to "elf". Adjust uses.
Add "tls_mask" field.
(TLS_GD, TLS_LD, TLS_TPREL, TLS_DTPREL, TLS_TLS, TLS_TPRELGD): Define.
(struct ppc_elf_link_hash_table): Rename "root" to "elf". Adjust uses.
Add got, relgot, plt, relplt, dynbss, relbss, dynsbss, relsbss,
sdata, sdata2, tls_sec, tls_get_addr, tlsld_got fields.
Make use of htab shortcuts throughout file.
(ppc_elf_link_hash_newfunc): Init tls_mask field.
(ppc_elf_link_hash_table_create): Init new fields.
(ppc_elf_copy_indirect_symbol): Copy tls_mask.
(ppc_elf_howto_raw): Add tls relocs.
(ppc_elf_reloc_type_lookup): Handle them.
(ppc_elf_unhandled_reloc): New function.
(ppc_elf_create_got): Stash got section pointer in hash table,
return status. Make .rela.got too.
(ppc_elf_create_dynamic_sections): Stash section pointers in htab.
(ppc_elf_adjust_dynamic_symbol): Only set up copy relocs when
NON_GOT_REF set. Don't allocate space in .plt here..
(allocate_dynrelocs): ..do so here instead, properly ref-counting and
not allocating plt entries unnecessarily. Allocate got entries here.
(WILL_CALL_FINISH_DYNAMIC_SYMBOL): Define.
(ppc_elf_size_dynamic_sections): Allocate local got entries. Pass
"info" during allocate_dynrelocs hash traversal. Use htab section
shortcuts rather than searching for named sections. Get rid of
"plt" and "strip" booleans.
(update_local_sym_info, bad_shared_reloc): New functions.
(ppc_elf_check_relocs): Handle TLS relocs. Move .rela.got creation to
ppc_elf_create_got. Don't mark got or plt reloc syms dynamic, do so
in allocate_dynreloc. Use update_local_sym_info and bad_shared_reloc.
Disallow R_PPC_EMB_RELSDA, R_PPC_EMB_NADDR32, R_PPC_EMB_NADDR16,
R_PPC_EMB_NADDR16_LO, R_PPC_EMB_NADDR16_HI and R_PPC_EMB_NADDR16_HA
in shared libs. R_PPC_PLTREL32 is a plt reloc too. Refcount all
relocs that might use a plt entry. Set NON_GOT_REF too.
Enumerate all do-nothing relocs.
(ppc_elf_gc_sweep_hook): Simplify removal of dynrelocs. Handle
tls relocs and all plt relocs.
(ppc_elf_tls_setup, ppc_elf_tls_optimize): New functions.
(ppc_elf_finish_dynamic_symbol): Don't build got entries here.
(ppc_elf_finish_dynamic_sections): Rewrite tag code using htab
shortcuts.
(ppc_elf_relocate_section): Tidy. Handle TLS relocs. Use
bfd_elf_local_sym_name. Simplify unresolved reloc code. Build got
entries and got relocs here. Warn on non-zero got reloc addend.
Split out branch taken/not taken reloc code into a separate switch
and correct offset calculation. Allow BRTAKEN/BRNTAKEN dynamic relocs.
Split out HA reloc adjustments to separate switch statement. Don't
warn on reloc overflow if we've already warned about undefined.
Don't rebuild sym name when reporting errors. Report all possible
errors from _bfd_final_link_relocate.
(bfd_elf32_bfd_final_link): Don't define.