ommit 3ae729d5a4
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Wed Mar 7 04:18:45 2018 -0800
x86: Rewrite NOP generation for fill and alignment
increased MAX_MEM_FOR_RS_ALIGN_CODE to 4095 which resulted in increase
of assembler time and memory usage by 5 times for inputs with many
.p2align directives, which is typical for LTO output. This patch passes
max_bytes to TC_FRAG_INIT so that MAX_MEM_FOR_RS_ALIGN_CODE can be set
as needed and tracked by backend it so that HANDLE_ALIGN can check the
maximum alignment for each rs_align_code frag. Wall time to assemble
the same cc1plus.s:
before:
423.78user 0.89system 7:05.71elapsed 99%CPU
after:
102.35user 0.27system 1:42.89elapsed 99%CPU
PR gas/24165
* frags.c (frag_var_init): Pass max_chars to TC_FRAG_INIT as
max_bytes.
* config/tc-aarch64.h (TC_FRAG_INIT): Add and pass max_bytes to
aarch64_init_frag.
* /config/tc-arm.h (TC_FRAG_INIT): And and pass max_bytes to
arm_init_frag.
* config/tc-avr.h (TC_FRAG_INIT): And and ignore max_bytes.
* config/tc-ia64.h (TC_FRAG_INIT): Likewise.
* config/tc-mmix.h (TC_FRAG_INIT): Likewise.
* config/tc-nds32.h (TC_FRAG_INIT): Likewise.
* config/tc-ns32k.h (TC_FRAG_INIT): Likewise.
* config/tc-rl78.h (TC_FRAG_INIT): Likewise.
* config/tc-rx.h (TC_FRAG_INIT): Likewise.
* config/tc-score.h (TC_FRAG_INIT): Likewise.
* config/tc-tic54x.h (TC_FRAG_INIT): Likewise.
* config/tc-tic6x.h (TC_FRAG_INIT): Likewise.
* config/tc-xtensa.h (TC_FRAG_INIT): Likewise.
* config/tc-i386.h (MAX_MEM_FOR_RS_ALIGN_CODE): Set to
(alignment ? ((1 << alignment) - 1) : 1)
(i386_tc_frag_data): Add max_bytes.
(TC_FRAG_INIT): Add and track max_bytes.
(HANDLE_ALIGN): Replace MAX_MEM_FOR_RS_ALIGN_CODE with
fragP->tc_frag_data.max_bytes.
* doc/internals.texi: Update TC_FRAG_TYPE with max_bytes.
Various fixes to linker relaxation. In general, we need to support
relaxing every branch, even if we don't relax it in the assembler,
so we can optionally defer relaxation to the linker.
* elf32-rl78.c (rl78_offset_for_reloc): Add more relocs.
(rl78_elf_relax_section): Add bc/bz/bnc/bnz/bh/bnh. Fix reloc
choices.
* config/rl78-parse.y: Make all branches relaxable via
rl78_linkrelax_branch().
* config/tc-rl78.c (rl78_linkrelax_branch): Mark all relaxable
branches with relocs.
(options): Add OPTION_NORELAX.
(md_longopts): Add -mnorelax.
(md_parse_option): Support OPTION_NORELAX.
(op_type_T): Add bh, sk, call, and br.
(rl78_opcode_type): Likewise.
(rl78_relax_frag): Fix not-relaxing logic. Add sk.
(md_convert_frag): Fix relocation handling.
(tc_gen_reloc): Strip relax relocs when not linker relaxing.
(md_apply_fix): Defer overflow handling for anything that needs a
PLT, to the linker.
* config/tc-rl78.h (TC_FORCE_RELOCATION): Force all relocations to
the linker when linker relaxing.
* doc/c-rl78.texi (norelax): Add.
gas * config/tc-rl78.h (TC_LINKRELAX_FIXUP): Define.
(TC_FORCE_RELOCATION_SUB_SAME): Define.
(DWARF2_USE_FIXED_ADVANCE_PC): Define.
* gas/lns/lns.exp: Add RL78 to list of targets using
DW_LNS_fixed_advance_pc.
bfd * elf32-rl78.c (RL78_OP_REL): New macro.
(rl78_elf_howto_table): Use it for complex relocs.
(get_symbol_value): Handle the cases when the info or status
arguments are NULL.
(get_romstart): Cache the status returned by get_symbol_value.
(get_ramstart): Likewise.
(RL78_STACK_PUSH): Generate an error message if the stack
overflows.
(RL78_STACK_POP): Likewise for underflows.
(rl78_compute_complex_reloc): New function. Contains the basic
processing code for all RL78 complex relocs.
(rl78_special_reloc): New function. Provides special reloc
handling for complex relocs.
(rl78_elf_relocate_section): Use rl78_compute_complex_reloc.
(rl78_offset_for_reloc): Likewise.
binutils* readelf.c (target_specific_reloc_handling): Add code to handle
RL78 complex relocs.
This patch adds initial in-gas opcode relaxation for the rl78
backend. Specifically, it checks for conditional branches that
are too far and replaces them with inverted branches around longer
fixed branches.
* elf32-rl78.c (rl78_elf_howto_table): Add R_RL78_RH_RELAX.
(rl78_reloc_map): Add BFD_RELOC_RL78_RELAX.
(rl78_elf_relocate_section): Add R_RL78_RH_RELAX, R_RL78_RH_SFR,
and R_RL78_RH_SADDR.
(rl78_elf_finish_dynamic_sections): Only validate PLT section if
we didn't relax anything, as relaxing might remove a PLT reference
after we've set up the table.
(elf32_rl78_relax_delete_bytes): New.
(reloc_bubblesort): New.
(rl78_offset_for_reloc): New.
(relax_addr16): New.
(rl78_elf_relax_section): Add support for relaxing long
instructions into short ones.
[gas]
* config/rl78-defs.h (rl78_linkrelax_addr16): Add.
(rl78_linkrelax_dsp, rl78_linkrelax_imm): Remove.
* config/rl78-parse.y: Tag all addr16 and branch patterns with
relaxation markers.
* config/tc-rl78.c (rl78_linkrelax_addr16): New.
(rl78_linkrelax_branch): New.
(OPTION_RELAX): New.
(md_longopts): Add relax option.
(md_parse_option): Add OPTION_RELAX.
(rl78_frag_init): Support relaxation.
(rl78_handle_align): New.
(md_assemble): Support relaxation.
(md_apply_fix): Likewise.
(md_convert_frag): Likewise.
* config/tc-rl78.h (MAX_MEM_FOR_RS_ALIGN_CODE): New.
(HANDLE_ALIGN): New.
(rl78_handle_align): Declare.
* config/rl78-parse.y (rl78_bit_insn): New. Set it for all bit
insn patterns.
(find_bit_index): New. Strip .BIT suffix off relevent
expressions for bit insns.
(rl78_lex): Exclude bit suffixes from expression parsing.
[include/elf]
* rl78.h (R_RL78_RH_RELAX, R_RL78_RH_SFR, R_RL78_RH_SADDR): New.
(RL78_RELAXA_BRA, RL78_RELAXA_ADDR16: New.