Besides avoiding the UB, this also makes right shifts inside
expression symbols unsigned, consistent with the way gas evaluates
expressions in source.
PR 26448
* symbols.c: Include limits.h.
(resolve_symbol_value <O_left_shift, O_right_shift>): Do an
unsigned shift. Warn if shift count larger than valueT size.
gas handles local symbols specially in order to save memory, but the
implementation using two separate hash tables is inefficient,
particularly the scheme of duplicating a struct local_symbol when it
needs to be converted to a full struct symbol. Also, updating symbol
pointers with LOCAL_SYMBOL_CHECK is horrible and has led to some hard
to find bugs.
This changes the implementation to use a single hash table and avoids
another copy of the symbol name in symbol_entry_t. When converting
local symbols the struct local_symbol memory is reused. Not only
does that save memory, but there is no need to twiddle symbol pointers
with LOCAL_SYMBOL_CHECK.
Assembling gcc-10 -g -Og gold/powerpc.cc output shows the following:
old:
symbol table hash statistics:
1371192 searches
1290398 collisions
143585 elements
262139 table size
mini local symbol table hash statistics:
2966204 searches
2707489 collisions
523533 elements
1048573 table size
523533 mini local symbols created, 140453 converted
new:
symbol table hash statistics:
2828883 searches
2453138 collisions
526665 elements
1048573 table size
523533 mini local symbols created, 140453 converted
* symbols.c (struct local_symbol): Add "hash" entry. Reorder fields.
Delete union. Adjust code throughout file.
(struct symbol): Add "hash", "name" and "x" entries. Reorder fields.
Split off some to..
(struct xsymbol): ..this. New struct. Adjust code throughout file
accessing these fields.
(struct symbol_entry): Delete.
(union symbol_entry): New.
(hash_symbol_entry): Adjust for symbol_entry_t change.
(symbol_entry_find): Likewise.
(eq_symbol_entry): Compare hash values too.
(symbol_entry_alloc): Delete.
(local_symbol_converted_p, local_symbol_mark_converted): Delete.
(local_symbol_get_real_symbol, local_symbol_set_real_symbol): Delete.
(local_hash): Delete.
(abs_symbol_x, dot_symbol_x): New static var.
(symbol_init): New function.
(symbol_create): Rewrite.
(LOCAL_SYMBOL_CHECK): Delete. Replace uses throughout with simple
test of flags.local_symbol.
(local_symbol_make): Adjust for struct local_symbol changes.
(local_symbol_convert): Rewrite. Adjust all callers.
(symbol_table_insert): Simplify.
(symbol_clone): Comment on local sym cloning. Handle split symbol
struct.
(get_real_sym): Delete. Remove all uses.
(symbol_find_exact_noref): Simplify.
(resolve_local_symbol): Don't resolve non-locals.
(S_SET_SEGMENT): Don't special case reg_section.
(S_SET_NAME): Set both name and bsym->name.
(symbol_mark_resolved, symbol_resolved_p): Simplify.
(symbol_symbolS): Update comment.
(symbol_begin): Don't create local_hash. Adjust abs_symbol setup.
(dot_symbol_init): Adjust dot_symbol setup.
(symbol_print_statistics): Delete local_hash stats.
Get rid of sy_ prefix, and some unused fields.
* symbols.c (struct symbol_flags): Rename sy_volatile to volatil,
and remove sy_ from other field names. Update throughout.
(struct symbol): Remove sy_ from field names. Delete unused
TARGET_SYMBOL_FIELDS. Update throughout file. Move after..
(struct local_symbol): ..here. Remove lsy_ from field names.
Delete unused TC_LOCAL_SYMFIELD_TYPE. Update throughout file.
(local_symbol_resolved_p, local_symbol_mark_resolved): Delete.
Expand uses throughout file.
(local_symbol_get_frag, local_symbol_set_frag): Likewise.
(symbol_new): Move symbol_table_frozen test to..
(symbol_append): ..here, and..
(symbol_insert): ..here.
(resolve_symbol_value, symbol_relc_make_expr): White space fixes.
(HANDLE_XADD_OPT1, HANDLE_XADD_OPT2): Likewise.
* config/obj-coff.h (RESOLVE_SYMBOL_REDEFINITION): Update.
sy_resolving ought to not be set for a struct local_symbol, but it is
apparent from local_symbol_make that the field is not initialised.
* symbols.c (resolve_symbol_value): Invoke LOCAL_SYMBOL_CHECK
before looking at add_symbol->sy_flags.sy_resolving.
This patch fixes a bug in GAS where the assembler enters a tight loop
when attempting to resolve recursively-defined symbols, e.g. when
trying to assemble "a=a".
This is a regression introduced between binutils 2.32 and 2.33,
by commit 1903f1385b
* symbols.c (struct local_symbol): Update comment.
(resolve_symbol_value): For resolved symbols equated to other
symbols, verify that the referenced symbol is not a local_symbol
before accessing sy_value. Don't leave symbol loops during
finalize_syms resolution.
* testsuite/gas/all/assign-bad-recursive.d: New test.
* testsuite/gas/all/assign-bad-recursive.l: Error output for test.
* testsuite/gas/all/assign-bad-recursive.s: Assembly for test.
* testsuite/gas/all/gas.exp: Run it.
All symbols, sizes and relocations in this section are octets instead of
bytes. Required for DWARF debug sections as DWARF information is
organized in octets, not bytes.
bfd/
* section.c (struct bfd_section): New flag SEC_ELF_OCTETS.
* archures.c (bfd_octets_per_byte): New parameter sec.
If section is not NULL and SEC_ELF_OCTETS is set, one octet es
returned [ELF targets only].
* bfd.c (bfd_get_section_limit): Provide section parameter to
bfd_octets_per_byte.
* bfd-in2.h: regenerate.
* binary.c (binary_set_section_contents): Move call to
bfd_octets_per_byte into section loop. Provide section parameter
to bfd_octets_per_byte.
* coff-arm.c (coff_arm_reloc): Provide section parameter
to bfd_octets_per_byte.
* coff-i386.c (coff_i386_reloc): likewise.
* coff-mips.c (mips_reflo_reloc): likewise.
* coff-x86_64.c (coff_amd64_reloc): likewise.
* cofflink.c (_bfd_coff_link_input_bfd): likewise.
(_bfd_coff_reloc_link_order): likewise.
* elf.c (_bfd_elf_section_offset): likewise.
(_bfd_elf_make_section_from_shdr): likewise.
Set SEC_ELF_OCTETS for sections with names .gnu.build.attributes,
.debug*, .zdebug* and .note.gnu*.
* elf32-msp430.c (rl78_sym_diff_handler): Provide section parameter
to bfd_octets_per_byte.
* elf32-nds.c (nds32_elf_get_relocated_section_contents): likewise.
* elf32-ppc.c (ppc_elf_addr16_ha_reloc): likewise.
* elf32-pru.c (pru_elf32_do_ldi32_relocate): likewise.
* elf32-s12z.c (opru18_reloc): likewise.
* elf32-sh.c (sh_elf_reloc): likewise.
* elf32-spu.c (spu_elf_rel9): likewise.
* elf32-xtensa.c (bfd_elf_xtensa_reloc): likewise
* elf64-ppc.c (ppc64_elf_brtaken_reloc): likewise.
(ppc64_elf_addr16_ha_reloc): likewise.
(ppc64_elf_toc64_reloc): likewise.
* elflink.c (bfd_elf_final_link): likewise.
(bfd_elf_perform_complex_relocation): likewise.
(elf_fixup_link_order): likewise.
(elf_link_input_bfd): likewise.
(elf_link_sort_relocs): likewise.
(elf_reloc_link_order): likewise.
(resolve_section): likewise.
* linker.c (_bfd_generic_reloc_link_order): likewise.
(bfd_generic_define_common_symbol): likewise.
(default_data_link_order): likewise.
(default_indirect_link_order): likewise.
* srec.c (srec_set_section_contents): likewise.
(srec_write_section): likewise.
* syms.c (_bfd_stab_section_find_nearest_line): likewise.
* reloc.c (_bfd_final_link_relocate): likewise.
(bfd_generic_get_relocated_section_contents): likewise.
(bfd_install_relocation): likewise.
For section which have SEC_ELF_OCTETS set, multiply output_base
and output_offset with bfd_octets_per_byte.
(bfd_perform_relocation): likewise.
include/
* coff/ti.h (GET_SCNHDR_SIZE, PUT_SCNHDR_SIZE, GET_SCN_SCNLEN),
(PUT_SCN_SCNLEN): Adjust bfd_octets_per_byte calls.
binutils/
* objdump.c (disassemble_data): Provide section parameter to
bfd_octets_per_byte.
(dump_section): likewise
(dump_section_header): likewise. Show SEC_ELF_OCTETS flag if set.
gas/
* as.h: Define SEC_OCTETS as SEC_ELF_OCTETS if OBJ_ELF.
* dwarf2dbg.c: (dwarf2_finish): Set section flag SEC_OCTETS for
.debug_line, .debug_info, .debug_abbrev, .debug_aranges, .debug_str
and .debug_ranges sections.
* write.c (maybe_generate_build_notes): Set section flag
SEC_OCTETS for .gnu.build.attributes section.
* frags.c (frag_now_fix): Don't divide by OCTETS_PER_BYTE if
SEC_OCTETS is set.
* symbols.c (resolve_symbol_value): Likewise.
ld/
* ldexp.c (fold_name): Provide section parameter to
bfd_octets_per_byte.
* ldlang (init_opb): New argument s. Set opb_shift to 0 if
SEC_ELF_OCTETS for the current section is set.
(print_input_section): Pass current section to init_opb.
(print_data_statement,print_reloc_statement,
print_padding_statement): Likewise.
(lang_check_section_addresses): Call init_opb for each
section.
(lang_size_sections_1,lang_size_sections_1,
lang_do_assignments_1): Likewise.
(lang_process): Pass NULL to init_opb.
Since I was looking at this I decided to fix the formatting, and
used an old C switch statements trick to factor out common code.
* symbols.c (use_complex_relocs_for): Formatting. Factor out
X_add_symbol tests.
In most cases we don't want expression symbols, such as that created
for an expression like "symbol + (1f - .)", resolved down to a
constant. Instead we'd like to leave the expression as "symbol +
constant" once the "1f - ." part has been resolved, and let the
backend decide whether "symbol" can be reduced further.
However, that doesn't work when trying to resolve .loc view symbols
early. We get expression symbols left as an O_symbol expression
pointing at an absolute symbol, and marked as sy_flags.sy_resolved.
That wouldn't really be a problem, but when one of those expression
symbols is used in further .loc view expressions, its value is taken
as zero.
This patch fixes the symbol value mistake, and stops creation of
O_symbol expression symbols pointing to absolute symbols. Either of
these fixes would cure the .loc view usage.
PR 24444
* symbols.c (resolve_symbol_value): When handling symbols
marked as sy_flags.resolved, return correct value for the
case of expression symbols left as an O_symbol expression.
Merge O_symbol code handling undefined and common symbols with
code handling special cases of expression symbols. Use
seg_left to test for undefined and common symbols. Don't
leave an O_symbol expression when X_add_symbol resolves to
the absolute_section. Init final_val later.
* testsuite/gas/mmix/basep-7.d: Adjust expected output.
Up to now, all symbol values are in units of bytes, where a "byte" can
consist of one or more octets (e.g. 8 bit or 16 bit).
Allow to specfiy that the "unit" of a newly created symbol is octets
(exactly 8 bit), instead of bytes.
* symbols.h (symbol_temp_new_now_octets): Declare.
(symbol_set_value_now_octets, symbol_octets_p): Declare.
* symbols.c (struct symbol_flags): New member sy_octets.
(symbol_temp_new_now_octets): New function.
(resolve_symbol_value): Return octets instead of bytes if
sy_octets is set.
(symbol_set_value_now_octets): New function.
(symbol_octets_p): New function.
This file was never supposed to be widely used. The fact that it has
found its way into many gas files led to bugs, typically when code
expecting a symbolS* to point at a struct symbol is presented with a
struct local_symbol. Also, commit 158184ac9e changed these structs in
2012 but didn't catch all places where symbol bsym was being used to
test for a local_symbol.
* Makefile.am (HFILES): Delete struc-symbol.h.
* doc/internals.texi: Delete struc-symbol.h reference and out
of date local symbol description.
* struc-symbol.h: Delete. Move contents to..
* symbols.c: ..here.
(symbol_on_chain, symbol_symbolS): New functions.
* symbols.h (symbol_on_chain, symbol_symbolS): Declare.
* cgen.c: Don't #include struc-symbol.h.
(gas_cgen_parse_operand): Don't test for local_symbol using
bsym, instead call symbol_symbolS. Use symbol_get_bfdsym.
(weak_operand_overflow_check, make_right_shifted_expr): Use
symbol accessors.
* config/obj-coff.c: Don't #include struc-symbol.h.
(GET_FILENAME_STRING): Delete.
* config/obj-elf.c: Don't #include struc-symbol.h.
(elf_file_symbol): Use symbol accessors.
(elf_adjust_symtab): Call symbol_on_chain.
* config/obj-evax.c: Don't #include struc-symbol.h.
* config/tc-nds32.c: Likewise.
* config/tc-rl78.c: Likewise.
* config/tc-rx.c: Likewise.
* config/tc-alpha.c: Likewise.
(add_to_link_pool, s_alpha_comm): Use symbol accessors.
* config/tc-arc.c: Don't #include struc-symbol.h.
(arc_check_relocs): Use symbol accessors, testing gas symbol
section rather than bfd symbol section.
* config/tc-avr.c: Don't #include struc-symbol.h.
(avr_patch_gccisr_frag): Use symbol accessors.
* config/tc-bfin.c: Don't #include struc-symbol.h.
(bfin_loop_beginend): Use symbol accessors.
* config/tc-csky.c: Don't #include struc-symbol.h.
(v2_work_movih, v2_work_ori): Use symbol accessors. Check for
absolute symbol as well as O_constant.
* config/tc-riscv.c: Don't #include struc-symbol.h.
(riscv_pre_output_hook): Use symbol accessors.
* config/tc-s390.c: Don't #include struc-symbol.h.
(s390_literals): Use symbol accessors.
* config/tc-score.c (s3_build_la_pic, s3_build_lwst_pic): Use
symbol accessors.
(s3_relax_branch_inst16, s3_relax_cmpbranch_inst32): Don't
test symbol bsym.
* config/tc-score7.c: Don't #include struc-symbol.h.
(s7_build_la_pic, s7_build_lwst_pic): Use symbol accessors.
(s7_b32_relax_to_b16): Don't test symbol bsym.
* config/tc-sh.c: Don't #include struc-symbol.h.
(insert_loop_bounds): Use symbol accessors.
(sh_frob_section): Remove bogus symbol canonicalization.
* config/tc-tic54x.c: Don't #include struc-symbol.h.
(tic54x_bss): Use symbol accessors.
* config/tc-tilegx.c: Don't #include struc-symbol.h.
(emit_tilegx_instruction, tilegx_parse_name): Use symbol accessors.
* config/tc-tilepro.c: Don't #include struc-symbol.h.
(emit_tilepro_instruction, tilepro_parse_name): Use accessors.
* config/tc-xtensa.c: Don't #include struc-symbol.h.
(xg_assemble_vliw_tokens): Use symbol accessors.
(xg_order_trampoline_chain): Likewise.
* ehopt.c: Don't #include struc-symbol.h.
(check_eh_frame): Correct local symbol test. Use symbol accessors.
* write.c: Don't #include struc-symbol.h.
(create_note_reloc, maybe_generate_build_notes): Use symbol accessors.
* Makefile.in: Regenerate.
* po/POTFILES.in: Regenerate.
What a trip down a rabbit hole this bug has been.
First observation: You can't use deferred_expression in s_leb128.
deferred_expression implements the semantics of .eqv or '==', saving
an expression with minimal simplification for assignment to a symbol
so that the expression is evaluated at uses of the symbol. In
particular, the value of "dot" is not evaluated at the .eqv symbol
assignment, but later. When s_leb128 uses deferred_expression,
"later" is at the end of assembly, giving entirely the wrong value of
"dot". There is no way to fix this for the s_leb128 use without
breaking .equ (which incidentally was already somewhat broken, see
commit e4c2619ad1). So, don't use deferred_expression in s_leb128.
But that leads to the gas test elf/dwarf2-17 failing, because view
symbols are calculated with a chain of expression symbols. In the
dwarf2-17 .L1 case there is a "temp_sym_1 > temp_sym_2" expression,
with temp_sym_1 and temp_sym_2 on either side of a ".balign". Since
".balign" and many other directives moving "dot" are not calculated on
the first (and only) pass over source, .L1 cannot be calculated until
final addresses are assigned to frags. However, ".uleb128 .L1" *is*
calculated immediately, resulting in the wrong value.
The reason why .L1 is calculated immediately is that code in
expr.c:operand after the comment
/* If we have an absolute symbol or a reg, then we know its
value now. */
does as it says and fixes the value of .L1, because .L1 is assigned
to absolute_section in dwarf2dbg.c:set_or_check_view. So, correct
that to expr_section.
Unfortunately that fix leads to failure of the elf/dwarf2-5 test with
../gas/elf/dwarf2-5.s: Error: attempt to get value of unresolved symbol `.L5'
../gas/elf/dwarf2-5.s: Error: attempt to get value of unresolved symbol `.L11'
../gas/elf/dwarf2-5.s: Error: attempt to get value of unresolved symbol `.L12'
So why is that? Well, it turns out that .L5 is defined in terms of
.L4, and apparently .L4 is undefined. But .L4 clearly is defined,
otherwise we would hit an error when trying to use .L4 a little
earlier. There are two copies of .L4! So, symbols are cloned when
that should not happen.
Symbol cloning is a technique used by gas to support saving the value
of symbols that change between uses, but that isn't the case with
.L4. Only one value is set and used for .L4, but indeed .L4 was being
cloned by symbol_clone_if_forward_ref. This despite no forward refs
being present. Also, .L4 is a local symbol and a cursory glance at
symbol_clone_if_forward_ref "if (symbolP && !LOCAL_SYMBOL_CHECK (symbolP))"
would seem to prevent cloning of local symbols. All is not as it
seems though, a curse of using macros. LOCAL_SYMBOL_CHECK modifies
its argument if a "struct local_symbol" is converted to the larger
"struct symbol", as happens when assigning a view symbol value.
That fact results in the recursive call to symbol_clone_if_forward_ref
returning a different address for "add_symbol". This problem could
have been fixed by using symbol_same_p rather than comparing symbol
pointers, but I thought it better to use the real symbol throughout.
Note that symbol_find_exact also returns the real symbol for a
converted local symbol.
Finally, this patch does expose lack of support for forward symbol
definitions in various targets. For example:
alpha-linux +ERROR: ../ld/testsuite/ld-elf/pr11138-2.c: compilation failed
This is caused by view symbol uses. On alpha-linux-gcc (GCC) 8.1.1
20180502 they happen to occur in .byte directives so were silently
broken in cases like elf/dwarf2-17 anyway.
/tmp/ccvtsMfU.s: Assembler messages:
/tmp/ccvtsMfU.s: Fatal error: unhandled relocation type BFD_RELOC_8
/tmp/ccvtsMfU.s: Fatal error: unhandled relocation type BFD_RELOC_8
md_apply_fix on those targets needs to handle fixups that resolve down
to a constant.
PR 23040
* symbols.c (get_real_sym): New function.
(symbol_same_p): Use get_real_sym.
(symbol_clone_if_forward_ref): Save real original add_symbol and
op_symbol for comparison against that returned from lookup or
recursive calls.
* dwarf2dbg.c (set_or_check_view): Use expr_section for
expression symbols, not absolute_section.
(dwarf2_directive_loc): Check symbol_equated_p and tidy cloning
of view symbols.
* read.c (s_leb128): Don't use deferred_expression.
gas * as.c (flag_generate_build_notes): New variable.
(show_usage): Add entry for --generate-missing-build-notes.
(parse_args): Parse --generate-missing-build-notes.
* as.h: Export flag_generate_build_notes.
* symbols.c (save_symbol_name): Ensure that the name parameter is
not NULL.
* write.c (create_obj_attrs_section): Reformat.
(create_note_reloc): New function - creates a relocation for a
field in a GNU Build attribute note.
(maybe_generate_build_notes): New function - created GNU Build
attribute notes if none are present in the output file.
(write_object_file): Call maybe_generate_build_notes.
* configure.ac (--enable-generate-build-notes): New option.
* NEWS: Announce the new feature.
* doc/as.textinfo: Document the new option.
* config.in: Regenerate.
* configure: Regenerate.
binutils* readelf.c (is_32bit_abs_reloc): Support R_PARISC_DIR32 as a
32-bit absolute reloc for the HPPA target.
* testsuite/binutils-all/note-5.d: New test.
* testsuite/binutils-all/note-5.s: Source file for new test.
* testsuite/binutils-all/objcopy.exp: Run new test.
* as.c: Include write.h.
(common_emul_init): Use FAKE_LABEL_NAME.
* ecoff.c (add_file, ecoff_directive_end, ecoff_directive_loc):
Likewise.
(ecoff_build_symbols): Use FAKE_LABEL_CHAR.
* expr.c (get_symbol_name): Use FAKE_LABEL_CHAR. Accept only if
input_from_string is TRUE.
* read.c (input_from_string): New.
(read_symbol_name): Use FAKE_LABEL_CHAR. Accept only if
input_from_string is TRUE.
(temp_ilp): Set input_from_string to TRUE.
(restore_ilp): Set input_from_string to FALSE.
* read.h (input_from_string): Declare.
* symbols.c: Include write.h
(S_IS_LOCAL): Check for FAKE_LABEL_CHAR.
(symbol_relc_make_sym): Fix comment refering to default fake label
string.
* write.h (FAKE_LABEL_CHAR): New.
* config/tc-riscv.h (FAKE_LABEL_CHAR): Define.
* testsuite/gas/all/err-fakelabel.s: New.
Separate out symbol flag reasons from section reasons to force a
reloc. Yes, this adds another section test to the local symbol case
too.
* symbols.c (S_FORCE_RELOC): Separate section and symbol tests.
* symbols.c (decode_local_label_name): Make type a const char *.
* listing.c (print_source): Make type of p const char *.
(print_line): Make type of string const char *.
(buffer_line): Return const char *.
(title): Make type const char *.
(subtitle): Likewise.
(listing_listing): Make type of p const char *.
* messages.c (as_internal_value_out_of_range): Make type of prefix
const char *.
* stabs.c (s_stab_generic): make type of stab_secname, stabstr_secname
and string const char *.
* read.c (_bfd_rel): Make type of name const char *.
* app.c (out_string): Change type to const char *.
(struct app_save::out_string): Likewise.
Use size_t in a few places involved with obstacks, and don't include
obstack.h in files that don't use obstacks.
gas/
* config/bfin-parse.y: Don't include obstack.h.
* config/obj-aout.c: Likewise.
* config/obj-coff.c: Likewise.
* config/obj-som.c: Likewise.
* config/tc-bfin.c: Likewise.
* config/tc-i960.c: Likewise.
* config/tc-rl78.c: Likewise.
* config/tc-rx.c: Likewise.
* config/tc-tic4x.c: Likewise.
* expr.c: Likewise.
* listing.c: Likewise.
* config/obj-elf.c (elf_file_symbol): Make name_length a size_t.
* config/tc-aarch64.c (symbol_locate): Likewise.
* config/tc-arm.c (symbol_locate): Likewise.
* config/tc-mmix.c (mmix_handle_mmixal): Make len_0 a size_t.
* config/tc-score.c (s3_build_score_ops_hsh): Make len a size_t.
(s3_build_dependency_insn_hsh): Likewise.
* config/tc-score7.c (s7_build_score_ops_hsh): Likewise.
(s7_build_dependency_insn_hsh): Likewise.
* frags.c (frag_grow): Make parameter a size_t, and use size_t locals.
(frag_new): Make parameter a size_t.
(frag_var_init): Make max_chars and var parameters size_t.
(frag_var, frag_variant): Likewise.
(frag_room): Return a size_t.
(frag_align_pattern): Make n_fill parameter a size_t.
* frags.h: Update function prototypes.
* symbols.c (save_symbol_name): Make name_length a size_t.