commit a7664973b2
Author: Jan Beulich <jbeulich@suse.com>
Date: Mon Apr 26 10:41:35 2021 +0200
x86: correct overflow checking for 16-bit PC-relative relocs
caused linker failure when building 16-bit program in a 32-bit ELF
container. Update GNU_PROPERTY_X86_FEATURE_2_USED with
#define GNU_PROPERTY_X86_FEATURE_2_CODE16 (1U << 12)
to indicate that 16-bit mode instructions are used in the input object:
https://groups.google.com/g/x86-64-abi/c/UvvXWeHIGMA
to indicate that 16-bit mode instructions are used in the object to
allow linker to properly perform relocation overflow check for 16-bit
PC-relative relocations in 16-bit mode instructions.
1. Update x86 assembler to always generate the GNU property note with
GNU_PROPERTY_X86_FEATURE_2_CODE16 for .code16 in ELF object.
2. Update i386 and x86-64 linkers to use 16-bit PC16 relocations if
input object is marked with GNU_PROPERTY_X86_FEATURE_2_CODE16.
bfd/
PR ld/27905
* elf32-i386.c: Include "libiberty.h".
(elf_howto_table): Add 16-bit R_386_PC16 entry.
(elf_i386_rtype_to_howto): Add a BFD argument. Use 16-bit
R_386_PC16 if input has 16-bit mode instructions.
(elf_i386_info_to_howto_rel): Update elf_i386_rtype_to_howto
call.
(elf_i386_tls_transition): Likewise.
(elf_i386_relocate_section): Likewise.
* elf64-x86-64.c (x86_64_elf_howto_table): Add 16-bit
R_X86_64_PC16 entry.
(elf_x86_64_rtype_to_howto): Use 16-bit R_X86_64_PC16 if input
has 16-bit mode instructions.
* elfxx-x86.c (_bfd_x86_elf_parse_gnu_properties): Set
elf_x86_has_code16 if relocatable input is marked with
GNU_PROPERTY_X86_FEATURE_2_CODE16.
* elfxx-x86.h (elf_x86_obj_tdata): Add has_code16.
(elf_x86_has_code16): New.
binutils/
PR ld/27905
* readelf.c (decode_x86_feature_2): Support
GNU_PROPERTY_X86_FEATURE_2_CODE16.
gas/
PR ld/27905
* config/tc-i386.c (set_code_flag): Update x86_feature_2_used
with GNU_PROPERTY_X86_FEATURE_2_CODE16 for .code16 in ELF
object.
(set_16bit_gcc_code_flag): Likewise.
(x86_cleanup): Always generate the GNU property note if
x86_feature_2_used isn't 0.
* testsuite/gas/i386/code16-2.d: New file.
* testsuite/gas/i386/code16-2.s: Likewise.
* testsuite/gas/i386/x86-64-code16-2.d: Likewise.
* testsuite/gas/i386/i386.exp: Run code16-2 and x86-64-code16-2.
include/
PR ld/27905
* elf/common.h (GNU_PROPERTY_X86_FEATURE_2_CODE16): New.
ld/
PR ld/27905
* testsuite/ld-i386/code16.d: New file.
* testsuite/ld-i386/code16.t: Likewise.
* testsuite/ld-x86-64/code16.d: Likewise.
* testsuite/ld-x86-64/code16.t: Likewise.
* testsuite/ld-i386/i386.exp: Run code16.
* testsuite/ld-x86-64/x86-64.exp: Likewise.
Since the policies of GNU and llvm toolchain are different for now,
current binutils mainline cannot accept any draft extensions, including
rvv, zfh, .... The Clang/LLVM allows these draft stuff on mainline,
but the GNU ld might be used with them, so this causes the link time
problems.
The patch allows ld to link the objects with unknown prefixed extensions,
which are probably generated by LLVM or customized toolchains.
bfd/
* elfxx-riscv.h (check_unknown_prefixed_ext): New bool.
* elfxx-riscv.c (riscv_parse_prefixed_ext): Do not check the
prefixed extension name if check_unknown_prefixed_ext is false.
* elfnn-riscv.c (riscv_merge_arch_attr_info): Set
check_unknown_prefixed_ext to false for linker.
gas/
* config/tc-riscv.c (riscv_set_arch): Set
check_unknown_prefixed_ext to true for assembler.
When assembling a forward reference the symbol will be unknown and so during
do_t_adr we cannot set the thumb bit. The bit it set so early to prevent
relaxations that are invalid. i.e. relaxing a Thumb2 to Thumb1 insn when the
symbol is Thumb.
But because it's done so early we miss the case for forward references.
This patch changes it so that we additionally check the thumb bit during the
internal relocation processing.
In principle we should be able to only set the bit during reloc processing but
that would require changes to the other relocations that the instruction could
be relaxed to.
This approach still allows early relaxations (which means that we have less
iteration of internal reloc processing) while still fixing the forward reference
case.
gas/ChangeLog:
2021-05-24 Tamar Christina <tamar.christina@arm.com>
PR gas/25235
* config/tc-arm.c (md_convert_frag): Set LSB when Thumb symbol.
(relax_adr): Thumb symbols 4 bytes.
* testsuite/gas/arm/pr25235.d: New test.
* testsuite/gas/arm/pr25235.s: New test.
This patch clarify the following invalid combinations of march and mabi,
* ilp32f/lp64f abi without f extension.
* ilp32d/lp64d abi without d extension.
* ilp32q/lp64q abi without q extension.
* e extension with any abi except ilp32e
GNU assembler reports errors when finding the above invalid combinations.
But LLVM-MC reports warnings and ignores these invalid cases. It help to
set the correct ilp32/lp64/ilp32e abi according to rv32/rv64/rve. This
looks good and convenient, so perhaps we can do the same things. However,
if you don't set the mabi, GNU assembler also try to set the suitable
ABI according to march/elf-attribute. Compared to LLVM-MC, we will choose
double/quad abi if d/f extension is set.
gas/
PR 25212
* config/tc-riscv.c (riscv_set_abi_by_arch): If -mabi isn't set, we
will choose ilp32e abi for rv32e. Besides, report errors for the
invalid march and mabi combinations.
* testsuite/gas/riscv/mabi-attr-rv32e.s: New testcase. Only accept
ilp32e abi for rve extension.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64f.d: Likewise.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64f.l: Likewise.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64d.d: Likewise.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64d.l: Likewise.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64d.q: Likewise.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64d.q: Likewise.
Renamed all mabi testcases to their march-mabi settings.
* config/tc-z80.c (emit_data_val): Warn on constant overflow.
(signed_overflow): New function.
(unsigned_overflow): New function.
(is_overflow): Use new functions.
(md_apply_fix): Use signed_overflow.
* testsuite/gas/z80/ez80_adl_suf.d: Fix test.
* testsuite/gas/z80/ez80_isuf.s: Likewise.
* testsuite/gas/z80/ez80_z80_suf.d: Likewise.
gas/ChangeLog:
2021-05-19 John Buddery <jvb@cyberscience.com>
PR 25599
* config/tc-ia64.c (emit_one_bundle): Increment fixup offset
by one for PCREL60B relocation on HP-UX.
The initial problem I wanted to fix here is that GAS was rejecting MVE
instructions such as:
vmov q3[2], q3[0], r2, r2
with:
Error: General purpose registers may not be the same -- `vmov q3[2],q3[0],r2,r2'
which is incorrect; such instructions are valid. Note that for moves in
the other direction, e.g.:
vmov r2, r2, q3[2], q3[0]
GAS is correct in rejecting this as it does not make sense to move both
lanes into the same register (the Arm ARM says this is CONSTRAINED
UNPREDICTABLE).
After fixing this issue, I added assembly/disassembly tests for these
vmovs. This revealed several disassembly issues, including incorrectly
marking the moves into vector lanes as UNPREDICTABLE, and disassembling
many of the vmovs as vector loads. These are now fixed.
gas/ChangeLog:
* config/tc-arm.c (do_mve_mov): Only reject vmov if we're moving
into the same GPR twice.
* testsuite/gas/arm/mve-vmov-bad-2.l: Tweak error message.
* testsuite/gas/arm/mve-vmov-3.d: New test.
* testsuite/gas/arm/mve-vmov-3.s: New test.
opcodes/ChangeLog:
* arm-dis.c (mve_opcodes): Fix disassembly of
MVE_VMOV2_GP_TO_VEC_LANE when idx == 1.
(is_mve_encoding_conflict): MVE vector loads should not match
when P = W = 0.
(is_mve_unpredictable): It's not unpredictable to use the same
source register twice (for MVE_VMOV2_GP_TO_VEC_LANE).
PR 27823
* config/tc-z80.c (emit_ld_r_m): Report an illegal load
instruction.
* testsuite/gas/z80/ill_ops.s: New test source file.
* testsuite/gas/z80/ill_ops.d: New test driver.
* testsuite/gas/z80/ill_ops.l: New test error output.
Surely disp processing should access the disp operand, not an imm one.
This is not an active issue only because imms and disps are, at the
moment, overlapping fields of the same union.
i386_finalize_immediate() is used for both AT&T and Intel immediate
operand handling. Move an AT&T-only check to i386_immediate(), which at
the same time allows it to cover other cases as well, giving an overall
better / more consistent diagnostic.
- Drop a pointless & where just before it was checked that the
respective bits are clear already anyway.
- Avoid a not really necessary operand_type_set() and a redundant
operand_type_or() / operand_type_and() pair.
Now that lex_got() is uniform for all targets using it, permit COFF
targets to also use @secrel32 with, in particular, .long. This is more
natural than the custom .secrel32 directive, and also allows more
flexibility (the "+six" form of the two added test lines doesn't work
with a .secrel32 equivalent, in that it silently produces an unintended
relocation type).
As an extra benefit this also makes sure that data definitions in Intel
syntax mode would get treated like they do for e.g. ELF targets.
I see no reason at all for us to carry two copies of almost identical
code. The differences, apart from the table entries, are benign. And
the #ifdef-ary doesn't really get any worse.
Allow a few more expression forms when the entire expression can be
resolved at assembly time. For this, i386_validate_fix() needs to
arrange for all processing of the relocation to be deferred to
tc_gen_reloc().
So far this (counter-intuitively) produced the size as recorded in the
(section) symbol. Obtain the section's size instead for section symbols.
(I wonder whether STT_SECTION symbols couldn't properly hold the
section's size in their st_size field, which in turn would likely mean
the internal symbol would also have its size properly updated.)
Note that this is not the same as the .sizeof.() pseudo-operator: @size
yields the local file's contribution to a section, while .sizeof.() gets
resolved by the linker to produce the final full section's size.
As to the 3rd each of the expected output lines in the changed testcase:
I can't find justification for zzz to come after yyy despite them being
defined in the opposite order in source. Therefore I think it's better
to permit both possible outcomes.
PR gas/27763
While the comment in output_jump() was basically correct prior to the
introduction of 64-bit mode, both that and the not-JMP-like behavior of
XBEGIN require adjustments: Branches with 32-bit displacement do not
wrap at 4G in 64-bit mode, and XBEGIN with 16-bit operand size doesn't
wrap at 64k. Similarly %rip-relative addressing doesn't wrap at 4G.
The new testcase points out that for PE/COFF object_64bit didn't get
set so far, preventing in particular the check at the end of
md_convert_frag() to take effect.
For Mach-O the new testcase fails (bogusly), in that only the first two
of the expected errors get raised. Since for Mach-O many testcases
already fail, and since an x86_64-darwin target can't even be configured
for, I didn't think I need to bother.
Note that there are further issues in this area, in particular for
branches with operand size overrides. Such branches, which truncate
%rip / %eip, can't be correctly expressed with ordinary PC-relative
relocations. It's not really clear what to do with them - perhaps the
best we can do is to carry through all associated relocations, leaving
it to the linker (or even loader) to decide (once the final address
layout is known). Same perhaps goes for relocations associated with
32-bit addressing in 64-bit mode.
Add () to !i.prefix[ADDR_PREFIX] to silence GCC 5:
gas/config/tc-i386.c:4152:31: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses]
&& !i.prefix[ADDR_PREFIX] != (flag_code == CODE_32BIT))
^
* config/tc-i386.c (optimize_encoding): Add () to silence GCC 5.
Over the years I've seen a number of instances where people used
lea (%reg1), %reg2
or
lea symbol, %reg
despite the same thing being expressable via MOV. Since additionally
LEA often has restrictions towards the ports it can be issued to, while
MOV typically gets dealt with simply by register renaming, transform to
MOV when possible (without growing opcode size and without altering
involved relocation types).
Note that for Mach-O the new 64-bit testcases would fail (for
BFD_RELOC_X86_64_32S not having a representation), and hence get skipped
there.
Constants not known at the time an individual insn gets assembled and
going into a sign-extended field still shouldn't be silently truncated
at the time the respective fixup gets resolved.
LEA behavior without a 64-bit destination is independent of address size
- in particular LEA with 32-bit addressing and 64-bit destination is the
same as LEA with 64-bit addressing and 32-bit destination. IOW checking
merely i.prefix[ADDR_PREFIX] is insufficient. This also means wrong
relocation types (R_X86_64_32S when R_X86_64_32 is needed) were used so
far in such cases.
Note that in one case in build_modrm_byte() the 64-bit check came too
early altogether, and hence gets dropped in favor of the one included in
the new helper. This is benign to non-64-bit code from all I can tell,
but the failure to clear disp16 could have been a latent problem.
In preparation for extending the conditions here defer this check until
operands have been parsed, as certain further attributes will need to
be known for determinig applicability of this check to be correct to
LEA.
While I can't point out any specific case where things break, it looks
wrong to have the consumer of a flag before its producer. Set .disp32
first, then do the possible conversion to signed 32-bit, and finally
check whether the value fits in a signed long.
Truncating an expression's X_add_number to just "long" can result in
confusing output (e.g. an apparently in-range number claimed to be out
of range). Use the abstraction that bfd provides for this.
Take the opportunity and also insert a missing "of".
Its (documented) behavior is unhelpful in particular in 64-bit build
environments: While printing large 32-bit numbers in decimal already
isn't very meaningful to most people, this even more so goes for yet
larger 64-bit numbers. bfd_sprintf_vma() still tries to limit the number
of digits printed (without depending on a build system property), but
uniformly produces hex output.
When built on a 32-bit host without --enable-64-bit-bfd, powerpc-linux
and other 32-bit powerpc targeted binutils fail to assemble some
power10 prefixed instructions with 34-bit fields. A typical error
seen when running the testsuite is
.../gas/testsuite/gas/ppc/prefix-pcrel.s:10: Error: bignum invalid
In practice this doesn't matter for addresses: 32-bit programs don't
need or use the top 2 bits of a d34 field when calculating addresses.
However it may matter when loading or adding 64-bit constants with
paddi. A power10 processor in 32-bit mode still has 64-bit wide GPRs.
So this patch enables limited support for O_big PowerPC operands, and
corrects sign extension of 32-bit constants using X_extrabit.
* config/tc-ppc.c (insn_validate): Use uint64_t for operand values.
(md_assemble): Likewise. Handle bignum operands.
(ppc_elf_suffix): Handle O_big. Remove unnecessary input_line_pointer
check.
* expr.c: Delete unnecessary forward declarations.
(generic_bignum_to_int32): Return uint32_t.
(generic_bignum_to_int64): Return uint64_t. Compile always.
(operand): Twiddle X_extrabit for unary '~'. Set X_unsigned and
clear X_extrabit for unary '!'.
* expr.h (generic_bignum_to_int32): Declare.
(generic_bignum_to_int64): Declare.
* testsuite/gas/ppc/prefix-pcrel.s,
* testsuite/gas/ppc/prefix-pcrel.d: Add more instructions.
A summary of what this patch set fixes:
For instructions
STXR w0,x2,[x0]
STLXR w0,x2,[x0]
The warning we emit currently is misleading:
Warning: unpredictable: identical transfer and status registers --`stlxr w0,x2,[x0]'
Warning: unpredictable: identical transfer and status registers --`stxr w0,x2,[x0]'
it ought to be:
Warning: unpredictable: identical base and status registers --`stlxr w0,x2,[x0]'
Warning: unpredictable: identical base and status registers --`stxr w0,x2,[x0]'
For instructions:
ldaxp x0,x0,[x0]
ldxp x0,x0,[x0]
The warning we emit is incorrect
Warning: unpredictable: identical transfer and status registers --`ldaxp x0,x0,[x0]'
Warning: unpredictable: identical transfer and status registers --`ldxp x0,x0,[x0]'
it ought to be:
Warning: unpredictable load of register pair -- `ldaxp x0,x0,[x0]'
Warning: unpredictable load of register pair -- `ldxp x0,x0,[x0]'
For instructions
stlxp w0, x2, x2, [x0]
stxp w0, x2, x2, [x0]
We don't emit any warning when it ought to be:
Warning: unpredictable: identical base and status registers --`stlxp w0,x2,x2,[x0]'
Warning: unpredictable: identical base and status registers --`stxp w0,x2,x2,[x0]'
gas/ChangeLog:
2021-04-09 Tejas Belagod <tejas.belagod@arm.com>
* config/tc-aarch64.c (warn_unpredictable_ldst): Clean-up diagnostic messages
for LD/ST Exclusive instructions.
* testsuite/gas/aarch64/diagnostic.s: Add a diagnostic test for STLXP.
* testsuite/gas/aarch64/diagnostic.l: Fix-up test after message clean-up.
PR 27217
* config/tc-aarch64.c (my_get_expression): Rename to
aarch64_get_expression. Add a fifth argument to enable deferring
of expression resolution.
(parse_typed_reg): Update calls to my_get_expression.
(parse_vector_reg_list): Likewise.
(parse_immediate_expression): Likewise.
(parse_big_immediate): Likewise.
(parse_shift): Likewise.
(parse_shifter_operand_imm): Likewise.
(parse_operands): Likewise.
(parse_shifter_operand_reloc): Update calls to my_get_expression
and call aarch64_force_reloc to determine the value of the new
fifth argument.
(parse_address_main): Likewise.
(parse_half): Likewise.
(parse_adrp): Likewise.
(aarch64_force_reloc): New function. Contains code extracted from...
(aarch64_force_relocation): ... here.
* testsuite/gas/aarch64/pr27217.s: New test case.
* testsuite/gas/aarch64/pr27217.d: New test driver.
The former two are unused anyway. And having such constants isn't very
helpful either, when they live in a place where updating the register
table wouldn't even allow noticing the need to adjust these constants.
st(1) ... st(7) will never be looked up in the hash table, so there's no
point inserting the entries. It's also not really necessary to do a 2nd
hash lookup after parsing the register number, nor is there a real
reason for having both st and st(0) entries. Plus we can easily do away
with the need for st to be first in the table.
There's no need for the extra level of indirection and the extra storage
needed for the pointer, pointing from one piece of static data to
another. Key checking of rounding being in effect off of the type field
of the structure instead.
There's no need for the extra level of indirection and the extra storage
needed for the pointer, pointing from one piece of static data to
another. Key checking of broadcast being in effect off of the type field
of the structure instead.
There's no need for the extra level of indirection and the extra storage
needed for the pointer, pointing from one piece of static data to
another. Key checking of masking being in effect off of the register
field of the structure instead.
All callers pass unsigned values (in some cases by virtue of passing
non-negative literal numbers).
This in turn requires struct {Mask,RC,Broadcast}_Operation's "operand"
fields to become unsigned, in turn allowing to reduce the amount of
casting needed (the two new casts that are necessary cast _to_
"unsigned" instead of _from_, as that's the form that'll never case
undefined behavior).