The attribute really specifies that the sum of register and memory
operands is 4. Express it like that in most places, while using the 2nd
(apart from XOP) CPU feature flags (FMA4) in reversed operand matching
logic.
With the use in build_modrm_byte() gone, part of an assertion there
also becomes meaningless - simplify that at the same time.
With all uses of the opcode modifier field gone, also drop that.
The few XOP insns which used it wrongly didn't have VexVVVV specified.
With that added, the only further missing piece to use more generic code
elsewhere is SwapSources - see e.g. the BMI2 insns for similar operand
patterns.
With the only users gone, drop the #define as well as the special case
code.
The VPROT* forms with an immediate operand are entirely standard in the
way their ModR/M bytes are built. There's no reason to invoke special
case code. With that the handling of an immediate there can also be
dropped; it was partially bogus anyway, as in its "no memory operands"
portion it ignores the possibility of an immediate operand (which was
okay only because that case was already handled by more generic code).
This really isn't a "modifier" and rather ought to live next to the base
opcode anyway. Use the bits we presently have available to fit in the
field, renaming it to opcode_space. As an intended side effect this
helps readability at the use sites, by shortening the references quite a
bit.
In generated code arrange for human readable output, by using the
SPACE_* constants there rather than raw numbers. This may aid debugging
down the road.
Fold adjacent comparisons when, by ORing in a certain mask, the same
effect can be achieved by a single one. In load_insn_p() this extends
to further uses of an already available local variable.
Now that we have identifiers for the mnemonic strings we can avoid
opcode based comparisons, for (in many cases) being more expensive and
(in a few cases) being a little fragile and not self-documenting.
Note that the MOV optimization can be engaged by the earlier LEA one,
and hence LEA also needs checking for there.
Swapping operands for commutative insns occurs outside of
optimize_encoding() and hence needs explicit checking for a request to
avoid any optimizations.
Dropping a meaningless segment prefix occurs outside of
optimize_encoding() and hence needs explicit checking for a request to
avoid any optimizations.
Don't define this. It is defined just before elf-bfd.h is included,
but doesn't have any relevance there. Instead is for aout64.h where
the default is 4 anyway.
Provide a way for config/obj-* to clean up at end of assembly, and do
so for ELF.
* obj.h (struct format_ops): Add "end".
* config/obj-aout.c (aout_format_ops): Init new field.
* config/obj-coff.c (coff_format_ops): Likewise.
* config/obj-ecoff.c (ecoff_format_ops): Likewise.
* config/obj-elf.c (elf_format_ops): Likewise.
(elf_begin): Move later in file. Clear some more variables.
(comment_section): Make file scope.
(free_section_idx): Rewrite.
(elf_adjust_symtab): Expand str_htab_create call and use
free_section_idx as delete function.
(elf_frob_file_after_relocs): Don't clean up groups.indexes here.
(elf_end): New function.
* config/obj-elf.h (obj_end): Define.
* config/obj-multi.h (obj_end): Define.
* output-file.c (output_file_close): Call obj_end.
Along with the normal JAL alias, the C-extension one should have been
moved as well by 839189bc93 ("RISC-V: re-arrange opcode table for
consistent alias handling"), for the assembler to actually be able to
use it where/when possible.
Since neither this nor any other compressed branch insn was being tested
so far, take the opportunity and introduce a new testcase covering those.
Ideally we'd do away with this somewhat questionable adjustment (leaving
i.types[] untouched). That's non-trivial though as it looks, so only
- move the logic into process_operands(), putting it closer to related
logic and eliminating a conditional for operand-less insns,
- make it consistent (i.e. also affect %xmm0), eliminating an ugly
special case later in the function.
All (there are just two) SSE2AVX templates with %xmm0 as first operand
also specify VEX3SOURCES. Hence there's no need for an "else" to the
respective if(), and the if() itself can become an assertion.
Now that we have identifiers for the mnemonic strings we can avoid
opcode based comparisons, for (in many cases) being more expensive and
(in a few cases) being a little fragile and not self-documenting.
This tidies memory allocated for entries in macro_hash. Freeing the
macro name requires a little restructuring of the define_macro
interface due to the name being used in the error message, and exposed
the fact that the name and other fields were not initialised by the
iq2000 backend.
There is also a fix for
.macro .macro
.endm
.macro .macro
.endm
which prior to this patch reported
mac.s:1: Warning: attempt to redefine pseudo-op `.macro' ignored
mac.s:3: Error: Macro `.macro' was already defined
rather than reporting the attempt to redefine twice.
* macro.c (macro_del_f): New function.
(macro_init): Use it when creating macro_hash.
(free_macro): Free macro name too.
(define_macro): Return the macro_entry, remove idx, file, line and
namep params. Call as_where. Report errors here. Delete macro
from macro_hash on attempt to redefined pseudo-op.
(delete_macro): Don't call free_macro.
* macro.h (define_macro): Update prototype.
* read.c (s_macro): Adjust to suit.
* config/tc-iq2000.c (iq2000_add_macro): Init all fields of
macro_entry.
This patch adds support for the .secidx directive and its corresponding
relocation to aarch64-w64-mingw32. As with x86, this is a two-byte LE
integer which gets filled in with the 1-based index of the output
section that a symbol ends up in.
This is needed for PDBs, which represent addresses as a .secrel32,
.secidx pair.
The test is substantially the same as for amd64, but with changes made
for padding and alignment.
Now that we have identifiers for the mnemonic strings we can avoid
strcmp() in a number of places, comparing the offsets into the mnemonic
string table instead. While doing this also
- convert a leftover strncmp() to startswith() (apparently an oversight
when rebasing the original patch introducing the startswith() uses),
- use the new shorthand for current_templates->start also elsewhere in
md_assemble() (valid up to the point where match_template() is
called).
Using full pointers to reference the insn mnemonic strings is not very
efficient. With overall string size presently just slightly over 20k,
even a 16-bit value would suffice. Use "unsigned int" for now, as
there's no good use we could presently make of the otherwise saved 16
bits.
For 64-bit builds this reduces table size by 6.25% (prior to the recent
ISA extension additions it would have been 12.5%), with a similar effect
on cache occupation of table entries accessed. For PIE builds of gas
this also reduces the number of base relocations quite a bit (obviously
independent of bitness).
In preparation for changing the representation of the "name" field
introduce a wrapper function. This keeps the mechanical change separate
from the functional one.
We noticed that a warning message about the use of scalar fp16
instructions being UNPREDICTABLE when conditionalized in an IT
block referenced the specific A-class architecture revision
ARMv8.2-A.
Many of these instructions are now also part of ARMv8.1-M, so
the warning message had become misleading. Here we just change
the message to not specify an architecture revision at all and
update all testing accordingly. This was done with a simple
find-n-replace within the binutils sources. No tests have
regressed for the arm target.
gas/ChangeLog:
* config/tc-arm.c (do_scalar_fp16_v82_encode): Remove
ARMv8.2-A from the warning message.
(do_neon_movhf): Likewise
* testsuite/gas/arm/armv8-2-fp16-scalar-bad.l: Likewise
* testsuite/gas/arm/mve-vaddsub-it-bad.l: Likewise
* testsuite/gas/arm/mve-vcvtne-it-bad.l: Likewise
* testsuite/gas/arm/mve-vcvtne-it.d: Likewise
Previously we had experienced issues with assembling a "VCVTNE" instruction
in the presence of the MVE architecture extension, because it could be
interpreted both as:
* The base instruction VCVT + NE for IT predication when inside an IT block.
* The MVE instruction VCVTN + E in the Else of a VPT block.
Given a C reproducer of:
```
int test_function(float value)
{
int ret_val = 10;
if (value != 0.0)
{
ret_val = (int) value;
}
return ret_val;
}
```
GCC generates a VCVTNE instruction based on the `truncsisf2_vfp`
pattern, which will look like:
`vcvtne.s32.f32 s-reg, s-reg`
This still triggers an error due to being misidentified as "vcvtn+e"
Similar errors were found with other type combinations and instruction
patterns (these have all been added to the testing of this patch).
This class of errors was previously worked around by:
https://sourceware.org/pipermail/binutils/2020-August/112728.html
which addressed this by looking at the operand types, however,
that isn't adequate to cover all the extra cases that have been
found. Instead, we add some special-casing logic earlier when
the instructions are parsed that is conditional on whether we are
in a VPT block or not, when the instruction is parsed.
gas/ChangeLog:
* config/tc-arm.c (opcode_lookup): Add special vcvtn handling.
* testsuite/gas/arm/mve-vcvtne-it-bad.l: Add further testing.
* testsuite/gas/arm/mve-vcvtne-it-bad.s: Likewise.
* testsuite/gas/arm/mve-vcvtne-it.d: Likewise.
* testsuite/gas/arm/mve-vcvtne-it.s: Likewise.
We already use C99's __func__ in places, use it more generally. This
patch doesn't change uses in the testsuite. I've also left one in
gold.h that is protected by GCC_VERSION < 4003. If any of the
remaining uses bothers anyone I invite patches.
bfd/
* bfd-in.h: Replace __FUNCTION__ with __func__.
* elf32-bfin.c: Likewise.
* elfnn-aarch64.c: Likewise.
* elfxx-sparc.c: Likewise.
* bfd-in2.h: Regenerate.
gas/
* config/tc-cris.c: Replace __FUNCTION__ with __func__.
* config/tc-m68hc11.c: Likewise.
* config/tc-msp430.c: Likewise.
gold/
* dwp.h: Replace __FUNCTION__ with __func__.
* gold.h: Likewise, except for use inside GCC_VERSION < 4003.
ld/
* emultempl/pe.em: Replace __FUNCTION__ with __func__.
* emultempl/pep.em: Likewise.
* pe-dll.c: Likewise.
PR gas/29940
With the single-operand JAL entry now sitting ahead of the two-operand
one, the parsing of a two-operand insn would first try to parse an 'a'-
style operand, resulting in the insertion of bogus (and otherwise
unused) undefined symbols in the symbol table, having register names.
Since 'a' is used as 1st operand only with J and JAL, and since JAL is
the only insn _also_ allowing for a register as 1st operand (and then
there being a 2nd one), special case this parsing aspect right there.
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Restores tc_pe_dwarf2_emit_offset in tc-aarch64.c, which is needed to
make sure that DWARF offsets are encoded correctly (they're secrels in
COFF). There were remnants of this there before, but they were removed
by Jedidiah's original patch - presumably because we didn't yet have
.secrel32.
This adds the remaining pe-aarch64 relocations, and gets them working.
It also brings in the constant directives from ELF, as otherwise .word
would be 2 rather than 4 bytes, and .xword and .dword wouldn't be
defined.
Delete a few files only used for obsolete targets, and tidy config,
xfails and other pieces of support specific to those targets. And
since I was editing target triplets in test files, fix the nm
alpha-linuxecoff fails.
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
This commit makes CSR class handling for 'Smstateen' and 'Ssstateen'
extensions simpler using fall-throughs (as used in CSR_CLASS_I{,_32}).
gas/ChangeLog:
* config/tc-riscv.c (riscv_csr_address): Simplify the logic for
'Smstateen' and 'Ssstateen' extensions.
TSXLDTRK takes RTM as a prereq. Additionally introduce an umbrella "tsx"
extension option covering both RTM and HLE, paralleling the "abm" one we
already have.
SEV-ES is an extension to SVME. SNP in turn is an extension to SEV-ES,
and yet in turn RMPQUERY is a SNP extension.
Note that cpu_arch[] has no SNP entry, so CPU_ANY_SNP_FLAGS remains
unused (just like CPU_SNP_FLAGS already is).
SSE itself takes FXSR as a prereq. Like AES, PCLMUL, and SHA both GFNI
and KL take SSE2 as a prereq, for operating on packed integers. And
while correcting KL also record it as a prereq to WIDEKL.
Features with prereqs as well as features with dependents cannot re-use
CPU_*_MASK for the 3rd argument of SUBARCH() - they need to use
CPU_ANY_*_MASK in order to avoid disabling too many (when there are
prereqs) and/or too few (when there are dependents) features.
Generally any CPU_ANY_*_MASK which exist should not remain unused.
Exceptions are
- FISTTP which has no corresponding entry in cpu_arch[],
- IAMCU which is a base architecture and hence uses ARCH(), not
SUBARCH() (only extensions can be disabled, but unlike for Cpu*86 it
would be a little more clumsy to suppress generating of the #define).
Getting both forward and reverse ISA dependencies right / consistent has
been a permanent source of mistakes. Reduce what needs specifying
manually to just the direct forward dependencies. Transitive forward
dependencies as well as reverse ones are now derived and hence cannot go
out of sync anymore (at least in the vast majority of cases; there are a
few special cases to still take care of manually). In the course of this
several CPU_ANY_*_FLAGS disappear, requiring adjustment to the
assembler's cpu_arch[].
Note that to retain the correct reverse dependency of AVX512F wrt
AVX512-VP2INTERSECT, the latter has the previously missing AVX512F
prereq added.
Note further that to avoid adding the following undue prereqs:
* ATHLON, K8, and AMDFAM10 gain CMOV and FXSR,
* IAMCU gains 387,
auxiliary table entries (including a colon-separated modifier) are
introduced in addition to the ones representing from converting the old
table.
To maintain forward-only dependencies between AVX (XOP) and SSE* (SSE4a)
(i.e. "nosse" not disabling AVX), reverse dependency tracking is
artifically suppressed.
As a side effect disabling of SSE or SSE2 will now also disable AES,
PCLMUL, and SHA (respective elements were missing from
CPU_ANY_SSE2_FLAGS).
While originally indeed used for register size checking only, the
attribute has been used for memory operand size checking as well already
for quite a while, with more such uses recently having been added.
The current AS accepts invalid operands due to miss of operands length check.
For example, "e6" is an invalid operand in (vsetvli a0, a1, e6, mf8, tu, ma),
but it's still accepted by assembler. In detail, the condition check "strncmp
(array[i], *s, len) == 0" in arg_lookup function passes with "strncmp ("e64",
"e6", 2)" in the case above. So the generated encoding is same as that of
(vsetvli a0, a1, e64, mf8, tu, ma).
This patch fixes issue above by prompting an error in such case and also adds
a new testcase.
gas/ChangeLog:
* config/tc-riscv.c (arg_lookup): Add string length check for operands.
* testsuite/gas/riscv/vector-insns-fail-vsew.d: New testcase for an illegal vsew.
* testsuite/gas/riscv/vector-insns-fail-vsew.l: Likewise.
* testsuite/gas/riscv/vector-insns-fail-vsew.s: Likewise.
As Alan points out, ASAN takes issue with these constructs, for
current_templates being NULL. Wrap them in sizeof(), so the expressions
aren't actually evaluated.
PR gas/29524
Having templates with a suffix explicitly present has always been
quirky. After prior adjustment all that's left to also eliminate the
anomaly from move-with-sign-extend is to consolidate the insn templates
and to make may_need_pass2() cope (plus extend testsuite coverage).
The need for them on the operand-less string insns has gone away with
the removal of maybe_adjust_templates() and associated logic. Since
i386_index_check() needs adjustment then anyway, take the opportunity
and also simplify it, possible again as a result of said removal (plus
the opcode template adjustments done here).
Having it in match_template() is unhelpful. Neither does looking for the
next template to possibly match make any sense in that case, nor is the
resulting diagnostic making clear what the problem is.
While moving the check, also generalize it to include all SIMD and VEX-
encoded insns. This way an existing conditional can be re-used in
md_assemble(). Note though that this still leaves a lof of insns which
are also wrong to use with these relocations.
Further fold the remaining check (BFD_RELOC_386_GOT32) with the XRELEASE
related one a few lines down. This again allows re-using an existing
conditional.
In commit 1212781b35 ("ix86: allow HLE store of accumulator to
absolute address") I was wrong to exclude 64-bit code. Dropping the
check also leads to better diagnostics in 64-bit code ("MOV", after
all, isn't invalid with "XRELEASE").
While there also limit the amount of further checks done: The operand
type checks that were there were effectively redundant with other ones
anyway, plus it's quite fine to also have "xrelease mov <disp>, %eax"
look for the next MOV template (in fact again also improving
diagnostics).