i386-init.h and i386-tbl.h are generated files. There is nothing to
translate. Remove them from HFILES (POTFILES).
* Makefile.am (HFILES): Remove i386-init.h and i386-tbl.h.
* Makefile.in: Regenerated.
* po/POTFILES.in: Likewise.
With the removal of its use for FPU insns the suffix is now finally
properly misnamed. Drop its use altogether, replacing it by a separate
boolean instead.
As a comment near the top of match_template() already says: We really
only need this pseudo-suffix for far branch handling. Stop "deriving" it
for floating point insns. (Don't bother renaming the now properly
misnamed LONG_DOUBLE_MNEM_SUFFIX, to e.g. FAR_BRANCH_SUFFIX - it's going
to disappear anyway.)
At the very least a comment in process_operands() is stale. Beyond that
there are effectively two options:
1) It is possible that FADDP and FMULP were mistakenly not marked as
being in need of dealing with the compiler anomaly, and hence the
respective templates weren't removed at the time when they should
have been.
2) It is also possible that there are indeed uses known beyond compiler
generated output for these two commutative opcodes, and hence the
templates need to stay.
To be on the safe side assume 2: Update the comment and fold the
templates into their "normal" ones (utilizing D), adjusting consuming
code accordingly.
For FMULP also add a comment paralleling a similar one FADDP has.
There are just 4 templates using it, which can be easily identified by
other means, as D is set only on a very limited number of FPU templates.
Also move the respective conditional out of the code path taken by all
"reverse match" insns (it probably should have been this way already
before, to avoid the one conditional in the common case).
With this the templates which had FloatR dropped no longer differ from
their AT&T syntax + mnemonic counterparts - the only difference is now
which of the two would be recognized. For this, however, we don't need
two templates - we can simply arrange the condition for setting
Opcode_FloatR accordingly.
Commit bb996692bd ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") tried to start supporting long instructions but
it was insufficient.
On the disassembler, correct ".byte" output was limited to the first 64-bits
of an instruction. After that, zeroes are incorrectly printed.
Note that, it only happens on ".byte" output (instruction part) and not on
hexdump (data) part. For example, before this commit, hexdump and ".byte"
produces different values:
Assembly:
.insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
objdump output example (before the fix):
10: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
18: 89ab 4567 0123 3210
20: 7654 ba98 fedc
Note that, after 0xcd (after first 64-bits of the target instruction), all
".byte" values are incorrectly printed as zero while hexdump prints correct
instruction bits.
To resolve this, this commit adds "packet" argument to support dumping
instructions longer than 64-bits (to print correct instruction bits on
".byte"). This commit will be tested on the separate commit.
Assembly:
.insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
objdump output example (after the fix):
10: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
18: 89ab 4567 0123 3210
20: 7654 ba98 fedc
opcodes/ChangeLog:
* riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
using the new argument packet.
(riscv_disassemble_data): Add unused argument packet.
(print_insn_riscv): Pass packet to the disassemble function.
First of all make operand_type_register_match() apply to all sized
operands, i.e. in Intel Syntax also to respective memory ones. This
addresses gas wrongly accepting certain SIMD insns where register and
memory operand sizes should match but don't. This apparently has
affected all templates with one memory-only operand and one or more
register ones, both permitting at least two sizes, due to CheckRegSize
not taking effect.
Then also add CheckRegSize to a couple of non-SIMD templates matching
that same pattern of memory-only vs register operands. This replaces
bogus (for Intel Syntax) diagnostics referring to a wrong suffix (when
none was used at all) by "type mismatch" ones, just like already emitted
for insns where the template allows a register operand alongside a
memory one at any particular position.
This also is a prereq to limiting (ideally eliminating in the long run)
suffix "derivation" in Intel Syntax mode.
While making the code adjustment also flip order of checks to do the
cheaper one first in both cases.
To properly and predictably determine operand size encoding (operand
size or REX.W prefixes), consistent operand sizes need to be specified.
Add CheckRegSize where this was previously missing.
Both uniformly only ever take 16-bit memory operands while at the same
time requiring matching (in size) register operands, which then also
should disassemble that way. This in particular requires splitting each
of the templates for the assembler and separating decode of the
register and memory forms in the disassembler.
Mode/reg bits for these insns are 000 Dy, 001 Ay, and 111 100 for the
move immediate.
* m68k-opc.c (m68k_opcodes): Only accept 000 and 001 as mode
for move reg to macsr/mask insns.
This patch changes the address for "isa_config" auxiliary register
from 0xC2 to the correct value 0xC1. Moreover, it only exists in
arc700+ and not all ARCs.
opcodes/ChangeLog:
* arc-regs.h: Change isa_config address to 0xc1.
isa_config exists for ARC700 and ARCV2 and not ARCALL.
Use NoSuf to replace No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf
and add the explicit NoSuf to AddrPrefixOpReg in templates.
* i386-opc.tbl (NoSuf): New macro.
(AddrPrefixOpReg): Remove No_?Suf.
Replace No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf with
NoSuf in templates.
Add NoSuf to AddrPrefixOpReg in templates.
This patch adds the XTheadInt extension, which provides interrupt
stack management instructions.
The XTheadFmv extension is documented in the RISC-V toolchain
contentions:
https://github.com/riscv-non-isa/riscv-toolchain-conventions
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This patch adds the XTheadFmv extension, which allows to access the
upper 32 bits of a double-precision floating-point register in RV32.
The XTheadFmv extension is documented in the RISC-V toolchain
contentions:
https://github.com/riscv-non-isa/riscv-toolchain-conventions
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Attributes which aren't used together in any single insn template can be
converted from individual booleans to a single enum, as was done for a few
other attributes before. This is more space efficient. Collect together
all attributes which express special operand constraints (and which fit
the criteria for folding).
On a testcase like
pla 8,foo@pcrel
disassembled with -Mpower10 results in
0: 00 00 10 06 pla r8,0 # 0
4: 00 00 00 39
0: R_PPC64_PCREL34 foo
but with -Mpower10 -Mraw
0: 00 00 10 06 .long 0x6100000
0: R_PPC64_PCREL34 foo
4: 00 00 00 39 addi r8,0,0
The instruction is unrecognised due to the hack we have in
extract_pcrel0 in order to disassemble paddi with RA0=0 and R=1 as
pla. I could have just added "&& !(dialect & PPC_OPCODE_RAW)" to the
condition in extract_pcrel0 under which *invalid is set, but went for
this larger patch that reorders the extended insn pla to the more
usual place before its underlying machine insn. (la is after addi
because we never disassemble to la.)
gas/
* testsuite/gas/ppc/raw.d,
* testsuite/gas/ppc/raw.s: Add pla.
opcodes/
* ppc-opc.c (extract_pcrel1): Rename from extract_pcrel0 and
invert *invalid logic.
(PCREL1): Rename from PCREL0.
(prefix_opcodes): Sort pla before paddi, adjusting R operand
for pla, paddi and psubi.
Although the encoding for scalar and fp registers is identical,
we should follow common pratice and use fp register names
when referencing fp registers.
The xtheadmemidx extension consists of indirect load/store instructions
which all load to or store from fp registers.
Let's use fp register names in this case and adjust the test cases
accordingly.
gas/
* testsuite/gas/riscv/x-thead-fmemidx-fail.l: Updated since rd need to
be float register.
* testsuite/gas/riscv/x-thead-fmemidx-fail.s: Likewise.
* testsuite/gas/riscv/x-thead-fmemidx.d: Likewise.
* testsuite/gas/riscv/x-thead-fmemidx.s: Likewise.
opcodes/
* riscv-opc.c (riscv_opcodes): Updated since rd need to be float register.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Hi all,
This wrong comment was introduced by previous AVX-VNNI-INT8 commit.
Committed as obvious fix.
BRs,
Haochen
opcodes/ChangeLog:
* i386-dis.c (VEX_W_0F3851): Corrected from
VEX_W_0F3851_P_0.
After this commit:
commit 6576bffe6c
Date: Thu Jul 7 13:43:45 2022 +0100
opcodes/arm: add disassembler styling for arm
Some people were seeing their builds failing with complaints about a
possible uninitialized variable usage. I previously fixed an instance
of this issue in this commit:
commit 2df82cd4b4
Date: Tue Nov 1 10:36:59 2022 +0000
opcodes/arm: silence compiler warning about uninitialized variable use
which did fix the build problems that the sourceware buildbot was
hitting, however, an additional instance of the same problem was
brought to my attention, and that is fixed in this commit.
Where commit 2df82cd4b4 fixed the uninitialized variable problem in
print_mve_unpredictable, this commit fixes the same problem in
print_mve_undefined.
As with the previous commit, I don't believe we could really ever get
an uninitialized variable usage, based on the current usage of the
function, so I have just initialized the reason variable to "??".
Prior to commit 1cb0ab18ad ("x86/Intel: restrict suffix derivation")
the Tbyte modifier on the FLDT and FSTPT templates was pointless, as
No_ldSuf would have prevented it being accepted. Due to the special
nature of LONG_DOUBLE_MNEM_SUFFIX said commit, however, has led to these
insns being accepted in Intel syntax mode even when "tbyte ptr" was
present. Restore original behavior by dropping Tbyte there. (Note that
these insns in principle should by marked AT&T syntax only, but since
they haven't been so far we probably shouldn't change that.)
The earlier commit:
commit 6576bffe6c
Date: Thu Jul 7 13:43:45 2022 +0100
opcodes/arm: add disassembler styling for arm
introduced two places where a register name was passed as the format
string to the disassembler's fprintf_styled_func callback. This will
cause a warning from some compilers, like this:
../../binutils-gdb/opcodes/arm-dis.c: In function ‘print_mve_vld_str_addr’:
../../binutils-gdb/opcodes/arm-dis.c:6005:3: error: format not a string literal and no format arguments [-Werror=format-security]
6005 | func (stream, dis_style_register, arm_regnames[gpr]);
| ^~~~
This commit fixes these by using "%s" as the format string.
The earlier commit:
commit 6576bffe6c
Date: Thu Jul 7 13:43:45 2022 +0100
opcodes/arm: add disassembler styling for arm
was causing a compiler warning about a possible uninitialized variable
usage within opcodes/arm-dis.c.
The problem is in print_mve_unpredictable, and relates to the reason
variable, which is set by a switch table.
Currently the switch table does cover every valid value, though there
is no default case. The variable switched on is passed in as an
argument to the print_mve_unpredictable function.
Looking at how print_mve_unpredictable is used, there is only one use,
the second argument is the one that is used for the switch table,
looking at how this argument is set, I don't believe it is possible
for this argument to take an invalid value.
So, I think the compiler warning is a false positive. As such, my
proposed solution is to initialize the reason variable to the string
"??", this will silence the warning, and the "??" string should never
end up being printed.
This commit adds disassembler styling for the ARM architecture.
The ARM disassembler is driven by several instruction tables,
e.g. cde_opcodes, coprocessor_opcodes, neon_opcodes, etc
The type for elements in each table can vary, but they all have one
thing in common, a 'const char *assembler' field. This field
contains a string that describes the assembler syntax of the
instruction.
Embedded within that assembler syntax are various escape characters,
prefixed with a '%'. Here's an example of a very simple instruction
from the arm_opcodes table:
"pld\t%a"
The '%a' indicates a particular type of operand, the function
print_insn_arm processes the arm_opcodes table, and includes a switch
statement that handles the '%a' operand, and takes care of printing
the correct value for that instruction operand.
It is worth noting that there are many print_* functions, each
function handles a single *_opcodes table, and includes its own switch
statement for operand handling. As a result, every *_opcodes table
uses a different mapping for the operand escape sequences. This means
that '%a' might print an address for one *_opcodes table, but in a
different *_opcodes table '%a' might print a register operand.
Notice as well that in our example above, the instruction mnemonic
'pld' is embedded within the assembler string. Some instructions also
include comments within the assembler string, for example, also from
the arm_opcodes table:
"nop\t\t\t@ (mov r0, r0)"
here, everything after the '@' is a comment that is displayed at the
end of the instruction disassembly.
The next complexity is that the meaning of some escape sequences is
not necessarily fixed. Consider these two examples from arm_opcodes:
"ldrex%c\tr%12-15d, [%16-19R]"
"setpan\t#%9-9d"
Here, the '%d' escape is used with a bitfield modifier, '%12-15d' in
the first instruction, and '%9-9d' in the second instruction, but,
both of these are the '%d' escape.
However, in the first instruction, the '%d' is used to print a
register number, notice the 'r' immediately before the '%d'. In the
second instruction the '%d' is used to print an immediate, notice the
'#' just before the '%d'.
We have two problems here, first, the '%d' needs to know if it should
use register style or immediate style, and secondly, the 'r' and '#'
characters also need to be styled appropriately.
The final thing we must consider is that some escape codes result in
more than just a single operand being printed, for example, the '%q'
operand as used in arm_opcodes ends up calling arm_decode_shift, which
can print a register name, a shift type, and a shift amount, this
could end up using register, sub-mnemonic, and immediate styles, as
well as the text style for things like ',' between the different
parts.
I propose a three layer approach to adding styling:
(1) Basic state machine:
When we start printing an instruction we should maintain the idea
of a 'base_style'. Every character from the assembler string will
be printed using the base_style.
The base_style will start as mnemonic, as each instruction starts
with an instruction mnemonic. When we encounter the first '\t'
character, the base_style will change to text. When we encounter
the first '@' the base_style will change to comment_start.
This simple state machine ensures that for simple instructions the
basic parts, except for the operands themselves, will be printed in
the correct style.
(2) Simple operand styling:
For operands that only have a single meaning, or which expand to
multiple parts, all of which have a consistent meaning, then I
will simply update the operand printing code to print the operand
with the correct style. This will cover a large number of the
operands, and is the most consistent with how styling has been
added to previous architectures.
(3) New styling syntax in assembler strings:
For cases like the '%d' that I describe above, I propose adding a
new extension to the assembler syntax. This extension will allow
me to temporarily change the base_style. Operands like '%d', will
then print using the base_style rather than using a fixed style.
Here are the two examples from above that use '%d', updated with
the new syntax extension:
"ldrex%c\t%{R:r%12-15d%}, [%16-19R]"
"setpan\t%{I:#%9-9d%}"
The syntax has the general form '%{X:....%}' where the 'X'
character changes to indicate a different style. In the first
instruction I use '%{R:...%}' to change base_style to the register
style, and in the second '%{I:...%}' changes base_style to
immediate style.
Notice that the 'r' and '#' characters are included within the new
style group, this ensures that these characters are printed with
the correct style rather than as text.
The function decode_base_style maps from character to style. I've
included a character for each style for completeness, though only
a small number of styles are currently used.
I have updated arm-dis.c to the above scheme, and checked all of the
tests in gas/testsuite/gas/arm/, and the styling looks reasonable.
There are no regressions on the ARM gas/binutils/ld tests that I can
see, so I don't believe I've changed the output layout at all. There
were two binutils tests for which I needed to force the disassembler
styling off.
I can't guarantee that I've not missed some untested corners of the
disassembler, or that I might have just missed some incorrectly styled
output when reviewing the test results, but I don't believe I've
introduced any changes that could break the disassembler - the worst
should be some aspect is not styled correctly.
Looking at the ARM disassembler output, every comment seems to start
with a ';' character, so I assumed this was the correct character to
start an assembler comment.
I then spotted a couple of places where there was no ';', but instead,
just a '@' character. I thought that this was a case of a missing
';', and proposed a patch to add the missing ';' characters.
Turns out I was wrong, '@' is actually the ARM assembler comment
character, while ';' is the statement separator. Thus this:
nop ;@ comment
is two statements, the first is the 'nop' instruction, while the
second contains no instructions, just the '@ comment' comment text.
This:
nop @ comment
is a single 'nop' instruction followed by a comment. And finally,
this:
nop ; comment
is two statements, the first contains the 'nop' instruction, while the
second contains the instruction 'comment', which obviously isn't
actually an instruction at all.
Why this matters is that, in the next commit, I would like to add
libopcodes syntax styling support for ARM.
The question then is how should the disassembler style the three cases
above?
As '@' is the actual comment start character then clearly the '@' and
anything after it can be styled as a comment. But what about ';' in
the second example? Style as text? Style as a comment?
And the third example is even harder, what about the 'comment' text?
Style as an instruction mnemonic? Style as text? Style as a comment?
I think the only sensible answer is to move the disassembler to use
'@' consistently as its comment character, and remove all the uses of
';'.
Then, in the next commit, it's obvious what to do.
There's obviously a *lot* of tests that get updated by this commit,
the only actual code changes are in opcodes/arm-dis.c.
Earlier tidying still missed an opportunity: There's no need for the
"anyimm" static variable. Instead of using it in the loop to mask
"allowed" (which is necessary to satisfy operand_type_or()'s assertions)
simply use "mask", requiring it to be calculated first. That way the
post-loop masking by "mask" ahead of the operand_type_all_zero() can be
dropped.
RISC-V Psabi pr196,
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/196
bfd/
* elfxx-riscv.c (riscv_release_subset_list): Free arch_str if needed.
(riscv_copy_subset_list): Copy arch_str as well.
* elfxx-riscv.h (riscv_subset_list_t): Store arch_str for each subset list.
gas/
* config/tc-riscv.c (riscv_reset_subsets_list_arch_str): Update the
architecture string in the subset_list.
(riscv_set_arch): Call riscv_reset_subsets_list_arch_str after parsing new
architecture string.
(s_riscv_option): Likewise.
(need_arch_map_symbol): New boolean, used to indicate if .option
directives do affect instructions.
(make_mapping_symbol): New boolean parameter reset_seg_arch_str. Need to
generate $x+arch for MAP_INSN, and then store it into tc_segment_info_data
if reset_seg_arch_str is true.
(riscv_mapping_state): Decide if we need to add $x+arch for MAP_INSN. For
now, only add $x+arch if the architecture strings in subset list and segment
are different. Besides, always add $x+arch at the start of section, and do
not add $x+arch for code alignment, since rvc for alignment can be judged
from addend of R_RISCV_ALIGN.
(riscv_remove_mapping_symbol): If current and previous mapping symbol have
same value, then remove the current $x only if the previous is $x+arch;
Otherwise, always remove previous.
(riscv_add_odd_padding_symbol): Updated.
(riscv_check_mapping_symbols): Don't need to add any $x+arch if
need_arch_map_symbol is false, so changed them to $x.
(riscv_frag_align_code): Updated since riscv_mapping_state is changed.
(riscv_init_frag): Likewise.
(s_riscv_insn): Likewise.
(riscv_elf_final_processing): Call riscv_release_subset_list to release
subset_list of riscv_rps_as, rather than only release arch_str in the
riscv_write_out_attrs.
(riscv_write_out_attrs): No need to call riscv_arch_str, just get arch_str
from subset_list of riscv_rps_as.
* config/tc-riscv.h (riscv_segment_info_type): Record current $x+arch mapping
symbol of each segment.
* testsuite/gas/riscv/mapping-0*: Merged and replaced by mapping.s.
* testsuite/gas/riscv/mapping.s: New testcase, to test most of the cases in
one file.
* testsuite/gas/riscv/mapping-symbols.d: Likewise.
* testsuite/gas/riscv/mapping-dis.d: Likewise.
* testsuite/gas/riscv/mapping-non-arch.s: New testcase for the case that
does need any $x+arch.
* testsuite/gas/riscv/mapping-non-arch.d: Likewise.
* testsuite/gas/riscv/option-arch-01a.d: Updated.
opcodes/
* riscv-dis.c (riscv_disassemble_insn): Set riscv_fpr_names back to
riscv_fpr_names_abi or riscv_fpr_names_numeric when zfinx is disabled
for some specfic code region.
(riscv_get_map_state): Recognized mapping symbols $x+arch, and then reset
the architecture string once the ISA is different.
When no AVX512-specific functionality is in use, the disassembly of
AVX512VL insns is indistinguishable from their AVX counterparts (if such
exist). Emit the {evex} pseudo-prefix in such cases.
Where applicable drop stray uses of PREFIX_OPCODE from table entries.
By putting the templates after their AVX512 counterparts, the AVX512
flavors will be picked by default. That way the need to always use {vex}
ceases to exist once respective CPU features (AVX512-VNNI or AVX512VL as
a whole) have been disabled. This way the need for the PseudoVexPrefix
attribute also disappears.