This patch adds support for followign SVE2p1 instruction, spec is available here [1].
1. PMOV (to vector)
2. PMOV (to predicate)
Both pmov (to vector) and pmov (to predicate) have destination scalable vector
register and source scalable vector register respectively as an operand with no
suffix and optional index. To handle this case we have added 8 new operands in
this patch.
AARCH64_OPND_SVE_Zn0_INDEX, /* Zn[index], bits [9:5]. */
AARCH64_OPND_SVE_Zn1_17_INDEX, /* Zn[index], bits [9:5,17]. */
AARCH64_OPND_SVE_Zn2_18_INDEX, /* Zn[index], bits [9:5,18:17]. */
AARCH64_OPND_SVE_Zn3_22_INDEX, /* Zn[index], bits [9:5,18:17,22]. */
AARCH64_OPND_SVE_Zd0_INDEX, /* Zn[index], bits [4:0]. */
AARCH64_OPND_SVE_Zd1_17_INDEX, /* Zn[index], bits [4:0,17]. */
AARCH64_OPND_SVE_Zd2_18_INDEX, /* Zn[index], bits [4:0,18:17]. */
AARCH64_OPND_SVE_Zd3_22_INDEX, /* Zn[index], bits [4:0,18:17,22]. */
Since the index of the <Zd> operand is optional, the index part is
dropped in disassembly in both the cases of "no index" or "zero index".
As per spec: PMOV <Zd>{[<imm>]}, <Pn>.D
PMOV <Pn>.D, <Zd>{[<imm>]}
Example1:
Assembly: pmov z5[0], p6.d
Disassembly: pmov z5, p6.d
Assembly: pmov z5, p6.d
Disassembly: pmov z5, p6.d
Example2:
Assembly: pmov p4.b, z5[0]
Disassembly: pmov p4.b, z5
Assembly: pmov p4.b, z5
Disassembly: pmov p4.b, z5
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions?lang=en
This patch fixes encoding and syntax for sve2p1 instructions ld[1-4]q/st[1-4]q
as mentioned below, for the issues reported here.
https://sourceware.org/pipermail/binutils/2024-February/132408.html
1) Previously all the ld[1-4]q/st[1-4]q instructions are wrongly added as
predicated instructions and this issue is fixed in this patch by replacing
"SVE2p1_INSNC" with "SVE2p1_INSN" macro.
2) Wrong first operand in all the ld[1-4]q/st[1-4]q instructions is fixed
by replacing "SVE_Zt" with "SVE_ZtxN".
3) Wrong operand qualifiers in ld1q and st1q instructions are also fixed in
this patch.
4) In ld1q/st1q the index in the second argument is optional and if index
is xzr and is skipped in the assembly, the index field is ignored by the
disassembler.
Fixing above mentioned issues helps with following:
1) ld1q and st1q first register operand accepts enclosed figure braces.
2) ld2q, ld3q, ld4q, st2q, st3q, and st4q instructions accepts wrapping
sequence of vector registers.
For the instructions ld[2-4]q/st[2-4]q, tests for wrapping sequence of vector
registers are added along with short-form of operands for non-wrapping sequence.
I have added test using following logic:
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0, #0, MUL VL] //raw insn encoding (all zeroes)
ld2q {Z31.Q, Z0.Q}, p0/Z, [x0, #0, MUL VL] // encoding of <Zt1>
ld2q {Z0.Q, Z1.Q}, p7/Z, [x0, #0, MUL VL] // encoding of <Pg>
ld2q {Z0.Q, Z1.Q}, p0/Z, [x30, #0, MUL VL] // encoding of <Xm>
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0, #-16, MUL VL] // encoding of <imm> (low value)
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0, #14, MUL VL] // encoding of <imm> (high value)
ld2q {Z31.Q, Z0.Q}, p7/Z, [x30, #-16, MUL VL] // encoding of all fields (all ones)
ld2q {Z30.Q, Z31.Q}, p1/Z, [x3, #-2, MUL VL] // random encoding.
For all the above form of instructions the hyphenated form is preferred for
disassembly if there are more than two registers in the list, and the register
numbers are monotonically increasing in increments of one.
This patch fixes the syntax of sve2p1 "extq" instruction by modifying the operands
count to 4. A new operand AARCH64_OPND_SVE_UIMM4 is defined to handle the 4th
argument an 4-bit unsigned immediate of extq instruction. The instruction encoding
is updated to use constraint C_SCAN_MOVPRFX, to enable "extq" instruction to immediately
precede in program order by a MOVPRFX instruction. Also removed the unused operand
AARCH64_OPND_SVE_Zm_imm4.
This issues was reported here:
https://sourceware.org/pipermail/binutils/2024-February/132408.html
This patch fixes the syntax of sve2p1 "dupq" instruction by modifying the way
2nd operand does the encoding and decoding using the [<imm>] value.
dupq makes use of already existing aarch64_ins_sve_index and aarch64_ext_sve_index
inserter and extractor functions. The definitions of aarch64_ins_sve_index_imm (inserter)
and aarch64_ext_sve_index_imm (extractor) is removed in this patch.
This issues was reported here:
https://sourceware.org/pipermail/binutils/2024-February/132408.html
This includes:
- FEAT_SME_F8F32 (+sme-f8f32)
- FEAT_SME_F8F16 (+sme-f8f16)
The FP16 addition/subtraction instructions originally added by
FEAT_SME_F16F16 haven't been added to Binutils yet. They are also
required to be enabled if FEAT_SME_F8F16 is present, so they are
included in this patch.
This includes all the instructions under the following features:
- FEAT_FP8FMA (+fp8fma)
- FEAT_FP8DOT4 (+fp8dot4)
- FEAT_FP8DOT2 (+fp8dot2)
- FEAT_SSVE_FP8FMA (+ssve-fp8fma)
- FEAT_SSVE_FP8DOT4 (+ssve-fp8dot4)
- FEAT_SSVE_FP8DOT2 (+ssve-fp8dot2)
Introduces instructions for the SME2 lutv2 extension for AArch64. They
are documented in the following document:
* ARM DDI0602
For both luti4 instructions, we introduced an operand called
SME_Znx2_BIT_INDEX. We use the existing function parse_vector_reg_list
for parsing but modified that function so that it can accept operands
without qualifiers and rejects instructions that have operands with
qualifiers but are not supposed to have operands with qualifiers.
For disassembly, we modified print_register_list so that it could
accept register lists without qualifiers.
For one luti4 instruction, we introduced a SME_Zdnx4_STRIDED. It is
similar to SME_Ztx4_STRIDED and we could use existing code for parsing,
encoding, and disassembly.
For movt instruction, we introduced an operand called SME_ZT0_INDEX2_12.
This is a ZT0 register with a bit index encoded in [13:12]. It is
similar to SME_ZT0_INDEX.
We also introduced an iclass named sme_size_12_b so that we can encode
size bits [13:12] correctly when only 'b' is allowed as qualifier.
Introduces instructions for the SVE2 lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en
These instructions use new SVE2 vector operands. They are called
SVE_Zm1_23_INDEX, SVE_Zm2_22_INDEX, and Zm3_12_INDEX and they have
1 bit, 2 bit, and 3 bit indices respectively.
The lsb and width of these new operands are the same as many existing
operands but the convention is to give different names to fields that
serve different purpose so we introduced new fields in aarch64-opc.c
and aarch64-opc.h.
We made a design choice for the second operand of the halfword variant of
luti4 with two register tables. We could have either defined a new operand,
like SVE_Znx2, or we could have use the existing operand SVE_ZnxN. With
the new operand, we would need to implement constraints on register
lists based on either operand or opcode flag. With existing operand, we
could just existing constraint checks using opcode flag. We chose
the second approach and went with SVE_ZnxN and added opcode flag to
enforce lengths of vector register list operands. This way, we can reuse
the existing constraint check logic.
Introduces instructions for the Advanced SIMD lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en
These instructions needed definition of some new operands. We will first
discuss operands for the third operand of the instructions and then
discuss a vector register list operand needed for the second operand.
The third operands are vectors with bit indices and without type
qualifiers. They are called Em_INDEX1_14, Em_INDEX2_13, and Em_INDEX3_12
and they have 1 bit, 2 bit, and 3 bit indices respectively. For these
new operands, we defined new parsing case branch. The lsb and width of
these operands are the same as many existing but the convention is to
give different names to fields that serve different purpose so we
introduced new fields in aarch64-opc.c and aarch64-opc.h for these new
operands.
For the second operand of these instructions, we introduced a new
operand called LVn_LUT. This represents a vector register list with
stride 1. We defined new inserter and extractor for this new operand and
it is encoded in FLD_Rn. We are enforcing the number of registers in the
reglist using opcode flag rather than operand flag as this is what other
SIMD vector register list operands are doing. The disassembly also uses
opcode flag to print the correct number of registers.
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:
1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
author I haven't committed, 'Kalray SA.', to cover gas testsuite
files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
This patch add support for FEAT_ECBHB "Exploitative control using
branch history information" adding the "clrbhb" instruction. AFAIU
the same alias was originally added as "clearbhb" before the
architecture was finalized (Mandatory v8.9-a/v9.4-a; Optional
v8.0-a+/v9.0-a+).
This patch add supports for FEAT_SPECRES2 "Enhanced speculation
restriction instructions" adding the "cosp" instruction.
This is mandatory v8.9-a/v9.4-a and optional v8.0-a+/v9.0-a+. It is
enabled by the +predres2 march flag.
This patch adds support for Guarded control stack data synchronization
instruction (GCSB DSYNC). This instruction is allocated to existing
HINT space and uses the HINT number 19 and to match this an entry is
added to the aarch64_hint_options array.
This patch adds for Guarded Control Stack Extension (GCS) extension. GCS feature is
optional from Armv9.4-A architecture and enabled by passing +gcs option to -march
(eg: -march=armv9.4-a+gcs) or using ".arch_extension gcs" directive in the assembly file.
Also this patch adds support for GCS instructions gcspushx, gcspopcx, gcspopx,
gcsss1, gcsss2, gcspushm, gcspopm, gcsstr and gcssttr.
This patch adds the RPRFM (range prefetch) instruction.
It was introduced as part of SME2, but it belongs to the
prefetch hint space and so doesn't require any specific
ISA flags.
The aarch64_rprfmop_array initialiser (deliberately) only
fills in the leading non-null elements.
This patch adds the SVE FDOT, SDOT and UDOT instructions,
which are available when FEAT_SME2 is implemented. The patch
also reorders the existing SVE_Zm3_22_INDEX to keep the
operands numerically sorted.
There are two instruction formats here:
- SQRSHR, SQRSHRU and UQRSHR, which operate on lists of two
or four registers.
- SQRSHRN, SQRSHRUN and UQRSHRN, which operate on lists of
four registers.
These are the first SME2 instructions to have immediate operands.
The patch makes sure that, when parsing SME2 instructions with
immediate operands, the new predicate-as-counter registers are
parsed as registers rather than as #-less immediates.
SMLALL, SMLSLL, UMLALL and UMLSLL have the same format.
USMLALL and SUMLALL allow the same operand types as those
instructions, except that SUMLALL does not have the multi-vector
x multi-vector forms (which would be redundant with USMLALL).
The {BF,F,S,U}MLAL and {BF,F,S,U}MLSL instructions share the same
encoding. They are the first instance of a ZA (as opposed to ZA tile)
operand having a range of offsets. As with ZA tiles, the expected
range size is encoded in the operand-specific data field.
Add support for the SME2 ADD. SUB, FADD and FSUB instructions.
SUB and FSUB have the same form as ADD and FADD, except that
ADD also has a 2-operand accumulating form.
The 64-bit ADD/SUB instructions require FEAT_SME_I16I64 and the
64-bit FADD/FSUB instructions require FEAT_SME_F64F64.
These are the first instructions to have tied register list
operands, as opposed to tied single registers.
The parse_operands change prevents unsuffixed Z registers (width==-1)
from being treated as though they had an Advanced SIMD-style suffix
(.4s etc.). It means that:
Error: expected element type rather than vector type at operand 2 -- `add za\.s\[w8,0\],{z0-z1}'
becomes:
Error: missing type suffix at operand 2 -- `add za\.s\[w8,0\],{z0-z1}'
SME2 adds lookup table instructions for quantisation. They use
a new lookup table register called ZT0.
LUTI2 takes an unsuffixed SVE vector index of the form Zn[<imm>],
which is the first time that this syntax has been used.
Implementation-wise, the main things to note here are:
- the WHILE* instructions have forms that return a pair of predicate
registers. This is the first time that we've had lists of predicate
registers, and they wrap around after register 15 rather than after
register 31.
- the predicate-as-counter WHILE* instructions have a fourth operand
that specifies the vector length. We can treat this as an enumeration,
except that immediate values aren't allowed.
- PEXT takes an unsuffixed predicate index of the form PN<n>[<imm>].
This is the first instance of a vector/predicate index having
no suffix.
SME2 adds LD1 and ST1 variants for lists of 2 and 4 registers.
The registers can be consecutive or strided. In the strided case,
2-register lists have a stride of 8, starting at register x0xxx.
4-register lists have a stride of 4, starting at register x00xx.
The instructions are predicated on a predicate-as-counter register in
the range pn8-pn15. Although we already had register fields with upper
bounds of 7 and 15, this is the first plain register operand to have a
nonzero lower bound. The patch uses the operand-specific data field
to record the minimum value, rather than having separate inserters
and extractors for each lower bound. This in turn required adding
an extra bit to the field.
SME2 defines new MOVA instructions for moving multiple registers
to and from ZA. As with SME, the instructions are also available
through MOV aliases.
One notable feature of these instructions (and many other SME2
instructions) is that some register lists must start at a multiple
of the list's size. The patch uses the general error "start register
out of range" when this constraint isn't met, rather than an error
specifically about multiples. This ensures that the error is
consistent between these simple consecutive lists and later
strided lists, for which the requirements aren't a simple multiple.
SME2 adds a new format for the existing SVE predicate registers:
predicates as counters rather than predicates as masks. In assembly
code, operands that interpret predicates as counters are written
pn<N> rather than p<N>.
This patch adds support for these registers and extends some
existing instructions to support them. Since the new forms
are just a programmer convenience, there's no need to make them
more restrictive than the earlier predicate-as-mask forms.
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
This patch adds support for FEAT_MOPS, an Armv8.8-A extension
that provides memcpy and memset acceleration instructions.
I took the perhaps controversial decision to generate the individual
instruction forms using macros rather than list them out individually.
This becomes useful with a follow-on patch to check that code follows
the correct P/M/E sequence.
[https://developer.arm.com/documentation/ddi0596/2021-09/Base-Instructions?lang=en]
include/
* opcode/aarch64.h (AARCH64_FEATURE_MOPS): New macro.
(AARCH64_ARCH_V8_8): Make armv8.8-a imply AARCH64_FEATURE_MOPS.
(AARCH64_OPND_MOPS_ADDR_Rd): New aarch64_opnd.
(AARCH64_OPND_MOPS_ADDR_Rs): Likewise.
(AARCH64_OPND_MOPS_WB_Rn): Likewise.
opcodes/
* aarch64-asm.h (ins_x0_to_x30): New inserter.
* aarch64-asm.c (aarch64_ins_x0_to_x30): New function.
* aarch64-dis.h (ext_x0_to_x30): New extractor.
* aarch64-dis.c (aarch64_ext_x0_to_x30): New function.
* aarch64-tbl.h (aarch64_feature_mops): New feature set.
(aarch64_feature_mops_memtag): Likewise.
(MOPS, MOPS_MEMTAG, MOPS_INSN, MOPS_MEMTAG_INSN)
(MOPS_CPY_OP1_OP2_PME_INSN, MOPS_CPY_OP1_OP2_INSN, MOPS_CPY_OP1_INSN)
(MOPS_CPY_INSN, MOPS_SET_OP1_OP2_PME_INSN, MOPS_SET_OP1_OP2_INSN)
(MOPS_SET_INSN): New macros.
(aarch64_opcode_table): Add MOPS instructions.
(aarch64_opcode_table): Add entries for AARCH64_OPND_MOPS_ADDR_Rd,
AARCH64_OPND_MOPS_ADDR_Rs and AARCH64_OPND_MOPS_WB_Rn.
* aarch64-opc.c (aarch64_print_operand): Handle
AARCH64_OPND_MOPS_ADDR_Rd, AARCH64_OPND_MOPS_ADDR_Rs and
AARCH64_OPND_MOPS_WB_Rn.
(verify_three_different_regs): New function.
* aarch64-asm-2.c: Regenerate.
* aarch64-dis-2.c: Likewise.
* aarch64-opc-2.c: Likewise.
gas/
* doc/c-aarch64.texi: Document +mops.
* config/tc-aarch64.c (parse_x0_to_x30): New function.
(parse_operands): Handle AARCH64_OPND_MOPS_ADDR_Rd,
AARCH64_OPND_MOPS_ADDR_Rs and AARCH64_OPND_MOPS_WB_Rn.
(aarch64_features): Add "mops".
* testsuite/gas/aarch64/mops.s, testsuite/gas/aarch64/mops.d: New test.
* testsuite/gas/aarch64/mops_invalid.s,
* testsuite/gas/aarch64/mops_invalid.d,
* testsuite/gas/aarch64/mops_invalid.l: Likewise.
This patch is adding new SVE2 instructions added to support SME extension.
The following SVE2 instructions are added by the SME architecture:
* PSEL,
* REVD, SCLAMP and UCLAMP.
gas/ChangeLog:
* config/tc-aarch64.c (parse_sme_pred_reg_with_index):
New parser.
(parse_operands): New parser.
* testsuite/gas/aarch64/sme-9-illegal.d: New test.
* testsuite/gas/aarch64/sme-9-illegal.l: New test.
* testsuite/gas/aarch64/sme-9-illegal.s: New test.
* testsuite/gas/aarch64/sme-9.d: New test.
* testsuite/gas/aarch64/sme-9.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New operand
AARCH64_OPND_SME_PnT_Wm_imm.
opcodes/ChangeLog:
* aarch64-asm.c (aarch64_ins_sme_pred_reg_with_index):
New inserter.
* aarch64-dis.c (aarch64_ext_sme_pred_reg_with_index):
New extractor.
* aarch64-opc.c (aarch64_print_operand): Printout of
OPND_SME_PnT_Wm_imm.
* aarch64-opc.h (enum aarch64_field_kind): New bitfields
FLD_SME_Rm, FLD_SME_i1, FLD_SME_tszh, FLD_SME_tszl.
* aarch64-tbl.h (OP_SVE_NN_BHSD): New qualifier.
(OP_SVE_QMQ): New qualifier.
(struct aarch64_opcode): New instructions PSEL, REVD,
SCLAMP and UCLAMP.
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
This patch is adding new SME mode selection and state access instructions:
* Add SMSTART and SMSTOP instructions.
* Add SVCR system register.
gas/ChangeLog:
* config/tc-aarch64.c (parse_sme_sm_za): New parser.
(parse_operands): New parser.
* testsuite/gas/aarch64/sme-8-illegal.d: New test.
* testsuite/gas/aarch64/sme-8-illegal.l: New test.
* testsuite/gas/aarch64/sme-8-illegal.s: New test.
* testsuite/gas/aarch64/sme-8.d: New test.
* testsuite/gas/aarch64/sme-8.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New operand
AARCH64_OPND_SME_SM_ZA.
(enum aarch64_insn_class): New instruction classes
sme_start and sme_stop.
opcodes/ChangeLog:
* aarch64-asm.c (aarch64_ins_pstatefield): New inserter.
(aarch64_ins_sme_sm_za): New inserter.
* aarch64-dis.c (aarch64_ext_imm): New extractor.
(aarch64_ext_pstatefield): New extractor.
(aarch64_ext_sme_sm_za): New extractor.
* aarch64-opc.c (operand_general_constraint_met_p):
New pstatefield value for SME instructions.
(aarch64_print_operand): Printout for OPND_SME_SM_ZA.
(SR_SME): New register SVCR.
* aarch64-opc.h (F_REG_IN_CRM): New register endcoding.
* aarch64-opc.h (F_IMM_IN_CRM): New immediate endcoding.
(PSTATE_ENCODE_CRM): Encode CRm field.
(PSTATE_DECODE_CRM): Decode CRm field.
(PSTATE_ENCODE_CRM_IMM): Encode CRm immediate field.
(PSTATE_DECODE_CRM_IMM): Decode CRm immediate field.
(PSTATE_ENCODE_CRM_AND_IMM): Encode CRm and immediate
field.
* aarch64-tbl.h (struct aarch64_opcode): New SMSTART
and SMSTOP instructions.
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
This patch is adding new loads and stores defined by SME instructions.
gas/ChangeLog:
* config/tc-aarch64.c (parse_sme_address): New parser.
(parse_sme_za_hv_tiles_operand_with_braces): New parser.
(parse_sme_za_array): New parser.
(output_operand_error_record): Print error details if
present.
(parse_operands): Support new operands.
* testsuite/gas/aarch64/sme-5-illegal.d: New test.
* testsuite/gas/aarch64/sme-5-illegal.l: New test.
* testsuite/gas/aarch64/sme-5-illegal.s: New test.
* testsuite/gas/aarch64/sme-5.d: New test.
* testsuite/gas/aarch64/sme-5.s: New test.
* testsuite/gas/aarch64/sme-6-illegal.d: New test.
* testsuite/gas/aarch64/sme-6-illegal.l: New test.
* testsuite/gas/aarch64/sme-6-illegal.s: New test.
* testsuite/gas/aarch64/sme-6.d: New test.
* testsuite/gas/aarch64/sme-6.s: New test.
* testsuite/gas/aarch64/sme-7-illegal.d: New test.
* testsuite/gas/aarch64/sme-7-illegal.l: New test.
* testsuite/gas/aarch64/sme-7-illegal.s: New test.
* testsuite/gas/aarch64/sme-7.d: New test.
* testsuite/gas/aarch64/sme-7.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New operands.
(enum aarch64_insn_class): Added sme_ldr and sme_str.
(AARCH64_OPDE_UNTIED_IMMS): New operand error kind.
opcodes/ChangeLog:
* aarch64-asm.c (aarch64_ins_sme_za_hv_tiles): New inserter.
(aarch64_ins_sme_za_list): New inserter.
(aarch64_ins_sme_za_array): New inserter.
(aarch64_ins_sme_addr_ri_u4xvl): New inserter.
* aarch64-asm.h (AARCH64_DECL_OPD_INSERTER): Added
ins_sme_za_list, ins_sme_za_array and ins_sme_addr_ri_u4xvl.
* aarch64-dis.c (aarch64_ext_sme_za_hv_tiles): New extractor.
(aarch64_ext_sme_za_list): New extractor.
(aarch64_ext_sme_za_array): New extractor.
(aarch64_ext_sme_addr_ri_u4xvl): New extractor.
* aarch64-dis.h (AARCH64_DECL_OPD_EXTRACTOR): Added
ext_sme_za_list, ext_sme_za_array and ext_sme_addr_ri_u4xvl.
* aarch64-opc.c (operand_general_constraint_met_p):
(aarch64_match_operands_constraint): Handle sme_ldr, sme_str
and sme_misc.
(aarch64_print_operand): New operands supported.
* aarch64-tbl.h (OP_SVE_QUU): New qualifier.
(OP_SVE_QZU): New qualifier.
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
This patch is adding ZERO (a list of 64-bit element ZA tiles)
instruction.
gas/ChangeLog:
* config/tc-aarch64.c (parse_sme_list_of_64bit_tiles):
New parser.
(parse_operands): Handle OPND_SME_list_of_64bit_tiles.
* testsuite/gas/aarch64/sme-4-illegal.d: New test.
* testsuite/gas/aarch64/sme-4-illegal.l: New test.
* testsuite/gas/aarch64/sme-4-illegal.s: New test.
* testsuite/gas/aarch64/sme-4.d: New test.
* testsuite/gas/aarch64/sme-4.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New operand
AARCH64_OPND_SME_list_of_64bit_tiles.
opcodes/ChangeLog:
* aarch64-opc.c (print_sme_za_list): New printing function.
(aarch64_print_operand): Handle OPND_SME_list_of_64bit_tiles.
* aarch64-opc.h (enum aarch64_field_kind): New bitfield
FLD_SME_zero_mask.
* aarch64-tbl.h (struct aarch64_opcode): New ZERO instruction.
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
This patch is adding new MOV (alias) and MOVA SME instruction.
gas/ChangeLog:
* config/tc-aarch64.c (enum sme_hv_slice): new enum.
(struct reloc_entry): Added ZAH and ZAV registers.
(parse_sme_immediate): Immediate parser.
(parse_sme_za_hv_tiles_operand): ZA tile parser.
(parse_sme_za_hv_tiles_operand_index): Index parser.
(parse_operands): Added ZA tile parser calls.
(REGNUMS): New macro. Regs with suffix.
(REGSET16S): New macro. 16 regs with suffix.
* testsuite/gas/aarch64/sme-2-illegal.d: New test.
* testsuite/gas/aarch64/sme-2-illegal.l: New test.
* testsuite/gas/aarch64/sme-2-illegal.s: New test.
* testsuite/gas/aarch64/sme-2.d: New test.
* testsuite/gas/aarch64/sme-2.s: New test.
* testsuite/gas/aarch64/sme-2a.d: New test.
* testsuite/gas/aarch64/sme-2a.s: New test.
* testsuite/gas/aarch64/sme-3-illegal.d: New test.
* testsuite/gas/aarch64/sme-3-illegal.l: New test.
* testsuite/gas/aarch64/sme-3-illegal.s: New test.
* testsuite/gas/aarch64/sme-3.d: New test.
* testsuite/gas/aarch64/sme-3.s: New test.
* testsuite/gas/aarch64/sme-3a.d: New test.
* testsuite/gas/aarch64/sme-3a.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New enums
AARCH64_OPND_SME_ZA_HV_idx_src and
AARCH64_OPND_SME_ZA_HV_idx_dest.
(struct aarch64_opnd_info): New ZA tile vector struct.
opcodes/ChangeLog:
* aarch64-asm.c (aarch64_ins_sme_za_hv_tiles):
New inserter.
* aarch64-asm.h (AARCH64_DECL_OPD_INSERTER):
New inserter ins_sme_za_hv_tiles.
* aarch64-dis.c (aarch64_ext_sme_za_hv_tiles):
New extractor.
* aarch64-dis.h (AARCH64_DECL_OPD_EXTRACTOR):
New extractor ext_sme_za_hv_tiles.
* aarch64-opc.c (aarch64_print_operand):
Handle SME_ZA_HV_idx_src and SME_ZA_HV_idx_dest.
* aarch64-opc.h (enum aarch64_field_kind): New enums
FLD_SME_size_10, FLD_SME_Q, FLD_SME_V and FLD_SME_Rv.
(struct aarch64_operand): Increase fields size to 5.
* aarch64-tbl.h (OP_SME_BHSDQ_PM_BHSDQ): New qualifiers
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
Patch is adding new SME matrix instructions. Please note additional
instructions will be added in following patches.
gas/ChangeLog:
* config/tc-aarch64.c (parse_sme_zada_operand):
New parser.
* config/tc-aarch64.c (parse_reg_with_qual):
New reg parser.
* config/tc-aarch64.c (R_ZA): New egister type.
(parse_operands): New parser.
* testsuite/gas/aarch64/sme-illegal.d: New test.
* testsuite/gas/aarch64/sme-illegal.l: New test.
* testsuite/gas/aarch64/sme-illegal.s: New test.
* testsuite/gas/aarch64/sme.d: New test.
* testsuite/gas/aarch64/sme.s: New test.
* testsuite/gas/aarch64/sme-f64.d: New test.
* testsuite/gas/aarch64/sme-f64.s: New test.
* testsuite/gas/aarch64/sme-i64.d: New test.
* testsuite/gas/aarch64/sme-i64.s: New test.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd): New operands
AARCH64_OPND_SME_ZAda_2b, AARCH64_OPND_SME_ZAda_3b and
AARCH64_OPND_SME_Pm.
(enum aarch64_insn_class): New instruction class sme_misc.
opcodes/ChangeLog:
* aarch64-opc.c (aarch64_print_operand):
Print OPND_SME_ZAda_2b and OPND_SME_ZAda_3b operands.
(verify_constraints): Handle OPND_SME_Pm.
* aarch64-opc.h (enum aarch64_field_kind):
New bit fields FLD_SME_ZAda_2b, FLD_SME_ZAda_3b and FLD_SME_Pm.
* aarch64-tbl.h (OP_SME_ZADA_PN_PM_ZN_S): New qualifier set.
(OP_SME_ZADA_PN_PM_ZN_D): New qualifier.
(OP_SME_ZADA_PN_PM_ZN_ZM): New qualifier.
(OP_SME_ZADA_S_PM_PM_S_S): New qualifier.
(OP_SME_ZADA_D_PM_PM_D_D): New qualifier.
(OP_SME_ZADA_S_PM_PM_H_H): New qualifier.
(OP_SME_ZADA_S_PM_PM_B_B): New qualifier.
(OP_SME_ZADA_D_PM_PM_H_H): New qualifier.
(SME_INSN): New instruction macro.
(SME_F64_INSN): New instruction macro.
(SME_I64_INSN): New instruction macro.
(SME_INSNC): New instruction macro.
(struct aarch64_opcode): New SME instructions.
aarch64-asm-2.c: Regenerate.
aarch64-dis-2.c: Regenerate.
aarch64-opc-2.c: Regenerate.
Atomic 64-byte load/store instructions limit Rt register number to
values matching below condition (register <Xt> number must be even
and <= 22):
if Rt<4:3> == '11' || Rt<0> == '1' then UNDEFINED;
This patch adds check if Rt fulfills above requirement.
For more details regarding atomic 64-byte load/store instruction for
Armv8.7 please refer to Arm A64 Instruction set documentation for
Armv8-A architecture profile, see document page 157 for load
instruction, and pages 414-418 for store instructions of [0].
[0]: https://developer.arm.com/docs/ddi0596/i