Commit Graph

559 Commits

Author SHA1 Message Date
Jan Beulich
1d026d6b19 x86/APX: further extend SSE2AVX coverage
Since {vex}/{vex3} are respected on legacy mnemonics when -msse2avx is
in use, {evex} should be respected, too. So far this is the case only
for insns where eGPR-s can come into play. Extend coverage to insns with
only %xmm register and possibly immediate operands.
2024-05-03 09:27:00 +02:00
Jan Beulich
24187fb9c0 x86/APX: extend SSE2AVX coverage
Legacy encoded SIMD insns are converted to AVX ones in that mode. When
eGPR-s are in use, i.e. with APX, convert to AVX10 insns (where
available; there are quite a few which can't be converted).

Note that LDDQU is represented as VMOVDQU32 (and the prior use of the
sse3 template there needs dropping, to get the order right).

Note further that in a few cases, due to the use of templates, AVX512VL
is used when AVX512F would suffice. Since AVX10 is the main reference,
this shouldn't be too much of a problem.
2024-05-03 09:26:25 +02:00
Cui, Lili
dd74a60337 Support APX NF
For the case when NDD and NF are both 0 in evex-promoted format,
we will fully support and test it in another patch.

gas/ChangeLog:

       * NEWS: Support Intel APX NF.
       * config/tc-i386.c (enum i386_error): Add unsupported_nf.
       (struct _i386_insn): Add has_nf.
       (is_apx_evex_encoding): Ditto.
       (build_apx_evex_prefix): Encode the NF bit.
       (md_assemble): Handle unsupported_nf.
       (parse_insn): Handle Prefix_NF and report bad for illegal combination.
       (can_convert_NDD_to_legacy): Replace i.tm.opcode_modifier.nf with i.has_nf.
       (match_template): Support D for APX_F insns and check NF support.
       * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Add bad test for NF bit.
       * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.
       * testsuite/gas/i386/x86-64-apx-inval.l: Ditto.
       * testsuite/gas/i386/x86-64-apx-inval.s: Ditto.
       * testsuite/gas/i386/x86-64.exp: Add apx nf tests.
       * testsuite/gas/i386/x86-64-apx-nf-intel.d: New test.
       * testsuite/gas/i386/x86-64-apx-nf.d: Ditto.
       * testsuite/gas/i386/x86-64-apx-nf.s: Ditto.

opcodes/ChangeLog:

       * i386-dis-evex.h: Add %NF to the instructions that support APX NF and
       add new instruction imul, popcnt, tzcnt and lzcnt to EVEX table.
       * i386-dis-evex-reg.h: Ditto.
       * i386-dis.c (struct instr_info): Add nf.
       (struct dis386): Add "NF" for EVEX.NF.
       (get_valid_dis386): Set ins->vex.nf and report bad-nf for illegal case.
       (print_insn): Handle ins.vex.nf.
       (putop): Handle "%NF".
       * i386-opc.h (Prefix_NF): New.
       * i386-opc.tbl: Added new entries to support full APX NF instructions.
       * i386-mnem.h: Regenerated.
       * i386-tbl.h: Regenerated.
2024-04-07 17:28:25 +08:00
H.J. Lu
cca46dea4d Revert "x86: Restore APX shift-double instructions with omitted shift count"
This reverts commit c2d698fe03.

GCC 14 has been changed to use explicit shift count in shift-double
instructions by the commit:

06a7e7514af x86: Use explicit shift count in double-precision shifts

gas/

	PR gas/31606
	* testsuite/gas/i386/x86-64-apx-ndd-wig.d: Updated.
	* testsuite/gas/i386/x86-64-apx-ndd.d: Likewise.
	* testsuite/gas/i386/x86-64-apx-ndd.s: Remove tests for APX
	shift-double instructions with omitted shift count.

opcodes/

	PR gas/31606
	* i386-opc.tbl: Remove APX shift-double instructions with
	omitted shift count.
	* i386-tbl.h: Regenerated.
2024-04-06 05:07:18 -07:00
H.J. Lu
c2d698fe03 x86: Restore APX shift-double instructions with omitted shift count
Restore APX shift-double instructions with omitted shift count since
they are generated by GCC as shown in:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114590

gas/

	PR gas/31606
	* testsuite/gas/i386/x86-64-apx-ndd-wig.d: Updated.
	* testsuite/gas/i386/x86-64-apx-ndd.d: Likewise.
	* testsuite/gas/i386/x86-64-apx-ndd.s: Add tests for APX
	shift-double instructions with omitted shift count.

opcodes/

	PR gas/31606
	* i386-opc.tbl: Restore APX shift-double instructions with
	omitted shift count.
	* i386-tbl.h: Regenerated.
2024-04-04 13:16:20 -07:00
Jan Beulich
ef9a6314d8 x86: add missing No_qSuf to non-64-bit PTWRITE
While largely benign, it still should have been put there when the
original single template was split (commit a04973848d).
2024-04-03 10:41:30 +02:00
Jan Beulich
0006623c18 x86: drop stray Size64 from WRSSQ
Like for WRUSSQ it's not needed here. The legacy insn had gained it in
the course of zapping Rex64, but that attribute wasn't needed here
either. The APX insn then simply gained it by copy-and-paste, I suppose.
2024-04-03 10:40:57 +02:00
Cui, Lili
8963a60d7b x86/APX: Remove KEYLOCKER and SHA promotions from EVEX MAP4
APX spec removed KEYLOCKER and SHA promotions from EVEX MAP4.
https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html

gas/ChangeLog:

        * NEWS: Mention that remove KEYLOCKER and SHA promotions from EVEX
	* MAP4.
        * config/tc-i386.c (process_operands): Removed special handling of
	* KEYLOCKER and SHA.
        * testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: Removed KEYLOCKER
        * and SHA instructions.
        * testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted.s: Ditto.

opcodes/ChangeLog:

        * i386-dis-evex-prefix.h: Removed KEYLOCKER and SHA instructions.
        * i386-dis-evex.h: Ditto.
        * i386-opc.tbl: Ditto.
        * i386-dis.c (print_vector_reg): Removed special handling of KEYLOCKER
	*  and SHA.
2024-04-03 09:50:00 +08:00
Jan Beulich
ffa2571063 x86: templatize shift-double insns
With the multitude of new APX templates, it finally becomes desirable to
further remove redundancy by also templatizing basic arithmetic insns.
Continue with the shift-double ones.

While there also drop the APX form with ShiftCount omitted. Other shift
and rotate insns were deliberately left without this form as well. Note
that there's also no testsuite adjustment needed for this, indicating
that the form wasn't tested either.
2024-03-28 11:49:48 +01:00
Jan Beulich
fe17c02650 x86: templatize shift/rotate insns
With the multitude of new APX templates, it finally becomes desirable to
further remove redundancy by also templatizing basic arithmetic insns.
Continue with the "ordinary" shift and rotate ones.

While there also drop the APX form of RCL/RCR with Imm1 omitted. Other
shift insns as well as ROR/ROL were deliberately left without this form
as well. Note that there's also no testsuite adjustment needed for this,
indicating that the form wasn't tested either.

Furthermore since RCL/RCR already had non-NDD APX forms, those end up
being added for the other 6 mnemonics, too.
2024-03-28 11:49:24 +01:00
Jan Beulich
42eb20eb35 x86: templatize binary ALU insns
With the multitude of new APX templates, it finally becomes desirable to
further remove redundancy by also templatizing basic arithmetic insns.
Continue with a the more complex binary (two source) cases.

Note how this adds a missing CheckOperandSize to one of the APX sub
forms.

Furthermore since SBB already had a non-NDD APX form, one ends up
being added for the other 6 mnemonics, too.
2024-03-28 11:49:01 +01:00
Jan Beulich
568473a437 x86: templatize unary ALU insns
With the multitude of new APX templates, it finally becomes desirable to
further remove redundancy by also templatizing basic arithmetic insns.
Continue with a few simple unary (single source) cases.
2024-03-28 11:48:47 +01:00
Jan Beulich
cd9ca24dd2 x86: templatize INC/DEC
With the multitude of new APX templates, it finally becomes desirable to
further remove redundancy by also templatizing basic arithmetic insns.
Start with the simplest case, accompanied by a necessary adjustment to
i386-gen (such that template uses can also be at the start of a line).

While there also drop a bogus (meaningless / unreachable) "break" as
well as a unused variable (which I'm surprised compilers didn't warn
about).
2024-03-28 11:47:59 +01:00
Jan Beulich
c73a37b268 x86/APX: optimize certain XOR and SUB forms
While most logic in optimize_encoding() is already covering APX by way
of the earlier NDD->REX2 conversion, there's a remaining set of cases
which wants handling separately.
2024-03-01 09:21:40 +01:00
Jan Beulich
a40a04601f x86: also permit YMM/ZMM use in CFI directives
Next to code using %ymm<N> or %zmm<N> it is more natural to have .cfi_*
directives also reference those, not the corresponding %xmm<N>. Accept
their names as kind of aliases, i.e. resolving to the same numbers.

While extending the respective 64-bit testcase, also add %bnd<N> there
(should have happened right with 633789901c ["x86-64: Dwarf2 register
numbers for %bnd<N>"], sorry), requiring binutils/dwarf.c to be adjusted
accordingly as well.
2024-02-23 11:59:09 +01:00
Jan Beulich
2f630f60b5 x86/APX: INV{EPT,PCID,VPID} are WIG
While various other entries in version 003 of the spec aren't quite as
explicit (due to simply leaving the respective field blank), all three
have a clear IGNORED there. IOW they ought to be emitted with EVEX.W=0
by default (and respect -mevexwig=).
2024-02-23 11:58:15 +01:00
Jan Beulich
c8054e730d x86/APX: drop stray IgnoreSize
While necessary on the legacy encodings, the EVEX ones don't need it.
Even more so when they're available for 64-bit mode only, when the
legacy encodings have the attribute only for correctly handling things
in 16-bit mode.
2024-02-16 10:20:08 +01:00
Jan Beulich
9405f24b8e x86: don't use VexWIG in SSE2AVX templates
Several years ago it was decided that SSE2AVX templates should not be
sensitive to -mvexwig= (upon my suggestion to consistently make all
sensitive as long as they don't require a specific setting of VEX.W).
Adjust the four that still are, switching to use of Vex128 at the same
time.
2024-02-16 10:19:11 +01:00
Jan Beulich
ec3babb8c1 x86/APX: V{BROADCAST,EXTRACT,INSERT}{F,I}128 can also be expressed
Interestingly unlike VROUND{P,S}{S,D} and VPERM{F,I}128 they weren't
even present in the x86-64-apx-egpr-inval testcase, hence why I
overlooked that these can actually be encoded, (again) using suitable
AVX512 counterparts.

While there also "modernize" the adjacent AVX/AVX2 entries.
2024-02-09 08:39:20 +01:00
Jan Beulich
5a635f1f59 x86/APX: VROUND{P,S}{S,D} encodings require AVX512{F,VL}
In eea4357967 ("x86/APX: VROUND{P,S}{S,D} can generally be encoded") I
failed to add the AVX512* ISA dependency of the two new entries.
2024-02-09 08:38:52 +01:00
Jan Beulich
0ebcbb1bd0 x86/APX: optimize MOVBE
With identical source and destination it can be covered by the NDD-to-
legacy conversion logic as well, even if in this case the original insn
doesn't use an NDD encoding. The size savings are even better here, for
the replacement (BSWAP) not having a ModR/M byte.
2024-01-26 10:31:38 +01:00
Jan Beulich
633789901c x86-64: Dwarf2 register numbers for %bnd<N>
I don't see why we shouldn't record them when they have been allocated,
even if they're (bogusly) named as reserved in the ABI right now.
2024-01-19 10:19:15 +01:00
Jan Beulich
eea4357967 x86/APX: VROUND{P,S}{S,D} can generally be encoded
VRNDSCALE{P,S}{S,D} is the AVX512 generalization of these AVX insns. As
long as the immediate has the top 4 bits clear, they are equivalent to
the earlier VEX-encoded insns, and hence can be used to permit use of
eGPR-s in the memory operand. Since this is the normal way of using
these insns, also alter the resulting diagnostic to complain about the
immediate, not the eGPR use.
2024-01-19 10:18:32 +01:00
Jan Beulich
5190fa3828 x86: support APX forms of U{RD,WR}MSR
This was missed in 6177c84d5e ("Support APX GPR32 with extend evex
prefix").
2024-01-19 10:16:00 +01:00
Indu Bhagat
448cf9e67d opcodes: x86: new marker for insns that implicitly update stack pointer
Some x86 instructions affect the stack pointer implicitly.  Add a new
operand constraint to reflect this.  This will be useful for SCFI
implmentation to ensure its correctness.

Mark all push, pop, call, ret, enter, leave, INT, iret instructions.

opcodes/
	* i386-gen.c: Update opcode_modifiers.
	* i386-opc.h: Add a new constraint.
	* i386-opc.tbl: Update the affected instructions.
	* i386-tbl.h: Regenerated.
2024-01-15 03:31:35 -08:00
Indu Bhagat
3037cefe56 opcodes: gas: x86: define and use Rex2 as attribute not constraint
Rex2 is currently an operand constraint.  For the upcoming SCFI
implementation in GAS, we need to identify operations which implicitly
update the stack pointer.  An operand constraint enumerator for implicit
stack op seems more appropriate than an attribute.  However, two opcodes
currently necessitate both Rex2 and an implicit stack op marker; this
prompts revisiting the current representations a bit.

Make Rex2 a standalone attribute, so that later a new operand constraint
may be added for IMPLICIT_STACK_OP.

ChangeLog:
	* gas/config/tc-i386.c (is_apx_rex2_encoding): Update the check.
	* opcodes/i386-gen.c: Add a new BITFIELD for Rex2.
	* opcodes/i386-opc.h (REX2_REQUIRED): Remove.
	* opcodes/i386-opc.tbl: Remove Rex2 operand constraint.
	* opcodes/i386-tbl.h: Regenerated.
2024-01-15 03:31:35 -08:00
Alan Modra
fd67aa1129 Update year range in copyright notice of binutils files
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:

1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
   author I haven't committed, 'Kalray SA.', to cover gas testsuite
   files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
2024-01-04 22:58:12 +10:30
Cui, Lili
ac32c879b2 Support APX pushp/popp
gas/ChangeLog:

	* config/tc-i386.c (process_operands): Handle "PUSHP/POPP requires
	rex2.w == 1."
	* testsuite/gas/i386/x86-64.exp: Add new test for PUSHP/POPP.
	* testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d: New test.
	* testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l: Ditto.
	* testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-apx-pushp-popp.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-pushp-popp.s: Ditto.

opcodes/ChangeLog:

	* i386-dis.c (putop): print pushp and popp.
	* i386-opc.tbl: Added new insns.
	* i386-init.h : Regenerated.
	* i386-mnem.h : Regenerated.
	* i386-tbl.h: Regenerated.
2023-12-28 11:45:14 +00:00
Mo, Zewei
08a98d4c13 Support APX Push2/Pop2
PPX functionality for PUSH/POP is not implemented in this patch
and will be implemented separately.

gas/ChangeLog:

2023-12-28  Zewei Mo <zewei.mo@intel.com>
            H.J. Lu  <hongjiu.lu@intel.com>
            Lili Cui <lili.cui@intel.com>

	* config/tc-i386.c: (enum i386_error):
	New unsupported_rsp_register and invalid_src_register_set.
	(md_assemble): Add handler for unsupported_rsp_register and
	invalid_src_register_set.
	(check_APX_operands): Add invalid check for push2/pop2.
	(match_template): Handle check_APX_operands.
	* testsuite/gas/i386/i386.exp: Add apx-push2pop2 tests.
	* testsuite/gas/i386/x86-64.exp: Ditto.
	* testsuite/gas/i386/x86-64-apx-push2pop2.d: New test.
	* testsuite/gas/i386/x86-64-apx-push2pop2.s: Ditto.
	* testsuite/gas/i386/x86-64-apx-push2pop2-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-push2pop2-inval.l: Ditto.
	* testsuite/gas/i386/x86-64-apx-push2pop2-inval.s: Ditto.
	* testsuite/gas/i386/apx-push2pop2-inval.s: Ditto.
	* testsuite/gas/i386/apx-push2pop2-inval.d: Ditto.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Added bad
	testcases for POP2.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.

opcodes/ChangeLog:

	* i386-dis-evex-reg.h: Add REG_EVEX_MAP4_8F.
	* i386-dis-evex-w.h: Add EVEX_W_MAP4_8F_R_0 and EVEX_W_MAP4_FF_R_6
	* i386-dis-evex.h: Add REG_EVEX_MAP4_8F.
	* i386-dis.c (PUSH2_POP2_Fixup): Add special handling for PUSH2/POP2.
	(get_valid_dis386): Add handler for vector length and address_mode for
	APX-Push2/Pop2 insn.
	(nd): define nd as b for EVEX-promoted instrutions.
	(OP_VEX): Add handler of 64-bit vvvv register for APX-Push2/Pop2 insn.
	* i386-gen.c: Add Push2Pop2 bitfield.
	* i386-opc.h: Regenerated.
	* i386-opc.tbl: Regenerated.
2023-12-28 11:41:45 +00:00
konglin1
3083f37643 Support APX NDD
opcodes/ChangeLog:

	* opcodes/i386-dis-evex-reg.h: Handle for REG_EVEX_MAP4_80,
	REG_EVEX_MAP4_81, REG_EVEX_MAP4_83,  REG_EVEX_MAP4_F6,
	REG_EVEX_MAP4_F7, REG_EVEX_MAP4_FE, REG_EVEX_MAP4_FF.
	* opcodes/i386-dis-evex.h: Add NDD insn.
	* opcodes/i386-dis.c (nd): New define.
	(VexGb): Ditto.
	(VexGv): Ditto.
	(get_valid_dis386): Change for NDD decode.
	(print_insn): Ditto.
	(putop): Ditto.
	(intel_operand_size): Ditto.
	(OP_E_memory): Ditto.
	(OP_VEX): Ditto.
	* opcodes/i386-opc.h (VexVVVV_DST): New.
	* opcodes/i386-opc.tbl: Add APX NDD instructions and adjust VexVVVV.
	* opcodes/i386-tbl.h: Regenerated.

gas/ChangeLog:

	* gas/config/tc-i386.c (operand_size_match):
	Support APX NDD that the number of operands is 3.
	(build_apx_evex_prefix): Change for ndd encode.
	(process_operands): Ditto.
	(build_modrm_byte): Ditto.
	(match_template): Support swap the first two operands for
	APX NDD.
	* testsuite/gas/i386/x86-64.exp: Add x86-64-apx-ndd.
	* testsuite/gas/i386/x86-64-apx-ndd.d: New test.
	* testsuite/gas/i386/x86-64-apx-ndd.s: Ditto.
	* testsuite/gas/i386/x86-64-pseudos.d: Add test.
	* testsuite/gas/i386/x86-64-pseudos.s: Ditto.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d : Ditto.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s : Ditto.
2023-12-28 11:37:16 +00:00
Cui, Lili
6177c84d5e Support APX GPR32 with extend evex prefix
This patch adds non-ND, non-NF forms of EVEX promotion insn.

EVEX extension of legacy instructions:
  All promoted legacy instructions are placed in EVEX map 4, which is
  currently reserved.
EVEX extension of EVEX instructions:
  All existing EVEX instructions are extended by APX using the extended
  EVEX prefix, so that they can access all 32 GPRs.
EVEX extension of VEX instructions:
  Promoting a VEX instruction into the EVEX space does not change the map
  id, the opcode, or the operand encoding of the VEX instruction.

Note: The promoted versions of MOVBE will be extended to include the “MOVBE
  reg1, reg2”.

  gas/ChangeLog:

  2023-12-28  Lingling Kong <lingling.kong@intel.com>
	      H.J. Lu  <hongjiu.lu@intel.com>
	      Lili Cui <lili.cui@intel.com>
	      Lin Hu   <lin1.hu@intel.com>

	* config/tc-i386.c (struct _i386_insn): Add has_egpr.
	(need_evex_encoding): Adjusted for apx.
	(cpu_flags_match): Ditto.
	(install_template): Handled APX combines.
	(is_apx_evex_encoding): Test apx evex encoding.
	(build_apx_evex_prefix): Enabe APX evex prefix.
	(md_assemble): Handle apx with evex encoding.
	(process_suffix): Handle apx map4 prefix.
	(check_register): Assign i.vec_encoding for APX evex instructions.
	* testsuite/gas/i386/x86-64-evex.d: Adjust test cases.
	* testsuite/gas/i386/x86-64.exp: Adjust x86-64-inval-movbe.

opcodes/ChangeLog:

	* i386-dis-evex-len.h: Handle EVEX_LEN_0F38F2, EVEX_LEN_0F38F3.
	* i386-dis-evex-prefix.h: Handle PREFIX_EVEX_0F38F2_L_0,
	PREFIX_EVEX_0F38F3_L_0, PREFIX_EVEX_MAP4_D8,
	PREFIX_EVEX_MAP4_DA, PREFIX_EVEX_MAP4_DB,
	PREFIX_EVEX_MAP4_DC, PREFIX_EVEX_MAP4_DD,
	PREFIX_EVEX_MAP4_DE, PREFIX_EVEX_MAP4_DF,
	PREFIX_EVEX_MAP4_F0, PREFIX_EVEX_MAP4_F1,
	PREFIX_EVEX_MAP4_F2, PREFIX_EVEX_MAP4_F8.
	* i386-dis-evex-reg.h: Handle REG_EVEX_0F38F3_L_0_P_0.
	* i386-dis-evex.h: Add EVEX_MAP4_ for legacy insn
	promote to apx to use gpr32
	* opcodes/i386-dis-evex-x86-64.h: Handle Add X86_64_EVEX_0F90,
	X86_64_EVEX_0F92, X86_64_EVEX_0F93, X86_64_EVEX_0F38F2,
	X86_64_EVEX_0F38F3, X86_64_EVEX_0F38F5, X86_64_EVEX_0F38F6,
	X86_64_EVEX_0F38F7, X86_64_EVEX_0F3AF0, X86_64_EVEX_0F91.
	* i386-dis.c
	(struct instr_info): Deleted bool r.
	(PREFIX_NP_OR_DATA): New.
	(NO_PREFIX): New.
	(putop): Ditto.
	(X86_64_EVEX_FROM_VEX_TABLE): Diito.
	(get_valid_dis386): Decode insn erex in extend evex prefix.
	Handle EVEX_MAP4
	(print_insn): Handle PREFIX_DATA_AND_NP_ONLY.
	(print_register): Handle apx instructions decode.
	(OP_E_memory): Diito.
	(OP_G): Diito.
	(OP_XMM): Diito.
	(DistinctDest_Fixup): Diito.
	* i386-gen.c (process_i386_opcode_modifier): Add EVEXMAP4.
	* i386-opc.h (SPACE_EVEXMAP4): Add legacy insn
	promote to evex.
	* i386-opc.tbl: Handle some legacy and vex insns don't
	support gpr32. And add some legacy insn (map2 / 3) promote
	to evex.
2023-12-28 11:31:01 +00:00
Cui, Lili
80d61d8d61 Support APX GPR32 with rex2 prefix
APX uses the REX2 prefix to support EGPR for map0 and map1 of legacy
instructions. We added the NoEgpr flag in i386-gen.c for instructions
that do not support EGPR.

gas/ChangeLog:

2023-12-28  Lingling Kong <lingling.kong@intel.com>
	    H.J. Lu  <hongjiu.lu@intel.com>
	    Lili Cui <lili.cui@intel.com>
	    Lin Hu   <lin1.hu@intel.com>

	* config/tc-i386.c
	(enum i386_error): Add unsupported_EGPR_for_addressing
	and invalid_pseudo_prefix.
	(struct _i386_insn): Add rex2 and rex2_encoding for
	gpr32.
	(cpu_arch): Add apx_f.
	(is_cpu): Ditto.
	(register_number): Handle RegRex2 for gpr32.
	(is_apx_rex2_encoding): New func. Test rex2 prefix encoding.
	(build_rex2_prefix): New func. Build legacy insn in
	opcode 0/1 use gpr32 with rex2 prefix.
	(establish_rex): Handle rex2 and rex2_encoding.
	(optimize_encoding): Handel add r16-r31 for registers.
	(md_assemble): Handle apx encoding.
	(parse_insn): Handle Prefix_REX2.
	(check_EgprOperands): New func. Check if Egprs operands
	are valid for the instruction
	(match_template):  Handle Egpr operands check.
	(set_rex_rex2):  New func. set i.rex and i.rex2.
	(build_modrm_byte): Ditto.
	(output_insn): Handle rex2 2-byte prefix output.
	(check_register): Handle check egpr illegal without
	target apx, 64-bit mode and with rex_prefix.
	* doc/c-i386.texi: Document .apx.
	* testsuite/gas/i386/ilp32/x86-64-opcode-inval-intel.d: D5 valid
	in 64-bit mode.
	* testsuite/gas/i386/ilp32/x86-64-opcode-inval.d: Ditto.
	* testsuite/gas/i386/rex-bad: Adjust rex testcase.
	* testsuite/gas/i386/x86-64-opcode-inval-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-opcode-inval.d: Ditto.
	* testsuite/gas/i386/x86-64-opcode-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-pseudos-bad.l: Add illegal rex2 test.
	* testsuite/gas/i386/x86-64-pseudos-bad.s: Ditto.
	* testsuite/gas/i386/x86-64-pseudos.d: Add rex2 test.
	* testsuite/gas/i386/x86-64-pseudos.s: Ditto.
	* testsuite/gas/i386/x86-64.exp: Run APX tests.
	* testsuite/gas/i386/x86-64-apx-egpr-inval.l: New test.
	* testsuite/gas/i386/x86-64-apx-egpr-inval.s: New test.
	* testsuite/gas/i386/x86-64-apx-rex2.d: New test.
	* testsuite/gas/i386/x86-64-apx-rex2.s: New test.

include/ChangeLog:

	* opcode/i386.h (REX2_OPCODE): New.
	(REX2_M): Ditto.

opcodes/ChangeLog:

	* i386-dis.c (struct instr_info): Add erex for gpr32.
	Add last_erex_prefix for rex2 prefix.
	(REX2_M): Extend for gpr32.
	(PREFIX_REX2): Ditto.
	(PREFIX_REX2_ILLEGAL): Ditto.
	(ckprefix): Ditto.
	(prefix_name): Ditto.
	(print_insn): Ditto.
	(print_register): Ditto.
	(OP_E_memory): Ditto.
	(OP_REG): Ditto.
	(OP_EX): Ditto.
	* i386-gen.c (rex2_disallowed): Some instructions are not allowed rex2 prefix.
	(process_i386_opcode_modifier): Set NoEgpr for VEX and some special instructions.
	(output_i386_opcode): Handle if_entry_needs_special_handle.
	* i386-init.h : Regenerated.
	* i386-mnem.h : Regenerated.
	* i386-opc.h (enum i386_cpu): Add CpuAPX_F.
	(NoEgpr): New.
	(Prefix_NoOptimize): Ditto.
	(Prefix_REX2): Ditto.
	(RegRex2): Ditto.
	* i386-opc.tbl: Add rex2 prefix.
	* i386-reg.tbl: Add egprs (r16-r31).
	* i386-tbl.h: Regenerated.
2023-12-28 11:14:41 +00:00
Haochen Jiang
fa88a361f9 x86: Remove the restriction for size of the mask register in AVX10
Since AVX10.1/256 will also allow 64 bit mask register, we will
remove the restriction for size of the mask register in AVX10.

gas/ChangeLog:

	* config/tc-i386.c (VSZ128, VSZ256, VSZ512): New.
	(VEX_check_encoding): Remove opcode_modifier check for vsz.
	* testsuite/gas/i386/avx10-vsz.l: Remove testcases for mask
	registers since they are not needed.
	* testsuite/gas/i386/avx10-vsz.s: Ditto.

opcodes/ChangeLog:

	* i386-gen.c: Remove Vsz.
	* i386-opc.h: Ditto.
	* i386-opc.tbl: Remove kvsz.
	* i386-tbl.h: Regenerated.
2023-12-19 16:35:24 +08:00
Jan Beulich
df5a4840c4 revert "x86: allow 32-bit reg to be used with U{RD,WR}MSR"
This reverts commit 1f865bae65. The
specification is going to by updated in a way rendering this change
wrong.
2023-12-15 12:40:00 +01:00
Jan Beulich
35266cb139 x86: fold assembly dialect attributes
Now that ATTSyntax and ATTMnemonic aren't use in combination anymore,
fold them and IntelSyntax into a single, enum-like attribute. Note that
this shrinks i386_opcode_modifier back to 2 32-bit words (albeit that's
not for long, seeing in-flight additions for APX).
2023-12-15 12:05:11 +01:00
Jan Beulich
7d3182d6aa x86: Intel syntax implies Intel mnemonics
As noted in the context of d53e6b98a2 ("x86/Intel: correct disassembly
of fsub*/fdiv*") there's no such thing as Intel syntax without Intel
mnemonics. Enforce this on the assembler side, and disentangle command
line option handling on the disassembler side accordingly.

As a result in the opcode table specifying ATTMnemonic|ATTSyntax becomes
redundant with just ATTMnemonic. Drop the now meaningless ATTSyntax and
remove the then no longer accessible templates.
2023-12-15 12:04:39 +01:00
Jan Beulich
1f865bae65 x86: allow 32-bit reg to be used with U{RD,WR}MSR
... as MSR index specifier: It is unreasonable to demand that people
write less readable / understandable code, just because the present
documentation mentions only Reg64. Whether to also adjust the
disassembler is a separate question, perhaps indeed more tightly tied
to what the spec says.
2023-12-01 08:26:36 +01:00
Jan Beulich
d3b01414b9 x86: shrink opcode sets table
Have i386-gen produce merely the offsets into i386_optab[]. Besides
allowing to shrink the table even on 32-bit builds, this results in
removing a level of indirection from the frequently accessed
current_templates, in return for adding a level of indirection when
looking up mnemonics (commonly happening just once per insn). Plus for
PIE builds of gas it also reduces the number of relocations by about two
thousand. Finally a somewhat ugly static variable can also be eliminated
from i386_displacement().
2023-11-24 09:55:51 +01:00
Jan Beulich
3086ed9a45 x86: CPU-qualify {disp16} / {disp32}
{disp16} is invalid to use in 64-bit mode, while {disp32} is invalid to
use on pre-386 CPUs. The latter, also affecting other (real) prefixes,
further requires that like for insns we fully check the CPU flags; till
now only Cpu64/CpuNo64 were taken into consideration.
2023-11-17 11:23:20 +01:00
Jan Beulich
e7d7487987 x86: rework UWRMSR operand swapping
As indicated during review already, doing the swapping early is overall
cheaper than doing it only after operand matching.
2023-11-09 12:55:52 +01:00
Jan Beulich
706ce98422 x86: do away with is_evex_encoding()
As we have grown more uses of it, it becomes increasingly more desirable
to replace it by a simpler check. Have i386-gen do at build time what so
far was done at runtime: Deal with templates indicating EVEX-encoding by
other than the EVex attribute, and set that to "dynamic" in such cases.

This then allows simplifying a number of other conditionals as well.
2023-11-09 12:55:26 +01:00
Jan Beulich
a5e91879d1 x86: split insn templates' CPU field
Right now the opcode table has entries with ISA restrictions of the form
FEAT1|FEAT2, the meaning of which depends on context and requires
special treatment in tc-i386.c: Sometimes this means "both features
requires", whereas originally it was intended to solely mean "all of
these features required". Split the field, with the original one
regaining its original meaning. The new field now truly means "any of
these". The combination of both fields is still and &&-type check, i.e.
(all of these) && (any of these). In the opcode table more involved
combinations of features then also need expressing this way: "all"
entities first, follow by "any" entities enclosed in parentheses, e.g.
x64&(AVX|AVX512F). If the "all" part is empty, parentheses may not be
added around the "any" part (unless parsing logic was further relaxed).

Note that this way AVX512VL no longer needs as much special treatment,
and hence templates previously using AVX512F|AVX512VL are switched to
just AVX512VL.

Note further that this requires FMA handling as resulting from
da0784f961 ("x86: fold FMA VEX and EVEX templates") to be slightly
re-done: FMA now becomes more similar to AVX and AVX2.
2023-11-09 12:54:58 +01:00
Hu, Lin1
8170af78e1 Support Intel USER_MSR
This patches aims to support Intel USER_MSR. In addition to the usual
support, this patch includes encoding and decoding support for MAP7 and
immediate numbers as the last operand (ATT style).

gas/ChangeLog:

	* NEWS: Support Intel USER_MSR.
	* config/tc-i386.c (smallest_imm_type): Reject imm32 in 64bit
	mode.
	(build_vex_prefix): Add VEXMAP7.
	(md_assemble): Handling the imm32 of USER_MSR.
	(match_template): Handling the unusual immediate.
	* doc/c-i386.texi: Document .user_msr.
	* testsuite/gas/i386/i386.exp: Run USER_MSR tests.
	* testsuite/gas/i386/x86-64.exp: Ditto.
	* testsuite/gas/i386/user_msr-inval.l: New test.
	* testsuite/gas/i386/user_msr-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-user_msr-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-user_msr-inval.l: Ditto.
	* testsuite/gas/i386/x86-64-user_msr-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-user_msr.d: Ditto.
	* testsuite/gas/i386/x86-64-user_msr.s: Ditto.

opcodes/ChangeLog:
	* i386-dis.c (struct instr_info): Add a new attribute
	has_skipped_modrm.
	(Gq): New.
	(Rq): Ditto.
	(q_mm_mode): Ditto.
	(Nq): Change mode from q_mode to q_mm_mode.
	(VEX_LEN_TABLE):
	(get_valid_dis386): Add VEX_MAP7 in VEX prefix.
	and handle the map7_f8 for save space.
	(OP_Skip_MODRM): Set has_skipped_modrm.
	(OP_E): Skip codep++ when has skipped modrm byte.
	(OP_R): Support q_mode and q_mm_mode.
	(REG_VEX_MAP7_F8_L_0_W_0): New.
	(PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64): Ditto.
	(X86_64_VEX_MAP7_F8_L_0_W_0_R_0): Ditto.
	(VEX_LEN_MAP7_F8): Ditto.
	(VEX_W_MAP7_F8_L_0): Ditto.
	(MOD_0F38F8): Ditto.
	(PREFIX_0F38F8_M_0): Ditto.
	(PREFIX_0F38F8_M_1_X86_64): Ditto.
	(X86_64_0F38F8_M_1): Ditto.
	(PREFIX_0F38F8): Remove.
	(prefix_table): Add PREFIX_0F38F8_M_1_X86_64.
	Remove PREFIX_0F38F8.
	(reg_table): Add REG_VEX_MAP7_F8_L_0_W_0,
	PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64.
	(x86_64_table): Add X86_64_0F38F8_PREFIX_3_M_1,
	X86_64_VEX_MAP7_F8_L_0_W_0_R_0 and X86_64_0F38F8_M_1.
	(vex_table): Add VEX_MAP7.
	(vex_len_table): Add VEX_LEN_MAP7_F8,
	VEX_W_MAP7_F8_L_0.
	(mod_table): New entry for USER_MSR and
	add MOD_0F38F8.
	* i386-gen.c (cpu_flag_init): Add CPU_USER_MSR_FLAGS and
	CPU_ANY_USER_MSR_FLAGS. Add add VEXMAP7.
	* i386-init.h: Regenerated.
	* i386-mnem.h: Ditto.
	* i386-opc.h (SPACE_VEXMAP7): New.
	(CPU_USER_MSR_FLAGS): Ditoo.
	(CPU_ANY_USER_MSR_FLAGS): Ditto.
	(i386_cpu_flags): Add cpuuser_msr.
	* i386-opc.tbl: Add USER_MSR instructions.
	* i386-tbl.h: Regenerated.
2023-10-31 16:24:41 +08:00
Jan Beulich
da0784f961 x86: fold FMA VEX and EVEX templates
Following the folding of some generic AVX/AVX2 templates with their
AVX512F counterpart ones, do this for FMA ones as well, requiring one
further adjustment to cpu_flags_match().
2023-09-27 14:16:09 +02:00
Jan Beulich
f94f390ef8 x86: fold VAES/VPCLMULQDQ VEX and EVEX templates
Following the folding of some generic AVX/AVX2 templates with their
AVX512F counterpart ones, do this for VAES and VPCLMULQDQ ones as well.
2023-09-27 14:15:44 +02:00
Jan Beulich
a6f3add002 x86: fold certain VEX and EVEX templates
In anticipation of APX introduce logic to reduce the number of templates
we have now, allowing to limit some the number of ones we then need to
gain.

The fundamental requirements are that
- attributes be compatible, which specifically means VexW needs to be
  the same in the templates (which often isn't the case, for VEX
  encodings having far more WIG tha, EVEX ones),
- the EVEX form being AVX512F (with or without AVX512VL), not any of its
  extensions (the same will then be required for APX - it'll need to be
  APX_F).

Note that in check_register() there's now a redundant zmm check. Since
this logic will need revisiting for APX anyway, I'd like to keep it that
way for now. (Similarly a couple of if()-s which could be folded are
kept separate, to reduce code churn when adding APX support.)
2023-09-27 14:15:19 +02:00
Jan Beulich
da5f9eb43f x86: fold CpuLM and Cpu64
Now that CpuLM is used solely in cpu_arch_flags and cpu_arch[] while
Cpu64 is solely used in insn templates, they no longer need to be
treated different from other "ordinary" flags; the only "unusual" one
left if CpuNo64. Fold both, leaving just Cpu64.
2023-09-15 09:57:05 +02:00
Jan Beulich
4fc85f37dc x86: support AVX10.1 vector size restrictions
Recognize "/<number>" suffixes on both -march=+avx10.1 and the
corresponding .arch directive, setting an upper bound on the vector size
that insns may use. Such a restriction can be reset by setting a new base
architecture, by using a suffix-less form, by disabling AVX10, or by
enabling any other VEX/EVEX-based vector extension.

While for most insns we can suppress their use with too wide operands
via registers becoming unavailable (or in Intel syntax memory operand
size specifiers not being recognized), mask register insns have to have
their minimum required vector size specified in a new attribute. (Of
course this new attribute could also be used on other insns.)

Note that .insn continues to be permitted to emit EVEX{512,256} (and
VEX256 ones) encodings regardless of vector size restrictions in place.
Of course these can't be expressed using zmm (or ymm) operands then,
but need using the EVEX.512.* forms (broadcast forms may be usable right
now, but this may go away so shouldn't be relied upon). This is why no
assertions should be added to build_{e,}vex_prefix().
2023-09-14 08:43:45 +02:00
Jan Beulich
d5f9027c4c x86: make AES/PCMULQDQ respectively prereqs of VAES/VPCMULQDQ
These probably should have been put in place already anyway, but they're
very much wanted in order to then put AVX10.1 support on top. Note that
to avoid reverse dependencies towards SSE (just like we already do for
AVX and XOP), add_isa_dependencies() needs some further tweaking.

While there also address a related anomaly: Disabling AES but neither
AVX nor VAES (similarly for {,V}PCLMULQDQ) would better keep the 128-bit
VEX-encoded forms available. Note that for this the VAES insns are moved
past the AVX+AES ones, to avoid the property-11 test suddenly failing.
The test really is wrong, but let's not also make things inconsistent:
Without the movement, YMM use would be correctly recorded for the
128-bit forms simply because the first template already matches, as long
as VAES wasn't disabled.  Yet it still wouldn't be if only AVX+AES were
enabled. Nor would behavior here then be the same as for VPCLMUL* insns.
2023-09-14 08:40:58 +02:00
Jan Beulich
e746be9858 x86: drop Size64 from VMOVQ
Commit 916fae9135 ("Add Size64 to movq/vmovq with Reg64 operand" was
right in adding the attribute to MOVQ, but there was no need to add it
to VMOVQ. (See also the AVX512F form, which doesn't have the attribute
either.)
2023-09-01 12:27:20 +02:00