binutils-gdb

mirror of https://sourceware.org/git/binutils-gdb.git synced 2024-11-27 03:54:41 +08:00

Author	SHA1	Message	Date
Jan Beulich	497ee27a74	x86: VP2INTERSECT{D,Q} have mask register destination group Much like AVX512-{4FMAPS,4VNNIW} have a constraint on their register source, there's a constraint (need to be even) on the destination register here. Adjust "good" test cases accordingly, and add a new test case to check the warning.	2024-11-18 11:45:50 +01:00
Jan Beulich	a3db0f57df	x86/APX: support JMPABS also in assembler Without this APX support isn't really complete. For Intel syntax displacement form is needed, such that symbolic operands won't need prefixing by "offset". (The other form is actually not used at all in Intel syntax.) For the record: To restrict displacement form to Intel syntax is not something I actually agree with.	2024-10-30 12:12:54 +01:00
Jan Beulich	5168ed9912	x86: use <xyz> for VFPCLASSP{S,D} Just like VFPCLASSPH does. While the order of generated table entries changes this way, the individual entries don't change.	2024-10-29 08:08:50 +01:00
MayShao-oc	b2841da4f2	x86: Regenerate missing table files As soon as I committed Zhaoxin's patch, I realized that I did not include the regen file. Regenerate them and commit as obvious. opcodes/ChangeLog: * i386-tbl.h: Regenerated. * i386-mnem.h: Ditto. * i386-init.h: Ditto.	2024-10-18 15:57:22 +08:00
Liwei Xu	3bac89e65f	Support Intel AVX10.2 convert instructions In this patch, we will support AVX10.2 convert instructions. All of them are new instruction forms. Among all the instructions, vcvtbiasph2[b,h]f8[,s] needs extra care. Since Operand 2 could indicate memory size, we do not need suffix under ATTmode. However, we could not fold all three templates but only XMM/YMM since the dst operand size are the same for them. Also, a new iterator <cvt8> is added to reduce redundancy. gas/ * testsuite/gas/i386/i386.exp: Add AVX10.2 tests. * testsuite/gas/i386/x86-64.exp: Ditto. * testsuite/gas/i386/avx10_2-256-cvt-intel.d: New. * testsuite/gas/i386/avx10_2-256-cvt.d: Ditto. * testsuite/gas/i386/avx10_2-256-cvt.s: Ditto. * testsuite/gas/i386/avx10_2-512-cvt-intel.d: Ditto. * testsuite/gas/i386/avx10_2-512-cvt.d: Ditto. * testsuite/gas/i386/avx10_2-512-cvt.s: Ditto. * testsuite/gas/i386/x86-64-avx10_2-256-cvt-intel.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-256-cvt.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-256-cvt.s: Ditto. * testsuite/gas/i386/x86-64-avx10_2-512-cvt-intel.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-512-cvt.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-512-cvt.s: Ditto. opcodes/ * i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F3874, PREFIX_EVEX_MAP5_18, PREFIX_EVEX_MAP5_1B, PREFIX_EVEX_MAP5_1E and PREFIX_EVEX_MAP5_74. * i386-dis-evex.h: Add table pass for AVX10.2 instructions. * i386-dis.c (MOD_EVEX_0F38B1): New. (PREFIX_EVEX_0F3874): Ditto. (PREFIX_EVEX_MAP5_18): Ditto. (PREFIX_EVEX_MAP5_1B): Ditto. (PREFIX_EVEX_MAP5_1E): Ditto. (PREFIX_EVEX_MAP5_74): Ditto. * i386-opc.tbl: Add AVX10.2 instructions. * i386-mnem.h: Regenerated. * i386-tbl.h: Ditto. Co-authored-by: Kong Lingling <lingling.kong@intel.com> Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>	2024-10-16 10:25:35 +08:00
Haochen Jiang	873e7b6cf6	Support Intel AVX10.2 media instructions In disassembler part, for vnni instructions, we extended previous VEX part using %XE in disassembler to promote them to EVEX by reusing the original VEX table. For vmpsadbw, we will also use %XE. However, it is hard to reuse the VEX table, so we are using new ones. In assmbler part, we put the vnni table entries with previous vnni instructions since they are just promotion from AVX-VNNI-INT{8,16}. Since we will prefer VEX encoding, we need to use the different table order in template <vnni>, which prefers EVEX due to earlier introduction for AVX512_VNNI than AVX_VNNI. This means a new <vnni>. For vdpphps and vmpsadbw, we put them at the end of the table, with future AVX10.2 instructions. Nit: I will remove the arch requirement for avx_vnni_int{8,16} in evex-promote testcases after AVX10.2 implies AVX-VNNI-INT{8,16}. gas/Changelog: * testsuite/gas/i386/i386.exp: Add AVX10.2 tests. * testsuite/gas/i386/x86-64.exp: Ditto. * testsuite/gas/i386/avx10_2-256-1-intel.d: New. * testsuite/gas/i386/avx10_2-256-1.d: Ditto. * testsuite/gas/i386/avx10_2-256-1.s: Ditto. * testsuite/gas/i386/avx10_2-512-1-intel.d: Ditto. * testsuite/gas/i386/avx10_2-512-1.d: Ditto. * testsuite/gas/i386/avx10_2-512-1.s: Ditto. * testsuite/gas/i386/avx10_2-promote.d: Ditto. * testsuite/gas/i386/avx10_2-promote.s: Ditto. * testsuite/gas/i386/x86-64-avx10_2-256-1-intel.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-256-1.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-256-1.s: Ditto. * testsuite/gas/i386/x86-64-avx10_2-512-1-intel.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-512-1.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-512-1.s: Ditto. * testsuite/gas/i386/x86-64-avx10_2-promote.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-promote.s: Ditto. opcodes/Changelog: * i386-dis-evex-prefix.h: Adjust PREFIX_EVEX_0F3852. Add PREFIX_EVEX_0F3A42_W_0. * i386-dis-evex-w.h: Adjust EVEX_W_0F3A42. * i386-dis-evex.h: Add table pass for AVX10.2 instructions. * i386-dis.c: Adjust PREFIX_VEX_0F3850_W_0, PREFIX_VEX_0F3851_W_0, PREFIX_VEX_0F38D2_W_0 and PREFIX_VEX_0F38D3_W_0. * i386-opc.tbl: Add AVX10.2 instructions. * i386-mnem.h: Regenerated. * i386-tbl.h: Ditto. Co-authored-by: Lili Cui <lili.cui@intel.com>	2024-10-11 10:38:27 +08:00
Jan Beulich	ca6b6f9d6e	x86: optimize {,V}INSERTPS with certain immediates They are equivalent to simple moves or xors, which are up to 3 bytes shorter to encode (and maybe/likely also cheaper to execute).	2024-09-27 11:23:12 +02:00
Jan Beulich	f079b0c4b2	x86: optimize {,V}EXTRACT{F,I}{128,32x{4,8},64x{2,4}} with immediate 0 They, too, are equivalent to simple moves, which are up to 3 bytes shorter to encode (and maybe also cheaper to execute).	2024-09-27 11:22:34 +02:00
Jan Beulich	afd5b33bc7	x86: optimize {,V}EXTRACTPS with immediate 0 They are equivalent to simple moves, which are up to 2 bytes shorter to encode (and maybe also cheaper to execute).	2024-09-27 11:21:51 +02:00
Jan Beulich	174e5e38b9	x86: templatize SIMD narrowing-move templates Once again to reduce redundancy.	2024-09-26 12:27:14 +02:00
Jan Beulich	2bb43416f9	x86: templatize SIMD sign-/zero-extension templates Yet again to reduce redundancy.	2024-09-26 12:27:01 +02:00
Jan Beulich	0c27c22320	x86: templatize SIMD FP binary-logic templates Once more to reduce redundancy.	2024-09-26 12:26:34 +02:00
Jan Beulich	5d285de425	x86: further templatize FMA templates Further reduce redundancy, in preparation of the addition of counterparts for AVX10.2.	2024-09-26 12:26:15 +02:00
Jan Beulich	fc91e3cec5	x86: templatize SIMD FP arithmetic templates Reduce redundancy, in preparation of the addition of further counterparts for AVX10.2. Provide the "ne" parameter needed there right away, even if unused for now.	2024-09-26 12:25:45 +02:00
H.J. Lu	2963d7d80d	x86/APX: Don't promote AVX/AVX2 instructions out of APX spec V{BROADCAST,EXTRACT,INSERT}{F,I}128 and VROUND{P,S}{S,D} aren't promoted to support EGPR in APX spec. Don't promote them out of APX spec. This commit effectively reverted: `ec3babb8c1` x86/APX: V{BROADCAST,EXTRACT,INSERT}{F,I}128 can also be expressed `5a635f1f59` x86/APX: VROUND{P,S}{S,D} encodings require AVX512{F,VL} `eea4357967` x86/APX: VROUND{P,S}{S,D} can generally be encoded gas/ PR gas/32171 * testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: Add V{BROADCAST,EXTRACT,INSERT}{F,I}128 tests with EGPR. * testsuite/gas/i386/x86-64-apx-evex-promoted.s: Remove V{BROADCAST,EXTRACT,INSERT}{F,I}128 and VROUND{P,S}{S,D} tests with EGPR. * testsuite/gas/i386/x86-64-apx-egpr-inval.l: Updated. * testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: Likewise. * testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Likewise. * testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Likewise. * testsuite/gas/i386/x86-64-apx-evex-promoted.d: Likewise. opcodes/ PR gas/32171 * i386-opc.tbl: Remove V{BROADCAST,EXTRACT,INSERT}{F,I}128 and VROUND{P,S}{S,D} entries with EGPR. * i386-tbl.h: Regenerated. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2024-09-18 10:11:02 +08:00
Jan Beulich	4eb59a5243	x86/APX: use D for 2-operand CFCMOVcc There's no need to have 30 redundant templates when we can easily take care of the operand swapping like we do for various other insns.	2024-09-06 08:35:42 +02:00
Jan Beulich	6b8ed67d6e	x86/APX: optimize certain reg-only CFCMOVcc forms Along the lines of `2513312930` ("x86/APX: apply NDD-to-legacy transformation to further CMOVcc forms") these can similarly be converted to the shorter legacy-encoded CMOVcc.	2024-09-06 08:35:07 +02:00
Jan Beulich	f12eb19e17	x86: templatize VNNI templates Reduce redundancy, in preparation of the addition of further counterparts for AVX10.2.	2024-09-06 08:33:47 +02:00
Haochen Jiang	85e370a3d6	Support ymm rounding control for Intel AVX10.2 In the patch, in order to support ymm rounding for AVX10.2, we derive evex attribute for all cases instead of only for rc_none to encode U bit. Also changed some bad_opcode return due to the share of U bit with APX_F. gas/ChangeLog: * config/tc-i386.c (cpu_flags_match): Handle AVX10_2. (build_evex_prefix): Handle U bit. Derive evex attribute for all cases. (check_VecOperands): Handle AVX10.2 and ymm roundings. * doc/c-i386.texi: Document .avx10.2. * testsuite/gas/i386/i386.exp: Run AVX10.2 tests. * testsuite/gas/i386/x86-64.exp: Ditto. * testsuite/gas/i386/avx10_2-rounding-intel.d: New test. * testsuite/gas/i386/avx10_2-rounding-inval.l: Ditto. * testsuite/gas/i386/avx10_2-rounding-inval.s: Ditto. * testsuite/gas/i386/avx10_2-rounding.d: Ditto. * testsuite/gas/i386/avx10_2-rounding.s: Ditto. * testsuite/gas/i386/x86-64-avx10_2-rounding-intel.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-rounding.d: Ditto. * testsuite/gas/i386/x86-64-avx10_2-rounding.s: Ditto. opcodes/ChangeLog: * i386-dis.c (struct instr_info): Add U bit. (get_valid_dis386): Handle U bit. * i386-gen.c (isa_dependencies): Add AVX10.2. (cpu_flags): Ditto. * i386-init.h: Regenerated. * i386-opc.h (CpuAVX10_2): New. (i386_cpu_flags): Add cpuavx10_2. * i386-opc.tbl: Add rounding to old entries which do not permit rounding previously. Also eliminate the redundant RegXMM for vcvtps2uqq. * i386-tbl.h: Regenerated.	2024-09-02 10:53:59 +08:00
Jan Beulich	4eb19fde73	x86: limit RegRex64 use The special property really only applies to the "extended" byte regs having legacy word/dword counterparts. While touching involved code also drop redundant byte checks from a conditional in establish_rex(): The other remaining RegRex64 uses only exist on registers which can't be used as register operands anyway. Hence RegRex64 as an attribute of a (valid) register operand implies that it's a byte reg.	2024-08-30 11:23:16 +02:00
Jan Beulich	1cd36be7c9	x86/APX: optimize certain {nf}-form insns to BMI2 ones ..., as those leave EFLAGS untouched anyway. That's a shorter encoding, available as long as no eGPR is in use anywhere.	2024-07-26 07:59:04 +02:00
Cui, Lili	b0dd832fa4	Support APX CFCMOV The CMOVcc instruction proposed by EVEX has four different forms, corresponding to the four possible combinations of EVEX.ND and EVEX.NF values. In the encoder part, when the CFCMOV template supports EVEX_NF, it means that it requires EVEX.NF to be 1. In the decoder part, CFCMOV_Fixup is used to reverse source and destination operands in the 2-operand case. gas/ChangeLog: * config/tc-i386.c (build_apx_evex_prefix): Set NF bit for cfcmov when the insn template supports EVEX_NF. * testsuite/gas/i386/x86-64-apx-inval.l: Add invalid tests for cfcmov. * testsuite/gas/i386/x86-64-apx-inval.s: Ditto. * testsuite/gas/i386/x86-64.exp: Add tests for cfcmov and cmov. * testsuite/gas/i386/x86-64-apx-cfcmov-intel.d: Ditto. * testsuite/gas/i386/x86-64-apx-cfcmov.d: Ditto. * testsuite/gas/i386/x86-64-apx-cfcmov.s: Ditto. opcodes/ChangeLog: * i386-dis-evex-prefix.h: Add cfcmov instructions. * i386-dis.c (CFCMOV_Fixup): Special handling of cfcmov. (putop): Print 'cf' for cfcmov instructions. * i386-opc.h (EVEX_NF): New. * i386-opc.tbl: Add cfcmov instructions. * i386-mnem.h: Regerated. * i386-tbl.h: Regerated.	2024-07-04 15:55:00 +08:00
Jan Beulich	2513312930	x86/APX: apply NDD-to-legacy transformation to further CMOVcc forms With both sources being registers, these insns are almost commutative; the only extra adjustment needed is inversion of the encoded condition.	2024-06-28 08:24:45 +02:00
Jan Beulich	7add993917	x86/APX: extend TEST-by-imm7 optimization to CTESTcc The same properties apply there.	2024-06-28 08:24:12 +02:00
Jan Beulich	82e06fa803	x86/APX: optimize {nf}-form IMUL-by-power-of-2 to SHL ..., for differing only in the resulting EFLAGS, which are left untouched anyway. That's a shorter encoding, available as long as certain constraints on operands are met; see code comments. (SHL-by-1 forms may then be subject to further optimization that was introduced earlier.) Note that kind of as a side effect this also converts multiplication by 1 to shift by 0, which is a plain move or even no-op anyway. That could be further shrunk (as could be presence of shifts/rotates by 0 in the original code as well as a fair set of other {nf}-form insns), yet the expectation (for now) is that people won't write such code in the first place.	2024-06-28 08:22:39 +02:00
Jan Beulich	27ef4876f7	x86/APX: optimize certain {nf}-form insns to LEA ..., as that leaves EFLAGS untouched anyway. That's a shorter encoding, available as long as certain constraints on operand size and registers are met; see code comments. Note that this requires deferring to derive encoding_evex from {nf} presence, as in optimize_encoding() we want to avoid touching the insns when {evex} was also used. Note further that this requires want_disp32() to now also consider the opcode: We don't want to replace i.tm.mnem_off, for diagnostics to still report the original mnemonic (or else things can get confusing). While there, correct adjacent mis-indentation.	2024-06-28 08:19:59 +02:00
Jan Beulich	c7eae03eab	x86/APX: optimize {nf}-form rotate-by-width-less-1 Unlike for the legacy forms, where there's a difference in the resulting EFLAGS.CF, for the NF variants the immediate can be got rid of in that case by switching to a 1-bit rotate in the opposite direction.	2024-06-28 08:19:32 +02:00
Jan Beulich	0868b8999b	x86/APX: optimize {nf} forms of ADD/SUB with specific immediates Unlike for the legacy forms, where there's a difference in the resulting EFLAGS, for the NF variants we can safely replace ones using 0x80 by the respectively other insn while negating the immediate, saving 3 immediate bytes (just 1 though for 16-bit operand size). Similarly we can replace ones using 1 / -1 by INC/DEC (eliminating the immediate).	2024-06-28 08:18:40 +02:00
Jan Beulich	f4a966a91d	x86: optimize {,V}PEXTR{D,Q} with immediate of 0 Such are equivalent to simple moves, which are up to 3 bytes shorter to encode (and perhaps also cheaper to execute).	2024-06-21 14:40:44 +02:00
Jan Beulich	fa2c4239f1	x86: optimize left-shift-by-1 These can be replaced by adds when acting on a register operand. While for the scalar forms there's no gain in encoding size, ADD generally has higher throughput than SHL. EFLAGS set by ADD are a superset of those set by SHL (AF in particular is undefined there). For the SIMD cases the transformation also reduced code size, by eliminating the 1-byte immediate from the resulting encoding. Note that this transformation is not applied by gcc13 (according to my observations), so would - as of now - even improve compiler generated code.	2024-06-21 14:39:52 +02:00
Cui, Lili	5445d7819b	x86: Remove the secondary encoding for ctest. There are two encodings for each opcode F6/F7 in ctest, but the second one is never used, so remove it to reduce the size of opcode_tbl.h. opcodes/ChangeLog: * i386-opc.tbl: Removed the secondary insn template for ctest. * i386-tbl.h: Regenerated.	2024-06-19 16:23:26 +08:00
Cui, Lili	d8ba1c4037	Support APX CCMP and CTEST CCMP and CTEST are two new sets of instructions for conditional CMP and TEST, SCC and OSZC flags are given as suffixes of CCMP or CTEST in the instruction mnemonic, e.g.: ccmp<cc> { dfv=sf , cf , of } %eax, %ecx also add {evex} cmp/test %eax, %ecx as an alias for ccmpt. For the encoder part, add function check_Scc_OszcOperation to parse '{ dfv=of , sf, sf, cf}', store scc in the lower 4 bits of base_opcode, and adjust base_opcode to its normal meaning in install_template. For the decoder part, add 'SC' and 'DF' macros to add scc and oszc flags suffixes. gas/ChangeLog: * config/tc-i386.c (OSZC_CF): New. (OSZC_ZF): Ditto. (OSZC_SF): Ditto. (OSZC_OF): Ditto. (set_oszc_flags): Set oszc flags and report error for using the same oszc flags twice. (check_Scc_OszcOperations): Handle SCC OSZC flags. (install_template): Add scc and oszc_flags. (build_apx_evex_prefix): Encode SCC and oszc flags bits. (parse_insn): Handle check_Scc_OszcOperations. * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Add ivalid test case. * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto. * testsuite/gas/i386/x86-64.exp: Add test for ccmp and ctest. * testsuite/gas/i386/x86-64-apx-ccmp-ctest-intel.d: New test. * testsuite/gas/i386/x86-64-apx-ccmp-ctest-inval.l: Ditto. * testsuite/gas/i386/x86-64-apx-ccmp-ctest-inval.s: Ditto. * testsuite/gas/i386/x86-64-apx-ccmp-ctest.d: Ditto. * testsuite/gas/i386/x86-64-apx-ccmp-ctest.s: Ditto. opcodes/ChangeLog: * i386-dis-evex-reg.h: Add ccmp and ctest. * i386-dis-evex.h: Ditto. * i386-dis.c (struct instr_info): add scc. (struct dis386): Add new micro 'NE','SC' and'DF'. (get_valid_dis386): Get scc value and move MAP4 invalid check to print_insn. (putop): Handle %NE, %SC and %DF. * i386-opc.h (SCC): New. * i386-opc.tbl: Add ccmp/ctest and evex format for cmp/test. * i386-mnem.h: Regenerated. * i386-tbl.h: Ditto.	2024-06-18 10:52:40 +08:00
Jan Beulich	d1c2dd6f4d	x86/APX: convert ZU to operand constraint Extremely rarely used attributes are inefficient when represented by a separate attribute. Convert it to an operand constraint, as already suggested during review. The collision with RegKludge is pretty simple to resolve.	2024-06-10 10:46:21 +02:00
Jan Beulich	f3f71a5ca0	x86/APX: support extended SETcc form As indicated during review, spelling/readability-wise setz %eax is easier than setzuz %al _and_ properly specifies the full register that's being modified. Permit that form to be used, even if the spec writers are unwilling to formally mention it. While there also correct the non-ZU EVEX form: That ought to also permit memory operands.	2024-06-10 10:45:16 +02:00
Jan Beulich	d967140f8c	x86/APX: add missing CPU requirement to imm+rm forms of <alu2> insns This was overlooked when the form was added by `dd74a60337` ("Support APX NF").	2024-06-10 09:05:23 +02:00
Jan Beulich	b83021de7a	x86/Intel: warn about undue mnemonic suffixes Except for very few insns mnemonic suffixes aren't permitted in Intel syntax. Warn about such for now, indicating that they will be outright refused down the road. While fiddling with testcases to address fallout, drop a few things which should never have been tested as valid Intel syntax. Also add a previously missing line to simd-suffix.d.	2024-05-29 10:03:00 +02:00
Jan Beulich	acd86c81f0	x86: correct VCVT{,U}SI2SD Properly reject inappropriate suffixes (No_lSuf / No_qSuf mistakenly omitted by `cf665fee1d` ["x86: re-work AVX512 embedded rounding / SAE"]), to avoid emitting bad or arbitrarily guessed instructions. Interestingly check_{long,qword}_suffix() don't help here, which perhaps is another indication that the way they work right now isn't quite appropriate. Sadly correcting just the templates breaks operand ambiguity detection, since so far that worked from a single template permitting more than one suffix. Here we have ambiguity though which can now be noticed only when taking all (matching) templates together. Therefore we need to determine further matching templates (see code comments for constraints), to then accumulate permitted suffixes across all of them.	2024-05-24 11:50:38 +02:00
Cui, Lili	bbe8d019ed	Support APX zero-upper This patch is to enable ZU for IMUL (opcodes 0x69 and 0x6B) and SETcc. Since the spec only recommends one form of setzu, I won't be adding set<cc>reg32/reg64 support in this patch. gas/ChangeLog: * config/tc-i386.c (build_apx_evex_prefix): Handle ZU. * testsuite/gas/i386/x86-64.exp: Added new tests for ZU. * testsuite/gas/i386/x86-64.exp: Added new tests for ZU. * testsuite/gas/i386/x86-64-apx-zu-intel.d: New test. * testsuite/gas/i386/x86-64-apx-zu-inval.l: Ditto. * testsuite/gas/i386/x86-64-apx-zu-inval.s: Ditto. * testsuite/gas/i386/x86-64-apx-zu.d: Ditto. * testsuite/gas/i386/x86-64-apx-zu.s: Ditto. opcodes/ChangeLog: * i386-dis-evex-prefix.h: Handle PREFIX_EVEX_MAP4_40 ~ PREFIX_EVEX_MAP4_4F. * i386-dis-evex.h: Ditto. * i386-dis.c (struct dis386): Add new micro 'ZU'. (putop): Handle %ZU. * i386-gen.c: Added ZU. * i386-opc.h: Ditto. * i386-opc.tbl: Added new templates to support ZU.	2024-05-22 16:15:47 +08:00
Cui, Lili	c8866e3ec5	x86: Drop using extension_opcode to encode vvvv register gas/ChangeLog: * config/tc-i386.c (build_modrm_byte): Dropped the use of extension_opcode to encode the vvvv register. * testsuite/gas/i386/x86-64-sse2avx.d: Added new testcases. * testsuite/gas/i386/x86-64-sse2avx.s: Diito. opcodes/ChangeLog: * i386-opc.tbl: Added DstVVVV to some extension_opcode instructions. * i386-tbl.h: Regenerated.	2024-05-06 18:33:45 +08:00
Cui, Lili	0820c9f5fc	x86: Drop SwapSources gas/ChangeLog: * config/tc-i386.c (build_modrm_byte): Dropped the use of SWAP_SOURCES to encode the vvvv register. opcodes/ChangeLog: * i386-opc.h (SWAP_SOURCES): Dropped. (NO_DEFAULT_MASK): Adjusted the value. (ADDR_PREFIX_OP_REG): Ditto. (DISTINCT_DEST): Ditto. (IMPLICIT_STACK_OP): Ditto. (VexVVVV_SRC2): New. * i386-opc.tbl: Dropped SwapSources and replaced its VexVVVV with Src1VVVV. * i386-tbl.h: Regenerated.	2024-05-06 18:21:28 +08:00
Cui, Lili	f2a3a8814d	x86: Use vexvvvv as the switch state to encode the vvvv register Use vexvvvv as the switch state, and replace VexVVVV with Src1VVVV. Src1VVVV means using VEX.vvvv encodes the first source register operand. The old logic did not check vexvvvv first, which made the logic here very complicated. gas/ChangeLog: * config/tc-i386.c (optimize_encoding): Replaced 1 with Src1VVVV. (build_modrm_byte): Used vexvvvv to encode the vvvv register. (s_insn): Replaced 1 with Src1VVVV. opcodes/ChangeLog: * i386-opc.h (VexVVVV_DST): Adjusted the value. (Src1VVVV): New. * i386-opc.tbl: Replaced part VexVVVV with Src1VVVV. * i386-tbl.h: Regenerated.	2024-05-06 18:16:42 +08:00
Jan Beulich	1d026d6b19	x86/APX: further extend SSE2AVX coverage Since {vex}/{vex3} are respected on legacy mnemonics when -msse2avx is in use, {evex} should be respected, too. So far this is the case only for insns where eGPR-s can come into play. Extend coverage to insns with only %xmm register and possibly immediate operands.	2024-05-03 09:27:00 +02:00
Jan Beulich	24187fb9c0	x86/APX: extend SSE2AVX coverage Legacy encoded SIMD insns are converted to AVX ones in that mode. When eGPR-s are in use, i.e. with APX, convert to AVX10 insns (where available; there are quite a few which can't be converted). Note that LDDQU is represented as VMOVDQU32 (and the prior use of the sse3 template there needs dropping, to get the order right). Note further that in a few cases, due to the use of templates, AVX512VL is used when AVX512F would suffice. Since AVX10 is the main reference, this shouldn't be too much of a problem.	2024-05-03 09:26:25 +02:00
Cui, Lili	dd74a60337	Support APX NF For the case when NDD and NF are both 0 in evex-promoted format, we will fully support and test it in another patch. gas/ChangeLog: * NEWS: Support Intel APX NF. * config/tc-i386.c (enum i386_error): Add unsupported_nf. (struct _i386_insn): Add has_nf. (is_apx_evex_encoding): Ditto. (build_apx_evex_prefix): Encode the NF bit. (md_assemble): Handle unsupported_nf. (parse_insn): Handle Prefix_NF and report bad for illegal combination. (can_convert_NDD_to_legacy): Replace i.tm.opcode_modifier.nf with i.has_nf. (match_template): Support D for APX_F insns and check NF support. * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Add bad test for NF bit. * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto. * testsuite/gas/i386/x86-64-apx-inval.l: Ditto. * testsuite/gas/i386/x86-64-apx-inval.s: Ditto. * testsuite/gas/i386/x86-64.exp: Add apx nf tests. * testsuite/gas/i386/x86-64-apx-nf-intel.d: New test. * testsuite/gas/i386/x86-64-apx-nf.d: Ditto. * testsuite/gas/i386/x86-64-apx-nf.s: Ditto. opcodes/ChangeLog: * i386-dis-evex.h: Add %NF to the instructions that support APX NF and add new instruction imul, popcnt, tzcnt and lzcnt to EVEX table. * i386-dis-evex-reg.h: Ditto. * i386-dis.c (struct instr_info): Add nf. (struct dis386): Add "NF" for EVEX.NF. (get_valid_dis386): Set ins->vex.nf and report bad-nf for illegal case. (print_insn): Handle ins.vex.nf. (putop): Handle "%NF". * i386-opc.h (Prefix_NF): New. * i386-opc.tbl: Added new entries to support full APX NF instructions. * i386-mnem.h: Regenerated. * i386-tbl.h: Regenerated.	2024-04-07 17:28:25 +08:00
H.J. Lu	cca46dea4d	Revert "x86: Restore APX shift-double instructions with omitted shift count" This reverts commit `c2d698fe03`. GCC 14 has been changed to use explicit shift count in shift-double instructions by the commit: 06a7e7514af x86: Use explicit shift count in double-precision shifts gas/ PR gas/31606 * testsuite/gas/i386/x86-64-apx-ndd-wig.d: Updated. * testsuite/gas/i386/x86-64-apx-ndd.d: Likewise. * testsuite/gas/i386/x86-64-apx-ndd.s: Remove tests for APX shift-double instructions with omitted shift count. opcodes/ PR gas/31606 * i386-opc.tbl: Remove APX shift-double instructions with omitted shift count. * i386-tbl.h: Regenerated.	2024-04-06 05:07:18 -07:00
H.J. Lu	c2d698fe03	x86: Restore APX shift-double instructions with omitted shift count Restore APX shift-double instructions with omitted shift count since they are generated by GCC as shown in: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114590 gas/ PR gas/31606 * testsuite/gas/i386/x86-64-apx-ndd-wig.d: Updated. * testsuite/gas/i386/x86-64-apx-ndd.d: Likewise. * testsuite/gas/i386/x86-64-apx-ndd.s: Add tests for APX shift-double instructions with omitted shift count. opcodes/ PR gas/31606 * i386-opc.tbl: Restore APX shift-double instructions with omitted shift count. * i386-tbl.h: Regenerated.	2024-04-04 13:16:20 -07:00
Jan Beulich	ef9a6314d8	x86: add missing No_qSuf to non-64-bit PTWRITE While largely benign, it still should have been put there when the original single template was split (commit `a04973848d`).	2024-04-03 10:41:30 +02:00
Jan Beulich	0006623c18	x86: drop stray Size64 from WRSSQ Like for WRUSSQ it's not needed here. The legacy insn had gained it in the course of zapping Rex64, but that attribute wasn't needed here either. The APX insn then simply gained it by copy-and-paste, I suppose.	2024-04-03 10:40:57 +02:00
Cui, Lili	8963a60d7b	x86/APX: Remove KEYLOCKER and SHA promotions from EVEX MAP4 APX spec removed KEYLOCKER and SHA promotions from EVEX MAP4. https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html gas/ChangeLog: * NEWS: Mention that remove KEYLOCKER and SHA promotions from EVEX * MAP4. * config/tc-i386.c (process_operands): Removed special handling of * KEYLOCKER and SHA. * testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: Removed KEYLOCKER * and SHA instructions. * testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: Ditto. * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Ditto. * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto. * testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Ditto. * testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Ditto. * testsuite/gas/i386/x86-64-apx-evex-promoted.d: Ditto. * testsuite/gas/i386/x86-64-apx-evex-promoted.s: Ditto. opcodes/ChangeLog: * i386-dis-evex-prefix.h: Removed KEYLOCKER and SHA instructions. * i386-dis-evex.h: Ditto. * i386-opc.tbl: Ditto. * i386-dis.c (print_vector_reg): Removed special handling of KEYLOCKER * and SHA.	2024-04-03 09:50:00 +08:00
Jan Beulich	ffa2571063	x86: templatize shift-double insns With the multitude of new APX templates, it finally becomes desirable to further remove redundancy by also templatizing basic arithmetic insns. Continue with the shift-double ones. While there also drop the APX form with ShiftCount omitted. Other shift and rotate insns were deliberately left without this form as well. Note that there's also no testsuite adjustment needed for this, indicating that the form wasn't tested either.	2024-03-28 11:49:48 +01:00

1 2 3 4 5 ...

600 Commits