Just like their AVX counterparts they can utilize XMVexScalar /
EXdVexScalarS / EXqVexScalarS taking care of dropping the middle operand
for their memory forms.
These encodings aren't valid in real and VM86 modes, but they are very
well usable in 16-bit protected mode.
A few adjustments in the disassembler tables are needed where Ev or Gv
were wrongly used. Additionally an adjustment is needed to avoid
printing "addr32" when that's already recognizable by the use of %eiz.
Furthermore the Iq operand template was wrong for XOP:0Ah encoding
insns: They're having a uniform 32-bit immediate. Drop Iq and introduce
Id instead.
Clone a few existing test cases to exercise assembler and disassembler.
I assume this mode was needed when EVEX.W handling wasn't really correct
yet for other than 64-bit mode. It's clearly not needed anymore. Its
elimination also allows dropping the EVEX.W split of VCVT{,U}SI2SS. (For
the record, the dropped mode would have been wrong if used in any table
entry not already guaranteeing EVEX.W=1.)
The only meaningful difference from OP_I() is the handling of the
VEX.W=1 case in 64-bit mode for bytemode being v_mode. Funnel
everything else into OP_I(), and drop no longer needed local
variables.
MOVNTI was wrongly assembled with a 66h prefix. Add IgnoreSize to
address this. It and the scalar to/from integer conversion insns also
were also wrongly using Ev / Gv, leading to 16-bit register names being
printed when 32-bit ones were meant.
Clone the 32-bit SSE2 test to cover both assembler and disassembler.
Break i386-dis-evex.h into small files such that each file is included
just once.
* i386-dis-evex.h: Break into ...
* i386-dis-evex-len.h: New file.
* i386-dis-evex-mod.h: Likewise.
* i386-dis-evex-prefix.h: Likewise.
* i386-dis-evex-reg.h: Likewise.
* i386-dis-evex-w.h: Likewise.
* i386-dis.c: Include i386-dis-evex-reg.h, i386-dis-evex-prefix.h,
i386-dis-evex.h, i386-dis-evex-len.h, i386-dis-evex-w.h and
i386-dis-evex-mod.h.
If VEX.vvvv and EVEX.vvvv are reserved, they must be all 1s, which are
all 0s in inverted form. Add check for unused VEX.vvvv and EVEX.vvvv
when disassembling VEX and EVEX instructions.
gas/
PR binutils/24626
* testsuite/gas/i386/disassem.s: Add tests for reserved VEX.vvvv
and EVEX.vvvv.
* testsuite/gas/i386/x86-64-disassem.s: Likewise.
* testsuite/gas/i386/disassem.d: Updated.
* testsuite/gas/i386/x86-64-disassem.d: Likewise.
opcodes/
PR binutils/24626
* i386-dis.c (print_insn): Check for unused VEX.vvvv and
EVEX.vvvv when disassembling VEX and EVEX instructions.
(OP_VEX): Set vex.register_specifier to 0 after readding
vex.register_specifier.
(OP_Vex_2src_1): Likewise.
(OP_Vex_2src_2): Likewise.
(OP_LWP_E): Likewise.
(OP_EX_Vex): Don't check vex.register_specifier.
(OP_XMM_Vex): Likewise.
For the flavors having a GPR operand EVEX.W is ignored outside of 64-bit
mode. The mnemonic should therefore not be KMOVQ, the GPR operand should
not name a non-existing 64-bit register, just like is already the case
for the AVX counterparts, and the Disp8 scaling factor should be 4
rather than 8.
PEXTR{B,W} and PINSR{B,W}, just like for AVX512BW, are WIG, no matter
that the SDM uses a nonstandard description of that fact.
PEXTRD, even with EVEX.W set, ignores that bit outside of 64-bit mode,
just like its AVX counterpart.
AVX "VMOVQ xmm1, xmm2/m64" and "VMOVQ xmm1/m64, xmm2" can only be
encoded with VEX.128. Set Vex=1 on VEX.128 only vmovq and update
assembler tests.
gas/
PR gas/23665
* testsuite/gas/i386/avx-scalar-intel.d: Updated.
* testsuite/gas/i386/avx-scalar.d: Likewise.
* testsuite/gas/i386/x86-64-avx-scalar-intel.d: Likewise.
* testsuite/gas/i386/x86-64-avx-scalar.d: Likewise.
opcodes/
PR gas/23665
* i386-dis.c (vex_len_table): Update VEX_LEN_0F7E_P_1 and
VEX_LEN_0FD6_P_2 entries.
* i386-opc.tbl: Set Vex=1 on VEX.128 only vmovq.
* i386-tbl.h: Regenerated.
Update x86 disassembler to ignore the EVEX.W bit in EVEX vcvt[u]si2s[sd]
instructions in 32-bit mode.
gas/
PR binutils/23655
* testsuite/gas/i386/evex.d: New file.
* testsuite/gas/i386/evex.s: Likewise.
* testsuite/gas/i386/i386.exp: Run evex.
opcodes/
PR binutils/23655
* i386-dis-evex.h (evex_table): Replace Eq with Edqa for
vcvtsi2ss%LQ, vcvtsi2sd%LQ, vcvtusi2ss%LQ and vcvtusi2sd%LQ.
* i386-dis.c (Edqa): New.
(dqa_mode): Likewise.
(intel_operand_size): Handle dqa_mode as m_mode.
(OP_E_register): Handle dqa_mode as dq_mode.
(OP_E_memory): Set shift for dqa_mode based on address_mode.
In 64-bit mode, display eiz for address with the addr32 prefix and without
base nor index registers. For
mov -0xccddef(,%eiz,), %rax
disassembler now displays:
67 48 8b 04 25 11 22 33 ff mov -0xccddef(,%eiz,1),%rax
instead of
67 48 8b 04 25 11 22 33 ff addr32 mov 0xffffffffff332211,%rax
gas/
* testsuite/gas/i386/evex-no-scale-64.d: Updated.
* testsuite/gas/i386/x86-64-addr32-intel.d: Likewise.
* testsuite/gas/i386/x86-64-addr32.d: Likewise.
* testsuite/gas/i386/ilp32/x86-64-addr32-intel.d: Likewise.
* testsuite/gas/i386/ilp32/x86-64-addr32.d: Likewise.
* testsuite/gas/i386/x86-64-addr32.s: Add %eiz tests.
opcodes/
* i386-dis.c (OP_E_memory): In 64-bit mode, display eiz for
address with the addr32 prefix and without base nor index
registers.
Since only the first 32 bits of input operand are used for tpause and
umwait, the REX.W bit is skipped. Both 32-bit registers and 64-bit
registers are allowed.
gas/
* testsuite/gas/i386/x86-64-waitpkg.s: Add 32-bit registers
tests for tpause and umwait.
* testsuite/gas/i386/x86-64-waitpkg-intel.d: Updated.
* testsuite/gas/i386/x86-64-waitpkg.d: Likewise.
opcodes/
* i386-dis.c (prefix_table): Replace Em with Edq on tpause and
umwait.
* i386-opc.tbl: Allow 32-bit registers for tpause and umwait in
64-bit mode.
* i386-tbl.h: Regenerated.
"vex" has many fields to control how to decode an instruction. Clear
all fields in "vex" before decoding an instruction to avoid using values
left from the previous instruction.
gas/
PR binutils/23025
* testsuite/gas/i386/prefix.s: Add tests for vcvtpd2dq with
VEX and EVEX prefixes.
* testsuite/gas/i386/prefix.d: Updated.
opcodes/
PR binutils/23025
* i386-dis.c (get_valid_dis386): Don't set vex.prefix nor vex.w
to 0.
(print_insn): Clear vex instead of vex.evex.
In the course of folding their patterns (possible now that the pointless
and partly even bogus VecESize are no longer in the way) I've noticed
that vcvt*2usi, other than their vcvt*2si counterparts, don't allow for
any suffixes. As that is supposedly intentional, make the disassembler
consistently omit suffixes for all to-scalar-int conversion insns.
The wrong placement of the Load attribute in the templates prevented
this from working. The disassembler also didn't handle this consistently
with other similar dual-encoding insns.
fsub/fsubr/fsubp/fsubrp as well as fdiv/fdivr/fdivp/fdivrp disassembly
should match (a) the Intel SDM and (b) respective input fed to gas (both
of course with the exception of when we intentionally convert bogus
insns, accompanied by a warning).
Despite EVEX encodings not being available in real and VM86 modes,
16-bit addressing still needs to be handled properly for 16-bit
protected mode as well as 16-bit addressing in 32-bit mode. Neither
should displacements be dropped silently by the assembler, nor should
the disassembler fail to correctly scale 8-bit displacements.
Make the assembler recognize UD0, supporting only the newer form
expecting a ModR/M byte.
Make assembler and disassembler properly emit / expect a ModR/M byte for
UD1.
For the testsuite, as arch-4 already tests all UDn, avoid producing a
huge delta for other tests using UD2B by making them use UD2 instead.
While commits 9889cbb14e ("Check invalid mask registers") and
abfcb414b9 ("X86: Ignore REX_B bit for 32-bit XOP instructions") went a
bit into the right direction, this wasn't quite enough:
- VEX.vvvv has its high bit ignored
- EVEX.vvvv has its high bit ignored together with EVEX.v'
- the high bits of {,E}VEX.vvvv should not be prematurely zapped, to
allow proper checking of them when the fields has to hold al ones
- when the high bits of an immediate specify a register, bit 7 is
ignored
VEX.W may be legitimately set (and is then ignored by the CPU) for
non-64-bit code. Don't print 64-bit register names in such a case, by
utilizing that REX_W would never be set for non-64-bit code, and that
it is being set from VEX.W by generic decoding.
A test for this is going to be introduced in the next patch of this
series.
The low four bits of an immediate being set when the high bits specify a
fourth register operand is not a problem: CPUs ignore these bits rather
than raising #UD. Take care of incrementing codep in OP_EX_VexW()
instead.