The previous change
"x86: Ignore CS/DS/ES/SS segment-override prefixes in 64-bit mode"
to ignore segment override prefixes in 64-bit mode lead to dumping
branch hints as excessive prefixes:
ffffffff8109d5a0 <vmx_get_rflags>:
...
ffffffff8109d601: 3e 77 0a ds ja,pt ffffffff8109d60e <vmx_get_rflags+0x6e>
^^^^^
In this particular case, those prefixes are not excessive but are used
to provide branch hints - taken/not-taken - to the CPU.
Assign active_seg_prefix in that particular case to consume them.
gas/
2002-11-29 Borislav Petkov <bp@suse.de>
* testsuite/gas/i386/branch.d: Add new branch insns test.
* testsuite/gas/i386/branch.s: Likewise.
* testsuite/gas/i386/i386.exp: Insert the new branch test.
* testsuite/gas/i386/x86-64-branch.d: Test for branch hints insns.
* testsuite/gas/i386/x86-64-branch.s: Likewise.
* testsuite/gas/i386/ilp32/x86-64-branch.d: Likewise.
opcodes/
2020-11-28 Borislav Petkov <bp@suse.de>
* i386-dis.c (print_insn): Set active_seg_prefix for branch hint insns
to not dump branch hint prefixes 0x2E and 0x3E as unused prefixes.
In 64bit, assembler generates a warning for "sysret":
$ echo sysret | as --64 -o x.o -
{standard input}: Assembler messages:
{standard input}:1: Warning: no instruction mnemonic suffix given and no register operands; using default for `sysret'
Always display suffix for %LQ in 64bit to display "sysretl".
gas/
PR binutils/26704
* testsuite/gas/i386/noreg64-data16.d: Expect sysretl instead of
sysret.
* testsuite/gas/i386/noreg64.d: Likewise.
* testsuite/gas/i386/x86-64-intel64.d: Likewise.
* testsuite/gas/i386/x86-64-opcode.d: Likewise.
opcodes/
PR binutils/26704
* i386-dis.c (putop): Always display suffix for %LQ in 64bit.
The MODRM byte can be checked to display the instruction name only if the
MODRM byte needed. Clear modrm if the MODRM byte isn't needed so that
modrm field checks in putop like, modrm.mod == N with N != 0, can be done
without checking need_modrm.
gas/
PR binutils/26705
* testsuite/gas/i386/x86-64-suffix.s: Add "mov %rsp,%rbp" before
sysretq.
* testsuite/gas/i386/x86-64-suffix-intel.d: Updated.
* testsuite/gas/i386/x86-64-suffix.d: Likewise.
opcodes/
PR binutils/26705
* i386-dis.c (print_insn): Clear modrm if not needed.
(putop): Check need_modrm for modrm.mod != 3. Don't check
need_modrm for modrm.mod == 3.
There are 11 MOD_VEX_0F38* inserted in MOD_0F38* group,
which should be placed in MOD_VEX_0F38* group.
opcode/
PR 26654
*i386-dis.c (enum): Put MOD_VEX_0F38* together.
i386-dis.c:12207 left shift of 128 by 24 places cannot be represented in type 'long int'
i386-dis.c:12220 left shift of 128 by 24 places cannot be represented in type 'long int'
i386-dis.c:12222 left shift of 1 by 31 places cannot be represented in type 'long int'
i386-dis.c:12222 signed integer overflow: 162254319 - -2147483648 cannot be represented in type 'long int'
* i386-dis.c (OP_E_memory): Don't cast to signed type when
negating.
(get32, get32s): Use unsigned types in shift expressions.
This reverts commit 04c662e2b6.
In my underlying suggestion I neglected the fact that in those
cases (,%eiz,1) is the only visible indication that 32-bit
addressing is in effect.
Change
67 48 8b 1c 25 ef cd ab 89 mov 0x89abcdef(,%eiz,1),%rbx
to
67 48 8b 1c 25 ef cd ab 89 mov 0x89abcdef,%rbx
in AT&T syntax and
67 48 8b 1c 25 ef cd ab 89 mov rbx,QWORD PTR [eiz*1+0x89abcdef]
to
67 48 8b 1c 25 ef cd ab 89 mov rbx,QWORD PTR ds:0x89abcdef
in Intel syntax.
gas/
PR gas/26237
* testsuite/gas/i386/evex-no-scale-64.d: Updated.
* testsuite/gas/i386/addr32.d: Likewise.
* testsuite/gas/i386/x86-64-addr32-intel.d: Likewise.
* testsuite/gas/i386/x86-64-addr32.d: Likewise.
opcodes/
PR gas/26237
* i386-dis.c (OP_E_memory): Don't display eiz with no scale
without base nor index registers.
Irrespective of their encoding the resulting output should look the
same. Therefore wire the handling of PUSH/POP with GPR operands
encoded in the main opcode byte to the same logic used for other
operands. This frees up yet another macro character.
"Unambiguous" is is in particular taking as reference the assembler,
which also accepts certain insns - despite them allowing for varying
operand size, and hence in principle being ambiguous - without any
suffix. For example, from the very beginning of the life of x86-64 I had
trouble understanding why a plain and simple RET had to be printed as
RETQ. In case someone really used the 16-bit form, RETW disambiguates
the two quite fine.
Since the addr32 (0x67) prefix zero-extends the lower 32 bits address to
64 bits, change disassembler to zero-extend the lower 32 bits displacement
to 64 bits when there is no base nor index registers.
gas/
PR gas/26237
* testsuite/gas/i386/addr32.s: Add tests for 32-bit wrapped around
address.
* testsuite/gas/i386/x86-64-addr32.s: Likewise.
* testsuite/gas/i386/addr32.d: Updated.
* testsuite/gas/i386/x86-64-addr32-intel.d: Likewise.
* testsuite/gas/i386/x86-64-addr32.d: Likewise.
* testsuite/gas/i386/ilp32/x86-64-addr32-intel.d: Likewise.
* testsuite/gas/i386/ilp32/x86-64-addr32.d: Likewise.
opcodes/
PR gas/26237
* i386-dis.c (OP_E_memory): Without base nor index registers,
32-bit displacement to 64 bits.
%db<n> is an AT&T invention; the Intel documentation and MASM have only
ever specified DRn (in line with CRn and TRn). (In principle gas also
shouldn't accept the names in Intel mode, but at least for now I've kept
things as they are. Perhaps as a first step this should just be warned
about.)
Rm (and hence OP_R()) can be dropped by making 'Z' force modrm.mod to 3
(for OP_E()) instead of ignoring it. While at it move 'Z' handling to
its designated place (after 'Y'; 'W' handling will be moved by a later
change).
Moves to/from TRn are illegal in 64-bit mode and thus get converted to
honor this at the same time (also getting them in line with moves
to/from CRn/DRn ModRM.mod handling wise). This then also frees up the L
macro.
Rdq, Rd, and MaskR can be replaced by Edq, Ed / Rm, and MaskE
respectively, as OP_R() doesn't enforce ModRM.mod == 3, and hence where
MOD matters but hasn't been decoded yet it needs to be anyway. (The case
of converting to Rm is temporary until a subsequent change.)
In this case there's no need to go through prefix_table[] at all - the
.prefix_requirement == PREFIX_OPCODE machinery takes care of this case
already.
A couple of further adjustments are needed though:
- Gv / Ev and alike then can't be used (needs to be Gdq / Edq instead),
- dq_mode and friends shouldn't lead to PREFIX_DATA getting set in
used_prefixes.
The only valid (embedded or explicit) prefix being the data size one
(which is a fairly common pattern), avoid going through prefix_table[].
Instead extend the "required prefix" logic to also handle PREFIX_DATA
alone in a table entry, now used to identify this case. This requires
moving the (adjusted) ->prefix_requirement logic ahead of the printing
of stray prefixes, as the latter needs to observe the new setting of
PREFIX_DATA in used_prefixes.
Also add PREFIX_OPCODE on related entries when previously there was
mistakenly no decode step through prefix_table[].
It was quite odd for the prior operand handling to have to clear this
flag for the actual operand handling to print nothing. Have the actual
operand handling determine whether the operand is actually present.
With this {d,q}_scalar_swap_mode become unused and hence also get dropped.
These are only used when VEX.L or EVEX.L'L have already been decoded,
and hence the "normal" length dependent name determination is quite
fine. Adjust a few enumerators to make clear that vex_len_table[] has
been consulted; be consistent and do so for all *f128 and *i128 insns
in one go.
Fold redundant case blocks and move the extra adjustments logic into
the single case block that actually needs it - there's no need to go
through the extra logic for all the other cases. Also utilize there that
vex.b cannot be set at this point, due to earlier logic. Reduce the
comment there, which was partly stale anyway.
Unlike the earlier ones these also need their operands adjusted. Replace
the (mis-described: there's nothing "scalar" here) {b,w}_scalar_mode by
a single new mode, with the actual unit width controlled by EVEX.W.
The operands don't allow disambiguating the insn in 64-bit mode, and
hence suffixes need to be emitted not just in AT&T mode. Achieve this
by re-using %LQ while dropping PCMPESTR_Fixup().
MOVBE_Fixup() is entirely redundant with the S macro already used on the
mnemonics, leading to double suffixes in suffix-always mode. Drop the
function.
Just like other insns with GPR operands, CRC32 with only register
operands should not get a suffix added unless in suffix-always mode.
Do away with CRC32_Fixup() altogether, using other more generic logic
instead.
Unlike for non-zero values passed to USED_REX(), where rex_used gets
updated only when the respective bit was actually set in the encoding,
zero getting passed in is not further guarded, yet such a (potentially
"empty") REX prefix takes effect only when there are registers numbered
4 and up.
There's only a very limited set of modes that this function gets invoked
with - avoid it being more generic than it needs to be. This may, down
the road, allow actually doing away with the function altogether.
This eliminates a first improperly used "USED_REX (0)".
While some insns support both XOP.W based operand swapping and 256-bit
operation (XOP.L=1), many others don't support one or both.
For {L,S}LWPCB also fix the so far not decoded ModRM.mod == 3
restriction.
Take the opportunity and replace the custom OP_LWP_E() and OP_LWPCB_E()
routines by suitable other, non-custom operanbd specifiers.
Just like other VEX-encoded scalar insns do.
Besides a testcase for this behavior also introduce one to verify that
XOP scalar insns don't honor -mavxscalar=256, as they don't ignore
XOP.L.
There's no need for custom operand handling here, except for the VEX.W
controlled operand swapping and the printing of the remaining 4-bit
immediate. VEX.W can be handled just like 4-operand insns.
Also take the opportunity and drop the stray indirection through
vex_w_table[].
There's no need for custom operand handling here, except for the VEX.W
controlled operand swapping. The latter can be easily integrated into
OP_REG_VexI4().
The unnecessary XOP.L decoding had caught my eye, together with the not
really expected operand specifiers. Drop this decode step, and instead
make sure XOP.W and XOP.PP don't get ignored. For the latter, do this in
a form applicable to all XOP insns, rather than adding extra table
layers - there are no encodings with the field non-zero. Besides these
two, for the scalar forms XOP.L actually needs to also be zero.
Since we have these macros, there's no point having unnecessary table
depth.
VFPCLASSP{S,D} are now the first instance of using two %-prefixed
macros, which has pointed out a problem with the implementation. Instead
of using custom code in various case blocks, do the macro accumulation
centralized at the top of the main loop of putop(), and zap the
accumulated macros at the bottom of that loop once it has been
processed.