With VexVVVV only being boolean, the SSE shift-by-immediate instructions
don't need special casing anymore for SSE2AVX handling. Simplify the two
respective templates. (No change to generated tables.)
With the SDM long having dropped the NDS/NDD/DDS concept of identifying
encoding variants, we can finally do away with this concept as well. Of
the few consumers of the attribute, only an assertion was still checking
for a particular value, which we don't really need to retain.
When touching lines anyway, modernize other aspects as well. This often
improves similarity to adjacent lines.
The function has accumulated a number of special cases for no real
reason. Some were necessary because insn attributes (SwapSources in
particular) weren't suitably utilized instead. Note that the addition of
SwapSources actually increases consistency among the templates: Like
others which already have the attribute, these are all insns where the
VEX.VVVV-encoded register comes first (or last when looking at the SDM).
Note that the vexvvvv attribute now has merely boolean meaning anymore,
in line with the SDM long having dropped the NDS/NDD/DDS concept of
identifying encoding variants. The fallout will be taken care of
subsequently, though, to not further clutter the change here.
As to the TILEZERO special case: If more instructions like this
appeared, a new attribute would likely be the way to go. But as long as
it's only a single insn, going from the mnemonic is cheaper.
The feature isn't universally available on 64-bit CPUs.
Note that in i386-gen.c:isa_dependencies[] I'm only adding it to models
where I'm certain the functionality exists. For Nocona and Core I'm
uncertain in particular.
While MOV to/from segment register as well as selector storing insns
already permit 32- and 64-bit GPR operands, selector loading insns and
ARPL do not. Split templates accordingly.
For shifts (but not ordinary rotates) and other cases where an immediate
describes e.g. a bit count or position, allowing negative operands is at
best confusing. An extreme example would be the two rotate-through-carry
insns, where a negative value would _not_ mean rotating the
corresponding number of bits in the other direction. To refuse such,
give meaning to the combination of Imm8 and Imm8S in templates (so far
these weren't used together anywhere). The issue was with
smallest_imm_type() blindly setting .imm8 for signed numbers determined
to fit in a byte.
VPROT{B,W,D,Q} is a little special: The rotate count there is a signed
quantity, so Imm8 is replaced by Imm8S. Adjust affected testcases
accordingly as well.
Another small adjustment to the testsuite is necessary: AAM and AAD were
never sensible to use with 0xffffff90 operands. This should have been an
error.
Just like we suppress emitting REX.W for e.g. MOV from/to segment
register, there's also no need for it for LAR and LSL - these can only
ever return 32-bit values and hence always zero-extend their results
anyway.
While there also drop the redundant Word from the first operand of
the second template each - this is already implied by Reg16.
In 64-bit mode BT can have REX.W or a data size prefix dropped in
certain cases. Outside of 64-bit mode all 4 insns can have the data
size prefix dropped in certain cases.
The attribute really specifies that the sum of register and memory
operands is 4. Express it like that in most places, while using the 2nd
(apart from XOP) CPU feature flags (FMA4) in reversed operand matching
logic.
With the use in build_modrm_byte() gone, part of an assertion there
also becomes meaningless - simplify that at the same time.
With all uses of the opcode modifier field gone, also drop that.
The few XOP insns which used it wrongly didn't have VexVVVV specified.
With that added, the only further missing piece to use more generic code
elsewhere is SwapSources - see e.g. the BMI2 insns for similar operand
patterns.
With the only users gone, drop the #define as well as the special case
code.
The VPROT* forms with an immediate operand are entirely standard in the
way their ModR/M bytes are built. There's no reason to invoke special
case code. With that the handling of an immediate there can also be
dropped; it was partially bogus anyway, as in its "no memory operands"
portion it ignores the possibility of an immediate operand (which was
okay only because that case was already handled by more generic code).
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
While originally indeed used for register size checking only, the
attribute has been used for memory operand size checking as well already
for quite a while, with more such uses recently having been added.
Having a "None" field in the vast majority of entries is needlessly
cluttering the overall table. Instead of this being a separate field,
use a representation matching that of Intel SDM and AMD PM for the main
use of the field: Append the value after a / as the separator.
PR gas/29524
Having templates with a suffix explicitly present has always been
quirky. After prior adjustment all that's left to also eliminate the
anomaly from move-with-sign-extend is to consolidate the insn templates
and to make may_need_pass2() cope (plus extend testsuite coverage).
The need for them on the operand-less string insns has gone away with
the removal of maybe_adjust_templates() and associated logic. Since
i386_index_check() needs adjustment then anyway, take the opportunity
and also simplify it, possible again as a result of said removal (plus
the opcode template adjustments done here).
Having templates with a suffix explicitly present has always been
quirky. Introduce a 2nd matching pass in case the 1st one couldn't find
a suitable template _and_ didn't itself already need to trim off a
suffix to find a match at all. This requires error reporting adjustments
(albeit luckily fewer than I was afraid might be necessary), as errors
previously reported during matching now need deferring until after the
2nd pass (because, obviously, we must not emit any error if the 2nd pass
succeeds). While also related to PR gas/29524, it was requested that
move-with-sign-extend be left as broken as it always was.
PR gas/29525
Note that with the dropped CMPSD and MOVSD Intel Syntax string insn
templates taking operands, mixed IsString/non-IsString template groups
(with memory operands) cannot occur anymore. With that
maybe_adjust_templates() becomes unnecessary (and is hence being
removed).
PR gas/29526
Note further that while the additions to the intel16 testcase aren't
really proper Intel syntax, we've been permitting all of those except
for the MOVD variant. The test therefore is to avoid re-introducing such
an inconsistency.
Since LAR and LSL only access 16 bits of the source operand, regardless
of operand size, allow 16-bit register source for LAR and LSL, and always
disassemble LAR and LSL with 16-bit source operand.
gas/
PR gas/29844
* testsuite/gas/i386/i386.s: Add tests for LAR and LSL.
* testsuite/gas/i386/x86_64.s: Likewise.
* testsuite/gas/i386/intelbad.s: Remove "lar/lsl eax, ax".
* testsuite/gas/i386/i386-intel.d: Updated.
* testsuite/gas/i386/i386.d: Likewise.
* testsuite/gas/i386/intel-intel.d: Likewise.
* testsuite/gas/i386/intel.d: Likewise.
* testsuite/gas/i386/intelbad.l: Likewise.
* testsuite/gas/i386/x86_64-intel.d: Likewise.
* testsuite/gas/i386/x86_64.d: Likewise.
opcodes/
PR gas/29844
* i386-dis.c (MOD_0F02): Removed.
(MOD_0F03): Likewise.
(dis386_twobyte): Restore larS and lslS.
(mod_table): Remove MOD_0F02 and MOD_0F03.
* i386-opc.tbl: Allow 16-bit register source for LAR and LSL.
* i386-tbl.h: Regenerated.
Leverage the C (commutative) attribute to also reduce the number of XCHG
and TEST templates we have. This way the reg <-> r/m (and reg <-> reg for
XCHG) forms can also be folded into a single template each, utilizing D.
With the removal of its use for FPU insns the suffix is now finally
properly misnamed. Drop its use altogether, replacing it by a separate
boolean instead.
As a comment near the top of match_template() already says: We really
only need this pseudo-suffix for far branch handling. Stop "deriving" it
for floating point insns. (Don't bother renaming the now properly
misnamed LONG_DOUBLE_MNEM_SUFFIX, to e.g. FAR_BRANCH_SUFFIX - it's going
to disappear anyway.)
At the very least a comment in process_operands() is stale. Beyond that
there are effectively two options:
1) It is possible that FADDP and FMULP were mistakenly not marked as
being in need of dealing with the compiler anomaly, and hence the
respective templates weren't removed at the time when they should
have been.
2) It is also possible that there are indeed uses known beyond compiler
generated output for these two commutative opcodes, and hence the
templates need to stay.
To be on the safe side assume 2: Update the comment and fold the
templates into their "normal" ones (utilizing D), adjusting consuming
code accordingly.
For FMULP also add a comment paralleling a similar one FADDP has.
There are just 4 templates using it, which can be easily identified by
other means, as D is set only on a very limited number of FPU templates.
Also move the respective conditional out of the code path taken by all
"reverse match" insns (it probably should have been this way already
before, to avoid the one conditional in the common case).
With this the templates which had FloatR dropped no longer differ from
their AT&T syntax + mnemonic counterparts - the only difference is now
which of the two would be recognized. For this, however, we don't need
two templates - we can simply arrange the condition for setting
Opcode_FloatR accordingly.
First of all make operand_type_register_match() apply to all sized
operands, i.e. in Intel Syntax also to respective memory ones. This
addresses gas wrongly accepting certain SIMD insns where register and
memory operand sizes should match but don't. This apparently has
affected all templates with one memory-only operand and one or more
register ones, both permitting at least two sizes, due to CheckRegSize
not taking effect.
Then also add CheckRegSize to a couple of non-SIMD templates matching
that same pattern of memory-only vs register operands. This replaces
bogus (for Intel Syntax) diagnostics referring to a wrong suffix (when
none was used at all) by "type mismatch" ones, just like already emitted
for insns where the template allows a register operand alongside a
memory one at any particular position.
This also is a prereq to limiting (ideally eliminating in the long run)
suffix "derivation" in Intel Syntax mode.
While making the code adjustment also flip order of checks to do the
cheaper one first in both cases.
To properly and predictably determine operand size encoding (operand
size or REX.W prefixes), consistent operand sizes need to be specified.
Add CheckRegSize where this was previously missing.
Both uniformly only ever take 16-bit memory operands while at the same
time requiring matching (in size) register operands, which then also
should disassemble that way. This in particular requires splitting each
of the templates for the assembler and separating decode of the
register and memory forms in the disassembler.
Use NoSuf to replace No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf
and add the explicit NoSuf to AddrPrefixOpReg in templates.
* i386-opc.tbl (NoSuf): New macro.
(AddrPrefixOpReg): Remove No_?Suf.
Replace No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf with
NoSuf in templates.
Add NoSuf to AddrPrefixOpReg in templates.
Attributes which aren't used together in any single insn template can be
converted from individual booleans to a single enum, as was done for a few
other attributes before. This is more space efficient. Collect together
all attributes which express special operand constraints (and which fit
the criteria for folding).
Prior to commit 1cb0ab18ad ("x86/Intel: restrict suffix derivation")
the Tbyte modifier on the FLDT and FSTPT templates was pointless, as
No_ldSuf would have prevented it being accepted. Due to the special
nature of LONG_DOUBLE_MNEM_SUFFIX said commit, however, has led to these
insns being accepted in Intel syntax mode even when "tbyte ptr" was
present. Restore original behavior by dropping Tbyte there. (Note that
these insns in principle should by marked AT&T syntax only, but since
they haven't been so far we probably shouldn't change that.)
By putting the templates after their AVX512 counterparts, the AVX512
flavors will be picked by default. That way the need to always use {vex}
ceases to exist once respective CPU features (AVX512-VNNI or AVX512VL as
a whole) have been disabled. This way the need for the PseudoVexPrefix
attribute also disappears.
While in some cases deriving an AT&T-style suffix from an Intel syntax
memory operand size specifier is necessary, in many cases this is not
only pointless, but has led to the introduction of various workarounds:
Excessive use of IgnoreSize and NoRex64 as well as the ToDword and
ToQword attributes. Suppress suffix derivation when we can clearly tell
that the memory operand's size isn't going to be needed to infer the
possible need for the low byte/word opcode bit or an operand size prefix
(0x66 or REX.W).
As a result ToDword and ToQword can be dropped entirely, plus a fair
number of IgnoreSize and NoRex64 can also be got rid of. Note that
IgnoreSize needs to remain on legacy encoded SIMD insns with GPR
operand, to avoid emitting an operand size prefix in 16-bit mode. (Since
16-bit code using SIMD insns isn't well tested, clone an existing
testcase just enough to cover a few insns which are potentially
problematic but are being touched here.)
Note that while folding the VCVT{,T}S{S,D}2SI templates, VCVT{,T}SH2SI
isn't included there. This is to fulfill the request of not allowing L
and Q suffixes there, despite the inconsistency with VCVT{,T}S{S,D}2SI.
Now that we can purge templates, let's use this to improve readability a
little by shortening a few of their names, making functionally similar
ones also have identical names in their multiple incarnations.