We already had mulu2_i32 for a 32-bit host; expand this to 64-bit
hosts as well. The muls2_i32 and the 64-bit opcodes are new.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These were already present in tcg-target.c.inc,
but not in the interpreter.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When this opcode is not available in the backend, tcg middle-end
will expand this as a series of 5 opcodes. So implementing this
saves bytecode space.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This operation is critical to staying within the interpretation
loop longer, which avoids the overhead of setup and teardown for
many TBs.
The check in tcg_prologue_init is disabled because TCI does
want to use NULL to indicate exit, as opposed to branching to
a real epilogue.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This removes all of the problems with unaligned accesses
to the bytecode stream.
With an 8-bit opcode at the bottom, we have 24 bits remaining,
which are generally split into 6 4-bit slots. This fits well
with the maximum length opcodes, e.g. INDEX_op_add2_i32, which
have 6 register operands.
We have, in previous patches, rearranged things such that there
are no operations with a label which have more than one other
operand. Which leaves us with a 20-bit field in which to encode
a label, giving us a maximum TB size of 512k -- easily large.
Change the INDEX_op_tci_movi_{i32,i64} opcodes to tci_mov[il].
The former puts the immediate in the upper 20 bits of the insn,
like we do for the label displacement. The later uses a label
to reference an entry in the constant pool. Thus, in the worst
case we still have a single memory reference for any constant,
but now the constants are out-of-line of the bytecode and can
be shared between different moves saving space.
Change INDEX_op_call to use a label to reference a pair of
pointers in the constant pool. This removes the only slightly
dodgy link with the layout of struct TCGHelperInfo.
The re-encode cannot be done in pieces.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Inline it into its one caller, tci_write_reg64.
Drop the asserts that are redundant with tcg_read_r.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The encoding planned for tci does not have enough room for
brcond2, with 4 registers and a condition as input as well
as the label. Resolve the condition into TCG_REG_TMP, and
relax brcond to one register plus a label, considering the
condition to always be reg != 0.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This requires adjusting where arguments are stored.
Place them on the stack at left-aligned positions.
Adjust the stack frame to be at entirely positive offsets.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Actually print arguments as opposed to simply the opcodes
and, uselessly, the argument counts. Reuse all of the helpers
developed as part of the interpreter.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This performs the size check while reading the arguments,
which means that we don't have to arrange for it to be
done after the operation. Which tidies all of the branches.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We are currently using the "natural" size routine, which
uses 64-bits on a 64-bit host. The TCGMemOpIdx operand
has 11 bits, so we can safely reduce to 32-bits.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the correct set of asserts during code generation.
We do not require the first input to overlap the output;
the existing interpreter already supported that.
Split out tci_args_rrrbb in the translator.
Use the deposit32/64 functions rather than inline expansion.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Convert to indirect jumps, as it's less complicated.
Then we just have a pointer to the tb address at which
the chain is stored, from which we read.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not emit a uint64_t, but a tcg_target_ulong, aka uintptr_t.
This reduces the size of the constant on 32-bit hosts.
The assert for label != NULL has to be removed because that
is a valid value for exit_tb.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Begin splitting out functions that do pure argument decode,
without actually loading values from the register set.
This means that decoding need not concern itself between
input and output registers. We can assert that the register
number is in range during decode, so that it is safe to
simply dereference from regs[] later.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the next patches, we want to use tci_read_r to return
the raw register number. So rename the existing function,
which returns the register value, to tci_read_rval.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These operations are always available under different names:
INDEX_op_ext_i32_i64 and INDEX_op_extu_i32_i64, so we remove
no code with the ifdef.
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This includes bswap16 and bswap32.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This includes ext8s, ext8u, ext16s, ext16u.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This includes add, sub, mul, and, or, xor.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In all cases restricted to 64-bit hosts, tcg_read_r is
identical. We retain the 64-bit symbol for the single
case of INDEX_op_qemu_st_i64.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use explicit casts for ext32s opcodes.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use explicit casts for ext32u opcodes, and allow truncation
to happen for other users.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use explicit casts for ext16s opcodes.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use explicit casts for ext16u opcodes, and allow truncation
to happen with the store for st16 opcodes, and with the call
for bswap16 opcodes.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use explicit casts for ext8s opcodes.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use explicit casts for ext8u opcodes, and allow truncation
to happen with the store for st8 opcodes.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the provided cpu_ldst.h interfaces. This fixes the build vs
the unconverted uses of g2h(), adds missed memory trace events,
and correctly recognizes when a SIGSEGV belongs to the guest via
set_helper_retaddr().
Fixes: 3e8f1628e8
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Restrict all operands to registers. All constants will be forced
into registers by the middle-end. Removing the difference in how
immediate integers were encoded will allow more code to be shared
between 32-bit and 64-bit operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This was removed from tcg_target_reg_alloc_order and
tcg_target_call_iarg_regs on the assumption that it
was the stack. This was incorrectly copied from i386.
For tci, the stack is R15.
By adding R4 back to tcg_target_call_iarg_regs, adjust the other
entries so that 6 (or 12) entries are still present in the array,
and adjust the numbers in the interpreter.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Trivially implemented like other arithmetic.
Tested via check-tcg and the ppc64 target.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We do not simultaneously support div and div2 -- it's one
or the other. TCI is already using div, so remove div2.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Three TODO instances are never happen cases.
Other uses of tcg_abort are also indicating unreachable cases.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The existing check was incomplete:
(1) Only applied to two of the 7 stores, and not to the loads at all.
(2) Only checked the upper, but not the lower bound of the stack.
Doing this at compile time means that we don't need to do it
at runtime as well.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>