This change cleans up the string code in a number of ways:
- For memcpy(), fix bug in prefetch and increase distance to 3 lines;
optimize for unaligned data; do all loads before wh64 to make memcpy
safe for forward-overlapping calls; etc. Performance is improved.
- Use new copy_byte() function on tilegx to spread a single byte value
out into a full word using the shufflebytes instruction.
- Clean up header include ordering to be more canonical, and remove
spurious #undefs of function names.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This support was partially present in the existing code (look for
"__tilegx__" ifdefs) but with this change you can build a working
kernel using the TILE-Gx toolchain and ARCH=tilegx.
Most of these files are new, generally adding a foo_64.c file
where previously there was just a foo_32.c file.
The ARCH=tilegx directive redirects to arch/tile, not arch/tilegx,
using the existing SRCARCH mechanism in the top-level Makefile.
Changes to existing files:
- <asm/bitops.h> and <asm/bitops_32.h> changed to factor the
include of <asm-generic/bitops/non-atomic.h> in the common header.
- <asm/compat.h> and arch/tile/kernel/compat.c changed to remove
the "const" markers I had put on compat_sys_execve() when trying
to match some recent similar changes to the non-compat execve.
It turns out the compat version wasn't "upgraded" to use const.
- <asm/opcode-tile_64.h> and <asm/opcode_constants_64.h> were
previously included accidentally, with the 32-bit contents. Now
they have the proper 64-bit contents.
Finally, I had to hack the existing hacky drivers/input/input-compat.h
to add yet another "#ifdef" for INPUT_COMPAT_TEST (same as x86_64).
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> [drivers/input]