We were multiplying a byte by 0x0101010101010101ULL to create a
constant for SIMD ops, but the compiler isn't good at optimizing
this case (the fact that one operand is a byte is lost by the time
it would be possible to do the optimization). So instead we add
a helper routine that explicitly uses SIMD ops to create the constant.
Although this is not required by the definition of memcpy(),
in practice this sort of thing does happen, and it's easy to make
the code robust by doing nothing in this case. (Since structure
copy causes the compiler to emit a memcpy, in the case where the
target structure is the same as the destination, we were seeing
corruption.)
This patches fixes up the tile startup files, moving elf/start.S up a
directory level and implementing the required crti.S and crtn.S files
based on the old initfini.c compiler output (hand-optimized to bum a
couple of cycles).
Common code moved _itoa.h necessitating a change in the #include path.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>