linux/arch/arm64/crypto
Eric Biggers 91a2abb78f crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS
Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
for ARM64.  This is ported from the 32-bit version.  It may be useful on
devices with 64-bit ARM CPUs that don't have the Cryptography
Extensions, so cannot do AES efficiently -- e.g. the Cortex-A53
processor on the Raspberry Pi 3.

It generally works the same way as the 32-bit version, but there are
some slight differences due to the different instructions, registers,
and syntax available in ARM64 vs. in ARM32.  For example, in the 64-bit
version there are enough registers to hold the XTS tweaks for each
128-byte chunk, so they don't need to be saved on the stack.

Benchmarks on a Raspberry Pi 3 running a 64-bit kernel:

   Algorithm                              Encryption     Decryption
   ---------                              ----------     ----------
   Speck64/128-XTS (NEON)                 92.2 MB/s      92.2 MB/s
   Speck128/256-XTS (NEON)                75.0 MB/s      75.0 MB/s
   Speck128/256-XTS (generic)             47.4 MB/s      35.6 MB/s
   AES-128-XTS (NEON bit-sliced)          33.4 MB/s      29.6 MB/s
   AES-256-XTS (NEON bit-sliced)          24.6 MB/s      21.7 MB/s

The code performs well on higher-end ARM64 processors as well, though
such processors tend to have the Crypto Extensions which make AES
preferred.  For example, here are the same benchmarks run on a HiKey960
(with CPU affinity set for the A73 cores), with the Crypto Extensions
implementation of AES-256-XTS added:

   Algorithm                              Encryption     Decryption
   ---------                              -----------    -----------
   AES-256-XTS (Crypto Extensions)        1273.3 MB/s    1274.7 MB/s
   Speck64/128-XTS (NEON)                  359.8 MB/s     348.0 MB/s
   Speck128/256-XTS (NEON)                 292.5 MB/s     286.1 MB/s
   Speck128/256-XTS (generic)              186.3 MB/s     181.8 MB/s
   AES-128-XTS (NEON bit-sliced)           142.0 MB/s     124.3 MB/s
   AES-256-XTS (NEON bit-sliced)           104.7 MB/s      91.1 MB/s

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-03-16 23:35:41 +08:00
..
.gitignore crypto: arm64/sha2 - add generated .S files to .gitignore 2016-11-29 16:06:56 +08:00
aes-ce-ccm-core.S crypto: arm64/aes-ce-cipher - match round key endianness with generic code 2017-08-04 09:27:19 +08:00
aes-ce-ccm-glue.c crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback 2017-08-04 09:27:21 +08:00
aes-ce-core.S crypto: arm64/aes-ce-cipher - move assembler code to .S file 2017-11-29 17:33:30 +11:00
aes-ce-glue.c crypto: arm64/aes-ce-cipher - move assembler code to .S file 2017-11-29 17:33:30 +11:00
aes-ce-setkey.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
aes-ce.S crypto: arm64/aes-ce-cipher - match round key endianness with generic code 2017-08-04 09:27:19 +08:00
aes-cipher-core.S crypto: arm64/aes-cipher - move S-box to .rodata section 2018-01-18 23:00:30 +11:00
aes-cipher-glue.c crypto: arm64/aes - add scalar implementation 2017-01-13 00:26:49 +08:00
aes-ctr-fallback.h crypto: arm64/aes-blk - add a non-SIMD fallback for synchronous CTR 2017-08-04 09:27:21 +08:00
aes-glue.c crypto: arm64/aes - do not call crypto_unregister_skcipher twice on error 2017-11-29 17:33:34 +11:00
aes-modes.S crypto: arm64/aes - add NEON/Crypto Extensions CBCMAC/CMAC/XCBC driver 2017-02-11 17:50:45 +08:00
aes-neon.S crypto: arm64/aes-neon - move literal data to .rodata section 2018-01-18 23:00:30 +11:00
aes-neonbs-core.S crypto: arm64/aes - don't use IV buffer to return final keystream block 2017-02-03 18:16:20 +08:00
aes-neonbs-glue.c crypto: arm64/aes-bs - implement non-SIMD fallback for AES-CTR 2017-08-04 09:27:22 +08:00
chacha20-neon-core.S crypto: arm64/chacha20 - implement NEON version based on SSE3 code 2017-01-13 00:26:48 +08:00
chacha20-neon-glue.c crypto: arm64/chacha20 - take may_use_simd() into account 2017-08-04 09:27:22 +08:00
crc32-ce-core.S crypto: arm64/crc32 - move literal data to .rodata section 2018-01-18 23:00:31 +11:00
crc32-ce-glue.c crypto: hash - annotate algorithms taking optional key 2018-01-12 23:03:35 +11:00
crct10dif-ce-core.S crypto: arm64/crct10dif - move literal data to .rodata section 2018-01-18 23:00:31 +11:00
crct10dif-ce-glue.c crypto: arm64/crct10dif - add non-SIMD generic fallback 2017-08-04 09:27:16 +08:00
ghash-ce-core.S crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL 2017-08-04 09:27:25 +08:00
ghash-ce-glue.c crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL 2017-08-04 09:27:25 +08:00
Kconfig crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS 2018-03-16 23:35:41 +08:00
Makefile crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS 2018-03-16 23:35:41 +08:00
sha1-ce-core.S crypto: arm64/sha1-ce - get rid of literal pool 2018-01-18 23:00:33 +11:00
sha1-ce-glue.c crypto: arm64/sha1-ce - add non-SIMD generic fallback 2017-08-04 09:27:18 +08:00
sha2-ce-core.S crypto: arm64/sha2-ce - move the round constant table to .rodata section 2018-01-18 23:00:32 +11:00
sha2-ce-glue.c crypto: arm64/sha2-ce - add non-SIMD scalar fallback 2017-08-04 09:27:19 +08:00
sha3-ce-core.S crypto: arm64/sha3 - new v8.2 Crypto Extensions implementation 2018-01-26 01:10:35 +11:00
sha3-ce-glue.c crypto: arm64/sha3 - new v8.2 Crypto Extensions implementation 2018-01-26 01:10:35 +11:00
sha256-core.S_shipped crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512 2016-11-28 19:58:05 +08:00
sha256-glue.c crypto: arm64/sha2-ce - add non-SIMD scalar fallback 2017-08-04 09:27:19 +08:00
sha512-armv8.pl crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512 2016-11-28 19:58:05 +08:00
sha512-ce-core.S crypto: arm64/sha512 - fix/improve new v8.2 Crypto Extensions code 2018-01-26 01:10:36 +11:00
sha512-ce-glue.c crypto: arm64 - implement SHA-512 using special instructions 2018-01-18 22:52:24 +11:00
sha512-core.S_shipped crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512 2016-11-28 19:58:05 +08:00
sha512-glue.c crypto: arm64/sha512 - fix/improve new v8.2 Crypto Extensions code 2018-01-26 01:10:36 +11:00
sm3-ce-core.S crypto: arm64/sm3 - new v8.2 Crypto Extensions implementation 2018-01-26 01:10:35 +11:00
sm3-ce-glue.c crypto: arm64/sm3 - new v8.2 Crypto Extensions implementation 2018-01-26 01:10:35 +11:00
speck-neon-core.S crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS 2018-03-16 23:35:41 +08:00
speck-neon-glue.c crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS 2018-03-16 23:35:41 +08:00