https://www.openssl.org/ Is a well known cryptography library and since
freshly released version 3.2 it also supports variable digest size of
blake2b, so we can now add it among the crypto providers.
Configure with --with-crypto=openssl.
Signed-off-by: David Sterba <dsterba@suse.com>
https://botan.randombit.net/ Botan is a cryptography library with C
bindings and provides what we need (sha256 and blake2b), among many
others. Add it to the list of crypto backends if somebody wants to use
it.
Currently the version 2.19 is the latest one. Botan3 3.2.0 exists but
does not seem to be widely available in distros yet.
Configure with --with-crypto=botan.
Signed-off-by: David Sterba <dsterba@suse.com>
The accelerated crc32c needs to check for two CPU features, the crc32c
instructions is in SSE 4.2 and 'pclmulqdq' is a separate. There's still
old hardware used that does not have the PCLMUL instructions. Detect it
and make it the condition.
The pclmul is not supported on old compilers so also add a
configure-time detection and leave the SSE 4.2 only implementation as
the accelerated one if possible.
Issue: #676
Signed-off-by: David Sterba <dsterba@suse.com>
Copy faster implementation of crc32c from linux kernel as of 6.5-rc7
(x86_64, arch/x86/crypto/crc32c-pcl-intel-asm_64.S). This needs
assembler build support, so detect target architecture so
cross-compilation still works.
Add a special CPU flag so the old and new implementations can be
benchmarked and verified separately.
Sample benchmark:
CPU flags: 0x1ff
CPU features: SSE2 SSSE3 SSE41 SSE42 SHA AVX AVX2 CRC32C_PCL
Block size: 4096
Iterations: 1000000
Implementation: builtin
Units: CPU cycles
NULL-NOP: cycles: 77177218, cycles/i 77
NULL-MEMCPY: cycles: 226313072, cycles/i 226, 62133.395 MiB/s
CRC32C-ref: cycles: 24418596066, cycles/i 24418, 575.859 MiB/s
CRC32C-NI: cycles: 1188335920, cycles/i 1188, 11833.073 MiB/s
CRC32C-PCL: cycles: 463193456, cycles/i 463, 30358.037 MiB/s
XXHASH: cycles: 851606646, cycles/i 851, 16511.916 MiB/s
SHA256-ref: cycles: 74476234956, cycles/i 74476, 188.808 MiB/s
SHA256-NI: cycles: 34198637428, cycles/i 34198, 411.177 MiB/s
BLAKE2-ref: cycles: 14761411664, cycles/i 14761, 952.597 MiB/s
BLAKE2-SSE2: cycles: 18101896796, cycles/i 18101, 776.807 MiB/s
BLAKE2-SSE41: cycles: 12599091062, cycles/i 12599, 1116.087 MiB/s
BLAKE2-AVX2: cycles: 9668247506, cycles/i 9668, 1454.418 MiB/s
The new implementation is about 2.5x faster.
Note: there new version does not work on musl because of linkage
problems (relocations in .rodata), so it's still using the old
implementation.
Signed-off-by: David Sterba <dsterba@suse.com>
Change what hash-speedtest benchmarks according to the
--with-crypto=backend option. Until now it would run the same version
under different names inherited from the builting.
At configure time detect availability of all backends and define all
macros.
Signed-off-by: David Sterba <dsterba@suse.com>
The build fails with crypto backends other than builtin, the
initializers cannot be reached as they're ifdef-ed out. Move
hash_init_accel under the right condition and delete the
algorithm-specific initializers as they're used only by the hash test
and that can simply call hash_init_accel to set the implementation.
All the -m flags need to be detected at configure time and the flag used
for ifdef (HAVE_CFLAG_m*), not the actual feature defined by compiler as
the dispatcher function is not built with the -m flags.
The uname check for x86_64 must be dropped so on i386/i586 we can still
build accelerated version.
Signed-off-by: David Sterba <dsterba@suse.com>
Prepare a single location that will detect or set accelerated versions
of hash algorithms. Right now it's the crc32c, blake2 and sha256 do
an if-else switch while crc32c sets a function pointer.
Signed-off-by: David Sterba <dsterba@suse.com>
Copy sha256-x86.c from https://github.com/noloader/SHA-Intrinsics, that
uses the compiler intrinsics to implement the update step with the
native x86_64 instructions.
To avoid dependencies of the reference code and the x86 version, check
runtime support only if the compiler also supports -msha.
Signed-off-by: David Sterba <dsterba@suse.com>
Benchmark all accelerated implementations if the CPU supports them. Set
the level before each test, expecting that the implementation switches
the implementation dynamically.
Signed-off-by: David Sterba <dsterba@suse.com>
Lots of code still uses fprintf(stderr, "...") that should be the
error() helper. The kernel-shared code is left out of the conversion for
now.
Signed-off-by: David Sterba <dsterba@suse.com>
Add the GPL v2 header to files where it was missing and is not from an
external source, update to the most recent version with the address.
Signed-off-by: David Sterba <dsterba@suse.com>
Internally it's blake2b but for the user facing output or other command
line interfaces let's call it just BLAKE2.
Signed-off-by: David Sterba <dsterba@suse.com>
With explicit width the default alignment is to the right, using space
is a gnu extension. Fix the following warnings:
crypto/hash-speedtest.c: In function ‘main’:
crypto/hash-speedtest.c:152:15: warning: ' ' flag used with ‘%s’ gnu_printf format [-Wformat=]
152 | printf("% 12s: ", c->name);
| ^
crypto/hash-speedtest.c:172:21: warning: ' ' flag used with ‘%u’ gnu_printf format [-Wformat=]
172 | printf("%s: % 12llu, %s/i % 8llu",
| ^
crypto/hash-speedtest.c:172:34: warning: ' ' flag used with ‘%u’ gnu_printf format [-Wformat=]
172 | printf("%s: % 12llu, %s/i % 8llu",
| ^
Signed-off-by: David Sterba <dsterba@suse.com>
For environments that require certified implementations of cryptographic
primitives allow to select a library providing them. The requirements
are SHA256 and BLAKE2 (with the 2b variant and 256 bit digest).
For now there are two: libgrcrypt and libsodium (openssl does not
provide the BLAKE2b-256). Accellerated versions are typically provided
and automatically selected.
Signed-off-by: David Sterba <dsterba@suse.com>
A simple tool to microbenchmark performance of the hashes. Uses rdtsc
for timing, so works only on x86_64.
$ make hash-speedtest
$ ./hash-speedtest [iterations]
Block size: 4096
Iterations: 100000
NULL-NOP: cycles: 56061823, c/i 560
NULL-MEMCPY: cycles: 61296469, c/i 612
CRC32C: cycles: 179961796, c/i 1799
XXHASH: cycles: 138434590, c/i 1384
Signed-off-by: David Sterba <dsterba@suse.com>