Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - hwrng core now credits for low-quality RNG devices.

  Algorithms:
   - Optimisations for neon aes on arm/arm64.
   - Add accelerated crc32_be on arm64.
   - Add ffdheXYZ(dh) templates.
   - Disallow hmac keys < 112 bits in FIPS mode.
   - Add AVX assembly implementation for sm3 on x86.

  Drivers:
   - Add missing local_bh_disable calls for crypto_engine callback.
   - Ensure BH is disabled in crypto_engine callback path.
   - Fix zero length DMA mappings in ccree.
   - Add synchronization between mailbox accesses in octeontx2.
   - Add Xilinx SHA3 driver.
   - Add support for the TDES IP available on sama7g5 SoC in atmel"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits)
  crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST
  MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list
  crypto: dh - Remove the unused function dh_safe_prime_dh_alg()
  hwrng: nomadik - Change clk_disable to clk_disable_unprepare
  crypto: arm64 - cleanup comments
  crypto: qat - fix initialization of pfvf rts_map_msg structures
  crypto: qat - fix initialization of pfvf cap_msg structures
  crypto: qat - remove unneeded assignment
  crypto: qat - disable registration of algorithms
  crypto: hisilicon/qm - fix memset during queues clearing
  crypto: xilinx: prevent probing on non-xilinx hardware
  crypto: marvell/octeontx - Use swap() instead of open coding it
  crypto: ccree - Fix use after free in cc_cipher_exit()
  crypto: ccp - ccp_dmaengine_unregister release dma channels
  crypto: octeontx2 - fix missing unlock
  hwrng: cavium - fix NULL but dereferenced coccicheck error
  crypto: cavium/nitrox - don't cast parameter in bit operations
  crypto: vmx - add missing dependencies
  MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver
  crypto: xilinx - Add Xilinx SHA3 driver
  ...
This commit is contained in:
Linus Torvalds 2022-03-21 16:02:36 -07:00
commit 93e220a62d
147 changed files with 5679 additions and 1675 deletions

View File

@ -27,6 +27,16 @@ Description: One HPRE controller has one PF and multiple VFs, each function
has a QM. Select the QM which below qm refers to.
Only available for PF.
What: /sys/kernel/debug/hisi_hpre/<bdf>/alg_qos
Date: Jun 2021
Contact: linux-crypto@vger.kernel.org
Description: The <bdf> is related the function for PF and VF.
HPRE driver supports to configure each function's QoS, the driver
supports to write <bdf> value to alg_qos in the host. Such as
"echo <bdf> value > alg_qos". The qos value is 1~1000, means
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_hpre/<bdf>/regs
Date: Sep 2019
Contact: linux-crypto@vger.kernel.org

View File

@ -14,6 +14,16 @@ Description: One SEC controller has one PF and multiple VFs, each function
qm refers to.
Only available for PF.
What: /sys/kernel/debug/hisi_sec2/<bdf>/alg_qos
Date: Jun 2021
Contact: linux-crypto@vger.kernel.org
Description: The <bdf> is related the function for PF and VF.
SEC driver supports to configure each function's QoS, the driver
supports to write <bdf> value to alg_qos in the host. Such as
"echo <bdf> value > alg_qos". The qos value is 1~1000, means
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/qm_regs
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org

View File

@ -26,6 +26,16 @@ Description: One ZIP controller has one PF and multiple VFs, each function
has a QM. Select the QM which below qm refers to.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/alg_qos
Date: Jun 2021
Contact: linux-crypto@vger.kernel.org
Description: The <bdf> is related the function for PF and VF.
ZIP driver supports to configure each function's QoS, the driver
supports to write <bdf> value to alg_qos in the host. Such as
"echo <bdf> value > alg_qos". The qos value is 1~1000, means
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org

View File

@ -8644,7 +8644,7 @@ S: Maintained
F: drivers/gpio/gpio-hisi.c
HISILICON HIGH PERFORMANCE RSA ENGINE DRIVER (HPRE)
M: Zaibo Xu <xuzaibo@huawei.com>
M: Longfang Liu <liulongfang@huawei.com>
L: linux-crypto@vger.kernel.org
S: Maintained
F: Documentation/ABI/testing/debugfs-hisi-hpre
@ -8724,8 +8724,8 @@ F: Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
F: drivers/scsi/hisi_sas/
HISILICON SECURITY ENGINE V2 DRIVER (SEC2)
M: Zaibo Xu <xuzaibo@huawei.com>
M: Kai Ye <yekai13@huawei.com>
M: Longfang Liu <liulongfang@huawei.com>
L: linux-crypto@vger.kernel.org
S: Maintained
F: Documentation/ABI/testing/debugfs-hisi-sec
@ -8756,7 +8756,7 @@ F: Documentation/devicetree/bindings/mfd/hisilicon,hi6421-spmi-pmic.yaml
F: drivers/mfd/hi6421-spmi-pmic.c
HISILICON TRUE RANDOM NUMBER GENERATOR V2 SUPPORT
M: Zaibo Xu <xuzaibo@huawei.com>
M: Weili Qian <qianweili@huawei.com>
S: Maintained
F: drivers/crypto/hisilicon/trng/trng.c
@ -21302,6 +21302,11 @@ T: git https://github.com/Xilinx/linux-xlnx.git
F: Documentation/devicetree/bindings/phy/xlnx,zynqmp-psgtr.yaml
F: drivers/phy/xilinx/phy-zynqmp.c
XILINX ZYNQMP SHA3 DRIVER
M: Harsha <harsha.harsha@xilinx.com>
S: Maintained
F: drivers/crypto/xilinx/zynqmp-sha.c
XILINX EVENT MANAGEMENT DRIVER
M: Abhyuday Godhasara <abhyuday.godhasara@xilinx.com>
S: Maintained

View File

@ -5,24 +5,43 @@
* Optimized RAID-5 checksumming functions for alpha EV5 and EV6
*/
extern void xor_alpha_2(unsigned long, unsigned long *, unsigned long *);
extern void xor_alpha_3(unsigned long, unsigned long *, unsigned long *,
unsigned long *);
extern void xor_alpha_4(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
extern void xor_alpha_5(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *, unsigned long *);
extern void
xor_alpha_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
extern void
xor_alpha_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
extern void
xor_alpha_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void
xor_alpha_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
extern void xor_alpha_prefetch_2(unsigned long, unsigned long *,
unsigned long *);
extern void xor_alpha_prefetch_3(unsigned long, unsigned long *,
unsigned long *, unsigned long *);
extern void xor_alpha_prefetch_4(unsigned long, unsigned long *,
unsigned long *, unsigned long *,
unsigned long *);
extern void xor_alpha_prefetch_5(unsigned long, unsigned long *,
unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
extern void
xor_alpha_prefetch_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
extern void
xor_alpha_prefetch_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
extern void
xor_alpha_prefetch_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void
xor_alpha_prefetch_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
asm(" \n\
.text \n\

View File

@ -758,29 +758,24 @@ ENTRY(aesbs_cbc_decrypt)
ENDPROC(aesbs_cbc_decrypt)
.macro next_ctr, q
vmov.32 \q\()h[1], r10
vmov \q\()h, r9, r10
adds r10, r10, #1
vmov.32 \q\()h[0], r9
adcs r9, r9, #0
vmov.32 \q\()l[1], r8
vmov \q\()l, r7, r8
adcs r8, r8, #0
vmov.32 \q\()l[0], r7
adc r7, r7, #0
vrev32.8 \q, \q
.endm
/*
* aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
* int rounds, int blocks, u8 ctr[], u8 final[])
* int rounds, int bytes, u8 ctr[])
*/
ENTRY(aesbs_ctr_encrypt)
mov ip, sp
push {r4-r10, lr}
ldm ip, {r5-r7} // load args 4-6
teq r7, #0
addne r5, r5, #1 // one extra block if final != 0
ldm ip, {r5, r6} // load args 4-5
vld1.8 {q0}, [r6] // load counter
vrev32.8 q1, q0
vmov r9, r10, d3
@ -792,20 +787,19 @@ ENTRY(aesbs_ctr_encrypt)
adc r7, r7, #0
99: vmov q1, q0
vmov q2, q0
vmov q3, q0
vmov q4, q0
vmov q5, q0
vmov q6, q0
vmov q7, q0
adr ip, 0f
sub lr, r5, #1
and lr, lr, #7
cmp r5, #8
sub ip, ip, lr, lsl #5
sub ip, ip, lr, lsl #2
movlt pc, ip // computed goto if blocks < 8
vmov q2, q0
adr ip, 0f
vmov q3, q0
and lr, lr, #112
vmov q4, q0
cmp r5, #112
vmov q5, q0
sub ip, ip, lr, lsl #1
vmov q6, q0
add ip, ip, lr, lsr #2
vmov q7, q0
movle pc, ip // computed goto if bytes < 112
next_ctr q1
next_ctr q2
@ -820,12 +814,14 @@ ENTRY(aesbs_ctr_encrypt)
bl aesbs_encrypt8
adr ip, 1f
and lr, r5, #7
cmp r5, #8
movgt r4, #0
ldrle r4, [sp, #40] // load final in the last round
sub ip, ip, lr, lsl #2
movlt pc, ip // computed goto if blocks < 8
sub lr, r5, #1
cmp r5, #128
bic lr, lr, #15
ands r4, r5, #15 // preserves C flag
teqcs r5, r5 // set Z flag if not last iteration
sub ip, ip, lr, lsr #2
rsb r4, r4, #16
movcc pc, ip // computed goto if bytes < 128
vld1.8 {q8}, [r1]!
vld1.8 {q9}, [r1]!
@ -834,46 +830,70 @@ ENTRY(aesbs_ctr_encrypt)
vld1.8 {q12}, [r1]!
vld1.8 {q13}, [r1]!
vld1.8 {q14}, [r1]!
teq r4, #0 // skip last block if 'final'
1: bne 2f
1: subne r1, r1, r4
vld1.8 {q15}, [r1]!
2: adr ip, 3f
cmp r5, #8
sub ip, ip, lr, lsl #3
movlt pc, ip // computed goto if blocks < 8
add ip, ip, #2f - 1b
veor q0, q0, q8
vst1.8 {q0}, [r0]!
veor q1, q1, q9
vst1.8 {q1}, [r0]!
veor q4, q4, q10
vst1.8 {q4}, [r0]!
veor q6, q6, q11
vst1.8 {q6}, [r0]!
veor q3, q3, q12
vst1.8 {q3}, [r0]!
veor q7, q7, q13
vst1.8 {q7}, [r0]!
veor q2, q2, q14
bne 3f
veor q5, q5, q15
movcc pc, ip // computed goto if bytes < 128
vst1.8 {q0}, [r0]!
vst1.8 {q1}, [r0]!
vst1.8 {q4}, [r0]!
vst1.8 {q6}, [r0]!
vst1.8 {q3}, [r0]!
vst1.8 {q7}, [r0]!
vst1.8 {q2}, [r0]!
teq r4, #0 // skip last block if 'final'
W(bne) 5f
3: veor q5, q5, q15
2: subne r0, r0, r4
vst1.8 {q5}, [r0]!
4: next_ctr q0
next_ctr q0
subs r5, r5, #8
subs r5, r5, #128
bgt 99b
vst1.8 {q0}, [r6]
pop {r4-r10, pc}
5: vst1.8 {q5}, [r4]
b 4b
3: adr lr, .Lpermute_table + 16
cmp r5, #16 // Z flag remains cleared
sub lr, lr, r4
vld1.8 {q8-q9}, [lr]
vtbl.8 d16, {q5}, d16
vtbl.8 d17, {q5}, d17
veor q5, q8, q15
bcc 4f // have to reload prev if R5 < 16
vtbx.8 d10, {q2}, d18
vtbx.8 d11, {q2}, d19
mov pc, ip // branch back to VST sequence
4: sub r0, r0, r4
vshr.s8 q9, q9, #7 // create mask for VBIF
vld1.8 {q8}, [r0] // reload
vbif q5, q8, q9
vst1.8 {q5}, [r0]
pop {r4-r10, pc}
ENDPROC(aesbs_ctr_encrypt)
.align 6
.Lpermute_table:
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.macro next_tweak, out, in, const, tmp
vshr.s64 \tmp, \in, #63
vand \tmp, \tmp, \const
@ -888,6 +908,7 @@ ENDPROC(aesbs_ctr_encrypt)
* aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int blocks, u8 iv[], int reorder_last_tweak)
*/
.align 6
__xts_prepare8:
vld1.8 {q14}, [r7] // load iv
vmov.i32 d30, #0x87 // compose tweak mask vector

View File

@ -37,7 +37,7 @@ asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 ctr[], u8 final[]);
int rounds, int blocks, u8 ctr[]);
asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[], int);
@ -243,32 +243,25 @@ static int ctr_encrypt(struct skcipher_request *req)
err = skcipher_walk_virt(&walk, req, false);
while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL;
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
int bytes = walk.nbytes;
if (walk.nbytes < walk.total) {
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
final = NULL;
}
if (unlikely(bytes < AES_BLOCK_SIZE))
src = dst = memcpy(buf + sizeof(buf) - bytes,
src, bytes);
else if (walk.nbytes < walk.total)
bytes &= ~(8 * AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->rk, ctx->rounds, blocks, walk.iv, final);
aesbs_ctr_encrypt(dst, src, ctx->rk, ctx->rounds, bytes, walk.iv);
kernel_neon_end();
if (final) {
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
if (unlikely(bytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr,
buf + sizeof(buf) - bytes, bytes);
crypto_xor_cpy(dst, src, final,
walk.total % AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, 0);
break;
}
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, walk.nbytes - bytes);
}
return err;

View File

@ -44,7 +44,8 @@
: "0" (dst), "r" (a1), "r" (a2), "r" (a3), "r" (a4))
static void
xor_arm4regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_arm4regs_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned int lines = bytes / sizeof(unsigned long) / 4;
register unsigned int a1 __asm__("r4");
@ -64,8 +65,9 @@ xor_arm4regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_arm4regs_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_arm4regs_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned int lines = bytes / sizeof(unsigned long) / 4;
register unsigned int a1 __asm__("r4");
@ -86,8 +88,10 @@ xor_arm4regs_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_arm4regs_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_arm4regs_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned int lines = bytes / sizeof(unsigned long) / 2;
register unsigned int a1 __asm__("r8");
@ -105,8 +109,11 @@ xor_arm4regs_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_arm4regs_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_arm4regs_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned int lines = bytes / sizeof(unsigned long) / 2;
register unsigned int a1 __asm__("r8");
@ -146,7 +153,8 @@ static struct xor_block_template xor_block_arm4regs = {
extern struct xor_block_template const xor_block_neon_inner;
static void
xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
if (in_interrupt()) {
xor_arm4regs_2(bytes, p1, p2);
@ -158,8 +166,9 @@ xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_neon_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
if (in_interrupt()) {
xor_arm4regs_3(bytes, p1, p2, p3);
@ -171,8 +180,10 @@ xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_neon_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
if (in_interrupt()) {
xor_arm4regs_4(bytes, p1, p2, p3, p4);
@ -184,8 +195,11 @@ xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_neon_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_neon_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
if (in_interrupt()) {
xor_arm4regs_5(bytes, p1, p2, p3, p4, p5);

View File

@ -17,17 +17,11 @@ MODULE_LICENSE("GPL");
/*
* Pull in the reference implementations while instructing GCC (through
* -ftree-vectorize) to attempt to exploit implicit parallelism and emit
* NEON instructions.
* NEON instructions. Clang does this by default at O2 so no pragma is
* needed.
*/
#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
#ifdef CONFIG_CC_IS_GCC
#pragma GCC optimize "tree-vectorize"
#else
/*
* While older versions of GCC do not generate incorrect code, they fail to
* recognize the parallel nature of these functions, and emit plain ARM code,
* which is known to be slower than the optimized ARM code in asm-arm/xor.h.
*/
#warning This code requires at least version 4.6 of GCC
#endif
#pragma GCC diagnostic ignored "-Wunused-variable"

View File

@ -45,7 +45,7 @@ config CRYPTO_SM3_ARM64_CE
tristate "SM3 digest algorithm (ARMv8.2 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_HASH
select CRYPTO_SM3
select CRYPTO_LIB_SM3
config CRYPTO_SM4_ARM64_CE
tristate "SM4 symmetric cipher (ARMv8.2 Crypto Extensions)"

View File

@ -24,7 +24,6 @@
#ifdef USE_V8_CRYPTO_EXTENSIONS
#define MODE "ce"
#define PRIO 300
#define STRIDE 5
#define aes_expandkey ce_aes_expandkey
#define aes_ecb_encrypt ce_aes_ecb_encrypt
#define aes_ecb_decrypt ce_aes_ecb_decrypt
@ -42,7 +41,6 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#else
#define MODE "neon"
#define PRIO 200
#define STRIDE 4
#define aes_ecb_encrypt neon_aes_ecb_encrypt
#define aes_ecb_decrypt neon_aes_ecb_decrypt
#define aes_cbc_encrypt neon_aes_cbc_encrypt
@ -89,7 +87,7 @@ asmlinkage void aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 const iv[]);
asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 ctr[], u8 finalbuf[]);
int rounds, int bytes, u8 ctr[]);
asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int bytes, u32 const rk2[], u8 iv[],
@ -458,26 +456,21 @@ static int __maybe_unused ctr_encrypt(struct skcipher_request *req)
unsigned int nbytes = walk.nbytes;
u8 *dst = walk.dst.virt.addr;
u8 buf[AES_BLOCK_SIZE];
unsigned int tail;
if (unlikely(nbytes < AES_BLOCK_SIZE))
src = memcpy(buf, src, nbytes);
src = dst = memcpy(buf + sizeof(buf) - nbytes,
src, nbytes);
else if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
walk.iv, buf);
walk.iv);
kernel_neon_end();
tail = nbytes % (STRIDE * AES_BLOCK_SIZE);
if (tail > 0 && tail < AES_BLOCK_SIZE)
/*
* The final partial block could not be returned using
* an overlapping store, so it was passed via buf[]
* instead.
*/
memcpy(dst + nbytes - tail, buf, tail);
if (unlikely(nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr,
buf + sizeof(buf) - nbytes, nbytes);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
@ -983,6 +976,7 @@ module_cpu_feature_match(AES, aes_init);
module_init(aes_init);
EXPORT_SYMBOL(neon_aes_ecb_encrypt);
EXPORT_SYMBOL(neon_aes_cbc_encrypt);
EXPORT_SYMBOL(neon_aes_ctr_encrypt);
EXPORT_SYMBOL(neon_aes_xts_encrypt);
EXPORT_SYMBOL(neon_aes_xts_decrypt);
#endif

View File

@ -321,7 +321,7 @@ AES_FUNC_END(aes_cbc_cts_decrypt)
/*
* aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int bytes, u8 ctr[], u8 finalbuf[])
* int bytes, u8 ctr[])
*/
AES_FUNC_START(aes_ctr_encrypt)
@ -414,8 +414,8 @@ ST5( st1 {v4.16b}, [x0], #16 )
.Lctrtail:
/* XOR up to MAX_STRIDE * 16 - 1 bytes of in/output with v0 ... v3/v4 */
mov x16, #16
ands x13, x4, #0xf
csel x13, x13, x16, ne
ands x6, x4, #0xf
csel x13, x6, x16, ne
ST5( cmp w4, #64 - (MAX_STRIDE << 4) )
ST5( csel x14, x16, xzr, gt )
@ -424,10 +424,10 @@ ST5( csel x14, x16, xzr, gt )
cmp w4, #32 - (MAX_STRIDE << 4)
csel x16, x16, xzr, gt
cmp w4, #16 - (MAX_STRIDE << 4)
ble .Lctrtail1x
adr_l x12, .Lcts_permute_table
add x12, x12, x13
ble .Lctrtail1x
ST5( ld1 {v5.16b}, [x1], x14 )
ld1 {v6.16b}, [x1], x15
@ -462,11 +462,19 @@ ST5( st1 {v5.16b}, [x0], x14 )
b .Lctrout
.Lctrtail1x:
csel x0, x0, x6, eq // use finalbuf if less than a full block
sub x7, x6, #16
csel x6, x6, x7, eq
add x1, x1, x6
add x0, x0, x6
ld1 {v5.16b}, [x1]
ld1 {v6.16b}, [x0]
ST5( mov v3.16b, v4.16b )
encrypt_block v3, w3, x2, x8, w7
ld1 {v10.16b-v11.16b}, [x12]
tbl v3.16b, {v3.16b}, v10.16b
sshr v11.16b, v11.16b, #7
eor v5.16b, v5.16b, v3.16b
bif v5.16b, v6.16b, v11.16b
st1 {v5.16b}, [x0]
b .Lctrout
AES_FUNC_END(aes_ctr_encrypt)

View File

@ -735,119 +735,67 @@ SYM_FUNC_END(aesbs_cbc_decrypt)
* int blocks, u8 iv[])
*/
SYM_FUNC_START_LOCAL(__xts_crypt8)
mov x6, #1
lsl x6, x6, x23
subs w23, w23, #8
csel x23, x23, xzr, pl
csel x6, x6, xzr, mi
movi v18.2s, #0x1
movi v19.2s, #0x87
uzp1 v18.4s, v18.4s, v19.4s
ld1 {v0.16b-v3.16b}, [x1], #64
ld1 {v4.16b-v7.16b}, [x1], #64
next_tweak v26, v25, v18, v19
next_tweak v27, v26, v18, v19
next_tweak v28, v27, v18, v19
next_tweak v29, v28, v18, v19
next_tweak v30, v29, v18, v19
next_tweak v31, v30, v18, v19
next_tweak v16, v31, v18, v19
next_tweak v17, v16, v18, v19
ld1 {v0.16b}, [x20], #16
next_tweak v26, v25, v30, v31
eor v0.16b, v0.16b, v25.16b
tbnz x6, #1, 0f
ld1 {v1.16b}, [x20], #16
next_tweak v27, v26, v30, v31
eor v1.16b, v1.16b, v26.16b
tbnz x6, #2, 0f
ld1 {v2.16b}, [x20], #16
next_tweak v28, v27, v30, v31
eor v2.16b, v2.16b, v27.16b
tbnz x6, #3, 0f
ld1 {v3.16b}, [x20], #16
next_tweak v29, v28, v30, v31
eor v3.16b, v3.16b, v28.16b
tbnz x6, #4, 0f
ld1 {v4.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset]
eor v4.16b, v4.16b, v29.16b
next_tweak v29, v29, v30, v31
tbnz x6, #5, 0f
eor v5.16b, v5.16b, v30.16b
eor v6.16b, v6.16b, v31.16b
eor v7.16b, v7.16b, v16.16b
ld1 {v5.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset + 16]
eor v5.16b, v5.16b, v29.16b
next_tweak v29, v29, v30, v31
tbnz x6, #6, 0f
stp q16, q17, [sp, #16]
ld1 {v6.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset + 32]
eor v6.16b, v6.16b, v29.16b
next_tweak v29, v29, v30, v31
tbnz x6, #7, 0f
ld1 {v7.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset + 48]
eor v7.16b, v7.16b, v29.16b
next_tweak v29, v29, v30, v31
0: mov bskey, x21
mov rounds, x22
mov bskey, x2
mov rounds, x3
br x16
SYM_FUNC_END(__xts_crypt8)
.macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
frame_push 6, 64
stp x29, x30, [sp, #-48]!
mov x29, sp
mov x19, x0
mov x20, x1
mov x21, x2
mov x22, x3
mov x23, x4
mov x24, x5
ld1 {v25.16b}, [x5]
movi v30.2s, #0x1
movi v25.2s, #0x87
uzp1 v30.4s, v30.4s, v25.4s
ld1 {v25.16b}, [x24]
99: adr x16, \do8
0: adr x16, \do8
bl __xts_crypt8
ldp q16, q17, [sp, #.Lframe_local_offset]
ldp q18, q19, [sp, #.Lframe_local_offset + 32]
eor v16.16b, \o0\().16b, v25.16b
eor v17.16b, \o1\().16b, v26.16b
eor v18.16b, \o2\().16b, v27.16b
eor v19.16b, \o3\().16b, v28.16b
eor \o0\().16b, \o0\().16b, v25.16b
eor \o1\().16b, \o1\().16b, v26.16b
eor \o2\().16b, \o2\().16b, v27.16b
eor \o3\().16b, \o3\().16b, v28.16b
ldp q24, q25, [sp, #16]
st1 {\o0\().16b}, [x19], #16
mov v25.16b, v26.16b
tbnz x6, #1, 1f
st1 {\o1\().16b}, [x19], #16
mov v25.16b, v27.16b
tbnz x6, #2, 1f
st1 {\o2\().16b}, [x19], #16
mov v25.16b, v28.16b
tbnz x6, #3, 1f
st1 {\o3\().16b}, [x19], #16
mov v25.16b, v29.16b
tbnz x6, #4, 1f
eor v20.16b, \o4\().16b, v29.16b
eor v21.16b, \o5\().16b, v30.16b
eor v22.16b, \o6\().16b, v31.16b
eor v23.16b, \o7\().16b, v24.16b
eor \o4\().16b, \o4\().16b, v16.16b
eor \o5\().16b, \o5\().16b, v17.16b
eor \o6\().16b, \o6\().16b, v18.16b
eor \o7\().16b, \o7\().16b, v19.16b
st1 {v16.16b-v19.16b}, [x0], #64
st1 {v20.16b-v23.16b}, [x0], #64
st1 {\o4\().16b}, [x19], #16
tbnz x6, #5, 1f
st1 {\o5\().16b}, [x19], #16
tbnz x6, #6, 1f
st1 {\o6\().16b}, [x19], #16
tbnz x6, #7, 1f
st1 {\o7\().16b}, [x19], #16
subs x4, x4, #8
b.gt 0b
cbz x23, 1f
st1 {v25.16b}, [x24]
b 99b
1: st1 {v25.16b}, [x24]
frame_pop
st1 {v25.16b}, [x5]
ldp x29, x30, [sp], #48
ret
.endm
@ -869,133 +817,51 @@ SYM_FUNC_END(aesbs_xts_decrypt)
/*
* aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
* int rounds, int blocks, u8 iv[], u8 final[])
* int rounds, int blocks, u8 iv[])
*/
SYM_FUNC_START(aesbs_ctr_encrypt)
frame_push 8
stp x29, x30, [sp, #-16]!
mov x29, sp
mov x19, x0
mov x20, x1
mov x21, x2
mov x22, x3
mov x23, x4
mov x24, x5
mov x25, x6
cmp x25, #0
cset x26, ne
add x23, x23, x26 // do one extra block if final
ldp x7, x8, [x24]
ld1 {v0.16b}, [x24]
ldp x7, x8, [x5]
ld1 {v0.16b}, [x5]
CPU_LE( rev x7, x7 )
CPU_LE( rev x8, x8 )
adds x8, x8, #1
adc x7, x7, xzr
99: mov x9, #1
lsl x9, x9, x23
subs w23, w23, #8
csel x23, x23, xzr, pl
csel x9, x9, xzr, le
tbnz x9, #1, 0f
next_ctr v1
tbnz x9, #2, 0f
0: next_ctr v1
next_ctr v2
tbnz x9, #3, 0f
next_ctr v3
tbnz x9, #4, 0f
next_ctr v4
tbnz x9, #5, 0f
next_ctr v5
tbnz x9, #6, 0f
next_ctr v6
tbnz x9, #7, 0f
next_ctr v7
0: mov bskey, x21
mov rounds, x22
mov bskey, x2
mov rounds, x3
bl aesbs_encrypt8
lsr x9, x9, x26 // disregard the extra block
tbnz x9, #0, 0f
ld1 { v8.16b-v11.16b}, [x1], #64
ld1 {v12.16b-v15.16b}, [x1], #64
ld1 {v8.16b}, [x20], #16
eor v0.16b, v0.16b, v8.16b
st1 {v0.16b}, [x19], #16
tbnz x9, #1, 1f
eor v8.16b, v0.16b, v8.16b
eor v9.16b, v1.16b, v9.16b
eor v10.16b, v4.16b, v10.16b
eor v11.16b, v6.16b, v11.16b
eor v12.16b, v3.16b, v12.16b
eor v13.16b, v7.16b, v13.16b
eor v14.16b, v2.16b, v14.16b
eor v15.16b, v5.16b, v15.16b
ld1 {v9.16b}, [x20], #16
eor v1.16b, v1.16b, v9.16b
st1 {v1.16b}, [x19], #16
tbnz x9, #2, 2f
st1 { v8.16b-v11.16b}, [x0], #64
st1 {v12.16b-v15.16b}, [x0], #64
ld1 {v10.16b}, [x20], #16
eor v4.16b, v4.16b, v10.16b
st1 {v4.16b}, [x19], #16
tbnz x9, #3, 3f
next_ctr v0
subs x4, x4, #8
b.gt 0b
ld1 {v11.16b}, [x20], #16
eor v6.16b, v6.16b, v11.16b
st1 {v6.16b}, [x19], #16
tbnz x9, #4, 4f
ld1 {v12.16b}, [x20], #16
eor v3.16b, v3.16b, v12.16b
st1 {v3.16b}, [x19], #16
tbnz x9, #5, 5f
ld1 {v13.16b}, [x20], #16
eor v7.16b, v7.16b, v13.16b
st1 {v7.16b}, [x19], #16
tbnz x9, #6, 6f
ld1 {v14.16b}, [x20], #16
eor v2.16b, v2.16b, v14.16b
st1 {v2.16b}, [x19], #16
tbnz x9, #7, 7f
ld1 {v15.16b}, [x20], #16
eor v5.16b, v5.16b, v15.16b
st1 {v5.16b}, [x19], #16
8: next_ctr v0
st1 {v0.16b}, [x24]
cbz x23, .Lctr_done
b 99b
.Lctr_done:
frame_pop
st1 {v0.16b}, [x5]
ldp x29, x30, [sp], #16
ret
/*
* If we are handling the tail of the input (x6 != NULL), return the
* final keystream block back to the caller.
*/
0: cbz x25, 8b
st1 {v0.16b}, [x25]
b 8b
1: cbz x25, 8b
st1 {v1.16b}, [x25]
b 8b
2: cbz x25, 8b
st1 {v4.16b}, [x25]
b 8b
3: cbz x25, 8b
st1 {v6.16b}, [x25]
b 8b
4: cbz x25, 8b
st1 {v3.16b}, [x25]
b 8b
5: cbz x25, 8b
st1 {v7.16b}, [x25]
b 8b
6: cbz x25, 8b
st1 {v2.16b}, [x25]
b 8b
7: cbz x25, 8b
st1 {v5.16b}, [x25]
b 8b
SYM_FUNC_END(aesbs_ctr_encrypt)

View File

@ -34,7 +34,7 @@ asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[], u8 final[]);
int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
@ -46,6 +46,8 @@ asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks);
asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void neon_aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 ctr[]);
asmlinkage void neon_aes_xts_encrypt(u8 out[], u8 const in[],
u32 const rk1[], int rounds, int bytes,
u32 const rk2[], u8 iv[], int first);
@ -58,7 +60,7 @@ struct aesbs_ctx {
int rounds;
} __aligned(AES_BLOCK_SIZE);
struct aesbs_cbc_ctx {
struct aesbs_cbc_ctr_ctx {
struct aesbs_ctx key;
u32 enc[AES_MAX_KEYLENGTH_U32];
};
@ -128,10 +130,10 @@ static int ecb_decrypt(struct skcipher_request *req)
return __ecb_crypt(req, aesbs_ecb_decrypt);
}
static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
static int aesbs_cbc_ctr_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct crypto_aes_ctx rk;
int err;
@ -154,7 +156,7 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
static int cbc_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
@ -177,7 +179,7 @@ static int cbc_encrypt(struct skcipher_request *req)
static int cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
@ -205,40 +207,32 @@ static int cbc_decrypt(struct skcipher_request *req)
static int ctr_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
u8 buf[AES_BLOCK_SIZE];
int err;
err = skcipher_walk_virt(&walk, req, false);
while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL;
if (walk.nbytes < walk.total) {
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
final = NULL;
}
int blocks = (walk.nbytes / AES_BLOCK_SIZE) & ~7;
int nbytes = walk.nbytes % (8 * AES_BLOCK_SIZE);
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
kernel_neon_begin();
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->rk, ctx->rounds, blocks, walk.iv, final);
kernel_neon_end();
if (final) {
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
crypto_xor_cpy(dst, src, final,
walk.total % AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, 0);
break;
if (blocks >= 8) {
aesbs_ctr_encrypt(dst, src, ctx->key.rk, ctx->key.rounds,
blocks, walk.iv);
dst += blocks * AES_BLOCK_SIZE;
src += blocks * AES_BLOCK_SIZE;
}
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
if (nbytes && walk.nbytes == walk.total) {
neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds,
nbytes, walk.iv);
nbytes = 0;
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
@ -308,23 +302,18 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
return err;
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
if (walk.nbytes < walk.total || walk.nbytes % AES_BLOCK_SIZE)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
int blocks = (walk.nbytes / AES_BLOCK_SIZE) & ~7;
out = walk.dst.virt.addr;
in = walk.src.virt.addr;
nbytes = walk.nbytes;
kernel_neon_begin();
if (likely(blocks > 6)) { /* plain NEON is faster otherwise */
if (first)
if (blocks >= 8) {
if (first == 1)
neon_aes_ecb_encrypt(walk.iv, walk.iv,
ctx->twkey,
ctx->key.rounds, 1);
first = 0;
first = 2;
fn(out, in, ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
@ -333,10 +322,17 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
in += blocks * AES_BLOCK_SIZE;
nbytes -= blocks * AES_BLOCK_SIZE;
}
if (walk.nbytes == walk.total && nbytes > 0)
goto xts_tail;
if (walk.nbytes == walk.total && nbytes > 0) {
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
nbytes = first = 0;
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
@ -361,13 +357,12 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
nbytes = walk.nbytes;
kernel_neon_begin();
xts_tail:
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first ?: 2);
nbytes, ctx->twkey, walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first ?: 2);
nbytes, ctx->twkey, walk.iv, first);
kernel_neon_end();
return skcipher_walk_done(&walk, 0);
@ -402,14 +397,14 @@ static struct skcipher_alg aes_algs[] = { {
.base.cra_driver_name = "cbc-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctx),
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctr_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_cbc_setkey,
.setkey = aesbs_cbc_ctr_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
@ -417,7 +412,7 @@ static struct skcipher_alg aes_algs[] = { {
.base.cra_driver_name = "ctr-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct aesbs_ctx),
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctr_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
@ -425,7 +420,7 @@ static struct skcipher_alg aes_algs[] = { {
.chunksize = AES_BLOCK_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_setkey,
.setkey = aesbs_cbc_ctr_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
}, {

View File

@ -1,4 +1,4 @@
/* SPDX-License-Identifier: GPL-2.0 */
// SPDX-License-Identifier: GPL-2.0
/*
* sha3-ce-glue.c - core SHA-3 transform using v8.2 Crypto Extensions
*

View File

@ -43,7 +43,7 @@
# on Cortex-A53 (or by 4 cycles per round).
# (***) Super-impressive coefficients over gcc-generated code are
# indication of some compiler "pathology", most notably code
# generated with -mgeneral-regs-only is significanty faster
# generated with -mgeneral-regs-only is significantly faster
# and the gap is only 40-90%.
#
# October 2016.

View File

@ -1,4 +1,4 @@
/* SPDX-License-Identifier: GPL-2.0 */
// SPDX-License-Identifier: GPL-2.0
/*
* sha512-ce-glue.c - SHA-384/SHA-512 using ARMv8 Crypto Extensions
*

View File

@ -26,8 +26,10 @@ asmlinkage void sm3_ce_transform(struct sm3_state *sst, u8 const *src,
static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
if (!crypto_simd_usable())
return crypto_sm3_update(desc, data, len);
if (!crypto_simd_usable()) {
sm3_update(shash_desc_ctx(desc), data, len);
return 0;
}
kernel_neon_begin();
sm3_base_do_update(desc, data, len, sm3_ce_transform);
@ -38,8 +40,10 @@ static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
static int sm3_ce_final(struct shash_desc *desc, u8 *out)
{
if (!crypto_simd_usable())
return crypto_sm3_finup(desc, NULL, 0, out);
if (!crypto_simd_usable()) {
sm3_final(shash_desc_ctx(desc), out);
return 0;
}
kernel_neon_begin();
sm3_base_do_finalize(desc, sm3_ce_transform);
@ -51,14 +55,22 @@ static int sm3_ce_final(struct shash_desc *desc, u8 *out)
static int sm3_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!crypto_simd_usable())
return crypto_sm3_finup(desc, data, len, out);
if (!crypto_simd_usable()) {
struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, out);
return 0;
}
kernel_neon_begin();
if (len)
sm3_base_do_update(desc, data, len, sm3_ce_transform);
sm3_base_do_finalize(desc, sm3_ce_transform);
kernel_neon_end();
return sm3_ce_final(desc, out);
return sm3_base_finish(desc, out);
}
static struct shash_alg sm3_alg = {

View File

@ -16,7 +16,8 @@
extern struct xor_block_template const xor_block_inner_neon;
static void
xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
kernel_neon_begin();
xor_block_inner_neon.do_2(bytes, p1, p2);
@ -24,8 +25,9 @@ xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_neon_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
kernel_neon_begin();
xor_block_inner_neon.do_3(bytes, p1, p2, p3);
@ -33,8 +35,10 @@ xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_neon_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
kernel_neon_begin();
xor_block_inner_neon.do_4(bytes, p1, p2, p3, p4);
@ -42,8 +46,11 @@ xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_neon_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_neon_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
kernel_neon_begin();
xor_block_inner_neon.do_5(bytes, p1, p2, p3, p4, p5);

View File

@ -11,7 +11,44 @@
.arch armv8-a+crc
.macro __crc32, c
.macro byteorder, reg, be
.if \be
CPU_LE( rev \reg, \reg )
.else
CPU_BE( rev \reg, \reg )
.endif
.endm
.macro byteorder16, reg, be
.if \be
CPU_LE( rev16 \reg, \reg )
.else
CPU_BE( rev16 \reg, \reg )
.endif
.endm
.macro bitorder, reg, be
.if \be
rbit \reg, \reg
.endif
.endm
.macro bitorder16, reg, be
.if \be
rbit \reg, \reg
lsr \reg, \reg, #16
.endif
.endm
.macro bitorder8, reg, be
.if \be
rbit \reg, \reg
lsr \reg, \reg, #24
.endif
.endm
.macro __crc32, c, be=0
bitorder w0, \be
cmp x2, #16
b.lt 8f // less than 16 bytes
@ -24,10 +61,14 @@
add x8, x8, x1
add x1, x1, x7
ldp x5, x6, [x8]
CPU_BE( rev x3, x3 )
CPU_BE( rev x4, x4 )
CPU_BE( rev x5, x5 )
CPU_BE( rev x6, x6 )
byteorder x3, \be
byteorder x4, \be
byteorder x5, \be
byteorder x6, \be
bitorder x3, \be
bitorder x4, \be
bitorder x5, \be
bitorder x6, \be
tst x7, #8
crc32\c\()x w8, w0, x3
@ -55,33 +96,43 @@ CPU_BE( rev x6, x6 )
32: ldp x3, x4, [x1], #32
sub x2, x2, #32
ldp x5, x6, [x1, #-16]
CPU_BE( rev x3, x3 )
CPU_BE( rev x4, x4 )
CPU_BE( rev x5, x5 )
CPU_BE( rev x6, x6 )
byteorder x3, \be
byteorder x4, \be
byteorder x5, \be
byteorder x6, \be
bitorder x3, \be
bitorder x4, \be
bitorder x5, \be
bitorder x6, \be
crc32\c\()x w0, w0, x3
crc32\c\()x w0, w0, x4
crc32\c\()x w0, w0, x5
crc32\c\()x w0, w0, x6
cbnz x2, 32b
0: ret
0: bitorder w0, \be
ret
8: tbz x2, #3, 4f
ldr x3, [x1], #8
CPU_BE( rev x3, x3 )
byteorder x3, \be
bitorder x3, \be
crc32\c\()x w0, w0, x3
4: tbz x2, #2, 2f
ldr w3, [x1], #4
CPU_BE( rev w3, w3 )
byteorder w3, \be
bitorder w3, \be
crc32\c\()w w0, w0, w3
2: tbz x2, #1, 1f
ldrh w3, [x1], #2
CPU_BE( rev16 w3, w3 )
byteorder16 w3, \be
bitorder16 w3, \be
crc32\c\()h w0, w0, w3
1: tbz x2, #0, 0f
ldrb w3, [x1]
bitorder8 w3, \be
crc32\c\()b w0, w0, w3
0: ret
0: bitorder w0, \be
ret
.endm
.align 5
@ -99,3 +150,11 @@ alternative_if_not ARM64_HAS_CRC32
alternative_else_nop_endif
__crc32 c
SYM_FUNC_END(__crc32c_le)
.align 5
SYM_FUNC_START(crc32_be)
alternative_if_not ARM64_HAS_CRC32
b crc32_be_base
alternative_else_nop_endif
__crc32 be=1
SYM_FUNC_END(crc32_be)

View File

@ -10,8 +10,8 @@
#include <linux/module.h>
#include <asm/neon-intrinsics.h>
void xor_arm64_neon_2(unsigned long bytes, unsigned long *p1,
unsigned long *p2)
void xor_arm64_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
@ -37,8 +37,9 @@ void xor_arm64_neon_2(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
void xor_arm64_neon_3(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3)
void xor_arm64_neon_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
@ -72,8 +73,10 @@ void xor_arm64_neon_3(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
void xor_arm64_neon_4(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3, unsigned long *p4)
void xor_arm64_neon_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
@ -115,9 +118,11 @@ void xor_arm64_neon_4(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
void xor_arm64_neon_5(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3,
unsigned long *p4, unsigned long *p5)
void xor_arm64_neon_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
@ -186,8 +191,10 @@ static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r)
return res;
}
static void xor_arm64_eor3_3(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3)
static void xor_arm64_eor3_3(unsigned long bytes,
unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
@ -219,9 +226,11 @@ static void xor_arm64_eor3_3(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
static void xor_arm64_eor3_4(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3,
unsigned long *p4)
static void xor_arm64_eor3_4(unsigned long bytes,
unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
@ -261,9 +270,12 @@ static void xor_arm64_eor3_4(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
static void xor_arm64_eor3_5(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3,
unsigned long *p4, unsigned long *p5)
static void xor_arm64_eor3_5(unsigned long bytes,
unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;

View File

@ -4,13 +4,20 @@
*/
extern void xor_ia64_2(unsigned long, unsigned long *, unsigned long *);
extern void xor_ia64_3(unsigned long, unsigned long *, unsigned long *,
unsigned long *);
extern void xor_ia64_4(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
extern void xor_ia64_5(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *, unsigned long *);
extern void xor_ia64_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
extern void xor_ia64_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
extern void xor_ia64_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void xor_ia64_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
static struct xor_block_template xor_block_ia64 = {
.name = "ia64",

View File

@ -3,17 +3,20 @@
#define _ASM_POWERPC_XOR_ALTIVEC_H
#ifdef CONFIG_ALTIVEC
void xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in);
void xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in);
void xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in);
void xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in, unsigned long *v5_in);
void xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
void xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
void xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
#endif
#endif /* _ASM_POWERPC_XOR_ALTIVEC_H */

View File

@ -49,8 +49,9 @@ typedef vector signed char unative_t;
V1##_3 = vec_xor(V1##_3, V2##_3); \
} while (0)
void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in)
void __xor_altivec_2(unsigned long bytes,
unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in)
{
DEFINE(v1);
DEFINE(v2);
@ -67,8 +68,10 @@ void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0);
}
void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in)
void __xor_altivec_3(unsigned long bytes,
unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in)
{
DEFINE(v1);
DEFINE(v2);
@ -89,9 +92,11 @@ void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0);
}
void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in)
void __xor_altivec_4(unsigned long bytes,
unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in,
const unsigned long * __restrict v4_in)
{
DEFINE(v1);
DEFINE(v2);
@ -116,9 +121,12 @@ void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0);
}
void __xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in, unsigned long *v5_in)
void __xor_altivec_5(unsigned long bytes,
unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in,
const unsigned long * __restrict v4_in,
const unsigned long * __restrict v5_in)
{
DEFINE(v1);
DEFINE(v2);

View File

@ -6,16 +6,17 @@
* outside of the enable/disable altivec block.
*/
void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in);
void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in);
void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in);
void __xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in, unsigned long *v5_in);
void __xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
void __xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
void __xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void __xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);

View File

@ -12,47 +12,51 @@
#include <asm/xor_altivec.h>
#include "xor_vmx.h"
void xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in)
void xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
preempt_disable();
enable_kernel_altivec();
__xor_altivec_2(bytes, v1_in, v2_in);
__xor_altivec_2(bytes, p1, p2);
disable_kernel_altivec();
preempt_enable();
}
EXPORT_SYMBOL(xor_altivec_2);
void xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in)
void xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
preempt_disable();
enable_kernel_altivec();
__xor_altivec_3(bytes, v1_in, v2_in, v3_in);
__xor_altivec_3(bytes, p1, p2, p3);
disable_kernel_altivec();
preempt_enable();
}
EXPORT_SYMBOL(xor_altivec_3);
void xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in)
void xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
preempt_disable();
enable_kernel_altivec();
__xor_altivec_4(bytes, v1_in, v2_in, v3_in, v4_in);
__xor_altivec_4(bytes, p1, p2, p3, p4);
disable_kernel_altivec();
preempt_enable();
}
EXPORT_SYMBOL(xor_altivec_4);
void xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in, unsigned long *v5_in)
void xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
preempt_disable();
enable_kernel_altivec();
__xor_altivec_5(bytes, v1_in, v2_in, v3_in, v4_in, v5_in);
__xor_altivec_5(bytes, p1, p2, p3, p4, p5);
disable_kernel_altivec();
preempt_enable();
}

View File

@ -11,7 +11,8 @@
#include <linux/raid/xor.h>
#include <asm/xor.h>
static void xor_xc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
static void xor_xc_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
asm volatile(
" larl 1,2f\n"
@ -32,8 +33,9 @@ static void xor_xc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
: "0", "1", "cc", "memory");
}
static void xor_xc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
static void xor_xc_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
asm volatile(
" larl 1,2f\n"
@ -58,8 +60,10 @@ static void xor_xc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
: : "0", "1", "cc", "memory");
}
static void xor_xc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
static void xor_xc_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
asm volatile(
" larl 1,2f\n"
@ -88,8 +92,11 @@ static void xor_xc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
: : "0", "1", "cc", "memory");
}
static void xor_xc_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
static void xor_xc_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
asm volatile(
" larl 1,2f\n"

View File

@ -13,7 +13,8 @@
*/
static void
sparc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
sparc_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
int lines = bytes / (sizeof (long)) / 8;
@ -50,8 +51,9 @@ sparc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
sparc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
sparc_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
int lines = bytes / (sizeof (long)) / 8;
@ -101,8 +103,10 @@ sparc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
sparc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
sparc_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
int lines = bytes / (sizeof (long)) / 8;
@ -165,8 +169,11 @@ sparc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
sparc_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
sparc_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
int lines = bytes / (sizeof (long)) / 8;

View File

@ -12,13 +12,20 @@
#include <asm/spitfire.h>
void xor_vis_2(unsigned long, unsigned long *, unsigned long *);
void xor_vis_3(unsigned long, unsigned long *, unsigned long *,
unsigned long *);
void xor_vis_4(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
void xor_vis_5(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *, unsigned long *);
void xor_vis_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
void xor_vis_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
void xor_vis_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void xor_vis_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
/* XXX Ugh, write cheetah versions... -DaveM */
@ -30,13 +37,20 @@ static struct xor_block_template xor_block_VIS = {
.do_5 = xor_vis_5,
};
void xor_niagara_2(unsigned long, unsigned long *, unsigned long *);
void xor_niagara_3(unsigned long, unsigned long *, unsigned long *,
unsigned long *);
void xor_niagara_4(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
void xor_niagara_5(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *, unsigned long *);
void xor_niagara_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
void xor_niagara_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
void xor_niagara_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void xor_niagara_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
static struct xor_block_template xor_block_niagara = {
.name = "Niagara",

View File

@ -90,6 +90,9 @@ nhpoly1305-avx2-y := nh-avx2-x86_64.o nhpoly1305-avx2-glue.o
obj-$(CONFIG_CRYPTO_CURVE25519_X86) += curve25519-x86_64.o
obj-$(CONFIG_CRYPTO_SM3_AVX_X86_64) += sm3-avx-x86_64.o
sm3-avx-x86_64-y := sm3-avx-asm_64.o sm3_avx_glue.o
obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64) += sm4-aesni-avx-x86_64.o
sm4-aesni-avx-x86_64-y := sm4-aesni-avx-asm_64.o sm4_aesni_avx_glue.o

View File

@ -1,65 +1,22 @@
/* SPDX-License-Identifier: GPL-2.0-only OR BSD-3-Clause */
/*
* Implement AES CTR mode by8 optimization with AVX instructions. (x86_64)
*
* This is AES128/192/256 CTR mode optimization implementation. It requires
* the support of Intel(R) AESNI and AVX instructions.
*
* This work was inspired by the AES CTR mode optimization published
* in Intel Optimized IPSEC Cryptograhpic library.
* Additional information on it can be found at:
* http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=22972
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
*
* GPL LICENSE SUMMARY
* AES CTR mode by8 optimization with AVX instructions. (x86_64)
*
* Copyright(c) 2014 Intel Corporation.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* Contact Information:
* James Guilford <james.guilford@intel.com>
* Sean Gulley <sean.m.gulley@intel.com>
* Chandramouli Narayanan <mouli@linux.intel.com>
*/
/*
* This is AES128/192/256 CTR mode optimization implementation. It requires
* the support of Intel(R) AESNI and AVX instructions.
*
* BSD LICENSE
*
* Copyright(c) 2014 Intel Corporation.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* This work was inspired by the AES CTR mode optimization published
* in Intel Optimized IPSEC Cryptographic library.
* Additional information on it can be found at:
* https://github.com/intel/intel-ipsec-mb
*/
#include <linux/linkage.h>

View File

@ -32,24 +32,12 @@ static inline void blowfish_enc_blk(struct bf_ctx *ctx, u8 *dst, const u8 *src)
__blowfish_enc_blk(ctx, dst, src, false);
}
static inline void blowfish_enc_blk_xor(struct bf_ctx *ctx, u8 *dst,
const u8 *src)
{
__blowfish_enc_blk(ctx, dst, src, true);
}
static inline void blowfish_enc_blk_4way(struct bf_ctx *ctx, u8 *dst,
const u8 *src)
{
__blowfish_enc_blk_4way(ctx, dst, src, false);
}
static inline void blowfish_enc_blk_xor_4way(struct bf_ctx *ctx, u8 *dst,
const u8 *src)
{
__blowfish_enc_blk_4way(ctx, dst, src, true);
}
static void blowfish_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
blowfish_enc_blk(crypto_tfm_ctx(tfm), dst, src);

View File

@ -45,14 +45,6 @@ static inline void des3_ede_dec_blk(struct des3_ede_x86_ctx *ctx, u8 *dst,
des3_ede_x86_64_crypt_blk(dec_ctx, dst, src);
}
static inline void des3_ede_enc_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{
u32 *enc_ctx = ctx->enc.expkey;
des3_ede_x86_64_crypt_blk_3way(enc_ctx, dst, src);
}
static inline void des3_ede_dec_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{

View File

@ -0,0 +1,517 @@
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM3 AVX accelerated transform.
* specified in: https://datatracker.ietf.org/doc/html/draft-sca-cfrg-sm3-02
*
* Copyright (C) 2021 Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Copyright (C) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
/* Based on SM3 AES/BMI2 accelerated work by libgcrypt at:
* https://gnupg.org/software/libgcrypt/index.html
*/
#include <linux/linkage.h>
#include <asm/frame.h>
/* Context structure */
#define state_h0 0
#define state_h1 4
#define state_h2 8
#define state_h3 12
#define state_h4 16
#define state_h5 20
#define state_h6 24
#define state_h7 28
/* Constants */
/* Round constant macros */
#define K0 2043430169 /* 0x79cc4519 */
#define K1 -208106958 /* 0xf3988a32 */
#define K2 -416213915 /* 0xe7311465 */
#define K3 -832427829 /* 0xce6228cb */
#define K4 -1664855657 /* 0x9cc45197 */
#define K5 965255983 /* 0x3988a32f */
#define K6 1930511966 /* 0x7311465e */
#define K7 -433943364 /* 0xe6228cbc */
#define K8 -867886727 /* 0xcc451979 */
#define K9 -1735773453 /* 0x988a32f3 */
#define K10 823420391 /* 0x311465e7 */
#define K11 1646840782 /* 0x6228cbce */
#define K12 -1001285732 /* 0xc451979c */
#define K13 -2002571463 /* 0x88a32f39 */
#define K14 289824371 /* 0x11465e73 */
#define K15 579648742 /* 0x228cbce6 */
#define K16 -1651869049 /* 0x9d8a7a87 */
#define K17 991229199 /* 0x3b14f50f */
#define K18 1982458398 /* 0x7629ea1e */
#define K19 -330050500 /* 0xec53d43c */
#define K20 -660100999 /* 0xd8a7a879 */
#define K21 -1320201997 /* 0xb14f50f3 */
#define K22 1654563303 /* 0x629ea1e7 */
#define K23 -985840690 /* 0xc53d43ce */
#define K24 -1971681379 /* 0x8a7a879d */
#define K25 351604539 /* 0x14f50f3b */
#define K26 703209078 /* 0x29ea1e76 */
#define K27 1406418156 /* 0x53d43cec */
#define K28 -1482130984 /* 0xa7a879d8 */
#define K29 1330705329 /* 0x4f50f3b1 */
#define K30 -1633556638 /* 0x9ea1e762 */
#define K31 1027854021 /* 0x3d43cec5 */
#define K32 2055708042 /* 0x7a879d8a */
#define K33 -183551212 /* 0xf50f3b14 */
#define K34 -367102423 /* 0xea1e7629 */
#define K35 -734204845 /* 0xd43cec53 */
#define K36 -1468409689 /* 0xa879d8a7 */
#define K37 1358147919 /* 0x50f3b14f */
#define K38 -1578671458 /* 0xa1e7629e */
#define K39 1137624381 /* 0x43cec53d */
#define K40 -2019718534 /* 0x879d8a7a */
#define K41 255530229 /* 0x0f3b14f5 */
#define K42 511060458 /* 0x1e7629ea */
#define K43 1022120916 /* 0x3cec53d4 */
#define K44 2044241832 /* 0x79d8a7a8 */
#define K45 -206483632 /* 0xf3b14f50 */
#define K46 -412967263 /* 0xe7629ea1 */
#define K47 -825934525 /* 0xcec53d43 */
#define K48 -1651869049 /* 0x9d8a7a87 */
#define K49 991229199 /* 0x3b14f50f */
#define K50 1982458398 /* 0x7629ea1e */
#define K51 -330050500 /* 0xec53d43c */
#define K52 -660100999 /* 0xd8a7a879 */
#define K53 -1320201997 /* 0xb14f50f3 */
#define K54 1654563303 /* 0x629ea1e7 */
#define K55 -985840690 /* 0xc53d43ce */
#define K56 -1971681379 /* 0x8a7a879d */
#define K57 351604539 /* 0x14f50f3b */
#define K58 703209078 /* 0x29ea1e76 */
#define K59 1406418156 /* 0x53d43cec */
#define K60 -1482130984 /* 0xa7a879d8 */
#define K61 1330705329 /* 0x4f50f3b1 */
#define K62 -1633556638 /* 0x9ea1e762 */
#define K63 1027854021 /* 0x3d43cec5 */
/* Register macros */
#define RSTATE %rdi
#define RDATA %rsi
#define RNBLKS %rdx
#define t0 %eax
#define t1 %ebx
#define t2 %ecx
#define a %r8d
#define b %r9d
#define c %r10d
#define d %r11d
#define e %r12d
#define f %r13d
#define g %r14d
#define h %r15d
#define W0 %xmm0
#define W1 %xmm1
#define W2 %xmm2
#define W3 %xmm3
#define W4 %xmm4
#define W5 %xmm5
#define XTMP0 %xmm6
#define XTMP1 %xmm7
#define XTMP2 %xmm8
#define XTMP3 %xmm9
#define XTMP4 %xmm10
#define XTMP5 %xmm11
#define XTMP6 %xmm12
#define BSWAP_REG %xmm15
/* Stack structure */
#define STACK_W_SIZE (32 * 2 * 3)
#define STACK_REG_SAVE_SIZE (64)
#define STACK_W (0)
#define STACK_REG_SAVE (STACK_W + STACK_W_SIZE)
#define STACK_SIZE (STACK_REG_SAVE + STACK_REG_SAVE_SIZE)
/* Instruction helpers. */
#define roll2(v, reg) \
roll $(v), reg;
#define roll3mov(v, src, dst) \
movl src, dst; \
roll $(v), dst;
#define roll3(v, src, dst) \
rorxl $(32-(v)), src, dst;
#define addl2(a, out) \
leal (a, out), out;
/* Round function macros. */
#define GG1(x, y, z, o, t) \
movl x, o; \
xorl y, o; \
xorl z, o;
#define FF1(x, y, z, o, t) GG1(x, y, z, o, t)
#define GG2(x, y, z, o, t) \
andnl z, x, o; \
movl y, t; \
andl x, t; \
addl2(t, o);
#define FF2(x, y, z, o, t) \
movl y, o; \
xorl x, o; \
movl y, t; \
andl x, t; \
andl z, o; \
xorl t, o;
#define R(i, a, b, c, d, e, f, g, h, round, widx, wtype) \
/* rol(a, 12) => t0 */ \
roll3mov(12, a, t0); /* rorxl here would reduce perf by 6% on zen3 */ \
/* rol (t0 + e + t), 7) => t1 */ \
leal K##round(t0, e, 1), t1; \
roll2(7, t1); \
/* h + w1 => h */ \
addl wtype##_W1_ADDR(round, widx), h; \
/* h + t1 => h */ \
addl2(t1, h); \
/* t1 ^ t0 => t0 */ \
xorl t1, t0; \
/* w1w2 + d => d */ \
addl wtype##_W1W2_ADDR(round, widx), d; \
/* FF##i(a,b,c) => t1 */ \
FF##i(a, b, c, t1, t2); \
/* d + t1 => d */ \
addl2(t1, d); \
/* GG#i(e,f,g) => t2 */ \
GG##i(e, f, g, t2, t1); \
/* h + t2 => h */ \
addl2(t2, h); \
/* rol (f, 19) => f */ \
roll2(19, f); \
/* d + t0 => d */ \
addl2(t0, d); \
/* rol (b, 9) => b */ \
roll2(9, b); \
/* P0(h) => h */ \
roll3(9, h, t2); \
roll3(17, h, t1); \
xorl t2, h; \
xorl t1, h;
#define R1(a, b, c, d, e, f, g, h, round, widx, wtype) \
R(1, a, b, c, d, e, f, g, h, round, widx, wtype)
#define R2(a, b, c, d, e, f, g, h, round, widx, wtype) \
R(2, a, b, c, d, e, f, g, h, round, widx, wtype)
/* Input expansion macros. */
/* Byte-swapped input address. */
#define IW_W_ADDR(round, widx, offs) \
(STACK_W + ((round) / 4) * 64 + (offs) + ((widx) * 4))(%rsp)
/* Expanded input address. */
#define XW_W_ADDR(round, widx, offs) \
(STACK_W + ((((round) / 3) - 4) % 2) * 64 + (offs) + ((widx) * 4))(%rsp)
/* Rounds 1-12, byte-swapped input block addresses. */
#define IW_W1_ADDR(round, widx) IW_W_ADDR(round, widx, 0)
#define IW_W1W2_ADDR(round, widx) IW_W_ADDR(round, widx, 32)
/* Rounds 1-12, expanded input block addresses. */
#define XW_W1_ADDR(round, widx) XW_W_ADDR(round, widx, 0)
#define XW_W1W2_ADDR(round, widx) XW_W_ADDR(round, widx, 32)
/* Input block loading. */
#define LOAD_W_XMM_1() \
vmovdqu 0*16(RDATA), XTMP0; /* XTMP0: w3, w2, w1, w0 */ \
vmovdqu 1*16(RDATA), XTMP1; /* XTMP1: w7, w6, w5, w4 */ \
vmovdqu 2*16(RDATA), XTMP2; /* XTMP2: w11, w10, w9, w8 */ \
vmovdqu 3*16(RDATA), XTMP3; /* XTMP3: w15, w14, w13, w12 */ \
vpshufb BSWAP_REG, XTMP0, XTMP0; \
vpshufb BSWAP_REG, XTMP1, XTMP1; \
vpshufb BSWAP_REG, XTMP2, XTMP2; \
vpshufb BSWAP_REG, XTMP3, XTMP3; \
vpxor XTMP0, XTMP1, XTMP4; \
vpxor XTMP1, XTMP2, XTMP5; \
vpxor XTMP2, XTMP3, XTMP6; \
leaq 64(RDATA), RDATA; \
vmovdqa XTMP0, IW_W1_ADDR(0, 0); \
vmovdqa XTMP4, IW_W1W2_ADDR(0, 0); \
vmovdqa XTMP1, IW_W1_ADDR(4, 0); \
vmovdqa XTMP5, IW_W1W2_ADDR(4, 0);
#define LOAD_W_XMM_2() \
vmovdqa XTMP2, IW_W1_ADDR(8, 0); \
vmovdqa XTMP6, IW_W1W2_ADDR(8, 0);
#define LOAD_W_XMM_3() \
vpshufd $0b00000000, XTMP0, W0; /* W0: xx, w0, xx, xx */ \
vpshufd $0b11111001, XTMP0, W1; /* W1: xx, w3, w2, w1 */ \
vmovdqa XTMP1, W2; /* W2: xx, w6, w5, w4 */ \
vpalignr $12, XTMP1, XTMP2, W3; /* W3: xx, w9, w8, w7 */ \
vpalignr $8, XTMP2, XTMP3, W4; /* W4: xx, w12, w11, w10 */ \
vpshufd $0b11111001, XTMP3, W5; /* W5: xx, w15, w14, w13 */
/* Message scheduling. Note: 3 words per XMM register. */
#define SCHED_W_0(round, w0, w1, w2, w3, w4, w5) \
/* Load (w[i - 16]) => XTMP0 */ \
vpshufd $0b10111111, w0, XTMP0; \
vpalignr $12, XTMP0, w1, XTMP0; /* XTMP0: xx, w2, w1, w0 */ \
/* Load (w[i - 13]) => XTMP1 */ \
vpshufd $0b10111111, w1, XTMP1; \
vpalignr $12, XTMP1, w2, XTMP1; \
/* w[i - 9] == w3 */ \
/* XMM3 ^ XTMP0 => XTMP0 */ \
vpxor w3, XTMP0, XTMP0;
#define SCHED_W_1(round, w0, w1, w2, w3, w4, w5) \
/* w[i - 3] == w5 */ \
/* rol(XMM5, 15) ^ XTMP0 => XTMP0 */ \
vpslld $15, w5, XTMP2; \
vpsrld $(32-15), w5, XTMP3; \
vpxor XTMP2, XTMP3, XTMP3; \
vpxor XTMP3, XTMP0, XTMP0; \
/* rol(XTMP1, 7) => XTMP1 */ \
vpslld $7, XTMP1, XTMP5; \
vpsrld $(32-7), XTMP1, XTMP1; \
vpxor XTMP5, XTMP1, XTMP1; \
/* XMM4 ^ XTMP1 => XTMP1 */ \
vpxor w4, XTMP1, XTMP1; \
/* w[i - 6] == XMM4 */ \
/* P1(XTMP0) ^ XTMP1 => XMM0 */ \
vpslld $15, XTMP0, XTMP5; \
vpsrld $(32-15), XTMP0, XTMP6; \
vpslld $23, XTMP0, XTMP2; \
vpsrld $(32-23), XTMP0, XTMP3; \
vpxor XTMP0, XTMP1, XTMP1; \
vpxor XTMP6, XTMP5, XTMP5; \
vpxor XTMP3, XTMP2, XTMP2; \
vpxor XTMP2, XTMP5, XTMP5; \
vpxor XTMP5, XTMP1, w0;
#define SCHED_W_2(round, w0, w1, w2, w3, w4, w5) \
/* W1 in XMM12 */ \
vpshufd $0b10111111, w4, XTMP4; \
vpalignr $12, XTMP4, w5, XTMP4; \
vmovdqa XTMP4, XW_W1_ADDR((round), 0); \
/* W1 ^ W2 => XTMP1 */ \
vpxor w0, XTMP4, XTMP1; \
vmovdqa XTMP1, XW_W1W2_ADDR((round), 0);
.section .rodata.cst16, "aM", @progbits, 16
.align 16
.Lbe32mask:
.long 0x00010203, 0x04050607, 0x08090a0b, 0x0c0d0e0f
.text
/*
* Transform nblocks*64 bytes (nblocks*16 32-bit words) at DATA.
*
* void sm3_transform_avx(struct sm3_state *state,
* const u8 *data, int nblocks);
*/
.align 16
SYM_FUNC_START(sm3_transform_avx)
/* input:
* %rdi: ctx, CTX
* %rsi: data (64*nblks bytes)
* %rdx: nblocks
*/
vzeroupper;
pushq %rbp;
movq %rsp, %rbp;
movq %rdx, RNBLKS;
subq $STACK_SIZE, %rsp;
andq $(~63), %rsp;
movq %rbx, (STACK_REG_SAVE + 0 * 8)(%rsp);
movq %r15, (STACK_REG_SAVE + 1 * 8)(%rsp);
movq %r14, (STACK_REG_SAVE + 2 * 8)(%rsp);
movq %r13, (STACK_REG_SAVE + 3 * 8)(%rsp);
movq %r12, (STACK_REG_SAVE + 4 * 8)(%rsp);
vmovdqa .Lbe32mask (%rip), BSWAP_REG;
/* Get the values of the chaining variables. */
movl state_h0(RSTATE), a;
movl state_h1(RSTATE), b;
movl state_h2(RSTATE), c;
movl state_h3(RSTATE), d;
movl state_h4(RSTATE), e;
movl state_h5(RSTATE), f;
movl state_h6(RSTATE), g;
movl state_h7(RSTATE), h;
.align 16
.Loop:
/* Load data part1. */
LOAD_W_XMM_1();
leaq -1(RNBLKS), RNBLKS;
/* Transform 0-3 + Load data part2. */
R1(a, b, c, d, e, f, g, h, 0, 0, IW); LOAD_W_XMM_2();
R1(d, a, b, c, h, e, f, g, 1, 1, IW);
R1(c, d, a, b, g, h, e, f, 2, 2, IW);
R1(b, c, d, a, f, g, h, e, 3, 3, IW); LOAD_W_XMM_3();
/* Transform 4-7 + Precalc 12-14. */
R1(a, b, c, d, e, f, g, h, 4, 0, IW);
R1(d, a, b, c, h, e, f, g, 5, 1, IW);
R1(c, d, a, b, g, h, e, f, 6, 2, IW); SCHED_W_0(12, W0, W1, W2, W3, W4, W5);
R1(b, c, d, a, f, g, h, e, 7, 3, IW); SCHED_W_1(12, W0, W1, W2, W3, W4, W5);
/* Transform 8-11 + Precalc 12-17. */
R1(a, b, c, d, e, f, g, h, 8, 0, IW); SCHED_W_2(12, W0, W1, W2, W3, W4, W5);
R1(d, a, b, c, h, e, f, g, 9, 1, IW); SCHED_W_0(15, W1, W2, W3, W4, W5, W0);
R1(c, d, a, b, g, h, e, f, 10, 2, IW); SCHED_W_1(15, W1, W2, W3, W4, W5, W0);
R1(b, c, d, a, f, g, h, e, 11, 3, IW); SCHED_W_2(15, W1, W2, W3, W4, W5, W0);
/* Transform 12-14 + Precalc 18-20 */
R1(a, b, c, d, e, f, g, h, 12, 0, XW); SCHED_W_0(18, W2, W3, W4, W5, W0, W1);
R1(d, a, b, c, h, e, f, g, 13, 1, XW); SCHED_W_1(18, W2, W3, W4, W5, W0, W1);
R1(c, d, a, b, g, h, e, f, 14, 2, XW); SCHED_W_2(18, W2, W3, W4, W5, W0, W1);
/* Transform 15-17 + Precalc 21-23 */
R1(b, c, d, a, f, g, h, e, 15, 0, XW); SCHED_W_0(21, W3, W4, W5, W0, W1, W2);
R2(a, b, c, d, e, f, g, h, 16, 1, XW); SCHED_W_1(21, W3, W4, W5, W0, W1, W2);
R2(d, a, b, c, h, e, f, g, 17, 2, XW); SCHED_W_2(21, W3, W4, W5, W0, W1, W2);
/* Transform 18-20 + Precalc 24-26 */
R2(c, d, a, b, g, h, e, f, 18, 0, XW); SCHED_W_0(24, W4, W5, W0, W1, W2, W3);
R2(b, c, d, a, f, g, h, e, 19, 1, XW); SCHED_W_1(24, W4, W5, W0, W1, W2, W3);
R2(a, b, c, d, e, f, g, h, 20, 2, XW); SCHED_W_2(24, W4, W5, W0, W1, W2, W3);
/* Transform 21-23 + Precalc 27-29 */
R2(d, a, b, c, h, e, f, g, 21, 0, XW); SCHED_W_0(27, W5, W0, W1, W2, W3, W4);
R2(c, d, a, b, g, h, e, f, 22, 1, XW); SCHED_W_1(27, W5, W0, W1, W2, W3, W4);
R2(b, c, d, a, f, g, h, e, 23, 2, XW); SCHED_W_2(27, W5, W0, W1, W2, W3, W4);
/* Transform 24-26 + Precalc 30-32 */
R2(a, b, c, d, e, f, g, h, 24, 0, XW); SCHED_W_0(30, W0, W1, W2, W3, W4, W5);
R2(d, a, b, c, h, e, f, g, 25, 1, XW); SCHED_W_1(30, W0, W1, W2, W3, W4, W5);
R2(c, d, a, b, g, h, e, f, 26, 2, XW); SCHED_W_2(30, W0, W1, W2, W3, W4, W5);
/* Transform 27-29 + Precalc 33-35 */
R2(b, c, d, a, f, g, h, e, 27, 0, XW); SCHED_W_0(33, W1, W2, W3, W4, W5, W0);
R2(a, b, c, d, e, f, g, h, 28, 1, XW); SCHED_W_1(33, W1, W2, W3, W4, W5, W0);
R2(d, a, b, c, h, e, f, g, 29, 2, XW); SCHED_W_2(33, W1, W2, W3, W4, W5, W0);
/* Transform 30-32 + Precalc 36-38 */
R2(c, d, a, b, g, h, e, f, 30, 0, XW); SCHED_W_0(36, W2, W3, W4, W5, W0, W1);
R2(b, c, d, a, f, g, h, e, 31, 1, XW); SCHED_W_1(36, W2, W3, W4, W5, W0, W1);
R2(a, b, c, d, e, f, g, h, 32, 2, XW); SCHED_W_2(36, W2, W3, W4, W5, W0, W1);
/* Transform 33-35 + Precalc 39-41 */
R2(d, a, b, c, h, e, f, g, 33, 0, XW); SCHED_W_0(39, W3, W4, W5, W0, W1, W2);
R2(c, d, a, b, g, h, e, f, 34, 1, XW); SCHED_W_1(39, W3, W4, W5, W0, W1, W2);
R2(b, c, d, a, f, g, h, e, 35, 2, XW); SCHED_W_2(39, W3, W4, W5, W0, W1, W2);
/* Transform 36-38 + Precalc 42-44 */
R2(a, b, c, d, e, f, g, h, 36, 0, XW); SCHED_W_0(42, W4, W5, W0, W1, W2, W3);
R2(d, a, b, c, h, e, f, g, 37, 1, XW); SCHED_W_1(42, W4, W5, W0, W1, W2, W3);
R2(c, d, a, b, g, h, e, f, 38, 2, XW); SCHED_W_2(42, W4, W5, W0, W1, W2, W3);
/* Transform 39-41 + Precalc 45-47 */
R2(b, c, d, a, f, g, h, e, 39, 0, XW); SCHED_W_0(45, W5, W0, W1, W2, W3, W4);
R2(a, b, c, d, e, f, g, h, 40, 1, XW); SCHED_W_1(45, W5, W0, W1, W2, W3, W4);
R2(d, a, b, c, h, e, f, g, 41, 2, XW); SCHED_W_2(45, W5, W0, W1, W2, W3, W4);
/* Transform 42-44 + Precalc 48-50 */
R2(c, d, a, b, g, h, e, f, 42, 0, XW); SCHED_W_0(48, W0, W1, W2, W3, W4, W5);
R2(b, c, d, a, f, g, h, e, 43, 1, XW); SCHED_W_1(48, W0, W1, W2, W3, W4, W5);
R2(a, b, c, d, e, f, g, h, 44, 2, XW); SCHED_W_2(48, W0, W1, W2, W3, W4, W5);
/* Transform 45-47 + Precalc 51-53 */
R2(d, a, b, c, h, e, f, g, 45, 0, XW); SCHED_W_0(51, W1, W2, W3, W4, W5, W0);
R2(c, d, a, b, g, h, e, f, 46, 1, XW); SCHED_W_1(51, W1, W2, W3, W4, W5, W0);
R2(b, c, d, a, f, g, h, e, 47, 2, XW); SCHED_W_2(51, W1, W2, W3, W4, W5, W0);
/* Transform 48-50 + Precalc 54-56 */
R2(a, b, c, d, e, f, g, h, 48, 0, XW); SCHED_W_0(54, W2, W3, W4, W5, W0, W1);
R2(d, a, b, c, h, e, f, g, 49, 1, XW); SCHED_W_1(54, W2, W3, W4, W5, W0, W1);
R2(c, d, a, b, g, h, e, f, 50, 2, XW); SCHED_W_2(54, W2, W3, W4, W5, W0, W1);
/* Transform 51-53 + Precalc 57-59 */
R2(b, c, d, a, f, g, h, e, 51, 0, XW); SCHED_W_0(57, W3, W4, W5, W0, W1, W2);
R2(a, b, c, d, e, f, g, h, 52, 1, XW); SCHED_W_1(57, W3, W4, W5, W0, W1, W2);
R2(d, a, b, c, h, e, f, g, 53, 2, XW); SCHED_W_2(57, W3, W4, W5, W0, W1, W2);
/* Transform 54-56 + Precalc 60-62 */
R2(c, d, a, b, g, h, e, f, 54, 0, XW); SCHED_W_0(60, W4, W5, W0, W1, W2, W3);
R2(b, c, d, a, f, g, h, e, 55, 1, XW); SCHED_W_1(60, W4, W5, W0, W1, W2, W3);
R2(a, b, c, d, e, f, g, h, 56, 2, XW); SCHED_W_2(60, W4, W5, W0, W1, W2, W3);
/* Transform 57-59 + Precalc 63 */
R2(d, a, b, c, h, e, f, g, 57, 0, XW); SCHED_W_0(63, W5, W0, W1, W2, W3, W4);
R2(c, d, a, b, g, h, e, f, 58, 1, XW);
R2(b, c, d, a, f, g, h, e, 59, 2, XW); SCHED_W_1(63, W5, W0, W1, W2, W3, W4);
/* Transform 60-62 + Precalc 63 */
R2(a, b, c, d, e, f, g, h, 60, 0, XW);
R2(d, a, b, c, h, e, f, g, 61, 1, XW); SCHED_W_2(63, W5, W0, W1, W2, W3, W4);
R2(c, d, a, b, g, h, e, f, 62, 2, XW);
/* Transform 63 */
R2(b, c, d, a, f, g, h, e, 63, 0, XW);
/* Update the chaining variables. */
xorl state_h0(RSTATE), a;
xorl state_h1(RSTATE), b;
xorl state_h2(RSTATE), c;
xorl state_h3(RSTATE), d;
movl a, state_h0(RSTATE);
movl b, state_h1(RSTATE);
movl c, state_h2(RSTATE);
movl d, state_h3(RSTATE);
xorl state_h4(RSTATE), e;
xorl state_h5(RSTATE), f;
xorl state_h6(RSTATE), g;
xorl state_h7(RSTATE), h;
movl e, state_h4(RSTATE);
movl f, state_h5(RSTATE);
movl g, state_h6(RSTATE);
movl h, state_h7(RSTATE);
cmpq $0, RNBLKS;
jne .Loop;
vzeroall;
movq (STACK_REG_SAVE + 0 * 8)(%rsp), %rbx;
movq (STACK_REG_SAVE + 1 * 8)(%rsp), %r15;
movq (STACK_REG_SAVE + 2 * 8)(%rsp), %r14;
movq (STACK_REG_SAVE + 3 * 8)(%rsp), %r13;
movq (STACK_REG_SAVE + 4 * 8)(%rsp), %r12;
vmovdqa %xmm0, IW_W1_ADDR(0, 0);
vmovdqa %xmm0, IW_W1W2_ADDR(0, 0);
vmovdqa %xmm0, IW_W1_ADDR(4, 0);
vmovdqa %xmm0, IW_W1W2_ADDR(4, 0);
vmovdqa %xmm0, IW_W1_ADDR(8, 0);
vmovdqa %xmm0, IW_W1W2_ADDR(8, 0);
movq %rbp, %rsp;
popq %rbp;
ret;
SYM_FUNC_END(sm3_transform_avx)

View File

@ -0,0 +1,134 @@
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM3 Secure Hash Algorithm, AVX assembler accelerated.
* specified in: https://datatracker.ietf.org/doc/html/draft-sca-cfrg-sm3-02
*
* Copyright (C) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/types.h>
#include <crypto/sm3.h>
#include <crypto/sm3_base.h>
#include <asm/simd.h>
asmlinkage void sm3_transform_avx(struct sm3_state *state,
const u8 *data, int nblocks);
static int sm3_avx_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sm3_state *sctx = shash_desc_ctx(desc);
if (!crypto_simd_usable() ||
(sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) {
sm3_update(sctx, data, len);
return 0;
}
/*
* Make sure struct sm3_state begins directly with the SM3
* 256-bit internal state, as this is what the asm functions expect.
*/
BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0);
kernel_fpu_begin();
sm3_base_do_update(desc, data, len, sm3_transform_avx);
kernel_fpu_end();
return 0;
}
static int sm3_avx_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!crypto_simd_usable()) {
struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, out);
return 0;
}
kernel_fpu_begin();
if (len)
sm3_base_do_update(desc, data, len, sm3_transform_avx);
sm3_base_do_finalize(desc, sm3_transform_avx);
kernel_fpu_end();
return sm3_base_finish(desc, out);
}
static int sm3_avx_final(struct shash_desc *desc, u8 *out)
{
if (!crypto_simd_usable()) {
sm3_final(shash_desc_ctx(desc), out);
return 0;
}
kernel_fpu_begin();
sm3_base_do_finalize(desc, sm3_transform_avx);
kernel_fpu_end();
return sm3_base_finish(desc, out);
}
static struct shash_alg sm3_avx_alg = {
.digestsize = SM3_DIGEST_SIZE,
.init = sm3_base_init,
.update = sm3_avx_update,
.final = sm3_avx_final,
.finup = sm3_avx_finup,
.descsize = sizeof(struct sm3_state),
.base = {
.cra_name = "sm3",
.cra_driver_name = "sm3-avx",
.cra_priority = 300,
.cra_blocksize = SM3_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init sm3_avx_mod_init(void)
{
const char *feature_name;
if (!boot_cpu_has(X86_FEATURE_AVX)) {
pr_info("AVX instruction are not detected.\n");
return -ENODEV;
}
if (!boot_cpu_has(X86_FEATURE_BMI2)) {
pr_info("BMI2 instruction are not detected.\n");
return -ENODEV;
}
if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM,
&feature_name)) {
pr_info("CPU feature '%s' is not supported.\n", feature_name);
return -ENODEV;
}
return crypto_register_shash(&sm3_avx_alg);
}
static void __exit sm3_avx_mod_exit(void)
{
crypto_unregister_shash(&sm3_avx_alg);
}
module_init(sm3_avx_mod_init);
module_exit(sm3_avx_mod_exit);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
MODULE_DESCRIPTION("SM3 Secure Hash Algorithm, AVX assembler accelerated");
MODULE_ALIAS_CRYPTO("sm3");
MODULE_ALIAS_CRYPTO("sm3-avx");

View File

@ -57,7 +57,8 @@
op(i + 3, 3)
static void
xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_sse_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 8;
@ -108,7 +109,8 @@ xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_sse_2_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_sse_2_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 8;
@ -142,8 +144,9 @@ xor_sse_2_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_sse_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_sse_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 8;
@ -201,8 +204,9 @@ xor_sse_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_3_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_sse_3_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 8;
@ -238,8 +242,10 @@ xor_sse_3_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_sse_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 8;
@ -304,8 +310,10 @@ xor_sse_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_4_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_sse_4_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 8;
@ -343,8 +351,11 @@ xor_sse_4_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_sse_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned long lines = bytes >> 8;
@ -416,8 +427,11 @@ xor_sse_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_5_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_sse_5_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned long lines = bytes >> 8;

View File

@ -21,7 +21,8 @@
#include <asm/fpu/api.h>
static void
xor_pII_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_pII_mmx_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 7;
@ -64,8 +65,9 @@ xor_pII_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_pII_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_pII_mmx_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 7;
@ -113,8 +115,10 @@ xor_pII_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_pII_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_pII_mmx_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 7;
@ -168,8 +172,11 @@ xor_pII_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
static void
xor_pII_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_pII_mmx_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned long lines = bytes >> 7;
@ -248,7 +255,8 @@ xor_pII_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
#undef BLOCK
static void
xor_p5_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_p5_mmx_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 6;
@ -295,8 +303,9 @@ xor_p5_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_p5_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_p5_mmx_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 6;
@ -352,8 +361,10 @@ xor_p5_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_p5_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_p5_mmx_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 6;
@ -418,8 +429,11 @@ xor_p5_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_p5_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_p5_mmx_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned long lines = bytes >> 6;

View File

@ -26,7 +26,8 @@
BLOCK4(8) \
BLOCK4(12)
static void xor_avx_2(unsigned long bytes, unsigned long *p0, unsigned long *p1)
static void xor_avx_2(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1)
{
unsigned long lines = bytes >> 9;
@ -52,8 +53,9 @@ do { \
kernel_fpu_end();
}
static void xor_avx_3(unsigned long bytes, unsigned long *p0, unsigned long *p1,
unsigned long *p2)
static void xor_avx_3(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 9;
@ -82,8 +84,10 @@ do { \
kernel_fpu_end();
}
static void xor_avx_4(unsigned long bytes, unsigned long *p0, unsigned long *p1,
unsigned long *p2, unsigned long *p3)
static void xor_avx_4(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 9;
@ -115,8 +119,11 @@ do { \
kernel_fpu_end();
}
static void xor_avx_5(unsigned long bytes, unsigned long *p0, unsigned long *p1,
unsigned long *p2, unsigned long *p3, unsigned long *p4)
static void xor_avx_5(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 9;

View File

@ -231,6 +231,13 @@ config CRYPTO_DH
help
Generic implementation of the Diffie-Hellman algorithm.
config CRYPTO_DH_RFC7919_GROUPS
bool "Support for RFC 7919 FFDHE group parameters"
depends on CRYPTO_DH
select CRYPTO_RNG_DEFAULT
help
Provide support for RFC 7919 FFDHE group parameters. If unsure, say N.
config CRYPTO_ECC
tristate
select CRYPTO_RNG_DEFAULT
@ -267,7 +274,7 @@ config CRYPTO_ECRDSA
config CRYPTO_SM2
tristate "SM2 algorithm"
select CRYPTO_SM3
select CRYPTO_LIB_SM3
select CRYPTO_AKCIPHER
select CRYPTO_MANAGER
select MPILIB
@ -425,6 +432,7 @@ config CRYPTO_LRW
select CRYPTO_SKCIPHER
select CRYPTO_MANAGER
select CRYPTO_GF128MUL
select CRYPTO_ECB
help
LRW: Liskov Rivest Wagner, a tweakable, non malleable, non movable
narrow block cipher mode for dm-crypt. Use it with cipher
@ -999,6 +1007,7 @@ config CRYPTO_SHA3
config CRYPTO_SM3
tristate "SM3 digest algorithm"
select CRYPTO_HASH
select CRYPTO_LIB_SM3
help
SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3).
It is part of the Chinese Commercial Cryptography suite.
@ -1007,6 +1016,19 @@ config CRYPTO_SM3
http://www.oscca.gov.cn/UpFile/20101222141857786.pdf
https://datatracker.ietf.org/doc/html/draft-shen-sm3-hash
config CRYPTO_SM3_AVX_X86_64
tristate "SM3 digest algorithm (x86_64/AVX)"
depends on X86 && 64BIT
select CRYPTO_HASH
select CRYPTO_LIB_SM3
help
SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3).
It is part of the Chinese Commercial Cryptography suite. This is
SM3 optimized implementation using Advanced Vector Extensions (AVX)
when available.
If unsure, say N.
config CRYPTO_STREEBOG
tristate "Streebog Hash Function"
select CRYPTO_HASH
@ -1847,6 +1869,7 @@ config CRYPTO_JITTERENTROPY
config CRYPTO_KDF800108_CTR
tristate
select CRYPTO_HMAC
select CRYPTO_SHA256
config CRYPTO_USER_API

View File

@ -6,6 +6,7 @@
*/
#include <crypto/algapi.h>
#include <crypto/internal/simd.h>
#include <linux/err.h>
#include <linux/errno.h>
#include <linux/fips.h>
@ -21,6 +22,11 @@
static LIST_HEAD(crypto_template_list);
#ifdef CONFIG_CRYPTO_MANAGER_EXTRA_TESTS
DEFINE_PER_CPU(bool, crypto_simd_disabled_for_test);
EXPORT_PER_CPU_SYMBOL_GPL(crypto_simd_disabled_for_test);
#endif
static inline void crypto_check_module_sig(struct module *mod)
{
if (fips_enabled && mod && !module_sig_ok(mod))
@ -322,9 +328,17 @@ void crypto_alg_tested(const char *name, int err)
found:
q->cra_flags |= CRYPTO_ALG_DEAD;
alg = test->adult;
if (err || list_empty(&alg->cra_list))
if (list_empty(&alg->cra_list))
goto complete;
if (err == -ECANCELED)
alg->cra_flags |= CRYPTO_ALG_FIPS_INTERNAL;
else if (err)
goto complete;
else
alg->cra_flags &= ~CRYPTO_ALG_FIPS_INTERNAL;
alg->cra_flags |= CRYPTO_ALG_TESTED;
/* Only satisfy larval waiters if we are the best. */
@ -604,6 +618,7 @@ int crypto_register_instance(struct crypto_template *tmpl,
{
struct crypto_larval *larval;
struct crypto_spawn *spawn;
u32 fips_internal = 0;
int err;
err = crypto_check_alg(&inst->alg);
@ -626,11 +641,15 @@ int crypto_register_instance(struct crypto_template *tmpl,
spawn->inst = inst;
spawn->registered = true;
fips_internal |= spawn->alg->cra_flags;
crypto_mod_put(spawn->alg);
spawn = next;
}
inst->alg.cra_flags |= (fips_internal & CRYPTO_ALG_FIPS_INTERNAL);
larval = __crypto_register_alg(&inst->alg);
if (IS_ERR(larval))
goto unlock;
@ -683,7 +702,8 @@ int crypto_grab_spawn(struct crypto_spawn *spawn, struct crypto_instance *inst,
if (IS_ERR(name))
return PTR_ERR(name);
alg = crypto_find_alg(name, spawn->frontend, type, mask);
alg = crypto_find_alg(name, spawn->frontend,
type | CRYPTO_ALG_FIPS_INTERNAL, mask);
if (IS_ERR(alg))
return PTR_ERR(alg);
@ -1002,7 +1022,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
}
while (IS_ENABLED(CONFIG_64BIT) && len >= 8 && !(relalign & 7)) {
if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u64 l = get_unaligned((u64 *)src1) ^
get_unaligned((u64 *)src2);
put_unaligned(l, (u64 *)dst);
} else {
*(u64 *)dst = *(u64 *)src1 ^ *(u64 *)src2;
}
dst += 8;
src1 += 8;
src2 += 8;
@ -1010,7 +1036,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
}
while (len >= 4 && !(relalign & 3)) {
if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u32 l = get_unaligned((u32 *)src1) ^
get_unaligned((u32 *)src2);
put_unaligned(l, (u32 *)dst);
} else {
*(u32 *)dst = *(u32 *)src1 ^ *(u32 *)src2;
}
dst += 4;
src1 += 4;
src2 += 4;
@ -1018,7 +1050,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
}
while (len >= 2 && !(relalign & 1)) {
if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u16 l = get_unaligned((u16 *)src1) ^
get_unaligned((u16 *)src2);
put_unaligned(l, (u16 *)dst);
} else {
*(u16 *)dst = *(u16 *)src1 ^ *(u16 *)src2;
}
dst += 2;
src1 += 2;
src2 += 2;

View File

@ -223,6 +223,8 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg)
else if (crypto_is_test_larval(larval) &&
!(alg->cra_flags & CRYPTO_ALG_TESTED))
alg = ERR_PTR(-EAGAIN);
else if (alg->cra_flags & CRYPTO_ALG_FIPS_INTERNAL)
alg = ERR_PTR(-EAGAIN);
else if (!crypto_mod_get(alg))
alg = ERR_PTR(-EAGAIN);
crypto_mod_put(&larval->alg);
@ -233,6 +235,7 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg)
static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type,
u32 mask)
{
const u32 fips = CRYPTO_ALG_FIPS_INTERNAL;
struct crypto_alg *alg;
u32 test = 0;
@ -240,8 +243,20 @@ static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type,
test |= CRYPTO_ALG_TESTED;
down_read(&crypto_alg_sem);
alg = __crypto_alg_lookup(name, type | test, mask | test);
if (!alg && test) {
alg = __crypto_alg_lookup(name, (type | test) & ~fips,
(mask | test) & ~fips);
if (alg) {
if (((type | mask) ^ fips) & fips)
mask |= fips;
mask &= fips;
if (!crypto_is_larval(alg) &&
((type ^ alg->cra_flags) & mask)) {
/* Algorithm is disallowed in FIPS mode. */
crypto_mod_put(alg);
alg = ERR_PTR(-ENOENT);
}
} else if (test) {
alg = __crypto_alg_lookup(name, type, mask);
if (alg && !crypto_is_larval(alg)) {
/* Test failed */

View File

@ -35,7 +35,7 @@ void public_key_signature_free(struct public_key_signature *sig)
EXPORT_SYMBOL_GPL(public_key_signature_free);
/**
* query_asymmetric_key - Get information about an aymmetric key.
* query_asymmetric_key - Get information about an asymmetric key.
* @params: Various parameters.
* @info: Where to put the information.
*/

View File

@ -22,7 +22,7 @@ struct x509_certificate {
time64_t valid_to;
const void *tbs; /* Signed data */
unsigned tbs_size; /* Size of signed data */
unsigned raw_sig_size; /* Size of sigature */
unsigned raw_sig_size; /* Size of signature */
const void *raw_sig; /* Signature data */
const void *raw_serial; /* Raw serial number in ASN.1 */
unsigned raw_serial_size;

View File

@ -170,8 +170,8 @@ dma_xor_aligned_offsets(struct dma_device *device, unsigned int offset,
*
* xor_blocks always uses the dest as a source so the
* ASYNC_TX_XOR_ZERO_DST flag must be set to not include dest data in
* the calculation. The assumption with dma eninges is that they only
* use the destination buffer as a source when it is explicity specified
* the calculation. The assumption with dma engines is that they only
* use the destination buffer as a source when it is explicitly specified
* in the source list.
*
* src_list note: if the dest is also a source it must be at index zero.
@ -261,8 +261,8 @@ EXPORT_SYMBOL_GPL(async_xor_offs);
*
* xor_blocks always uses the dest as a source so the
* ASYNC_TX_XOR_ZERO_DST flag must be set to not include dest data in
* the calculation. The assumption with dma eninges is that they only
* use the destination buffer as a source when it is explicity specified
* the calculation. The assumption with dma engines is that they only
* use the destination buffer as a source when it is explicitly specified
* in the source list.
*
* src_list note: if the dest is also a source it must be at index zero.

View File

@ -217,7 +217,7 @@ static int raid6_test(void)
err += test(12, &tests);
}
/* the 24 disk case is special for ioatdma as it is the boudary point
/* the 24 disk case is special for ioatdma as it is the boundary point
* at which it needs to switch from 8-source ops to 16-source
* ops for continuation (assumes DMA_HAS_PQ_CONTINUE is not set)
*/
@ -241,7 +241,7 @@ static void raid6_test_exit(void)
}
/* when compiled-in wait for drivers to load first (assumes dma drivers
* are also compliled-in)
* are also compiled-in)
*/
late_initcall(raid6_test);
module_exit(raid6_test_exit);

View File

@ -253,7 +253,7 @@ static int crypto_authenc_decrypt_tail(struct aead_request *req,
dst = scatterwalk_ffwd(areq_ctx->dst, req->dst, req->assoclen);
skcipher_request_set_tfm(skreq, ctx->enc);
skcipher_request_set_callback(skreq, aead_request_flags(req),
skcipher_request_set_callback(skreq, flags,
req->base.complete, req->base.data);
skcipher_request_set_crypt(skreq, src, dst,
req->cryptlen - authsize, req->iv);

View File

@ -53,6 +53,7 @@ static void crypto_finalize_request(struct crypto_engine *engine,
dev_err(engine->dev, "failed to unprepare request\n");
}
}
lockdep_assert_in_softirq();
req->complete(req, err);
kthread_queue_work(engine->kworker, &engine->pump_requests);

View File

@ -10,11 +10,11 @@
#include <crypto/internal/kpp.h>
#include <crypto/kpp.h>
#include <crypto/dh.h>
#include <crypto/rng.h>
#include <linux/mpi.h>
struct dh_ctx {
MPI p; /* Value is guaranteed to be set. */
MPI q; /* Value is optional. */
MPI g; /* Value is guaranteed to be set. */
MPI xa; /* Value is guaranteed to be set. */
};
@ -22,7 +22,6 @@ struct dh_ctx {
static void dh_clear_ctx(struct dh_ctx *ctx)
{
mpi_free(ctx->p);
mpi_free(ctx->q);
mpi_free(ctx->g);
mpi_free(ctx->xa);
memset(ctx, 0, sizeof(*ctx));
@ -62,12 +61,6 @@ static int dh_set_params(struct dh_ctx *ctx, struct dh *params)
if (!ctx->p)
return -EINVAL;
if (params->q && params->q_size) {
ctx->q = mpi_read_raw_data(params->q, params->q_size);
if (!ctx->q)
return -EINVAL;
}
ctx->g = mpi_read_raw_data(params->g, params->g_size);
if (!ctx->g)
return -EINVAL;
@ -104,11 +97,12 @@ err_clear_ctx:
/*
* SP800-56A public key verification:
*
* * If Q is provided as part of the domain paramenters, a full validation
* according to SP800-56A section 5.6.2.3.1 is performed.
* * For the safe-prime groups in FIPS mode, Q can be computed
* trivially from P and a full validation according to SP800-56A
* section 5.6.2.3.1 is performed.
*
* * If Q is not provided, a partial validation according to SP800-56A section
* 5.6.2.3.2 is performed.
* * For all other sets of group parameters, only a partial validation
* according to SP800-56A section 5.6.2.3.2 is performed.
*/
static int dh_is_pubkey_valid(struct dh_ctx *ctx, MPI y)
{
@ -119,21 +113,40 @@ static int dh_is_pubkey_valid(struct dh_ctx *ctx, MPI y)
* Step 1: Verify that 2 <= y <= p - 2.
*
* The upper limit check is actually y < p instead of y < p - 1
* as the mpi_sub_ui function is yet missing.
* in order to save one mpi_sub_ui() invocation here. Note that
* p - 1 is the non-trivial element of the subgroup of order 2 and
* thus, the check on y^q below would fail if y == p - 1.
*/
if (mpi_cmp_ui(y, 1) < 1 || mpi_cmp(y, ctx->p) >= 0)
return -EINVAL;
/* Step 2: Verify that 1 = y^q mod p */
if (ctx->q) {
MPI val = mpi_alloc(0);
/*
* Step 2: Verify that 1 = y^q mod p
*
* For the safe-prime groups q = (p - 1)/2.
*/
if (fips_enabled) {
MPI val, q;
int ret;
val = mpi_alloc(0);
if (!val)
return -ENOMEM;
ret = mpi_powm(val, y, ctx->q, ctx->p);
q = mpi_alloc(mpi_get_nlimbs(ctx->p));
if (!q) {
mpi_free(val);
return -ENOMEM;
}
/*
* ->p is odd, so no need to explicitly subtract one
* from it before shifting to the right.
*/
mpi_rshift(q, ctx->p, 1);
ret = mpi_powm(val, y, q, ctx->p);
mpi_free(q);
if (ret) {
mpi_free(val);
return ret;
@ -263,13 +276,645 @@ static struct kpp_alg dh = {
},
};
struct dh_safe_prime {
unsigned int max_strength;
unsigned int p_size;
const char *p;
};
static const char safe_prime_g[] = { 2 };
struct dh_safe_prime_instance_ctx {
struct crypto_kpp_spawn dh_spawn;
const struct dh_safe_prime *safe_prime;
};
struct dh_safe_prime_tfm_ctx {
struct crypto_kpp *dh_tfm;
};
static void dh_safe_prime_free_instance(struct kpp_instance *inst)
{
struct dh_safe_prime_instance_ctx *ctx = kpp_instance_ctx(inst);
crypto_drop_kpp(&ctx->dh_spawn);
kfree(inst);
}
static inline struct dh_safe_prime_instance_ctx *dh_safe_prime_instance_ctx(
struct crypto_kpp *tfm)
{
return kpp_instance_ctx(kpp_alg_instance(tfm));
}
static int dh_safe_prime_init_tfm(struct crypto_kpp *tfm)
{
struct dh_safe_prime_instance_ctx *inst_ctx =
dh_safe_prime_instance_ctx(tfm);
struct dh_safe_prime_tfm_ctx *tfm_ctx = kpp_tfm_ctx(tfm);
tfm_ctx->dh_tfm = crypto_spawn_kpp(&inst_ctx->dh_spawn);
if (IS_ERR(tfm_ctx->dh_tfm))
return PTR_ERR(tfm_ctx->dh_tfm);
return 0;
}
static void dh_safe_prime_exit_tfm(struct crypto_kpp *tfm)
{
struct dh_safe_prime_tfm_ctx *tfm_ctx = kpp_tfm_ctx(tfm);
crypto_free_kpp(tfm_ctx->dh_tfm);
}
static u64 __add_u64_to_be(__be64 *dst, unsigned int n, u64 val)
{
unsigned int i;
for (i = n; val && i > 0; --i) {
u64 tmp = be64_to_cpu(dst[i - 1]);
tmp += val;
val = tmp >= val ? 0 : 1;
dst[i - 1] = cpu_to_be64(tmp);
}
return val;
}
static void *dh_safe_prime_gen_privkey(const struct dh_safe_prime *safe_prime,
unsigned int *key_size)
{
unsigned int n, oversampling_size;
__be64 *key;
int err;
u64 h, o;
/*
* Generate a private key following NIST SP800-56Ar3,
* sec. 5.6.1.1.1 and 5.6.1.1.3 resp..
*
* 5.6.1.1.1: choose key length N such that
* 2 * ->max_strength <= N <= log2(q) + 1 = ->p_size * 8 - 1
* with q = (p - 1) / 2 for the safe-prime groups.
* Choose the lower bound's next power of two for N in order to
* avoid excessively large private keys while still
* maintaining some extra reserve beyond the bare minimum in
* most cases. Note that for each entry in safe_prime_groups[],
* the following holds for such N:
* - N >= 256, in particular it is a multiple of 2^6 = 64
* bits and
* - N < log2(q) + 1, i.e. N respects the upper bound.
*/
n = roundup_pow_of_two(2 * safe_prime->max_strength);
WARN_ON_ONCE(n & ((1u << 6) - 1));
n >>= 6; /* Convert N into units of u64. */
/*
* Reserve one extra u64 to hold the extra random bits
* required as per 5.6.1.1.3.
*/
oversampling_size = (n + 1) * sizeof(__be64);
key = kmalloc(oversampling_size, GFP_KERNEL);
if (!key)
return ERR_PTR(-ENOMEM);
/*
* 5.6.1.1.3, step 3 (and implicitly step 4): obtain N + 64
* random bits and interpret them as a big endian integer.
*/
err = -EFAULT;
if (crypto_get_default_rng())
goto out_err;
err = crypto_rng_get_bytes(crypto_default_rng, (u8 *)key,
oversampling_size);
crypto_put_default_rng();
if (err)
goto out_err;
/*
* 5.6.1.1.3, step 5 is implicit: 2^N < q and thus,
* M = min(2^N, q) = 2^N.
*
* For step 6, calculate
* key = (key[] mod (M - 1)) + 1 = (key[] mod (2^N - 1)) + 1.
*
* In order to avoid expensive divisions, note that
* 2^N mod (2^N - 1) = 1 and thus, for any integer h,
* 2^N * h mod (2^N - 1) = h mod (2^N - 1) always holds.
* The big endian integer key[] composed of n + 1 64bit words
* may be written as key[] = h * 2^N + l, with h = key[0]
* representing the 64 most significant bits and l
* corresponding to the remaining 2^N bits. With the remark
* from above,
* h * 2^N + l mod (2^N - 1) = l + h mod (2^N - 1).
* As both, l and h are less than 2^N, their sum after
* this first reduction is guaranteed to be <= 2^(N + 1) - 2.
* Or equivalently, that their sum can again be written as
* h' * 2^N + l' with h' now either zero or one and if one,
* then l' <= 2^N - 2. Thus, all bits at positions >= N will
* be zero after a second reduction:
* h' * 2^N + l' mod (2^N - 1) = l' + h' mod (2^N - 1).
* At this point, it is still possible that
* l' + h' = 2^N - 1, i.e. that l' + h' mod (2^N - 1)
* is zero. This condition will be detected below by means of
* the final increment overflowing in this case.
*/
h = be64_to_cpu(key[0]);
h = __add_u64_to_be(key + 1, n, h);
h = __add_u64_to_be(key + 1, n, h);
WARN_ON_ONCE(h);
/* Increment to obtain the final result. */
o = __add_u64_to_be(key + 1, n, 1);
/*
* The overflow bit o from the increment is either zero or
* one. If zero, key[1:n] holds the final result in big-endian
* order. If one, key[1:n] is zero now, but needs to be set to
* one, c.f. above.
*/
if (o)
key[n] = cpu_to_be64(1);
/* n is in units of u64, convert to bytes. */
*key_size = n << 3;
/* Strip the leading extra __be64, which is (virtually) zero by now. */
memmove(key, &key[1], *key_size);
return key;
out_err:
kfree_sensitive(key);
return ERR_PTR(err);
}
static int dh_safe_prime_set_secret(struct crypto_kpp *tfm, const void *buffer,
unsigned int len)
{
struct dh_safe_prime_instance_ctx *inst_ctx =
dh_safe_prime_instance_ctx(tfm);
struct dh_safe_prime_tfm_ctx *tfm_ctx = kpp_tfm_ctx(tfm);
struct dh params = {};
void *buf = NULL, *key = NULL;
unsigned int buf_size;
int err;
if (buffer) {
err = __crypto_dh_decode_key(buffer, len, &params);
if (err)
return err;
if (params.p_size || params.g_size)
return -EINVAL;
}
params.p = inst_ctx->safe_prime->p;
params.p_size = inst_ctx->safe_prime->p_size;
params.g = safe_prime_g;
params.g_size = sizeof(safe_prime_g);
if (!params.key_size) {
key = dh_safe_prime_gen_privkey(inst_ctx->safe_prime,
&params.key_size);
if (IS_ERR(key))
return PTR_ERR(key);
params.key = key;
}
buf_size = crypto_dh_key_len(&params);
buf = kmalloc(buf_size, GFP_KERNEL);
if (!buf) {
err = -ENOMEM;
goto out;
}
err = crypto_dh_encode_key(buf, buf_size, &params);
if (err)
goto out;
err = crypto_kpp_set_secret(tfm_ctx->dh_tfm, buf, buf_size);
out:
kfree_sensitive(buf);
kfree_sensitive(key);
return err;
}
static void dh_safe_prime_complete_req(struct crypto_async_request *dh_req,
int err)
{
struct kpp_request *req = dh_req->data;
kpp_request_complete(req, err);
}
static struct kpp_request *dh_safe_prime_prepare_dh_req(struct kpp_request *req)
{
struct dh_safe_prime_tfm_ctx *tfm_ctx =
kpp_tfm_ctx(crypto_kpp_reqtfm(req));
struct kpp_request *dh_req = kpp_request_ctx(req);
kpp_request_set_tfm(dh_req, tfm_ctx->dh_tfm);
kpp_request_set_callback(dh_req, req->base.flags,
dh_safe_prime_complete_req, req);
kpp_request_set_input(dh_req, req->src, req->src_len);
kpp_request_set_output(dh_req, req->dst, req->dst_len);
return dh_req;
}
static int dh_safe_prime_generate_public_key(struct kpp_request *req)
{
struct kpp_request *dh_req = dh_safe_prime_prepare_dh_req(req);
return crypto_kpp_generate_public_key(dh_req);
}
static int dh_safe_prime_compute_shared_secret(struct kpp_request *req)
{
struct kpp_request *dh_req = dh_safe_prime_prepare_dh_req(req);
return crypto_kpp_compute_shared_secret(dh_req);
}
static unsigned int dh_safe_prime_max_size(struct crypto_kpp *tfm)
{
struct dh_safe_prime_tfm_ctx *tfm_ctx = kpp_tfm_ctx(tfm);
return crypto_kpp_maxsize(tfm_ctx->dh_tfm);
}
static int __maybe_unused __dh_safe_prime_create(
struct crypto_template *tmpl, struct rtattr **tb,
const struct dh_safe_prime *safe_prime)
{
struct kpp_instance *inst;
struct dh_safe_prime_instance_ctx *ctx;
const char *dh_name;
struct kpp_alg *dh_alg;
u32 mask;
int err;
err = crypto_check_attr_type(tb, CRYPTO_ALG_TYPE_KPP, &mask);
if (err)
return err;
dh_name = crypto_attr_alg_name(tb[1]);
if (IS_ERR(dh_name))
return PTR_ERR(dh_name);
inst = kzalloc(sizeof(*inst) + sizeof(*ctx), GFP_KERNEL);
if (!inst)
return -ENOMEM;
ctx = kpp_instance_ctx(inst);
err = crypto_grab_kpp(&ctx->dh_spawn, kpp_crypto_instance(inst),
dh_name, 0, mask);
if (err)
goto err_free_inst;
err = -EINVAL;
dh_alg = crypto_spawn_kpp_alg(&ctx->dh_spawn);
if (strcmp(dh_alg->base.cra_name, "dh"))
goto err_free_inst;
ctx->safe_prime = safe_prime;
err = crypto_inst_setname(kpp_crypto_instance(inst),
tmpl->name, &dh_alg->base);
if (err)
goto err_free_inst;
inst->alg.set_secret = dh_safe_prime_set_secret;
inst->alg.generate_public_key = dh_safe_prime_generate_public_key;
inst->alg.compute_shared_secret = dh_safe_prime_compute_shared_secret;
inst->alg.max_size = dh_safe_prime_max_size;
inst->alg.init = dh_safe_prime_init_tfm;
inst->alg.exit = dh_safe_prime_exit_tfm;
inst->alg.reqsize = sizeof(struct kpp_request) + dh_alg->reqsize;
inst->alg.base.cra_priority = dh_alg->base.cra_priority;
inst->alg.base.cra_module = THIS_MODULE;
inst->alg.base.cra_ctxsize = sizeof(struct dh_safe_prime_tfm_ctx);
inst->free = dh_safe_prime_free_instance;
err = kpp_register_instance(tmpl, inst);
if (err)
goto err_free_inst;
return 0;
err_free_inst:
dh_safe_prime_free_instance(inst);
return err;
}
#ifdef CONFIG_CRYPTO_DH_RFC7919_GROUPS
static const struct dh_safe_prime ffdhe2048_prime = {
.max_strength = 112,
.p_size = 256,
.p =
"\xff\xff\xff\xff\xff\xff\xff\xff\xad\xf8\x54\x58\xa2\xbb\x4a\x9a"
"\xaf\xdc\x56\x20\x27\x3d\x3c\xf1\xd8\xb9\xc5\x83\xce\x2d\x36\x95"
"\xa9\xe1\x36\x41\x14\x64\x33\xfb\xcc\x93\x9d\xce\x24\x9b\x3e\xf9"
"\x7d\x2f\xe3\x63\x63\x0c\x75\xd8\xf6\x81\xb2\x02\xae\xc4\x61\x7a"
"\xd3\xdf\x1e\xd5\xd5\xfd\x65\x61\x24\x33\xf5\x1f\x5f\x06\x6e\xd0"
"\x85\x63\x65\x55\x3d\xed\x1a\xf3\xb5\x57\x13\x5e\x7f\x57\xc9\x35"
"\x98\x4f\x0c\x70\xe0\xe6\x8b\x77\xe2\xa6\x89\xda\xf3\xef\xe8\x72"
"\x1d\xf1\x58\xa1\x36\xad\xe7\x35\x30\xac\xca\x4f\x48\x3a\x79\x7a"
"\xbc\x0a\xb1\x82\xb3\x24\xfb\x61\xd1\x08\xa9\x4b\xb2\xc8\xe3\xfb"
"\xb9\x6a\xda\xb7\x60\xd7\xf4\x68\x1d\x4f\x42\xa3\xde\x39\x4d\xf4"
"\xae\x56\xed\xe7\x63\x72\xbb\x19\x0b\x07\xa7\xc8\xee\x0a\x6d\x70"
"\x9e\x02\xfc\xe1\xcd\xf7\xe2\xec\xc0\x34\x04\xcd\x28\x34\x2f\x61"
"\x91\x72\xfe\x9c\xe9\x85\x83\xff\x8e\x4f\x12\x32\xee\xf2\x81\x83"
"\xc3\xfe\x3b\x1b\x4c\x6f\xad\x73\x3b\xb5\xfc\xbc\x2e\xc2\x20\x05"
"\xc5\x8e\xf1\x83\x7d\x16\x83\xb2\xc6\xf3\x4a\x26\xc1\xb2\xef\xfa"
"\x88\x6b\x42\x38\x61\x28\x5c\x97\xff\xff\xff\xff\xff\xff\xff\xff",
};
static const struct dh_safe_prime ffdhe3072_prime = {
.max_strength = 128,
.p_size = 384,
.p =
"\xff\xff\xff\xff\xff\xff\xff\xff\xad\xf8\x54\x58\xa2\xbb\x4a\x9a"
"\xaf\xdc\x56\x20\x27\x3d\x3c\xf1\xd8\xb9\xc5\x83\xce\x2d\x36\x95"
"\xa9\xe1\x36\x41\x14\x64\x33\xfb\xcc\x93\x9d\xce\x24\x9b\x3e\xf9"
"\x7d\x2f\xe3\x63\x63\x0c\x75\xd8\xf6\x81\xb2\x02\xae\xc4\x61\x7a"
"\xd3\xdf\x1e\xd5\xd5\xfd\x65\x61\x24\x33\xf5\x1f\x5f\x06\x6e\xd0"
"\x85\x63\x65\x55\x3d\xed\x1a\xf3\xb5\x57\x13\x5e\x7f\x57\xc9\x35"
"\x98\x4f\x0c\x70\xe0\xe6\x8b\x77\xe2\xa6\x89\xda\xf3\xef\xe8\x72"
"\x1d\xf1\x58\xa1\x36\xad\xe7\x35\x30\xac\xca\x4f\x48\x3a\x79\x7a"
"\xbc\x0a\xb1\x82\xb3\x24\xfb\x61\xd1\x08\xa9\x4b\xb2\xc8\xe3\xfb"
"\xb9\x6a\xda\xb7\x60\xd7\xf4\x68\x1d\x4f\x42\xa3\xde\x39\x4d\xf4"
"\xae\x56\xed\xe7\x63\x72\xbb\x19\x0b\x07\xa7\xc8\xee\x0a\x6d\x70"
"\x9e\x02\xfc\xe1\xcd\xf7\xe2\xec\xc0\x34\x04\xcd\x28\x34\x2f\x61"
"\x91\x72\xfe\x9c\xe9\x85\x83\xff\x8e\x4f\x12\x32\xee\xf2\x81\x83"
"\xc3\xfe\x3b\x1b\x4c\x6f\xad\x73\x3b\xb5\xfc\xbc\x2e\xc2\x20\x05"
"\xc5\x8e\xf1\x83\x7d\x16\x83\xb2\xc6\xf3\x4a\x26\xc1\xb2\xef\xfa"
"\x88\x6b\x42\x38\x61\x1f\xcf\xdc\xde\x35\x5b\x3b\x65\x19\x03\x5b"
"\xbc\x34\xf4\xde\xf9\x9c\x02\x38\x61\xb4\x6f\xc9\xd6\xe6\xc9\x07"
"\x7a\xd9\x1d\x26\x91\xf7\xf7\xee\x59\x8c\xb0\xfa\xc1\x86\xd9\x1c"
"\xae\xfe\x13\x09\x85\x13\x92\x70\xb4\x13\x0c\x93\xbc\x43\x79\x44"
"\xf4\xfd\x44\x52\xe2\xd7\x4d\xd3\x64\xf2\xe2\x1e\x71\xf5\x4b\xff"
"\x5c\xae\x82\xab\x9c\x9d\xf6\x9e\xe8\x6d\x2b\xc5\x22\x36\x3a\x0d"
"\xab\xc5\x21\x97\x9b\x0d\xea\xda\x1d\xbf\x9a\x42\xd5\xc4\x48\x4e"
"\x0a\xbc\xd0\x6b\xfa\x53\xdd\xef\x3c\x1b\x20\xee\x3f\xd5\x9d\x7c"
"\x25\xe4\x1d\x2b\x66\xc6\x2e\x37\xff\xff\xff\xff\xff\xff\xff\xff",
};
static const struct dh_safe_prime ffdhe4096_prime = {
.max_strength = 152,
.p_size = 512,
.p =
"\xff\xff\xff\xff\xff\xff\xff\xff\xad\xf8\x54\x58\xa2\xbb\x4a\x9a"
"\xaf\xdc\x56\x20\x27\x3d\x3c\xf1\xd8\xb9\xc5\x83\xce\x2d\x36\x95"
"\xa9\xe1\x36\x41\x14\x64\x33\xfb\xcc\x93\x9d\xce\x24\x9b\x3e\xf9"
"\x7d\x2f\xe3\x63\x63\x0c\x75\xd8\xf6\x81\xb2\x02\xae\xc4\x61\x7a"
"\xd3\xdf\x1e\xd5\xd5\xfd\x65\x61\x24\x33\xf5\x1f\x5f\x06\x6e\xd0"
"\x85\x63\x65\x55\x3d\xed\x1a\xf3\xb5\x57\x13\x5e\x7f\x57\xc9\x35"
"\x98\x4f\x0c\x70\xe0\xe6\x8b\x77\xe2\xa6\x89\xda\xf3\xef\xe8\x72"
"\x1d\xf1\x58\xa1\x36\xad\xe7\x35\x30\xac\xca\x4f\x48\x3a\x79\x7a"
"\xbc\x0a\xb1\x82\xb3\x24\xfb\x61\xd1\x08\xa9\x4b\xb2\xc8\xe3\xfb"
"\xb9\x6a\xda\xb7\x60\xd7\xf4\x68\x1d\x4f\x42\xa3\xde\x39\x4d\xf4"
"\xae\x56\xed\xe7\x63\x72\xbb\x19\x0b\x07\xa7\xc8\xee\x0a\x6d\x70"
"\x9e\x02\xfc\xe1\xcd\xf7\xe2\xec\xc0\x34\x04\xcd\x28\x34\x2f\x61"
"\x91\x72\xfe\x9c\xe9\x85\x83\xff\x8e\x4f\x12\x32\xee\xf2\x81\x83"
"\xc3\xfe\x3b\x1b\x4c\x6f\xad\x73\x3b\xb5\xfc\xbc\x2e\xc2\x20\x05"
"\xc5\x8e\xf1\x83\x7d\x16\x83\xb2\xc6\xf3\x4a\x26\xc1\xb2\xef\xfa"
"\x88\x6b\x42\x38\x61\x1f\xcf\xdc\xde\x35\x5b\x3b\x65\x19\x03\x5b"
"\xbc\x34\xf4\xde\xf9\x9c\x02\x38\x61\xb4\x6f\xc9\xd6\xe6\xc9\x07"
"\x7a\xd9\x1d\x26\x91\xf7\xf7\xee\x59\x8c\xb0\xfa\xc1\x86\xd9\x1c"
"\xae\xfe\x13\x09\x85\x13\x92\x70\xb4\x13\x0c\x93\xbc\x43\x79\x44"
"\xf4\xfd\x44\x52\xe2\xd7\x4d\xd3\x64\xf2\xe2\x1e\x71\xf5\x4b\xff"
"\x5c\xae\x82\xab\x9c\x9d\xf6\x9e\xe8\x6d\x2b\xc5\x22\x36\x3a\x0d"
"\xab\xc5\x21\x97\x9b\x0d\xea\xda\x1d\xbf\x9a\x42\xd5\xc4\x48\x4e"
"\x0a\xbc\xd0\x6b\xfa\x53\xdd\xef\x3c\x1b\x20\xee\x3f\xd5\x9d\x7c"
"\x25\xe4\x1d\x2b\x66\x9e\x1e\xf1\x6e\x6f\x52\xc3\x16\x4d\xf4\xfb"
"\x79\x30\xe9\xe4\xe5\x88\x57\xb6\xac\x7d\x5f\x42\xd6\x9f\x6d\x18"
"\x77\x63\xcf\x1d\x55\x03\x40\x04\x87\xf5\x5b\xa5\x7e\x31\xcc\x7a"
"\x71\x35\xc8\x86\xef\xb4\x31\x8a\xed\x6a\x1e\x01\x2d\x9e\x68\x32"
"\xa9\x07\x60\x0a\x91\x81\x30\xc4\x6d\xc7\x78\xf9\x71\xad\x00\x38"
"\x09\x29\x99\xa3\x33\xcb\x8b\x7a\x1a\x1d\xb9\x3d\x71\x40\x00\x3c"
"\x2a\x4e\xce\xa9\xf9\x8d\x0a\xcc\x0a\x82\x91\xcd\xce\xc9\x7d\xcf"
"\x8e\xc9\xb5\x5a\x7f\x88\xa4\x6b\x4d\xb5\xa8\x51\xf4\x41\x82\xe1"
"\xc6\x8a\x00\x7e\x5e\x65\x5f\x6a\xff\xff\xff\xff\xff\xff\xff\xff",
};
static const struct dh_safe_prime ffdhe6144_prime = {
.max_strength = 176,
.p_size = 768,
.p =
"\xff\xff\xff\xff\xff\xff\xff\xff\xad\xf8\x54\x58\xa2\xbb\x4a\x9a"
"\xaf\xdc\x56\x20\x27\x3d\x3c\xf1\xd8\xb9\xc5\x83\xce\x2d\x36\x95"
"\xa9\xe1\x36\x41\x14\x64\x33\xfb\xcc\x93\x9d\xce\x24\x9b\x3e\xf9"
"\x7d\x2f\xe3\x63\x63\x0c\x75\xd8\xf6\x81\xb2\x02\xae\xc4\x61\x7a"
"\xd3\xdf\x1e\xd5\xd5\xfd\x65\x61\x24\x33\xf5\x1f\x5f\x06\x6e\xd0"
"\x85\x63\x65\x55\x3d\xed\x1a\xf3\xb5\x57\x13\x5e\x7f\x57\xc9\x35"
"\x98\x4f\x0c\x70\xe0\xe6\x8b\x77\xe2\xa6\x89\xda\xf3\xef\xe8\x72"
"\x1d\xf1\x58\xa1\x36\xad\xe7\x35\x30\xac\xca\x4f\x48\x3a\x79\x7a"
"\xbc\x0a\xb1\x82\xb3\x24\xfb\x61\xd1\x08\xa9\x4b\xb2\xc8\xe3\xfb"
"\xb9\x6a\xda\xb7\x60\xd7\xf4\x68\x1d\x4f\x42\xa3\xde\x39\x4d\xf4"
"\xae\x56\xed\xe7\x63\x72\xbb\x19\x0b\x07\xa7\xc8\xee\x0a\x6d\x70"
"\x9e\x02\xfc\xe1\xcd\xf7\xe2\xec\xc0\x34\x04\xcd\x28\x34\x2f\x61"
"\x91\x72\xfe\x9c\xe9\x85\x83\xff\x8e\x4f\x12\x32\xee\xf2\x81\x83"
"\xc3\xfe\x3b\x1b\x4c\x6f\xad\x73\x3b\xb5\xfc\xbc\x2e\xc2\x20\x05"
"\xc5\x8e\xf1\x83\x7d\x16\x83\xb2\xc6\xf3\x4a\x26\xc1\xb2\xef\xfa"
"\x88\x6b\x42\x38\x61\x1f\xcf\xdc\xde\x35\x5b\x3b\x65\x19\x03\x5b"
"\xbc\x34\xf4\xde\xf9\x9c\x02\x38\x61\xb4\x6f\xc9\xd6\xe6\xc9\x07"
"\x7a\xd9\x1d\x26\x91\xf7\xf7\xee\x59\x8c\xb0\xfa\xc1\x86\xd9\x1c"
"\xae\xfe\x13\x09\x85\x13\x92\x70\xb4\x13\x0c\x93\xbc\x43\x79\x44"
"\xf4\xfd\x44\x52\xe2\xd7\x4d\xd3\x64\xf2\xe2\x1e\x71\xf5\x4b\xff"
"\x5c\xae\x82\xab\x9c\x9d\xf6\x9e\xe8\x6d\x2b\xc5\x22\x36\x3a\x0d"
"\xab\xc5\x21\x97\x9b\x0d\xea\xda\x1d\xbf\x9a\x42\xd5\xc4\x48\x4e"
"\x0a\xbc\xd0\x6b\xfa\x53\xdd\xef\x3c\x1b\x20\xee\x3f\xd5\x9d\x7c"
"\x25\xe4\x1d\x2b\x66\x9e\x1e\xf1\x6e\x6f\x52\xc3\x16\x4d\xf4\xfb"
"\x79\x30\xe9\xe4\xe5\x88\x57\xb6\xac\x7d\x5f\x42\xd6\x9f\x6d\x18"
"\x77\x63\xcf\x1d\x55\x03\x40\x04\x87\xf5\x5b\xa5\x7e\x31\xcc\x7a"
"\x71\x35\xc8\x86\xef\xb4\x31\x8a\xed\x6a\x1e\x01\x2d\x9e\x68\x32"
"\xa9\x07\x60\x0a\x91\x81\x30\xc4\x6d\xc7\x78\xf9\x71\xad\x00\x38"
"\x09\x29\x99\xa3\x33\xcb\x8b\x7a\x1a\x1d\xb9\x3d\x71\x40\x00\x3c"
"\x2a\x4e\xce\xa9\xf9\x8d\x0a\xcc\x0a\x82\x91\xcd\xce\xc9\x7d\xcf"
"\x8e\xc9\xb5\x5a\x7f\x88\xa4\x6b\x4d\xb5\xa8\x51\xf4\x41\x82\xe1"
"\xc6\x8a\x00\x7e\x5e\x0d\xd9\x02\x0b\xfd\x64\xb6\x45\x03\x6c\x7a"
"\x4e\x67\x7d\x2c\x38\x53\x2a\x3a\x23\xba\x44\x42\xca\xf5\x3e\xa6"
"\x3b\xb4\x54\x32\x9b\x76\x24\xc8\x91\x7b\xdd\x64\xb1\xc0\xfd\x4c"
"\xb3\x8e\x8c\x33\x4c\x70\x1c\x3a\xcd\xad\x06\x57\xfc\xcf\xec\x71"
"\x9b\x1f\x5c\x3e\x4e\x46\x04\x1f\x38\x81\x47\xfb\x4c\xfd\xb4\x77"
"\xa5\x24\x71\xf7\xa9\xa9\x69\x10\xb8\x55\x32\x2e\xdb\x63\x40\xd8"
"\xa0\x0e\xf0\x92\x35\x05\x11\xe3\x0a\xbe\xc1\xff\xf9\xe3\xa2\x6e"
"\x7f\xb2\x9f\x8c\x18\x30\x23\xc3\x58\x7e\x38\xda\x00\x77\xd9\xb4"
"\x76\x3e\x4e\x4b\x94\xb2\xbb\xc1\x94\xc6\x65\x1e\x77\xca\xf9\x92"
"\xee\xaa\xc0\x23\x2a\x28\x1b\xf6\xb3\xa7\x39\xc1\x22\x61\x16\x82"
"\x0a\xe8\xdb\x58\x47\xa6\x7c\xbe\xf9\xc9\x09\x1b\x46\x2d\x53\x8c"
"\xd7\x2b\x03\x74\x6a\xe7\x7f\x5e\x62\x29\x2c\x31\x15\x62\xa8\x46"
"\x50\x5d\xc8\x2d\xb8\x54\x33\x8a\xe4\x9f\x52\x35\xc9\x5b\x91\x17"
"\x8c\xcf\x2d\xd5\xca\xce\xf4\x03\xec\x9d\x18\x10\xc6\x27\x2b\x04"
"\x5b\x3b\x71\xf9\xdc\x6b\x80\xd6\x3f\xdd\x4a\x8e\x9a\xdb\x1e\x69"
"\x62\xa6\x95\x26\xd4\x31\x61\xc1\xa4\x1d\x57\x0d\x79\x38\xda\xd4"
"\xa4\x0e\x32\x9c\xd0\xe4\x0e\x65\xff\xff\xff\xff\xff\xff\xff\xff",
};
static const struct dh_safe_prime ffdhe8192_prime = {
.max_strength = 200,
.p_size = 1024,
.p =
"\xff\xff\xff\xff\xff\xff\xff\xff\xad\xf8\x54\x58\xa2\xbb\x4a\x9a"
"\xaf\xdc\x56\x20\x27\x3d\x3c\xf1\xd8\xb9\xc5\x83\xce\x2d\x36\x95"
"\xa9\xe1\x36\x41\x14\x64\x33\xfb\xcc\x93\x9d\xce\x24\x9b\x3e\xf9"
"\x7d\x2f\xe3\x63\x63\x0c\x75\xd8\xf6\x81\xb2\x02\xae\xc4\x61\x7a"
"\xd3\xdf\x1e\xd5\xd5\xfd\x65\x61\x24\x33\xf5\x1f\x5f\x06\x6e\xd0"
"\x85\x63\x65\x55\x3d\xed\x1a\xf3\xb5\x57\x13\x5e\x7f\x57\xc9\x35"
"\x98\x4f\x0c\x70\xe0\xe6\x8b\x77\xe2\xa6\x89\xda\xf3\xef\xe8\x72"
"\x1d\xf1\x58\xa1\x36\xad\xe7\x35\x30\xac\xca\x4f\x48\x3a\x79\x7a"
"\xbc\x0a\xb1\x82\xb3\x24\xfb\x61\xd1\x08\xa9\x4b\xb2\xc8\xe3\xfb"
"\xb9\x6a\xda\xb7\x60\xd7\xf4\x68\x1d\x4f\x42\xa3\xde\x39\x4d\xf4"
"\xae\x56\xed\xe7\x63\x72\xbb\x19\x0b\x07\xa7\xc8\xee\x0a\x6d\x70"
"\x9e\x02\xfc\xe1\xcd\xf7\xe2\xec\xc0\x34\x04\xcd\x28\x34\x2f\x61"
"\x91\x72\xfe\x9c\xe9\x85\x83\xff\x8e\x4f\x12\x32\xee\xf2\x81\x83"
"\xc3\xfe\x3b\x1b\x4c\x6f\xad\x73\x3b\xb5\xfc\xbc\x2e\xc2\x20\x05"
"\xc5\x8e\xf1\x83\x7d\x16\x83\xb2\xc6\xf3\x4a\x26\xc1\xb2\xef\xfa"
"\x88\x6b\x42\x38\x61\x1f\xcf\xdc\xde\x35\x5b\x3b\x65\x19\x03\x5b"
"\xbc\x34\xf4\xde\xf9\x9c\x02\x38\x61\xb4\x6f\xc9\xd6\xe6\xc9\x07"
"\x7a\xd9\x1d\x26\x91\xf7\xf7\xee\x59\x8c\xb0\xfa\xc1\x86\xd9\x1c"
"\xae\xfe\x13\x09\x85\x13\x92\x70\xb4\x13\x0c\x93\xbc\x43\x79\x44"
"\xf4\xfd\x44\x52\xe2\xd7\x4d\xd3\x64\xf2\xe2\x1e\x71\xf5\x4b\xff"
"\x5c\xae\x82\xab\x9c\x9d\xf6\x9e\xe8\x6d\x2b\xc5\x22\x36\x3a\x0d"
"\xab\xc5\x21\x97\x9b\x0d\xea\xda\x1d\xbf\x9a\x42\xd5\xc4\x48\x4e"
"\x0a\xbc\xd0\x6b\xfa\x53\xdd\xef\x3c\x1b\x20\xee\x3f\xd5\x9d\x7c"
"\x25\xe4\x1d\x2b\x66\x9e\x1e\xf1\x6e\x6f\x52\xc3\x16\x4d\xf4\xfb"
"\x79\x30\xe9\xe4\xe5\x88\x57\xb6\xac\x7d\x5f\x42\xd6\x9f\x6d\x18"
"\x77\x63\xcf\x1d\x55\x03\x40\x04\x87\xf5\x5b\xa5\x7e\x31\xcc\x7a"
"\x71\x35\xc8\x86\xef\xb4\x31\x8a\xed\x6a\x1e\x01\x2d\x9e\x68\x32"
"\xa9\x07\x60\x0a\x91\x81\x30\xc4\x6d\xc7\x78\xf9\x71\xad\x00\x38"
"\x09\x29\x99\xa3\x33\xcb\x8b\x7a\x1a\x1d\xb9\x3d\x71\x40\x00\x3c"
"\x2a\x4e\xce\xa9\xf9\x8d\x0a\xcc\x0a\x82\x91\xcd\xce\xc9\x7d\xcf"
"\x8e\xc9\xb5\x5a\x7f\x88\xa4\x6b\x4d\xb5\xa8\x51\xf4\x41\x82\xe1"
"\xc6\x8a\x00\x7e\x5e\x0d\xd9\x02\x0b\xfd\x64\xb6\x45\x03\x6c\x7a"
"\x4e\x67\x7d\x2c\x38\x53\x2a\x3a\x23\xba\x44\x42\xca\xf5\x3e\xa6"
"\x3b\xb4\x54\x32\x9b\x76\x24\xc8\x91\x7b\xdd\x64\xb1\xc0\xfd\x4c"
"\xb3\x8e\x8c\x33\x4c\x70\x1c\x3a\xcd\xad\x06\x57\xfc\xcf\xec\x71"
"\x9b\x1f\x5c\x3e\x4e\x46\x04\x1f\x38\x81\x47\xfb\x4c\xfd\xb4\x77"
"\xa5\x24\x71\xf7\xa9\xa9\x69\x10\xb8\x55\x32\x2e\xdb\x63\x40\xd8"
"\xa0\x0e\xf0\x92\x35\x05\x11\xe3\x0a\xbe\xc1\xff\xf9\xe3\xa2\x6e"
"\x7f\xb2\x9f\x8c\x18\x30\x23\xc3\x58\x7e\x38\xda\x00\x77\xd9\xb4"
"\x76\x3e\x4e\x4b\x94\xb2\xbb\xc1\x94\xc6\x65\x1e\x77\xca\xf9\x92"
"\xee\xaa\xc0\x23\x2a\x28\x1b\xf6\xb3\xa7\x39\xc1\x22\x61\x16\x82"
"\x0a\xe8\xdb\x58\x47\xa6\x7c\xbe\xf9\xc9\x09\x1b\x46\x2d\x53\x8c"
"\xd7\x2b\x03\x74\x6a\xe7\x7f\x5e\x62\x29\x2c\x31\x15\x62\xa8\x46"
"\x50\x5d\xc8\x2d\xb8\x54\x33\x8a\xe4\x9f\x52\x35\xc9\x5b\x91\x17"
"\x8c\xcf\x2d\xd5\xca\xce\xf4\x03\xec\x9d\x18\x10\xc6\x27\x2b\x04"
"\x5b\x3b\x71\xf9\xdc\x6b\x80\xd6\x3f\xdd\x4a\x8e\x9a\xdb\x1e\x69"
"\x62\xa6\x95\x26\xd4\x31\x61\xc1\xa4\x1d\x57\x0d\x79\x38\xda\xd4"
"\xa4\x0e\x32\x9c\xcf\xf4\x6a\xaa\x36\xad\x00\x4c\xf6\x00\xc8\x38"
"\x1e\x42\x5a\x31\xd9\x51\xae\x64\xfd\xb2\x3f\xce\xc9\x50\x9d\x43"
"\x68\x7f\xeb\x69\xed\xd1\xcc\x5e\x0b\x8c\xc3\xbd\xf6\x4b\x10\xef"
"\x86\xb6\x31\x42\xa3\xab\x88\x29\x55\x5b\x2f\x74\x7c\x93\x26\x65"
"\xcb\x2c\x0f\x1c\xc0\x1b\xd7\x02\x29\x38\x88\x39\xd2\xaf\x05\xe4"
"\x54\x50\x4a\xc7\x8b\x75\x82\x82\x28\x46\xc0\xba\x35\xc3\x5f\x5c"
"\x59\x16\x0c\xc0\x46\xfd\x82\x51\x54\x1f\xc6\x8c\x9c\x86\xb0\x22"
"\xbb\x70\x99\x87\x6a\x46\x0e\x74\x51\xa8\xa9\x31\x09\x70\x3f\xee"
"\x1c\x21\x7e\x6c\x38\x26\xe5\x2c\x51\xaa\x69\x1e\x0e\x42\x3c\xfc"
"\x99\xe9\xe3\x16\x50\xc1\x21\x7b\x62\x48\x16\xcd\xad\x9a\x95\xf9"
"\xd5\xb8\x01\x94\x88\xd9\xc0\xa0\xa1\xfe\x30\x75\xa5\x77\xe2\x31"
"\x83\xf8\x1d\x4a\x3f\x2f\xa4\x57\x1e\xfc\x8c\xe0\xba\x8a\x4f\xe8"
"\xb6\x85\x5d\xfe\x72\xb0\xa6\x6e\xde\xd2\xfb\xab\xfb\xe5\x8a\x30"
"\xfa\xfa\xbe\x1c\x5d\x71\xa8\x7e\x2f\x74\x1e\xf8\xc1\xfe\x86\xfe"
"\xa6\xbb\xfd\xe5\x30\x67\x7f\x0d\x97\xd1\x1d\x49\xf7\xa8\x44\x3d"
"\x08\x22\xe5\x06\xa9\xf4\x61\x4e\x01\x1e\x2a\x94\x83\x8f\xf8\x8c"
"\xd6\x8c\x8b\xb7\xc5\xc6\x42\x4c\xff\xff\xff\xff\xff\xff\xff\xff",
};
static int dh_ffdhe2048_create(struct crypto_template *tmpl,
struct rtattr **tb)
{
return __dh_safe_prime_create(tmpl, tb, &ffdhe2048_prime);
}
static int dh_ffdhe3072_create(struct crypto_template *tmpl,
struct rtattr **tb)
{
return __dh_safe_prime_create(tmpl, tb, &ffdhe3072_prime);
}
static int dh_ffdhe4096_create(struct crypto_template *tmpl,
struct rtattr **tb)
{
return __dh_safe_prime_create(tmpl, tb, &ffdhe4096_prime);
}
static int dh_ffdhe6144_create(struct crypto_template *tmpl,
struct rtattr **tb)
{
return __dh_safe_prime_create(tmpl, tb, &ffdhe6144_prime);
}
static int dh_ffdhe8192_create(struct crypto_template *tmpl,
struct rtattr **tb)
{
return __dh_safe_prime_create(tmpl, tb, &ffdhe8192_prime);
}
static struct crypto_template crypto_ffdhe_templates[] = {
{
.name = "ffdhe2048",
.create = dh_ffdhe2048_create,
.module = THIS_MODULE,
},
{
.name = "ffdhe3072",
.create = dh_ffdhe3072_create,
.module = THIS_MODULE,
},
{
.name = "ffdhe4096",
.create = dh_ffdhe4096_create,
.module = THIS_MODULE,
},
{
.name = "ffdhe6144",
.create = dh_ffdhe6144_create,
.module = THIS_MODULE,
},
{
.name = "ffdhe8192",
.create = dh_ffdhe8192_create,
.module = THIS_MODULE,
},
};
#else /* ! CONFIG_CRYPTO_DH_RFC7919_GROUPS */
static struct crypto_template crypto_ffdhe_templates[] = {};
#endif /* CONFIG_CRYPTO_DH_RFC7919_GROUPS */
static int dh_init(void)
{
return crypto_register_kpp(&dh);
int err;
err = crypto_register_kpp(&dh);
if (err)
return err;
err = crypto_register_templates(crypto_ffdhe_templates,
ARRAY_SIZE(crypto_ffdhe_templates));
if (err) {
crypto_unregister_kpp(&dh);
return err;
}
return 0;
}
static void dh_exit(void)
{
crypto_unregister_templates(crypto_ffdhe_templates,
ARRAY_SIZE(crypto_ffdhe_templates));
crypto_unregister_kpp(&dh);
}

View File

@ -10,7 +10,7 @@
#include <crypto/dh.h>
#include <crypto/kpp.h>
#define DH_KPP_SECRET_MIN_SIZE (sizeof(struct kpp_secret) + 4 * sizeof(int))
#define DH_KPP_SECRET_MIN_SIZE (sizeof(struct kpp_secret) + 3 * sizeof(int))
static inline u8 *dh_pack_data(u8 *dst, u8 *end, const void *src, size_t size)
{
@ -28,7 +28,7 @@ static inline const u8 *dh_unpack_data(void *dst, const void *src, size_t size)
static inline unsigned int dh_data_size(const struct dh *p)
{
return p->key_size + p->p_size + p->q_size + p->g_size;
return p->key_size + p->p_size + p->g_size;
}
unsigned int crypto_dh_key_len(const struct dh *p)
@ -53,11 +53,9 @@ int crypto_dh_encode_key(char *buf, unsigned int len, const struct dh *params)
ptr = dh_pack_data(ptr, end, &params->key_size,
sizeof(params->key_size));
ptr = dh_pack_data(ptr, end, &params->p_size, sizeof(params->p_size));
ptr = dh_pack_data(ptr, end, &params->q_size, sizeof(params->q_size));
ptr = dh_pack_data(ptr, end, &params->g_size, sizeof(params->g_size));
ptr = dh_pack_data(ptr, end, params->key, params->key_size);
ptr = dh_pack_data(ptr, end, params->p, params->p_size);
ptr = dh_pack_data(ptr, end, params->q, params->q_size);
ptr = dh_pack_data(ptr, end, params->g, params->g_size);
if (ptr != end)
return -EINVAL;
@ -65,7 +63,7 @@ int crypto_dh_encode_key(char *buf, unsigned int len, const struct dh *params)
}
EXPORT_SYMBOL_GPL(crypto_dh_encode_key);
int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
int __crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
{
const u8 *ptr = buf;
struct kpp_secret secret;
@ -79,27 +77,35 @@ int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
ptr = dh_unpack_data(&params->key_size, ptr, sizeof(params->key_size));
ptr = dh_unpack_data(&params->p_size, ptr, sizeof(params->p_size));
ptr = dh_unpack_data(&params->q_size, ptr, sizeof(params->q_size));
ptr = dh_unpack_data(&params->g_size, ptr, sizeof(params->g_size));
if (secret.len != crypto_dh_key_len(params))
return -EINVAL;
/*
* Don't permit the buffer for 'key' or 'g' to be larger than 'p', since
* some drivers assume otherwise.
*/
if (params->key_size > params->p_size ||
params->g_size > params->p_size || params->q_size > params->p_size)
return -EINVAL;
/* Don't allocate memory. Set pointers to data within
* the given buffer
*/
params->key = (void *)ptr;
params->p = (void *)(ptr + params->key_size);
params->q = (void *)(ptr + params->key_size + params->p_size);
params->g = (void *)(ptr + params->key_size + params->p_size +
params->q_size);
params->g = (void *)(ptr + params->key_size + params->p_size);
return 0;
}
int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
{
int err;
err = __crypto_dh_decode_key(buf, len, params);
if (err)
return err;
/*
* Don't permit the buffer for 'key' or 'g' to be larger than 'p', since
* some drivers assume otherwise.
*/
if (params->key_size > params->p_size ||
params->g_size > params->p_size)
return -EINVAL;
/*
* Don't permit 'p' to be 0. It's not a prime number, and it's subject
@ -109,10 +115,6 @@ int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
if (memchr_inv(params->p, 0, params->p_size) == NULL)
return -EINVAL;
/* It is permissible to not provide Q. */
if (params->q_size == 0)
params->q = NULL;
return 0;
}
EXPORT_SYMBOL_GPL(crypto_dh_decode_key);

View File

@ -15,6 +15,7 @@
#include <crypto/internal/hash.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
#include <linux/fips.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
@ -51,6 +52,9 @@ static int hmac_setkey(struct crypto_shash *parent,
SHASH_DESC_ON_STACK(shash, hash);
unsigned int i;
if (fips_enabled && (keylen < 112 / 8))
return -EINVAL;
shash->tfm = hash;
if (keylen > bs) {

View File

@ -68,9 +68,17 @@ static int crypto_kpp_init_tfm(struct crypto_tfm *tfm)
return 0;
}
static void crypto_kpp_free_instance(struct crypto_instance *inst)
{
struct kpp_instance *kpp = kpp_instance(inst);
kpp->free(kpp);
}
static const struct crypto_type crypto_kpp_type = {
.extsize = crypto_alg_extsize,
.init_tfm = crypto_kpp_init_tfm,
.free = crypto_kpp_free_instance,
#ifdef CONFIG_PROC_FS
.show = crypto_kpp_show,
#endif
@ -87,6 +95,15 @@ struct crypto_kpp *crypto_alloc_kpp(const char *alg_name, u32 type, u32 mask)
}
EXPORT_SYMBOL_GPL(crypto_alloc_kpp);
int crypto_grab_kpp(struct crypto_kpp_spawn *spawn,
struct crypto_instance *inst,
const char *name, u32 type, u32 mask)
{
spawn->base.frontend = &crypto_kpp_type;
return crypto_grab_spawn(&spawn->base, inst, name, type, mask);
}
EXPORT_SYMBOL_GPL(crypto_grab_kpp);
static void kpp_prepare_alg(struct kpp_alg *alg)
{
struct crypto_alg *base = &alg->base;
@ -111,5 +128,17 @@ void crypto_unregister_kpp(struct kpp_alg *alg)
}
EXPORT_SYMBOL_GPL(crypto_unregister_kpp);
int kpp_register_instance(struct crypto_template *tmpl,
struct kpp_instance *inst)
{
if (WARN_ON(!inst->free))
return -EINVAL;
kpp_prepare_alg(&inst->alg);
return crypto_register_instance(tmpl, kpp_crypto_instance(inst));
}
EXPORT_SYMBOL_GPL(kpp_register_instance);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Key-agreement Protocol Primitives");

View File

@ -428,3 +428,4 @@ module_exit(lrw_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("LRW block cipher mode");
MODULE_ALIAS_CRYPTO("lrw");
MODULE_SOFTDEP("pre: ecb");

View File

@ -60,6 +60,7 @@
*/
#include <crypto/algapi.h>
#include <asm/unaligned.h>
#ifndef __HAVE_ARCH_CRYPTO_MEMNEQ
@ -71,7 +72,8 @@ __crypto_memneq_generic(const void *a, const void *b, size_t size)
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
while (size >= sizeof(unsigned long)) {
neq |= *(unsigned long *)a ^ *(unsigned long *)b;
neq |= get_unaligned((unsigned long *)a) ^
get_unaligned((unsigned long *)b);
OPTIMIZER_HIDE_VAR(neq);
a += sizeof(unsigned long);
b += sizeof(unsigned long);
@ -95,18 +97,24 @@ static inline unsigned long __crypto_memneq_16(const void *a, const void *b)
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
if (sizeof(unsigned long) == 8) {
neq |= *(unsigned long *)(a) ^ *(unsigned long *)(b);
neq |= get_unaligned((unsigned long *)a) ^
get_unaligned((unsigned long *)b);
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned long *)(a+8) ^ *(unsigned long *)(b+8);
neq |= get_unaligned((unsigned long *)(a + 8)) ^
get_unaligned((unsigned long *)(b + 8));
OPTIMIZER_HIDE_VAR(neq);
} else if (sizeof(unsigned int) == 4) {
neq |= *(unsigned int *)(a) ^ *(unsigned int *)(b);
neq |= get_unaligned((unsigned int *)a) ^
get_unaligned((unsigned int *)b);
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+4) ^ *(unsigned int *)(b+4);
neq |= get_unaligned((unsigned int *)(a + 4)) ^
get_unaligned((unsigned int *)(b + 4));
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+8) ^ *(unsigned int *)(b+8);
neq |= get_unaligned((unsigned int *)(a + 8)) ^
get_unaligned((unsigned int *)(b + 8));
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+12) ^ *(unsigned int *)(b+12);
neq |= get_unaligned((unsigned int *)(a + 12)) ^
get_unaligned((unsigned int *)(b + 12));
OPTIMIZER_HIDE_VAR(neq);
} else
#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */

View File

@ -385,15 +385,15 @@ static int pkcs1pad_sign(struct akcipher_request *req)
struct pkcs1pad_inst_ctx *ictx = akcipher_instance_ctx(inst);
const struct rsa_asn1_template *digest_info = ictx->digest_info;
int err;
unsigned int ps_end, digest_size = 0;
unsigned int ps_end, digest_info_size = 0;
if (!ctx->key_size)
return -EINVAL;
if (digest_info)
digest_size = digest_info->size;
digest_info_size = digest_info->size;
if (req->src_len + digest_size > ctx->key_size - 11)
if (req->src_len + digest_info_size > ctx->key_size - 11)
return -EOVERFLOW;
if (req->dst_len < ctx->key_size) {
@ -406,7 +406,7 @@ static int pkcs1pad_sign(struct akcipher_request *req)
if (!req_ctx->in_buf)
return -ENOMEM;
ps_end = ctx->key_size - digest_size - req->src_len - 2;
ps_end = ctx->key_size - digest_info_size - req->src_len - 2;
req_ctx->in_buf[0] = 0x01;
memset(req_ctx->in_buf + 1, 0xff, ps_end - 1);
req_ctx->in_buf[ps_end] = 0x00;
@ -441,6 +441,8 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
struct akcipher_instance *inst = akcipher_alg_instance(tfm);
struct pkcs1pad_inst_ctx *ictx = akcipher_instance_ctx(inst);
const struct rsa_asn1_template *digest_info = ictx->digest_info;
const unsigned int sig_size = req->src_len;
const unsigned int digest_size = req->dst_len;
unsigned int dst_len;
unsigned int pos;
u8 *out_buf;
@ -476,6 +478,8 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
pos++;
if (digest_info) {
if (digest_info->size > dst_len - pos)
goto done;
if (crypto_memneq(out_buf + pos, digest_info->data,
digest_info->size))
goto done;
@ -485,20 +489,19 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
err = 0;
if (req->dst_len != dst_len - pos) {
if (digest_size != dst_len - pos) {
err = -EKEYREJECTED;
req->dst_len = dst_len - pos;
goto done;
}
/* Extract appended digest. */
sg_pcopy_to_buffer(req->src,
sg_nents_for_len(req->src,
req->src_len + req->dst_len),
sg_nents_for_len(req->src, sig_size + digest_size),
req_ctx->out_buf + ctx->key_size,
req->dst_len, ctx->key_size);
digest_size, sig_size);
/* Do the actual verification step. */
if (memcmp(req_ctx->out_buf + ctx->key_size, out_buf + pos,
req->dst_len) != 0)
digest_size) != 0)
err = -EKEYREJECTED;
done:
kfree_sensitive(req_ctx->out_buf);
@ -534,14 +537,15 @@ static int pkcs1pad_verify(struct akcipher_request *req)
struct crypto_akcipher *tfm = crypto_akcipher_reqtfm(req);
struct pkcs1pad_ctx *ctx = akcipher_tfm_ctx(tfm);
struct pkcs1pad_request *req_ctx = akcipher_request_ctx(req);
const unsigned int sig_size = req->src_len;
const unsigned int digest_size = req->dst_len;
int err;
if (WARN_ON(req->dst) ||
WARN_ON(!req->dst_len) ||
!ctx->key_size || req->src_len < ctx->key_size)
if (WARN_ON(req->dst) || WARN_ON(!digest_size) ||
!ctx->key_size || sig_size != ctx->key_size)
return -EINVAL;
req_ctx->out_buf = kmalloc(ctx->key_size + req->dst_len, GFP_KERNEL);
req_ctx->out_buf = kmalloc(ctx->key_size + digest_size, GFP_KERNEL);
if (!req_ctx->out_buf)
return -ENOMEM;
@ -554,8 +558,7 @@ static int pkcs1pad_verify(struct akcipher_request *req)
/* Reuse input buffer, output to a new buffer */
akcipher_request_set_crypt(&req_ctx->child_req, req->src,
req_ctx->out_sg, req->src_len,
ctx->key_size);
req_ctx->out_sg, sig_size, ctx->key_size);
err = crypto_akcipher_encrypt(&req_ctx->child_req);
if (err != -EINPROGRESS && err != -EBUSY)
@ -621,6 +624,11 @@ static int pkcs1pad_create(struct crypto_template *tmpl, struct rtattr **tb)
rsa_alg = crypto_spawn_akcipher_alg(&ctx->spawn);
if (strcmp(rsa_alg->base.cra_name, "rsa") != 0) {
err = -EINVAL;
goto err_free_inst;
}
err = -ENAMETOOLONG;
hash_name = crypto_attr_alg_name(tb[2]);
if (IS_ERR(hash_name)) {

View File

@ -1,4 +1,4 @@
/* SPDX-License-Identifier: GPL-2.0-or-later */
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* SM2 asymmetric public-key algorithm
* as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and
@ -13,7 +13,7 @@
#include <crypto/internal/akcipher.h>
#include <crypto/akcipher.h>
#include <crypto/hash.h>
#include <crypto/sm3_base.h>
#include <crypto/sm3.h>
#include <crypto/rng.h>
#include <crypto/sm2.h>
#include "sm2signature.asn1.h"
@ -213,7 +213,7 @@ int sm2_get_signature_s(void *context, size_t hdrlen, unsigned char tag,
return 0;
}
static int sm2_z_digest_update(struct shash_desc *desc,
static int sm2_z_digest_update(struct sm3_state *sctx,
MPI m, unsigned int pbytes)
{
static const unsigned char zero[32];
@ -226,20 +226,20 @@ static int sm2_z_digest_update(struct shash_desc *desc,
if (inlen < pbytes) {
/* padding with zero */
crypto_sm3_update(desc, zero, pbytes - inlen);
crypto_sm3_update(desc, in, inlen);
sm3_update(sctx, zero, pbytes - inlen);
sm3_update(sctx, in, inlen);
} else if (inlen > pbytes) {
/* skip the starting zero */
crypto_sm3_update(desc, in + inlen - pbytes, pbytes);
sm3_update(sctx, in + inlen - pbytes, pbytes);
} else {
crypto_sm3_update(desc, in, inlen);
sm3_update(sctx, in, inlen);
}
kfree(in);
return 0;
}
static int sm2_z_digest_update_point(struct shash_desc *desc,
static int sm2_z_digest_update_point(struct sm3_state *sctx,
MPI_POINT point, struct mpi_ec_ctx *ec, unsigned int pbytes)
{
MPI x, y;
@ -249,8 +249,8 @@ static int sm2_z_digest_update_point(struct shash_desc *desc,
y = mpi_new(0);
if (!mpi_ec_get_affine(x, y, point, ec) &&
!sm2_z_digest_update(desc, x, pbytes) &&
!sm2_z_digest_update(desc, y, pbytes))
!sm2_z_digest_update(sctx, x, pbytes) &&
!sm2_z_digest_update(sctx, y, pbytes))
ret = 0;
mpi_free(x);
@ -265,7 +265,7 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm,
struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm);
uint16_t bits_len;
unsigned char entl[2];
SHASH_DESC_ON_STACK(desc, NULL);
struct sm3_state sctx;
unsigned int pbytes;
if (id_len > (USHRT_MAX / 8) || !ec->Q)
@ -278,17 +278,17 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm,
pbytes = MPI_NBYTES(ec->p);
/* ZA = H256(ENTLA | IDA | a | b | xG | yG | xA | yA) */
sm3_base_init(desc);
crypto_sm3_update(desc, entl, 2);
crypto_sm3_update(desc, id, id_len);
sm3_init(&sctx);
sm3_update(&sctx, entl, 2);
sm3_update(&sctx, id, id_len);
if (sm2_z_digest_update(desc, ec->a, pbytes) ||
sm2_z_digest_update(desc, ec->b, pbytes) ||
sm2_z_digest_update_point(desc, ec->G, ec, pbytes) ||
sm2_z_digest_update_point(desc, ec->Q, ec, pbytes))
if (sm2_z_digest_update(&sctx, ec->a, pbytes) ||
sm2_z_digest_update(&sctx, ec->b, pbytes) ||
sm2_z_digest_update_point(&sctx, ec->G, ec, pbytes) ||
sm2_z_digest_update_point(&sctx, ec->Q, ec, pbytes))
return -EINVAL;
crypto_sm3_final(desc, dgst);
sm3_final(&sctx, dgst);
return 0;
}
EXPORT_SYMBOL(sm2_compute_z_digest);

View File

@ -5,6 +5,7 @@
*
* Copyright (C) 2017 ARM Limited or its affiliates.
* Written by Gilad Ben-Yossef <gilad@benyossef.com>
* Copyright (C) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#include <crypto/internal/hash.h>
@ -26,143 +27,29 @@ const u8 sm3_zero_message_hash[SM3_DIGEST_SIZE] = {
};
EXPORT_SYMBOL_GPL(sm3_zero_message_hash);
static inline u32 p0(u32 x)
{
return x ^ rol32(x, 9) ^ rol32(x, 17);
}
static inline u32 p1(u32 x)
{
return x ^ rol32(x, 15) ^ rol32(x, 23);
}
static inline u32 ff(unsigned int n, u32 a, u32 b, u32 c)
{
return (n < 16) ? (a ^ b ^ c) : ((a & b) | (a & c) | (b & c));
}
static inline u32 gg(unsigned int n, u32 e, u32 f, u32 g)
{
return (n < 16) ? (e ^ f ^ g) : ((e & f) | ((~e) & g));
}
static inline u32 t(unsigned int n)
{
return (n < 16) ? SM3_T1 : SM3_T2;
}
static void sm3_expand(u32 *t, u32 *w, u32 *wt)
{
int i;
unsigned int tmp;
/* load the input */
for (i = 0; i <= 15; i++)
w[i] = get_unaligned_be32((__u32 *)t + i);
for (i = 16; i <= 67; i++) {
tmp = w[i - 16] ^ w[i - 9] ^ rol32(w[i - 3], 15);
w[i] = p1(tmp) ^ (rol32(w[i - 13], 7)) ^ w[i - 6];
}
for (i = 0; i <= 63; i++)
wt[i] = w[i] ^ w[i + 4];
}
static void sm3_compress(u32 *w, u32 *wt, u32 *m)
{
u32 ss1;
u32 ss2;
u32 tt1;
u32 tt2;
u32 a, b, c, d, e, f, g, h;
int i;
a = m[0];
b = m[1];
c = m[2];
d = m[3];
e = m[4];
f = m[5];
g = m[6];
h = m[7];
for (i = 0; i <= 63; i++) {
ss1 = rol32((rol32(a, 12) + e + rol32(t(i), i & 31)), 7);
ss2 = ss1 ^ rol32(a, 12);
tt1 = ff(i, a, b, c) + d + ss2 + *wt;
wt++;
tt2 = gg(i, e, f, g) + h + ss1 + *w;
w++;
d = c;
c = rol32(b, 9);
b = a;
a = tt1;
h = g;
g = rol32(f, 19);
f = e;
e = p0(tt2);
}
m[0] = a ^ m[0];
m[1] = b ^ m[1];
m[2] = c ^ m[2];
m[3] = d ^ m[3];
m[4] = e ^ m[4];
m[5] = f ^ m[5];
m[6] = g ^ m[6];
m[7] = h ^ m[7];
a = b = c = d = e = f = g = h = ss1 = ss2 = tt1 = tt2 = 0;
}
static void sm3_transform(struct sm3_state *sst, u8 const *src)
{
unsigned int w[68];
unsigned int wt[64];
sm3_expand((u32 *)src, w, wt);
sm3_compress(w, wt, sst->state);
memzero_explicit(w, sizeof(w));
memzero_explicit(wt, sizeof(wt));
}
static void sm3_generic_block_fn(struct sm3_state *sst, u8 const *src,
int blocks)
{
while (blocks--) {
sm3_transform(sst, src);
src += SM3_BLOCK_SIZE;
}
}
int crypto_sm3_update(struct shash_desc *desc, const u8 *data,
static int crypto_sm3_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
return sm3_base_do_update(desc, data, len, sm3_generic_block_fn);
sm3_update(shash_desc_ctx(desc), data, len);
return 0;
}
EXPORT_SYMBOL(crypto_sm3_update);
int crypto_sm3_final(struct shash_desc *desc, u8 *out)
static int crypto_sm3_final(struct shash_desc *desc, u8 *out)
{
sm3_base_do_finalize(desc, sm3_generic_block_fn);
return sm3_base_finish(desc, out);
sm3_final(shash_desc_ctx(desc), out);
return 0;
}
EXPORT_SYMBOL(crypto_sm3_final);
int crypto_sm3_finup(struct shash_desc *desc, const u8 *data,
static int crypto_sm3_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *hash)
{
sm3_base_do_update(desc, data, len, sm3_generic_block_fn);
return crypto_sm3_final(desc, hash);
struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, hash);
return 0;
}
EXPORT_SYMBOL(crypto_sm3_finup);
static struct shash_alg sm3_alg = {
.digestsize = SM3_DIGEST_SIZE,
@ -174,6 +61,7 @@ static struct shash_alg sm3_alg = {
.base = {
.cra_name = "sm3",
.cra_driver_name = "sm3-generic",
.cra_priority = 100,
.cra_blocksize = SM3_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}

View File

@ -724,200 +724,6 @@ static inline int do_one_ahash_op(struct ahash_request *req, int ret)
return crypto_wait_req(ret, wait);
}
struct test_mb_ahash_data {
struct scatterlist sg[XBUFSIZE];
char result[64];
struct ahash_request *req;
struct crypto_wait wait;
char *xbuf[XBUFSIZE];
};
static inline int do_mult_ahash_op(struct test_mb_ahash_data *data, u32 num_mb,
int *rc)
{
int i, err = 0;
/* Fire up a bunch of concurrent requests */
for (i = 0; i < num_mb; i++)
rc[i] = crypto_ahash_digest(data[i].req);
/* Wait for all requests to finish */
for (i = 0; i < num_mb; i++) {
rc[i] = crypto_wait_req(rc[i], &data[i].wait);
if (rc[i]) {
pr_info("concurrent request %d error %d\n", i, rc[i]);
err = rc[i];
}
}
return err;
}
static int test_mb_ahash_jiffies(struct test_mb_ahash_data *data, int blen,
int secs, u32 num_mb)
{
unsigned long start, end;
int bcount;
int ret = 0;
int *rc;
rc = kcalloc(num_mb, sizeof(*rc), GFP_KERNEL);
if (!rc)
return -ENOMEM;
for (start = jiffies, end = start + secs * HZ, bcount = 0;
time_before(jiffies, end); bcount++) {
ret = do_mult_ahash_op(data, num_mb, rc);
if (ret)
goto out;
}
pr_cont("%d operations in %d seconds (%llu bytes)\n",
bcount * num_mb, secs, (u64)bcount * blen * num_mb);
out:
kfree(rc);
return ret;
}
static int test_mb_ahash_cycles(struct test_mb_ahash_data *data, int blen,
u32 num_mb)
{
unsigned long cycles = 0;
int ret = 0;
int i;
int *rc;
rc = kcalloc(num_mb, sizeof(*rc), GFP_KERNEL);
if (!rc)
return -ENOMEM;
/* Warm-up run. */
for (i = 0; i < 4; i++) {
ret = do_mult_ahash_op(data, num_mb, rc);
if (ret)
goto out;
}
/* The real thing. */
for (i = 0; i < 8; i++) {
cycles_t start, end;
start = get_cycles();
ret = do_mult_ahash_op(data, num_mb, rc);
end = get_cycles();
if (ret)
goto out;
cycles += end - start;
}
pr_cont("1 operation in %lu cycles (%d bytes)\n",
(cycles + 4) / (8 * num_mb), blen);
out:
kfree(rc);
return ret;
}
static void test_mb_ahash_speed(const char *algo, unsigned int secs,
struct hash_speed *speed, u32 num_mb)
{
struct test_mb_ahash_data *data;
struct crypto_ahash *tfm;
unsigned int i, j, k;
int ret;
data = kcalloc(num_mb, sizeof(*data), GFP_KERNEL);
if (!data)
return;
tfm = crypto_alloc_ahash(algo, 0, 0);
if (IS_ERR(tfm)) {
pr_err("failed to load transform for %s: %ld\n",
algo, PTR_ERR(tfm));
goto free_data;
}
for (i = 0; i < num_mb; ++i) {
if (testmgr_alloc_buf(data[i].xbuf))
goto out;
crypto_init_wait(&data[i].wait);
data[i].req = ahash_request_alloc(tfm, GFP_KERNEL);
if (!data[i].req) {
pr_err("alg: hash: Failed to allocate request for %s\n",
algo);
goto out;
}
ahash_request_set_callback(data[i].req, 0, crypto_req_done,
&data[i].wait);
sg_init_table(data[i].sg, XBUFSIZE);
for (j = 0; j < XBUFSIZE; j++) {
sg_set_buf(data[i].sg + j, data[i].xbuf[j], PAGE_SIZE);
memset(data[i].xbuf[j], 0xff, PAGE_SIZE);
}
}
pr_info("\ntesting speed of multibuffer %s (%s)\n", algo,
get_driver_name(crypto_ahash, tfm));
for (i = 0; speed[i].blen != 0; i++) {
/* For some reason this only tests digests. */
if (speed[i].blen != speed[i].plen)
continue;
if (speed[i].blen > XBUFSIZE * PAGE_SIZE) {
pr_err("template (%u) too big for tvmem (%lu)\n",
speed[i].blen, XBUFSIZE * PAGE_SIZE);
goto out;
}
if (klen)
crypto_ahash_setkey(tfm, tvmem[0], klen);
for (k = 0; k < num_mb; k++)
ahash_request_set_crypt(data[k].req, data[k].sg,
data[k].result, speed[i].blen);
pr_info("test%3u "
"(%5u byte blocks,%5u bytes per update,%4u updates): ",
i, speed[i].blen, speed[i].plen,
speed[i].blen / speed[i].plen);
if (secs) {
ret = test_mb_ahash_jiffies(data, speed[i].blen, secs,
num_mb);
cond_resched();
} else {
ret = test_mb_ahash_cycles(data, speed[i].blen, num_mb);
}
if (ret) {
pr_err("At least one hashing failed ret=%d\n", ret);
break;
}
}
out:
for (k = 0; k < num_mb; ++k)
ahash_request_free(data[k].req);
for (k = 0; k < num_mb; ++k)
testmgr_free_buf(data[k].xbuf);
crypto_free_ahash(tfm);
free_data:
kfree(data);
}
static int test_ahash_jiffies_digest(struct ahash_request *req, int blen,
char *out, int secs)
{
@ -1667,8 +1473,8 @@ static inline int tcrypt_test(const char *alg)
pr_debug("testing %s\n", alg);
ret = alg_test(alg, alg, 0, 0);
/* non-fips algs return -EINVAL in fips mode */
if (fips_enabled && ret == -EINVAL)
/* non-fips algs return -EINVAL or -ECANCELED in fips mode */
if (fips_enabled && (ret == -EINVAL || ret == -ECANCELED))
ret = 0;
return ret;
}
@ -2571,33 +2377,7 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
if (mode > 400 && mode < 500) break;
fallthrough;
case 422:
test_mb_ahash_speed("sha1", sec, generic_hash_speed_template,
num_mb);
if (mode > 400 && mode < 500) break;
fallthrough;
case 423:
test_mb_ahash_speed("sha256", sec, generic_hash_speed_template,
num_mb);
if (mode > 400 && mode < 500) break;
fallthrough;
case 424:
test_mb_ahash_speed("sha512", sec, generic_hash_speed_template,
num_mb);
if (mode > 400 && mode < 500) break;
fallthrough;
case 425:
test_mb_ahash_speed("sm3", sec, generic_hash_speed_template,
num_mb);
if (mode > 400 && mode < 500) break;
fallthrough;
case 426:
test_mb_ahash_speed("streebog256", sec,
generic_hash_speed_template, num_mb);
if (mode > 400 && mode < 500) break;
fallthrough;
case 427:
test_mb_ahash_speed("streebog512", sec,
generic_hash_speed_template, num_mb);
test_ahash_speed("sm3", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 499:

View File

@ -55,9 +55,6 @@ MODULE_PARM_DESC(noextratests, "disable expensive crypto self-tests");
static unsigned int fuzz_iterations = 100;
module_param(fuzz_iterations, uint, 0644);
MODULE_PARM_DESC(fuzz_iterations, "number of fuzz test iterations");
DEFINE_PER_CPU(bool, crypto_simd_disabled_for_test);
EXPORT_PER_CPU_SYMBOL_GPL(crypto_simd_disabled_for_test);
#endif
#ifdef CONFIG_CRYPTO_MANAGER_DISABLE_TESTS
@ -1854,6 +1851,9 @@ static int __alg_test_hash(const struct hash_testvec *vecs,
}
for (i = 0; i < num_vecs; i++) {
if (fips_enabled && vecs[i].fips_skip)
continue;
err = test_hash_vec(&vecs[i], i, req, desc, tsgl, hashstate);
if (err)
goto out;
@ -4650,7 +4650,6 @@ static const struct alg_test_desc alg_test_descs[] = {
}, {
.alg = "dh",
.test = alg_test_kpp,
.fips_allowed = 1,
.suite = {
.kpp = __VECS(dh_tv_template)
}
@ -4973,6 +4972,43 @@ static const struct alg_test_desc alg_test_descs[] = {
.cipher = __VECS(essiv_aes_cbc_tv_template)
}
}, {
#if IS_ENABLED(CONFIG_CRYPTO_DH_RFC7919_GROUPS)
.alg = "ffdhe2048(dh)",
.test = alg_test_kpp,
.fips_allowed = 1,
.suite = {
.kpp = __VECS(ffdhe2048_dh_tv_template)
}
}, {
.alg = "ffdhe3072(dh)",
.test = alg_test_kpp,
.fips_allowed = 1,
.suite = {
.kpp = __VECS(ffdhe3072_dh_tv_template)
}
}, {
.alg = "ffdhe4096(dh)",
.test = alg_test_kpp,
.fips_allowed = 1,
.suite = {
.kpp = __VECS(ffdhe4096_dh_tv_template)
}
}, {
.alg = "ffdhe6144(dh)",
.test = alg_test_kpp,
.fips_allowed = 1,
.suite = {
.kpp = __VECS(ffdhe6144_dh_tv_template)
}
}, {
.alg = "ffdhe8192(dh)",
.test = alg_test_kpp,
.fips_allowed = 1,
.suite = {
.kpp = __VECS(ffdhe8192_dh_tv_template)
}
}, {
#endif /* CONFIG_CRYPTO_DH_RFC7919_GROUPS */
.alg = "gcm(aes)",
.generic_driver = "gcm_base(ctr(aes-generic),ghash-generic)",
.test = alg_test_aead,
@ -5613,6 +5649,13 @@ static int alg_find_test(const char *alg)
return -1;
}
static int alg_fips_disabled(const char *driver, const char *alg)
{
pr_info("alg: %s (%s) is disabled due to FIPS\n", alg, driver);
return -ECANCELED;
}
int alg_test(const char *driver, const char *alg, u32 type, u32 mask)
{
int i;
@ -5649,9 +5692,13 @@ int alg_test(const char *driver, const char *alg, u32 type, u32 mask)
if (i < 0 && j < 0)
goto notest;
if (fips_enabled && ((i >= 0 && !alg_test_descs[i].fips_allowed) ||
(j >= 0 && !alg_test_descs[j].fips_allowed)))
if (fips_enabled) {
if (j >= 0 && !alg_test_descs[j].fips_allowed)
return -EINVAL;
if (i >= 0 && !alg_test_descs[i].fips_allowed)
goto non_fips_alg;
}
rc = 0;
if (i >= 0)
@ -5681,9 +5728,13 @@ test_done:
notest:
printk(KERN_INFO "alg: No test for %s (%s)\n", alg, driver);
if (type & CRYPTO_ALG_FIPS_INTERNAL)
return alg_fips_disabled(driver, alg);
return 0;
non_fips_alg:
return -EINVAL;
return alg_fips_disabled(driver, alg);
}
#endif /* CONFIG_CRYPTO_MANAGER_DISABLE_TESTS */

File diff suppressed because it is too large Load Diff

View File

@ -466,3 +466,4 @@ MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("XTS block cipher mode");
MODULE_ALIAS_CRYPTO("xts");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);
MODULE_SOFTDEP("pre: ecb");

View File

@ -401,7 +401,7 @@ config HW_RANDOM_MESON
config HW_RANDOM_CAVIUM
tristate "Cavium ThunderX Random Number Generator support"
depends on HW_RANDOM && PCI && ARM64
depends on HW_RANDOM && PCI && ARCH_THUNDER
default HW_RANDOM
help
This driver provides kernel-side support for the Random Number

View File

@ -13,13 +13,16 @@
#include <linux/err.h>
#include <linux/clk.h>
#include <linux/io.h>
#include <linux/iopoll.h>
#include <linux/hw_random.h>
#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
#define TRNG_CR 0x00
#define TRNG_MR 0x04
#define TRNG_ISR 0x1c
#define TRNG_ISR_DATRDY BIT(0)
#define TRNG_ODATA 0x50
#define TRNG_KEY 0x524e4700 /* RNG */
@ -34,37 +37,79 @@ struct atmel_trng {
struct clk *clk;
void __iomem *base;
struct hwrng rng;
bool has_half_rate;
};
static bool atmel_trng_wait_ready(struct atmel_trng *trng, bool wait)
{
int ready;
ready = readl(trng->base + TRNG_ISR) & TRNG_ISR_DATRDY;
if (!ready && wait)
readl_poll_timeout(trng->base + TRNG_ISR, ready,
ready & TRNG_ISR_DATRDY, 1000, 20000);
return !!ready;
}
static int atmel_trng_read(struct hwrng *rng, void *buf, size_t max,
bool wait)
{
struct atmel_trng *trng = container_of(rng, struct atmel_trng, rng);
u32 *data = buf;
int ret;
ret = pm_runtime_get_sync((struct device *)trng->rng.priv);
if (ret < 0) {
pm_runtime_put_sync((struct device *)trng->rng.priv);
return ret;
}
ret = atmel_trng_wait_ready(trng, wait);
if (!ret)
goto out;
/* data ready? */
if (readl(trng->base + TRNG_ISR) & 1) {
*data = readl(trng->base + TRNG_ODATA);
/*
ensure data ready is only set again AFTER the next data
word is ready in case it got set between checking ISR
and reading ODATA, so we don't risk re-reading the
same word
* ensure data ready is only set again AFTER the next data word is ready
* in case it got set between checking ISR and reading ODATA, so we
* don't risk re-reading the same word
*/
readl(trng->base + TRNG_ISR);
return 4;
} else
ret = 4;
out:
pm_runtime_mark_last_busy((struct device *)trng->rng.priv);
pm_runtime_put_sync_autosuspend((struct device *)trng->rng.priv);
return ret;
}
static int atmel_trng_init(struct atmel_trng *trng)
{
unsigned long rate;
int ret;
ret = clk_prepare_enable(trng->clk);
if (ret)
return ret;
if (trng->has_half_rate) {
rate = clk_get_rate(trng->clk);
/* if peripheral clk is above 100MHz, set HALFR */
if (rate > 100000000)
writel(TRNG_HALFR, trng->base + TRNG_MR);
}
writel(TRNG_KEY | 1, trng->base + TRNG_CR);
return 0;
}
static void atmel_trng_enable(struct atmel_trng *trng)
{
writel(TRNG_KEY | 1, trng->base + TRNG_CR);
}
static void atmel_trng_disable(struct atmel_trng *trng)
static void atmel_trng_cleanup(struct atmel_trng *trng)
{
writel(TRNG_KEY, trng->base + TRNG_CR);
clk_disable_unprepare(trng->clk);
}
static int atmel_trng_probe(struct platform_device *pdev)
@ -88,32 +133,31 @@ static int atmel_trng_probe(struct platform_device *pdev)
if (!data)
return -ENODEV;
if (data->has_half_rate) {
unsigned long rate = clk_get_rate(trng->clk);
/* if peripheral clk is above 100MHz, set HALFR */
if (rate > 100000000)
writel(TRNG_HALFR, trng->base + TRNG_MR);
}
ret = clk_prepare_enable(trng->clk);
if (ret)
return ret;
atmel_trng_enable(trng);
trng->has_half_rate = data->has_half_rate;
trng->rng.name = pdev->name;
trng->rng.read = atmel_trng_read;
ret = devm_hwrng_register(&pdev->dev, &trng->rng);
if (ret)
goto err_register;
trng->rng.priv = (unsigned long)&pdev->dev;
platform_set_drvdata(pdev, trng);
return 0;
#ifndef CONFIG_PM
ret = atmel_trng_init(trng);
if (ret)
return ret;
#endif
pm_runtime_set_autosuspend_delay(&pdev->dev, 100);
pm_runtime_use_autosuspend(&pdev->dev);
pm_runtime_enable(&pdev->dev);
ret = devm_hwrng_register(&pdev->dev, &trng->rng);
if (ret) {
pm_runtime_disable(&pdev->dev);
pm_runtime_set_suspended(&pdev->dev);
#ifndef CONFIG_PM
atmel_trng_cleanup(trng);
#endif
}
err_register:
clk_disable_unprepare(trng->clk);
return ret;
}
@ -121,43 +165,35 @@ static int atmel_trng_remove(struct platform_device *pdev)
{
struct atmel_trng *trng = platform_get_drvdata(pdev);
atmel_trng_disable(trng);
clk_disable_unprepare(trng->clk);
atmel_trng_cleanup(trng);
pm_runtime_disable(&pdev->dev);
pm_runtime_set_suspended(&pdev->dev);
return 0;
}
#ifdef CONFIG_PM
static int atmel_trng_suspend(struct device *dev)
static int __maybe_unused atmel_trng_runtime_suspend(struct device *dev)
{
struct atmel_trng *trng = dev_get_drvdata(dev);
atmel_trng_disable(trng);
clk_disable_unprepare(trng->clk);
atmel_trng_cleanup(trng);
return 0;
}
static int atmel_trng_resume(struct device *dev)
static int __maybe_unused atmel_trng_runtime_resume(struct device *dev)
{
struct atmel_trng *trng = dev_get_drvdata(dev);
int ret;
ret = clk_prepare_enable(trng->clk);
if (ret)
return ret;
atmel_trng_enable(trng);
return 0;
return atmel_trng_init(trng);
}
static const struct dev_pm_ops atmel_trng_pm_ops = {
.suspend = atmel_trng_suspend,
.resume = atmel_trng_resume,
static const struct dev_pm_ops __maybe_unused atmel_trng_pm_ops = {
SET_RUNTIME_PM_OPS(atmel_trng_runtime_suspend,
atmel_trng_runtime_resume, NULL)
SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
pm_runtime_force_resume)
};
#endif /* CONFIG_PM */
static const struct atmel_trng_data at91sam9g45_config = {
.has_half_rate = false,
@ -185,9 +221,7 @@ static struct platform_driver atmel_trng_driver = {
.remove = atmel_trng_remove,
.driver = {
.name = "atmel-trng",
#ifdef CONFIG_PM
.pm = &atmel_trng_pm_ops,
#endif /* CONFIG_PM */
.pm = pm_ptr(&atmel_trng_pm_ops),
.of_match_table = atmel_trng_dt_ids,
},
};

View File

@ -179,7 +179,7 @@ static int cavium_map_pf_regs(struct cavium_rng *rng)
pdev = pci_get_device(PCI_VENDOR_ID_CAVIUM,
PCI_DEVID_CAVIUM_RNG_PF, NULL);
if (!pdev) {
dev_err(&pdev->dev, "Cannot find RNG PF device\n");
pr_err("Cannot find RNG PF device\n");
return -EIO;
}

View File

@ -32,7 +32,7 @@ static struct hwrng *current_rng;
/* the current rng has been explicitly chosen by user via sysfs */
static int cur_rng_set_by_user;
static struct task_struct *hwrng_fill;
/* list of registered rngs, sorted decending by quality */
/* list of registered rngs */
static LIST_HEAD(rng_list);
/* Protects rng_list and current_rng */
static DEFINE_MUTEX(rng_mutex);
@ -45,14 +45,14 @@ static unsigned short default_quality; /* = 0; default to "off" */
module_param(current_quality, ushort, 0644);
MODULE_PARM_DESC(current_quality,
"current hwrng entropy estimation per 1024 bits of input");
"current hwrng entropy estimation per 1024 bits of input -- obsolete, use rng_quality instead");
module_param(default_quality, ushort, 0644);
MODULE_PARM_DESC(default_quality,
"default entropy content of hwrng per 1024 bits of input");
static void drop_current_rng(void);
static int hwrng_init(struct hwrng *rng);
static void start_khwrngd(void);
static void hwrng_manage_rngd(struct hwrng *rng);
static inline int rng_get_data(struct hwrng *rng, u8 *buffer, size_t size,
int wait);
@ -65,13 +65,12 @@ static size_t rng_buffer_size(void)
static void add_early_randomness(struct hwrng *rng)
{
int bytes_read;
size_t size = min_t(size_t, 16, rng_buffer_size());
mutex_lock(&reading_mutex);
bytes_read = rng_get_data(rng, rng_buffer, size, 0);
bytes_read = rng_get_data(rng, rng_fillbuf, 32, 0);
mutex_unlock(&reading_mutex);
if (bytes_read > 0)
add_device_randomness(rng_buffer, bytes_read);
add_device_randomness(rng_fillbuf, bytes_read);
}
static inline void cleanup_rng(struct kref *kref)
@ -162,14 +161,13 @@ static int hwrng_init(struct hwrng *rng)
reinit_completion(&rng->cleanup_done);
skip_init:
current_quality = rng->quality ? : default_quality;
if (current_quality > 1024)
current_quality = 1024;
if (!rng->quality)
rng->quality = default_quality;
if (rng->quality > 1024)
rng->quality = 1024;
current_quality = rng->quality; /* obsolete */
if (current_quality == 0 && hwrng_fill)
kthread_stop(hwrng_fill);
if (current_quality > 0 && !hwrng_fill)
start_khwrngd();
hwrng_manage_rngd(rng);
return 0;
}
@ -299,23 +297,27 @@ static struct miscdevice rng_miscdev = {
static int enable_best_rng(void)
{
struct hwrng *rng, *new_rng = NULL;
int ret = -ENODEV;
BUG_ON(!mutex_is_locked(&rng_mutex));
/* rng_list is sorted by quality, use the best (=first) one */
if (!list_empty(&rng_list)) {
struct hwrng *new_rng;
/* no rng to use? */
if (list_empty(&rng_list)) {
drop_current_rng();
cur_rng_set_by_user = 0;
return 0;
}
/* use the rng which offers the best quality */
list_for_each_entry(rng, &rng_list, list) {
if (!new_rng || rng->quality > new_rng->quality)
new_rng = rng;
}
new_rng = list_entry(rng_list.next, struct hwrng, list);
ret = ((new_rng == current_rng) ? 0 : set_current_rng(new_rng));
if (!ret)
cur_rng_set_by_user = 0;
} else {
drop_current_rng();
cur_rng_set_by_user = 0;
ret = 0;
}
return ret;
}
@ -337,8 +339,9 @@ static ssize_t rng_current_store(struct device *dev,
} else {
list_for_each_entry(rng, &rng_list, list) {
if (sysfs_streq(rng->name, buf)) {
cur_rng_set_by_user = 1;
err = set_current_rng(rng);
if (!err)
cur_rng_set_by_user = 1;
break;
}
}
@ -400,14 +403,76 @@ static ssize_t rng_selected_show(struct device *dev,
return sysfs_emit(buf, "%d\n", cur_rng_set_by_user);
}
static ssize_t rng_quality_show(struct device *dev,
struct device_attribute *attr,
char *buf)
{
ssize_t ret;
struct hwrng *rng;
rng = get_current_rng();
if (IS_ERR(rng))
return PTR_ERR(rng);
if (!rng) /* no need to put_rng */
return -ENODEV;
ret = sysfs_emit(buf, "%hu\n", rng->quality);
put_rng(rng);
return ret;
}
static ssize_t rng_quality_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t len)
{
u16 quality;
int ret = -EINVAL;
if (len < 2)
return -EINVAL;
ret = mutex_lock_interruptible(&rng_mutex);
if (ret)
return -ERESTARTSYS;
ret = kstrtou16(buf, 0, &quality);
if (ret || quality > 1024) {
ret = -EINVAL;
goto out;
}
if (!current_rng) {
ret = -ENODEV;
goto out;
}
current_rng->quality = quality;
current_quality = quality; /* obsolete */
/* the best available RNG may have changed */
ret = enable_best_rng();
/* start/stop rngd if necessary */
if (current_rng)
hwrng_manage_rngd(current_rng);
out:
mutex_unlock(&rng_mutex);
return ret ? ret : len;
}
static DEVICE_ATTR_RW(rng_current);
static DEVICE_ATTR_RO(rng_available);
static DEVICE_ATTR_RO(rng_selected);
static DEVICE_ATTR_RW(rng_quality);
static struct attribute *rng_dev_attrs[] = {
&dev_attr_rng_current.attr,
&dev_attr_rng_available.attr,
&dev_attr_rng_selected.attr,
&dev_attr_rng_quality.attr,
NULL
};
@ -425,9 +490,11 @@ static int __init register_miscdev(void)
static int hwrng_fillfn(void *unused)
{
size_t entropy, entropy_credit = 0; /* in 1/1024 of a bit */
long rc;
while (!kthread_should_stop()) {
unsigned short quality;
struct hwrng *rng;
rng = get_current_rng();
@ -436,35 +503,56 @@ static int hwrng_fillfn(void *unused)
mutex_lock(&reading_mutex);
rc = rng_get_data(rng, rng_fillbuf,
rng_buffer_size(), 1);
if (current_quality != rng->quality)
rng->quality = current_quality; /* obsolete */
quality = rng->quality;
mutex_unlock(&reading_mutex);
put_rng(rng);
if (!quality)
break;
if (rc <= 0) {
pr_warn("hwrng: no data available\n");
msleep_interruptible(10000);
continue;
}
/* If we cannot credit at least one bit of entropy,
* keep track of the remainder for the next iteration
*/
entropy = rc * quality * 8 + entropy_credit;
if ((entropy >> 10) == 0)
entropy_credit = entropy;
/* Outside lock, sure, but y'know: randomness. */
add_hwgenerator_randomness((void *)rng_fillbuf, rc,
rc * current_quality * 8 >> 10);
entropy >> 10);
}
hwrng_fill = NULL;
return 0;
}
static void start_khwrngd(void)
static void hwrng_manage_rngd(struct hwrng *rng)
{
if (WARN_ON(!mutex_is_locked(&rng_mutex)))
return;
if (rng->quality == 0 && hwrng_fill)
kthread_stop(hwrng_fill);
if (rng->quality > 0 && !hwrng_fill) {
hwrng_fill = kthread_run(hwrng_fillfn, NULL, "hwrng");
if (IS_ERR(hwrng_fill)) {
pr_err("hwrng_fill thread creation failed\n");
hwrng_fill = NULL;
}
}
}
int hwrng_register(struct hwrng *rng)
{
int err = -EINVAL;
struct hwrng *tmp;
struct list_head *rng_list_ptr;
bool is_new_current = false;
if (!rng->name || (!rng->data_read && !rng->read))
@ -478,18 +566,11 @@ int hwrng_register(struct hwrng *rng)
if (strcmp(tmp->name, rng->name) == 0)
goto out_unlock;
}
list_add_tail(&rng->list, &rng_list);
init_completion(&rng->cleanup_done);
complete(&rng->cleanup_done);
/* rng_list is sorted by decreasing quality */
list_for_each(rng_list_ptr, &rng_list) {
tmp = list_entry(rng_list_ptr, struct hwrng, list);
if (tmp->quality < rng->quality)
break;
}
list_add_tail(&rng->list, rng_list_ptr);
if (!current_rng ||
(!cur_rng_set_by_user && rng->quality > current_rng->quality)) {
/*
@ -639,7 +720,7 @@ static void __exit hwrng_modexit(void)
unregister_miscdev();
}
module_init(hwrng_modinit);
fs_initcall(hwrng_modinit); /* depends on misc_register() */
module_exit(hwrng_modexit);
MODULE_DESCRIPTION("H/W Random Number Generator (RNG) driver");

View File

@ -65,14 +65,14 @@ static int nmk_rng_probe(struct amba_device *dev, const struct amba_id *id)
out_release:
amba_release_regions(dev);
out_clk:
clk_disable(rng_clk);
clk_disable_unprepare(rng_clk);
return ret;
}
static void nmk_rng_remove(struct amba_device *dev)
{
amba_release_regions(dev);
clk_disable(rng_clk);
clk_disable_unprepare(rng_clk);
}
static const struct amba_id nmk_rng_ids[] = {

View File

@ -808,6 +808,16 @@ config CRYPTO_DEV_ZYNQMP_AES
accelerator. Select this if you want to use the ZynqMP module
for AES algorithms.
config CRYPTO_DEV_ZYNQMP_SHA3
tristate "Support for Xilinx ZynqMP SHA3 hardware accelerator"
depends on ZYNQMP_FIRMWARE || COMPILE_TEST
select CRYPTO_SHA3
help
Xilinx ZynqMP has SHA3 engine used for secure hash calculation.
This driver interfaces with SHA3 hardware engine.
Select this if you want to use the ZynqMP module
for SHA3 hash computation.
source "drivers/crypto/chelsio/Kconfig"
source "drivers/crypto/virtio/Kconfig"

View File

@ -47,7 +47,7 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) += bcm/
obj-$(CONFIG_CRYPTO_DEV_SAFEXCEL) += inside-secure/
obj-$(CONFIG_CRYPTO_DEV_ARTPEC6) += axis/
obj-$(CONFIG_CRYPTO_DEV_ZYNQMP_AES) += xilinx/
obj-y += xilinx/
obj-y += hisilicon/
obj-$(CONFIG_CRYPTO_DEV_AMLOGIC_GXL) += amlogic/
obj-y += keembay/

View File

@ -11,6 +11,7 @@
* You could find a link for the datasheet in Documentation/arm/sunxi.rst
*/
#include <linux/bottom_half.h>
#include <linux/crypto.h>
#include <linux/dma-mapping.h>
#include <linux/io.h>
@ -283,7 +284,9 @@ static int sun8i_ce_cipher_run(struct crypto_engine *engine, void *areq)
flow = rctx->flow;
err = sun8i_ce_run_task(ce, flow, crypto_tfm_alg_name(breq->base.tfm));
local_bh_disable();
crypto_finalize_skcipher_request(engine, breq, err);
local_bh_enable();
return 0;
}

View File

@ -9,6 +9,7 @@
*
* You could find the datasheet in Documentation/arm/sunxi.rst
*/
#include <linux/bottom_half.h>
#include <linux/dma-mapping.h>
#include <linux/pm_runtime.h>
#include <linux/scatterlist.h>
@ -414,6 +415,8 @@ int sun8i_ce_hash_run(struct crypto_engine *engine, void *breq)
theend:
kfree(buf);
kfree(result);
local_bh_disable();
crypto_finalize_hash_request(engine, breq, err);
local_bh_enable();
return 0;
}

View File

@ -11,6 +11,7 @@
* You could find a link for the datasheet in Documentation/arm/sunxi.rst
*/
#include <linux/bottom_half.h>
#include <linux/crypto.h>
#include <linux/dma-mapping.h>
#include <linux/io.h>
@ -274,7 +275,9 @@ static int sun8i_ss_handle_cipher_request(struct crypto_engine *engine, void *ar
struct skcipher_request *breq = container_of(areq, struct skcipher_request, base);
err = sun8i_ss_cipher(breq);
local_bh_disable();
crypto_finalize_skcipher_request(engine, breq, err);
local_bh_enable();
return 0;
}

View File

@ -30,6 +30,8 @@
static const struct ss_variant ss_a80_variant = {
.alg_cipher = { SS_ALG_AES, SS_ALG_DES, SS_ALG_3DES,
},
.alg_hash = { SS_ID_NOTSUPP, SS_ID_NOTSUPP, SS_ID_NOTSUPP, SS_ID_NOTSUPP,
},
.op_mode = { SS_OP_ECB, SS_OP_CBC,
},
.ss_clks = {

View File

@ -9,6 +9,7 @@
*
* You could find the datasheet in Documentation/arm/sunxi.rst
*/
#include <linux/bottom_half.h>
#include <linux/dma-mapping.h>
#include <linux/pm_runtime.h>
#include <linux/scatterlist.h>
@ -442,6 +443,8 @@ int sun8i_ss_hash_run(struct crypto_engine *engine, void *breq)
theend:
kfree(pad);
kfree(result);
local_bh_disable();
crypto_finalize_hash_request(engine, breq, err);
local_bh_enable();
return 0;
}

View File

@ -265,7 +265,9 @@ static int meson_handle_cipher_request(struct crypto_engine *engine,
struct skcipher_request *breq = container_of(areq, struct skcipher_request, base);
err = meson_cipher(breq);
local_bh_disable();
crypto_finalize_skcipher_request(engine, breq, err);
local_bh_enable();
return 0;
}

View File

@ -2509,6 +2509,7 @@ static void atmel_aes_get_cap(struct atmel_aes_dev *dd)
/* keep only major version number */
switch (dd->hw_version & 0xff0) {
case 0x700:
case 0x500:
dd->caps.has_dualbuff = 1;
dd->caps.has_cfb64 = 1;

View File

@ -2508,6 +2508,7 @@ static void atmel_sha_get_cap(struct atmel_sha_dev *dd)
/* keep only major version number */
switch (dd->hw_version & 0xff0) {
case 0x700:
case 0x510:
dd->caps.has_dma = 1;
dd->caps.has_dualbuff = 1;

View File

@ -1130,6 +1130,7 @@ static void atmel_tdes_get_cap(struct atmel_tdes_dev *dd)
/* keep only major version number */
switch (dd->hw_version & 0xf00) {
case 0x800:
case 0x700:
dd->caps.has_dma = 1;
dd->caps.has_cfb_3keys = 1;

View File

@ -1,4 +1,5 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bitmap.h>
#include <linux/workqueue.h>
#include "nitrox_csr.h"
@ -120,6 +121,7 @@ static void pf2vf_resp_handler(struct work_struct *work)
void nitrox_pf2vf_mbox_handler(struct nitrox_device *ndev)
{
DECLARE_BITMAP(csr, BITS_PER_TYPE(u64));
struct nitrox_vfdev *vfdev;
struct pf2vf_work *pfwork;
u64 value, reg_addr;
@ -129,7 +131,8 @@ void nitrox_pf2vf_mbox_handler(struct nitrox_device *ndev)
/* loop for VF(0..63) */
reg_addr = NPS_PKT_MBOX_INT_LO;
value = nitrox_read_csr(ndev, reg_addr);
for_each_set_bit(i, (const unsigned long *)&value, BITS_PER_LONG) {
bitmap_from_u64(csr, value);
for_each_set_bit(i, csr, BITS_PER_TYPE(csr)) {
/* get the vfno from ring */
vfno = RING_TO_VFNO(i, ndev->iov.max_vf_queues);
vfdev = ndev->iov.vfdev + vfno;
@ -151,7 +154,8 @@ void nitrox_pf2vf_mbox_handler(struct nitrox_device *ndev)
/* loop for VF(64..127) */
reg_addr = NPS_PKT_MBOX_INT_HI;
value = nitrox_read_csr(ndev, reg_addr);
for_each_set_bit(i, (const unsigned long *)&value, BITS_PER_LONG) {
bitmap_from_u64(csr, value);
for_each_set_bit(i, csr, BITS_PER_TYPE(csr)) {
/* get the vfno from ring */
vfno = RING_TO_VFNO(i + 64, ndev->iov.max_vf_queues);
vfdev = ndev->iov.vfdev + vfno;

View File

@ -440,7 +440,7 @@ struct aqmq_command_s {
/**
* struct ctx_hdr - Book keeping data about the crypto context
* @pool: Pool used to allocate crypto context
* @dma: Base DMA address of the cypto context
* @dma: Base DMA address of the crypto context
* @ctx_dma: Actual usable crypto context for NITROX
*/
struct ctx_hdr {

View File

@ -55,6 +55,11 @@ static const struct pci_device_id zip_id_table[] = {
{ 0, }
};
static void zip_debugfs_init(void);
static void zip_debugfs_exit(void);
static int zip_register_compression_device(void);
static void zip_unregister_compression_device(void);
void zip_reg_write(u64 val, u64 __iomem *addr)
{
writeq(val, addr);
@ -235,6 +240,15 @@ static int zip_init_hw(struct zip_device *zip)
return 0;
}
static void zip_reset(struct zip_device *zip)
{
union zip_cmd_ctl cmd_ctl;
cmd_ctl.u_reg64 = 0x0ull;
cmd_ctl.s.reset = 1; /* Forces ZIP cores to do reset */
zip_reg_write(cmd_ctl.u_reg64, (zip->reg_base + ZIP_CMD_CTL));
}
static int zip_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
struct device *dev = &pdev->dev;
@ -282,8 +296,21 @@ static int zip_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
if (err)
goto err_release_regions;
/* Register with the Kernel Crypto Interface */
err = zip_register_compression_device();
if (err < 0) {
zip_err("ZIP: Kernel Crypto Registration failed\n");
goto err_register;
}
/* comp-decomp statistics are handled with debugfs interface */
zip_debugfs_init();
return 0;
err_register:
zip_reset(zip);
err_release_regions:
if (zip->reg_base)
iounmap(zip->reg_base);
@ -305,16 +332,17 @@ err_free_device:
static void zip_remove(struct pci_dev *pdev)
{
struct zip_device *zip = pci_get_drvdata(pdev);
union zip_cmd_ctl cmd_ctl;
int q = 0;
if (!zip)
return;
zip_debugfs_exit();
zip_unregister_compression_device();
if (zip->reg_base) {
cmd_ctl.u_reg64 = 0x0ull;
cmd_ctl.s.reset = 1; /* Forces ZIP cores to do reset */
zip_reg_write(cmd_ctl.u_reg64, (zip->reg_base + ZIP_CMD_CTL));
zip_reset(zip);
iounmap(zip->reg_base);
}
@ -585,7 +613,7 @@ DEFINE_SHOW_ATTRIBUTE(zip_regs);
/* Root directory for thunderx_zip debugfs entry */
static struct dentry *zip_debugfs_root;
static void __init zip_debugfs_init(void)
static void zip_debugfs_init(void)
{
if (!debugfs_initialized())
return;
@ -604,7 +632,7 @@ static void __init zip_debugfs_init(void)
}
static void __exit zip_debugfs_exit(void)
static void zip_debugfs_exit(void)
{
debugfs_remove_recursive(zip_debugfs_root);
}
@ -615,48 +643,7 @@ static void __exit zip_debugfs_exit(void) { }
#endif
/* debugfs - end */
static int __init zip_init_module(void)
{
int ret;
zip_msg("%s\n", DRV_NAME);
ret = pci_register_driver(&zip_driver);
if (ret < 0) {
zip_err("ZIP: pci_register_driver() failed\n");
return ret;
}
/* Register with the Kernel Crypto Interface */
ret = zip_register_compression_device();
if (ret < 0) {
zip_err("ZIP: Kernel Crypto Registration failed\n");
goto err_pci_unregister;
}
/* comp-decomp statistics are handled with debugfs interface */
zip_debugfs_init();
return ret;
err_pci_unregister:
pci_unregister_driver(&zip_driver);
return ret;
}
static void __exit zip_cleanup_module(void)
{
zip_debugfs_exit();
/* Unregister from the kernel crypto interface */
zip_unregister_compression_device();
/* Unregister this driver for pci zip devices */
pci_unregister_driver(&zip_driver);
}
module_init(zip_init_module);
module_exit(zip_cleanup_module);
module_pci_driver(zip_driver);
MODULE_AUTHOR("Cavium Inc");
MODULE_DESCRIPTION("Cavium Inc ThunderX ZIP Driver");

View File

@ -69,7 +69,6 @@ static int ccp_aes_crypt(struct skcipher_request *req, bool encrypt)
struct ccp_aes_req_ctx *rctx = skcipher_request_ctx(req);
struct scatterlist *iv_sg = NULL;
unsigned int iv_len = 0;
int ret;
if (!ctx->u.aes.key_len)
return -EINVAL;
@ -104,9 +103,7 @@ static int ccp_aes_crypt(struct skcipher_request *req, bool encrypt)
rctx->cmd.u.aes.src_len = req->cryptlen;
rctx->cmd.u.aes.dst = req->dst;
ret = ccp_crypto_enqueue_request(&req->base, &rctx->cmd);
return ret;
return ccp_crypto_enqueue_request(&req->base, &rctx->cmd);
}
static int ccp_aes_encrypt(struct skcipher_request *req)

View File

@ -632,6 +632,20 @@ static int ccp_terminate_all(struct dma_chan *dma_chan)
return 0;
}
static void ccp_dma_release(struct ccp_device *ccp)
{
struct ccp_dma_chan *chan;
struct dma_chan *dma_chan;
unsigned int i;
for (i = 0; i < ccp->cmd_q_count; i++) {
chan = ccp->ccp_dma_chan + i;
dma_chan = &chan->dma_chan;
tasklet_kill(&chan->cleanup_tasklet);
list_del_rcu(&dma_chan->device_node);
}
}
int ccp_dmaengine_register(struct ccp_device *ccp)
{
struct ccp_dma_chan *chan;
@ -736,6 +750,7 @@ int ccp_dmaengine_register(struct ccp_device *ccp)
return 0;
err_reg:
ccp_dma_release(ccp);
kmem_cache_destroy(ccp->dma_desc_cache);
err_cache:
@ -752,6 +767,7 @@ void ccp_dmaengine_unregister(struct ccp_device *ccp)
return;
dma_async_device_unregister(dma_dev);
ccp_dma_release(ccp);
kmem_cache_destroy(ccp->dma_desc_cache);
kmem_cache_destroy(ccp->dma_cmd_cache);

View File

@ -413,7 +413,7 @@ static int __sev_platform_init_locked(int *error)
{
struct psp_device *psp = psp_master;
struct sev_device *sev;
int rc, psp_ret;
int rc, psp_ret = -1;
int (*init_function)(int *error);
if (!psp || !psp->sev_data)

View File

@ -258,6 +258,13 @@ static int cc_map_sg(struct device *dev, struct scatterlist *sg,
{
int ret = 0;
if (!nbytes) {
*mapped_nents = 0;
*lbytes = 0;
*nents = 0;
return 0;
}
*nents = cc_get_sgl_nents(dev, sg, nbytes, lbytes);
if (*nents > max_sg_nents) {
*nents = 0;

View File

@ -257,8 +257,8 @@ static void cc_cipher_exit(struct crypto_tfm *tfm)
&ctx_p->user.key_dma_addr);
/* Free key buffer in context */
kfree_sensitive(ctx_p->user.key);
dev_dbg(dev, "Free key buffer in context. key=@%p\n", ctx_p->user.key);
kfree_sensitive(ctx_p->user.key);
}
struct tdes_keys {

View File

@ -23,8 +23,8 @@ static bool sl3516_ce_need_fallback(struct skcipher_request *areq)
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
struct sl3516_ce_cipher_tfm_ctx *op = crypto_skcipher_ctx(tfm);
struct sl3516_ce_dev *ce = op->ce;
struct scatterlist *in_sg = areq->src;
struct scatterlist *out_sg = areq->dst;
struct scatterlist *in_sg;
struct scatterlist *out_sg;
struct scatterlist *sg;
if (areq->cryptlen == 0 || areq->cryptlen % 16) {
@ -264,7 +264,9 @@ static int sl3516_ce_handle_cipher_request(struct crypto_engine *engine, void *a
struct skcipher_request *breq = container_of(areq, struct skcipher_request, base);
err = sl3516_ce_cipher(breq);
local_bh_disable();
crypto_finalize_skcipher_request(engine, breq, err);
local_bh_enable();
return 0;
}

View File

@ -3840,7 +3840,7 @@ static void qm_clear_queues(struct hisi_qm *qm)
for (i = 0; i < qm->qp_num; i++) {
qp = &qm->qp_array[i];
if (qp->is_resetting)
if (qp->is_in_kernel && qp->is_resetting)
memset(qp->qdma.va, 0, qp->qdma.size);
}
@ -4295,7 +4295,7 @@ static void qm_vf_get_qos(struct hisi_qm *qm, u32 fun_num)
static int qm_vf_read_qos(struct hisi_qm *qm)
{
int cnt = 0;
int ret;
int ret = -EINVAL;
/* reset mailbox qos val */
qm->mb_qos = 0;

View File

@ -42,6 +42,8 @@
#define SEC_DE_OFFSET_V3 9
#define SEC_SCENE_OFFSET_V3 5
#define SEC_CKEY_OFFSET_V3 13
#define SEC_CTR_CNT_OFFSET 25
#define SEC_CTR_CNT_ROLLOVER 2
#define SEC_SRC_SGL_OFFSET_V3 11
#define SEC_DST_SGL_OFFSET_V3 14
#define SEC_CALG_OFFSET_V3 4
@ -63,6 +65,7 @@
#define SEC_AUTH_CIPHER 0x1
#define SEC_MAX_MAC_LEN 64
#define SEC_MAX_AAD_LEN 65535
#define SEC_MAX_CCM_AAD_LEN 65279
#define SEC_TOTAL_MAC_SZ (SEC_MAX_MAC_LEN * QM_Q_DEPTH)
#define SEC_PBUF_SZ 512
@ -237,7 +240,7 @@ static void sec_req_cb(struct hisi_qp *qp, void *resp)
if (unlikely(type != type_supported)) {
atomic64_inc(&dfx->err_bd_cnt);
pr_err("err bd type [%d]\n", type);
pr_err("err bd type [%u]\n", type);
return;
}
@ -641,13 +644,15 @@ static int sec_skcipher_fbtfm_init(struct crypto_skcipher *tfm)
struct sec_cipher_ctx *c_ctx = &ctx->c_ctx;
c_ctx->fallback = false;
/* Currently, only XTS mode need fallback tfm when using 192bit key */
if (likely(strncmp(alg, "xts", SEC_XTS_NAME_SZ)))
return 0;
c_ctx->fbtfm = crypto_alloc_sync_skcipher(alg, 0,
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(c_ctx->fbtfm)) {
pr_err("failed to alloc fallback tfm!\n");
pr_err("failed to alloc xts mode fallback tfm!\n");
return PTR_ERR(c_ctx->fbtfm);
}
@ -808,7 +813,7 @@ static int sec_skcipher_setkey(struct crypto_skcipher *tfm, const u8 *key,
}
memcpy(c_ctx->c_key, key, keylen);
if (c_ctx->fallback) {
if (c_ctx->fallback && c_ctx->fbtfm) {
ret = crypto_sync_skcipher_setkey(c_ctx->fbtfm, key, keylen);
if (ret) {
dev_err(dev, "failed to set fallback skcipher key!\n");
@ -1300,6 +1305,10 @@ static int sec_skcipher_bd_fill_v3(struct sec_ctx *ctx, struct sec_req *req)
cipher = SEC_CIPHER_DEC;
sec_sqe3->c_icv_key |= cpu_to_le16(cipher);
/* Set the CTR counter mode is 128bit rollover */
sec_sqe3->auth_mac_key = cpu_to_le32((u32)SEC_CTR_CNT_ROLLOVER <<
SEC_CTR_CNT_OFFSET);
if (req->use_pbuf) {
bd_param |= SEC_PBUF << SEC_SRC_SGL_OFFSET_V3;
bd_param |= SEC_PBUF << SEC_DST_SGL_OFFSET_V3;
@ -1614,7 +1623,7 @@ static void sec_auth_bd_fill_ex_v3(struct sec_auth_ctx *ctx, int dir,
sqe3->auth_mac_key |= cpu_to_le32((u32)SEC_AUTH_TYPE1);
sqe3->huk_iv_seq &= SEC_CIPHER_AUTH_V3;
} else {
sqe3->auth_mac_key |= cpu_to_le32((u32)SEC_AUTH_TYPE1);
sqe3->auth_mac_key |= cpu_to_le32((u32)SEC_AUTH_TYPE2);
sqe3->huk_iv_seq |= SEC_AUTH_CIPHER_V3;
}
sqe3->a_len_key = cpu_to_le32(c_req->c_len + aq->assoclen);
@ -2032,13 +2041,12 @@ static int sec_skcipher_soft_crypto(struct sec_ctx *ctx,
struct skcipher_request *sreq, bool encrypt)
{
struct sec_cipher_ctx *c_ctx = &ctx->c_ctx;
SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, c_ctx->fbtfm);
struct device *dev = ctx->dev;
int ret;
SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, c_ctx->fbtfm);
if (!c_ctx->fbtfm) {
dev_err(dev, "failed to check fallback tfm\n");
dev_err_ratelimited(dev, "the soft tfm isn't supported in the current system.\n");
return -EINVAL;
}
@ -2219,6 +2227,10 @@ static int sec_aead_spec_check(struct sec_ctx *ctx, struct sec_req *sreq)
}
if (c_mode == SEC_CMODE_CCM) {
if (unlikely(req->assoclen > SEC_MAX_CCM_AAD_LEN)) {
dev_err_ratelimited(dev, "CCM input aad parameter is too long!\n");
return -EINVAL;
}
ret = aead_iv_demension_check(req);
if (ret) {
dev_err(dev, "aead input iv param error!\n");
@ -2256,7 +2268,6 @@ static int sec_aead_param_check(struct sec_ctx *ctx, struct sec_req *sreq)
if (ctx->sec->qm.ver == QM_HW_V2) {
if (unlikely(!req->cryptlen || (!sreq->c_req.encrypt &&
req->cryptlen <= authsize))) {
dev_err(dev, "Kunpeng920 not support 0 length!\n");
ctx->a_ctx.fallback = true;
return -EINVAL;
}
@ -2284,9 +2295,10 @@ static int sec_aead_soft_crypto(struct sec_ctx *ctx,
struct aead_request *aead_req,
bool encrypt)
{
struct aead_request *subreq = aead_request_ctx(aead_req);
struct sec_auth_ctx *a_ctx = &ctx->a_ctx;
struct device *dev = ctx->dev;
struct aead_request *subreq;
int ret;
/* Kunpeng920 aead mode not support input 0 size */
if (!a_ctx->fallback_aead_tfm) {
@ -2294,6 +2306,10 @@ static int sec_aead_soft_crypto(struct sec_ctx *ctx,
return -EINVAL;
}
subreq = aead_request_alloc(a_ctx->fallback_aead_tfm, GFP_KERNEL);
if (!subreq)
return -ENOMEM;
aead_request_set_tfm(subreq, a_ctx->fallback_aead_tfm);
aead_request_set_callback(subreq, aead_req->base.flags,
aead_req->base.complete, aead_req->base.data);
@ -2301,8 +2317,13 @@ static int sec_aead_soft_crypto(struct sec_ctx *ctx,
aead_req->cryptlen, aead_req->iv);
aead_request_set_ad(subreq, aead_req->assoclen);
return encrypt ? crypto_aead_encrypt(subreq) :
crypto_aead_decrypt(subreq);
if (encrypt)
ret = crypto_aead_encrypt(subreq);
else
ret = crypto_aead_decrypt(subreq);
aead_request_free(subreq);
return ret;
}
static int sec_aead_crypto(struct aead_request *a_req, bool encrypt)

View File

@ -354,8 +354,10 @@ struct sec_sqe3 {
* akey_len: 9~14 bits
* a_alg: 15~20 bits
* key_sel: 21~24 bits
* updata_key: 25 bits
* reserved: 26~31 bits
* ctr_count_mode/sm4_xts: 25~26 bits
* sva_prefetch: 27 bits
* key_wrap_num: 28~30 bits
* update_key: 31 bits
*/
__le32 auth_mac_key;
__le32 salt;

View File

@ -90,6 +90,10 @@
SEC_USER1_WB_DATA_SSV)
#define SEC_USER1_SMMU_SVA (SEC_USER1_SMMU_NORMAL | SEC_USER1_SVA_SET)
#define SEC_USER1_SMMU_MASK (~SEC_USER1_SVA_SET)
#define SEC_INTERFACE_USER_CTRL0_REG_V3 0x302220
#define SEC_INTERFACE_USER_CTRL1_REG_V3 0x302224
#define SEC_USER1_SMMU_NORMAL_V3 (BIT(23) | BIT(17) | BIT(11) | BIT(5))
#define SEC_USER1_SMMU_MASK_V3 0xFF79E79E
#define SEC_CORE_INT_STATUS_M_ECC BIT(2)
#define SEC_PREFETCH_CFG 0x301130
@ -335,6 +339,41 @@ static void sec_set_endian(struct hisi_qm *qm)
writel_relaxed(reg, qm->io_base + SEC_CONTROL_REG);
}
static void sec_engine_sva_config(struct hisi_qm *qm)
{
u32 reg;
if (qm->ver > QM_HW_V2) {
reg = readl_relaxed(qm->io_base +
SEC_INTERFACE_USER_CTRL0_REG_V3);
reg |= SEC_USER0_SMMU_NORMAL;
writel_relaxed(reg, qm->io_base +
SEC_INTERFACE_USER_CTRL0_REG_V3);
reg = readl_relaxed(qm->io_base +
SEC_INTERFACE_USER_CTRL1_REG_V3);
reg &= SEC_USER1_SMMU_MASK_V3;
reg |= SEC_USER1_SMMU_NORMAL_V3;
writel_relaxed(reg, qm->io_base +
SEC_INTERFACE_USER_CTRL1_REG_V3);
} else {
reg = readl_relaxed(qm->io_base +
SEC_INTERFACE_USER_CTRL0_REG);
reg |= SEC_USER0_SMMU_NORMAL;
writel_relaxed(reg, qm->io_base +
SEC_INTERFACE_USER_CTRL0_REG);
reg = readl_relaxed(qm->io_base +
SEC_INTERFACE_USER_CTRL1_REG);
reg &= SEC_USER1_SMMU_MASK;
if (qm->use_sva)
reg |= SEC_USER1_SMMU_SVA;
else
reg |= SEC_USER1_SMMU_NORMAL;
writel_relaxed(reg, qm->io_base +
SEC_INTERFACE_USER_CTRL1_REG);
}
}
static void sec_open_sva_prefetch(struct hisi_qm *qm)
{
u32 val;
@ -426,26 +465,18 @@ static int sec_engine_init(struct hisi_qm *qm)
reg |= (0x1 << SEC_TRNG_EN_SHIFT);
writel_relaxed(reg, qm->io_base + SEC_CONTROL_REG);
reg = readl_relaxed(qm->io_base + SEC_INTERFACE_USER_CTRL0_REG);
reg |= SEC_USER0_SMMU_NORMAL;
writel_relaxed(reg, qm->io_base + SEC_INTERFACE_USER_CTRL0_REG);
reg = readl_relaxed(qm->io_base + SEC_INTERFACE_USER_CTRL1_REG);
reg &= SEC_USER1_SMMU_MASK;
if (qm->use_sva && qm->ver == QM_HW_V2)
reg |= SEC_USER1_SMMU_SVA;
else
reg |= SEC_USER1_SMMU_NORMAL;
writel_relaxed(reg, qm->io_base + SEC_INTERFACE_USER_CTRL1_REG);
sec_engine_sva_config(qm);
writel(SEC_SINGLE_PORT_MAX_TRANS,
qm->io_base + AM_CFG_SINGLE_PORT_MAX_TRANS);
writel(SEC_SAA_ENABLE, qm->io_base + SEC_SAA_EN_REG);
/* Enable sm4 extra mode, as ctr/ecb */
/* HW V2 enable sm4 extra mode, as ctr/ecb */
if (qm->ver < QM_HW_V3)
writel_relaxed(SEC_BD_ERR_CHK_EN0,
qm->io_base + SEC_BD_ERR_CHK_EN_REG0);
/* Enable sm4 xts mode multiple iv */
writel_relaxed(SEC_BD_ERR_CHK_EN1,
qm->io_base + SEC_BD_ERR_CHK_EN_REG1);

View File

@ -47,6 +47,7 @@ config CRYPTO_DEV_OCTEONTX2_CPT
select CRYPTO_SKCIPHER
select CRYPTO_HASH
select CRYPTO_AEAD
select NET_DEVLINK
help
This driver allows you to utilize the Marvell Cryptographic
Accelerator Unit(CPT) found in OcteonTX2 series of processors.

View File

@ -1639,11 +1639,8 @@ static void swap_func(void *lptr, void *rptr, int size)
{
struct cpt_device_desc *ldesc = (struct cpt_device_desc *) lptr;
struct cpt_device_desc *rdesc = (struct cpt_device_desc *) rptr;
struct cpt_device_desc desc;
desc = *ldesc;
*ldesc = *rdesc;
*rdesc = desc;
swap(*ldesc, *rdesc);
}
int otx_cpt_crypto_init(struct pci_dev *pdev, struct module *mod,

View File

@ -204,7 +204,6 @@ static int alloc_command_queues(struct otx_cptvf *cptvf,
/* per queue initialization */
for (i = 0; i < cptvf->num_queues; i++) {
c_size = 0;
rem_q_size = q_size;
first = NULL;
last = NULL;

View File

@ -157,5 +157,6 @@ struct otx2_cptlfs_info;
int otx2_cpt_attach_rscrs_msg(struct otx2_cptlfs_info *lfs);
int otx2_cpt_detach_rsrcs_msg(struct otx2_cptlfs_info *lfs);
int otx2_cpt_msix_offset_msg(struct otx2_cptlfs_info *lfs);
int otx2_cpt_sync_mbox_msg(struct otx2_mbox *mbox);
#endif /* __OTX2_CPT_COMMON_H */

View File

@ -202,3 +202,17 @@ int otx2_cpt_msix_offset_msg(struct otx2_cptlfs_info *lfs)
}
return ret;
}
int otx2_cpt_sync_mbox_msg(struct otx2_mbox *mbox)
{
int err;
if (!otx2_mbox_nonempty(mbox, 0))
return 0;
otx2_mbox_msg_send(mbox, 0);
err = otx2_mbox_wait_for_rsp(mbox, 0);
if (err)
return err;
return otx2_mbox_check_rsp_msgs(mbox, 0);
}

View File

@ -26,12 +26,22 @@
*/
#define OTX2_CPT_INST_QLEN_MSGS ((OTX2_CPT_SIZE_DIV40 - 1) * 40)
/*
* LDWB is getting incorrectly used when IQB_LDWB = 1 and CPT instruction
* queue has less than 320 free entries. So, increase HW instruction queue
* size by 320 and give 320 entries less for SW/NIX RX as a workaround.
*/
#define OTX2_CPT_INST_QLEN_EXTRA_BYTES (320 * OTX2_CPT_INST_SIZE)
#define OTX2_CPT_EXTRA_SIZE_DIV40 (320/40)
/* CPT instruction queue length in bytes */
#define OTX2_CPT_INST_QLEN_BYTES (OTX2_CPT_SIZE_DIV40 * 40 * \
OTX2_CPT_INST_SIZE)
#define OTX2_CPT_INST_QLEN_BYTES \
((OTX2_CPT_SIZE_DIV40 * 40 * OTX2_CPT_INST_SIZE) + \
OTX2_CPT_INST_QLEN_EXTRA_BYTES)
/* CPT instruction group queue length in bytes */
#define OTX2_CPT_INST_GRP_QLEN_BYTES (OTX2_CPT_SIZE_DIV40 * 16)
#define OTX2_CPT_INST_GRP_QLEN_BYTES \
((OTX2_CPT_SIZE_DIV40 + OTX2_CPT_EXTRA_SIZE_DIV40) * 16)
/* CPT FC length in bytes */
#define OTX2_CPT_Q_FC_LEN 128
@ -179,7 +189,8 @@ static inline void otx2_cptlf_do_set_iqueue_size(struct otx2_cptlf_info *lf)
{
union otx2_cptx_lf_q_size lf_q_size = { .u = 0x0 };
lf_q_size.s.size_div40 = OTX2_CPT_SIZE_DIV40;
lf_q_size.s.size_div40 = OTX2_CPT_SIZE_DIV40 +
OTX2_CPT_EXTRA_SIZE_DIV40;
otx2_cpt_write64(lf->lfs->reg_base, BLKADDR_CPT0, lf->slot,
OTX2_CPT_LF_Q_SIZE, lf_q_size.u);
}

View File

@ -46,6 +46,7 @@ struct otx2_cptpf_dev {
struct workqueue_struct *flr_wq;
struct cptpf_flr_work *flr_work;
struct mutex lock; /* serialize mailbox access */
unsigned long cap_flag;
u8 pf_id; /* RVU PF number */

View File

@ -140,10 +140,13 @@ static void cptpf_flr_wq_handler(struct work_struct *work)
vf = flr_work - pf->flr_work;
mutex_lock(&pf->lock);
req = otx2_mbox_alloc_msg_rsp(mbox, 0, sizeof(*req),
sizeof(struct msg_rsp));
if (!req)
if (!req) {
mutex_unlock(&pf->lock);
return;
}
req->sig = OTX2_MBOX_REQ_SIG;
req->id = MBOX_MSG_VF_FLR;
@ -151,6 +154,7 @@ static void cptpf_flr_wq_handler(struct work_struct *work)
req->pcifunc |= (vf + 1) & RVU_PFVF_FUNC_MASK;
otx2_cpt_send_mbox_msg(mbox, pf->pdev);
if (!otx2_cpt_sync_mbox_msg(&pf->afpf_mbox)) {
if (vf >= 64) {
reg = 1;
@ -162,6 +166,8 @@ static void cptpf_flr_wq_handler(struct work_struct *work)
otx2_cpt_write64(pf->reg_base, BLKADDR_RVUM, 0,
RVU_PF_VFFLR_INT_ENA_W1SX(reg), BIT_ULL(vf));
}
mutex_unlock(&pf->lock);
}
static irqreturn_t cptpf_vf_flr_intr(int __always_unused irq, void *arg)
{
@ -468,6 +474,7 @@ static int cptpf_afpf_mbox_init(struct otx2_cptpf_dev *cptpf)
goto error;
INIT_WORK(&cptpf->afpf_mbox_work, otx2_cptpf_afpf_mbox_handler);
mutex_init(&cptpf->lock);
return 0;
error:

View File

@ -18,9 +18,12 @@ static int forward_to_af(struct otx2_cptpf_dev *cptpf,
struct mbox_msghdr *msg;
int ret;
mutex_lock(&cptpf->lock);
msg = otx2_mbox_alloc_msg(&cptpf->afpf_mbox, 0, size);
if (msg == NULL)
if (msg == NULL) {
mutex_unlock(&cptpf->lock);
return -ENOMEM;
}
memcpy((uint8_t *)msg + sizeof(struct mbox_msghdr),
(uint8_t *)req + sizeof(struct mbox_msghdr), size);
@ -29,15 +32,19 @@ static int forward_to_af(struct otx2_cptpf_dev *cptpf,
msg->sig = req->sig;
msg->ver = req->ver;
otx2_mbox_msg_send(&cptpf->afpf_mbox, 0);
ret = otx2_mbox_wait_for_rsp(&cptpf->afpf_mbox, 0);
ret = otx2_cpt_sync_mbox_msg(&cptpf->afpf_mbox);
/* Error code -EIO indicate there is a communication failure
* to the AF. Rest of the error codes indicate that AF processed
* VF messages and set the error codes in response messages
* (if any) so simply forward responses to VF.
*/
if (ret == -EIO) {
dev_err(&cptpf->pdev->dev, "RVU MBOX timeout.\n");
dev_warn(&cptpf->pdev->dev,
"AF not responding to VF%d messages\n", vf->vf_id);
mutex_unlock(&cptpf->lock);
return ret;
} else if (ret) {
dev_err(&cptpf->pdev->dev, "RVU MBOX error: %d.\n", ret);
return -EFAULT;
}
mutex_unlock(&cptpf->lock);
return 0;
}
@ -204,6 +211,10 @@ void otx2_cptpf_vfpf_mbox_handler(struct work_struct *work)
if (err == -ENOMEM || err == -EIO)
break;
offset = msg->next_msgoff;
/* Write barrier required for VF responses which are handled by
* PF driver and not forwarded to AF.
*/
smp_wmb();
}
/* Send mbox responses to VF */
if (mdev->num_msgs)
@ -350,6 +361,8 @@ void otx2_cptpf_afpf_mbox_handler(struct work_struct *work)
process_afpf_mbox_msg(cptpf, msg);
offset = msg->next_msgoff;
/* Sync VF response ready to be sent */
smp_wmb();
mdev->msgs_acked++;
}
otx2_mbox_reset(afpf_mbox, 0);

View File

@ -1076,6 +1076,39 @@ static void delete_engine_grps(struct pci_dev *pdev,
delete_engine_group(&pdev->dev, &eng_grps->grp[i]);
}
#define PCI_DEVID_CN10K_RNM 0xA098
#define RNM_ENTROPY_STATUS 0x8
static void rnm_to_cpt_errata_fixup(struct device *dev)
{
struct pci_dev *pdev;
void __iomem *base;
int timeout = 5000;
pdev = pci_get_device(PCI_VENDOR_ID_CAVIUM, PCI_DEVID_CN10K_RNM, NULL);
if (!pdev)
return;
base = pci_ioremap_bar(pdev, 0);
if (!base)
goto put_pdev;
while ((readq(base + RNM_ENTROPY_STATUS) & 0x7F) != 0x40) {
cpu_relax();
udelay(1);
timeout--;
if (!timeout) {
dev_warn(dev, "RNM is not producing entropy\n");
break;
}
}
iounmap(base);
put_pdev:
pci_dev_put(pdev);
}
int otx2_cpt_get_eng_grp(struct otx2_cpt_eng_grps *eng_grps, int eng_type)
{
@ -1111,6 +1144,7 @@ int otx2_cpt_create_eng_grps(struct otx2_cptpf_dev *cptpf,
struct otx2_cpt_engines engs[OTX2_CPT_MAX_ETYPES_PER_GRP] = { {0} };
struct pci_dev *pdev = cptpf->pdev;
struct fw_info_t fw_info;
u64 reg_val;
int ret = 0;
mutex_lock(&eng_grps->lock);
@ -1189,9 +1223,17 @@ int otx2_cpt_create_eng_grps(struct otx2_cptpf_dev *cptpf,
if (is_dev_otx2(pdev))
goto unlock;
/*
* Ensure RNM_ENTROPY_STATUS[NORMAL_CNT] = 0x40 before writing
* CPT_AF_CTL[RNM_REQ_EN] = 1 as a workaround for HW errata.
*/
rnm_to_cpt_errata_fixup(&pdev->dev);
/*
* Configure engine group mask to allow context prefetching
* for the groups.
* for the groups and enable random number request, to enable
* CPT to request random numbers from RNM.
*/
otx2_cpt_write_af_reg(&cptpf->afpf_mbox, pdev, CPT_AF_CTL,
OTX2_CPT_ALL_ENG_GRPS_MASK << 3 | BIT_ULL(16),
@ -1203,6 +1245,18 @@ int otx2_cpt_create_eng_grps(struct otx2_cptpf_dev *cptpf,
*/
otx2_cpt_write_af_reg(&cptpf->afpf_mbox, pdev, CPT_AF_CTX_FLUSH_TIMER,
CTX_FLUSH_TIMER_CNT, BLKADDR_CPT0);
/*
* Set CPT_AF_DIAG[FLT_DIS], as a workaround for HW errata, when
* CPT_AF_DIAG[FLT_DIS] = 0 and a CPT engine access to LLC/DRAM
* encounters a fault/poison, a rare case may result in
* unpredictable data being delivered to a CPT engine.
*/
otx2_cpt_read_af_reg(&cptpf->afpf_mbox, pdev, CPT_AF_DIAG, &reg_val,
BLKADDR_CPT0);
otx2_cpt_write_af_reg(&cptpf->afpf_mbox, pdev, CPT_AF_DIAG,
reg_val | BIT_ULL(24), BLKADDR_CPT0);
mutex_unlock(&eng_grps->lock);
return 0;

View File

@ -1634,16 +1634,13 @@ static inline int cpt_register_algs(void)
{
int i, err = 0;
if (!IS_ENABLED(CONFIG_DM_CRYPT)) {
for (i = 0; i < ARRAY_SIZE(otx2_cpt_skciphers); i++)
otx2_cpt_skciphers[i].base.cra_flags &=
~CRYPTO_ALG_DEAD;
otx2_cpt_skciphers[i].base.cra_flags &= ~CRYPTO_ALG_DEAD;
err = crypto_register_skciphers(otx2_cpt_skciphers,
ARRAY_SIZE(otx2_cpt_skciphers));
if (err)
return err;
}
for (i = 0; i < ARRAY_SIZE(otx2_cpt_aeads); i++)
otx2_cpt_aeads[i].base.cra_flags &= ~CRYPTO_ALG_DEAD;

Some files were not shown because too many files have changed in this diff Show More