mirror of
https://github.com/openssl/openssl.git
synced 2024-11-27 12:04:38 +08:00
Many spelling fixes/typo's corrected.
Around 138 distinct errors found and fixed; thanks! Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3459)
This commit is contained in:
parent
b4d0fa49d9
commit
46f4e1bec5
46
CHANGES
46
CHANGES
@ -889,7 +889,7 @@
|
||||
*) Add support for setting the minimum and maximum supported protocol.
|
||||
It can bet set via the SSL_set_min_proto_version() and
|
||||
SSL_set_max_proto_version(), or via the SSL_CONF's MinProtocol and
|
||||
MaxProtcol. It's recommended to use the new APIs to disable
|
||||
MaxProtocol. It's recommended to use the new APIs to disable
|
||||
protocols instead of disabling individual protocols using
|
||||
SSL_set_options() or SSL_CONF's Protocol. This change also
|
||||
removes support for disabling TLS 1.2 in the OpenSSL TLS
|
||||
@ -2853,7 +2853,7 @@
|
||||
*) OpenSSL 1.0.0 sets SSL_OP_ALL to 0x80000FFFL and OpenSSL 1.0.1 and
|
||||
1.0.1a set SSL_OP_NO_TLSv1_1 to 0x00000400L which would unfortunately
|
||||
mean any application compiled against OpenSSL 1.0.0 headers setting
|
||||
SSL_OP_ALL would also set SSL_OP_NO_TLSv1_1, unintentionally disablng
|
||||
SSL_OP_ALL would also set SSL_OP_NO_TLSv1_1, unintentionally disabling
|
||||
TLS 1.1 also. Fix this by changing the value of SSL_OP_NO_TLSv1_1 to
|
||||
0x10000000L Any application which was previously compiled against
|
||||
OpenSSL 1.0.1 or 1.0.1a headers and which cares about SSL_OP_NO_TLSv1_1
|
||||
@ -2862,7 +2862,7 @@
|
||||
in unlike event, limit maximum offered version to TLS 1.0 [see below].
|
||||
[Steve Henson]
|
||||
|
||||
*) In order to ensure interoperabilty SSL_OP_NO_protocolX does not
|
||||
*) In order to ensure interoperability SSL_OP_NO_protocolX does not
|
||||
disable just protocol X, but all protocols above X *if* there are
|
||||
protocols *below* X still enabled. In more practical terms it means
|
||||
that if application wants to disable TLS1.0 in favor of TLS1.1 and
|
||||
@ -3630,7 +3630,7 @@
|
||||
|
||||
SSL_set_tlsext_opaque_prf_input(ssl, src, len) is used to set the
|
||||
opaque PRF input value to use in the handshake. This will create
|
||||
an interal copy of the length-'len' string at 'src', and will
|
||||
an internal copy of the length-'len' string at 'src', and will
|
||||
return non-zero for success.
|
||||
|
||||
To get more control and flexibility, provide a callback function
|
||||
@ -3740,8 +3740,8 @@
|
||||
most recently disabled ciphersuites when "HIGH" is parsed).
|
||||
|
||||
Also, change ssl_create_cipher_list() (using this new
|
||||
funcionality) such that between otherwise identical
|
||||
cihpersuites, ephemeral ECDH is preferred over ephemeral DH in
|
||||
functionality) such that between otherwise identical
|
||||
ciphersuites, ephemeral ECDH is preferred over ephemeral DH in
|
||||
the default order.
|
||||
[Bodo Moeller]
|
||||
|
||||
@ -3920,7 +3920,7 @@
|
||||
functional reference processing.
|
||||
[Steve Henson]
|
||||
|
||||
*) New functions EVP_Digest{Sign,Verify)*. These are enchance versions of
|
||||
*) New functions EVP_Digest{Sign,Verify)*. These are enhanced versions of
|
||||
EVP_{Sign,Verify}* which allow an application to customise the signature
|
||||
process.
|
||||
[Steve Henson]
|
||||
@ -4133,7 +4133,7 @@
|
||||
|
||||
*) New option SSL_OP_NO_COMP to disable use of compression selectively
|
||||
in SSL structures. New SSL ctrl to set maximum send fragment size.
|
||||
Save memory by seeting the I/O buffer sizes dynamically instead of
|
||||
Save memory by setting the I/O buffer sizes dynamically instead of
|
||||
using the maximum available value.
|
||||
[Steve Henson]
|
||||
|
||||
@ -4192,7 +4192,7 @@
|
||||
|
||||
Changes between 0.9.8l and 0.9.8m [25 Feb 2010]
|
||||
|
||||
*) Always check bn_wexpend() return values for failure. (CVE-2009-3245)
|
||||
*) Always check bn_wexpand() return values for failure. (CVE-2009-3245)
|
||||
[Martin Olsson, Neel Mehta]
|
||||
|
||||
*) Fix X509_STORE locking: Every 'objs' access requires a lock (to
|
||||
@ -4325,7 +4325,7 @@
|
||||
is already buffered was missing. For every new message was memory
|
||||
allocated, allowing an attacker to perform an denial of service attack
|
||||
with sending out of seq handshake messages until there is no memory
|
||||
left. Additionally every future messege was buffered, even if the
|
||||
left. Additionally every future message was buffered, even if the
|
||||
sequence number made no sense and would be part of another handshake.
|
||||
So only messages with sequence numbers less than 10 in advance will be
|
||||
buffered. (CVE-2009-1378)
|
||||
@ -4509,7 +4509,7 @@
|
||||
Changes between 0.9.8g and 0.9.8h [28 May 2008]
|
||||
|
||||
*) Fix flaw if 'Server Key exchange message' is omitted from a TLS
|
||||
handshake which could lead to a cilent crash as found using the
|
||||
handshake which could lead to a client crash as found using the
|
||||
Codenomicon TLS test suite (CVE-2008-1672)
|
||||
[Steve Henson, Mark Cox]
|
||||
|
||||
@ -4943,7 +4943,7 @@
|
||||
|
||||
*) Disable the padding bug check when compression is in use. The padding
|
||||
bug check assumes the first packet is of even length, this is not
|
||||
necessarily true if compresssion is enabled and can result in false
|
||||
necessarily true if compression is enabled and can result in false
|
||||
positives causing handshake failure. The actual bug test is ancient
|
||||
code so it is hoped that implementations will either have fixed it by
|
||||
now or any which still have the bug do not support compression.
|
||||
@ -5172,7 +5172,7 @@
|
||||
we can fix the problem directly in the 'ca' utility.)
|
||||
[Steve Henson]
|
||||
|
||||
*) Reduced header interdepencies by declaring more opaque objects in
|
||||
*) Reduced header interdependencies by declaring more opaque objects in
|
||||
ossl_typ.h. As a consequence, including some headers (eg. engine.h) will
|
||||
give fewer recursive includes, which could break lazy source code - so
|
||||
this change is covered by the OPENSSL_NO_DEPRECATED symbol. As always,
|
||||
@ -5396,7 +5396,7 @@
|
||||
named like the index file with '.attr' appended to the name.
|
||||
[Richard Levitte]
|
||||
|
||||
*) Generate muti valued AVAs using '+' notation in config files for
|
||||
*) Generate multi-valued AVAs using '+' notation in config files for
|
||||
req and dirName.
|
||||
[Steve Henson]
|
||||
|
||||
@ -5937,7 +5937,7 @@
|
||||
draft-ietf-tls-56-bit-ciphersuites-0[01].txt, but do not really
|
||||
appear there.
|
||||
|
||||
Also deactive the remaining ciphersuites from
|
||||
Also deactivate the remaining ciphersuites from
|
||||
draft-ietf-tls-56-bit-ciphersuites-01.txt. These are just as
|
||||
unofficial, and the ID has long expired.
|
||||
[Bodo Moeller]
|
||||
@ -6580,9 +6580,9 @@
|
||||
*) Add an "init" command to the ENGINE config module and auto initialize
|
||||
ENGINEs. Without any "init" command the ENGINE will be initialized
|
||||
after all ctrl commands have been executed on it. If init=1 the
|
||||
ENGINE is initailized at that point (ctrls before that point are run
|
||||
ENGINE is initialized at that point (ctrls before that point are run
|
||||
on the uninitialized ENGINE and after on the initialized one). If
|
||||
init=0 then the ENGINE will not be iniatialized at all.
|
||||
init=0 then the ENGINE will not be initialized at all.
|
||||
[Steve Henson]
|
||||
|
||||
*) Fix the 'app_verify_callback' interface so that the user-defined
|
||||
@ -6839,7 +6839,7 @@
|
||||
*) Major restructuring to the underlying ENGINE code. This includes
|
||||
reduction of linker bloat, separation of pure "ENGINE" manipulation
|
||||
(initialisation, etc) from functionality dealing with implementations
|
||||
of specific crypto iterfaces. This change also introduces integrated
|
||||
of specific crypto interfaces. This change also introduces integrated
|
||||
support for symmetric ciphers and digest implementations - so ENGINEs
|
||||
can now accelerate these by providing EVP_CIPHER and EVP_MD
|
||||
implementations of their own. This is detailed in crypto/engine/README
|
||||
@ -7843,7 +7843,7 @@ des-cbc 3624.96k 5258.21k 5530.91k 5624.30k 5628.26k
|
||||
[Steve Henson]
|
||||
|
||||
*) Enhance mkdef.pl to be more accepting about spacing in C preprocessor
|
||||
lines, recognice more "algorithms" that can be deselected, and make
|
||||
lines, recognize more "algorithms" that can be deselected, and make
|
||||
it complain about algorithm deselection that isn't recognised.
|
||||
[Richard Levitte]
|
||||
|
||||
@ -8241,7 +8241,7 @@ des-cbc 3624.96k 5258.21k 5530.91k 5624.30k 5628.26k
|
||||
Changes between 0.9.6h and 0.9.6i [19 Feb 2003]
|
||||
|
||||
*) In ssl3_get_record (ssl/s3_pkt.c), minimize information leaked
|
||||
via timing by performing a MAC computation even if incorrrect
|
||||
via timing by performing a MAC computation even if incorrect
|
||||
block cipher padding has been found. This is a countermeasure
|
||||
against active attacks where the attacker has to distinguish
|
||||
between bad padding and a MAC verification error. (CVE-2003-0078)
|
||||
@ -9879,7 +9879,7 @@ des-cbc 3624.96k 5258.21k 5530.91k 5624.30k 5628.26k
|
||||
ssl_cert_dup, which is used by SSL_new, now copies DH keys in addition
|
||||
to parameters -- in previous versions (since OpenSSL 0.9.3) the
|
||||
'default key' from SSL_CTX_set_tmp_dh would always be lost, meaning
|
||||
you effectivly got SSL_OP_SINGLE_DH_USE when using this macro.
|
||||
you effectively got SSL_OP_SINGLE_DH_USE when using this macro.
|
||||
[Bodo Moeller]
|
||||
|
||||
*) New s_client option -ign_eof: EOF at stdin is ignored, and
|
||||
@ -10098,7 +10098,7 @@ des-cbc 3624.96k 5258.21k 5530.91k 5624.30k 5628.26k
|
||||
*) ./config recognizes MacOS X now.
|
||||
[Andy Polyakov]
|
||||
|
||||
*) Bug fix for BN_div() when the first words of num and divsor are
|
||||
*) Bug fix for BN_div() when the first words of num and divisor are
|
||||
equal (it gave wrong results if (rem=(n1-q*d0)&BN_MASK2) < d0).
|
||||
[Ulf Möller]
|
||||
|
||||
@ -11771,7 +11771,7 @@ des-cbc 3624.96k 5258.21k 5530.91k 5624.30k 5628.26k
|
||||
|
||||
*) Bugfix: In test/testenc, don't test "openssl <cipher>" for
|
||||
ciphers that were excluded, e.g. by -DNO_IDEA. Also, test
|
||||
all available cipers including rc5, which was forgotten until now.
|
||||
all available ciphers including rc5, which was forgotten until now.
|
||||
In order to let the testing shell script know which algorithms
|
||||
are available, a new (up to now undocumented) command
|
||||
"openssl list-cipher-commands" is used.
|
||||
|
@ -367,7 +367,7 @@ source as well. However, the files given through SOURCE are expected
|
||||
to be located in the source tree while files given through DEPEND are
|
||||
expected to be located in the build tree)
|
||||
|
||||
It's also possible to depend on static libraries explicitely:
|
||||
It's also possible to depend on static libraries explicitly:
|
||||
|
||||
DEPEND[foo]=libsomething.a
|
||||
DEPEND[libbar]=libsomethingelse.a
|
||||
|
@ -40,7 +40,7 @@
|
||||
my $extensionlessitem = extensionlesslib($item);
|
||||
if (grep { $extensionlessitem eq extensionlesslib($_) } @list) {
|
||||
if ($item ne $extensionlessitem) {
|
||||
# If this instance of the library is explicitely static, we
|
||||
# If this instance of the library is explicitly static, we
|
||||
# prefer that to any shared library name, since it must have
|
||||
# been done on purpose.
|
||||
$replace{$extensionlessitem} = $item;
|
||||
|
@ -774,7 +774,7 @@ while (@argvcopy)
|
||||
}
|
||||
unless ($_ eq $target || /^no-/ || /^disable-/)
|
||||
{
|
||||
# "no-..." follows later after implied disactivations
|
||||
# "no-..." follows later after implied deactivations
|
||||
# have been derived. (Don't take this too seriously,
|
||||
# we really only write OPTIONS to the Makefile out of
|
||||
# nostalgia.)
|
||||
@ -1767,7 +1767,7 @@ EOF
|
||||
|
||||
# Additionally, we set up sharednames for libraries that don't
|
||||
# have any, as themselves. Only for libraries that aren't
|
||||
# explicitely static.
|
||||
# explicitly static.
|
||||
foreach (grep !/\.a$/, keys %{$unified_info{libraries}}) {
|
||||
if (!defined $unified_info{sharednames}->{$_}) {
|
||||
$unified_info{sharednames}->{$_} = $_
|
||||
@ -1775,13 +1775,13 @@ EOF
|
||||
}
|
||||
|
||||
# Check that we haven't defined any library as both shared and
|
||||
# explicitely static. That is forbidden.
|
||||
# explicitly static. That is forbidden.
|
||||
my @doubles = ();
|
||||
foreach (grep /\.a$/, keys %{$unified_info{libraries}}) {
|
||||
(my $l = $_) =~ s/\.a$//;
|
||||
push @doubles, $l if defined $unified_info{sharednames}->{$l};
|
||||
}
|
||||
die "these libraries are both explicitely static and shared:\n ",
|
||||
die "these libraries are both explicitly static and shared:\n ",
|
||||
join(" ", @doubles), "\n"
|
||||
if @doubles;
|
||||
}
|
||||
|
8
NEWS
8
NEWS
@ -492,7 +492,7 @@
|
||||
affected functions.
|
||||
o Improved platform support for PowerPC.
|
||||
o New FIPS 180-2 algorithms (SHA-224, -256, -384 and -512).
|
||||
o New X509_VERIFY_PARAM structure to support parametrisation
|
||||
o New X509_VERIFY_PARAM structure to support parameterisation
|
||||
of X.509 path validation.
|
||||
o Major overhaul of RC4 performance on Intel P4, IA-64 and
|
||||
AMD64.
|
||||
@ -778,7 +778,7 @@
|
||||
o Automation of 'req' application
|
||||
o Fixes to make s_client, s_server work under Windows
|
||||
o Support for multiple fieldnames in SPKACs
|
||||
o New SPKAC command line utilty and associated library functions
|
||||
o New SPKAC command line utility and associated library functions
|
||||
o Options to allow passwords to be obtained from various sources
|
||||
o New public key PEM format and options to handle it
|
||||
o Many other fixes and enhancements to command line utilities
|
||||
@ -860,8 +860,8 @@
|
||||
o Added BIO proxy and filtering functionality
|
||||
o Extended Big Number (BN) library
|
||||
o Added RIPE MD160 message digest
|
||||
o Addeed support for RC2/64bit cipher
|
||||
o Added support for RC2/64bit cipher
|
||||
o Extended ASN.1 parser routines
|
||||
o Adjustations of the source tree for CVS
|
||||
o Adjustments of the source tree for CVS
|
||||
o Support for various new platforms
|
||||
|
||||
|
2
README
2
README
@ -62,7 +62,7 @@
|
||||
- Download the latest version from the repository
|
||||
to see if the problem has already been addressed
|
||||
- Configure with no-asm
|
||||
- Remove compiler optimisation flags
|
||||
- Remove compiler optimization flags
|
||||
|
||||
If you wish to report a bug then please include the following information
|
||||
and create an issue on GitHub:
|
||||
|
@ -1106,7 +1106,7 @@ static int do_responder(OCSP_REQUEST **preq, BIO **pcbio, BIO *acbio)
|
||||
if (*q == ' ')
|
||||
break;
|
||||
if (strncmp(q, " HTTP/1.", 8) != 0) {
|
||||
BIO_printf(bio_err, "Invalid request -- bad HTTP vesion\n");
|
||||
BIO_printf(bio_err, "Invalid request -- bad HTTP version\n");
|
||||
return 1;
|
||||
}
|
||||
*q = '\0';
|
||||
|
@ -26,7 +26,7 @@ my @openssl_source =
|
||||
@{$unified_info{sources}->{$apps_openssl}};
|
||||
|
||||
foreach my $filename (@openssl_source) {
|
||||
open F, $filename or die "Coudn't open $_: $!\n";
|
||||
open F, $filename or die "Couldn't open $_: $!\n";
|
||||
foreach ( grep /$cmdre/, <F> ) {
|
||||
my @foo = /$cmdre/;
|
||||
$commands{$1} = 1;
|
||||
|
@ -514,7 +514,7 @@ static int TerminalDeviceAst (int astparm)
|
||||
strcat (TerminalDeviceBuff, "\n");
|
||||
|
||||
/*
|
||||
** Send the data read from the terminal device throught the socket pair
|
||||
** Send the data read from the terminal device through the socket pair
|
||||
*/
|
||||
send (TerminalSocketPair[0], TerminalDeviceBuff,
|
||||
TerminalDeviceIosb.iosb$w_bcnt + 1, 0);
|
||||
|
@ -55,8 +55,8 @@
|
||||
# better performance on most recent µ-archs...
|
||||
#
|
||||
# Third version adds AES_cbc_encrypt implementation, which resulted in
|
||||
# up to 40% performance imrovement of CBC benchmark results. 40% was
|
||||
# observed on P4 core, where "overall" imrovement coefficient, i.e. if
|
||||
# up to 40% performance improvement of CBC benchmark results. 40% was
|
||||
# observed on P4 core, where "overall" improvement coefficient, i.e. if
|
||||
# compared to PIC generated by GCC and in CBC mode, was observed to be
|
||||
# as large as 4x:-) CBC performance is virtually identical to ECB now
|
||||
# and on some platforms even better, e.g. 17.6 "small" cycles/byte on
|
||||
@ -159,7 +159,7 @@
|
||||
# combinations then attack becomes infeasible. This is why revised
|
||||
# AES_cbc_encrypt "dares" to switch to larger S-box when larger chunk
|
||||
# of data is to be processed in one stroke. The current size limit of
|
||||
# 512 bytes is chosen to provide same [diminishigly low] probability
|
||||
# 512 bytes is chosen to provide same [diminishingly low] probability
|
||||
# for cache-line to remain untouched in large chunk operation with
|
||||
# large S-box as for single block operation with compact S-box and
|
||||
# surely needs more careful consideration...
|
||||
@ -171,12 +171,12 @@
|
||||
# yield execution to process performing AES just before timer fires
|
||||
# off the scheduler, immediately regain control of CPU and analyze the
|
||||
# cache state. For this attack to be efficient attacker would have to
|
||||
# effectively slow down the operation by several *orders* of magnitute,
|
||||
# effectively slow down the operation by several *orders* of magnitude,
|
||||
# by ratio of time slice to duration of handful of AES rounds, which
|
||||
# unlikely to remain unnoticed. Not to mention that this also means
|
||||
# that he would spend correspondigly more time to collect enough
|
||||
# that he would spend correspondingly more time to collect enough
|
||||
# statistical data to mount the attack. It's probably appropriate to
|
||||
# say that if adeversary reckons that this attack is beneficial and
|
||||
# say that if adversary reckons that this attack is beneficial and
|
||||
# risks to be noticed, you probably have larger problems having him
|
||||
# mere opportunity. In other words suggested code design expects you
|
||||
# to preclude/mitigate this attack by overall system security design.
|
||||
@ -240,7 +240,7 @@ $small_footprint=1; # $small_footprint=1 code is ~5% slower [on
|
||||
# contention and in hope to "collect" 5% back
|
||||
# in real-life applications...
|
||||
|
||||
$vertical_spin=0; # shift "verticaly" defaults to 0, because of
|
||||
$vertical_spin=0; # shift "vertically" defaults to 0, because of
|
||||
# its proof-of-concept status...
|
||||
# Note that there is no decvert(), as well as last encryption round is
|
||||
# performed with "horizontal" shifts. This is because this "vertical"
|
||||
@ -1606,7 +1606,7 @@ sub decstep()
|
||||
# no instructions are reordered, as performance appears
|
||||
# optimal... or rather that all attempts to reorder didn't
|
||||
# result in better performance [which by the way is not a
|
||||
# bit lower than ecryption].
|
||||
# bit lower than encryption].
|
||||
if($i==3) { &mov ($key,$__key); }
|
||||
else { &mov ($out,$s[0]); }
|
||||
&and ($out,0xFF);
|
||||
|
@ -120,7 +120,7 @@ my ($i0,$i1,$i2,$i3)=($at,$t0,$t1,$t2);
|
||||
my ($t0,$t1,$t2,$t3,$t4,$t5,$t6,$t7,$t8,$t9,$t10,$t11) = map("\$$_",(12..23));
|
||||
my ($key0,$cnt)=($gp,$fp);
|
||||
|
||||
# instuction ordering is "stolen" from output from MIPSpro assembler
|
||||
# instruction ordering is "stolen" from output from MIPSpro assembler
|
||||
# invoked with -mips3 -O3 arguments...
|
||||
$code.=<<___;
|
||||
.align 5
|
||||
|
@ -1015,7 +1015,7 @@ ___
|
||||
foreach (split("\n",$code)) {
|
||||
s/\`([^\`]*)\`/eval $1/ge;
|
||||
|
||||
# translate made up instructons: _ror, _srm
|
||||
# translate made up instructions: _ror, _srm
|
||||
s/_ror(\s+)(%r[0-9]+),/shd$1$2,$2,/ or
|
||||
|
||||
s/_srm(\s+%r[0-9]+),([0-9]+),/
|
||||
|
@ -44,7 +44,7 @@
|
||||
# minimize/avoid Address Generation Interlock hazard and to favour
|
||||
# dual-issue z10 pipeline. This gave ~25% improvement on z10 and
|
||||
# almost 50% on z9. The gain is smaller on z10, because being dual-
|
||||
# issue z10 makes it improssible to eliminate the interlock condition:
|
||||
# issue z10 makes it impossible to eliminate the interlock condition:
|
||||
# critial path is not long enough. Yet it spends ~24 cycles per byte
|
||||
# processed with 128-bit key.
|
||||
#
|
||||
|
@ -2018,7 +2018,7 @@ AES_cbc_encrypt:
|
||||
lea ($key,%rax),%rax
|
||||
mov %rax,$keyend
|
||||
|
||||
# pick Te4 copy which can't "overlap" with stack frame or key scdedule
|
||||
# pick Te4 copy which can't "overlap" with stack frame or key schedule
|
||||
lea 2048($sbox),$sbox
|
||||
lea 768-8(%rsp),%rax
|
||||
sub $sbox,%rax
|
||||
|
@ -22,7 +22,7 @@
|
||||
# April 2016
|
||||
#
|
||||
# Add "teaser" CBC and CTR mode-specific subroutines. "Teaser" means
|
||||
# that parallelizeable nature of CBC decrypt and CTR is not utilized
|
||||
# that parallelizable nature of CBC decrypt and CTR is not utilized
|
||||
# yet. CBC encrypt on the other hand is as good as it can possibly
|
||||
# get processing one byte in 4.1 cycles with 128-bit key on SPARC64 X.
|
||||
# This is ~6x faster than pure software implementation...
|
||||
|
@ -1325,10 +1325,10 @@ se_handler:
|
||||
mov -48(%rax),%r15
|
||||
mov %rbx,144($context) # restore context->Rbx
|
||||
mov %rbp,160($context) # restore context->Rbp
|
||||
mov %r12,216($context) # restore cotnext->R12
|
||||
mov %r13,224($context) # restore cotnext->R13
|
||||
mov %r14,232($context) # restore cotnext->R14
|
||||
mov %r15,240($context) # restore cotnext->R15
|
||||
mov %r12,216($context) # restore context->R12
|
||||
mov %r13,224($context) # restore context->R13
|
||||
mov %r14,232($context) # restore context->R14
|
||||
mov %r15,240($context) # restore context->R15
|
||||
|
||||
lea -56-10*16(%rax),%rsi
|
||||
lea 512($context),%rdi # &context.Xmm6
|
||||
|
@ -239,7 +239,7 @@ sub aesni_generate1 # fully unrolled loop
|
||||
# can schedule aes[enc|dec] every cycle optimal interleave factor
|
||||
# equals to corresponding instructions latency. 8x is optimal for
|
||||
# * Bridge, but it's unfeasible to accommodate such implementation
|
||||
# in XMM registers addreassable in 32-bit mode and therefore maximum
|
||||
# in XMM registers addressable in 32-bit mode and therefore maximum
|
||||
# of 6x is used instead...
|
||||
|
||||
sub aesni_generate2
|
||||
|
@ -60,7 +60,7 @@
|
||||
# identical to CBC, because CBC-MAC is essentially CBC encrypt without
|
||||
# saving output. CCM CTR "stays invisible," because it's neatly
|
||||
# interleaved wih CBC-MAC. This provides ~30% improvement over
|
||||
# "straghtforward" CCM implementation with CTR and CBC-MAC performed
|
||||
# "straightforward" CCM implementation with CTR and CBC-MAC performed
|
||||
# disjointly. Parallelizable modes practically achieve the theoretical
|
||||
# limit.
|
||||
#
|
||||
@ -143,14 +143,14 @@
|
||||
# asymptotic, if it can be surpassed, isn't it? What happens there?
|
||||
# Rewind to CBC paragraph for the answer. Yes, out-of-order execution
|
||||
# magic is responsible for this. Processor overlaps not only the
|
||||
# additional instructions with AES ones, but even AES instuctions
|
||||
# additional instructions with AES ones, but even AES instructions
|
||||
# processing adjacent triplets of independent blocks. In the 6x case
|
||||
# additional instructions still claim disproportionally small amount
|
||||
# of additional cycles, but in 8x case number of instructions must be
|
||||
# a tad too high for out-of-order logic to cope with, and AES unit
|
||||
# remains underutilized... As you can see 8x interleave is hardly
|
||||
# justifiable, so there no need to feel bad that 32-bit aesni-x86.pl
|
||||
# utilizies 6x interleave because of limited register bank capacity.
|
||||
# utilizes 6x interleave because of limited register bank capacity.
|
||||
#
|
||||
# Higher interleave factors do have negative impact on Westmere
|
||||
# performance. While for ECB mode it's negligible ~1.5%, other
|
||||
@ -1550,7 +1550,7 @@ $code.=<<___;
|
||||
sub \$8,$len
|
||||
jnc .Lctr32_loop8 # loop if $len-=8 didn't borrow
|
||||
|
||||
add \$8,$len # restore real remainig $len
|
||||
add \$8,$len # restore real remaining $len
|
||||
jz .Lctr32_done # done if ($len==0)
|
||||
lea -0x80($key),$key
|
||||
|
||||
@ -1667,7 +1667,7 @@ $code.=<<___;
|
||||
movups $inout2,0x20($out) # $len was 3, stop store
|
||||
|
||||
.Lctr32_done:
|
||||
xorps %xmm0,%xmm0 # clear regiser bank
|
||||
xorps %xmm0,%xmm0 # clear register bank
|
||||
xor $key0,$key0
|
||||
pxor %xmm1,%xmm1
|
||||
pxor %xmm2,%xmm2
|
||||
@ -1856,7 +1856,7 @@ $code.=<<___;
|
||||
lea `16*6`($inp),$inp
|
||||
pxor $twmask,$inout5
|
||||
|
||||
pxor $twres,@tweak[0] # calclulate tweaks^round[last]
|
||||
pxor $twres,@tweak[0] # calculate tweaks^round[last]
|
||||
aesenc $rndkey1,$inout4
|
||||
pxor $twres,@tweak[1]
|
||||
movdqa @tweak[0],`16*0`(%rsp) # put aside tweaks^round[last]
|
||||
@ -2342,7 +2342,7 @@ $code.=<<___;
|
||||
lea `16*6`($inp),$inp
|
||||
pxor $twmask,$inout5
|
||||
|
||||
pxor $twres,@tweak[0] # calclulate tweaks^round[last]
|
||||
pxor $twres,@tweak[0] # calculate tweaks^round[last]
|
||||
aesdec $rndkey1,$inout4
|
||||
pxor $twres,@tweak[1]
|
||||
movdqa @tweak[0],`16*0`(%rsp) # put aside tweaks^last round key
|
||||
@ -4515,7 +4515,7 @@ __aesni_set_encrypt_key:
|
||||
|
||||
.align 16
|
||||
.L14rounds:
|
||||
movups 16($inp),%xmm2 # remaning half of *userKey
|
||||
movups 16($inp),%xmm2 # remaining half of *userKey
|
||||
mov \$13,$bits # 14 rounds for 256
|
||||
lea 16(%rax),%rax
|
||||
cmp \$`1<<28`,%r10d # AVX, but no XOP
|
||||
|
@ -44,7 +44,7 @@
|
||||
# instructions with those on critical path. Amazing!
|
||||
#
|
||||
# As with Intel AES-NI, question is if it's possible to improve
|
||||
# performance of parallelizeable modes by interleaving round
|
||||
# performance of parallelizable modes by interleaving round
|
||||
# instructions. Provided round instruction latency and throughput
|
||||
# optimal interleave factor is 2. But can we expect 2x performance
|
||||
# improvement? Well, as round instructions can be issued one per
|
||||
|
@ -929,7 +929,7 @@ if ($flavour =~ /64/) { ######## 64-bit code
|
||||
s/^(\s+)v/$1/o or # strip off v prefix
|
||||
s/\bbx\s+lr\b/ret/o;
|
||||
|
||||
# fix up remainig legacy suffixes
|
||||
# fix up remaining legacy suffixes
|
||||
s/\.[ui]?8//o;
|
||||
m/\],#8/o and s/\.16b/\.8b/go;
|
||||
s/\.[ui]?32//o and s/\.16b/\.4s/go;
|
||||
@ -988,7 +988,7 @@ if ($flavour =~ /64/) { ######## 64-bit code
|
||||
s/\bv([0-9])\.[12468]+[bsd]\b/q$1/go; # new->old registers
|
||||
s/\/\/\s?/@ /o; # new->old style commentary
|
||||
|
||||
# fix up remainig new-style suffixes
|
||||
# fix up remaining new-style suffixes
|
||||
s/\{q([0-9]+)\},\s*\[(.+)\],#8/sprintf "{d%d},[$2]!",2*$1/eo or
|
||||
s/\],#[0-9]+/]!/o;
|
||||
|
||||
|
@ -31,7 +31,7 @@
|
||||
# Apple A7(***) 22.7(**) 10.9/14.3 [8.45/10.0 ]
|
||||
# Mongoose(***) 26.3(**) 21.0/25.0(**) [13.3/16.8 ]
|
||||
#
|
||||
# (*) ECB denotes approximate result for parallelizeable modes
|
||||
# (*) ECB denotes approximate result for parallelizable modes
|
||||
# such as CBC decrypt, CTR, etc.;
|
||||
# (**) these results are worse than scalar compiler-generated
|
||||
# code, but it's constant-time and therefore preferred;
|
||||
@ -137,7 +137,7 @@ _vpaes_consts:
|
||||
.quad 0x07E4A34047A4E300, 0x1DFEB95A5DBEF91A
|
||||
.quad 0x5F36B5DC83EA6900, 0x2841C2ABF49D1E77
|
||||
|
||||
.asciz "Vector Permutaion AES for ARMv8, Mike Hamburg (Stanford University)"
|
||||
.asciz "Vector Permutation AES for ARMv8, Mike Hamburg (Stanford University)"
|
||||
.size _vpaes_consts,.-_vpaes_consts
|
||||
.align 6
|
||||
___
|
||||
|
@ -28,7 +28,7 @@
|
||||
# endif
|
||||
/*
|
||||
* Why doesn't gcc define __ARM_ARCH__? Instead it defines
|
||||
* bunch of below macros. See all_architectires[] table in
|
||||
* bunch of below macros. See all_architectures[] table in
|
||||
* gcc/config/arm/arm.c. On a side note it defines
|
||||
* __ARMEL__/__ARMEB__ for little-/big-endian.
|
||||
*/
|
||||
|
@ -8,7 +8,7 @@
|
||||
*/
|
||||
|
||||
/*
|
||||
* This table MUST be kept in ascening order of the NID each method
|
||||
* This table MUST be kept in ascending order of the NID each method
|
||||
* represents (corresponding to the pkey_id field) as OBJ_bsearch
|
||||
* is used to search it.
|
||||
*/
|
||||
|
@ -43,7 +43,7 @@ $code.=<<___;
|
||||
SHRU $A,16, $Ahi ; smash $A to two halfwords
|
||||
|| EXTU $A,16,16,$Alo
|
||||
|
||||
XORMPY $Alo,$B_2,$Alox2 ; 16x8 bits muliplication
|
||||
XORMPY $Alo,$B_2,$Alox2 ; 16x8 bits multiplication
|
||||
|| XORMPY $Ahi,$B_2,$Ahix2
|
||||
|| EXTU $B,16,24,$B_1
|
||||
XORMPY $Alo,$B_0,$Alox0
|
||||
|
@ -20,7 +20,7 @@
|
||||
// disclaimed.
|
||||
// ====================================================================
|
||||
//
|
||||
// Version 2.x is Itanium2 re-tune. Few words about how Itanum2 is
|
||||
// Version 2.x is Itanium2 re-tune. Few words about how Itanium2 is
|
||||
// different from Itanium to this module viewpoint. Most notably, is it
|
||||
// "wider" than Itanium? Can you experience loop scalability as
|
||||
// discussed in commentary sections? Not really:-( Itanium2 has 6
|
||||
@ -141,7 +141,7 @@
|
||||
// User Mask I want to excuse the kernel from preserving upper
|
||||
// (f32-f128) FP register bank over process context switch, thus
|
||||
// minimizing bus bandwidth consumption during the switch (i.e.
|
||||
// after PKI opration completes and the program is off doing
|
||||
// after PKI operation completes and the program is off doing
|
||||
// something else like bulk symmetric encryption). Having said
|
||||
// this, I also want to point out that it might be good idea
|
||||
// to compile the whole toolkit (as well as majority of the
|
||||
@ -162,7 +162,7 @@
|
||||
//
|
||||
// bn_[add|sub]_words routines.
|
||||
//
|
||||
// Loops are spinning in 2*(n+5) ticks on Itanuim (provided that the
|
||||
// Loops are spinning in 2*(n+5) ticks on Itanium (provided that the
|
||||
// data reside in L1 cache, i.e. 2 ticks away). It's possible to
|
||||
// compress the epilogue and get down to 2*n+6, but at the cost of
|
||||
// scalability (the neat feature of this implementation is that it
|
||||
@ -500,7 +500,7 @@ bn_sqr_words:
|
||||
// possible to compress the epilogue (I'm getting tired to write this
|
||||
// comment over and over) and get down to 2*n+16 at the cost of
|
||||
// scalability. The decision will very likely be reconsidered after the
|
||||
// benchmark program is profiled. I.e. if perfomance gain on Itanium
|
||||
// benchmark program is profiled. I.e. if performance gain on Itanium
|
||||
// will appear larger than loss on "wider" IA-64, then the loop should
|
||||
// be explicitly split and the epilogue compressed.
|
||||
.L_bn_sqr_words_ctop:
|
||||
@ -936,7 +936,7 @@ bn_mul_comba8:
|
||||
xma.hu f118=f39,f127,f117 }
|
||||
{ .mfi; xma.lu f117=f39,f127,f117 };;//
|
||||
//-------------------------------------------------//
|
||||
// Leaving muliplier's heaven... Quite a ride, huh?
|
||||
// Leaving multiplier's heaven... Quite a ride, huh?
|
||||
|
||||
{ .mii; getf.sig r31=f47
|
||||
add r25=r25,r24
|
||||
|
@ -21,7 +21,7 @@
|
||||
# optimal in respect to instruction set capabilities. Fair comparison
|
||||
# with vendor compiler is problematic, because OpenSSL doesn't define
|
||||
# BN_LLONG [presumably] for historical reasons, which drives compiler
|
||||
# toward 4 times 16x16=32-bit multiplicatons [plus complementary
|
||||
# toward 4 times 16x16=32-bit multiplications [plus complementary
|
||||
# shifts and additions] instead. This means that you should observe
|
||||
# several times improvement over code generated by vendor compiler
|
||||
# for PA-RISC 1.1, but the "baseline" is far from optimal. The actual
|
||||
|
@ -37,7 +37,7 @@
|
||||
# and squaring procedure operating on lengths divisible by 8. Length
|
||||
# is expressed in number of limbs. RSA private key operations are
|
||||
# ~35-50% faster (more for longer keys) on contemporary high-end POWER
|
||||
# processors in 64-bit builds, [mysterously enough] more in 32-bit
|
||||
# processors in 64-bit builds, [mysteriously enough] more in 32-bit
|
||||
# builds. On low-end 32-bit processors performance improvement turned
|
||||
# to be marginal...
|
||||
|
||||
|
@ -35,7 +35,7 @@
|
||||
# key lengths. As it's obviously inappropriate as "best all-round"
|
||||
# alternative, it has to be complemented with run-time CPU family
|
||||
# detection. Oh! It should also be noted that unlike other PowerPC
|
||||
# implementation IALU ppc-mont.pl module performs *suboptimaly* on
|
||||
# implementation IALU ppc-mont.pl module performs *suboptimally* on
|
||||
# >=1024-bit key lengths on Power 6. It should also be noted that
|
||||
# *everything* said so far applies to 64-bit builds! As far as 32-bit
|
||||
# application executed on 64-bit CPU goes, this module is likely to
|
||||
@ -1353,7 +1353,7 @@ $code.=<<___;
|
||||
std $t3,-16($tp) ; tp[j-1]
|
||||
std $t5,-8($tp) ; tp[j]
|
||||
|
||||
add $carry,$carry,$ovf ; comsume upmost overflow
|
||||
add $carry,$carry,$ovf ; consume upmost overflow
|
||||
add $t6,$t6,$carry ; can not overflow
|
||||
srdi $carry,$t6,16
|
||||
add $t7,$t7,$carry
|
||||
|
@ -20,7 +20,7 @@
|
||||
# in bn_gf2m.c. It's kind of low-hanging mechanical port from C for
|
||||
# the time being... gcc 4.3 appeared to generate poor code, therefore
|
||||
# the effort. And indeed, the module delivers 55%-90%(*) improvement
|
||||
# on haviest ECDSA verify and ECDH benchmarks for 163- and 571-bit
|
||||
# on heaviest ECDSA verify and ECDH benchmarks for 163- and 571-bit
|
||||
# key lengths on z990, 30%-55%(*) - on z10, and 70%-110%(*) - on z196.
|
||||
# This is for 64-bit build. In 32-bit "highgprs" case improvement is
|
||||
# even higher, for example on z990 it was measured 80%-150%. ECDSA
|
||||
|
@ -13,7 +13,7 @@
|
||||
*/
|
||||
|
||||
/*
|
||||
* This is my modest contributon to OpenSSL project (see
|
||||
* This is my modest contribution to OpenSSL project (see
|
||||
* http://www.openssl.org/ for more information about it) and is
|
||||
* a drop-in SuperSPARC ISA replacement for crypto/bn/bn_asm.c
|
||||
* module. For updates see http://fy.chalmers.se/~appro/hpe/.
|
||||
@ -159,12 +159,12 @@ bn_mul_add_words:
|
||||
*/
|
||||
bn_mul_words:
|
||||
cmp %o2,0
|
||||
bg,a .L_bn_mul_words_proceeed
|
||||
bg,a .L_bn_mul_words_proceed
|
||||
ld [%o1],%g2
|
||||
retl
|
||||
clr %o0
|
||||
|
||||
.L_bn_mul_words_proceeed:
|
||||
.L_bn_mul_words_proceed:
|
||||
andcc %o2,-4,%g0
|
||||
bz .L_bn_mul_words_tail
|
||||
clr %o5
|
||||
@ -251,12 +251,12 @@ bn_mul_words:
|
||||
*/
|
||||
bn_sqr_words:
|
||||
cmp %o2,0
|
||||
bg,a .L_bn_sqr_words_proceeed
|
||||
bg,a .L_bn_sqr_words_proceed
|
||||
ld [%o1],%g2
|
||||
retl
|
||||
clr %o0
|
||||
|
||||
.L_bn_sqr_words_proceeed:
|
||||
.L_bn_sqr_words_proceed:
|
||||
andcc %o2,-4,%g0
|
||||
bz .L_bn_sqr_words_tail
|
||||
clr %o5
|
||||
|
@ -13,7 +13,7 @@
|
||||
*/
|
||||
|
||||
/*
|
||||
* This is my modest contributon to OpenSSL project (see
|
||||
* This is my modest contribution to OpenSSL project (see
|
||||
* http://www.openssl.org/ for more information about it) and is
|
||||
* a drop-in UltraSPARC ISA replacement for crypto/bn/bn_asm.c
|
||||
* module. For updates see http://fy.chalmers.se/~appro/hpe/.
|
||||
@ -278,7 +278,7 @@ bn_mul_add_words:
|
||||
*/
|
||||
bn_mul_words:
|
||||
sra %o2,%g0,%o2 ! signx %o2
|
||||
brgz,a %o2,.L_bn_mul_words_proceeed
|
||||
brgz,a %o2,.L_bn_mul_words_proceed
|
||||
lduw [%o1],%g2
|
||||
retl
|
||||
clr %o0
|
||||
@ -286,7 +286,7 @@ bn_mul_words:
|
||||
nop
|
||||
nop
|
||||
|
||||
.L_bn_mul_words_proceeed:
|
||||
.L_bn_mul_words_proceed:
|
||||
srl %o3,%g0,%o3 ! clruw %o3
|
||||
andcc %o2,-4,%g0
|
||||
bz,pn %icc,.L_bn_mul_words_tail
|
||||
@ -366,7 +366,7 @@ bn_mul_words:
|
||||
*/
|
||||
bn_sqr_words:
|
||||
sra %o2,%g0,%o2 ! signx %o2
|
||||
brgz,a %o2,.L_bn_sqr_words_proceeed
|
||||
brgz,a %o2,.L_bn_sqr_words_proceed
|
||||
lduw [%o1],%g2
|
||||
retl
|
||||
clr %o0
|
||||
@ -374,7 +374,7 @@ bn_sqr_words:
|
||||
nop
|
||||
nop
|
||||
|
||||
.L_bn_sqr_words_proceeed:
|
||||
.L_bn_sqr_words_proceed:
|
||||
andcc %o2,-4,%g0
|
||||
nop
|
||||
bz,pn %icc,.L_bn_sqr_words_tail
|
||||
|
@ -611,7 +611,7 @@ $code.=<<___;
|
||||
add $tp,8,$tp
|
||||
.type $fname,#function
|
||||
.size $fname,(.-$fname)
|
||||
.asciz "Montgomery Multipltication for SPARCv9, CRYPTOGAMS by <appro\@openssl.org>"
|
||||
.asciz "Montgomery Multiplication for SPARCv9, CRYPTOGAMS by <appro\@openssl.org>"
|
||||
.align 32
|
||||
___
|
||||
$code =~ s/\`([^\`]*)\`/eval($1)/gem;
|
||||
|
@ -865,7 +865,7 @@ $fname:
|
||||
restore
|
||||
.type $fname,#function
|
||||
.size $fname,(.-$fname)
|
||||
.asciz "Montgomery Multipltication for UltraSPARC, CRYPTOGAMS by <appro\@openssl.org>"
|
||||
.asciz "Montgomery Multiplication for UltraSPARC, CRYPTOGAMS by <appro\@openssl.org>"
|
||||
.align 32
|
||||
___
|
||||
|
||||
|
@ -16,7 +16,7 @@
|
||||
|
||||
# October 2012.
|
||||
#
|
||||
# SPARCv9 VIS3 Montgomery multiplicaion procedure suitable for T3 and
|
||||
# SPARCv9 VIS3 Montgomery multiplication procedure suitable for T3 and
|
||||
# onward. There are three new instructions used here: umulxhi,
|
||||
# addxc[cc] and initializing store. On T3 RSA private key operations
|
||||
# are 1.54/1.87/2.11/2.26 times faster for 512/1024/2048/4096-bit key
|
||||
|
@ -152,7 +152,7 @@ $R="mm0";
|
||||
&xor ($a4,$a2); # a2=a4^a2^a4
|
||||
&mov (&DWP(5*4,"esp"),$a1); # a1^a4
|
||||
&xor ($a4,$a1); # a1^a2^a4
|
||||
&sar (@i[1],31); # broardcast 30th bit
|
||||
&sar (@i[1],31); # broadcast 30th bit
|
||||
&and ($lo,$b);
|
||||
&mov (&DWP(6*4,"esp"),$a2); # a2^a4
|
||||
&and (@i[1],$b);
|
||||
|
@ -78,7 +78,7 @@ $frame=32; # size of above frame rounded up to 16n
|
||||
&lea ("ebp",&DWP(-$frame,"esp","edi",4)); # future alloca($frame+4*(num+2))
|
||||
&neg ("edi");
|
||||
|
||||
# minimize cache contention by arraning 2K window between stack
|
||||
# minimize cache contention by arranging 2K window between stack
|
||||
# pointer and ap argument [np is also position sensitive vector,
|
||||
# but it's assumed to be near ap, as it's allocated at ~same
|
||||
# time].
|
||||
|
@ -68,7 +68,7 @@ _mul_1x1:
|
||||
sar \$63,$i0 # broadcast 62nd bit
|
||||
lea (,$a1,4),$a4
|
||||
and $b,$a
|
||||
sar \$63,$i1 # boardcast 61st bit
|
||||
sar \$63,$i1 # broadcast 61st bit
|
||||
mov $a,$hi # $a is $lo
|
||||
shl \$63,$lo
|
||||
and $b,$i0
|
||||
|
@ -319,7 +319,7 @@ $code.=<<___;
|
||||
mov %rax,($rp,$i,8) # rp[i]=tp[i]-np[i]
|
||||
mov 8($ap,$i,8),%rax # tp[i+1]
|
||||
lea 1($i),$i # i++
|
||||
dec $j # doesnn't affect CF!
|
||||
dec $j # doesn't affect CF!
|
||||
jnz .Lsub
|
||||
|
||||
sbb \$0,%rax # handle upmost overflow bit
|
||||
@ -750,7 +750,7 @@ $code.=<<___;
|
||||
mov 56($ap,$i,8),@ri[3]
|
||||
sbb 40($np,$i,8),@ri[1]
|
||||
lea 4($i),$i # i++
|
||||
dec $j # doesnn't affect CF!
|
||||
dec $j # doesn't affect CF!
|
||||
jnz .Lsub4x
|
||||
|
||||
mov @ri[0],0($rp,$i,8) # rp[i]=tp[i]-np[i]
|
||||
|
@ -419,7 +419,7 @@ $code.=<<___;
|
||||
mov %rax,($rp,$i,8) # rp[i]=tp[i]-np[i]
|
||||
mov 8($ap,$i,8),%rax # tp[i+1]
|
||||
lea 1($i),$i # i++
|
||||
dec $j # doesnn't affect CF!
|
||||
dec $j # doesn't affect CF!
|
||||
jnz .Lsub
|
||||
|
||||
sbb \$0,%rax # handle upmost overflow bit
|
||||
@ -2421,7 +2421,7 @@ my $N=$STRIDE/4; # should match cache line size
|
||||
$code.=<<___;
|
||||
movdqa 0(%rax),%xmm0 # 00000001000000010000000000000000
|
||||
movdqa 16(%rax),%xmm1 # 00000002000000020000000200000002
|
||||
lea 88-112(%rsp,%r10),%r10 # place the mask after tp[num+1] (+ICache optimizaton)
|
||||
lea 88-112(%rsp,%r10),%r10 # place the mask after tp[num+1] (+ICache optimization)
|
||||
lea 128($bp),$bptr # size optimization
|
||||
|
||||
pshufd \$0,%xmm5,%xmm5 # broadcast index
|
||||
|
@ -231,7 +231,7 @@ bus_loop1?:
|
||||
_OPENSSL_instrument_bus2:
|
||||
.asmfunc
|
||||
MV A6,B0 ; reassign max
|
||||
|| MV B4,A6 ; reassing sizeof(output)
|
||||
|| MV B4,A6 ; reassign sizeof(output)
|
||||
|| MVK 0x00004030,A3
|
||||
MV A4,B4 ; reassign output
|
||||
|| MVK 0,A4 ; return value
|
||||
|
@ -17,7 +17,7 @@
|
||||
# Camellia for SPARC T4.
|
||||
#
|
||||
# As with AES below results [for aligned data] are virtually identical
|
||||
# to critical path lenths for 3-cycle instruction latency:
|
||||
# to critical path lengths for 3-cycle instruction latency:
|
||||
#
|
||||
# 128-bit key 192/256-
|
||||
# CBC encrypt 4.14/4.21(*) 5.46/5.52
|
||||
@ -25,7 +25,7 @@
|
||||
# misaligned data.
|
||||
#
|
||||
# As with Intel AES-NI, question is if it's possible to improve
|
||||
# performance of parallelizeable modes by interleaving round
|
||||
# performance of parallelizable modes by interleaving round
|
||||
# instructions. In Camellia every instruction is dependent on
|
||||
# previous, which means that there is place for 2 additional ones
|
||||
# in between two dependent. Can we expect 3x performance improvement?
|
||||
|
@ -22,7 +22,7 @@
|
||||
# faster than code generated by TI compiler. Compiler also disables
|
||||
# interrupts for some reason, thus making interrupt response time
|
||||
# dependent on input length. This module on the other hand is free
|
||||
# from such limiation.
|
||||
# from such limitation.
|
||||
|
||||
$output=pop;
|
||||
open STDOUT,">$output";
|
||||
|
@ -15,7 +15,7 @@ require "x86asm.pl";
|
||||
require "cbc.pl";
|
||||
require "desboth.pl";
|
||||
|
||||
# base code is in microsft
|
||||
# base code is in Microsoft
|
||||
# op dest, source
|
||||
# format.
|
||||
#
|
||||
|
@ -528,7 +528,7 @@ $4:
|
||||
! parameter 3 1 for optional store to [in0]
|
||||
! parameter 4 1 for load input/output address to local5/7
|
||||
!
|
||||
! The final permutation logic switches the halfes, meaning that
|
||||
! The final permutation logic switches the halves, meaning that
|
||||
! left and right ends up the the registers originally used.
|
||||
|
||||
define(fp_macro, {
|
||||
@ -731,7 +731,7 @@ define(fp_ip_macro, {
|
||||
sll $4, 3, local2
|
||||
xor local4, temp2, $2
|
||||
|
||||
! reload since used as temporar:
|
||||
! reload since used as temporary:
|
||||
|
||||
ld [out2+280], out4 ! loop counter
|
||||
|
||||
@ -753,7 +753,7 @@ define(fp_ip_macro, {
|
||||
! parameter 1 address
|
||||
! parameter 2 destination left
|
||||
! parameter 3 destination right
|
||||
! parameter 4 temporar
|
||||
! parameter 4 temporary
|
||||
! parameter 5 label
|
||||
|
||||
define(load_little_endian, {
|
||||
@ -802,7 +802,7 @@ $5a:
|
||||
! parameter 1 address
|
||||
! parameter 2 destination left
|
||||
! parameter 3 destination right
|
||||
! parameter 4 temporar
|
||||
! parameter 4 temporary
|
||||
! parameter 4 label
|
||||
!
|
||||
! adds 8 to address
|
||||
@ -927,7 +927,7 @@ $7.jmp.table:
|
||||
! parameter 1 address
|
||||
! parameter 2 source left
|
||||
! parameter 3 source right
|
||||
! parameter 4 temporar
|
||||
! parameter 4 temporary
|
||||
|
||||
define(store_little_endian, {
|
||||
|
||||
@ -1517,7 +1517,7 @@ DES_ncbc_encrypt:
|
||||
! parameter 7 1 for mov in1 to in3
|
||||
! parameter 8 1 for mov in3 to in4
|
||||
|
||||
ip_macro(in5, out5, out5, in5, in4, 2, 0, 1) ! include decryprion ks in4
|
||||
ip_macro(in5, out5, out5, in5, in4, 2, 0, 1) ! include decryption ks in4
|
||||
|
||||
fp_macro(out5, in5, 0, 1) ! 1 for input and output address to local5/7
|
||||
|
||||
@ -1563,7 +1563,7 @@ DES_ncbc_encrypt:
|
||||
.size DES_ncbc_encrypt, .DES_ncbc_encrypt.end-DES_ncbc_encrypt
|
||||
|
||||
|
||||
! void DES_ede3_cbc_encrypt(input, output, lenght, ks1, ks2, ks3, ivec, enc)
|
||||
! void DES_ede3_cbc_encrypt(input, output, length, ks1, ks2, ks3, ivec, enc)
|
||||
! **************************************************************************
|
||||
|
||||
|
||||
@ -1811,7 +1811,7 @@ DES_ede3_cbc_encrypt:
|
||||
.byte 240, 240, 240, 240, 244, 244, 244, 244
|
||||
.byte 248, 248, 248, 248, 252, 252, 252, 252
|
||||
|
||||
! 5 numbers for initil/final permutation
|
||||
! 5 numbers for initial/final permutation
|
||||
|
||||
.word 0x0f0f0f0f ! offset 256
|
||||
.word 0x0000ffff ! 260
|
||||
|
@ -37,7 +37,7 @@ void DES_cfb_encrypt(const unsigned char *in, unsigned char *out, int numbits,
|
||||
unsigned int sh[4];
|
||||
unsigned char *ovec = (unsigned char *)sh;
|
||||
|
||||
/* I kind of count that compiler optimizes away this assertioni, */
|
||||
/* I kind of count that compiler optimizes away this assertion, */
|
||||
assert(sizeof(sh[0]) == 4); /* as this holds true for all, */
|
||||
/* but 16-bit platforms... */
|
||||
|
||||
|
@ -233,7 +233,7 @@ __ecp_nistz256_add:
|
||||
@ if a+b >= modulus, subtract modulus.
|
||||
@
|
||||
@ But since comparison implies subtraction, we subtract
|
||||
@ modulus and then add it back if subraction borrowed.
|
||||
@ modulus and then add it back if subtraction borrowed.
|
||||
|
||||
subs $a0,$a0,#-1
|
||||
sbcs $a1,$a1,#-1
|
||||
@ -1222,7 +1222,7 @@ __ecp_nistz256_add_self:
|
||||
@ if a+b >= modulus, subtract modulus.
|
||||
@
|
||||
@ But since comparison implies subtraction, we subtract
|
||||
@ modulus and then add it back if subraction borrowed.
|
||||
@ modulus and then add it back if subtraction borrowed.
|
||||
|
||||
subs $a0,$a0,#-1
|
||||
sbcs $a1,$a1,#-1
|
||||
|
@ -137,7 +137,7 @@ ___
|
||||
|
||||
{
|
||||
# This function receives a pointer to an array of four affine points
|
||||
# (X, Y, <1>) and rearanges the data for AVX2 execution, while
|
||||
# (X, Y, <1>) and rearranges the data for AVX2 execution, while
|
||||
# converting it to 2^29 radix redundant form
|
||||
|
||||
my ($X0,$X1,$X2,$X3, $Y0,$Y1,$Y2,$Y3,
|
||||
@ -289,7 +289,7 @@ ___
|
||||
{
|
||||
################################################################################
|
||||
# This function receives a pointer to an array of four AVX2 formatted points
|
||||
# (X, Y, Z) convert the data to normal representation, and rearanges the data
|
||||
# (X, Y, Z) convert the data to normal representation, and rearranges the data
|
||||
|
||||
my ($D0,$D1,$D2,$D3, $D4,$D5,$D6,$D7, $D8)=map("%ymm$_",(0..8));
|
||||
my ($T0,$T1,$T2,$T3, $T4,$T5,$T6)=map("%ymm$_",(9..15));
|
||||
|
@ -696,7 +696,7 @@ __ecp_nistz256_add:
|
||||
# if a+b >= modulus, subtract modulus
|
||||
#
|
||||
# But since comparison implies subtraction, we subtract
|
||||
# modulus and then add it back if subraction borrowed.
|
||||
# modulus and then add it back if subtraction borrowed.
|
||||
|
||||
subic $acc0,$acc0,-1
|
||||
subfe $acc1,$poly1,$acc1
|
||||
|
@ -413,7 +413,7 @@ __ecp_nistz256_add:
|
||||
! if a+b >= modulus, subtract modulus.
|
||||
!
|
||||
! But since comparison implies subtraction, we subtract
|
||||
! modulus and then add it back if subraction borrowed.
|
||||
! modulus and then add it back if subtraction borrowed.
|
||||
|
||||
subcc @acc[0],-1,@acc[0]
|
||||
subccc @acc[1],-1,@acc[1]
|
||||
@ -1592,7 +1592,7 @@ ___
|
||||
########################################################################
|
||||
# Following subroutines are VIS3 counterparts of those above that
|
||||
# implement ones found in ecp_nistz256.c. Key difference is that they
|
||||
# use 128-bit muliplication and addition with 64-bit carry, and in order
|
||||
# use 128-bit multiplication and addition with 64-bit carry, and in order
|
||||
# to do that they perform conversion from uin32_t[8] to uint64_t[4] upon
|
||||
# entry and vice versa on return.
|
||||
#
|
||||
@ -1977,7 +1977,7 @@ $code.=<<___;
|
||||
srlx $acc0,32,$t1
|
||||
addxccc $acc3,$t2,$acc2 ! +=acc[0]*0xFFFFFFFF00000001
|
||||
sub $acc0,$t0,$t2 ! acc0*0xFFFFFFFF00000001, low part
|
||||
addxc %g0,$t3,$acc3 ! cant't overflow
|
||||
addxc %g0,$t3,$acc3 ! can't overflow
|
||||
___
|
||||
}
|
||||
$code.=<<___;
|
||||
|
@ -157,7 +157,7 @@ static int ecdsa_sign_setup(EC_KEY *eckey, BN_CTX *ctx_in,
|
||||
if (EC_GROUP_get_mont_data(group) != NULL) {
|
||||
/*
|
||||
* We want inverse in constant time, therefore we utilize the fact
|
||||
* order must be prime and use Fermats Little Theorem instead.
|
||||
* order must be prime and use Fermat's Little Theorem instead.
|
||||
*/
|
||||
if (!BN_set_word(X, 2)) {
|
||||
ECerr(EC_F_ECDSA_SIGN_SETUP, ERR_R_BN_LIB);
|
||||
|
@ -79,7 +79,7 @@ typedef limb felem[4];
|
||||
typedef widelimb widefelem[7];
|
||||
|
||||
/*
|
||||
* Field element represented as a byte arrary. 28*8 = 224 bits is also the
|
||||
* Field element represented as a byte array. 28*8 = 224 bits is also the
|
||||
* group order size for the elliptic curve, and we also use this type for
|
||||
* scalars for point multiplication.
|
||||
*/
|
||||
|
@ -125,7 +125,7 @@ int ec_GFp_simple_set_compressed_coordinates(const EC_GROUP *group,
|
||||
EC_R_INVALID_COMPRESSION_BIT);
|
||||
else
|
||||
/*
|
||||
* BN_mod_sqrt() should have cought this error (not a square)
|
||||
* BN_mod_sqrt() should have caught this error (not a square)
|
||||
*/
|
||||
ECerr(EC_F_EC_GFP_SIMPLE_SET_COMPRESSED_COORDINATES,
|
||||
EC_R_INVALID_COMPRESSED_POINT);
|
||||
|
@ -161,7 +161,7 @@ actually qualitatively different depending on 'nid' (the "des_cbc" EVP_CIPHER is
|
||||
not an interoperable implementation of "aes_256_cbc"), RSA_METHODs are
|
||||
necessarily interoperable and don't have different flavours, only different
|
||||
implementations. In other words, the ENGINE_TABLE for RSA will either be empty,
|
||||
or will have a single ENGING_PILE hashed to by the 'nid' 1 and that pile
|
||||
or will have a single ENGINE_PILE hashed to by the 'nid' 1 and that pile
|
||||
represents ENGINEs that implement the single "type" of RSA there is.
|
||||
|
||||
Cleanup - the registration and unregistration may pose questions about how
|
||||
@ -188,7 +188,7 @@ state will be unchanged. Thus, no cleanup is required unless registration takes
|
||||
place. ENGINE_cleanup() will simply iterate across a list of registered cleanup
|
||||
callbacks calling each in turn, and will then internally delete its own storage
|
||||
(a STACK). When a cleanup callback is next registered (eg. if the cleanup() is
|
||||
part of a gracefull restart and the application wants to cleanup all state then
|
||||
part of a graceful restart and the application wants to cleanup all state then
|
||||
start again), the internal STACK storage will be freshly allocated. This is much
|
||||
the same as the situation in the ENGINE_TABLE instantiations ... NULL is the
|
||||
initialised state, so only modification operations (not queries) will cause that
|
||||
@ -204,7 +204,7 @@ exists) - the idea of providing an ENGINE_cpy() function probably wasn't a good
|
||||
one and now certainly doesn't make sense in any generalised way. Some of the
|
||||
RSA, DSA, DH, and RAND functions that were fiddled during the original ENGINE
|
||||
changes have now, as a consequence, been reverted back. This is because the
|
||||
hooking of ENGINE is now automatic (and passive, it can interally use a NULL
|
||||
hooking of ENGINE is now automatic (and passive, it can internally use a NULL
|
||||
ENGINE pointer to simply ignore ENGINE from then on).
|
||||
|
||||
Hell, that should be enough for now ... comments welcome.
|
||||
|
@ -8,7 +8,7 @@
|
||||
* https://www.openssl.org/source/license.html
|
||||
*/
|
||||
|
||||
/* Copyright (c) 2017 National Security Resarch Institute. All rights reserved. */
|
||||
/* Copyright (c) 2017 National Security Research Institute. All rights reserved. */
|
||||
|
||||
#ifndef HEADER_ARIA_H
|
||||
# define HEADER_ARIA_H
|
||||
@ -22,7 +22,7 @@
|
||||
# define ARIA_ENCRYPT 1
|
||||
# define ARIA_DECRYPT 0
|
||||
|
||||
# define ARIA_BLOCK_SIZE 16 /* Size of each encryption/decription block */
|
||||
# define ARIA_BLOCK_SIZE 16 /* Size of each encryption/decryption block */
|
||||
# define ARIA_MAX_KEYS 17 /* Number of keys needed in the worst case */
|
||||
|
||||
# ifdef __cplusplus
|
||||
|
@ -396,7 +396,7 @@ void openssl_add_all_digests_int(void);
|
||||
void evp_cleanup_int(void);
|
||||
void evp_app_cleanup_int(void);
|
||||
|
||||
/* Pulling defines out of C soure files */
|
||||
/* Pulling defines out of C source files */
|
||||
|
||||
#define EVP_RC4_KEY_SIZE 16
|
||||
#ifndef TLS1_1_VERSION
|
||||
|
@ -156,7 +156,7 @@ $code.=<<___;
|
||||
___
|
||||
|
||||
######################################################################
|
||||
# "528B" (well, "512B" actualy) streamed GHASH
|
||||
# "528B" (well, "512B" actually) streamed GHASH
|
||||
#
|
||||
$Xip="in0";
|
||||
$Htbl="in1";
|
||||
|
@ -705,7 +705,7 @@ my $depd = sub {
|
||||
my ($mod,$args) = @_;
|
||||
my $orig = "depd$mod\t$args";
|
||||
|
||||
# I only have ",z" completer, it's impicitly encoded...
|
||||
# I only have ",z" completer, it's implicitly encoded...
|
||||
if ($args =~ /%r([0-9]+),([0-9]+),([0-9]+),%r([0-9]+)/) # format 16
|
||||
{ my $opcode=(0x3c<<26)|($4<<21)|($1<<16);
|
||||
my $cpos=63-$2;
|
||||
|
@ -103,7 +103,7 @@
|
||||
#
|
||||
# Does it make sense to increase Naggr? To start with it's virtually
|
||||
# impossible in 32-bit mode, because of limited register bank
|
||||
# capacity. Otherwise improvement has to be weighed agiainst slower
|
||||
# capacity. Otherwise improvement has to be weighed against slower
|
||||
# setup, as well as code size and complexity increase. As even
|
||||
# optimistic estimate doesn't promise 30% performance improvement,
|
||||
# there are currently no plans to increase Naggr.
|
||||
|
@ -23,7 +23,7 @@
|
||||
# Relative comparison is therefore more informative. This initial
|
||||
# version is ~2.1x slower than hardware-assisted AES-128-CTR, ~12x
|
||||
# faster than "4-bit" integer-only compiler-generated 64-bit code.
|
||||
# "Initial version" means that there is room for futher improvement.
|
||||
# "Initial version" means that there is room for further improvement.
|
||||
|
||||
# May 2016
|
||||
#
|
||||
|
@ -206,13 +206,13 @@ $code.=<<___;
|
||||
@ loaded value would have
|
||||
@ to be rotated in order to
|
||||
@ make it appear as in
|
||||
@ alorithm specification
|
||||
@ algorithm specification
|
||||
subs $len,$len,#32 @ see if $len is 32 or larger
|
||||
mov $inc,#16 @ $inc is used as post-
|
||||
@ increment for input pointer;
|
||||
@ as loop is modulo-scheduled
|
||||
@ $inc is zeroed just in time
|
||||
@ to preclude oversteping
|
||||
@ to preclude overstepping
|
||||
@ inp[len], which means that
|
||||
@ last block[s] are actually
|
||||
@ loaded twice, but last
|
||||
@ -370,7 +370,7 @@ if ($flavour =~ /64/) { ######## 64-bit code
|
||||
s/\bq([0-9]+)\b/"v".($1<8?$1:$1+8).".16b"/geo; # old->new registers
|
||||
s/@\s/\/\//o; # old->new style commentary
|
||||
|
||||
# fix up remainig legacy suffixes
|
||||
# fix up remaining legacy suffixes
|
||||
s/\.[ui]?8(\s)/$1/o;
|
||||
s/\.[uis]?32//o and s/\.16b/\.4s/go;
|
||||
m/\.p64/o and s/\.16b/\.1q/o; # 1st pmull argument
|
||||
@ -410,7 +410,7 @@ if ($flavour =~ /64/) { ######## 64-bit code
|
||||
s/\bv([0-9])\.[12468]+[bsd]\b/q$1/go; # new->old registers
|
||||
s/\/\/\s?/@ /o; # new->old style commentary
|
||||
|
||||
# fix up remainig new-style suffixes
|
||||
# fix up remaining new-style suffixes
|
||||
s/\],#[0-9]+/]!/o;
|
||||
|
||||
s/cclr\s+([^,]+),\s*([a-z]+)/mov$2 $1,#0/o or
|
||||
|
@ -16,7 +16,7 @@ The basic syntax for adding an object is as follows:
|
||||
create the C macros SN_base, LN_base, NID_base and OBJ_base.
|
||||
|
||||
Note that if the base name contains spaces, dashes or periods,
|
||||
those will be converte to underscore.
|
||||
those will be converted to underscore.
|
||||
|
||||
Then there are some extra commands:
|
||||
|
||||
|
@ -855,7 +855,7 @@ internet 6 : snmpv2 : SNMPv2
|
||||
# Documents refer to "internet 7" as "mail". This however leads to ambiguities
|
||||
# with RFC2798, Section 9.1.3, where "mail" is defined as the short name for
|
||||
# rfc822Mailbox. The short name is therefore here left out for a reason.
|
||||
# Subclasses of "mail", e.g. "MIME MHS" don't consitute a problem, as
|
||||
# Subclasses of "mail", e.g. "MIME MHS" don't constitute a problem, as
|
||||
# references are realized via long name "Mail" (with capital M).
|
||||
internet 7 : : Mail
|
||||
|
||||
@ -1502,7 +1502,7 @@ ISO-US 10046 2 1 : dhpublicnumber : X9.42 DH
|
||||
|
||||
# RFC 5639 curve OIDs (see http://www.ietf.org/rfc/rfc5639.txt)
|
||||
# versionOne OBJECT IDENTIFIER ::= {
|
||||
# iso(1) identifified-organization(3) teletrust(36) algorithm(3)
|
||||
# iso(1) identified-organization(3) teletrust(36) algorithm(3)
|
||||
# signature-algorithm(3) ecSign(2) ecStdCurvesAndGeneration(8)
|
||||
# ellipticCurve(1) 1 }
|
||||
1 3 36 3 3 2 8 1 1 1 : brainpoolP160r1
|
||||
|
@ -531,7 +531,7 @@ my %globals;
|
||||
);
|
||||
|
||||
# Following constants are defined in x86_64 ABI supplement, for
|
||||
# example avaiable at https://www.uclibc.org/docs/psABI-x86_64.pdf,
|
||||
# example available at https://www.uclibc.org/docs/psABI-x86_64.pdf,
|
||||
# see section 3.7 "Stack Unwind Algorithm".
|
||||
my %DW_reg_idx = (
|
||||
"%rax"=>0, "%rdx"=>1, "%rcx"=>2, "%rbx"=>3,
|
||||
@ -544,7 +544,7 @@ my %globals;
|
||||
|
||||
# [us]leb128 format is variable-length integer representation base
|
||||
# 2^128, with most significant bit of each byte being 0 denoting
|
||||
# *last* most significat digit. See "Variable Length Data" in the
|
||||
# *last* most significant digit. See "Variable Length Data" in the
|
||||
# DWARF specification, numbered 7.6 at least in versions 3 and 4.
|
||||
sub sleb128 {
|
||||
use integer; # get right shift extend sign
|
||||
@ -1427,6 +1427,6 @@ close STDOUT;
|
||||
#
|
||||
# (*) Note that we're talking about run-time, not debug-time. Lack of
|
||||
# unwind information makes debugging hard on both Windows and
|
||||
# Unix. "Unlike" referes to the fact that on Unix signal handler
|
||||
# Unix. "Unlike" refers to the fact that on Unix signal handler
|
||||
# will always be invoked, core dumped and appropriate exit code
|
||||
# returned to parent (for user notification).
|
||||
|
@ -730,7 +730,7 @@ my $extra = shift;
|
||||
|
||||
&movdqa ($T0,$T1); # -> base 2^26 ...
|
||||
&pand ($T1,$MASK);
|
||||
&paddd ($D0,$T1); # ... and accumuate
|
||||
&paddd ($D0,$T1); # ... and accumulate
|
||||
|
||||
&movdqa ($T1,$T0);
|
||||
&psrlq ($T0,26);
|
||||
|
@ -298,7 +298,7 @@ poly1305_emit:
|
||||
mov %r9,%rcx
|
||||
adc \$0,%r9
|
||||
adc \$0,%r10
|
||||
shr \$2,%r10 # did 130-bit value overfow?
|
||||
shr \$2,%r10 # did 130-bit value overflow?
|
||||
cmovnz %r8,%rax
|
||||
cmovnz %r9,%rcx
|
||||
|
||||
@ -1403,7 +1403,7 @@ poly1305_emit_avx:
|
||||
mov %r9,%rcx
|
||||
adc \$0,%r9
|
||||
adc \$0,%r10
|
||||
shr \$2,%r10 # did 130-bit value overfow?
|
||||
shr \$2,%r10 # did 130-bit value overflow?
|
||||
cmovnz %r8,%rax
|
||||
cmovnz %r9,%rcx
|
||||
|
||||
@ -3734,7 +3734,7 @@ poly1305_emit_base2_44:
|
||||
mov %r9,%rcx
|
||||
adc \$0,%r9
|
||||
adc \$0,%r10
|
||||
shr \$2,%r10 # did 130-bit value overfow?
|
||||
shr \$2,%r10 # did 130-bit value overflow?
|
||||
cmovnz %r8,%rax
|
||||
cmovnz %r9,%rcx
|
||||
|
||||
|
@ -134,7 +134,7 @@ if ($alt=0) {
|
||||
push (@XX,shift(@XX)) if ($i>=0);
|
||||
}
|
||||
} else {
|
||||
# Using pinsrw here improves performane on Intel CPUs by 2-3%, but
|
||||
# Using pinsrw here improves performance on Intel CPUs by 2-3%, but
|
||||
# brings down AMD by 7%...
|
||||
$RC4_loop_mmx = sub {
|
||||
my $i=shift;
|
||||
|
@ -88,7 +88,7 @@
|
||||
# The only code path that was not modified is P4-specific one. Non-P4
|
||||
# Intel code path optimization is heavily based on submission by Maxim
|
||||
# Perminov, Maxim Locktyukhin and Jim Guilford of Intel. I've used
|
||||
# some of the ideas even in attempt to optmize the original RC4_INT
|
||||
# some of the ideas even in attempt to optimize the original RC4_INT
|
||||
# code path... Current performance in cycles per processed byte (less
|
||||
# is better) and improvement coefficients relative to previous
|
||||
# version of this module are:
|
||||
|
@ -790,7 +790,7 @@ static int pkey_pss_init(EVP_PKEY_CTX *ctx)
|
||||
if (!rsa_pss_get_param(rsa->pss, &md, &mgf1md, &min_saltlen))
|
||||
return 0;
|
||||
|
||||
/* See if minumum salt length exceeds maximum possible */
|
||||
/* See if minimum salt length exceeds maximum possible */
|
||||
max_saltlen = RSA_size(rsa) - EVP_MD_size(md);
|
||||
if ((RSA_bits(rsa) & 0x7) == 1)
|
||||
max_saltlen--;
|
||||
|
@ -35,7 +35,7 @@
|
||||
# P4 +85%(!) +45%
|
||||
#
|
||||
# As you can see Pentium came out as looser:-( Yet I reckoned that
|
||||
# improvement on P4 outweights the loss and incorporate this
|
||||
# improvement on P4 outweighs the loss and incorporate this
|
||||
# re-tuned code to 0.9.7 and later.
|
||||
# ----------------------------------------------------------------
|
||||
|
||||
@ -549,7 +549,7 @@ for($i=0;$i<20-4;$i+=2) {
|
||||
# being implemented in SSSE3). Once 8 quadruples or 32 elements are
|
||||
# collected, it switches to routine proposed by Max Locktyukhin.
|
||||
#
|
||||
# Calculations inevitably require temporary reqisters, and there are
|
||||
# Calculations inevitably require temporary registers, and there are
|
||||
# no %xmm registers left to spare. For this reason part of the ring
|
||||
# buffer, X[2..4] to be specific, is offloaded to 3 quadriples ring
|
||||
# buffer on the stack. Keep in mind that X[2] is alias X[-6], X[3] -
|
||||
|
@ -444,7 +444,7 @@ for(;$i<80;$i++) { &BODY_20_39($i,@V); unshift(@V,pop(@V)); }
|
||||
$code.=<<___;
|
||||
movdqa (%rbx),@Xi[0] # pull counters
|
||||
mov \$1,%ecx
|
||||
cmp 4*0(%rbx),%ecx # examinte counters
|
||||
cmp 4*0(%rbx),%ecx # examine counters
|
||||
pxor $t2,$t2
|
||||
cmovge $Tbl,@ptr[0] # cancel input
|
||||
cmp 4*1(%rbx),%ecx
|
||||
@ -1508,10 +1508,10 @@ avx2_handler:
|
||||
mov -48(%rax),%r15
|
||||
mov %rbx,144($context) # restore context->Rbx
|
||||
mov %rbp,160($context) # restore context->Rbp
|
||||
mov %r12,216($context) # restore cotnext->R12
|
||||
mov %r13,224($context) # restore cotnext->R13
|
||||
mov %r14,232($context) # restore cotnext->R14
|
||||
mov %r15,240($context) # restore cotnext->R15
|
||||
mov %r12,216($context) # restore context->R12
|
||||
mov %r13,224($context) # restore context->R13
|
||||
mov %r14,232($context) # restore context->R14
|
||||
mov %r15,240($context) # restore context->R15
|
||||
|
||||
lea -56-10*16(%rax),%rsi
|
||||
lea 512($context),%rdi # &context.Xmm6
|
||||
|
@ -1984,9 +1984,9 @@ ssse3_handler:
|
||||
mov -40(%rax),%r14
|
||||
mov %rbx,144($context) # restore context->Rbx
|
||||
mov %rbp,160($context) # restore context->Rbp
|
||||
mov %r12,216($context) # restore cotnext->R12
|
||||
mov %r13,224($context) # restore cotnext->R13
|
||||
mov %r14,232($context) # restore cotnext->R14
|
||||
mov %r12,216($context) # restore context->R12
|
||||
mov %r13,224($context) # restore context->R13
|
||||
mov %r14,232($context) # restore context->R14
|
||||
|
||||
.Lcommon_seh_tail:
|
||||
mov 8(%rax),%rdi
|
||||
|
@ -18,7 +18,7 @@
|
||||
#
|
||||
# Performance improvement over compiler generated code varies from
|
||||
# 10% to 40% [see below]. Not very impressive on some µ-archs, but
|
||||
# it's 5 times smaller and optimizies amount of writes.
|
||||
# it's 5 times smaller and optimizes amount of writes.
|
||||
#
|
||||
# May 2012.
|
||||
#
|
||||
|
@ -1508,10 +1508,10 @@ avx2_handler:
|
||||
mov -48(%rax),%r15
|
||||
mov %rbx,144($context) # restore context->Rbx
|
||||
mov %rbp,160($context) # restore context->Rbp
|
||||
mov %r12,216($context) # restore cotnext->R12
|
||||
mov %r13,224($context) # restore cotnext->R13
|
||||
mov %r14,232($context) # restore cotnext->R14
|
||||
mov %r15,240($context) # restore cotnext->R15
|
||||
mov %r12,216($context) # restore context->R12
|
||||
mov %r13,224($context) # restore context->R13
|
||||
mov %r14,232($context) # restore context->R14
|
||||
mov %r15,240($context) # restore context->R15
|
||||
|
||||
lea -56-10*16(%rax),%rsi
|
||||
lea 512($context),%rdi # &context.Xmm6
|
||||
|
@ -42,7 +42,7 @@
|
||||
# (*) whichever best applicable.
|
||||
# (**) x86_64 assembler performance is presented for reference
|
||||
# purposes, the results are for integer-only code.
|
||||
# (***) paddq is increadibly slow on Atom.
|
||||
# (***) paddq is incredibly slow on Atom.
|
||||
#
|
||||
# IALU code-path is optimized for elder Pentiums. On vanilla Pentium
|
||||
# performance improvement over compiler generated code reaches ~60%,
|
||||
|
@ -35,7 +35,7 @@
|
||||
# on Cortex-A53 (or by 4 cycles per round).
|
||||
# (***) Super-impressive coefficients over gcc-generated code are
|
||||
# indication of some compiler "pathology", most notably code
|
||||
# generated with -mgeneral-regs-only is significanty faster
|
||||
# generated with -mgeneral-regs-only is significantly faster
|
||||
# and the gap is only 40-90%.
|
||||
#
|
||||
# October 2016.
|
||||
|
@ -773,7 +773,7 @@ foreach (split("\n",$code)) {
|
||||
s/shd\s+(%r[0-9]+),(%r[0-9]+),([0-9]+)/
|
||||
$3>31 ? sprintf("shd\t%$2,%$1,%d",$3-32) # rotation for >=32
|
||||
: sprintf("shd\t%$1,%$2,%d",$3)/e or
|
||||
# translate made up instructons: _ror, _shr, _align, _shl
|
||||
# translate made up instructions: _ror, _shr, _align, _shl
|
||||
s/_ror(\s+)(%r[0-9]+),/
|
||||
($SZ==4 ? "shd" : "shrpd")."$1$2,$2,"/e or
|
||||
|
||||
|
@ -26,7 +26,7 @@
|
||||
#
|
||||
# (*) 64-bit code in 32-bit application context, which actually is
|
||||
# on TODO list. It should be noted that for safe deployment in
|
||||
# 32-bit *mutli-threaded* context asyncronous signals should be
|
||||
# 32-bit *multi-threaded* context asynchronous signals should be
|
||||
# blocked upon entry to SHA512 block routine. This is because
|
||||
# 32-bit signaling procedure invalidates upper halves of GPRs.
|
||||
# Context switch procedure preserves them, but not signaling:-(
|
||||
|
@ -25,7 +25,7 @@
|
||||
# sha1-ppc.pl and 1.6x slower than aes-128-cbc. Another interesting
|
||||
# result is degree of computational resources' utilization. POWER8 is
|
||||
# "massively multi-threaded chip" and difference between single- and
|
||||
# maximum multi-process benchmark results tells that utlization is
|
||||
# maximum multi-process benchmark results tells that utilization is
|
||||
# whooping 94%. For sha512-ppc.pl we get [not unimpressive] 84% and
|
||||
# for sha1-ppc.pl - 73%. 100% means that multi-process result equals
|
||||
# to single-process one, given that all threads end up on the same
|
||||
|
@ -1055,7 +1055,7 @@ static uint64_t BitDeinterleave(uint64_t Ai)
|
||||
* as blocksize. It is commonly (1600 - 256*n)/8, e.g. 168, 136, 104,
|
||||
* 72, but can also be (1600 - 448)/8 = 144. All this means that message
|
||||
* padding and intermediate sub-block buffering, byte- or bitwise, is
|
||||
* caller's reponsibility.
|
||||
* caller's responsibility.
|
||||
*/
|
||||
size_t SHA3_absorb(uint64_t A[5][5], const unsigned char *inp, size_t len,
|
||||
size_t r)
|
||||
|
@ -68,7 +68,7 @@ int HASH_INIT(SHA_CTX *c)
|
||||
|
||||
/*
|
||||
* As pointed out by Wei Dai, F() below can be simplified to the code in
|
||||
* F_00_19. Wei attributes these optimisations to Peter Gutmann's SHS code,
|
||||
* F_00_19. Wei attributes these optimizations to Peter Gutmann's SHS code,
|
||||
* and he attributes it to Rich Schroeppel.
|
||||
* #define F(x,y,z) (((x) & (y)) | ((~(x)) & (z)))
|
||||
* I've just become aware of another tweak to be made, again from Wei Dai,
|
||||
|
@ -110,7 +110,7 @@ for (@ARGV) { $sse2=1 if (/-DOPENSSL_IA32_SSE2/); }
|
||||
&cmp ("ebp",0);
|
||||
&jne (&label("notintel"));
|
||||
&or ("edx",1<<30); # set reserved bit#30 on Intel CPUs
|
||||
&and (&HB("eax"),15); # familiy ID
|
||||
&and (&HB("eax"),15); # family ID
|
||||
&cmp (&HB("eax"),15); # P4?
|
||||
&jne (&label("notintel"));
|
||||
&or ("edx",1<<20); # set reserved bit#20 to engage RC4_CHAR
|
||||
|
@ -5,7 +5,7 @@
|
||||
testapp = test_sect
|
||||
|
||||
[test_sect]
|
||||
# list of confuration modules
|
||||
# list of configuration modules
|
||||
|
||||
# SSL configuration module
|
||||
ssl_conf = ssl_sect
|
||||
|
@ -145,7 +145,7 @@ static int afalg_setup_async_event_notification(afalg_aio *aio)
|
||||
ALG_WARN("%s: ASYNC_get_wait_ctx error", __func__);
|
||||
return 0;
|
||||
}
|
||||
/* Get waitfd from ASYNC_WAIT_CTX if it is alreday set */
|
||||
/* Get waitfd from ASYNC_WAIT_CTX if it is already set */
|
||||
ret = ASYNC_WAIT_CTX_get_fd(waitctx, engine_afalg_id,
|
||||
&aio->efd, &custom);
|
||||
if (ret == 0) {
|
||||
|
@ -401,7 +401,7 @@ int ENGINE_register_complete(ENGINE *e);
|
||||
int ENGINE_register_all_complete(void);
|
||||
|
||||
/*
|
||||
* Send parametrised control commands to the engine. The possibilities to
|
||||
* Send parameterised control commands to the engine. The possibilities to
|
||||
* send down an integer, a pointer to data or a function pointer are
|
||||
* provided. Any of the parameters may or may not be NULL, depending on the
|
||||
* command number. In actuality, this function only requires a structural
|
||||
|
@ -181,7 +181,7 @@ int UI_get_result_length(UI *ui, int i);
|
||||
int UI_process(UI *ui);
|
||||
|
||||
/*
|
||||
* Give a user interface parametrised control commands. This can be used to
|
||||
* Give a user interface parameterised control commands. This can be used to
|
||||
* send down an integer, a data pointer or a function pointer, as well as be
|
||||
* used to get information from a UI.
|
||||
*/
|
||||
|
@ -51,7 +51,7 @@ foreach (@files) {
|
||||
# read IMAGE_FILE_HEADER
|
||||
sysread(FD,$coff,20)==20 || die "$file is too short";
|
||||
($Machine,$NumberOfSections,$TimeDateStamp,
|
||||
$PointerToSymbolTable,$NumberOfSysmbols,
|
||||
$PointerToSymbolTable,$NumberOfSymbols,
|
||||
$SizeOfOptionalHeader,$Characteristics)=unpack("SSIIISS",$coff);
|
||||
|
||||
# skip over IMAGE_OPTIONAL_HEADER
|
||||
|
@ -1494,7 +1494,7 @@ typedef struct ssl3_state_st {
|
||||
uint16_t *peer_sigalgs;
|
||||
/* Size of above array */
|
||||
size_t peer_sigalgslen;
|
||||
/* Sigalg peer actualy uses */
|
||||
/* Sigalg peer actually uses */
|
||||
const SIGALG_LOOKUP *peer_sigalg;
|
||||
/*
|
||||
* Set if corresponding CERT_PKEY can be used with current
|
||||
|
@ -127,7 +127,7 @@ explicitly run (with more debugging):
|
||||
$ VERBOSE=1 make TESTS=test_external_krb5 test
|
||||
|
||||
Test-failures suppressions
|
||||
-------------------------
|
||||
--------------------------
|
||||
|
||||
krb5 will automatically adapt its test suite to account for the configuration
|
||||
of your system. Certain tests may require more installed packages to run. No
|
||||
|
@ -18,7 +18,7 @@ IF[{- !$disabled{tests} -}]
|
||||
DEPEND[libtestutil.a]=../libcrypto
|
||||
|
||||
# Special hack for descrip.mms to include the MAIN object module
|
||||
# explicitely. This will only be done if there isn't a MAIN in the
|
||||
# explicitly. This will only be done if there isn't a MAIN in the
|
||||
# program's object modules already.
|
||||
BEGINRAW[descrip.mms]
|
||||
INCLUDE_MAIN___test_libtestutil_OLB = /INCLUDE=MAIN
|
||||
|
@ -178,11 +178,11 @@ genpc() {
|
||||
-set_serial 2 -days "${DAYS}"
|
||||
}
|
||||
|
||||
# Usage: $0 genalt keyname certname eekeyname eecertname alt1 alt2 ...
|
||||
# Usage: $0 geneealt keyname certname eekeyname eecertname alt1 alt2 ...
|
||||
#
|
||||
# Note: takes csr on stdin, so must be used with $0 req like this:
|
||||
#
|
||||
# $0 req keyname dn | $0 genalt keyname certname eekeyname eecertname alt ...
|
||||
# $0 req keyname dn | $0 geneealt keyname certname eekeyname eecertname alt ...
|
||||
geneealt() {
|
||||
local key=$1; shift
|
||||
local cert=$1; shift
|
||||
|
@ -80,7 +80,7 @@ my @testlists = (
|
||||
[ "4.4.7", "Valid Two CRLs Test7", 0 ],
|
||||
|
||||
# The test document suggests these should return certificate revoked...
|
||||
# Subsquent discussion has concluded they should not due to unhandle
|
||||
# Subsequent discussion has concluded they should not due to unhandle
|
||||
# critical CRL extensions.
|
||||
[ "4.4.8", "Invalid Unknown CRL Entry Extension Test8", 36 ],
|
||||
[ "4.4.9", "Invalid Unknown CRL Extension Test9", 36 ],
|
||||
@ -705,7 +705,7 @@ my @testlists = (
|
||||
[ "4.14.29", "Valid cRLIssuer Test29", 0 ],
|
||||
|
||||
# Although this test is valid it has a circular dependency. As a result
|
||||
# an attempt is made to reursively checks a CRL path and rejected due to
|
||||
# an attempt is made to recursively checks a CRL path and rejected due to
|
||||
# a CRL path validation error. PKITS notes suggest this test does not
|
||||
# need to be run due to this issue.
|
||||
[ "4.14.30", "Valid cRLIssuer Test30", 54 ],
|
||||
|
@ -322,13 +322,13 @@ ok(!verify("badalt7-cert", "sslserver", ["root-cert"], ["ncca1-cert"], ),
|
||||
"Name Constraints CN BMPSTRING hostname not permitted");
|
||||
|
||||
ok(!verify("badalt8-cert", "sslserver", ["root-cert"], ["ncca1-cert", "ncca3-cert"], ),
|
||||
"Name constaints nested DNS name not permitted 1");
|
||||
"Name constraints nested DNS name not permitted 1");
|
||||
|
||||
ok(!verify("badalt9-cert", "sslserver", ["root-cert"], ["ncca1-cert", "ncca3-cert"], ),
|
||||
"Name constaints nested DNS name not permitted 2");
|
||||
"Name constraints nested DNS name not permitted 2");
|
||||
|
||||
ok(!verify("badalt10-cert", "sslserver", ["root-cert"], ["ncca1-cert", "ncca3-cert"], ),
|
||||
"Name constaints nested DNS name excluded");
|
||||
"Name constraints nested DNS name excluded");
|
||||
|
||||
ok(verify("ee-pss-sha1-cert", "sslserver", ["root-cert"], ["ca-cert"], ),
|
||||
"Certificate PSS signature using SHA1");
|
||||
|
@ -168,7 +168,7 @@ $proxy->start();
|
||||
ok(TLSProxy::Message->success() && ($selectedgroupid == X25519),
|
||||
"Multiple acceptable key_shares (part 2)");
|
||||
|
||||
#Test 15: Server sends key_share that wasn't offerred should fail
|
||||
#Test 15: Server sends key_share that wasn't offered should fail
|
||||
$proxy->clear();
|
||||
$testtype = SELECT_X25519;
|
||||
$proxy->clientflags("-curves P-256");
|
||||
|
@ -221,7 +221,7 @@ $proxy->reneg(1);
|
||||
$proxy->start();
|
||||
checkhandshake($proxy, checkhandshake::RENEG_HANDSHAKE,
|
||||
checkhandshake::DEFAULT_EXTENSIONS,
|
||||
"Rengotiation handshake test");
|
||||
"Renegotiation handshake test");
|
||||
|
||||
#Test 8: Server name handshake (no client request)
|
||||
$proxy->clear();
|
||||
|
@ -180,7 +180,7 @@ SKIP: {
|
||||
$boundary_test_type = DATA_AFTER_SERVER_HELLO;
|
||||
$proxy->filter(\¬_on_record_boundary);
|
||||
$proxy->start();
|
||||
ok(TLSProxy::Message->fail(), "Record not on bounday in TLS1.3 (ServerHello)");
|
||||
ok(TLSProxy::Message->fail(), "Record not on boundary in TLS1.3 (ServerHello)");
|
||||
|
||||
#Test 17: Sending a Finished which doesn't end on a record boundary
|
||||
# should fail
|
||||
@ -188,7 +188,7 @@ SKIP: {
|
||||
$boundary_test_type = DATA_AFTER_FINISHED;
|
||||
$proxy->filter(\¬_on_record_boundary);
|
||||
$proxy->start();
|
||||
ok(TLSProxy::Message->fail(), "Record not on bounday in TLS1.3 (Finished)");
|
||||
ok(TLSProxy::Message->fail(), "Record not on boundary in TLS1.3 (Finished)");
|
||||
|
||||
#Test 18: Sending a KeyUpdate which doesn't end on a record boundary
|
||||
# should fail
|
||||
@ -196,7 +196,7 @@ SKIP: {
|
||||
$boundary_test_type = DATA_AFTER_KEY_UPDATE;
|
||||
$proxy->filter(\¬_on_record_boundary);
|
||||
$proxy->start();
|
||||
ok(TLSProxy::Message->fail(), "Record not on bounday in TLS1.3 (KeyUpdate)");
|
||||
ok(TLSProxy::Message->fail(), "Record not on boundary in TLS1.3 (KeyUpdate)");
|
||||
}
|
||||
|
||||
|
||||
|
@ -137,7 +137,7 @@ use constant {
|
||||
EMPTY_EXTENSION => 1,
|
||||
NON_DHE_KEX_MODE_ONLY => 2,
|
||||
DHE_KEX_MODE_ONLY => 3,
|
||||
UNKNONW_KEX_MODES => 4,
|
||||
UNKNOWN_KEX_MODES => 4,
|
||||
BOTH_KEX_MODES => 5
|
||||
};
|
||||
|
||||
@ -207,7 +207,7 @@ checkhandshake($proxy, checkhandshake::RESUME_HANDSHAKE,
|
||||
#Test 6: Attempt a resume with only unrecognised kex modes. Should not resume
|
||||
$proxy->clear();
|
||||
$proxy->clientflags("-sess_in ".$session);
|
||||
$testtype = UNKNONW_KEX_MODES;
|
||||
$testtype = UNKNOWN_KEX_MODES;
|
||||
$proxy->start();
|
||||
checkhandshake($proxy, checkhandshake::DEFAULT_HANDSHAKE,
|
||||
checkhandshake::DEFAULT_EXTENSIONS
|
||||
@ -246,7 +246,7 @@ checkhandshake($proxy, checkhandshake::HRR_RESUME_HANDSHAKE,
|
||||
| checkhandshake::PSK_SRV_EXTENSION,
|
||||
"Resume with both kex modes and HRR");
|
||||
|
||||
#Test 9: Attempt a resume with dhe kex mode only and an unnacceptable initial
|
||||
#Test 9: Attempt a resume with dhe kex mode only and an unacceptable initial
|
||||
# key_share. Should resume with a key_share following an HRR
|
||||
$proxy->clear();
|
||||
$proxy->clientflags("-sess_in ".$session);
|
||||
@ -310,7 +310,7 @@ sub modify_kex_modes_filter
|
||||
$ext = pack "C2",
|
||||
0x01, #List length
|
||||
0x01; #psk_dhe_ke
|
||||
} elsif ($testtype == UNKNONW_KEX_MODES) {
|
||||
} elsif ($testtype == UNKNOWN_KEX_MODES) {
|
||||
$ext = pack "C3",
|
||||
0x02, #List length
|
||||
0xfe, #unknown
|
||||
|
@ -42,7 +42,7 @@ if (eval { require Win32::API; 1; }) {
|
||||
$pass = Encode::encode("cp1253",Encode::decode("utf-8",$pass));
|
||||
}
|
||||
} else {
|
||||
# Running MinGW tests transparenly under Wine apparently requires
|
||||
# Running MinGW tests transparently under Wine apparently requires
|
||||
# UTF-8 locale...
|
||||
|
||||
foreach(`locale -a`) {
|
||||
|
@ -441,7 +441,7 @@ sub testssl {
|
||||
plan skip_all => "None of the ciphersuites to test are available in this OpenSSL build"
|
||||
if $protocolciphersuitecount + scalar(keys %ciphersuites) == 0;
|
||||
|
||||
# The count of protocols is because in addition to the ciphersuits
|
||||
# The count of protocols is because in addition to the ciphersuites
|
||||
# we got above, we're running a weak DH test for each protocol
|
||||
plan tests => scalar(@protocols) + $protocolciphersuitecount
|
||||
+ scalar(keys %ciphersuites);
|
||||
|
@ -66,7 +66,7 @@ static int test_sanity_sign(void)
|
||||
return 1;
|
||||
}
|
||||
|
||||
static int test_sanity_unsigned_convertion(void)
|
||||
static int test_sanity_unsigned_conversion(void)
|
||||
{
|
||||
/* Check that unsigned-to-signed conversions preserve bit patterns */
|
||||
if (!TEST_int_eq((int)((unsigned int)INT_MAX + 1), INT_MIN)
|
||||
@ -91,7 +91,7 @@ int setup_tests(void)
|
||||
ADD_TEST(test_sanity_enum_size);
|
||||
ADD_TEST(test_sanity_twos_complement);
|
||||
ADD_TEST(test_sanity_sign);
|
||||
ADD_TEST(test_sanity_unsigned_convertion);
|
||||
ADD_TEST(test_sanity_unsigned_conversion);
|
||||
ADD_TEST(test_sanity_range);
|
||||
return 1;
|
||||
}
|
||||
|
@ -14,7 +14,7 @@
|
||||
|
||||
/*
|
||||
* The basic I/O functions used internally by the test framework. These
|
||||
* can be overriden when needed. Note that if one is, then all must be.
|
||||
* can be overridden when needed. Note that if one is, then all must be.
|
||||
*/
|
||||
void test_open_streams(void);
|
||||
void test_close_streams(void);
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user