Index of /pub/net/Crypto/libeay
Name Last modified Size Description
Parent Directory -
libsha-0.8.2b.tar.gz 1998-01-07 23:00 32K
librc4-0.8.2b.tar.gz 1998-01-07 23:00 24K
librc2-0.8.2b.tar.gz 1998-01-07 23:00 14K
libmd5-0.8.2b.tar.gz 1998-01-07 23:00 24K
libidea-0.8.2b.tar.gz 1998-01-07 23:00 9.4K
libdes.tar.gz 1998-01-07 23:00 139K
libdes-l.tar.gz 1998-01-07 23:00 92K
libdes-l-4.04b.tar.gz 1998-01-07 23:00 92K
libdes-4.04b.tar.gz 1998-01-07 23:00 139K
libcast-0.8.2b.tar.gz 1998-01-07 23:00 41K
libbf.tar.gz 1998-01-07 23:00 39K
libbf-0.8.2b.tar.gz 1998-01-07 23:00 39K
README 1998-01-07 23:00 3.9K
This directory contains various ciphers and digests pulled out of SSLeay.
There is x86 assember for
rc4, des, blowfish, cast5, md5 and sha1.
On a pentium the md5 takes 337 cycles per block, and is faster than the
speed listed in 'Even Faster Hashing on the Pentium' (345 cycles).
The sha1 inner loop, is the same speed, (837 cycles).
Blowfish has an inner loop that is 9 cycles per round. There is a
faster version for the pentium pro, but the default version is probably
the best option.
There are 2 variants on the CAST5 implementation. One has 13 cycles per
round, and the other has 14. The 13 cycle version unfortunatly runs
%30 slower on a pentium pro than the 14 cycle version.
RC4 processes 8 bytes per 70 cycles.
cbc mode of these ciphers is implemented via assembler, but not inline
code. If you want another %2-3 speedup, you could easily remove the
function call overhead at the expense of increased code size for the library.
Anyway, what does this mean in the real world? Using the SSLeay 'speed'
test program, under linux on a pentium 100,
built on Tue Nov 4 02:52:29 EST 1997
options:bn(64,32) md2(int) rc4(ptr,int) des(ptr,risc1,16,long) idea(int) blowfish(ptr2)
C flags:gcc -DL_ENDIAN -DTERMIO -DBN_ASM -O3 -fomit-frame-pointer -m486 -Wall -Wuninitialized -DMD5_ASM -DSHA1_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 993.15k 5748.27k 11944.70k 16477.53k 18287.27k
sha1 563.24k 2851.67k 5363.71k 6879.23k 7441.07k
rc4 7876.70k 10400.85k 10825.90k 10943.49k 10745.17k
des cbc 2047.39k 2188.25k 2188.29k 2239.49k 2233.69k
des ede3 660.55k 764.01k 773.55k 779.21k 780.97k
idea cbc 653.93k 708.48k 715.43k 719.87k 720.90k
rc2 cbc 648.08k 702.23k 708.78k 711.00k 709.97k
blowfish cbc 3764.39k 4288.66k 4375.04k 4497.07k 4423.68k
cast cbc 2757.14k 2993.75k 3035.31k 3078.90k 3055.62k
blowfish cbc [*] 3258.81k 3673.47k 3767.30k 3774.12k 3719.17k
cast cbc [**] 2677.05k 3164.78k 3273.05k 3287.38k 3244.03k
[*] pentium pro specific version
[**] pentium specific version
For a pentum pro 200, Windows 95, SSLeay with DLLs
built on Tue Nov 4 08:57:30 EST 1997
options:bn(64,32) md2(int) rc4(idx,int) des(idx,cisc,4,long) idea(int) blowfish(ptr2)
C flags:cl /W3 /WX /G5 /Ox /O2 /Ob2 /Gs0 /GF /Gy /nologo -DWIN32 -DL_ENDIAN -DBN_ASM -DMD5_ASM -DSHA1_ASM /MD
The 'numbers' are in 1000s of bytes per second processed.
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 2251.85k 11966.63k 22944.77k 29916.58k 32729.64k
sha1 1398.85k 6621.89k 11831.61k 14722.02k 15863.48k
rc4 15744.38k 21239.13k 22093.89k 22419.84k 22564.58k
des cbc 4147.03k 4571.95k 4654.13k 4673.84k 4654.13k
des ede3 1564.92k 1631.00k 1642.47k 1646.21k 1641.86k
idea cbc 2582.06k 2888.64k 2936.78k 2953.74k 2949.58k
rc2 cbc 1646.37k 1782.23k 1800.59k 1805.24k 1806.96k
blowfish cbc 6052.39k 7025.63k 7123.48k 7172.20k 7147.76k
cast cbc 6000.43k 6978.88k 7123.48k 7147.76k 7123.48k
blowfish cbc [*] 6404.43k 7304.13k 7508.36k 7627.94k 7477.61k
cast cbc [**] 4404.82k 4909.89k 5000.44k 4993.28k 5013.06k
[*] pentium pro specific version
[**] pentium specific version
Work still to be done.
- Test vectors for various modes of CAST5 need to be put into casttest.c
- More work on C code variants of the CAST inner loop.
- More testing of the various assember implementations.
- General code cleanups
eric 15-Nov-1997
08-Jan-1998
- fixes to md5 and sha1 for bignendian machines.