Eric Lee / smarc-fsl-linux-kernel

20 Dec, 2011

2 commits

7ba8babf8 crypto: serpent-sse2 - remove unneeded LRW/XTS #ifdefs ... Browse Code »

Since LRW & XTS are selected by serpent-sse2, we don't need these #ifdefs
anymore.

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-12-20 15:20:08 +0800
88715b9ad crypto: twofish-x86_64-3way - remove unneeded LRW/XTS #ifdefs ... Browse Code »

Since LRW & XTS are selected by twofish-x86_64-3way, we don't need these
#ifdefs anymore.

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-12-20 15:20:07 +0800

21 Nov, 2011

5 commits

d35643385 crypto: serpent-sse2 - clear CRYPTO_TFM_REQ_MAY_SLEEP in lrw and xts modes ... Browse Code »

LRW/XTS patches for serpent-sse2 forgot to add this. CRYPTO_TFM_REQ_MAY_SLEEP
should be cleared as sleeping between kernel_fpu_begin()/kernel_fpu_end() is
not allowed.

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-11-21 16:13:25 +0800
5962f8b66 crypto: serpent-sse2 - add xts support ... Browse Code »

Patch adds XTS support for serpent-sse2 by using xts_crypt(). Patch has been
tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

Intel Celeron T1600 (x86_64) (fam:6, model:15, step:13):
size xts-enc xts-dec
16B 0.98x 1.00x
64B 1.00x 1.01x
256B 2.78x 2.75x
1024B 3.30x 3.26x
8192B 3.39x 3.30x

AMD Phenom II 1055T (x86_64) (fam:16, model:10):
size xts-enc xts-dec
16B 1.05x 1.02x
64B 1.04x 1.03x
256B 2.10x 2.05x
1024B 2.34x 2.35x
8192B 2.34x 2.40x

Intel Atom N270 (i586):
size xts-enc xts-dec
16B 0.95x 0.96x
64B 1.53x 1.50x
256B 1.72x 1.75x
1024B 1.88x 1.87x
8192B 1.86x 1.83x

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-11-21 16:13:24 +0800
18482053f crypto: serpent-sse2 - add lrw support ... Browse Code »

Patch adds LRW support for serpent-sse2 by using lrw_crypt(). Patch has been
tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

Benchmark results with tcrypt:

Intel Celeron T1600 (x86_64) (fam:6, model:15, step:13):
size lrw-enc lrw-dec
16B 1.00x 0.96x
64B 1.01x 1.01x
256B 3.01x 2.97x
1024B 3.39x 3.33x
8192B 3.35x 3.33x

AMD Phenom II 1055T (x86_64) (fam:16, model:10):
size lrw-enc lrw-dec
16B 0.98x 1.03x
64B 1.01x 1.04x
256B 2.10x 2.14x
1024B 2.28x 2.33x
8192B 2.30x 2.33x

Intel Atom N270 (i586):
size lrw-enc lrw-dec
16B 0.97x 0.97x
64B 1.47x 1.50x
256B 1.72x 1.69x
1024B 1.88x 1.81x
8192B 1.84x 1.79x

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-11-21 16:13:24 +0800
251496dbf crypto: serpent - add 4-way parallel i586/SSE2 assembler implementation ... Browse Code »

Patch adds i586/SSE2 assembler implementation of serpent cipher. Assembler
functions crypt data in four block chunks.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

Intel Atom N270:

size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16 0.95x 1.12x 1.02x 1.07x 0.97x 0.98x
64 1.73x 1.82x 1.08x 1.82x 1.72x 1.73x
256 2.08x 2.00x 1.04x 2.07x 1.99x 2.01x
1024 2.28x 2.18x 1.05x 2.23x 2.17x 2.20x
8192 2.28x 2.13x 1.05x 2.23x 2.18x 2.20x

Full output:
http://koti.mbnet.fi/axh/kernel/crypto/atom-n270/serpent-generic.txt
http://koti.mbnet.fi/axh/kernel/crypto/atom-n270/serpent-sse2.txt

Userspace test results:

Encryption/decryption of sse2-i586 vs generic on Intel Atom N270:
encrypt: 2.35x
decrypt: 2.54x

Encryption/decryption of sse2-i586 vs generic on AMD Phenom II:
encrypt: 1.82x
decrypt: 2.51x

Encryption/decryption of sse2-i586 vs generic on Intel Xeon E7330:
encrypt: 2.99x
decrypt: 3.48x

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-11-21 16:13:23 +0800
937c30d7f crypto: serpent - add 8-way parallel x86_64/SSE2 assembler implementation ... Browse Code »

Patch adds x86_64/SSE2 assembler implementation of serpent cipher. Assembler
functions crypt data in eigth block chunks (two 4 block chunk SSE2 operations
in parallel to improve performance on out-of-order CPUs). Glue code is based
on one from AES-NI implementation, so requests from irq context are redirected
to cryptd.

v2:
- add missing include of linux/module.h
(appearently crypto.h used to include module.h, which changed for 3.2 by
commit 7c926402a7e8c9b279968fd94efec8700ba3859e)

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

AMD Phenom II 1055T (fam:16, model:10):

size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B 1.03x 1.01x 1.03x 1.05x 1.00x 0.99x
64B 1.00x 1.01x 1.02x 1.04x 1.02x 1.01x
256B 2.34x 2.41x 0.99x 2.43x 2.39x 2.40x
1024B 2.51x 2.57x 1.00x 2.59x 2.56x 2.56x
8192B 2.50x 2.54x 1.00x 2.55x 2.57x 2.57x

Intel Celeron T1600 (fam:6, model:15, step:13):

size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B 0.97x 0.97x 1.01x 1.01x 1.01x 1.02x
64B 1.00x 1.00x 1.00x 1.02x 1.01x 1.01x
256B 3.41x 3.35x 1.00x 3.39x 3.42x 3.44x
1024B 3.75x 3.72x 0.99x 3.74x 3.75x 3.75x
8192B 3.70x 3.68x 0.99x 3.68x 3.69x 3.69x

Full output:
http://koti.mbnet.fi/axh/kernel/crypto/phenom-ii-1055t/serpent-generic.txt
http://koti.mbnet.fi/axh/kernel/crypto/phenom-ii-1055t/serpent-sse2.txt
http://koti.mbnet.fi/axh/kernel/crypto/celeron-t1600/serpent-generic.txt
http://koti.mbnet.fi/axh/kernel/crypto/celeron-t1600/serpent-sse2.txt

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-11-21 16:13:23 +0800

09 Nov, 2011

2 commits

bae6d3038 crypto: twofish-x86_64-3way - add xts support ... Browse Code »

Patch adds XTS support for twofish-x86_64-3way by using xts_crypt(). Patch has
been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (twofish-3way/twofish-asm speed ratios):

Intel Celeron T1600 (fam:6, model:15, step:13):

size xts-enc xts-dec
16B 0.98x 1.00x
64B 1.14x 1.15x
256B 1.23x 1.25x
1024B 1.26x 1.29x
8192B 1.28x 1.30x

AMD Phenom II 1055T (fam:16, model:10):

size xts-enc xts-dec
16B 1.03x 1.03x
64B 1.13x 1.16x
256B 1.20x 1.20x
1024B 1.22x 1.22x
8192B 1.22x 1.21x

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-11-09 11:57:57 +0800
81559f9ad crypto: twofish-x86_64-3way - add lrw support ... Browse Code »

Patch adds LRW support for twofish-x86_64-3way by using lrw_crypt(). Patch has
been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (twofish-3way/twofish-asm speed ratios):

Intel Celeron T1600 (fam:6, model:15, step:13):

size lrw-enc lrw-dec
16B 0.99x 1.00x
64B 1.17x 1.17x
256B 1.26x 1.27x
1024B 1.30x 1.31x
8192B 1.31x 1.32x

AMD Phenom II 1055T (fam:16, model:10):

size lrw-enc lrw-dec
16B 1.06x 1.01x
64B 1.08x 1.14x
256B 1.19x 1.20x
1024B 1.21x 1.22x
8192B 1.23x 1.24x

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-11-09 11:53:32 +0800

07 Nov, 2011

1 commit

32aaeffbd Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux ... Browse Code »

* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include
net: sch_generic remove redundant use of
net: inet_timewait_sock doesnt need
...

Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h

Linus Torvalds
2011-11-07 11:44:47 +0800

01 Nov, 2011

1 commit

7c52d5517 x86: fix up files really needing to include module.h ... Browse Code »

These files aren't just exporting symbols -- they are also defining
a MODULE_LICENSE etc. so give them the full module.h file.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-11-01 07:30:36 +0800

21 Oct, 2011

6 commits

906b2c9f2 crypto: twofish-x86_64-3way - fix ctr blocksize to 1 ... Browse Code »

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:28:57 +0800
a516ebafd crypto: blowfish-x86_64 - fix ctr blocksize to 1 ... Browse Code »

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:28:57 +0800
8280daad4 crypto: twofish - add 3-way parallel x86_64 assembler implemention ... Browse Code »

Patch adds 3-way parallel x86_64 assembly implementation of twofish as new
module. New assembler functions crypt data in three blocks chunks, improving
cipher performance on out-of-order CPUs.

Patch has been tested with tcrypt and automated filesystem tests.

Summary of the tcrypt benchmarks:

Twofish 3-way-asm vs twofish asm (128bit 8kb block ECB)
encrypt: 1.3x speed
decrypt: 1.3x speed

Twofish 3-way-asm vs twofish asm (128bit 8kb block CBC)
encrypt: 1.07x speed
decrypt: 1.4x speed

Twofish 3-way-asm vs twofish asm (128bit 8kb block CTR)
encrypt: 1.4x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block ECB)
encrypt: 1.0x speed
decrypt: 1.0x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block CBC)
encrypt: 0.84x speed
decrypt: 1.09x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block CTR)
encrypt: 1.15x speed

Full output:
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-3way-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-aes-asm-x86_64.txt

Tests were run on:
vendor_id : AuthenticAMD
cpu family : 16
model : 10
model name : AMD Phenom(tm) II X6 1055T Processor

Also userspace test were run on:
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU E7330 @ 2.40GHz
stepping : 11

Userspace test results:

Encryption/decryption of twofish 3-way vs x86_64-asm on AMD Phenom II:
encrypt: 1.27x
decrypt: 1.25x

Encryption/decryption of twofish 3-way vs x86_64-asm on Intel Xeon E7330:
encrypt: 1.36x
decrypt: 1.36x

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:23:08 +0800
91d41f159 crypto: twofish-x86-asm - make assembler functions use twofish_ctx instead of crypto_tfm ... Browse Code »

This needed by 3-way twofish patch to be able to easily use one block
assembler functions. As glue code is shared between i586/x86_64 apply
change to i586 assembler too. Also export assembler functions for
3-way parallel twofish module.

CC: Joachim Fritschi
Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:23:08 +0800
a071d06e3 crypto: blowfish-x86_64 - add credits ... Browse Code »

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:23:07 +0800
e827bb09c crypto: blowfish-x86_64 - improve x86_64 blowfish 4-way performance ... Browse Code »

This patch adds improved F-macro for 4-way parallel functions. With new
F-macro for 4-way parallel functions, blowfish sees ~15% improvement in
speed tests on AMD Phenom II (~5% on Intel Xeon E7330).

However when used in 1-way blowfish function new macro would be ~10%
slower than original, so old F-macro is kept for 1-way functions.
Patch cleans up old F-macro as it is no longer needed in 4-way part.

Patch also does register macro renaming to reduce stack usage.

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:23:07 +0800

22 Sep, 2011

2 commits

4a4cc2b6b crypto: aes-x86 - quiet sparse noise about symbol not declared ... Browse Code »

Include to pick up the declarations for crypto_aes_encrypt_x86
and crypto_aes_decrypt_x86 to quiet the sparse noise:

warning: symbol 'crypto_aes_encrypt_x86' was not declared. Should it be static?
warning: symbol 'crypto_aes_decrypt_x86' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten
Acked-by: Mandeep Singh Baines
Signed-off-by: Herbert Xu

H Hartley Sweeten
2011-09-22 19:33:01 +0800
64b94ceae crypto: blowfish - add x86_64 assembly implementation ... Browse Code »

Patch adds x86_64 assembly implementation of blowfish. Two set of assembler
functions are provided. First set is regular 'one-block at time'
encrypt/decrypt functions. Second is 'four-block at time' functions that
gain performance increase on out-of-order CPUs. Performance of 4-way
functions should be equal to 1-way functions with in-order CPUs.

Summary of the tcrypt benchmarks:

Blowfish assembler vs blowfish C (256bit 8kb block ECB)
encrypt: 2.2x speed
decrypt: 2.3x speed

Blowfish assembler vs blowfish C (256bit 8kb block CBC)
encrypt: 1.12x speed
decrypt: 2.5x speed

Blowfish assembler vs blowfish C (256bit 8kb block CTR)
encrypt: 2.5x speed

Full output:
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt

Tests were run on:
vendor_id : AuthenticAMD
cpu family : 16
model : 10
model name : AMD Phenom(tm) II X6 1055T Processor
stepping : 0

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-09-22 19:25:26 +0800

10 Aug, 2011

1 commit

66be89515 crypto: sha1 - SSSE3 based SHA1 implementation for x86-64 ... Browse Code »

This is an assembler implementation of the SHA1 algorithm using the
Supplemental SSE3 (SSSE3) instructions or, when available, the
Advanced Vector Extensions (AVX).

Testing with the tcrypt module shows the raw hash performance is up to
2.3 times faster than the C implementation, using 8k data blocks on a
Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25%
faster.

Since this implementation uses SSE/YMM registers it cannot safely be
used in every situation, e.g. while an IRQ interrupts a kernel thread.
The implementation falls back to the generic SHA1 variant, if using
the SSE/YMM registers is not possible.

With this algorithm I was able to increase the throughput of a single
IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
the SSSE3 variant -- a speedup of +34.8%.

Saving and restoring SSE/YMM state might make the actual throughput
fluctuate when there are FPU intensive userland applications running.
For example, meassuring the performance using iperf2 directly on the
machine under test gives wobbling numbers because iperf2 uses the FPU
for each packet to check if the reporting interval has expired (in the
above test I got min/max/avg: 402/484/464 MBit/s).

Using this algorithm on a IPsec gateway gives much more reasonable and
stable numbers, albeit not as high as in the directly connected case.
Here is the result from an RFC 2544 test run with a EXFO Packet Blazer
FTB-8510:

frame size sha1-generic sha1-ssse3 delta
64 byte 37.5 MBit/s 37.5 MBit/s 0.0%
128 byte 56.3 MBit/s 62.5 MBit/s +11.0%
256 byte 87.5 MBit/s 100.0 MBit/s +14.3%
512 byte 131.3 MBit/s 150.0 MBit/s +14.2%
1024 byte 162.5 MBit/s 193.8 MBit/s +19.3%
1280 byte 175.0 MBit/s 212.5 MBit/s +21.4%
1420 byte 175.0 MBit/s 218.7 MBit/s +25.0%
1518 byte 150.0 MBit/s 181.2 MBit/s +20.8%

The throughput for the largest frame size is lower than for the
previous size because the IP packets need to be fragmented in this
case to make there way through the IPsec tunnel.

Signed-off-by: Mathias Krause
Cc: Maxim Locktyukhin
Signed-off-by: Herbert Xu

Mathias Krause
2011-08-10 19:00:29 +0800

30 Jun, 2011

1 commit

c3e73e76a crypto: ghash-intel - Fix set but not used in ghash_async_setkey() ... Browse Code »

Signed-off-by: Gustavo F. Padovan
Signed-off-by: Herbert Xu

Gustavo F. Padovan
2011-06-30 07:43:42 +0800

18 May, 2011

1 commit

9bed4aca2 crypto: aesni-intel - fix aesni build on i386 ... Browse Code »

Fix build error on i386 by moving function prototypes:

arch/x86/crypto/aesni-intel_glue.c: In function 'aesni_init':
arch/x86/crypto/aesni-intel_glue.c:1263: error: implicit declaration of function 'crypto_fpu_init'
arch/x86/crypto/aesni-intel_glue.c: In function 'aesni_exit':
arch/x86/crypto/aesni-intel_glue.c:1373: error: implicit declaration of function 'crypto_fpu_exit'

Signed-off-by: Randy Dunlap
Signed-off-by: Herbert Xu

Randy Dunlap
2011-05-18 07:03:34 +0800

16 May, 2011

1 commit

b23b64516 crypto: aesni-intel - Merge with fpu.ko ... Browse Code »

Loading fpu without aesni-intel does nothing. Loading aesni-intel
without fpu causes modes like xts to fail. (Unloading
aesni-intel will restore those modes.)

One solution would be to make aesni-intel depend on fpu, but it
seems cleaner to just combine the modules.

This is probably responsible for bugs like:
https://bugzilla.redhat.com/show_bug.cgi?id=589390

Signed-off-by: Andy Lutomirski
Signed-off-by: Herbert Xu

Andy Lutomirski
2011-05-16 13:12:47 +0800

27 Mar, 2011

1 commit

60af520cf crypto: aesni-intel - fixed problem with packets that are not multiple of 64bytes ... Browse Code »

This patch fixes problem with packets that are not multiple of 64bytes.

Signed-off-by: Adrian Hoban
Signed-off-by: Aidan O'Mahony
Signed-off-by: Gabriele Paoloni
Signed-off-by: Tadeusz Struk
Signed-off-by: Herbert Xu

Tadeusz Struk
2011-03-27 10:29:39 +0800

19 Mar, 2011

1 commit

f2e1fbb5f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Flush TLB if PGD entry is changed in i386 PAE mode
x86, dumpstack: Correct stack dump info when frame pointer is available
x86: Clean up csum-copy_64.S a bit
x86: Fix common misspellings
x86: Fix misspelling and align params
x86: Use PentiumPro-optimized partial_csum() on VIA C7

Linus Torvalds
2011-03-19 01:45:21 +0800

18 Mar, 2011

1 commit

0d2eb44f6 x86: Fix common misspellings ... Browse Code »

They were generated by 'codespell' and then manually reviewed.

Signed-off-by: Lucas De Marchi
Cc: trivial@kernel.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Lucas De Marchi
2011-03-18 17:39:30 +0800

16 Feb, 2011

1 commit

fc9044e2d crypto: aesni-intel - Fix remaining leak in r fc4106 _set_hash_key ... Browse Code »

Fix up previous patch that failed to properly fix mem leak in
rfc4106_set_hash_subkey(). This add-on patch; fixes the leak. moves
kfree() out of the error path, returns -ENOMEM rather than -EINVAL when
ablkcipher_request_alloc() fails.

Signed-off-by: Jesper Juhl
Signed-off-by: Herbert Xu

Jesper Juhl
2011-02-16 10:04:09 +0800

23 Jan, 2011

1 commit

7efd95f62 crypto: aesni-intel - Don't leak memory in r fc4106 _set_hash_subkey ... Browse Code »

There's a small memory leak in
arch/x86/crypto/aesni-intel_glue.c::rfc4106_set_hash_subkey(). If the call
to kmalloc() fails and returns NULL then the memory allocated previously
by ablkcipher_request_alloc() is not freed when we leave the function.

I could have just added a call to ablkcipher_request_free() before we
return -ENOMEM, but that started to look too much like the code we
already had at the end of the function, so I chose instead to rework the
code a bit so that there are now a few labels at the end that we goto when
various allocations fail, so we don't have to repeat the same blocks of
code (this also reduces the object code size slightly).

Signed-off-by: Jesper Juhl
Signed-off-by: Herbert Xu

Jesper Juhl
2011-01-23 15:59:17 +0800

14 Jan, 2011

1 commit

27d189c02 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (46 commits)
hwrng: via_rng - Fix memory scribbling on some CPUs
crypto: padlock - Move padlock.h into include/crypto
hwrng: via_rng - Fix asm constraints
crypto: n2 - use __devexit not __exit in n2_unregister_algs
crypto: mark crypto workqueues CPU_INTENSIVE
crypto: mv_cesa - dont return PTR_ERR() of wrong pointer
crypto: ripemd - Set module author and update email address
crypto: omap-sham - backlog handling fix
crypto: gf128mul - Remove experimental tag
crypto: af_alg - fix af_alg memory_allocated data type
crypto: aesni-intel - Fixed build with binutils 2.16
crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets
net: Add missing lockdep class names for af_alg
include: Install linux/if_alg.h for user-space crypto API
crypto: omap-aes - checkpatch --file warning fixes
crypto: omap-aes - initialize aes module once per request
crypto: omap-aes - unnecessary code removed
crypto: omap-aes - error handling implementation improved
crypto: omap-aes - redundant locking is removed
crypto: omap-aes - DMA initialization fixes for OMAP off mode
...

Linus Torvalds
2011-01-14 02:25:58 +0800

15 Dec, 2010

1 commit

52f6c5ad4 crypto: ghash-intel - ghash-clmulni-intel_glue needs err.h ... Browse Code »

Add missing header file:

arch/x86/crypto/ghash-clmulni-intel_glue.c:256: error: implicit declaration of function 'IS_ERR'
arch/x86/crypto/ghash-clmulni-intel_glue.c:257: error: implicit declaration of function 'PTR_ERR'

Signed-off-by: Randy Dunlap
Signed-off-by: Herbert Xu

Randy Dunlap
2010-12-15 19:44:08 +0800

13 Dec, 2010

1 commit

3c097b800 crypto: aesni-intel - Fixed build with binutils 2.16 ... Browse Code »

This patch fixes the problem with 2.16 binutils.

Signed-off-by: Aidan O'Mahony
Signed-off-by: Adrian Hoban
Signed-off-by: Gabriele Paoloni
Signed-off-by: Tadeusz Struk
Signed-off-by: Herbert Xu

Tadeusz Struk
2010-12-13 19:51:15 +0800

29 Nov, 2010

1 commit

559ad0ff1 crypto: aesni-intel - Fixed build error on x86-32 ... Browse Code »

Exclude AES-GCM code for x86-32 due to heavy usage of 64-bit registers
not available on x86-32.

While at it, fixed unregister order in aesni_exit().

Signed-off-by: Mathias Krause
Signed-off-by: Herbert Xu

Mathias Krause
2010-11-29 08:35:39 +0800

27 Nov, 2010

1 commit

0d258efb6 crypto: aesni-intel - Ported implementation to x86-32 ... Browse Code »

The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.

To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:

x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%

Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:

x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%

Signed-off-by: Mathias Krause
Reviewed-by: Huang Ying
Signed-off-by: Herbert Xu

Mathias Krause
2010-11-27 16:34:46 +0800

13 Nov, 2010

1 commit

0bd82f5f6 crypto: aesni-intel - R FC4106 AES-GCM Driver Using Intel New Instructions ... Browse Code »

This patch adds an optimized RFC4106 AES-GCM implementation for 64-bit
kernels. It supports 128-bit AES key size. This leverages the crypto
AEAD interface type to facilitate a combined AES & GCM operation to
be implemented in assembly code. The assembly code leverages Intel(R)
AES New Instructions and the PCLMULQDQ instruction.

Signed-off-by: Adrian Hoban
Signed-off-by: Tadeusz Struk
Signed-off-by: Gabriele Paoloni
Signed-off-by: Aidan O'Mahony
Signed-off-by: Erdinc Ozturk
Signed-off-by: James Guilford
Signed-off-by: Wajdi Feghali
Signed-off-by: Herbert Xu

Tadeusz Struk
2010-11-13 20:47:55 +0800

03 May, 2010

1 commit

df2071bd0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Browse Code »

Herbert Xu
2010-05-03 11:28:58 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

13 Mar, 2010

1 commit

32cbd7dfc crypto: aesni-intel - Fix CTR optimization build failure with gas 2.16.1 ... Browse Code »

Andrew Morton reported that AES-NI CTR optimization failed to compile
with gas 2.16.1, the error message is as follow:

arch/x86/crypto/aesni-intel_asm.S: Assembler messages:
arch/x86/crypto/aesni-intel_asm.S:752: Error: suffix or operands invalid for `movq'
arch/x86/crypto/aesni-intel_asm.S:753: Error: suffix or operands invalid for `movq'

To fix this, a gas macro is defined to assemble movq with 64bit
general purpose registers and XMM registers. The macro will generate
the raw .byte sequence for needed instructions.

Reported-by: Andrew Morton
Signed-off-by: Huang Ying
Signed-off-by: Herbert Xu

Huang Ying
2010-03-13 16:28:42 +0800

10 Mar, 2010

1 commit

12387a46b crypto: aesni-intel - Add AES-NI accelerated CTR mode ... Browse Code »

To take advantage of the hardware pipeline implementation of AES-NI
instructions. CTR mode cryption is implemented in ASM to schedule
multiple AES-NI instructions one after another. This way, some latency
of AES-NI instruction can be eliminated.

Performance testing based on dm-crypt should 50% reduction of
ecryption/decryption time.

Signed-off-by: Huang Ying
Signed-off-by: Herbert Xu

Huang Ying
2010-03-10 18:28:55 +0800

09 Feb, 2010

1 commit

3ad2f3fbb tree-wide: Assorted spelling fixes ... Browse Code »

In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.

Signed-off-by: Daniel Mack
Cc: Joe Perches
Cc: Junio C Hamano
Signed-off-by: Jiri Kosina

Daniel Mack
2010-02-09 18:13:56 +0800

01 Dec, 2009

1 commit

838632438 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Browse Code »

Herbert Xu
2009-12-01 15:16:22 +0800