Eric Lee / smarc-fsl-linux-kernel

21 Feb, 2012

3 commits

7c51cb723 crypto: sha512 - use standard ror64() ... Browse Code »

commit f2ea0f5f04c97b48c88edccba52b0682fbe45087 upstream.

Use standard ror64() instead of hand-written.
There is no standard ror64, so create it.

The difference is shift value being "unsigned int" instead of uint64_t
(for which there is no reason). gcc starts to emit native ROR instructions
which it doesn't do for some reason currently. This should make the code
faster.

Patch survives in-tree crypto test and ping flood with hmac(sha512) on.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Alexey Dobriyan
2012-02-21 04:46:20 +0800
03b762ab8 crypto: sha512 - Avoid stack bloat on i386 ... Browse Code »

commit 3a92d687c8015860a19213e3c102cad6b722f83c upstream.

Unfortunately in reducing W from 80 to 16 we ended up unrolling
the loop twice. As gcc has issues dealing with 64-bit ops on
i386 this means that we end up using even more stack space (>1K).

This patch solves the W reduction by moving LOAD_OP/BLEND_OP
into the loop itself, thus avoiding the need to duplicate it.

While the stack space still isn't great (>0.5K) it is at least
in the same ball park as the amount of stack used for our C sha1
implementation.

Note that this patch basically reverts to the original code so
the diff looks bigger than it really is.

Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Herbert Xu
2012-02-21 04:46:18 +0800
f334f7457 crypto: sha512 - Use binary and instead of modulus ... Browse Code »

commit 58d7d18b5268febb8b1391c6dffc8e2aaa751fcd upstream.

The previous patch used the modulus operator over a power of 2
unnecessarily which may produce suboptimal binary code. This
patch changes changes them to binary ands instead.

Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Herbert Xu
2012-02-21 04:46:18 +0800

04 Feb, 2012

2 commits

64d4ed6a2 crypto: sha512 - reduce stack usage to safe number ... Browse Code »

commit 51fc6dc8f948047364f7d42a4ed89b416c6cc0a3 upstream.

For rounds 16--79, W[i] only depends on W[i - 2], W[i - 7], W[i - 15] and W[i - 16].
Consequently, keeping all W[80] array on stack is unnecessary,
only 16 values are really needed.

Using W[16] instead of W[80] greatly reduces stack usage
(~750 bytes to ~340 bytes on x86_64).

Line by line explanation:
* BLEND_OP
array is "circular" now, all indexes have to be modulo 16.
Round number is positive, so remainder operation should be
without surprises.

* initial full message scheduling is trimmed to first 16 values which
come from data block, the rest is calculated before it's needed.

* original loop body is unrolled version of new SHA512_0_15 and
SHA512_16_79 macros, unrolling was done to not do explicit variable
renaming. Otherwise it's the very same code after preprocessing.
See sha1_transform() code which does the same trick.

Patch survives in-tree crypto test and original bugreport test
(ping flood with hmac(sha512).

See FIPS 180-2 for SHA-512 definition
http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf

Signed-off-by: Alexey Dobriyan
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Alexey Dobriyan
2012-02-04 01:21:33 +0800
1a2357930 crypto: sha512 - make it work, undo percpu message schedule ... Browse Code »

commit 84e31fdb7c797a7303e0cc295cb9bc8b73fb872d upstream.

commit f9e2bca6c22d75a289a349f869701214d63b5060
aka "crypto: sha512 - Move message schedule W[80] to static percpu area"
created global message schedule area.

If sha512_update will ever be entered twice, hash will be silently
calculated incorrectly.

Probably the easiest way to notice incorrect hashes being calculated is
to run 2 ping floods over AH with hmac(sha512):

#!/usr/sbin/setkey -f
flush;
spdflush;
add IP1 IP2 ah 25 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000025;
add IP2 IP1 ah 52 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000052;
spdadd IP1 IP2 any -P out ipsec ah/transport//require;
spdadd IP2 IP1 any -P in ipsec ah/transport//require;

XfrmInStateProtoError will start ticking with -EBADMSG being returned
from ah_input(). This never happens with, say, hmac(sha1).

With patch applied (on BOTH sides), XfrmInStateProtoError does not tick
with multiple bidirectional ping flood streams like it doesn't tick
with SHA-1.

After this patch sha512_transform() will start using ~750 bytes of stack on x86_64.
This is OK for simple loads, for something more heavy, stack reduction will be done
separatedly.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Alexey Dobriyan
2012-02-04 01:21:33 +0800

12 Nov, 2011

1 commit

42a0ddcd4 Merge git://github.com/herbertx/crypto ... Browse Code »

* git://github.com/herbertx/crypto:
crypto: algapi - Fix build problem with NET disabled
crypto: user - Fix rwsem leak in crypto_user

Linus Torvalds
2011-11-12 09:40:02 +0800

11 Nov, 2011

1 commit

3acc84739 crypto: algapi - Fix build problem with NET disabled ... Browse Code »

The report functions use NLA_PUT so we need to ensure that NET
is enabled.

Reported-by: Luis Henriques
Signed-off-by: Herbert Xu

Herbert Xu
2011-11-11 06:57:06 +0800

07 Nov, 2011

1 commit

32aaeffbd Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux ... Browse Code »

* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include
net: sch_generic remove redundant use of
net: inet_timewait_sock doesnt need
...

Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h

Linus Torvalds
2011-11-07 11:44:47 +0800

02 Nov, 2011

2 commits

fb223c32b crypto: user - Fix rwsem leak in crypto_user ... Browse Code »

The list_empty case in crypto_alg_match() will return without calling
up_read() on crypto_alg_sem. We could do the "goto out" routine, but the
function will clearly do the right thing with that test simply removed.

Signed-off-by: Jonathan Corbet
Signed-off-by: Herbert Xu

Jonathan Corbet
2011-11-02 06:15:16 +0800
dc47d3810 Merge git://github.com/herbertx/crypto ... Browse Code »

* git://github.com/herbertx/crypto: (48 commits)
crypto: user - Depend on NET instead of selecting it
crypto: user - Add dependency on NET
crypto: talitos - handle descriptor not found in error path
crypto: user - Initialise match in crypto_alg_match
crypto: testmgr - add twofish tests
crypto: testmgr - add blowfish test-vectors
crypto: Make hifn_795x build depend on !ARCH_DMA_ADDR_T_64BIT
crypto: twofish-x86_64-3way - fix ctr blocksize to 1
crypto: blowfish-x86_64 - fix ctr blocksize to 1
crypto: whirlpool - count rounds from 0
crypto: Add userspace report for compress type algorithms
crypto: Add userspace report for cipher type algorithms
crypto: Add userspace report for rng type algorithms
crypto: Add userspace report for pcompress type algorithms
crypto: Add userspace report for nivaead type algorithms
crypto: Add userspace report for aead type algorithms
crypto: Add userspace report for givcipher type algorithms
crypto: Add userspace report for ablkcipher type algorithms
crypto: Add userspace report for blkcipher type algorithms
crypto: Add userspace report for ahash type algorithms
...

Linus Torvalds
2011-11-02 00:24:41 +0800

01 Nov, 2011

2 commits

5db017aa2 crypto: user - Depend on NET instead of selecting it ... Browse Code »

Selecting NET causes all sorts of issues, including a dependency
loop involving bluetooth. This patch makes it a dependency instead.

Signed-off-by: Herbert Xu

Herbert Xu
2011-11-01 09:12:43 +0800
4bb33cc89 crypto: add module.h to those files that are explicitly using it ... Browse Code »

Part of the include cleanups means that the implicit
inclusion of module.h via device.h is going away. So
fix things up in advance.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-11-01 07:31:11 +0800

26 Oct, 2011

1 commit

ea8bdfcff crypto: user - Add dependency on NET ... Browse Code »

Since the configuration interface relies on netlink we need to
select NET.

Signed-off-by: Herbert Xu

Herbert Xu
2011-10-26 23:15:10 +0800

21 Oct, 2011

24 commits

e6ea64ece crypto: user - Initialise match in crypto_alg_match ... Browse Code »

We need to default match to 0 as otherwise it may lead to a false
positive.

Signed-off-by: Herbert Xu

Herbert Xu
2011-10-21 20:37:10 +0800
573da6208 crypto: testmgr - add twofish tests ... Browse Code »

Add tests for parallel twofish-x86_64-3way code paths.

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:28:58 +0800
85b63e342 crypto: testmgr - add blowfish test-vectors ... Browse Code »

Add tests for parallel blowfish-x86_64 code paths.

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:28:58 +0800
ac4385d25 crypto: whirlpool - count rounds from 0 ... Browse Code »

rc[0] is unused because rounds are counted from 1.
Save an u64!

Signed-off-by: Alexey Dobriyan
Signed-off-by: Herbert Xu

Alexey Dobriyan
2011-10-21 20:24:16 +0800
540b97c1d crypto: Add userspace report for compress type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:12 +0800
07a5fa4ab crypto: Add userspace report for cipher type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:07 +0800
792608e9c crypto: Add userspace report for rng type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:06 +0800
a55465dca crypto: Add userspace report for pcompress type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:06 +0800
b735d0a91 crypto: Add userspace report for nivaead type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:06 +0800
6ad414fe7 crypto: Add userspace report for aead type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:06 +0800
3e29c1095 crypto: Add userspace report for givcipher type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:05 +0800
29ffc8764 crypto: Add userspace report for ablkcipher type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:05 +0800
50496a1fa crypto: Add userspace report for blkcipher type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:05 +0800
6238cbaec crypto: Add userspace report for ahash type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:04 +0800
f4d663ce6 crypto: Add userspace report for shash type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:04 +0800
6c5a86f52 crypto: Add userspace report for larval type algorithms ... Browse Code »

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:04 +0800
b6aa63c09 crypto: Add a report function pointer to crypto_type ... Browse Code »

We add a report function pointer to struct crypto_type. This function
pointer is used from the crypto userspace configuration API to report
crypto algorithms to userspace.

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:03 +0800
a38f7907b crypto: Add userspace configuration API ... Browse Code »

This patch adds a basic userspace configuration API for the crypto layer.
With this it is possible to instantiate, remove and to show crypto
algorithms from userspace.

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:03 +0800
22e5b20be crypto: Export crypto_remove_final ... Browse Code »

The upcomming crypto usrerspace configuration api needs
to remove the spawns on top on an algorithm, so export
crypto_remove_final.

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:03 +0800
89b596ba2 crypto: Export crypto_remove_spawns ... Browse Code »

The upcomming crypto usrerspace configuration api needs
to remove the spawns on top on an algorithm, so export
crypto_remove_spawns.

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:03 +0800
64a947b13 crypto: Add a flag to identify crypto instances ... Browse Code »

The upcomming crypto user configuration api needs to identify
crypto instances. This patch adds a flag that is set if the
algorithm is an instance that is build from templates.

Signed-off-by: Steffen Klassert
Signed-off-by: Herbert Xu

Steffen Klassert
2011-10-21 20:24:01 +0800
8280daad4 crypto: twofish - add 3-way parallel x86_64 assembler implemention ... Browse Code »

Patch adds 3-way parallel x86_64 assembly implementation of twofish as new
module. New assembler functions crypt data in three blocks chunks, improving
cipher performance on out-of-order CPUs.

Patch has been tested with tcrypt and automated filesystem tests.

Summary of the tcrypt benchmarks:

Twofish 3-way-asm vs twofish asm (128bit 8kb block ECB)
encrypt: 1.3x speed
decrypt: 1.3x speed

Twofish 3-way-asm vs twofish asm (128bit 8kb block CBC)
encrypt: 1.07x speed
decrypt: 1.4x speed

Twofish 3-way-asm vs twofish asm (128bit 8kb block CTR)
encrypt: 1.4x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block ECB)
encrypt: 1.0x speed
decrypt: 1.0x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block CBC)
encrypt: 0.84x speed
decrypt: 1.09x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block CTR)
encrypt: 1.15x speed

Full output:
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-3way-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-aes-asm-x86_64.txt

Tests were run on:
vendor_id : AuthenticAMD
cpu family : 16
model : 10
model name : AMD Phenom(tm) II X6 1055T Processor

Also userspace test were run on:
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU E7330 @ 2.40GHz
stepping : 11

Userspace test results:

Encryption/decryption of twofish 3-way vs x86_64-asm on AMD Phenom II:
encrypt: 1.27x
decrypt: 1.25x

Encryption/decryption of twofish 3-way vs x86_64-asm on Intel Xeon E7330:
encrypt: 1.36x
decrypt: 1.36x

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:23:08 +0800
ee5002a54 crypto: tcrypt - add ctr(twofish) speed test ... Browse Code »

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-10-21 20:23:08 +0800
7ed47b7d1 crypto: ghash - Avoid null pointer dereference if no key is set ... Browse Code »
1

The ghash_update function passes a pointer to gf128mul_4k_lle which will
be NULL if ghash_setkey is not called or if the most recent call to
ghash_setkey failed to allocate memory. This causes an oops. Fix this
up by returning an error code in the null case.

This is trivially triggered from unprivileged userspace through the
AF_ALG interface by simply writing to the socket without setting a key.

The ghash_final function has a similar issue, but triggering it requires
a memory allocation failure in ghash_setkey _after_ at least one
successful call to ghash_update.

BUG: unable to handle kernel NULL pointer dereference at 00000670
IP: [] gf128mul_4k_lle+0x23/0x60 [gf128mul]
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ghash_generic gf128mul algif_hash af_alg nfs lockd nfs_acl sunrpc bridge ipv6 stp llc

Pid: 1502, comm: hashatron Tainted: G W 3.1.0-rc9-00085-ge9308cf #32 Bochs Bochs
EIP: 0060:[] EFLAGS: 00000202 CPU: 0
EIP is at gf128mul_4k_lle+0x23/0x60 [gf128mul]
EAX: d69db1f0 EBX: d6b8ddac ECX: 00000004 EDX: 00000000
ESI: 00000670 EDI: d6b8ddac EBP: d6b8ddc8 ESP: d6b8dda4
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process hashatron (pid: 1502, ti=d6b8c000 task=d6810000 task.ti=d6b8c000)
Stack:
00000000 d69db1f0 00000163 00000000 d6b8ddc8 c101a520 d69db1f0 d52aa000
00000ff0 d6b8dde8 d88d310f d6b8a3f8 d52aa000 00001000 d88d502c d6b8ddfc
00001000 d6b8ddf4 c11676ed d69db1e8 d6b8de24 c11679ad d52aa000 00000000
Call Trace:
[] ? kmap_atomic_prot+0x37/0xa6
[] ghash_update+0x85/0xbe [ghash_generic]
[] crypto_shash_update+0x18/0x1b
[] shash_ahash_update+0x22/0x36
[] shash_async_update+0xb/0xd
[] hash_sendpage+0xba/0xf2 [algif_hash]
[] kernel_sendpage+0x39/0x4e
[] ? 0xd88cdfff
[] sock_sendpage+0x37/0x3e
[] ? kernel_sendpage+0x4e/0x4e
[] pipe_to_sendpage+0x56/0x61
[] splice_from_pipe_feed+0x58/0xcd
[] ? splice_from_pipe_begin+0x10/0x10
[] __splice_from_pipe+0x36/0x55
[] ? splice_from_pipe_begin+0x10/0x10
[] splice_from_pipe+0x51/0x64
[] ? default_file_splice_write+0x2c/0x2c
[] generic_splice_sendpage+0x13/0x15
[] ? splice_from_pipe_begin+0x10/0x10
[] do_splice_from+0x5d/0x67
[] sys_splice+0x2bf/0x363
[] ? sysenter_exit+0xf/0x16
[] ? trace_hardirqs_on_caller+0x10e/0x13f
[] sysenter_do_call+0x12/0x32
Code: 83 c4 0c 5b 5e 5f c9 c3 55 b9 04 00 00 00 89 e5 57 8d 7d e4 56 53 8d 5d e4 83 ec 18 89 45 e0 89 55 dc 0f b6 70 0f c1 e6 04 01 d6 a5 be 0f 00 00 00 4e 89 d8 e8 48 ff ff ff 8b 45 e0 89 da 0f
EIP: [] gf128mul_4k_lle+0x23/0x60 [gf128mul] SS:ESP 0068:d6b8dda4
CR2: 0000000000000670
---[ end trace 4eaa2a86a8e2da24 ]---
note: hashatron[1502] exited with preempt_count 1
BUG: scheduling while atomic: hashatron/1502/0x10000002
INFO: lockdep is turned off.
[...]

Signed-off-by: Nick Bowler
Cc: stable@kernel.org [2.6.37+]
Signed-off-by: Herbert Xu

Nick Bowler
2011-10-21 19:18:42 +0800

22 Sep, 2011

3 commits

64b94ceae crypto: blowfish - add x86_64 assembly implementation ... Browse Code »

Patch adds x86_64 assembly implementation of blowfish. Two set of assembler
functions are provided. First set is regular 'one-block at time'
encrypt/decrypt functions. Second is 'four-block at time' functions that
gain performance increase on out-of-order CPUs. Performance of 4-way
functions should be equal to 1-way functions with in-order CPUs.

Summary of the tcrypt benchmarks:

Blowfish assembler vs blowfish C (256bit 8kb block ECB)
encrypt: 2.2x speed
decrypt: 2.3x speed

Blowfish assembler vs blowfish C (256bit 8kb block CBC)
encrypt: 1.12x speed
decrypt: 2.5x speed

Blowfish assembler vs blowfish C (256bit 8kb block CTR)
encrypt: 2.5x speed

Full output:
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt

Tests were run on:
vendor_id : AuthenticAMD
cpu family : 16
model : 10
model name : AMD Phenom(tm) II X6 1055T Processor
stepping : 0

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-09-22 19:25:26 +0800
7d47b86cf crypto: tcrypt - add ctr(blowfish) speed test ... Browse Code »

Add ctr(blowfish) speed test to receive results for blowfish x86_64 assembly
patch.

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-09-22 19:25:26 +0800
3f2a5d2d4 crypto: blowfish - rename C-version to blowfish_generic ... Browse Code »

Rename blowfish to blowfish_generic so that assembler versions of blowfish
cipher can autoload. Module alias 'blowfish' is added.

Also fix checkpatch warnings.

Signed-off-by: Jussi Kivilinna
Signed-off-by: Herbert Xu

Jussi Kivilinna
2011-09-22 19:25:26 +0800