23 Jan, 2017
1 commit
-
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE." if -fdata-sections is used.
This patch does the same:.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.oMerged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK
CC: Herbert Xu
CC: Josh Poimboeuf
CC: Xiaodong Liu
CC: Megha Dey
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu
24 Feb, 2016
1 commit
-
The crypto code has several callable non-leaf functions which don't
honor CONFIG_FRAME_POINTER, which can result in bad stack traces.Create stack frames for them when CONFIG_FRAME_POINTER is enabled.
Signed-off-by: Josh Poimboeuf
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Andy Lutomirski
Cc: Arnaldo Carvalho de Melo
Cc: Bernd Petrovitsch
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Chris J Arges
Cc: David S. Miller
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Herbert Xu
Cc: Jiri Slaby
Cc: Linus Torvalds
Cc: Michal Marek
Cc: Namhyung Kim
Cc: Pedro Alves
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/6c20192bcf1102ae18ae5a242cabf30ce9b29895.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar
04 Apr, 2014
1 commit
-
The internal key isn't actually in big-endian format so let's switch
to u128 which also happens to allow us to remove a sparse warning.Based on suggestion by Ard Biesheuvel.
Signed-off-by: Herbert Xu
Acked-by: Ard Biesheuvel
01 Apr, 2014
1 commit
-
The GHASH setkey() function uses SSE registers but fails to call
kernel_fpu_begin()/kernel_fpu_end(). Instead of adding these calls, and
then having to deal with the restriction that they cannot be called from
interrupt context, move the setkey() implementation to the C domain.Note that setkey() does not use any particular SSE features and is not
expected to become a performance bottleneck.Signed-off-by: Ard Biesheuvel
Acked-by: H. Peter Anvin
Fixes: 0e1227d356e9b (crypto: ghash - Add PCLMULQDQ accelerated implementation)
Signed-off-by: Herbert Xu
20 Jan, 2013
1 commit
-
Signed-off-by: Jussi Kivilinna
Acked-by: David S. Miller
Signed-off-by: Herbert Xu
23 Nov, 2009
2 commits
-
Lbswap_mask, Lpoly and Ltwo_one should clearly belong to
.data section, not .text.Signed-off-by: Jiri Kosina
Signed-off-by: Herbert Xu -
Old binutils do not support PCLMULQDQ-NI and PSHUFB, to make kernel
can be compiled by them, .byte code is used instead of assembly
instructions. But the readability and flexibility of raw .byte code is
not good.So corresponding assembly instruction like gas macro is used instead.
Signed-off-by: Huang Ying
Signed-off-by: Herbert Xu
03 Nov, 2009
1 commit
-
Add PSHUFB macros instead of repeating byte sequences, suggested
by Ingo.Signed-off-by: Herbert Xu
Acked-by: Ingo Molnar
02 Nov, 2009
1 commit
-
Old gases don't have a clue what pshufb stands for so we have
to hard-code it for now.Reported-by: Andrew Morton
Signed-off-by: Herbert Xu
19 Oct, 2009
1 commit
-
PCLMULQDQ is used to accelerate the most time-consuming part of GHASH,
carry-less multiplication. More information about PCLMULQDQ can be
found at:http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/
Because PCLMULQDQ changes XMM state, its usage must be enclosed with
kernel_fpu_begin/end, which can be used only in process context, the
acceleration is implemented as crypto_ahash. That is, request in soft
IRQ context will be defered to the cryptd kernel thread.Signed-off-by: Huang Ying
Signed-off-by: Herbert Xu