09 Jan, 2020

1 commit

  • The CRYPTO_TFM_RES_BAD_KEY_LEN flag was apparently meant as a way to
    make the ->setkey() functions provide more information about errors.

    However, no one actually checks for this flag, which makes it pointless.

    Also, many algorithms fail to set this flag when given a bad length key.
    Reviewing just the generic implementations, this is the case for
    aes-fixed-time, cbcmac, echainiv, nhpoly1305, pcrypt, rfc3686, rfc4309,
    rfc7539, rfc7539esp, salsa20, seqiv, and xcbc. But there are probably
    many more in arch/*/crypto/ and drivers/crypto/.

    Some algorithms can even set this flag when the key is the correct
    length. For example, authenc and authencesn set it when the key payload
    is malformed in any way (not just a bad length), the atmel-sha and ccree
    drivers can set it if a memory allocation fails, and the chelsio driver
    sets it for bad auth tag lengths, not just bad key lengths.

    So even if someone actually wanted to start checking this flag (which
    seems unlikely, since it's been unused for a long time), there would be
    a lot of work needed to get it working correctly. But it would probably
    be much better to go back to the drawing board and just define different
    return values, like -EINVAL if the key is invalid for the algorithm vs.
    -EKEYREJECTED if the key was rejected by a policy like "no weak keys".
    That would be much simpler, less error-prone, and easier to test.

    So just remove this flag.

    Signed-off-by: Eric Biggers
    Reviewed-by: Horia Geantă
    Signed-off-by: Herbert Xu

    Eric Biggers
     

11 Dec, 2019

1 commit

  • The crypto glue performed function prototype casting via macros to make
    indirect calls to assembly routines. Instead of performing casts at the
    call sites (which trips Control Flow Integrity prototype checking), switch
    each prototype to a common standard set of arguments which allows the
    removal of the existing macros. In order to keep pointer math unchanged,
    internal casting between u128 pointers and u8 pointers is added.

    Co-developed-by: João Moreira
    Signed-off-by: João Moreira
    Signed-off-by: Kees Cook
    Reviewed-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Kees Cook
     

22 Aug, 2019

1 commit


31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version this program is distributed in the
    hope that it will be useful but without any warranty without even
    the implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details you
    should have received a copy of the gnu general public license along
    with this program if not write to the free software foundation inc
    59 temple place suite 330 boston ma 02111 1307 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 1334 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Richard Fontana
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

03 Mar, 2018

4 commits

  • Convert the AVX implementation of Twofish from the (deprecated)
    ablkcipher and blkcipher interfaces over to the skcipher interface.
    Note that this includes replacing the use of ablk_helper with
    crypto_simd.

    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     
  • The LRW template now wraps an ECB mode algorithm rather than the block
    cipher directly. Therefore it is now redundant for crypto modules to
    wrap their ECB code with generic LRW code themselves via lrw_crypt().

    Remove the lrw-twofish-avx algorithm which did this. Users who request
    lrw(twofish) and previously would have gotten lrw-twofish-avx will now
    get lrw(ecb-twofish-avx) instead, which is just as fast.

    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     
  • The XTS template now wraps an ECB mode algorithm rather than the block
    cipher directly. Therefore it is now redundant for crypto modules to
    wrap their ECB code with generic XTS code themselves via xts_crypt().

    Remove the xts-twofish-3way algorithm which did this. Users who request
    xts(twofish) and previously would have gotten xts-twofish-3way will now
    get xts(ecb-twofish-3way) instead, which is just as fast.

    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     
  • The LRW template now wraps an ECB mode algorithm rather than the block
    cipher directly. Therefore it is now redundant for crypto modules to
    wrap their ECB code with generic LRW code themselves via lrw_crypt().

    Remove the lrw-twofish-3way algorithm which did this. Users who request
    lrw(twofish) and previously would have gotten lrw-twofish-3way will now
    get lrw(ecb-twofish-3way) instead, which is just as fast.

    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     

24 Sep, 2015

1 commit

  • Hand in &feature_name to cpu_has_xfeatures() as it is supposed
    to. Fixes an uninitialized warning.

    Signed-off-by: Borislav Petkov
    Cc: Dave Hansen
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: brgerst@gmail.com
    Cc: dvlasenk@redhat.com
    Cc: fenghua.yu@intel.com
    Cc: luto@amacapital.net
    Cc: tim.c.chen@linux.intel.com
    Fixes: d91cab78133d ("x86/fpu: Rename XSAVE macros")
    Link: http://lkml.kernel.org/r/20150923104901.GA3538@pd.tnic
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     

14 Sep, 2015

1 commit

  • There are two concepts that have some confusing naming:
    1. Extended State Component numbers (currently called
    XFEATURE_BIT_*)
    2. Extended State Component masks (currently called XSTATE_*)

    The numbers are (currently) from 0-9. State component 3 is the
    bounds registers for MPX, for instance.

    But when we want to enable "state component 3", we go set a bit
    in XCR0. The bit we set is 1<
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: Fenghua Yu
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: dave@sr71.net
    Cc: linux-kernel@vger.kernel.org
    Link: http://lkml.kernel.org/r/20150902233126.38653250@viggo.jf.intel.com
    [ Ported to v4.3-rc1. ]
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

19 May, 2015

4 commits

  • Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

    This has the following advantages to the driver:

    - Decouples the driver from FPU internals: it's now only using .

    - Removes detection complexity from the driver, no more raw XGETBV instruction

    - Shrinks the code a bit.

    - Standardizes feature name error message printouts across drivers

    There are also advantages to the x86 FPU code: once all drivers
    are decoupled from internals we can move them out of common
    headers and we'll also be able to remove xcr.h.

    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: Fenghua Yu
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • 'xsave' is an x86 instruction name to most people - but xsave.h is
    about a lot more than just the XSAVE instruction: it includes
    definitions and support, both internal and external, related to
    xstate and xfeatures support.

    As a first step in cleaning up the various xstate uses rename this
    header to 'fpu/xstate.h' to better reflect what this header file
    is about.

    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: Fenghua Yu
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Move the xsave.h header file to the FPU directory as well.

    Reviewed-by: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: Fenghua Yu
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • We already have fpu/types.h, move i387.h to fpu/api.h.

    The file name has become a misnomer anyway: it offers generic FPU APIs,
    but is not limited to i387 functionality.

    Reviewed-by: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: Fenghua Yu
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

31 Mar, 2015

1 commit


24 Nov, 2014

1 commit


24 Sep, 2013

1 commit


21 Jun, 2013

1 commit

  • This reverts commit cf1521a1a5e21fd1e79a458605c4282fbfbbeee2.

    Instruction (vpgatherdd) that this implementation relied on turned out to be
    slow performer on real hardware (i5-4570). The previous 8-way twofish/AVX
    implementation is therefore faster and this implementation should be removed.

    Converting this implementation to use the same method as in twofish/AVX for
    table look-ups would give additional ~3% speed up vs twofish/AVX, but would
    hardly be worth of the added code and binary size.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     

25 Apr, 2013

2 commits

  • Patch adds AVX2/x86-64 implementation of Twofish cipher, requiring 16 parallel
    blocks for input (256 bytes). Table look-ups are performed using vpgatherdd
    instruction directly from vector registers and thus should be faster than
    earlier implementations. Implementation also uses 256-bit wide YMM registers,
    which should give additional speed up compared to the AVX implementation.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     
  • Change twofish-avx to use the new XTS code, for smaller stack usage and small
    boost to performance.

    tcrypt results, with Intel i5-2450M:
    enc dec
    16B 1.03x 1.02x
    64B 0.91x 0.91x
    256B 1.10x 1.09x
    1024B 1.12x 1.11x
    8192B 1.12x 1.11x

    Since XTS is practically always used with data blocks of size 512 bytes or
    more, I chose to not make use of twofish-3way for block sized smaller than
    128 bytes. This causes slower result in tcrypt for 64 bytes.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     

24 Oct, 2012

2 commits

  • Introduce new assembler functions to avoid use temporary stack buffers in glue
    code. This also allows use of vector instructions for xoring output in CTR and
    CBC modes and construction of IVs for CTR mode.

    ECB mode sees ~0.2% decrease in speed because added one extra function
    call. CBC mode decryption and CTR mode benefit from vector operations
    and gain ~3%.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     
  • 'u128' currently used for CTR mode is on little-endian 'long long' swapped
    and would require extra swap operations by SSE/AVX code. Use of le128
    instead of u128 allows IV calculations to be done with vector registers
    easier.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     

01 Aug, 2012

1 commit


27 Jun, 2012

2 commits


12 Jun, 2012

1 commit

  • This patch adds a x86_64/avx assembler implementation of the Twofish block
    cipher. The implementation processes eight blocks in parallel (two 4 block
    chunk AVX operations). The table-lookups are done in general-purpose registers.
    For small blocksizes the 3way-parallel functions from the twofish-x86_64-3way
    module are called. A good performance increase is provided for blocksizes
    greater or equal to 128B.

    Patch has been tested with tcrypt and automated filesystem tests.

    Tcrypt benchmark results:

    Intel Core i5-2500 CPU (fam:6, model:42, step:7)

    twofish-avx-x86_64 vs. twofish-x86_64-3way
    128bit key: (lrw:256bit) (xts:256bit)
    size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
    16B 0.96x 0.97x 1.00x 0.95x 0.97x 0.97x 0.96x 0.95x 0.95x 0.98x
    64B 0.99x 0.99x 1.00x 0.99x 0.98x 0.98x 0.99x 0.98x 0.99x 0.98x
    256B 1.20x 1.21x 1.00x 1.19x 1.15x 1.14x 1.19x 1.20x 1.18x 1.19x
    1024B 1.29x 1.30x 1.00x 1.28x 1.23x 1.24x 1.26x 1.28x 1.26x 1.27x
    8192B 1.31x 1.32x 1.00x 1.31x 1.25x 1.25x 1.28x 1.29x 1.28x 1.30x

    256bit key: (lrw:384bit) (xts:512bit)
    size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
    16B 0.96x 0.96x 1.00x 0.96x 0.97x 0.98x 0.95x 0.95x 0.95x 0.96x
    64B 1.00x 0.99x 1.00x 0.98x 0.98x 1.01x 0.98x 0.98x 0.98x 0.98x
    256B 1.20x 1.21x 1.00x 1.21x 1.15x 1.15x 1.19x 1.20x 1.18x 1.19x
    1024B 1.29x 1.30x 1.00x 1.28x 1.23x 1.23x 1.26x 1.27x 1.26x 1.27x
    8192B 1.31x 1.33x 1.00x 1.31x 1.26x 1.26x 1.29x 1.29x 1.28x 1.30x

    twofish-avx-x86_64 vs aes-asm (8kB block):
    128bit 256bit
    ecb-enc 1.19x 1.63x
    ecb-dec 1.18x 1.62x
    cbc-enc 0.75x 1.03x
    cbc-dec 1.23x 1.67x
    ctr-enc 1.24x 1.65x
    ctr-dec 1.24x 1.65x
    lrw-enc 1.15x 1.53x
    lrw-dec 1.14x 1.52x
    xts-enc 1.16x 1.56x
    xts-dec 1.16x 1.56x

    Signed-off-by: Johannes Goetzfried
    Signed-off-by: Herbert Xu

    Johannes Goetzfried