18 Apr, 2019

2 commits

  • Add Elliptic Curve Russian Digital Signature Algorithm (GOST R
    34.10-2012, RFC 7091, ISO/IEC 14888-3) is one of the Russian (and since
    2018 the CIS countries) cryptographic standard algorithms (called GOST
    algorithms). Only signature verification is supported, with intent to be
    used in the IMA.

    Summary of the changes:

    * crypto/Kconfig:
    - EC-RDSA is added into Public-key cryptography section.

    * crypto/Makefile:
    - ecrdsa objects are added.

    * crypto/asymmetric_keys/x509_cert_parser.c:
    - Recognize EC-RDSA and Streebog OIDs.

    * include/linux/oid_registry.h:
    - EC-RDSA OIDs are added to the enum. Also, a two currently not
    implemented curve OIDs are added for possible extension later (to
    not change numbering and grouping).

    * crypto/ecc.c:
    - Kenneth MacKay copyright date is updated to 2014, because
    vli_mmod_slow, ecc_point_add, ecc_point_mult_shamir are based on his
    code from micro-ecc.
    - Functions needed for ecrdsa are EXPORT_SYMBOL'ed.
    - New functions:
    vli_is_negative - helper to determine sign of vli;
    vli_from_be64 - unpack big-endian array into vli (used for
    a signature);
    vli_from_le64 - unpack little-endian array into vli (used for
    a public key);
    vli_uadd, vli_usub - add/sub u64 value to/from vli (used for
    increment/decrement);
    mul_64_64 - optimized to use __int128 where appropriate, this speeds
    up point multiplication (and as a consequence signature
    verification) by the factor of 1.5-2;
    vli_umult - multiply vli by a small value (speeds up point
    multiplication by another factor of 1.5-2, depending on vli sizes);
    vli_mmod_special - module reduction for some form of Pseudo-Mersenne
    primes (used for the curves A);
    vli_mmod_special2 - module reduction for another form of
    Pseudo-Mersenne primes (used for the curves B);
    vli_mmod_barrett - module reduction using pre-computed value (used
    for the curve C);
    vli_mmod_slow - more general module reduction which is much slower
    (used when the modulus is subgroup order);
    vli_mod_mult_slow - modular multiplication;
    ecc_point_add - add two points;
    ecc_point_mult_shamir - add two points multiplied by scalars in one
    combined multiplication (this gives speed up by another factor 2 in
    compare to two separate multiplications).
    ecc_is_pubkey_valid_partial - additional samity check is added.
    - Updated vli_mmod_fast with non-strict heuristic to call optimal
    module reduction function depending on the prime value;
    - All computations for the previously defined (two NIST) curves should
    not unaffected.

    * crypto/ecc.h:
    - Newly exported functions are documented.

    * crypto/ecrdsa_defs.h
    - Five curves are defined.

    * crypto/ecrdsa.c:
    - Signature verification is implemented.

    * crypto/ecrdsa_params.asn1, crypto/ecrdsa_pub_key.asn1:
    - Templates for BER decoder for EC-RDSA parameters and public key.

    Cc: linux-integrity@vger.kernel.org
    Signed-off-by: Vitaly Chikunov
    Signed-off-by: Herbert Xu

    Vitaly Chikunov
     
  • ecc.c have algorithms that could be used togeter by ecdh and ecrdsa.
    Make it separate module. Add CRYPTO_ECC into Kconfig. EXPORT_SYMBOL and
    document to what seems appropriate. Move structs ecc_point and ecc_curve
    from ecc_curve_defs.h into ecc.h.

    No code changes.

    Signed-off-by: Vitaly Chikunov
    Signed-off-by: Herbert Xu

    Vitaly Chikunov
     

08 Mar, 2019

1 commit

  • To prevent any issues with persistent data, separate lzo-rle from lzo so
    that it is treated as a separate algorithm, and lzo is still available.

    Link: http://lkml.kernel.org/r/20190205155944.16007-3-dave.rodgman@arm.com
    Signed-off-by: Dave Rodgman
    Cc: David S. Miller
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: Markus F.X.J. Oberhumer
    Cc: Matt Sealey
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Richard Purdie
    Cc: Sergey Senozhatsky
    Cc: Sonny Rao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Rodgman
     

07 Dec, 2018

1 commit


20 Nov, 2018

3 commits

  • Add support for the Adiantum encryption mode. Adiantum was designed by
    Paul Crowley and is specified by our paper:

    Adiantum: length-preserving encryption for entry-level processors
    (https://eprint.iacr.org/2018/720.pdf)

    See our paper for full details; this patch only provides an overview.

    Adiantum is a tweakable, length-preserving encryption mode designed for
    fast and secure disk encryption, especially on CPUs without dedicated
    crypto instructions. Adiantum encrypts each sector using the XChaCha12
    stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
    function, and an invocation of the AES-256 block cipher on a single
    16-byte block. On CPUs without AES instructions, Adiantum is much
    faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
    Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
    and decryption about 5 times faster.

    Adiantum is a specialization of the more general HBSH construction. Our
    earlier proposal, HPolyC, was also a HBSH specialization, but it used a
    different εA∆U hash function, one based on Poly1305 only. Adiantum's
    εA∆U hash function, which is based primarily on the "NH" hash function
    like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
    consequently, Adiantum is about 20% faster than HPolyC.

    This speed comes with no loss of security: Adiantum is provably just as
    secure as HPolyC, in fact slightly *more* secure. Like HPolyC,
    Adiantum's security is reducible to that of XChaCha12 and AES-256,
    subject to a security bound. XChaCha12 itself has a security reduction
    to ChaCha12. Therefore, one need not "trust" Adiantum; one need only
    trust ChaCha12 and AES-256. Note that the εA∆U hash function is only
    used for its proven combinatorical properties so cannot be "broken".

    Adiantum is also a true wide-block encryption mode, so flipping any
    plaintext bit in the sector scrambles the entire ciphertext, and vice
    versa. No other such mode is available in the kernel currently; doing
    the same with XTS scrambles only 16 bytes. Adiantum also supports
    arbitrary-length tweaks and naturally supports any length input >= 16
    bytes without needing "ciphertext stealing".

    For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in
    order to make encryption feasible on the widest range of devices.
    Although the 20-round variant is quite popular, the best known attacks
    on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial
    security margin; in fact, larger than AES-256's. 12-round Salsa20 is
    also the eSTREAM recommendation. For the block cipher, Adiantum uses
    AES-256, despite it having a lower security margin than XChaCha12 and
    needing table lookups, due to AES's extensive adoption and analysis
    making it the obvious first choice. Nevertheless, for flexibility this
    patch also permits the "adiantum" template to be instantiated with
    XChaCha20 and/or with an alternate block cipher.

    We need Adiantum support in the kernel for use in dm-crypt and fscrypt,
    where currently the only other suitable options are block cipher modes
    such as AES-XTS. A big problem with this is that many low-end mobile
    devices (e.g. Android Go phones sold primarily in developing countries,
    as well as some smartwatches) still have CPUs that lack AES
    instructions, e.g. ARM Cortex-A7. Sadly, AES-XTS encryption is much too
    slow to be viable on these devices. We did find that some "lightweight"
    block ciphers are fast enough, but these suffer from problems such as
    not having much cryptanalysis or being too controversial.

    The ChaCha stream cipher has excellent performance but is insecure to
    use directly for disk encryption, since each sector's IV is reused each
    time it is overwritten. Even restricting the threat model to offline
    attacks only isn't enough, since modern flash storage devices don't
    guarantee that "overwrites" are really overwrites, due to wear-leveling.
    Adiantum avoids this problem by constructing a
    "tweakable super-pseudorandom permutation"; this is the strongest
    possible security model for length-preserving encryption.

    Of course, storing random nonces along with the ciphertext would be the
    ideal solution. But doing that with existing hardware and filesystems
    runs into major practical problems; in most cases it would require data
    journaling (like dm-integrity) which severely degrades performance.
    Thus, for now length-preserving encryption is still needed.

    Signed-off-by: Eric Biggers
    Reviewed-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Eric Biggers
     
  • Add a generic implementation of NHPoly1305, an ε-almost-∆-universal hash
    function used in the Adiantum encryption mode.

    CONFIG_NHPOLY1305 is not selectable by itself since there won't be any
    real reason to enable it without also enabling Adiantum support.

    Signed-off-by: Eric Biggers
    Acked-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Eric Biggers
     
  • In preparation for adding XChaCha12 support, rename/refactor
    chacha20-generic to support different numbers of rounds. The
    justification for needing XChaCha12 support is explained in more detail
    in the patch "crypto: chacha - add XChaCha12 support".

    The only difference between ChaCha{8,12,20} are the number of rounds
    itself; all other parts of the algorithm are the same. Therefore,
    remove the "20" from all definitions, structures, functions, files, etc.
    that will be shared by all ChaCha versions.

    Also make ->setkey() store the round count in the chacha_ctx (previously
    chacha20_ctx). The generic code then passes the round count through to
    chacha_block(). There will be a ->setkey() function for each explicitly
    allowed round count; the encrypt/decrypt functions will be the same. I
    decided not to do it the opposite way (same ->setkey() function for all
    round counts, with different encrypt/decrypt functions) because that
    would have required more boilerplate code in architecture-specific
    implementations of ChaCha and XChaCha.

    Reviewed-by: Ard Biesheuvel
    Acked-by: Martin Willi
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     

16 Nov, 2018

1 commit


28 Sep, 2018

2 commits


04 Sep, 2018

2 commits

  • As it turns out, the AVX2 multibuffer SHA routines are currently
    broken [0], in a way that would have likely been noticed if this
    code were in wide use. Since the code is too complicated to be
    maintained by anyone except the original authors, and since the
    performance benefits for real-world use cases are debatable to
    begin with, it is better to drop it entirely for the moment.

    [0] https://marc.info/?l=linux-crypto-vger&m=153476243825350&w=2

    Suggested-by: Eric Biggers
    Cc: Megha Dey
    Cc: Tim Chen
    Cc: Geert Uytterhoeven
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • These are unused, undesired, and have never actually been used by
    anybody. The original authors of this code have changed their mind about
    its inclusion. While originally proposed for disk encryption on low-end
    devices, the idea was discarded [1] in favor of something else before
    that could really get going. Therefore, this patch removes Speck.

    [1] https://marc.info/?l=linux-crypto-vger&m=153359499015659

    Signed-off-by: Jason A. Donenfeld
    Acked-by: Eric Biggers
    Cc: stable@vger.kernel.org
    Acked-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Jason A. Donenfeld
     

31 May, 2018

1 commit

  • Commit 56e8e57fc3a7 ("crypto: morus - Add common SIMD glue code for
    MORUS") accidetally consiedered the glue code to be usable by different
    architectures, but it seems to be only usable on x86.

    This patch moves it under arch/x86/crypto and adds 'depends on X86' to
    the Kconfig options and also removes the prompt to hide these internal
    options from the user.

    Reported-by: kbuild test robot
    Signed-off-by: Ondrej Mosnacek
    Signed-off-by: Herbert Xu

    Ondrej Mosnacek
     

19 May, 2018

3 commits

  • This patch adds a common glue code for optimized implementations of
    MORUS AEAD algorithms.

    Signed-off-by: Ondrej Mosnacek
    Signed-off-by: Herbert Xu

    Ondrej Mosnacek
     
  • This patch adds the generic implementation of the MORUS family of AEAD
    algorithms (MORUS-640 and MORUS-1280). The original authors of MORUS
    are Hongjun Wu and Tao Huang.

    At the time of writing, MORUS is one of the finalists in CAESAR, an
    open competition intended to select a portfolio of alternatives to
    the problematic AES-GCM:

    https://competitions.cr.yp.to/caesar-submissions.html
    https://competitions.cr.yp.to/round3/morusv2.pdf

    Signed-off-by: Ondrej Mosnacek
    Signed-off-by: Herbert Xu

    Ondrej Mosnacek
     
  • This patch adds the generic implementation of the AEGIS family of AEAD
    algorithms (AEGIS-128, AEGIS-128L, and AEGIS-256). The original
    authors of AEGIS are Hongjun Wu and Bart Preneel.

    At the time of writing, AEGIS is one of the finalists in CAESAR, an
    open competition intended to select a portfolio of alternatives to
    the problematic AES-GCM:

    https://competitions.cr.yp.to/caesar-submissions.html
    https://competitions.cr.yp.to/round3/aegisv11.pdf

    Signed-off-by: Ondrej Mosnacek
    Signed-off-by: Herbert Xu

    Ondrej Mosnacek
     

21 Apr, 2018

1 commit

  • Adds zstd support to crypto and scompress. Only supports the default
    level.

    Previously we held off on this patch, since there weren't any users.
    Now zram is ready for zstd support, but depends on CONFIG_CRYPTO_ZSTD,
    which isn't defined until this patch is in. I also see a patch adding
    zstd to pstore [0], which depends on crypto zstd.

    [0] lkml.kernel.org/r/9c9416b2dff19f05fb4c35879aaa83d11ff72c92.1521626182.git.geliangtang@gmail.com

    Signed-off-by: Nick Terrell
    Signed-off-by: Herbert Xu

    Nick Terrell
     

07 Apr, 2018

2 commits

  • Our convention is to distinguish file types by suffixes with a period
    as a separator.

    *-asn1.[ch] is a different pattern from other generated sources such
    as *.lex.c, *.tab.[ch], *.dtb.S, etc. More confusing, files with
    '-asn1.[ch]' are generated files, but '_asn1.[ch]' are checked-in
    files:
    net/netfilter/nf_conntrack_h323_asn1.c
    include/linux/netfilter/nf_conntrack_h323_asn1.h
    include/linux/sunrpc/gss_asn1.h

    Rename generated files to *.asn1.[ch] for consistency.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • Clean up these patterns from the top Makefile to omit 'clean-files'
    in each Makefile.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     

16 Mar, 2018

1 commit

  • Introduce the SM4 cipher algorithms (OSCCA GB/T 32907-2016).

    SM4 (GBT.32907-2016) is a cryptographic standard issued by the
    Organization of State Commercial Administration of China (OSCCA)
    as an authorized cryptographic algorithms for the use within China.

    SMS4 was originally created for use in protecting wireless
    networks, and is mandated in the Chinese National Standard for
    Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure)
    (GB.15629.11-2003).

    Signed-off-by: Gilad Ben-Yossef
    Signed-off-by: Herbert Xu

    Gilad Ben-Yossef
     

09 Mar, 2018

1 commit

  • TPM security routines require encryption and decryption with AES in
    CFB mode, so add it to the Linux Crypto schemes. CFB is basically a
    one time pad where the pad is generated initially from the encrypted
    IV and then subsequently from the encrypted previous block of
    ciphertext. The pad is XOR'd into the plain text to get the final
    ciphertext.

    https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#CFB

    Signed-off-by: James Bottomley
    Signed-off-by: Herbert Xu

    James Bottomley
     

03 Mar, 2018

1 commit


22 Feb, 2018

1 commit

  • Add a generic implementation of Speck, including the Speck128 and
    Speck64 variants. Speck is a lightweight block cipher that can be much
    faster than AES on processors that don't have AES instructions.

    We are planning to offer Speck-XTS (probably Speck128/256-XTS) as an
    option for dm-crypt and fscrypt on Android, for low-end mobile devices
    with older CPUs such as ARMv7 which don't have the Cryptography
    Extensions. Currently, such devices are unencrypted because AES is not
    fast enough, even when the NEON bit-sliced implementation of AES is
    used. Other AES alternatives such as Twofish, Threefish, Camellia,
    CAST6, and Serpent aren't fast enough either; it seems that only a
    modern ARX cipher can provide sufficient performance on these devices.

    This is a replacement for our original proposal
    (https://patchwork.kernel.org/patch/10101451/) which was to offer
    ChaCha20 for these devices. However, the use of a stream cipher for
    disk/file encryption with no space to store nonces would have been much
    more insecure than we thought initially, given that it would be used on
    top of flash storage as well as potentially on top of F2FS, neither of
    which is guaranteed to overwrite data in-place.

    Speck has been somewhat controversial due to its origin. Nevertheless,
    it has a straightforward design (it's an ARX cipher), and it appears to
    be the leading software-optimized lightweight block cipher currently,
    with the most cryptanalysis. It's also easy to implement without side
    channels, unlike AES. Moreover, we only intend Speck to be used when
    the status quo is no encryption, due to AES not being fast enough.

    We've also considered a novel length-preserving encryption mode based on
    ChaCha20 and Poly1305. While theoretically attractive, such a mode
    would be a brand new crypto construction and would be more complicated
    and difficult to implement efficiently in comparison to Speck-XTS.

    There is confusion about the byte and word orders of Speck, since the
    original paper doesn't specify them. But we have implemented it using
    the orders the authors recommended in a correspondence with them. The
    test vectors are taken from the original paper but were mapped to byte
    arrays using the recommended byte and word orders.

    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     

20 Jan, 2018

1 commit

  • My last bugfix added -Os on the command line, which unfortunately caused
    a build regression on powerpc in some configurations.

    I've done some more analysis of the original problem and found slightly
    different workaround that avoids this regression and also results in
    better performance on gcc-7.0: -fcode-hoisting is an optimization step
    that got added in gcc-7 and that for all gcc-7 versions causes worse
    performance.

    This disables -fcode-hoisting on all compilers that understand the option.
    For gcc-7.1 and 7.2 I found the same performance as my previous patch
    (using -Os), in gcc-7.0 it was even better. On gcc-8 I could see no
    change in performance from this patch. In theory, code hoisting should
    not be able make things better for the AES cipher, so leaving it
    disabled for gcc-8 only serves to simplify the Makefile change.

    Reported-by: kbuild test robot
    Link: https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg30418.html
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
    Fixes: 148b974deea9 ("crypto: aes-generic - build with -Os on gcc-7+")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Herbert Xu

    Arnd Bergmann
     

12 Jan, 2018

1 commit

  • While testing other changes, I discovered that gcc-7.2.1 produces badly
    optimized code for aes_encrypt/aes_decrypt. This is especially true when
    CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely
    large stack usage that in turn might cause kernel stack overflows:

    crypto/aes_generic.c: In function 'aes_encrypt':
    crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger than 2048 bytes [-Wframe-larger-than=]
    crypto/aes_generic.c: In function 'aes_decrypt':
    crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger than 2048 bytes [-Wframe-larger-than=]

    I verified that this problem exists on all architectures that are
    supported by gcc-7.2, though arm64 in particular is less affected than
    the others. I also found that gcc-7.1 and gcc-8 do not show the extreme
    stack usage but still produce worse code than earlier versions for this
    file, apparently because of optimization passes that generally provide
    a substantial improvement in object code quality but understandably fail
    to find any shortcuts in the AES algorithm.

    Possible workarounds include

    a) disabling -ftree-pre and -ftree-sra optimizations, this was an earlier
    patch I tried, which reliably fixed the stack usage, but caused a
    serious performance regression in some versions, as later testing
    found.

    b) disabling UBSAN on this file or all ciphers, as suggested by Ard
    Biesheuvel. This would lead to massively better crypto performance in
    UBSAN-enabled kernels and avoid the stack usage, but there is a concern
    over whether we should exclude arbitrary files from UBSAN at all.

    c) Forcing the optimization level in a different way. Similar to a),
    but rather than deselecting specific optimization stages,
    this now uses "gcc -Os" for this file, regardless of the
    CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE/SIZE option. This is a reliable
    workaround for the stack consumption on all architecture, and I've
    retested the performance results now on x86, cycles/byte (lower is
    better) for cbc(aes-generic) with 256 bit keys:

    -O2 -Os
    gcc-6.3.1 14.9 15.1
    gcc-7.0.1 14.7 15.3
    gcc-7.1.1 15.3 14.7
    gcc-7.2.1 16.8 15.9
    gcc-8.0.0 15.5 15.6

    This implements the option c) by enabling forcing -Os on all compiler
    versions starting with gcc-7.1. As a workaround for PR83356, it would
    only be needed for gcc-7.2+ with UBSAN enabled, but since it also shows
    better performance on gcc-7.1 without UBSAN, it seems appropriate to
    use the faster version here as well.

    Side note: during testing, I also played with the AES code in libressl,
    which had a similar performance regression from gcc-6 to gcc-7.2,
    but was three times slower overall. It might be interesting to
    investigate that further and possibly port the Linux implementation
    into that.

    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
    Cc: Richard Biener
    Cc: Jakub Jelinek
    Cc: Ard Biesheuvel
    Signed-off-by: Arnd Bergmann
    Acked-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Arnd Bergmann
     

15 Nov, 2017

1 commit

  • Pull crypto updates from Herbert Xu:
    "Here is the crypto update for 4.15:

    API:

    - Disambiguate EBUSY when queueing crypto request by adding ENOSPC.
    This change touches code outside the crypto API.
    - Reset settings when empty string is written to rng_current.

    Algorithms:

    - Add OSCCA SM3 secure hash.

    Drivers:

    - Remove old mv_cesa driver (replaced by marvell/cesa).
    - Enable rfc3686/ecb/cfb/ofb AES in crypto4xx.
    - Add ccm/gcm AES in crypto4xx.
    - Add support for BCM7278 in iproc-rng200.
    - Add hash support on Exynos in s5p-sss.
    - Fix fallback-induced error in vmx.
    - Fix output IV in atmel-aes.
    - Fix empty GCM hash in mediatek.

    Others:

    - Fix DoS potential in lib/mpi.
    - Fix potential out-of-order issues with padata"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (162 commits)
    lib/mpi: call cond_resched() from mpi_powm() loop
    crypto: stm32/hash - Fix return issue on update
    crypto: dh - Remove pointless checks for NULL 'p' and 'g'
    crypto: qat - Clean up error handling in qat_dh_set_secret()
    crypto: dh - Don't permit 'key' or 'g' size longer than 'p'
    crypto: dh - Don't permit 'p' to be 0
    crypto: dh - Fix double free of ctx->p
    hwrng: iproc-rng200 - Add support for BCM7278
    dt-bindings: rng: Document BCM7278 RNG200 compatible
    crypto: chcr - Replace _manual_ swap with swap macro
    crypto: marvell - Add a NULL entry at the end of mv_cesa_plat_id_table[]
    hwrng: virtio - Virtio RNG devices need to be re-registered after suspend/resume
    crypto: atmel - remove empty functions
    crypto: ecdh - remove empty exit()
    MAINTAINERS: update maintainer for qat
    crypto: caam - remove unused param of ctx_map_to_sec4_sg()
    crypto: caam - remove unneeded edesc zeroization
    crypto: atmel-aes - Reset the controller before each use
    crypto: atmel-aes - properly set IV after {en,de}crypt
    hwrng: core - Reset user selected rng by writing "" to rng_current
    ...

    Linus Torvalds
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

22 Sep, 2017

1 commit


10 Jun, 2017

1 commit

  • Add support for generating ecc private keys.

    Generation of ecc private keys is helpful in a user-space to kernel
    ecdh offload because the keys are not revealed to user-space. Private
    key generation is also helpful to implement forward secrecy.

    If the user provides a NULL ecc private key, the kernel will generate it
    and further use it for ecdh.

    Move ecdh's object files below drbg's. drbg must be present in the kernel
    at the time of calling.

    Signed-off-by: Tudor Ambarus
    Reviewed-by: Stephan Müller
    Signed-off-by: Herbert Xu

    Tudor-Dan Ambarus
     

11 Feb, 2017

2 commits

  • An ancient gcc bug (first reported in 2003) has apparently resurfaced
    on MIPS, where kernelci.org reports an overly large stack frame in the
    whirlpool hash algorithm:

    crypto/wp512.c:987:1: warning: the frame size of 1112 bytes is larger than 1024 bytes [-Wframe-larger-than=]

    With some testing in different configurations, I'm seeing large
    variations in stack frames size up to 1500 bytes for what should have
    around 300 bytes at most. I also checked the reference implementation,
    which is essentially the same code but also comes with some test and
    benchmarking infrastructure.

    It seems that recent compiler versions on at least arm, arm64 and powerpc
    have a partial fix for this problem, but enabling "-fsched-pressure", but
    even with that fix they suffer from the issue to a certain degree. Some
    testing on arm64 shows that the time needed to hash a given amount of
    data is roughly proportional to the stack frame size here, which makes
    sense given that the wp512 implementation is doing lots of loads for
    table lookups, and the problem with the overly large stack is a result
    of doing a lot more loads and stores for spilled registers (as seen from
    inspecting the object code).

    Disabling -fschedule-insns consistently fixes the problem for wp512,
    in my collection of cross-compilers, the results are consistently better
    or identical when comparing the stack sizes in this function, though
    some architectures (notable x86) have schedule-insns disabled by
    default.

    The four columns are:
    default: -O2
    press: -O2 -fsched-pressure
    nopress: -O2 -fschedule-insns -fno-sched-pressure
    nosched: -O2 -no-schedule-insns (disables sched-pressure)

    default press nopress nosched
    alpha-linux-gcc-4.9.3 1136 848 1136 176
    am33_2.0-linux-gcc-4.9.3 2100 2076 2100 2104
    arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352
    cris-linux-gcc-4.9.3 272 272 272 272
    frv-linux-gcc-4.9.3 1128 1000 1128 280
    hppa64-linux-gcc-4.9.3 1128 336 1128 184
    hppa-linux-gcc-4.9.3 644 308 644 276
    i386-linux-gcc-4.9.3 352 352 352 352
    m32r-linux-gcc-4.9.3 720 656 720 268
    microblaze-linux-gcc-4.9.3 1108 604 1108 256
    mips64-linux-gcc-4.9.3 1328 592 1328 208
    mips-linux-gcc-4.9.3 1096 624 1096 240
    powerpc64-linux-gcc-4.9.3 1088 432 1088 160
    powerpc-linux-gcc-4.9.3 1080 584 1080 224
    s390-linux-gcc-4.9.3 456 456 624 360
    sh3-linux-gcc-4.9.3 292 292 292 292
    sparc64-linux-gcc-4.9.3 992 240 992 208
    sparc-linux-gcc-4.9.3 680 592 680 312
    x86_64-linux-gcc-4.9.3 224 240 272 224
    xtensa-linux-gcc-4.9.3 1152 704 1152 304

    aarch64-linux-gcc-7.0.0 224 224 1104 208
    arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352
    mips-linux-gcc-7.0.0 1120 648 1120 272
    x86_64-linux-gcc-7.0.1 240 240 304 240

    arm-linux-gnueabi-gcc-4.4.7 840 392
    arm-linux-gnueabi-gcc-4.5.4 784 728 784 320
    arm-linux-gnueabi-gcc-4.6.4 736 728 736 304
    arm-linux-gnueabi-gcc-4.7.4 944 784 944 352
    arm-linux-gnueabi-gcc-4.8.5 464 464 760 352
    arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352
    arm-linux-gnueabi-gcc-5.3.1 824 824 1064 336
    arm-linux-gnueabi-gcc-6.1.1 808 808 1056 344
    arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352

    Trying the same test for serpent-generic, the picture is a bit different,
    and while -fno-schedule-insns is generally better here than the default,
    -fsched-pressure wins overall, so I picked that instead.

    default press nopress nosched
    alpha-linux-gcc-4.9.3 1392 864 1392 960
    am33_2.0-linux-gcc-4.9.3 536 524 536 528
    arm-linux-gnueabi-gcc-4.9.3 552 552 776 536
    cris-linux-gcc-4.9.3 528 528 528 528
    frv-linux-gcc-4.9.3 536 400 536 504
    hppa64-linux-gcc-4.9.3 524 208 524 480
    hppa-linux-gcc-4.9.3 768 472 768 508
    i386-linux-gcc-4.9.3 564 564 564 564
    m32r-linux-gcc-4.9.3 712 576 712 532
    microblaze-linux-gcc-4.9.3 724 392 724 512
    mips64-linux-gcc-4.9.3 720 384 720 496
    mips-linux-gcc-4.9.3 728 384 728 496
    powerpc64-linux-gcc-4.9.3 704 304 704 480
    powerpc-linux-gcc-4.9.3 704 296 704 480
    s390-linux-gcc-4.9.3 560 560 592 536
    sh3-linux-gcc-4.9.3 540 540 540 540
    sparc64-linux-gcc-4.9.3 544 352 544 496
    sparc-linux-gcc-4.9.3 544 344 544 496
    x86_64-linux-gcc-4.9.3 528 536 576 528
    xtensa-linux-gcc-4.9.3 752 544 752 544

    aarch64-linux-gcc-7.0.0 432 432 656 480
    arm-linux-gnueabi-gcc-7.0.1 616 616 808 536
    mips-linux-gcc-7.0.0 720 464 720 488
    x86_64-linux-gcc-7.0.1 536 528 600 536

    arm-linux-gnueabi-gcc-4.4.7 592 440
    arm-linux-gnueabi-gcc-4.5.4 776 448 776 544
    arm-linux-gnueabi-gcc-4.6.4 776 448 776 544
    arm-linux-gnueabi-gcc-4.7.4 768 448 768 544
    arm-linux-gnueabi-gcc-4.8.5 488 488 776 544
    arm-linux-gnueabi-gcc-4.9.3 552 552 776 536
    arm-linux-gnueabi-gcc-5.3.1 552 552 776 536
    arm-linux-gnueabi-gcc-6.1.1 560 560 776 536
    arm-linux-gnueabi-gcc-7.0.1 616 616 808 536

    I did not do any runtime tests with serpent, so it is possible that stack
    frame size does not directly correlate with runtime performance here and
    it actually makes things worse, but it's more likely to help here, and
    the reduced stack frame size is probably enough reason to apply the patch,
    especially given that the crypto code is often used in deep call chains.

    Link: https://kernelci.org/build/id/58797d7559b5149efdf6c3a9/logs/
    Link: http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11488
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
    Cc: Ralf Baechle
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Herbert Xu

    Arnd Bergmann
     
  • Lookup table based AES is sensitive to timing attacks, which is due to
    the fact that such table lookups are data dependent, and the fact that
    8 KB worth of tables covers a significant number of cachelines on any
    architecture, resulting in an exploitable correlation between the key
    and the processing time for known plaintexts.

    For network facing algorithms such as CTR, CCM or GCM, this presents a
    security risk, which is why arch specific AES ports are typically time
    invariant, either through the use of special instructions, or by using
    SIMD algorithms that don't rely on table lookups.

    For generic code, this is difficult to achieve without losing too much
    performance, but we can improve the situation significantly by switching
    to an implementation that only needs 256 bytes of table data (the actual
    S-box itself), which can be prefetched at the start of each block to
    eliminate data dependent latencies.

    This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
    ordinary generic AES driver manages 18 cycles per byte on this
    hardware). Decryption is substantially slower.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     

30 Nov, 2016

2 commits


28 Nov, 2016

1 commit

  • This patch adds the simd skcipher helper which is meant to be
    a replacement for ablk helper. It replaces the underlying blkcipher
    interface with skcipher, and also presents the top-level algorithm
    as an skcipher.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

01 Nov, 2016

1 commit


25 Oct, 2016

2 commits


18 Jul, 2016

1 commit

  • This patch removes the old crypto_grab_skcipher helper and replaces
    it with crypto_grab_skcipher2.

    As this is the final entry point into givcipher this patch also
    removes all traces of the top-level givcipher interface, including
    all implicit IV generators such as chainiv.

    The bottom-level givcipher interface remains until the drivers
    using it are converted.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

23 Jun, 2016

2 commits