04 Dec, 2020

10 commits

  • Geert reports that builds where CONFIG_CRYPTO_AEGIS128_SIMD is not set
    may still emit references to crypto_aegis128_update_simd(), which
    cannot be satisfied and therefore break the build. These references
    only exist in functions that can be optimized away, but apparently,
    the compiler is not always able to prove this.

    So add some explicit checks for CONFIG_CRYPTO_AEGIS128_SIMD to help the
    compiler figure this out.

    Tested-by: Geert Uytterhoeven
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The macro use will already have a semicolon.

    Signed-off-by: Tom Rix
    Signed-off-by: Herbert Xu

    Tom Rix
     
  • CMP $0,%reg can't set overflow flag, so we can use shorter TEST %reg,%reg
    instruction when only zero and sign flags are checked (E,L,LE,G,GE conditions).

    Signed-off-by: Uros Bizjak
    Cc: Herbert Xu
    Cc: Borislav Petkov
    Cc: "H. Peter Anvin"
    Signed-off-by: Herbert Xu

    Uros Bizjak
     
  • CMP $0,%reg can't set overflow flag, so we can use shorter TEST %reg,%reg
    instruction when only zero and sign flags are checked (E,L,LE,G,GE conditions).

    Signed-off-by: Uros Bizjak
    Cc: Herbert Xu
    Cc: Borislav Petkov
    Cc: "H. Peter Anvin"
    Signed-off-by: Herbert Xu

    Uros Bizjak
     
  • CMP $0,%reg can't set overflow flag, so we can use shorter TEST %reg,%reg
    instruction when only zero and sign flags are checked (E,L,LE,G,GE conditions).

    Signed-off-by: Uros Bizjak
    Cc: Herbert Xu
    Cc: Borislav Petkov
    Cc: "H. Peter Anvin"
    Signed-off-by: Herbert Xu

    Uros Bizjak
     
  • This patch fixes a few sparse warnings that were missed in the
    last round.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds a dependency for KEYSTONE on HAS_IOMEM and OF to
    prevent COMPILE_TEST build failures.

    Reported-by: kernel test robot
    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch fixes a missing prototype warning on blake2s_selftest.

    Reported-by: kernel test robot
    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • ARM Cortex-A57 and Cortex-A72 cores running in 32-bit mode are affected
    by silicon errata #1742098 and #1655431, respectively, where the second
    instruction of a AES instruction pair may execute twice if an interrupt
    is taken right after the first instruction consumes an input register of
    which a single 32-bit lane has been updated the last time it was modified.

    This is not such a rare occurrence as it may seem: in counter mode, only
    the least significant 32-bit word is incremented in the absence of a
    carry, which makes our counter mode implementation susceptible to these
    errata.

    So let's shuffle the counter assignments around a bit so that the most
    recent updates when the AES instruction pair executes are 128-bit wide.

    [0] ARM-EPM-049219 v23 Cortex-A57 MPCore Software Developers Errata Notice
    [1] ARM-EPM-012079 v11.0 Cortex-A72 MPCore Software Developers Errata Notice

    Cc: # v5.4+
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • ecdh_set_secret() casts a void* pointer to a const u64* in order to
    feed it into ecc_is_key_valid(). This is not generally permitted by
    the C standard, and leads to actual misalignment faults on ARMv6
    cores. In some cases, these are fixed up in software, but this still
    leads to performance hits that are entirely avoidable.

    So let's copy the key into the ctx buffer first, which we will do
    anyway in the common case, and which guarantees correct alignment.

    Cc:
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     

27 Nov, 2020

22 commits

  • Rework the setting of DMA cache parameters, program more appropriate
    values and explicitly set sharability domain.

    Signed-off-by: Gilad Ben-Yossef
    Signed-off-by: Herbert Xu

    Gilad Ben-Yossef
     
  • 'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
    an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

    Signed-off-by: Christophe JAILLET
    Signed-off-by: Herbert Xu

    Christophe JAILLET
     
  • 'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
    an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

    Signed-off-by: Christophe JAILLET
    Signed-off-by: Herbert Xu

    Christophe JAILLET
     
  • 'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
    an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

    Signed-off-by: Christophe JAILLET
    Signed-off-by: Herbert Xu

    Christophe JAILLET
     
  • In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
    warnings by explicitly adding multiple break statements instead of
    letting the code fall through to the next case.

    Link: https://github.com/KSPP/linux/issues/115
    Signed-off-by: Gustavo A. R. Silva
    Acked-by: Gilad Ben-Yossef
    Signed-off-by: Herbert Xu

    Gustavo A. R. Silva
     
  • WireGuard and IPsec both typically operate on input blocks that are
    ~1420 bytes in size, given the default Ethernet MTU of 1500 bytes and
    the overhead of the VPN metadata.

    Many aead and sckipher implementations are optimized for power-of-2
    block sizes, and whether they perform well when operating on 1420
    byte blocks cannot be easily extrapolated from the performance on
    power-of-2 block size. So let's add 1420 bytes explicitly, and round
    it up to the next blocksize multiple of the algo in question if it
    does not support 1420 byte blocks.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • When working on crypto algorithms, being able to run tcrypt quickly
    without booting an entire Linux installation can be very useful. For
    instance, QEMU/kvm can be used to boot a kernel from the command line,
    and having tcrypt.ko builtin would allow tcrypt to be executed to run
    benchmarks, or to run tests for algorithms that need to be instantiated
    from templates, without the need to make it past the point where the
    rootfs is mounted.

    So let's relax the requirement that tcrypt can only be built as a module
    when CONFIG_EXPERT is enabled.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Commit c4741b2305979 ("crypto: run initcalls for generic implementations
    earlier") converted tcrypt.ko's module_init() to subsys_initcall(), but
    this was unintentional: tcrypt.ko currently cannot be built into the core
    kernel, and so the subsys_initcall() gets converted into module_init()
    under the hood. Given that tcrypt.ko does not implement a generic version
    of a crypto algorithm that has to be available early during boot, there
    is no point in running the tcrypt init code earlier than implied by
    module_init().

    However, for crypto development purposes, we will lift the restriction
    that tcrypt.ko must be built as a module, and when builtin, it makes sense
    for tcrypt.ko (which does its work inside the module init function) to run
    as late as possible. So let's switch to late_initcall() instead.

    Signed-off-by: Ard Biesheuvel
    Reviewed-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Move HiSilicon TRNG V2 driver into 'drivers/crypto/hisilicon/trng'
    with some updating on 'MAINTAINERS'.

    Signed-off-by: Weili Qian
    Reviewed-by: Zaibo Xu
    Signed-off-by: Herbert Xu

    Weili Qian
     
  • This patch adds support for pseudo random number generator(PRNG)
    in Crypto subsystem.

    Signed-off-by: Weili Qian
    Reviewed-by: Zaibo Xu
    Signed-off-by: Herbert Xu

    Weili Qian
     
  • Move existing char/hw_random/hisi-trng-v2.c to crypto/hisilicon/trng.c.

    Signed-off-by: Weili Qian
    Reviewed-by: Zaibo Xu
    Signed-off-by: Herbert Xu

    Weili Qian
     
  • Driver of HiSilicon true random number generator(TRNG)
    is removed from 'drivers/char/hw_random'.

    Both 'Kunpeng 920' and 'Kunpeng 930' chips have TRNG,
    however, PRNG is only supported by 'Kunpeng 930'.
    So, this driver is moved to 'drivers/crypto/hisilicon/trng/'
    in the next to enable the two's TRNG better.

    Signed-off-by: Weili Qian
    Reviewed-by: Zaibo Xu
    Signed-off-by: Herbert Xu

    Weili Qian
     
  • This patch fixes a coulpe of sparse endianness warnings.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch fixes a sparse endianness warning in sha256-spe.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch fixes a number of endianness warnings in the mips/octeon
    code.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Condition !A || A && B is equivalent to !A || B.

    Generated by: scripts/coccinelle/misc/excluded_middle.cocci

    Fixes: b76f0ea01312 ("coccinelle: misc: add excluded_middle.cocci script")
    CC: Denis Efremov
    Reported-by: kernel test robot
    Signed-off-by: kernel test robot
    Signed-off-by: Julia Lawall
    Signed-off-by: Giovanni Cabiddu
    Signed-off-by: Herbert Xu

    kernel test robot
     
  • Partial hash was being copied into the final result buffer without the
    entire message block processed. Depending on how the end user processes
    this result buffer, errors vary from result buffer corruption to result
    buffer poisoing. Fix this issue by ensuring that only the final hash value
    is copied into the result buffer.

    Reviewed-by: Bjorn Andersson
    Signed-off-by: Thara Gopinath
    Signed-off-by: Herbert Xu

    Thara Gopinath
     
  • Add support Qualcomm Crypto Engine accelerated encryption and
    authentication algorithms on sdm845.

    Reviewed-by: Bjorn Andersson
    Signed-off-by: Thara Gopinath
    Signed-off-by: Herbert Xu

    Thara Gopinath
     
  • Wiring the SIMD code into the generic driver has the unfortunate side
    effect that the tcrypt testing code cannot distinguish them, and will
    therefore not use the latter to fuzz test the former, as it does for
    other algorithms.

    So let's refactor the code a bit so we can register two implementations:
    aegis128-generic and aegis128-simd.

    Signed-off-by: Ard Biesheuvel
    Reviewed-by: Ondrej Mosnacek
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Instead of calculating the tag and returning it to the caller on
    decryption, use a SIMD compare and min across vector to perform
    the comparison. This is slightly more efficient, and removes the
    need on the caller's part to wipe the tag from memory if the
    decryption failed.

    While at it, switch to unsigned int when passing cryptlen and
    assoclen - we don't support input sizes where it matters anyway.

    Signed-off-by: Ard Biesheuvel
    Reviewed-by: Ondrej Mosnacek
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Avoid copying the tail block via a stack buffer if the total size
    exceeds a single AEGIS block. In this case, we can use overlapping
    loads and stores and NEON permutation instructions instead, which
    leads to a modest performance improvement on some cores (< 5%),
    and is slightly cleaner. Note that we still need to use a stack
    buffer if the entire input is smaller than 16 bytes, given that
    we cannot use 16 byte NEON loads and stores safely in this case.

    Signed-off-by: Ard Biesheuvel
    Reviewed-by: Ondrej Mosnacek
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The AEGIS spec mentions explicitly that the security guarantees hold
    only if the resulting plaintext and tag of a failed decryption are
    withheld. So ensure that we abide by this.

    While at it, drop the unused struct aead_request *req parameter from
    crypto_aegis128_process_crypt().

    Reviewed-by: Ondrej Mosnacek
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     

20 Nov, 2020

8 commits