26 Jul, 2019

40 commits

  • The generic AES code provides four sets of lookup tables, where each
    set consists of four tables containing the same 32-bit values, but
    rotated by 0, 8, 16 and 24 bits, respectively. This makes sense for
    CISC architectures such as x86 which support memory operands, but
    for other architectures, the rotates are quite cheap, and using all
    four tables needlessly thrashes the D-cache, and actually hurts rather
    than helps performance.

    Since x86 already has its own implementation of AEGIS based on AES-NI
    instructions, let's tweak the generic implementation towards other
    architectures, and avoid the prerotated tables, and perform the
    rotations inline. On ARM Cortex-A53, this results in a ~8% speedup.

    Acked-by: Ondrej Mosnacek
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • TFM init/exit routines are optional, so no need to provide empty ones.

    Reviewed-by: Ondrej Mosnacek
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Three variants of AEGIS were proposed for the CAESAR competition, and
    only one was selected for the final portfolio: AEGIS128.

    The other variants, AEGIS128L and AEGIS256, are not likely to ever turn
    up in networking protocols or other places where interoperability
    between Linux and other systems is a concern, nor are they likely to
    be subjected to further cryptanalysis. However, uninformed users may
    think that AEGIS128L (which is faster) is equally fit for use.

    So let's remove them now, before anyone starts using them and we are
    forced to support them forever.

    Note that there are no known flaws in the algorithms or in any of these
    implementations, but they have simply outlived their usefulness.

    Reviewed-by: Ondrej Mosnacek
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • MORUS was not selected as a winner in the CAESAR competition, which
    is not surprising since it is considered to be cryptographically
    broken [0]. (Note that this is not an implementation defect, but a
    flaw in the underlying algorithm). Since it is unlikely to be in use
    currently, let's remove it before we're stuck with it.

    [0] https://eprint.iacr.org/2019/172.pdf

    Reviewed-by: Ondrej Mosnacek
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Add self-tests for the lzo-rle algorithm.

    Signed-off-by: Hannah Pan
    Signed-off-by: Herbert Xu

    Hannah Pan
     
  • The scalar table based AES routines are not used by other drivers, so
    let's keep it that way and unexport the symbols.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • There are a few copies of the AES S-boxes floating around, so export
    the ones from the AES library so that we can reuse them in other
    modules.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The versions of the AES lookup tables that are only used during the last
    round are never used outside of the driver, so there is no need to
    export their symbols.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Replace a couple of occurrences where the "aes-generic" cipher is
    instantiated explicitly and only used for encryption of a single block.
    Use AES library calls instead.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Use the AES library instead of the cipher interface to perform
    the single block of AES processing involved in updating the key
    of the cmac(aes) hash.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The AMCC code for GCM key derivation allocates a AES cipher to
    perform a single block encryption. So let's switch to the new
    and more lightweight AES library instead.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The bluetooth code uses a bare AES cipher for the encryption operations.
    Given that it carries out a set_key() operation right before every
    encryption operation, this is clearly not a hot path, and so the use of
    the cipher interface (which provides the best implementation available
    on the system) is not really required.

    In fact, when using a cipher like AES-NI or AES-CE, both the set_key()
    and the encrypt() operations involve en/disabling preemption as well as
    stacking and unstacking the SIMD context, and this is most certainly
    not worth it for encrypting 16 bytes of data.

    So let's switch to the new lightweight library interface instead.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • GHASH is used by the GCM mode, which is often used in contexts where
    only synchronous ciphers are permitted. So provide a synchronous version
    of GHASH based on the existing code. This requires a non-SIMD fallback
    to deal with invocations occurring from a context where SIMD instructions
    may not be used.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • AES in CTR mode is used by modes such as GCM and CCM, which are often
    used in contexts where only synchronous ciphers are permitted. So
    provide a synchronous version of ctr(aes) based on the existing code.
    This requires a non-SIMD fallback to deal with invocations occurring
    from a context where SIMD instructions may not be used. We have a
    helper for this now in the AES library, so wire that up.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • AES in CTR mode is used by modes such as GCM and CCM, which are often
    used in contexts where only synchronous ciphers are permitted. So
    provide a synchronous version of ctr(aes) based on the existing code.
    This requires a non-SIMD fallback to deal with invocations occurring
    from a context where SIMD instructions may not be used. We have a
    helper for this now in the AES library, so wire that up.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Align ARM's hw instruction based AES implementation with other versions
    that keep the key schedule in native endianness. This will allow us to
    merge the various implementations going forward.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Instead of calling into the table based scalar AES code in situations
    where the SIMD unit may not be used, use the generic AES code, which
    is more appropriate since it is less likely to be susceptible to
    timing attacks.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • In preparation of duplicating the sync ctr(aes) functionality to modules
    under arch/arm, move the helper function from a inline .h file to the
    AES library, which is already depended upon by the drivers that use this
    fallback.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Add a static inline helper modeled after crypto_cbc_encrypt_walk()
    that can be reused for SIMD algorithms that need to implement a
    non-SIMD fallback for performing CTR encryption.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Drop aes-generic's version of crypto_aes_expand_key(), and switch to
    the key expansion routine provided by the AES library. AES key expansion
    is not performance critical, and it is better to have a single version
    shared by all AES implementations.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Switch to the new AES library that also provides an implementation of
    the AES key expansion routine. This removes the dependency on the
    generic AES cipher, allowing it to be omitted entirely in the future.

    While at it, remove some references to the table based arm64 version
    of AES and replace them with AES library calls as well.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Switch to the new AES library that also provides an implementation of
    the AES key expansion routine. This removes the dependency on the
    generic AES cipher, allowing it to be omitted entirely in the future.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The CCM code calls directly into the scalar table based AES cipher for
    arm64 from the fallback path, and since this implementation is known to
    be non-time invariant, doing so from a time invariant SIMD cipher is a
    bit nasty.

    So let's switch to the AES library - this makes the code more robust,
    and drops the dependency on the generic AES cipher, allowing us to
    omit it entirely in the future.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Switch to the new AES library that also provides an implementation of
    the AES key expansion routine. This removes the dependency on the
    generic AES cipher, allowing it to be omitted entirely in the future.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The GHASH code uses the generic AES key expansion routines, and calls
    directly into the scalar table based AES cipher for arm64 from the
    fallback path, and since this implementation is known to be non-time
    invariant, doing so from a time invariant SIMD cipher is a bit nasty.

    So let's switch to the AES library - this makes the code more robust,
    and drops the dependency on the generic AES cipher, allowing us to
    omit it entirely in the future.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Switch to the new AES library that also provides an implementation of
    the AES key expansion routine. This removes the dependency on the
    generic AES cipher, allowing it to be omitted entirely in the future.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Switch to the new AES library that also provides an implementation of
    the AES key expansion routine. This removes the dependency on the
    generic AES cipher, allowing it to be omitted entirely in the future.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Switch to the new AES library that also provides an implementation of
    the AES key expansion routine. This removes the dependency on the
    generic AES cipher, allowing it to be omitted entirely in the future.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The AES assembler code for x86 isn't actually faster than code
    generated by the compiler from aes_generic.c, and considering
    the disproportionate maintenance burden of assembler code on
    x86, it is better just to drop it entirely. Modern x86 systems
    will use AES-NI anyway, and given that the modules being removed
    have a dependency on aes_generic already, we can remove them
    without running the risk of regressions.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The AES-NI code contains fallbacks for invocations that occur from a
    context where the SIMD unit is unavailable, which really only occurs
    when running in softirq context that was entered from a hard IRQ that
    was taken while running kernel code that was already using the FPU.

    That means performance is not really a consideration, and we can just
    use the new library code for this use case, which has a smaller
    footprint and is believed to be time invariant. This will allow us to
    drop the non-SIMD asm routines in a subsequent patch.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Take the existing small footprint and mostly time invariant C code
    and turn it into a AES library that can be used for non-performance
    critical, casual use of AES, and as a fallback for, e.g., SIMD code
    that needs a secondary path that can be taken in contexts where the
    SIMD unit is off limits (e.g., in hard interrupts taken from kernel
    context)

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The fixed time AES code mangles the key schedule so that xoring the
    first round key with values at fixed offsets across the Sbox produces
    the correct value. This primes the D-cache with the entire Sbox before
    any data dependent lookups are done, making it more difficult to infer
    key bits from timing variances when the plaintext is known.

    The downside of this approach is that it renders the key schedule
    incompatible with other implementations of AES in the kernel, which
    makes it cumbersome to use this implementation as a fallback for SIMD
    based AES in contexts where this is not allowed.

    So let's tweak the fixed Sbox indexes so that they add up to zero under
    the xor operation. While at it, increase the granularity to 16 bytes so
    we cover the entire Sbox even on systems with 16 byte cachelines.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Rename some local AES encrypt/decrypt routines so they don't clash with
    the names we are about to introduce for the routines exposed by the
    generic AES library.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Rearrange the aes_algs[] array for legibility.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • This patch adds support for the specific corner case of performing HMAC
    on an empty string (i.e. payload length is zero). This solves the last
    failing cryptomgr extratests for HMAC.

    Signed-off-by: Pascal van Leeuwen
    Signed-off-by: Herbert Xu

    Pascal van Leeuwen
     
  • This patch fixes an issue with hash and HMAC operations that perform
    "large" intermediate updates (i.e. combined size > 2 hash blocks) by
    actually making use of the hardware's hash continue capabilities.
    The original implementation would cache these updates in a buffer that
    was 2 hash blocks in size and fail if all update calls combined would
    overflow that buffer. Which caused the cryptomgr extra tests to fail.

    Signed-off-by: Pascal van Leeuwen
    Signed-off-by: Herbert Xu

    Pascal van Leeuwen
     
  • The driver was loading the initial digest for hash operations into
    the hardware explicitly, but this is not needed as the hardware can
    handle that by itself, which is more efficient and avoids any context
    record coherence issues.

    Signed-off-by: Pascal van Leeuwen
    Signed-off-by: Herbert Xu

    Pascal van Leeuwen