26 Jul, 2019

2 commits

  • Take the existing small footprint and mostly time invariant C code
    and turn it into a AES library that can be used for non-performance
    critical, casual use of AES, and as a fallback for, e.g., SIMD code
    that needs a secondary path that can be taken in contexts where the
    SIMD unit is off limits (e.g., in hard interrupts taken from kernel
    context)

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The fixed time AES code mangles the key schedule so that xoring the
    first round key with values at fixed offsets across the Sbox produces
    the correct value. This primes the D-cache with the entire Sbox before
    any data dependent lookups are done, making it more difficult to infer
    key bits from timing variances when the plaintext is known.

    The downside of this approach is that it renders the key schedule
    incompatible with other implementations of AES in the kernel, which
    makes it cumbersome to use this implementation as a fallback for SIMD
    based AES in contexts where this is not allowed.

    So let's tweak the fixed Sbox indexes so that they add up to zero under
    the xor operation. While at it, increase the granularity to 16 bytes so
    we cover the entire Sbox even on systems with 16 byte cachelines.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     

19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

09 Nov, 2018

1 commit

  • In the "aes-fixed-time" AES implementation, disable interrupts while
    accessing the S-box, in order to make cache-timing attacks more
    difficult. Previously it was possible for the CPU to be interrupted
    while the S-box was loaded into L1 cache, potentially evicting the
    cachelines and causing later table lookups to be time-variant.

    In tests I did on x86 and ARM, this doesn't affect performance
    significantly. Responsiveness is potentially a concern, but interrupts
    are only disabled for a single AES block.

    Note that even after this change, the implementation still isn't
    necessarily guaranteed to be constant-time; see
    https://cr.yp.to/antiforgery/cachetiming-20050414.pdf for a discussion
    of the many difficulties involved in writing truly constant-time AES
    software. But it's valuable to make such attacks more difficult.

    Reviewed-by: Ard Biesheuvel
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     

19 Jun, 2017

1 commit


11 Feb, 2017

1 commit

  • Lookup table based AES is sensitive to timing attacks, which is due to
    the fact that such table lookups are data dependent, and the fact that
    8 KB worth of tables covers a significant number of cachelines on any
    architecture, resulting in an exploitable correlation between the key
    and the processing time for known plaintexts.

    For network facing algorithms such as CTR, CCM or GCM, this presents a
    security risk, which is why arch specific AES ports are typically time
    invariant, either through the use of special instructions, or by using
    SIMD algorithms that don't rely on table lookups.

    For generic code, this is difficult to achieve without losing too much
    performance, but we can improve the situation significantly by switching
    to an implementation that only needs 256 bytes of table data (the actual
    S-box itself), which can be prefetched at the start of each block to
    eliminate data dependent latencies.

    This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
    ordinary generic AES driver manages 18 cycles per byte on this
    hardware). Decryption is substantially slower.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel