09 Dec, 2013

1 commit

  • Commit fe8c8a126806 introduced a possible build error for archs
    that do not have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set. :/
    Fix this up by bringing else braces outside of the ifdef.

    Reported-by: Fengguang Wu
    Fixes: fe8c8a126806 ("crypto: more robust crypto_memneq")
    Signed-off-by: Daniel Borkmann
    Acked-By: Cesar Eduardo Barros
    Signed-off-by: Herbert Xu

    Daniel Borkmann
     

05 Dec, 2013

1 commit

  • Disabling compiler optimizations can be fragile, since a new
    optimization could be added to -O0 or -Os that breaks the assumptions
    the code is making.

    Instead of disabling compiler optimizations, use a dummy inline assembly
    (based on RELOC_HIDE) to block the problematic kinds of optimization,
    while still allowing other optimizations to be applied to the code.

    The dummy inline assembly is added after every OR, and has the
    accumulator variable as its input and output. The compiler is forced to
    assume that the dummy inline assembly could both depend on the
    accumulator variable and change the accumulator variable, so it is
    forced to compute the value correctly before the inline assembly, and
    cannot assume anything about its value after the inline assembly.

    This change should be enough to make crypto_memneq work correctly (with
    data-independent timing) even if it is inlined at its call sites. That
    can be done later in a followup patch.

    Compile-tested on x86_64.

    Signed-off-by: Cesar Eduardo Barros
    Acked-by: Daniel Borkmann
    Signed-off-by: Herbert Xu

    Cesar Eduardo Barros
     

07 Oct, 2013

1 commit

  • When comparing MAC hashes, AEAD authentication tags, or other hash
    values in the context of authentication or integrity checking, it
    is important not to leak timing information to a potential attacker,
    i.e. when communication happens over a network.

    Bytewise memory comparisons (such as memcmp) are usually optimized so
    that they return a nonzero value as soon as a mismatch is found. E.g,
    on x86_64/i5 for 512 bytes this can be ~50 cyc for a full mismatch
    and up to ~850 cyc for a full match (cold). This early-return behavior
    can leak timing information as a side channel, allowing an attacker to
    iteratively guess the correct result.

    This patch adds a new method crypto_memneq ("memory not equal to each
    other") to the crypto API that compares memory areas of the same length
    in roughly "constant time" (cache misses could change the timing, but
    since they don't reveal information about the content of the strings
    being compared, they are effectively benign). Iow, best and worst case
    behaviour take the same amount of time to complete (in contrast to
    memcmp).

    Note that crypto_memneq (unlike memcmp) can only be used to test for
    equality or inequality, NOT for lexicographical order. This, however,
    is not an issue for its use-cases within the crypto API.

    We tried to locate all of the places in the crypto API where memcmp was
    being used for authentication or integrity checking, and convert them
    over to crypto_memneq.

    crypto_memneq is declared noinline, placed in its own source file,
    and compiled with optimizations that might increase code size disabled
    ("Os") because a smart compiler (or LTO) might notice that the return
    value is always compared against zero/nonzero, and might then
    reintroduce the same early-return optimization that we are trying to
    avoid.

    Using #pragma or __attribute__ optimization annotations of the code
    for disabling optimization was avoided as it seems to be considered
    broken or unmaintained for long time in GCC [1]. Therefore, we work
    around that by specifying the compile flag for memneq.o directly in
    the Makefile. We found that this seems to be most appropriate.

    As we use ("Os"), this patch also provides a loop-free "fast-path" for
    frequently used 16 byte digests. Similarly to kernel library string
    functions, leave an option for future even further optimized architecture
    specific assembler implementations.

    This was a joint work of James Yonan and Daniel Borkmann. Also thanks
    for feedback from Florian Weimer on this and earlier proposals [2].

    [1] http://gcc.gnu.org/ml/gcc/2012-07/msg00211.html
    [2] https://lkml.org/lkml/2013/2/10/131

    Signed-off-by: James Yonan
    Signed-off-by: Daniel Borkmann
    Cc: Florian Weimer
    Signed-off-by: Herbert Xu

    James Yonan