05 Mar, 2017

1 commit

  • Pull crypto fixes from Herbert Xu:

    - vmalloc stack regression in CCM

    - Build problem in CRC32 on ARM

    - Memory leak in cavium

    - Missing Kconfig dependencies in atmel and mediatek

    - XTS Regression on some platforms (s390 and ppc)

    - Memory overrun in CCM test vector

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    crypto: vmx - Use skcipher for xts fallback
    crypto: vmx - Use skcipher for cbc fallback
    crypto: testmgr - Pad aes_ccm_enc_tv_template vector
    crypto: arm/crc32 - add build time test for CRC instruction support
    crypto: arm/crc32 - fix build error with outdated binutils
    crypto: ccm - move cbcmac input off the stack
    crypto: xts - Propagate NEED_FALLBACK bit
    crypto: api - Add crypto_requires_off helper
    crypto: atmel - CRYPTO_DEV_MEDIATEK should depend on HAS_DMA
    crypto: atmel - CRYPTO_DEV_ATMEL_TDES and CRYPTO_DEV_ATMEL_SHA should depend on HAS_DMA
    crypto: cavium - fix leak on curr if curr->head fails to be allocated
    crypto: cavium - Fix couple of static checker errors

    Linus Torvalds
     

02 Mar, 2017

3 commits


01 Mar, 2017

1 commit

  • Running with KASAN and crypto tests currently gives

    BUG: KASAN: global-out-of-bounds in __test_aead+0x9d9/0x2200 at addr ffffffff8212fca0
    Read of size 16 by task cryptomgr_test/1107
    Address belongs to variable 0xffffffff8212fca0
    CPU: 0 PID: 1107 Comm: cryptomgr_test Not tainted 4.10.0+ #45
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
    Call Trace:
    dump_stack+0x63/0x8a
    kasan_report.part.1+0x4a7/0x4e0
    ? __test_aead+0x9d9/0x2200
    ? crypto_ccm_init_crypt+0x218/0x3c0 [ccm]
    kasan_report+0x20/0x30
    check_memory_region+0x13c/0x1a0
    memcpy+0x23/0x50
    __test_aead+0x9d9/0x2200
    ? kasan_unpoison_shadow+0x35/0x50
    ? alg_test_akcipher+0xf0/0xf0
    ? crypto_skcipher_init_tfm+0x2e3/0x310
    ? crypto_spawn_tfm2+0x37/0x60
    ? crypto_ccm_init_tfm+0xa9/0xd0 [ccm]
    ? crypto_aead_init_tfm+0x7b/0x90
    ? crypto_alloc_tfm+0xc4/0x190
    test_aead+0x28/0xc0
    alg_test_aead+0x54/0xd0
    alg_test+0x1eb/0x3d0
    ? alg_find_test+0x90/0x90
    ? __sched_text_start+0x8/0x8
    ? __wake_up_common+0x70/0xb0
    cryptomgr_test+0x4d/0x60
    kthread+0x173/0x1c0
    ? crypto_acomp_scomp_free_ctx+0x60/0x60
    ? kthread_create_on_node+0xa0/0xa0
    ret_from_fork+0x2c/0x40
    Memory state around the buggy address:
    ffffffff8212fb80: 00 00 00 00 01 fa fa fa fa fa fa fa 00 00 00 00
    ffffffff8212fc00: 00 01 fa fa fa fa fa fa 00 00 00 00 01 fa fa fa
    >ffffffff8212fc80: fa fa fa fa 00 05 fa fa fa fa fa fa 00 00 00 00
    ^
    ffffffff8212fd00: 01 fa fa fa fa fa fa fa 00 00 00 00 01 fa fa fa
    ffffffff8212fd80: fa fa fa fa 00 00 00 00 00 05 fa fa fa fa fa fa

    This always happens on the same IV which is less than 16 bytes.

    Per Ard,

    "CCM IVs are 16 bytes, but due to the way they are constructed
    internally, the final couple of bytes of input IV are dont-cares.

    Apparently, we do read all 16 bytes, which triggers the KASAN errors."

    Fix this by padding the IV with null bytes to be at least 16 bytes.

    Cc: stable@vger.kernel.org
    Fixes: 0bc5a6c5c79a ("crypto: testmgr - Disable rfc4309 test and convert
    test vectors")
    Acked-by: Ard Biesheuvel
    Signed-off-by: Laura Abbott
    Signed-off-by: Herbert Xu

    Laura Abbott
     

28 Feb, 2017

1 commit

  • Commit f15f05b0a5de ("crypto: ccm - switch to separate cbcmac driver")
    refactored the CCM driver to allow separate implementations of the
    underlying MAC to be provided by a platform. However, in doing so, it
    moved some data from the linear region to the stack, which violates the
    SG constraints when the stack is virtually mapped.

    So move idata/odata back to the request ctx struct, of which we can
    reasonably expect that it has been allocated using kmalloc() et al.

    Reported-by: Johannes Berg
    Fixes: f15f05b0a5de ("crypto: ccm - switch to separate cbcmac driver")
    Signed-off-by: Ard Biesheuvel
    Tested-by: Johannes Berg
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     

27 Feb, 2017

1 commit

  • When we're used as a fallback algorithm, we should propagate
    the NEED_FALLBACK bit when searching for the underlying ECB mode.

    This just happens to fix a hang too because otherwise the search
    may end up loading the same module that triggered this XTS creation.

    Cc: stable@vger.kernel.org #4.10
    Fixes: f1c131b45410 ("crypto: xts - Convert to skcipher")
    Reported-by: Harald Freudenberger
    Signed-off-by: Herbert Xu

    Herbert Xu
     

25 Feb, 2017

1 commit

  • Update the crypto modules using LZ4 compression as well as the test
    cases in testmgr.h to work with the new LZ4 module version.

    Link: http://lkml.kernel.org/r/1486321748-19085-4-git-send-email-4sschmid@informatik.uni-hamburg.de
    Signed-off-by: Sven Schmidt
    Cc: Bongkyu Kim
    Cc: Rui Salvaterra
    Cc: Sergey Senozhatsky
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: David S. Miller
    Cc: Anton Vorontsov
    Cc: Colin Cross
    Cc: Kees Cook
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sven Schmidt
     

23 Feb, 2017

1 commit

  • Since the
    commit f1c131b45410a202eb45cc55980a7a9e4e4b4f40
    crypto: xts - Convert to skcipher
    the XTS mode is based on ECB, so the mode must select
    ECB otherwise it can fail to initialize.

    Signed-off-by: Milan Broz
    Signed-off-by: Herbert Xu

    Milan Broz
     

15 Feb, 2017

2 commits

  • The CCM driver forces 32-bit alignment even if the underlying ciphers
    don't care about alignment. This is because crypto_xor() used to require
    this, but since this is no longer the case, drop the hardcoded minimum
    of 32 bits.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The CCM driver was recently updated to defer the MAC part of the algorithm
    to a dedicated crypto transform, and a template for instantiating such
    transforms was added at the same time.

    However, this new cbcmac template fails to take the alignmask of the
    encapsulated cipher into account, which may result in buffer addresses
    being passed down that are not sufficiently aligned.

    So update the code to ensure that the digest buffer in the desc ctx
    appears at a sufficiently aligned offset, and tweak the code so that all
    calls to crypto_cipher_encrypt_one() operate on this buffer exclusively.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     

11 Feb, 2017

6 commits

  • Instead of unconditionally forcing 4 byte alignment for all generic
    chaining modes that rely on crypto_xor() or crypto_inc() (which may
    result in unnecessary copying of data when the underlying hardware
    can perform unaligned accesses efficiently), make those functions
    deal with unaligned input explicitly, but only if the Kconfig symbol
    HAVE_EFFICIENT_UNALIGNED_ACCESS is set. This will allow us to drop
    the alignmasks from the CBC, CMAC, CTR, CTS, PCBC and SEQIV drivers.

    For crypto_inc(), this simply involves making the 4-byte stride
    conditional on HAVE_EFFICIENT_UNALIGNED_ACCESS being set, given that
    it typically operates on 16 byte buffers.

    For crypto_xor(), an algorithm is implemented that simply runs through
    the input using the largest strides possible if unaligned accesses are
    allowed. If they are not, an optimal sequence of memory accesses is
    emitted that takes the relative alignment of the input buffers into
    account, e.g., if the relative misalignment of dst and src is 4 bytes,
    the entire xor operation will be completed using 4 byte loads and stores
    (modulo unaligned bits at the start and end). Note that all expressions
    involving misalign are simply eliminated by the compiler when
    HAVE_EFFICIENT_UNALIGNED_ACCESS is defined.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • An ancient gcc bug (first reported in 2003) has apparently resurfaced
    on MIPS, where kernelci.org reports an overly large stack frame in the
    whirlpool hash algorithm:

    crypto/wp512.c:987:1: warning: the frame size of 1112 bytes is larger than 1024 bytes [-Wframe-larger-than=]

    With some testing in different configurations, I'm seeing large
    variations in stack frames size up to 1500 bytes for what should have
    around 300 bytes at most. I also checked the reference implementation,
    which is essentially the same code but also comes with some test and
    benchmarking infrastructure.

    It seems that recent compiler versions on at least arm, arm64 and powerpc
    have a partial fix for this problem, but enabling "-fsched-pressure", but
    even with that fix they suffer from the issue to a certain degree. Some
    testing on arm64 shows that the time needed to hash a given amount of
    data is roughly proportional to the stack frame size here, which makes
    sense given that the wp512 implementation is doing lots of loads for
    table lookups, and the problem with the overly large stack is a result
    of doing a lot more loads and stores for spilled registers (as seen from
    inspecting the object code).

    Disabling -fschedule-insns consistently fixes the problem for wp512,
    in my collection of cross-compilers, the results are consistently better
    or identical when comparing the stack sizes in this function, though
    some architectures (notable x86) have schedule-insns disabled by
    default.

    The four columns are:
    default: -O2
    press: -O2 -fsched-pressure
    nopress: -O2 -fschedule-insns -fno-sched-pressure
    nosched: -O2 -no-schedule-insns (disables sched-pressure)

    default press nopress nosched
    alpha-linux-gcc-4.9.3 1136 848 1136 176
    am33_2.0-linux-gcc-4.9.3 2100 2076 2100 2104
    arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352
    cris-linux-gcc-4.9.3 272 272 272 272
    frv-linux-gcc-4.9.3 1128 1000 1128 280
    hppa64-linux-gcc-4.9.3 1128 336 1128 184
    hppa-linux-gcc-4.9.3 644 308 644 276
    i386-linux-gcc-4.9.3 352 352 352 352
    m32r-linux-gcc-4.9.3 720 656 720 268
    microblaze-linux-gcc-4.9.3 1108 604 1108 256
    mips64-linux-gcc-4.9.3 1328 592 1328 208
    mips-linux-gcc-4.9.3 1096 624 1096 240
    powerpc64-linux-gcc-4.9.3 1088 432 1088 160
    powerpc-linux-gcc-4.9.3 1080 584 1080 224
    s390-linux-gcc-4.9.3 456 456 624 360
    sh3-linux-gcc-4.9.3 292 292 292 292
    sparc64-linux-gcc-4.9.3 992 240 992 208
    sparc-linux-gcc-4.9.3 680 592 680 312
    x86_64-linux-gcc-4.9.3 224 240 272 224
    xtensa-linux-gcc-4.9.3 1152 704 1152 304

    aarch64-linux-gcc-7.0.0 224 224 1104 208
    arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352
    mips-linux-gcc-7.0.0 1120 648 1120 272
    x86_64-linux-gcc-7.0.1 240 240 304 240

    arm-linux-gnueabi-gcc-4.4.7 840 392
    arm-linux-gnueabi-gcc-4.5.4 784 728 784 320
    arm-linux-gnueabi-gcc-4.6.4 736 728 736 304
    arm-linux-gnueabi-gcc-4.7.4 944 784 944 352
    arm-linux-gnueabi-gcc-4.8.5 464 464 760 352
    arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352
    arm-linux-gnueabi-gcc-5.3.1 824 824 1064 336
    arm-linux-gnueabi-gcc-6.1.1 808 808 1056 344
    arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352

    Trying the same test for serpent-generic, the picture is a bit different,
    and while -fno-schedule-insns is generally better here than the default,
    -fsched-pressure wins overall, so I picked that instead.

    default press nopress nosched
    alpha-linux-gcc-4.9.3 1392 864 1392 960
    am33_2.0-linux-gcc-4.9.3 536 524 536 528
    arm-linux-gnueabi-gcc-4.9.3 552 552 776 536
    cris-linux-gcc-4.9.3 528 528 528 528
    frv-linux-gcc-4.9.3 536 400 536 504
    hppa64-linux-gcc-4.9.3 524 208 524 480
    hppa-linux-gcc-4.9.3 768 472 768 508
    i386-linux-gcc-4.9.3 564 564 564 564
    m32r-linux-gcc-4.9.3 712 576 712 532
    microblaze-linux-gcc-4.9.3 724 392 724 512
    mips64-linux-gcc-4.9.3 720 384 720 496
    mips-linux-gcc-4.9.3 728 384 728 496
    powerpc64-linux-gcc-4.9.3 704 304 704 480
    powerpc-linux-gcc-4.9.3 704 296 704 480
    s390-linux-gcc-4.9.3 560 560 592 536
    sh3-linux-gcc-4.9.3 540 540 540 540
    sparc64-linux-gcc-4.9.3 544 352 544 496
    sparc-linux-gcc-4.9.3 544 344 544 496
    x86_64-linux-gcc-4.9.3 528 536 576 528
    xtensa-linux-gcc-4.9.3 752 544 752 544

    aarch64-linux-gcc-7.0.0 432 432 656 480
    arm-linux-gnueabi-gcc-7.0.1 616 616 808 536
    mips-linux-gcc-7.0.0 720 464 720 488
    x86_64-linux-gcc-7.0.1 536 528 600 536

    arm-linux-gnueabi-gcc-4.4.7 592 440
    arm-linux-gnueabi-gcc-4.5.4 776 448 776 544
    arm-linux-gnueabi-gcc-4.6.4 776 448 776 544
    arm-linux-gnueabi-gcc-4.7.4 768 448 768 544
    arm-linux-gnueabi-gcc-4.8.5 488 488 776 544
    arm-linux-gnueabi-gcc-4.9.3 552 552 776 536
    arm-linux-gnueabi-gcc-5.3.1 552 552 776 536
    arm-linux-gnueabi-gcc-6.1.1 560 560 776 536
    arm-linux-gnueabi-gcc-7.0.1 616 616 808 536

    I did not do any runtime tests with serpent, so it is possible that stack
    frame size does not directly correlate with runtime performance here and
    it actually makes things worse, but it's more likely to help here, and
    the reduced stack frame size is probably enough reason to apply the patch,
    especially given that the crypto code is often used in deep call chains.

    Link: https://kernelci.org/build/id/58797d7559b5149efdf6c3a9/logs/
    Link: http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11488
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
    Cc: Ralf Baechle
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Herbert Xu

    Arnd Bergmann
     
  • Update the generic CCM driver to defer CBC-MAC processing to a
    dedicated CBC-MAC ahash transform rather than open coding this
    transform (and much of the associated scatterwalk plumbing) in
    the CCM driver itself.

    This cleans up the code considerably, but more importantly, it allows
    the use of alternative CBC-MAC implementations that don't suffer from
    performance degradation due to significant setup time (e.g., the NEON
    based AES code needs to enable/disable the NEON, and load the S-box
    into 16 SIMD registers, which cannot be amortized over the entire input
    when using the cipher interface)

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • In preparation of splitting off the CBC-MAC transform in the CCM
    driver into a separate algorithm, define some test cases for the
    AES incarnation of cbcmac.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Lookup table based AES is sensitive to timing attacks, which is due to
    the fact that such table lookups are data dependent, and the fact that
    8 KB worth of tables covers a significant number of cachelines on any
    architecture, resulting in an exploitable correlation between the key
    and the processing time for known plaintexts.

    For network facing algorithms such as CTR, CCM or GCM, this presents a
    security risk, which is why arch specific AES ports are typically time
    invariant, either through the use of special instructions, or by using
    SIMD algorithms that don't rely on table lookups.

    For generic code, this is difficult to achieve without losing too much
    performance, but we can improve the situation significantly by switching
    to an implementation that only needs 256 bytes of table data (the actual
    S-box itself), which can be prefetched at the start of each block to
    eliminate data dependent latencies.

    This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
    ordinary generic AES driver manages 18 cycles per byte on this
    hardware). Decryption is substantially slower.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • The generic AES code exposes a 32-bit align mask, which forces all
    users of the code to use temporary buffers or take other measures to
    ensure the alignment requirement is adhered to, even on architectures
    that don't care about alignment for software algorithms such as this
    one.

    So drop the align mask, and fix the code to use get_unaligned_le32()
    where appropriate, which will resolve to whatever is optimal for the
    architecture.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     

03 Feb, 2017

2 commits


23 Jan, 2017

2 commits

  • tcrypt is very tight-lipped when it succeeds, but a bit more feedback
    would be useful when developing or debugging crypto drivers, especially
    since even a successful run ends with the module failing to insert. Add
    a couple of debug prints, which can be enabled with dynamic debug:

    Before:

    # insmod tcrypt.ko mode=10
    insmod: can't insert 'tcrypt.ko': Resource temporarily unavailable

    After:

    # insmod tcrypt.ko mode=10 dyndbg
    tcrypt: testing ecb(aes)
    tcrypt: testing cbc(aes)
    tcrypt: testing lrw(aes)
    tcrypt: testing xts(aes)
    tcrypt: testing ctr(aes)
    tcrypt: testing rfc3686(ctr(aes))
    tcrypt: all tests passed
    insmod: can't insert 'tcrypt.ko': Resource temporarily unavailable

    Signed-off-by: Rabin Vincent
    Signed-off-by: Herbert Xu

    Rabin Vincent
     
  • Make sure CRYPTO_ALG_DEAD bit is cleared before proceeding with
    the algorithm registration. This fixes qat-dh registration when
    driver is restarted

    Cc:
    Signed-off-by: Salvatore Benedetto
    Signed-off-by: Herbert Xu

    Salvatore Benedetto
     

13 Jan, 2017

4 commits

  • When working on AES in CCM mode for ARM, my code passed the internal
    tcrypt test before I had even bothered to implement the AES-192 and
    AES-256 code paths, which is strange because the tcrypt does contain
    AES-192 and AES-256 test vectors for CCM.

    As it turned out, the define AES_CCM_ENC_TEST_VECTORS was out of sync
    with the actual number of test vectors, causing only the AES-128 ones
    to be executed.

    So get rid of the defines, and wrap the test vector references in a
    macro that calculates the number of vectors automatically.

    The following test vector counts were out of sync with the respective
    defines:

    BF_CTR_ENC_TEST_VECTORS 2 -> 3
    BF_CTR_DEC_TEST_VECTORS 2 -> 3
    TF_CTR_ENC_TEST_VECTORS 2 -> 3
    TF_CTR_DEC_TEST_VECTORS 2 -> 3
    SERPENT_CTR_ENC_TEST_VECTORS 2 -> 3
    SERPENT_CTR_DEC_TEST_VECTORS 2 -> 3
    AES_CCM_ENC_TEST_VECTORS 8 -> 14
    AES_CCM_DEC_TEST_VECTORS 7 -> 17
    AES_CCM_4309_ENC_TEST_VECTORS 7 -> 23
    AES_CCM_4309_DEC_TEST_VECTORS 10 -> 23
    CAMELLIA_CTR_ENC_TEST_VECTORS 2 -> 3
    CAMELLIA_CTR_DEC_TEST_VECTORS 2 -> 3

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • There are some hashes (e.g. sha224) that have some internal trickery
    to make sure that only the correct number of output bytes are
    generated. If something goes wrong, they could potentially overrun
    the output buffer.

    Make the test more robust by allocating only enough space for the
    correct output size so that memory debugging will catch the error if
    the output is overrun.

    Tested by intentionally breaking sha224 to output all 256
    internally-generated bits while running on KASAN.

    Cc: Ard Biesheuvel
    Cc: Herbert Xu
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Herbert Xu

    Andrew Lutomirski
     
  • Continuing from this commit: 52f5684c8e1e
    ("kernel: use macros from compiler.h instead of __attribute__((...))")

    I submitted 4 total patches. They are part of task I've taken up to
    increase compiler portability in the kernel. I've cleaned up the
    subsystems under /kernel /mm /block and /security, this patch targets
    /crypto.

    There is which provides macros for various gcc specific
    constructs. Eg: __weak for __attribute__((weak)). I've cleaned all
    instances of gcc specific attributes with the right macros for the crypto
    subsystem.

    I had to make one additional change into compiler-gcc.h for the case when
    one wants to use this: __attribute__((aligned) and not specify an alignment
    factor. From the gcc docs, this will result in the largest alignment for
    that data type on the target machine so I've named the macro
    __aligned_largest. Please advise if another name is more appropriate.

    Signed-off-by: Gideon Israel Dsouza
    Signed-off-by: Herbert Xu

    Gideon Israel Dsouza
     
  • It's recommended to use kmemdup instead of kmalloc followed by memcpy.

    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     

30 Dec, 2016

1 commit

  • In some cases, SIMD algorithms can only perform optimally when
    allowed to operate on multiple input blocks in parallel. This is
    especially true for bit slicing algorithms, which typically take
    the same amount of time processing a single block or 8 blocks in
    parallel. However, other SIMD algorithms may benefit as well from
    bigger strides.

    So add a walksize attribute to the skcipher algorithm definition, and
    wire it up to the skcipher walk API. To avoid confusion between the
    skcipher and AEAD attributes, rename the skcipher_walk chunksize
    attribute to 'stride', and set it from the walksize (in the skcipher
    case) or from the chunksize (in the AEAD case).

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     

27 Dec, 2016

3 commits

  • With this reproducer:
    struct sockaddr_alg alg = {
    .salg_family = 0x26,
    .salg_type = "hash",
    .salg_feat = 0xf,
    .salg_mask = 0x5,
    .salg_name = "digest_null",
    };
    int sock, sock2;

    sock = socket(AF_ALG, SOCK_SEQPACKET, 0);
    bind(sock, (struct sockaddr *)&alg, sizeof(alg));
    sock2 = accept(sock, NULL, NULL);
    setsockopt(sock, SOL_ALG, ALG_SET_KEY, "\x9b\xca", 2);
    accept(sock2, NULL, NULL);

    ==== 8< ======== 8< ======== 8< ======== 8< ====

    one can immediatelly see an UBSAN warning:
    UBSAN: Undefined behaviour in crypto/algif_hash.c:187:7
    variable length array bound value 0 ] ? __ubsan_handle_vla_bound_not_positive+0x13d/0x188
    [] ? __ubsan_handle_out_of_bounds+0x1bc/0x1bc
    [] ? hash_accept+0x5bd/0x7d0 [algif_hash]
    [] ? hash_accept_nokey+0x3f/0x51 [algif_hash]
    [] ? hash_accept_parent_nokey+0x4a0/0x4a0 [algif_hash]
    [] ? SyS_accept+0x2b/0x40

    It is a correct warning, as hash state is propagated to accept as zero,
    but creating a zero-length variable array is not allowed in C.

    Fix this as proposed by Herbert -- do "?: 1" on that site. No sizeof or
    similar happens in the code there, so we just allocate one byte even
    though we do not use the array.

    Signed-off-by: Jiri Slaby
    Cc: Herbert Xu
    Cc: "David S. Miller" (maintainer:CRYPTO API)
    Reported-by: Sasha Levin
    Signed-off-by: Herbert Xu

    Jiri Slaby
     
  • This converts the ChaCha20 code from a blkcipher to a skcipher, which
    is now the preferred way to implement symmetric block and stream ciphers.

    This ports the generic and x86 versions at the same time because the
    latter reuses routines of the former.

    Note that the skcipher_walk() API guarantees that all presented blocks
    except the final one are a multiple of the chunk size, so we can simplify
    the encrypt() routine somewhat.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • Christopher Covington reported a crash on aarch64 on recent Fedora
    kernels:

    kernel BUG at ./include/linux/scatterlist.h:140!
    Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
    Modules linked in:
    CPU: 2 PID: 752 Comm: cryptomgr_test Not tainted 4.9.0-11815-ge93b1cc #162
    Hardware name: linux,dummy-virt (DT)
    task: ffff80007c650080 task.stack: ffff800008910000
    PC is at sg_init_one+0xa0/0xb8
    LR is at sg_init_one+0x24/0xb8
    ...
    [] sg_init_one+0xa0/0xb8
    [] test_acomp+0x10c/0x438
    [] alg_test_comp+0xb0/0x118
    [] alg_test+0x17c/0x2f0
    [] cryptomgr_test+0x44/0x50
    [] kthread+0xf8/0x128
    [] ret_from_fork+0x10/0x50

    The test vectors used for input are part of the kernel image. These
    inputs are passed as a buffer to sg_init_one which eventually blows up
    with BUG_ON(!virt_addr_valid(buf)). On arm64, virt_addr_valid returns
    false for the kernel image since virt_to_page will not return the
    correct page. Fix this by copying the input vectors to heap buffer
    before setting up the scatterlist.

    Reported-by: Christopher Covington
    Fixes: d7db7a882deb ("crypto: acomp - update testmgr with support for acomp")
    Signed-off-by: Laura Abbott
    Signed-off-by: Herbert Xu

    Laura Abbott
     

18 Dec, 2016

1 commit

  • Pull more documentation updates from Jonathan Corbet:
    "This converts the crypto DocBook to Sphinx"

    * tag 'docs-4.10-2' of git://git.lwn.net/linux:
    crypto: doc - optimize compilation
    crypto: doc - clarify AEAD memory structure
    crypto: doc - remove crypto_alloc_ablkcipher
    crypto: doc - add KPP documentation
    crypto: doc - fix separation of cipher / req API
    crypto: doc - fix source comments for Sphinx
    crypto: doc - remove crypto API DocBook
    crypto: doc - convert crypto API documentation to Sphinx

    Linus Torvalds
     

16 Dec, 2016

1 commit

  • Pull crypto fixes from Herbert Xu:
    "This fixes the following issues:

    - a crash regression in the new skcipher walker

    - incorrect return value in public_key_verify_signature

    - fix for in-place signing in the sign-file utility"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    crypto: skcipher - fix crash in virtual walk
    sign-file: Fix inplace signing when src and dst names are both specified
    crypto: asymmetric_keys - set error code on failure

    Linus Torvalds
     

15 Dec, 2016

1 commit

  • Pull crypto updates from Herbert Xu:
    "Here is the crypto update for 4.10:

    API:
    - add skcipher walk interface
    - add asynchronous compression (acomp) interface
    - fix algif_aed AIO handling of zero buffer

    Algorithms:
    - fix unaligned access in poly1305
    - fix DRBG output to large buffers

    Drivers:
    - add support for iMX6UL to caam
    - fix givenc descriptors (used by IPsec) in caam
    - accelerated SHA256/SHA512 for ARM64 from OpenSSL
    - add SSE CRCT10DIF and CRC32 to ARM/ARM64
    - add AEAD support to Chelsio chcr
    - add Armada 8K support to omap-rng"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (148 commits)
    crypto: testmgr - fix overlap in chunked tests again
    crypto: arm/crc32 - accelerated support based on x86 SSE implementation
    crypto: arm64/crc32 - accelerated support based on x86 SSE implementation
    crypto: arm/crct10dif - port x86 SSE implementation to ARM
    crypto: arm64/crct10dif - port x86 SSE implementation to arm64
    crypto: testmgr - add/enhance test cases for CRC-T10DIF
    crypto: testmgr - avoid overlap in chunked tests
    crypto: chcr - checking for IS_ERR() instead of NULL
    crypto: caam - check caam_emi_slow instead of re-lookup platform
    crypto: algif_aead - fix AIO handling of zero buffer
    crypto: aes-ce - Make aes_simd_algs static
    crypto: algif_skcipher - set error code when kcalloc fails
    crypto: caam - make aamalg_desc a proper module
    crypto: caam - pass key buffers with typesafe pointers
    crypto: arm64/aes-ce-ccm - Fix AEAD decryption length
    MAINTAINERS: add crypto headers to crypto entry
    crypt: doc - remove misleading mention of async API
    crypto: doc - fix header file name
    crypto: api - fix comment typo
    crypto: skcipher - Add separate walker for AEAD decryption
    ..

    Linus Torvalds
     

14 Dec, 2016

3 commits

  • The new skcipher walk API may crash in the following way. (Interestingly,
    the tcrypt boot time tests seem unaffected, while an explicit test using
    the module triggers it)

    Unable to handle kernel NULL pointer dereference at virtual address 00000000
    ...
    [] __memcpy+0x84/0x180
    [] skcipher_walk_done+0x328/0x340
    [] ctr_encrypt+0x84/0x100
    [] simd_skcipher_encrypt+0x88/0x98
    [] crypto_rfc3686_crypt+0x8c/0x98
    [] test_skcipher_speed+0x518/0x820 [tcrypt]
    [] do_test+0x1408/0x3b70 [tcrypt]
    [] tcrypt_mod_init+0x50/0x1000 [tcrypt]
    [] do_one_initcall+0x44/0x138
    [] do_init_module+0x68/0x1e0
    [] load_module+0x1fd0/0x2458
    [] SyS_finit_module+0xe0/0xf0
    [] el0_svc_naked+0x24/0x28

    This is due to the fact that skcipher_done_slow() may be entered with
    walk->buffer unset. Since skcipher_walk_done() already deals with the
    case where walk->buffer == walk->page, it appears to be the intention
    that walk->buffer point to walk->page after skcipher_next_slow(), so
    ensure that is the case.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • In function public_key_verify_signature(), returns variable ret on
    error paths. When the call to kmalloc() fails, the value of ret is 0,
    and it is not set to an errno before returning. This patch fixes the
    bug.

    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188891

    Signed-off-by: Pan Bian
    Signed-off-by: David Howells
    Signed-off-by: Herbert Xu

    Pan Bian
     
  • The previous description have been misleading and partially incorrect.

    Reported-by: Harsh Jain
    Signed-off-by: Stephan Mueller
    Signed-off-by: Jonathan Corbet

    Stephan Mueller
     

11 Dec, 2016

2 commits


08 Dec, 2016

2 commits

  • Commit 7e4c7f17cde2 ("crypto: testmgr - avoid overlap in chunked tests")
    attempted to address a problem in the crypto testmgr code where chunked
    test cases are copied to memory in a way that results in overlap.

    However, the fix recreated the exact same issue for other chunked tests,
    by putting IDX3 within 492 bytes of IDX1, which causes overlap if the
    first chunk exceeds 492 bytes, which is the case for at least one of
    the xts(aes) test cases.

    So increase IDX3 by another 1000 bytes.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu

    Ard Biesheuvel
     
  • In case the user provided insufficient data, the code may return
    prematurely without any operation. In this case, the processed
    data indicated with outlen is zero.

    Reported-by: Stephen Rothwell
    Signed-off-by: Stephan Mueller
    Signed-off-by: Herbert Xu

    Stephan Mueller
     

07 Dec, 2016

1 commit