07 Sep, 2013

2 commits


21 Aug, 2013

1 commit


14 Aug, 2013

1 commit


25 Jul, 2013

1 commit

  • Pull crypto fixes from Herbert Xu:
    "This push fixes a memory corruption issue in caam, as well as
    reverting the new optimised crct10dif implementation as it breaks boot
    on initrd systems.

    Hopefully crct10dif will be reinstated once the supporting code is
    added so that it doesn't break boot"

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    Revert "crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework"
    crypto: caam - Fixed the memory out of bound overwrite issue

    Linus Torvalds
     

24 Jul, 2013

1 commit

  • This reverts commits
    67822649d7305caf3dd50ed46c27b99c94eff996
    39761214eefc6b070f29402aa1165f24d789b3f7
    0b95a7f85718adcbba36407ef88bba0a7379ed03
    31d939625a9a20b1badd2d4e6bf6fd39fa523405
    2d31e518a42828df7877bca23a958627d60408bc

    Unfortunately this change broke boot on some systems that used an
    initrd which does not include the newly created crct10dif modules.
    As these modules are required by sd_mod under certain configurations
    this is a serious problem.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

06 Jul, 2013

1 commit

  • Pull crypto update from Herbert Xu:
    - Do not idle omap device between crypto operations in one session.
    - Added sha224/sha384 shims for SSSE3.
    - More optimisations for camellia-aesni-avx2.
    - Removed defunct blowfish/twofish AVX2 implementations.
    - Added unaligned buffer self-tests.
    - Added PCLMULQDQ optimisation for CRCT10DIF.
    - Added support for Freescale's DCP co-processor
    - Misc fixes.

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (44 commits)
    crypto: testmgr - test hash implementations with unaligned buffers
    crypto: testmgr - test AEADs with unaligned buffers
    crypto: testmgr - test skciphers with unaligned buffers
    crypto: testmgr - check that entries in alg_test_descs are in correct order
    Revert "crypto: twofish - add AVX2/x86_64 assembler implementation of twofish cipher"
    Revert "crypto: blowfish - add AVX2/x86_64 implementation of blowfish cipher"
    crypto: camellia-aesni-avx2 - tune assembly code for more performance
    hwrng: bcm2835 - fix MODULE_LICENSE tag
    hwrng: nomadik - use clk_prepare_enable()
    crypto: picoxcell - replace strict_strtoul() with kstrtoul()
    crypto: dcp - Staticize local symbols
    crypto: dcp - Use NULL instead of 0
    crypto: dcp - Use devm_* APIs
    crypto: dcp - Remove redundant platform_set_drvdata()
    hwrng: use platform_{get,set}_drvdata()
    crypto: omap-aes - Don't idle/start AES device between Encrypt operations
    crypto: crct10dif - Use PTR_RET
    crypto: ux500 - Cocci spatch "resource_size.spatch"
    crypto: sha256_ssse3 - add sha224 support
    crypto: sha512_ssse3 - add sha384 support
    ...

    Linus Torvalds
     

22 Jun, 2013

1 commit


21 Jun, 2013

4 commits

  • Merge crypto to resolve conflict in crypto/Kconfig.

    Herbert Xu
     
  • This reverts commit cf1521a1a5e21fd1e79a458605c4282fbfbbeee2.

    Instruction (vpgatherdd) that this implementation relied on turned out to be
    slow performer on real hardware (i5-4570). The previous 8-way twofish/AVX
    implementation is therefore faster and this implementation should be removed.

    Converting this implementation to use the same method as in twofish/AVX for
    table look-ups would give additional ~3% speed up vs twofish/AVX, but would
    hardly be worth of the added code and binary size.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     
  • This reverts commit 604880107010a1e5794552d184cd5471ea31b973.

    Instruction (vpgatherdd) that this implementation relied on turned out to be
    slow performer on real hardware (i5-4570). The previous 4-way blowfish
    implementation is therefore faster and this implementation should be removed.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     
  • Add implementation tuned for more performance on real hardware. Changes are
    mostly around the part mixing 128-bit extract and insert instructions and
    AES-NI instructions. Also 'vpbroadcastb' instructions have been change to
    'vpshufb with zero mask'.

    Tests on Intel Core i5-4570:

    tcrypt ECB results, old-AVX2 vs new-AVX2:

    size 128bit key 256bit key
    enc dec enc dec
    256 1.00x 1.00x 1.00x 1.00x
    1k 1.08x 1.09x 1.05x 1.06x
    8k 1.06x 1.06x 1.06x 1.06x

    tcrypt ECB results, AVX vs new-AVX2:

    size 128bit key 256bit key
    enc dec enc dec
    256 1.00x 1.00x 1.00x 1.00x
    1k 1.51x 1.50x 1.52x 1.50x
    8k 1.47x 1.48x 1.48x 1.48x

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     

13 Jun, 2013

1 commit

  • The new XTS code for aesni_intel uses input buffers directly as memory operands
    for pxor instructions, which causes crash if those buffers are not aligned to
    16 bytes.

    Patch changes XTS code to handle unaligned memory correctly, by loading memory
    with movdqu instead.

    Reported-by: Dave Jones
    Tested-by: Dave Jones
    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     

31 May, 2013

2 commits

  • Pull x86 fixes from Peter Anvin:

    - Three EFI-related fixes

    - Two early memory initialization fixes

    - build fix for older binutils

    - fix for an eager FPU performance regression -- currently we don't
    allow the use of the FPU at interrupt time *at all* in eager mode,
    which is clearly wrong.

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: Allow FPU to be used at interrupt time even with eagerfpu
    x86, crc32-pclmul: Fix build with older binutils
    x86-64, init: Fix a possible wraparound bug in switchover in head_64.S
    x86, range: fix missing merge during add range
    x86, efi: initial the local variable of DataSize to zero
    efivar: fix oops in efivar_update_sysfs_entries() caused by memory reuse
    efivarfs: Never return ENOENT from firmware again

    Linus Torvalds
     
  • binutils prior to 2.18 (e.g. the ones found on SLE10) don't support
    assembling PEXTRD, so a macro based approach like the one for PCLMULQDQ
    in the same file should be used.

    This requires making the helper macros capable of recognizing 32-bit
    general purpose register operands.

    [ hpa: tagging for stable as it is a low risk build fix ]

    Signed-off-by: Jan Beulich
    Link: http://lkml.kernel.org/r/51A6142A02000078000D99D8@nat28.tlf.novell.com
    Cc: Alexander Boyko
    Cc: Herbert Xu
    Cc: Huang Ying
    Cc: v3.9
    Signed-off-by: H. Peter Anvin

    Jan Beulich
     

28 May, 2013

3 commits

  • Add sha224 implementation to sha256_ssse3 module.

    This also fixes sha256_ssse3 module autoloading issue when 'sha224' is used
    before 'sha256'. Previously in such case, just sha256_generic was loaded and
    not sha256_ssse3 (since it did not provide sha224). Now if 'sha256' was used
    after 'sha224' usage, sha256_ssse3 would remain unloaded.

    Cc: Tim Chen
    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     
  • Add sha384 implementation to sha512_ssse3 module.

    This also fixes sha512_ssse3 module autoloading issue when 'sha384' is used
    before 'sha512'. Previously in such case, just sha512_generic was loaded and
    not sha512_ssse3 (since it did not provide sha384). Now if 'sha512' was used
    after 'sha384' usage, sha512_ssse3 would remain unloaded. For example, this
    happens with tcrypt testing module since it tests 'sha384' before 'sha512'.

    Cc: Tim Chen
    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     
  • The _XFER stack element size was set too small, 8 bytes, when it needs to be
    16 bytes. As _XFER is the last stack element used by these implementations,
    the 16 byte stores with 'movdqa' corrupt the stack where the value of register
    %r12 is temporarily stored. As these implementations align the stack pointer
    to 16 bytes, this corruption did not happen every time.

    Patch corrects this issue.

    Reported-by: Julian Wollrath
    Signed-off-by: Jussi Kivilinna
    Tested-by: Julian Wollrath
    Acked-by: Tim Chen
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     

24 May, 2013

1 commit


20 May, 2013

1 commit

  • This is the x86_64 CRC T10 DIF transform accelerated with the PCLMULQDQ
    instructions. Details discussing the implementation can be found in the
    paper:

    "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction"
    http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf

    Signed-off-by: Tim Chen
    Signed-off-by: Herbert Xu

    Tim Chen
     

25 Apr, 2013

15 commits


03 Apr, 2013

5 commits