10 Jul, 2013

1 commit

  • Add support for lz4 and lz4hc compression algorithm using the lib/lz4/*
    codebase.

    [akpm@linux-foundation.org: fix warnings]
    Signed-off-by: Chanho Min
    Cc: "Darrick J. Wong"
    Cc: Bob Pearson
    Cc: Richard Weinberger
    Cc: Herbert Xu
    Cc: Yann Collet
    Cc: Kyungsik Lee
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chanho Min
     

21 Jun, 2013

2 commits

  • This reverts commit cf1521a1a5e21fd1e79a458605c4282fbfbbeee2.

    Instruction (vpgatherdd) that this implementation relied on turned out to be
    slow performer on real hardware (i5-4570). The previous 8-way twofish/AVX
    implementation is therefore faster and this implementation should be removed.

    Converting this implementation to use the same method as in twofish/AVX for
    table look-ups would give additional ~3% speed up vs twofish/AVX, but would
    hardly be worth of the added code and binary size.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     
  • This reverts commit 604880107010a1e5794552d184cd5471ea31b973.

    Instruction (vpgatherdd) that this implementation relied on turned out to be
    slow performer on real hardware (i5-4570). The previous 4-way blowfish
    implementation is therefore faster and this implementation should be removed.

    Signed-off-by: Jussi Kivilinna
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     

24 May, 2013

1 commit


20 May, 2013

1 commit


25 Apr, 2013

10 commits


26 Feb, 2013

2 commits

  • This bool option can never be set to anything other than y. So
    let's just kill it.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Pull crypto update from Herbert Xu:
    "Here is the crypto update for 3.9:

    - Added accelerated implementation of crc32 using pclmulqdq.

    - Added test vector for fcrypt.

    - Added support for OMAP4/AM33XX cipher and hash.

    - Fixed loose crypto_user input checks.

    - Misc fixes"

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (43 commits)
    crypto: user - ensure user supplied strings are nul-terminated
    crypto: user - fix empty string test in report API
    crypto: user - fix info leaks in report API
    crypto: caam - Added property fsl,sec-era in SEC4.0 device tree binding.
    crypto: use ERR_CAST
    crypto: atmel-aes - adjust duplicate test
    crypto: crc32-pclmul - Kill warning on x86-32
    crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels
    crypto: x86/sha1 - assembler clean-ups: use ENTRY/ENDPROC
    crypto: x86/serpent - use ENTRY/ENDPROC for assember functions and localize jump targets
    crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_*
    crypto: x86/ghash - assembler clean-up: use ENDPROC at end of assember functions
    crypto: x86/crc32c - assembler clean-up: use ENTRY/ENDPROC
    crypto: cast6-avx: use ENTRY()/ENDPROC() for assembler functions
    crypto: cast5-avx: use ENTRY()/ENDPROC() for assembler functions and localize jump targets
    crypto: camellia-x86_64/aes-ni: use ENTRY()/ENDPROC() for assembler functions and localize jump targets
    crypto: blowfish-x86_64: use ENTRY()/ENDPROC() for assembler functions and localize jump targets
    crypto: aesni-intel - add ENDPROC statements for assembler functions
    crypto: x86/aes - assembler clean-ups: use ENTRY/ENDPROC, localize jump targets
    crypto: testmgr - add test vector for fcrypt
    ...

    Linus Torvalds
     

24 Feb, 2013

1 commit

  • Pull powerpc updates from Benjamin Herrenschmidt:
    "So from the depth of frozen Minnesota, here's the powerpc pull request
    for 3.9. It has a few interesting highlights, in addition to the
    usual bunch of bug fixes, minor updates, embedded device tree updates
    and new boards:

    - Hand tuned asm implementation of SHA1 (by Paulus & Michael
    Ellerman)

    - Support for Doorbell interrupts on Power8 (kind of fast
    thread-thread IPIs) by Ian Munsie

    - Long overdue cleanup of the way we handle relocation of our open
    firmware trampoline (prom_init.c) on 64-bit by Anton Blanchard

    - Support for saving/restoring & context switching the PPR (Processor
    Priority Register) on server processors that support it. This
    allows the kernel to preserve thread priorities established by
    userspace. By Haren Myneni.

    - DAWR (new watchpoint facility) support on Power8 by Michael Neuling

    - Ability to change the DSCR (Data Stream Control Register) which
    controls cache prefetching on a running process via ptrace by
    Alexey Kardashevskiy

    - Support for context switching the TAR register on Power8 (new
    branch target register meant to be used by some new specific
    userspace perf event interrupt facility which is yet to be enabled)
    by Ian Munsie.

    - Improve preservation of the CFAR register (which captures the
    origin of a branch) on various exception conditions by Paulus.

    - Move the Bestcomm DMA driver from arch powerpc to drivers/dma where
    it belongs by Philippe De Muyter

    - Support for Transactional Memory on Power8 by Michael Neuling
    (based on original work by Matt Evans). For those curious about
    the feature, the patch contains a pretty good description."

    (See commit db8ff907027b: "powerpc: Documentation for transactional
    memory on powerpc" for the mentioned description added to the file
    Documentation/powerpc/transactional_memory.txt)

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (140 commits)
    powerpc/kexec: Disable hard IRQ before kexec
    powerpc/85xx: l2sram - Add compatible string for BSC9131 platform
    powerpc/85xx: bsc9131 - Correct typo in SDHC device node
    powerpc/e500/qemu-e500: enable coreint
    powerpc/mpic: allow coreint to be determined by MPIC version
    powerpc/fsl_pci: Store the pci ctlr device ptr in the pci ctlr struct
    powerpc/85xx: Board support for ppa8548
    powerpc/fsl: remove extraneous DIU platform functions
    arch/powerpc/platforms/85xx/p1022_ds.c: adjust duplicate test
    powerpc: Documentation for transactional memory on powerpc
    powerpc: Add transactional memory to pseries and ppc64 defconfigs
    powerpc: Add config option for transactional memory
    powerpc: Add transactional memory to POWER8 cpu features
    powerpc: Add new transactional memory state to the signal context
    powerpc: Hook in new transactional memory code
    powerpc: Routines for FP/VSX/VMX unavailable during a transaction
    powerpc: Add transactional memory unavaliable execption handler
    powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes
    powerpc: Add FP/VSX and VMX register load functions for transactional memory
    powerpc: Add helper functions for transactional memory context switching
    ...

    Linus Torvalds
     

20 Jan, 2013

1 commit

  • This patch adds crc32 algorithms to shash crypto api. One is wrapper to
    gerneric crc32_le function. Second is crc32 pclmulqdq implementation. It
    use hardware provided PCLMULQDQ instruction to accelerate the CRC32 disposal.
    This instruction present from Intel Westmere and AMD Bulldozer CPUs.

    For intel core i5 I got 450MB/s for table implementation and 2100MB/s
    for pclmulqdq implementation.

    Signed-off-by: Alexander Boyko
    Signed-off-by: Herbert Xu

    Alexander Boyko
     

12 Jan, 2013

1 commit

  • The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
    while now and is almost always enabled by default. As agreed during the
    Linux kernel summit, remove it from any "depends on" lines in Kconfigs.

    CC: Herbert Xu
    CC: "David S. Miller"
    Signed-off-by: Kees Cook
    Acked-by: David S. Miller

    Kees Cook
     

10 Jan, 2013

1 commit

  • This patch adds a crypto driver which provides a powerpc accelerated
    implementation of SHA-1, accelerated in that it is written in asm.

    Original patch by Paul, minor fixups for upstream by moi.

    Lightly tested on 64-bit with the test program here:

    http://michael.ellerman.id.au/files/junkcode/sha1test.c

    Seems to work, and is "not slower" than the generic version.

    Needs testing on 32-bit.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Michael Ellerman
    Signed-off-by: Benjamin Herrenschmidt

    Michael Ellerman
     

06 Dec, 2012

1 commit


09 Nov, 2012

1 commit

  • This patch adds AES-NI/AVX/x86_64 assembler implementation of Camellia block
    cipher. Implementation process data in sixteen block chunks, which are
    byte-sliced and AES SubBytes is reused for Camellia s-box with help of pre-
    and post-filtering.

    Patch has been tested with tcrypt and automated filesystem tests.

    tcrypt test results:

    Intel Core i5-2450M:

    camellia-aesni-avx vs camellia-asm-x86_64-2way:
    128bit key: (lrw:256bit) (xts:256bit)
    size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
    16B 0.98x 0.96x 0.99x 0.96x 0.96x 0.95x 0.95x 0.94x 0.97x 0.98x
    64B 0.99x 0.98x 1.00x 0.98x 0.98x 0.99x 0.98x 0.93x 0.99x 0.98x
    256B 2.28x 2.28x 1.01x 2.29x 2.25x 2.24x 1.96x 1.97x 1.91x 1.90x
    1024B 2.57x 2.56x 1.00x 2.57x 2.51x 2.53x 2.19x 2.17x 2.19x 2.22x
    8192B 2.49x 2.49x 1.00x 2.53x 2.48x 2.49x 2.17x 2.17x 2.22x 2.22x

    256bit key: (lrw:384bit) (xts:512bit)
    size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
    16B 0.97x 0.98x 0.99x 0.97x 0.97x 0.96x 0.97x 0.98x 0.98x 0.99x
    64B 1.00x 1.00x 1.01x 0.99x 0.98x 0.99x 0.99x 0.99x 0.99x 0.99x
    256B 2.37x 2.37x 1.01x 2.39x 2.35x 2.33x 2.10x 2.11x 1.99x 2.02x
    1024B 2.58x 2.60x 1.00x 2.58x 2.56x 2.56x 2.28x 2.29x 2.28x 2.29x
    8192B 2.50x 2.52x 1.00x 2.56x 2.51x 2.51x 2.24x 2.25x 2.26x 2.29x

    Signed-off-by: Jussi Kivilinna
    Acked-by: David S. Miller
    Signed-off-by: Herbert Xu

    Jussi Kivilinna
     

15 Oct, 2012

2 commits

  • This patch adds the crc_pcl function that calculates CRC32C checksum using the
    PCLMULQDQ instruction on processors that support this feature. This will
    provide speedup over using CRC32 instruction only.
    The usage of PCLMULQDQ necessitate the invocation of kernel_fpu_begin and
    kernel_fpu_end and incur some overhead. So the new crc_pcl function is only
    invoked for buffer size of 512 bytes or more. Larger sized
    buffers will expect to see greater speedup. This feature is best used coupled
    with eager_fpu which reduces the kernel_fpu_begin/end overhead. For
    buffer size of 1K the speedup is around 1.6x and for buffer size greater than
    4K, the speedup is around 3x compared to original implementation in crc32c-intel
    module. Test was performed on Sandy Bridge based platform with constant frequency
    set for cpu.

    A white paper detailing the algorithm can be found here:
    http://download.intel.com/design/intarch/papers/323405.pdf

    Signed-off-by: Tim Chen
    Signed-off-by: Herbert Xu

    Tim Chen
     
  • Pull module signing support from Rusty Russell:
    "module signing is the highlight, but it's an all-over David Howells frenzy..."

    Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

    * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
    X.509: Fix indefinite length element skip error handling
    X.509: Convert some printk calls to pr_devel
    asymmetric keys: fix printk format warning
    MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
    MODSIGN: Make mrproper should remove generated files.
    MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
    MODSIGN: Use the same digest for the autogen key sig as for the module sig
    MODSIGN: Sign modules during the build process
    MODSIGN: Provide a script for generating a key ID from an X.509 cert
    MODSIGN: Implement module signature checking
    MODSIGN: Provide module signing public keys to the kernel
    MODSIGN: Automatically generate module signing keys if missing
    MODSIGN: Provide Kconfig options
    MODSIGN: Provide gitignore and make clean rules for extra files
    MODSIGN: Add FIPS policy
    module: signature checking hook
    X.509: Add a crypto key parser for binary (DER) X.509 certificates
    MPILIB: Provide a function to read raw data into an MPI
    X.509: Add an ASN.1 decoder
    X.509: Add simple ASN.1 grammar compiler
    ...

    Linus Torvalds
     

08 Oct, 2012

1 commit

  • Create a key type that can be used to represent an asymmetric key type for use
    in appropriate cryptographic operations, such as encryption, decryption,
    signature generation and signature verification.

    The key type is "asymmetric" and can provide access to a variety of
    cryptographic algorithms.

    Possibly, this would be better as "public_key" - but that has the disadvantage
    that "public key" is an overloaded term.

    Signed-off-by: David Howells
    Signed-off-by: Rusty Russell

    David Howells
     

05 Oct, 2012

1 commit

  • Pull crypto update from Herbert Xu:
    - Optimised AES/SHA1 for ARM.
    - IPsec ESN support in talitos and caam.
    - x86_64/avx implementation of cast5/cast6.
    - Add/use multi-algorithm registration helpers where possible.
    - Added IBM Power7+ in-Nest support.
    - Misc fixes.

    Fix up trivial conflicts in crypto/Kconfig due to the sparc64 crypto
    config options being added next to the new ARM ones.

    [ Side note: cut-and-paste duplicate help texts make those conflicts
    harder to read than necessary, thanks to git being smart about
    minimizing conflicts and maximizing the common parts... ]

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits)
    crypto: x86/glue_helper - fix storing of new IV in CBC encryption
    crypto: cast5/avx - fix storing of new IV in CBC encryption
    crypto: tcrypt - add missing tests for camellia and ghash
    crypto: testmgr - make test_aead also test 'dst != src' code paths
    crypto: testmgr - make test_skcipher also test 'dst != src' code paths
    crypto: testmgr - add test vectors for CTR mode IV increasement
    crypto: testmgr - add test vectors for partial ctr(cast5) and ctr(cast6)
    crypto: testmgr - allow non-multi page and multi page skcipher tests from same test template
    crypto: caam - increase TRNG clocks per sample
    crypto, tcrypt: remove local_bh_disable/enable() around local_irq_disable/enable()
    crypto: tegra-aes - fix error return code
    crypto: crypto4xx - fix error return code
    crypto: hifn_795x - fix error return code
    crypto: ux500 - fix error return code
    crypto: caam - fix error IDs for SEC v5.x RNG4
    hwrng: mxc-rnga - Access data via structure
    hwrng: mxc-rnga - Adapt clocks to new i.mx clock framework
    crypto: caam - add IPsec ESN support
    crypto: 842 - remove .cra_list initialization
    Revert "[CRYPTO] cast6: inline bloat--"
    ...

    Linus Torvalds
     

03 Oct, 2012

1 commit


07 Sep, 2012

1 commit

  • Add assembler versions of AES and SHA1 for ARM platforms. This has provided
    up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1.

    Platform CPU SPeed Endian Before (bps) After (bps) Improvement

    IXP425 533 MHz big 11217042 15566294 ~38%
    KS8695 166 MHz little 3828549 5795373 ~51%

    Signed-off-by: David McCullough
    Signed-off-by: Herbert Xu

    David McCullough
     

29 Aug, 2012

1 commit


26 Aug, 2012

1 commit


23 Aug, 2012

2 commits


21 Aug, 2012

4 commits


20 Aug, 2012

1 commit

  • …NI hardware pipelines

    Use parallel LRW and XTS encryption facilities to better utilize AES-NI
    hardware pipelines and gain extra performance.

    Tcrypt benchmark results (async), old vs new ratios:

    Intel Core i5-2450M CPU (fam: 6, model: 42, step: 7)

    aes:128bit
    lrw:256bit xts:256bit
    size lrw-enc lrw-dec xts-dec xts-dec
    16B 0.99x 1.00x 1.22x 1.19x
    64B 1.38x 1.50x 1.58x 1.61x
    256B 2.04x 2.02x 2.27x 2.29x
    1024B 2.56x 2.54x 2.89x 2.92x
    8192B 2.85x 2.99x 3.40x 3.23x

    aes:192bit
    lrw:320bit xts:384bit
    size lrw-enc lrw-dec xts-dec xts-dec
    16B 1.08x 1.08x 1.16x 1.17x
    64B 1.48x 1.54x 1.59x 1.65x
    256B 2.18x 2.17x 2.29x 2.28x
    1024B 2.67x 2.67x 2.87x 3.05x
    8192B 2.93x 2.84x 3.28x 3.33x

    aes:256bit
    lrw:348bit xts:512bit
    size lrw-enc lrw-dec xts-dec xts-dec
    16B 1.07x 1.07x 1.18x 1.19x
    64B 1.56x 1.56x 1.70x 1.71x
    256B 2.22x 2.24x 2.46x 2.46x
    1024B 2.76x 2.77x 3.13x 3.05x
    8192B 2.99x 3.05x 3.40x 3.30x

    Cc: Huang Ying <ying.huang@intel.com>
    Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
    Reviewed-by: Kim Phillips <kim.phillips@freescale.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

    Jussi Kivilinna
     

01 Aug, 2012

2 commits

  • This patch add the 842 cryptographic API driver that
    submits compression requests to the 842 hardware compression
    accelerator driver (nx-compress).

    If the hardware accelerator goes offline for any reason
    (dynamic disable, migration, etc...), this driver will use LZO
    as a software failover for all future compression requests.
    For decompression requests, the 842 hardware driver contains
    a software implementation of the 842 decompressor to support
    the decompression of data that was compressed before the accelerator
    went offline.

    Signed-off-by: Robert Jennings
    Signed-off-by: Seth Jennings
    Signed-off-by: Herbert Xu

    Seth Jennings
     
  • This patch adds a x86_64/avx assembler implementation of the Cast6 block
    cipher. The implementation processes eight blocks in parallel (two 4 block
    chunk AVX operations). The table-lookups are done in general-purpose registers.
    For small blocksizes the functions from the generic module are called. A good
    performance increase is provided for blocksizes greater or equal to 128B.

    Patch has been tested with tcrypt and automated filesystem tests.

    Tcrypt benchmark results:

    Intel Core i5-2500 CPU (fam:6, model:42, step:7)

    cast6-avx-x86_64 vs. cast6-generic
    128bit key: (lrw:256bit) (xts:256bit)
    size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
    16B 0.97x 1.00x 1.01x 1.01x 0.99x 0.97x 0.98x 1.01x 0.96x 0.98x
    64B 0.98x 0.99x 1.02x 1.01x 0.99x 1.00x 1.01x 0.99x 1.00x 0.99x
    256B 1.77x 1.84x 0.99x 1.85x 1.77x 1.77x 1.70x 1.74x 1.69x 1.72x
    1024B 1.93x 1.95x 0.99x 1.96x 1.93x 1.93x 1.84x 1.85x 1.89x 1.87x
    8192B 1.91x 1.95x 0.99x 1.97x 1.95x 1.91x 1.86x 1.87x 1.93x 1.90x

    256bit key: (lrw:384bit) (xts:512bit)
    size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
    16B 0.97x 0.99x 1.02x 1.01x 0.98x 0.99x 1.00x 1.00x 0.98x 0.98x
    64B 0.98x 0.99x 1.01x 1.00x 1.00x 1.00x 1.01x 1.01x 0.97x 1.00x
    256B 1.77x 1.83x 1.00x 1.86x 1.79x 1.78x 1.70x 1.76x 1.71x 1.69x
    1024B 1.92x 1.95x 0.99x 1.96x 1.93x 1.93x 1.83x 1.86x 1.89x 1.87x
    8192B 1.94x 1.95x 0.99x 1.97x 1.95x 1.95x 1.87x 1.87x 1.93x 1.91x

    Signed-off-by: Johannes Goetzfried
    Signed-off-by: Herbert Xu

    Johannes Goetzfried