08 Feb, 2006

1 commit


10 Jan, 2006

10 commits

  • Many cipher implementations use 4-byte/8-byte loads/stores which require
    alignment on some architectures. This patch explicitly sets the alignment
    requirements for them.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • The cipher code path may allocate up to two blocks of data on the stack.
    Therefore we need to place limits on the maximum block size.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • After a partial update, the done pointer is off to the right by 64 bytes.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Since the temporary buffer is used as an argument to cia_decrypt, it must be
    aligned by cra_alignmask. This bug was found by linux@horizon.com.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch avoids shifting the count left and right needlessly for each
    call to sha1_update(). It instead can be done only once at the end in
    sha1_final().

    Keeping the previous test example (sha1_update() successively called with
    len=64), a 1.3% performance increase can be observed on i386, or 0.2% on
    ARM. The generated code is also smaller on ARM.

    Signed-off-by: Nicolas Pitre
    Signed-off-by: Herbert Xu

    Nicolas Pitre
     
  • This patch gives more descriptive names to the variables i and j.

    Signed-off-by: Nicolas Pitre
    Signed-off-by: Herbert Xu

    Nicolas Pitre
     
  • The current code unconditionally copy the first block for every call to
    sha1_update(). This can be avoided if there is no pending partial block.
    This is always the case on the first call to sha1_update() (if the length
    is >= 64 of course.

    Furthermore, temp does need to be called if sha_transform is never invoked.
    Also consolidate the sha_transform calls into one to reduce code size.

    Signed-off-by: Nicolas Pitre
    Signed-off-by: Herbert Xu

    Nicolas Pitre
     
  • As the Crypto API now allows multiple implementations to be registered
    for the same algorithm, we no longer have to play tricks with Kconfig
    to select the right AES implementation.

    This patch sets the driver name and priority for all the AES
    implementations and removes the Kconfig conditions on the C implementation
    for AES.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This is the first step on the road towards asynchronous support in
    the Crypto API. It adds support for having multiple crypto_alg objects
    for the same algorithm registered in the system.

    For example, each device driver would register a crypto_alg object
    for each algorithm that it supports. While at the same time the
    user may load software implementations of those same algorithms.

    Users of the Crypto API may then select a specific implementation
    by name, or choose any implementation for a given algorithm with
    the highest priority.

    The priority field is a 32-bit signed integer. In future it will be
    possible to modify it from user-space.

    This also provides a solution to the problem of selecting amongst
    various AES implementations, that is, aes vs. aes-i586 vs. aes-padlock.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • A lot of crypto code needs to read/write a 32-bit/64-bit words in a
    specific gender. Many of them open code them by reading/writing one
    byte at a time. This patch converts all the applicable usages over
    to use the standard byte order macros.

    This is based on a previous patch by Denis Vlasenko.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

07 Jan, 2006

5 commits

  • Sanitize some s390 Kconfig options. We have ARCH_S390, ARCH_S390X,
    ARCH_S390_31, 64BIT, S390_SUPPORT and COMPAT. Replace these 6 options by
    S390, 64BIT and COMPAT.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     
  • Add new test vectors to the AES test suite for AES CBC and AES with plaintext
    larger than AES blocksize.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Glauber
     
  • Add support for the hardware accelerated AES crypto algorithm.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Glauber
     
  • Add support for the hardware accelerated sha256 crypto algorithm.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Glauber
     
  • Replace all references to z990 by s390 in the in-kernel crypto files in
    arch/s390/crypto. The code is not specific to a particular machine (z990) but
    to the s390 platform. Big diff, does nothing..

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Glauber
     

30 Oct, 2005

3 commits


07 Sep, 2005

1 commit


02 Sep, 2005

2 commits

  • The crypto layer currently uses in_atomic() to determine whether it is
    allowed to sleep. This is incorrect since spin locks don't always cause
    in_atomic() to return true.

    Instead of that, this patch returns to an earlier idea of a per-tfm flag
    which determines whether sleeping is allowed. Unlike the earlier version,
    the default is to not allow sleeping. This ensures that no existing code
    can break.

    As usual, this flag may either be set through crypto_alloc_tfm(), or
    just before a specific crypto operation.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The XTEA implementation was incorrect due to a misinterpretation of
    operator precedence. Because of the wide-spread nature of this
    error, the erroneous implementation will be kept, albeit under the
    new name of XETA.

    Signed-off-by: Aaron Grothe
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Aaron Grothe
     

28 Jul, 2005

1 commit

  • `gcc -W' likes to complain if the static keyword is not at the beginning of
    the declaration. This patch fixes all remaining occurrences of "inline
    static" up with "static inline" in the entire kernel tree (140 occurrences in
    47 files).

    While making this change I came across a few lines with trailing whitespace
    that I also fixed up, I have also added or removed a blank line or two here
    and there, but there are no functional changes in the patch.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     

15 Jul, 2005

1 commit


07 Jul, 2005

11 commits

  • I've made a new implementation of DES to replace the old one in the kernel.
    It provides faster encryption on all tested processors apart from the original
    Pentium, and key setup is many times faster.

    Speed relative to old kernel implementation
    Processor des_setkey des_encrypt des3_ede_setkey des3_ede_encrypt
    Pentium
    120Mhz 6.8 0.82 7.2 0.86
    Pentium III
    1.266Ghz 5.6 1.19 5.8 1.34
    Pentium M
    1.3Ghz 5.7 1.15 6.0 1.31
    Pentium 4
    2.266Ghz 5.8 1.24 6.0 1.40
    Pentium 4E
    3Ghz 5.4 1.27 5.5 1.48
    StrongARM 1110
    206Mhz 4.3 1.03 4.4 1.14
    Athlon XP
    2Ghz 7.8 1.44 8.1 1.61
    Athlon 64
    2Ghz 7.8 1.34 8.3 1.49

    Signed-off-by: Dag Arne Osvik
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Dag Arne Osvik
     
  • The iv field in des_ctx/des3_ede_ctx/serpent_ctx has never been used.
    This was noticed by Dag Arne Osvik.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Implementation:
    ===============
    The encrypt/decrypt code is based on an x86 implementation I did a while
    ago which I never published. This unpublished implementation does
    include an assembler based key schedule and precomputed tables. For
    simplicity and best acceptance, however, I took Gladman's in-kernel code
    for table generation and key schedule for the kernel port of my
    assembler code and modified this code to produce the key schedule as
    required by my assembler implementation. File locations and Kconfig are
    kept similar to the i586 AES assembler implementation.
    It may seem a little bit strange to use 32 bit I/O and registers in the
    assembler implementation but this gives the best code size. My
    implementation takes one instruction more per round compared to
    Gladman's x86 assembler but it doesn't require any stack for local
    variables or saved registers and it is less serialized than Gladman's
    code.
    Note that all comparisons to Gladman's code were done after my code was
    implemented. I did only use FIPS PUB 197 for the implementation so my
    implementation is independent work.
    If anybody has a better assembler solution for x86_64 I'll be pleased to
    have my code replaced with the better solution.

    Testing:
    ========
    The implementation passes the in-kernel crypto testing module and I'm
    running it without any problems on my laptop where it is mainly used for
    dm-crypt.

    Microbenchmark:
    ===============
    The microbenchmark was done in userspace with similar compile flags as
    used during kernel compile.
    Encrypt/decrypt is about 35% faster than the generic C implementation.
    As the generic C as well as my assembler implementation are both table
    I don't really expect that there is much room for further
    improvements though I'll be glad to be corrected here.
    The key schedule is about 5% slower than the generic C implementation.
    This is due to the fact that some more work has to be done in the key
    schedule routine to fit the schedule to the assembler implementation.

    Code Size:
    ==========
    Encrypt and decrypt are together about 2.1 Kbytes smaller than the
    generic C implementation which is important with regard to L1 cache
    usage. The key schedule routine is about 100 bytes larger than the
    generic C implementation.

    Data Size:
    ==========
    There's no difference in data size requirements between the assembler
    implementation and the generic C implementation.

    License:
    ========
    Gladmans's code is dual BSD/GPL whereas my assembler code is GPLv2 only
    (I'm not going to change the license for my code). So I had to change
    the module license for the x86_64 aes module from 'Dual BSD/GPL' to
    'GPL' to reflect the most restrictive license within the module.

    Signed-off-by: Andreas Steinmetz
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Andreas Steinmetz
     
  • As far as I'm aware there's a general concensus that functions that are
    responsible for freeing resources should be able to cope with being passed
    a NULL pointer. This makes sense as it removes the need for all callers to
    check for NULL, thus elliminating the bugs that happen when some forget
    (safer to just check centrally in the freeing function) and it also makes
    for smaller code all over due to the lack of all those NULL checks.
    This patch makes it safe to pass the crypto_free_tfm() function a NULL
    pointer. Once this patch is applied we can start removing the NULL checks
    from the callers.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Jesper Juhl
     
  • Even though cit_iv is now always aligned, the user can still supply an
    unaligned iv through crypto_cipher_encrypt_iv/crypto_cipher_decrypt_iv.
    This patch will check the alignment of the user-supplied iv and copy
    it if necessary.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch ensures that cit_iv is aligned according to cra_alignmask
    by allocating it as part of the tfm structure. As a side effect the
    crypto layer will also guarantee that the tfm ctx area has enough space
    to be aligned by cra_alignmask. This allows us to remove the extra
    space reservation from the Padlock driver.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch makes a needlessly global function static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Adrian Bunk
     
  • The VIA Padlock device requires the input and output buffers to
    be aligned on 16-byte boundaries. This patch adds the alignmask
    attribute for low-level cipher implementations to indicate their
    alignment requirements.

    The mid-level crypt() function will copy the input/output buffers
    if they are not aligned correctly before they are passed to the
    low-level implementation.

    Strictly speaking, some of the software implementations require
    the buffers to be aligned on 4-byte boundaries as they do 32-bit
    loads. However, it is not clear whether it is better to copy
    the buffers or pay the penalty for unaligned loads/stores.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch adds hooks for cipher algorithms to implement multi-block
    ECB/CBC operations directly. This is expected to provide significant
    performance boots to the VIA Padlock.

    It could also be used for improving software implementations such as
    AES where operating on multiple blocks at a time may enable certain
    optimisations.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The VIA Padlock device is able to perform much better when multiple
    blocks are fed to it at once. As this device offers an exceptional
    throughput rate it is worthwhile to optimise the infrastructure
    specifically for it.

    We shift the existing page-sized fast path down to the CBC/ECB functions.
    We can then replace the CBC/ECB functions with functions provided by the
    underlying algorithm that performs the multi-block operations.

    As a side-effect this improves the performance of large cipher operations
    for all existing algorithm implementations. I've measured the gain to be
    around 5% for 3DES and 15% for AES.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Checking a pointer for NULL before calling kfree() on it is redundant.
    This patch removes such checks from crypto/

    Signed-off-by: Jesper Juhl
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Jesper Juhl
     

23 Jun, 2005

5 commits

  • After using this facility for a while to test my changes to the
    cipher crypt() layer, I realised that I should've listend to Dave
    and made this thing use CPU cycle counters :) As it is it's too
    jittery for me to feel safe about relying on the results.

    So here is a patch to make it use CPU cycles by default but fall
    back to jiffies if the user specifies a non-zero sec value.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The existing keys used in the speed tests do not pass the 3DES quality check.
    This patch makes it use the template keys instead.

    Other algorithms can supply template keys through the same interface if needed.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • From: Reyk Floeter

    I recently had the requirement to do some benchmarking on cryptoapi, and
    I found reyk's very useful performance test patch [1].

    However, I could not find any discussion on why that extension (or
    something providing a similar feature but different implementation) was
    not merged into mainline. If there was such a discussion, can someone
    please point me to the archive[s]?

    I've now merged the old patch into 2.6.12-rc1, the result can be found
    attached to this email.

    [1] http://lists.logix.cz/pipermail/padlock/2004/000010.html

    Signed-off-by: Harald Welte
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Harald Welte
     
  • It seems that bad code tends to get copied (see test_cipher_speed). So let's
    kill this idiom before it spreads any further.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu