11 Jan, 2008

40 commits

  • Analogously to camellia7 patch, move
    "absorb kw2 to other subkeys" and "absorb kw4 to other subkeys"
    code parts into camellia_setup_tail(). This further reduces
    source and object code size at the cost of two brances
    in key setup code.

    Signed-off-by: Denys Vlasenko
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • Move "key XOR is end of F-function" code part into
    camellia_setup_tail(), it is sufficiently similar
    between camellia_setup128 and camellia_setup256.

    Signed-off-by: Denys Vlasenko
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • unifies encrypt/decrypt routines for different key lengths.
    This reduces module size by ~25%, with tiny (less than 1%)
    speed impact.
    Also collapses encrypt/decrypt into more readable
    (visually shorter) form using macros.

    Signed-off-by: Denys Vlasenko
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • Remove unused macro params.
    Use (u8)(expr) instead of (expr) & 0xff,
    helps gcc to realize how to use simpler commands.
    Move CAMELLIA_FLS macro closer to encrypt/decrypt routines.

    Signed-off-by: Denys Vlasenko
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • This patch replaces the custom inc/xor in CTR with the generic functions.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch replaces the custom xor in CBC with the generic crypto_xor.

    It changes the operations for in-place encryption slightly to avoid
    calling crypto_xor with tmpbuf since it is not necessarily aligned.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • All common block ciphers have a block size that's a power of 2. In fact,
    all of our block ciphers obey this rule.

    If we require this then CBC can be optimised to avoid an expensive divide
    on in-place decryption.

    I've also changed the saving of the first IV in the in-place decryption
    case to the last IV because that lets us use walk->iv (which is already
    aligned) for the xor operation where alignment is required.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch replaces the custom xor in CBC with the generic crypto_xor.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • With the addition of more stream ciphers we need to curb the proliferation
    of ad-hoc xor functions. This patch creates a generic pair of functions,
    crypto_inc and crypto_xor which does big-endian increment and exclusive or,
    respectively.

    For optimum performance, they both use u32 operations so alignment must be
    as that of u32 even though the arguments are of type u8 *.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Signed-off-by: Patrick McHardy
    Acked-by: Evgeniy Polyakov
    Signed-off-by: Herbert Xu

    Patrick McHardy
     
  • The current PLL initalization has a number of deficiencies:

    - uses fixed multiplier of 8, which overclocks the chip when using a
    reference clock that operates at frequencies above 33MHz. According
    to a comment in the BSD source, this is true for the external clock
    on almost all every board.

    - writes to a reserved bit

    - doesn't follow the initialization procedure specified in chapter
    6.11.1 of the HIFN hardware users guide

    - doesn't allow to use the PCI clock

    This patch adds a module parameter to specify the reference clock
    (pci or external) and its frequency and uses that to calculate the
    optimum multiplier to reach the maximal speed. By default it uses
    the external clock and assumes a speed of 66MHz, which effectively
    halfs the frequency currently used.

    Signed-off-by: Patrick McHardy
    Acked-by: Evgeniy Polyakov
    Signed-off-by: Herbert Xu

    Patrick McHardy
     
  • Handle waiting for new random within the drivers themselves, this allows to
    use better suited timeouts for the individual rngs.

    Signed-off-by: Patrick McHardy
    Acked-by: Michael Buesch
    Signed-off-by: Herbert Xu

    Patrick McHardy
     
  • This patch implements the Salsa20 stream cipher using the blkcipher interface.

    The core cipher code comes from Daniel Bernstein's submission to eSTREAM:
    http://www.ecrypt.eu.org/stream/svn/viewcvs.cgi/ecrypt/trunk/submissions/salsa20/full/ref/

    The test vectors comes from:
    http://www.ecrypt.eu.org/stream/svn/viewcvs.cgi/ecrypt/trunk/submissions/salsa20/full/

    It has been tested successfully with "modprobe tcrypt mode=34" on an
    UML instance.

    Signed-off-by: Tan Swee Heng
    Signed-off-by: Herbert Xu

    Tan Swee Heng
     
  • Up until now we have ablkcipher algorithms have been identified as
    type BLKCIPHER with the ASYNC bit set. This is suboptimal because
    ablkcipher refers to two things. On the one hand it refers to the
    top-level ablkcipher interface with requests. On the other hand it
    refers to and algorithm type underneath.

    As it is you cannot request a synchronous block cipher algorithm
    with the ablkcipher interface on top. This is a problem because
    we want to be able to eventually phase out the blkcipher top-level
    interface.

    This patch fixes this by making ABLKCIPHER its own type, just as
    we have distinct types for HASH and DIGEST. The type it associated
    with the algorithm implementation only.

    Which top-level interface is used for synchronous block ciphers is
    then determined by the mask that's used. If it's a specific mask
    then the old blkcipher interface is given, otherwise we go with the
    new ablkcipher interface.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch updates the list of transforms we support and clarifies that
    the Block Ciphers interface in fact supports all ciphers including stream
    ciphers.

    It also removes the obsolete Configuration Notes section and adds the
    linux-crypto mailing list as the primary bug reporting address.

    Finally it documents the fact that setkey should only be called from
    user context.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch converts the crypto scatterwalk code to use the generic
    scatterlist chaining rather the version specific to crypto.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • alpha:

    drivers/crypto/hifn_795x.c: In function 'ablkcipher_walk_init':
    drivers/crypto/hifn_795x.c:1231: error: implicit declaration of function 'sg_init_table'
    drivers/crypto/hifn_795x.c:1243: error: implicit declaration of function 'sg_set_page'
    drivers/crypto/hifn_795x.c: In function 'ablkcipher_walk_exit':
    drivers/crypto/hifn_795x.c:1257: error: implicit declaration of function 'sg_page'
    drivers/crypto/hifn_795x.c:1257: warning: passing argument 1 of '__free_pages' makes pointer from integer without a cast
    drivers/crypto/hifn_795x.c: In function 'ablkcipher_add':
    drivers/crypto/hifn_795x.c:1278: warning: passing argument 1 of 'kmap_atomic' makes pointer from integer without a cast
    drivers/crypto/hifn_795x.c: In function 'ablkcipher_walk':
    drivers/crypto/hifn_795x.c:1336: warning: passing argument 1 of 'kmap_atomic' makes pointer from integer without a cast
    drivers/crypto/hifn_795x.c: In function 'hifn_setup_session':
    drivers/crypto/hifn_795x.c:1465: warning: assignment makes pointer from integer without a cast
    drivers/crypto/hifn_795x.c:1469: warning: assignment makes pointer from integer without a cast
    drivers/crypto/hifn_795x.c:1472: warning: assignment makes pointer from integer without a cast
    drivers/crypto/hifn_795x.c: In function 'ablkcipher_get':
    drivers/crypto/hifn_795x.c:1593: warning: passing argument 1 of 'kmap_atomic' makes pointer from integer without a cast
    {standard input}: Assembler messages:
    {standard input}:7: Warning: setting incorrect section attributes for .got
    drivers/crypto/hifn_795x.c: In function 'hifn_process_ready':
    drivers/crypto/hifn_795x.c:1653: warning: passing argument 1 of 'kmap_atomic' makes pointer from integer without a cast
    drivers/crypto/hifn_795x.c: In function 'hifn_probe':
    drivers/crypto/hifn_795x.c:2438: error: 'DMA_32BIT_MASK' undeclared (first use in this function)
    drivers/crypto/hifn_795x.c:2438: error: (Each undeclared identifier is reported only once
    drivers/crypto/hifn_795x.c:2438: error: for each function it appears in.)
    drivers/crypto/hifn_795x.c:2443: warning: format '%d' expects type 'int', but argument 4 has type 'long int'
    drivers/crypto/hifn_795x.c:2443: warning: format '%d' expects type 'int', but argument 4 has type 'long int'

    Signed-off-by: Andrew Morton
    Signed-off-by: Herbert Xu

    Andrew Morton
     
  • The HIFN driver is currently selectable on s390 but wont compile.
    Since it looks like HIFN needs PCI make the Kconfig dependent on PCI,
    which is not available on s390.

    Signed-off-by: Jan Glauber
    Acked-by: Evgeniy Polyakov
    Signed-off-by: Herbert Xu

    Jan Glauber
     
  • This patch forces HIFN driver to invoke crypto request callbacks from
    tasklet (softirq context) instead of hardirq context, since network
    stack expects it to be called from bottom halves.

    It is done by simply scheduling callback invocation via dedicated
    tasklet. Workqueue solution was dropped because of tooo slow
    rescheduling performance (7 times slower than tasklet, for mode details
    one can check this link:
    http://tservice.net.ru/~s0mbre/blog/devel/other/2007_11_09.html).

    Driver passed all AES and DES tests in tcryt.c module.

    Signed-off-by: Evgeniy Polyakov
    Signed-off-by: Herbert Xu

    Evgeniy Polyakov
     
  • Resubmitting this patch which extends sha256_generic.c to support SHA-224 as
    described in FIPS 180-2 and RFC 3874. HMAC-SHA-224 as described in RFC4231
    is then supported through the hmac interface.

    Patch includes test vectors for SHA-224 and HMAC-SHA-224.

    SHA-224 chould be chosen as a hash algorithm when 112 bits of security
    strength is required.

    Patch generated against the 2.6.24-rc1 kernel and tested against
    2.6.24-rc1-git14 which includes fix for scatter gather implementation for HMAC.

    Signed-off-by: Jonathan Lynch
    Signed-off-by: Herbert Xu

    Jonathan Lynch
     
  • The Geode AES crypto engine supports only 128 bit long key. This
    patch adds fallback for other key sizes which are required by the
    AES standard.

    Signed-off-by: Sebastian Siewior
    Acked-by: Jordan Crouse
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • The setkey() function can be shared with the generic algorithm.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • NO other block mode is M by default.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • The setkey() function can be shared with the generic algorithm.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • This patch exports four tables and the set_key() routine. This ressources
    can be shared by other AES implementations (aes-x86_64 for instance).
    The decryption key has been turned around (deckey[0] is the first piece
    of the key instead of deckey[keylen+20]). The encrypt/decrypt functions
    are looking now identical (except they are using different tables and
    key).

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • This patch adds countersize to CTR mode.
    The template is now ctr(algo,noncesize,ivsize,countersize).

    For example, ctr(aes,4,8,4) indicates the counterblock
    will be composed of a salt/nonce that is 4 bytes, an iv
    that is 8 bytes and the counter is 4 bytes.

    When noncesize + ivsize < blocksize, CTR initializes the
    last block - ivsize - noncesize portion of the block to
    zero. Otherwise the counter block is composed of the IV
    (and nonce if necessary).

    If noncesize + ivsize == blocksize, then this indicates that
    user is passing in entire counterblock. Thus countersize
    indicates the amount of bytes in counterblock to use as
    the counter for incrementing. CTR will increment counter
    portion by 1, and begin encryption with that value.

    Note that CTR assumes the counter portion of the block that
    will be incremented is stored in big endian.

    Signed-off-by: Joy Latten
    Signed-off-by: Herbert Xu

    Joy Latten
     
  • Move huge unrolled pieces of code (3 screenfuls) at the end of
    128/256 key setup routines into common camellia_setup_tail(),
    convert it to loop there.
    Loop is still unrolled six times, so performance hit is very small,
    code size win is big.

    Signed-off-by: Denys Vlasenko
    Acked-by: Noriaki TAKAMIYA
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • Optimize GETU32 to use 4-byte memcpy (modern gcc will convert
    such memcpy to single move instruction on i386).
    Original GETU32 did four byte fetches, and shifted/XORed those.

    Signed-off-by: Denys Vlasenko
    Acked-by: Noriaki TAKAMIYA
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • Rename some macros to shorter names: CAMELLIA_RR8 -> ROR8,
    making it easier to understand that it is just a right rotation,
    nothing camellia-specific in it.
    CAMELLIA_SUBKEY_L() -> SUBKEY_L() - just shorter.

    Move be32 cpu conversions out of en/decrypt128/256 and into
    camellia_en/decrypt - no reason to have that code duplicated twice.

    Signed-off-by: Denys Vlasenko
    Acked-by: Noriaki TAKAMIYA
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • Move code blocks around so that related pieces are closer together:
    e.g. CAMELLIA_ROUNDSM macro does not need to be separated
    from the rest of the code by huge array of constants.

    Remove unused macros (COPY4WORD, SWAP4WORD, XOR4WORD[2])

    Drop SUBL(), SUBR() macros which only obscure things.
    Same for CAMELLIA_SP1110() macro and KEY_TABLE_TYPE typedef.

    Remove useless comments:
    /* encryption */ -- well it's obvious enough already!
    void camellia_encrypt128(...)

    Combine swap with copying at the beginning/end of encrypt/decrypt.

    Signed-off-by: Denys Vlasenko
    Acked-by: Noriaki TAKAMIYA
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • Currently twofish cipher key setup code
    has unrolled loops - approximately 70-100
    instructions are repeated 40 times.

    As a result, twofish module is the biggest module
    in crypto/*.

    Unrolling produces x2.5 more code (+18k on i386), and speeds up key
    setup by 7%:

    unrolled: twofish_setkey/sec: 41128
    loop: twofish_setkey/sec: 38148
    CALC_K256: ~100 insns each
    CALC_K192: ~90 insns
    CALC_K: ~70 insns

    Attached patch removes this unrolling.

    $ size */twofish_common.o
    text data bss dec hex filename
    37920 0 0 37920 9420 crypto.org/twofish_common.o
    13209 0 0 13209 3399 crypto/twofish_common.o

    Run tested (modprobe tcrypt reports ok). Please apply.

    Signed-off-by: Denys Vlasenko
    Signed-off-by: Herbert Xu

    Denys Vlasenko
     
  • This patch moves macros in geode-aes.c into geode-aes.h.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • The code waits in a busy loop until the hardware finishes the encryption
    or decryption process. This wants a cpu_relax() :)
    The busy loop finishes either if the encryption is done or if the counter
    is zero. If the latter is true than the hardware failed. Since this
    should not happen, leave sith a BUG().

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • It is enough if the IV is copied before and after the while loop.
    With DM-Crypt is seems not be required to save the IV after encrytion
    because a new one is used in the request (dunno about other users).
    It is not save to load the IV within while loop and not save afterwards
    because we mill end up with the wrong IV if the request goes consists
    of more than one page.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • This three defines are used in all AES related hardware.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • alias isn't required because the module provides PCI ids.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • HIFN driver update to use DES weak key checks (exported in this patch).

    Signed-off-by: Evgeniy Polyakov
    Signed-off-by: Herbert Xu

    Evgeniy Polyakov
     
  • This patch creates include/crypto/des.h for common macros shared between
    DES implementations.

    Signed-off-by: Evgeniy Polyakov
    Signed-off-by: Herbert Xu

    Evgeniy Polyakov
     
  • This is a driver for HIFN 795x crypto accelerator chips.

    It passed all tests for AES, DES and DES3_EDE except weak test for DES,
    since hardware can not determine weak keys.

    Signed-off-by: Evgeniy Polyakov
    Signed-off-by: Herbert Xu

    Evgeniy Polyakov