16 Oct, 2007

1 commit


11 Oct, 2007

22 commits

  • There are currently several SHA implementations that all define their own
    initialization vectors and size values. Since this values are idential
    move them to a header file under include/crypto.

    Signed-off-by: Jan Glauber
    Signed-off-by: Herbert Xu

    Jan Glauber
     
  • Loading the crypto algorithm by the alias instead of by module directly
    has the advantage that all possible implementations of this algorithm
    are loaded automatically and the crypto API can choose the best one
    depending on its priority.

    Additionally it ensures that the generic implementation as well as the
    HW driver (if available) is loaded in case the HW driver needs the
    generic version as fallback in corner cases.

    Also remove the probe for sha1 in padlock's init code.

    Quote from Herbert:
    The probe is actually pointless since we can always probe when
    the algorithm is actually used which does not lead to dead-locks
    like this.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • Loading the crypto algorithm by the alias instead of by module directly
    has the advantage that all possible implementations of this algorithm
    are loaded automatically and the crypto API can choose the best one
    depending on its priority.

    Additionally it ensures that the generic implementation as well as the
    HW driver (if available) is loaded in case the HW driver needs the
    generic version as fallback in corner cases.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • Loading the crypto algorithm by the alias instead of by module directly
    has the advantage that all possible implementations of this algorithm
    are loaded automatically and the crypto API can choose the best one
    depending on its priority.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • This patch adds the helper blkcipher_walk_virt_block which is similar to
    blkcipher_walk_virt but uses a supplied block size instead of the block
    size of the block cipher. This is useful for CTR where the block size is
    1 but we still want to walk by the block size of the underlying cipher.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Now that the block size is no longer a multiple of the alignment, we need to
    increase the kmalloc amount in blkcipher_next_slow to use the aligned block
    size.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds a comment to explain why we compare the cra_driver_name of
    the algorithm being registered against the cra_name of a larval as opposed
    to the cra_driver_name of the larval.

    In fact larvals have only one name, cra_name which is the name that was
    requested by the user. The test here is simply trying to find out whether
    the algorithm being registered can or can not satisfy the larval.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Previously we assumed for convenience that the block size is a multiple of
    the algorithm's required alignment. With the pending addition of CTR this
    will no longer be the case as the block size will be 1 due to it being a
    stream cipher. However, the alignment requirement will be that of the
    underlying implementation which will most likely be greater than 1.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • We do not allow spaces in algorithm names or parameters. Thanks to Joy Latten
    for pointing this out.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • As Joy Latten points out, inner algorithm parameters will miss the closing
    bracket which will also cause the outer algorithm to terminate prematurely.

    This patch fixes that also kills the WARN_ON if the number of parameters
    exceed the maximum as that is a user error.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • XTS currently considered to be the successor of the LRW mode by the IEEE1619
    workgroup. LRW was discarded, because it was not secure if the encyption key
    itself is encrypted with LRW.

    XTS does not have this problem. The implementation is pretty straightforward,
    a new function was added to gf128mul to handle GF(128) elements in ble format.
    Four testvectors from the specification
    http://grouper.ieee.org/groups/1619/email/pdf00086.pdf
    were added, and they verify on my system.

    Signed-off-by: Rik Snel
    Signed-off-by: Herbert Xu

    Rik Snel
     
  • Use max in blkcipher_get_spot() instead of open coding it.

    Signed-off-by: Ingo Oeser
    Signed-off-by: Herbert Xu

    Ingo Oeser
     
  • When scatterwalk is built as a module digest.c was broken because it
    requires the crypto_km_types structure which is in scatterwalk. This
    patch removes the crypto_km_types structure by encoding the logic into
    crypto_kmap_type directly.

    In fact, this even saves a few bytes of code (not to mention the data
    structure itself) on i386 which is about the only place where it's
    needed.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds the authenc algorithm which constructs an AEAD algorithm
    from an asynchronous block cipher and a hash. The construction is done
    by concatenating the encrypted result from the cipher with the output
    from the hash, as is used by the IPsec ESP protocol.

    The authenc algorithm exists as a template with four parameters:

    authenc(auth, authsize, enc, enckeylen).

    The authentication algorithm, the authentication size (i.e., truncating
    the output of the authentication algorithm), the encryption algorithm,
    and the encryption key length. Both the size field and the key length
    field are in bytes. For example, AES-128 with SHA1-HMAC would be
    represented by

    authenc(hmac(sha1), 12, cbc(aes), 16)

    The key for the authenc algorithm is the concatenation of the keys for
    the authentication algorithm with the encryption algorithm. For the
    above example, if a key of length 36 bytes is given, then hmac(sha1)
    would receive the first 20 bytes while the last 16 would be given to
    cbc(aes).

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds the function scatterwalk_map_and_copy which reads or
    writes a chunk of data from a scatterlist at a given offset. It will
    be used by authenc which would read/write the authentication data at
    the end of the cipher/plain text.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • The scatterwalk code is only used by algorithms that can be built as
    a module. Therefore we can move it into algapi.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Since not everyone needs a queue pointer and those who need it can
    always get it from the context anyway the queue pointer in the
    common alg object is redundant.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch ensures that kernel.h and slab.h are included for
    the setkey_unaligned function. It also breaks a couple of
    long lines.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds support for having multiple parameters to
    a template, separated by a comma. It also adds support
    for integer parameters in addition to the current algorithm
    parameter type.

    This will be used by the authenc template which will have
    four parameters: the authentication algorithm, the encryption
    algorithm, the authentication size and the encryption key
    length.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds crypto_aead which is the interface for AEAD
    (Authenticated Encryption with Associated Data) algorithms.

    AEAD algorithms perform authentication and encryption in one
    step. Traditionally users (such as IPsec) would use two
    different crypto algorithms to perform these. With AEAD
    this comes down to one algorithm and one operation.

    Of course if traditional algorithms were used we'd still
    be doing two operations underneath. However, real AEAD
    algorithms may allow the underlying operations to be
    optimised as well.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds support for the SEED cipher (RFC4269).

    This patch have been used in few VPN appliance vendors in Korea for
    several years. And it was verified by KISA, who developed the
    algorithm itself.

    As its importance in Korean banking industry, it would be great
    if linux incorporates the support.

    Signed-off-by: Hye-Shik Chang
    Signed-off-by: Herbert Xu

    Hye-Shik Chang
     
  • Other options requiring specific block cipher algorithms already have
    the appropriate select's.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Herbert Xu

    Adrian Bunk
     

25 Sep, 2007

1 commit

  • Fix dma_wait_for_async_tx to not loop forever in the case where a
    dependency chain is longer than two entries. This condition will not
    happen with current in-kernel drivers, but fix it for future drivers.

    Found-by: Saeed Bishara
    Signed-off-by: Dan Williams

    Dan Williams
     

10 Sep, 2007

1 commit


09 Sep, 2007

1 commit

  • The function blkcipher_get_spot tries to return a buffer of
    the specified length that does not straddle a page. It has
    an off-by-one bug so it may advance a page unnecessarily.

    What's worse, one of its callers doesn't provide a buffer
    that's sufficiently long for this operation.

    This patch fixes both problems. Thanks to Bob Gilligan for
    diagnosing this problem and providing a fix.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

06 Aug, 2007

1 commit


20 Jul, 2007

1 commit

  • Andrew Morton:
    [async_memcpy] is very wrong if both ASYNC_TX_KMAP_DST and
    ASYNC_TX_KMAP_SRC can ever be set. We'll end up using the same kmap
    slot for both src add dest and we get either corrupted data or a BUG.

    Evgeniy Polyakov:
    Btw, shouldn't it always be kmap_atomic() even if flag is not set.
    That pages are usual one returned by alloc_page().

    So fix the usage of kmap_atomic and kill the ASYNC_TX_KMAP_DST and
    ASYNC_TX_KMAP_SRC flags.

    Cc: Andrew Morton
    Cc: Evgeniy Polyakov
    Signed-off-by: Dan Williams
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Williams
     

17 Jul, 2007

1 commit


15 Jul, 2007

1 commit


13 Jul, 2007

2 commits

  • The async_tx api provides methods for describing a chain of asynchronous
    bulk memory transfers/transforms with support for inter-transactional
    dependencies. It is implemented as a dmaengine client that smooths over
    the details of different hardware offload engine implementations. Code
    that is written to the api can optimize for asynchronous operation and the
    api will fit the chain of operations to the available offload resources.

    I imagine that any piece of ADMA hardware would register with the
    'async_*' subsystem, and a call to async_X would be routed as
    appropriate, or be run in-line. - Neil Brown

    async_tx exploits the capabilities of struct dma_async_tx_descriptor to
    provide an api of the following general format:

    struct dma_async_tx_descriptor *
    async_(..., struct dma_async_tx_descriptor *depend_tx,
    dma_async_tx_callback cb_fn, void *cb_param)
    {
    struct dma_chan *chan = async_tx_find_channel(depend_tx, );
    struct dma_device *device = chan ? chan->device : NULL;
    int int_en = cb_fn ? 1 : 0;
    struct dma_async_tx_descriptor *tx = device ?
    device->device_prep_dma_(chan, len, int_en) : NULL;

    if (tx) { /* run asynchronously */
    ...
    tx->tx_set_dest(addr, tx, index);
    ...
    tx->tx_set_src(addr, tx, index);
    ...
    async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
    } else { /* run synchronously */
    ...

    ...
    async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
    }

    return tx;
    }

    async_tx_find_channel() returns a capable channel from its pool. The
    channel pool is organized as a per-cpu array of channel pointers. The
    async_tx_rebalance() routine is tasked with managing these arrays. In the
    uniprocessor case async_tx_rebalance() tries to spread responsibility
    evenly over channels of similar capabilities. For example if there are two
    copy+xor channels, one will handle copy operations and the other will
    handle xor. In the SMP case async_tx_rebalance() attempts to spread the
    operations evenly over the cpus, e.g. cpu0 gets copy channel0 and xor
    channel0 while cpu1 gets copy channel 1 and xor channel 1. When a
    dependency is specified async_tx_find_channel defaults to keeping the
    operation on the same channel. A xor->copy->xor chain will stay on one
    channel if it supports both operation types, otherwise the transaction will
    transition between a copy and a xor resource.

    Currently the raid5 implementation in the MD raid456 driver has been
    converted to the async_tx api. A driver for the offload engines on the
    Intel Xscale series of I/O processors, iop-adma, is provided in a later
    commit. With the iop-adma driver and async_tx, raid456 is able to offload
    copy, xor, and xor-zero-sum operations to hardware engines.

    On iop342 tiobench showed higher throughput for sequential writes (20 - 30%
    improvement) and sequential reads to a degraded array (40 - 55%
    improvement). For the other cases performance was roughly equal, +/- a few
    percentage points. On a x86-smp platform the performance of the async_tx
    implementation (in synchronous mode) was also +/- a few percentage points
    of the original implementation. According to 'top' on iop342 CPU
    utilization drops from ~50% to ~15% during a 'resync' while the speed
    according to /proc/mdstat doubles from ~25 MB/s to ~50 MB/s.

    The tiobench command line used for testing was: tiobench --size 2048
    --block 4096 --block 131072 --dir /mnt/raid --numruns 5
    * iop342 had 1GB of memory available

    Details:
    * if CONFIG_DMA_ENGINE=n the asynchronous path is compiled away by making
    async_tx_find_channel a static inline routine that always returns NULL
    * when a callback is specified for a given transaction an interrupt will
    fire at operation completion time and the callback will occur in a
    tasklet. if the the channel does not support interrupts then a live
    polling wait will be performed
    * the api is written as a dmaengine client that requests all available
    channels
    * In support of dependencies the api implicitly schedules channel-switch
    interrupts. The interrupt triggers the cleanup tasklet which causes
    pending operations to be scheduled on the next channel
    * Xor engines treat an xor destination address differently than a software
    xor routine. To the software routine the destination address is an implied
    source, whereas engines treat it as a write-only destination. This patch
    modifies the xor_blocks routine to take a an explicit destination address
    to mirror the hardware.

    Changelog:
    * fixed a leftover debug print
    * don't allow callbacks in async_interrupt_cond
    * fixed xor_block changes
    * fixed usage of ASYNC_TX_XOR_DROP_DEST
    * drop dma mapping methods, suggested by Chris Leech
    * printk warning fixups from Andrew Morton
    * don't use inline in C files, Adrian Bunk
    * select the API when MD is enabled
    * BUG_ON xor source counts
    Signed-off-by: Dan Williams
    Acked-By: NeilBrown

    Dan Williams
     
  • The async_tx api tries to use a dma engine for an operation, but will fall
    back to an optimized software routine otherwise. Xor support is
    implemented using the raid5 xor routines. For organizational purposes this
    routine is moved to a common area.

    The following fixes are also made:
    * rename xor_block => xor_blocks, suggested by Adrian Bunk
    * ensure that xor.o initializes before md.o in the built-in case
    * checkpatch.pl fixes
    * mark calibrate_xor_blocks __init, Adrian Bunk

    Cc: Adrian Bunk
    Cc: NeilBrown
    Cc: Herbert Xu
    Signed-off-by: Dan Williams

    Dan Williams
     

11 Jul, 2007

4 commits


31 May, 2007

1 commit


19 May, 2007

1 commit

  • The function crypto_mod_put first frees the algorithm and then drops
    the reference to its module. Unfortunately we read the module pointer
    which after freeing the algorithm and that pointer sits inside the
    object that we just freed.

    So this patch reads the module pointer out before we free the object.

    Thanks to Luca Tettamanti for reporting this.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

18 May, 2007

1 commit


09 May, 2007

1 commit