06 Aug, 2007

1 commit


20 Jul, 2007

1 commit

  • Andrew Morton:
    [async_memcpy] is very wrong if both ASYNC_TX_KMAP_DST and
    ASYNC_TX_KMAP_SRC can ever be set. We'll end up using the same kmap
    slot for both src add dest and we get either corrupted data or a BUG.

    Evgeniy Polyakov:
    Btw, shouldn't it always be kmap_atomic() even if flag is not set.
    That pages are usual one returned by alloc_page().

    So fix the usage of kmap_atomic and kill the ASYNC_TX_KMAP_DST and
    ASYNC_TX_KMAP_SRC flags.

    Cc: Andrew Morton
    Cc: Evgeniy Polyakov
    Signed-off-by: Dan Williams
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Williams
     

17 Jul, 2007

1 commit


15 Jul, 2007

1 commit


13 Jul, 2007

2 commits

  • The async_tx api provides methods for describing a chain of asynchronous
    bulk memory transfers/transforms with support for inter-transactional
    dependencies. It is implemented as a dmaengine client that smooths over
    the details of different hardware offload engine implementations. Code
    that is written to the api can optimize for asynchronous operation and the
    api will fit the chain of operations to the available offload resources.

    I imagine that any piece of ADMA hardware would register with the
    'async_*' subsystem, and a call to async_X would be routed as
    appropriate, or be run in-line. - Neil Brown

    async_tx exploits the capabilities of struct dma_async_tx_descriptor to
    provide an api of the following general format:

    struct dma_async_tx_descriptor *
    async_(..., struct dma_async_tx_descriptor *depend_tx,
    dma_async_tx_callback cb_fn, void *cb_param)
    {
    struct dma_chan *chan = async_tx_find_channel(depend_tx, );
    struct dma_device *device = chan ? chan->device : NULL;
    int int_en = cb_fn ? 1 : 0;
    struct dma_async_tx_descriptor *tx = device ?
    device->device_prep_dma_(chan, len, int_en) : NULL;

    if (tx) { /* run asynchronously */
    ...
    tx->tx_set_dest(addr, tx, index);
    ...
    tx->tx_set_src(addr, tx, index);
    ...
    async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
    } else { /* run synchronously */
    ...

    ...
    async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
    }

    return tx;
    }

    async_tx_find_channel() returns a capable channel from its pool. The
    channel pool is organized as a per-cpu array of channel pointers. The
    async_tx_rebalance() routine is tasked with managing these arrays. In the
    uniprocessor case async_tx_rebalance() tries to spread responsibility
    evenly over channels of similar capabilities. For example if there are two
    copy+xor channels, one will handle copy operations and the other will
    handle xor. In the SMP case async_tx_rebalance() attempts to spread the
    operations evenly over the cpus, e.g. cpu0 gets copy channel0 and xor
    channel0 while cpu1 gets copy channel 1 and xor channel 1. When a
    dependency is specified async_tx_find_channel defaults to keeping the
    operation on the same channel. A xor->copy->xor chain will stay on one
    channel if it supports both operation types, otherwise the transaction will
    transition between a copy and a xor resource.

    Currently the raid5 implementation in the MD raid456 driver has been
    converted to the async_tx api. A driver for the offload engines on the
    Intel Xscale series of I/O processors, iop-adma, is provided in a later
    commit. With the iop-adma driver and async_tx, raid456 is able to offload
    copy, xor, and xor-zero-sum operations to hardware engines.

    On iop342 tiobench showed higher throughput for sequential writes (20 - 30%
    improvement) and sequential reads to a degraded array (40 - 55%
    improvement). For the other cases performance was roughly equal, +/- a few
    percentage points. On a x86-smp platform the performance of the async_tx
    implementation (in synchronous mode) was also +/- a few percentage points
    of the original implementation. According to 'top' on iop342 CPU
    utilization drops from ~50% to ~15% during a 'resync' while the speed
    according to /proc/mdstat doubles from ~25 MB/s to ~50 MB/s.

    The tiobench command line used for testing was: tiobench --size 2048
    --block 4096 --block 131072 --dir /mnt/raid --numruns 5
    * iop342 had 1GB of memory available

    Details:
    * if CONFIG_DMA_ENGINE=n the asynchronous path is compiled away by making
    async_tx_find_channel a static inline routine that always returns NULL
    * when a callback is specified for a given transaction an interrupt will
    fire at operation completion time and the callback will occur in a
    tasklet. if the the channel does not support interrupts then a live
    polling wait will be performed
    * the api is written as a dmaengine client that requests all available
    channels
    * In support of dependencies the api implicitly schedules channel-switch
    interrupts. The interrupt triggers the cleanup tasklet which causes
    pending operations to be scheduled on the next channel
    * Xor engines treat an xor destination address differently than a software
    xor routine. To the software routine the destination address is an implied
    source, whereas engines treat it as a write-only destination. This patch
    modifies the xor_blocks routine to take a an explicit destination address
    to mirror the hardware.

    Changelog:
    * fixed a leftover debug print
    * don't allow callbacks in async_interrupt_cond
    * fixed xor_block changes
    * fixed usage of ASYNC_TX_XOR_DROP_DEST
    * drop dma mapping methods, suggested by Chris Leech
    * printk warning fixups from Andrew Morton
    * don't use inline in C files, Adrian Bunk
    * select the API when MD is enabled
    * BUG_ON xor source counts
    Signed-off-by: Dan Williams
    Acked-By: NeilBrown

    Dan Williams
     
  • The async_tx api tries to use a dma engine for an operation, but will fall
    back to an optimized software routine otherwise. Xor support is
    implemented using the raid5 xor routines. For organizational purposes this
    routine is moved to a common area.

    The following fixes are also made:
    * rename xor_block => xor_blocks, suggested by Adrian Bunk
    * ensure that xor.o initializes before md.o in the built-in case
    * checkpatch.pl fixes
    * mark calibrate_xor_blocks __init, Adrian Bunk

    Cc: Adrian Bunk
    Cc: NeilBrown
    Cc: Herbert Xu
    Signed-off-by: Dan Williams

    Dan Williams
     

11 Jul, 2007

4 commits


31 May, 2007

1 commit


19 May, 2007

1 commit

  • The function crypto_mod_put first frees the algorithm and then drops
    the reference to its module. Unfortunately we read the module pointer
    which after freeing the algorithm and that pointer sits inside the
    object that we just freed.

    So this patch reads the module pointer out before we free the object.

    Thanks to Luca Tettamanti for reporting this.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

18 May, 2007

1 commit


09 May, 2007

2 commits


02 May, 2007

8 commits

  • This patch adds the cryptd module which is a template that takes a
    synchronous software crypto algorithm and converts it to an asynchronous
    one by executing it in a kernel thread.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • As it is whenever a new algorithm with the same name is registered
    users of the old algorithm will be removed so that they can take
    advantage of the new algorithm. This presents a problem when the
    new algorithm is not equivalent to the old algorithm. In particular,
    the new algorithm might only function on top of the existing one.

    Hence we should not remove users unless they can make use of the
    new algorithm.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch allows the use of nested templates by allowing the use of
    brackets inside a template parameter.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds the mid-level interface for asynchronous block ciphers.
    It also includes a generic queueing mechanism that can be used by other
    asynchronous crypto operations in future.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch passes the type/mask along when constructing instances of
    templates. This is in preparation for templates that may support
    multiple types of instances depending on what is requested. For example,
    the planned software async crypto driver will use this construct.

    For the moment this allows us to check whether the instance constructed
    is of the correct type and avoid returning success if the type does not
    match.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch converts the tcrypt module to use the asynchronous block cipher
    interface. As all synchronous block ciphers can be used through the async
    interface, tcrypt is still able to test them.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds the frontend interface for asynchronous block ciphers.
    In addition to the usual block cipher parameters, there is a callback
    function pointer and a data pointer. The callback will be invoked only
    if the encrypt/decrypt handlers return -EINPROGRESS. In other words,
    if the return value of zero the completion handler (or the equivalent
    code) needs to be invoked by the caller.

    The request structure is allocated and freed by the caller. Its size
    is determined by calling crypto_ablkcipher_reqsize(). The helpers
    ablkcipher_request_alloc/ablkcipher_request_free can be used to manage
    the memory for a request.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • The proc functions were incorrectly marked as used rather than unused.
    They may be unused if proc is disabled.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

28 Apr, 2007

1 commit

  • After 13 years of use, it looks like my email address is finally going
    to disappear. While this is likely to drop the amount of incoming spam
    greatly ;-), it may also affect more appropriate messages, so let's
    update my email address in various places. In addition, Host AP mailing
    list is subscribers-only and linux-wireless can also be used for
    discussing issues related to this driver which is now shown in
    MAINTAINERS.

    Signed-off-by: Jouni Malinen
    Signed-off-by: John W. Linville

    Jouni Malinen
     

31 Mar, 2007

2 commits


21 Mar, 2007

2 commits

  • This patch fixes loading the tcrypt module while deflate isn't available
    at all (isn't build).

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • In the loop in scatterwalk_copychunks(), if walk->offset is zero,
    then scatterwalk_pagedone rounds that up to the nearest page boundary:

    walk->offset += PAGE_SIZE - 1;
    walk->offset &= PAGE_MASK;

    which is a no-op in this case, so we don't advance to the next element
    of the scatterlist array:

    if (walk->offset >= walk->sg->offset + walk->sg->length)
    scatterwalk_start(walk, sg_next(walk->sg));

    and we end up copying the same data twice.

    It appears that other callers of scatterwalk_{page}done first advance
    walk->offset, so I believe that's the correct thing to do here.

    This caused a bug in NFS when run with krb5p security, which would
    cause some writes to fail with permissions errors--for example, writes
    of less than 8 bytes (the des blocksize) at the start of a file.

    A git-bisect shows the bug was originally introduced by
    5c64097aa0f6dc4f27718ef47ca9a12538d62860, first in 2.6.19-rc1.

    Signed-off-by: "J. Bruce Fields"
    Signed-off-by: Herbert Xu

    J. Bruce Fields
     

13 Feb, 2007

1 commit

  • Many struct file_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

09 Feb, 2007

1 commit


07 Feb, 2007

10 commits