31 Mar, 2011

1 commit


15 Jan, 2011

1 commit


18 May, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

27 Mar, 2010

2 commits

  • Simple conditional struct filler to cut out some duplicated code.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • Convert the device_is_tx_complete() operation on the
    DMA engine to a generic device_tx_status()operation which
    can return three states, DMA_TX_RUNNING, DMA_TX_COMPLETE,
    DMA_TX_PAUSED.

    [dan.j.williams@intel.com: update for timberdale]
    Signed-off-by: Linus Walleij
    Acked-by: Mark Brown
    Cc: Maciej Sosnowski
    Cc: Nicolas Ferre
    Cc: Pavel Machek
    Cc: Li Yang
    Cc: Guennadi Liakhovetski
    Cc: Paul Mundt
    Cc: Ralf Baechle
    Cc: Haavard Skinnemoen
    Cc: Magnus Damm
    Cc: Liam Girdwood
    Cc: Joe Perches
    Cc: Roland Dreier
    Signed-off-by: Dan Williams

    Linus Walleij
     

12 Dec, 2009

1 commit


09 Sep, 2009

4 commits


30 Aug, 2009

5 commits

  • Even though the intent is to extend dmatest with P+Q tests there is
    still value in having an always-on sanity check to prevent an
    unintentionally broken driver from registering.

    This depends on raid6_pq.ko for verification, the side effect being that
    PQ capable channels will fail to register when raid6 is disabled.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • iop33x support is not included because that engine is a bit more awkward
    to handle in that it can either be in xor mode or pq mode. The
    dmaengine/async_tx layers currently only comprehend static capabilities.

    Note iop13xx does not support hardware PQ continuation so the driver
    must handle the DMA_PREP_CONTINUE flag for operations across > 16
    sources. From the comment for dma_maxpq:

    /* When an engine does not support native continuation we need 3 extra
    * source slots to reuse P and Q with the following coefficients:
    * 1/ {00} * P : remove P from Q', but use it as a source for P'
    * 2/ {01} * Q : use Q to continue Q' calculation
    * 3/ {00} * Q : subtract Q from P' to cancel (2)
    */

    Signed-off-by: Dan Williams

    Dan Williams
     
  • lockdep correctly identifies a potential recursive locking case for
    iop_chan->lock, but in the dependency submission case we expect that the same
    class will be acquired for both the parent dependency and the child channel.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • Replace 'desc->async_tx.' with 'tx->'

    [ Impact: pure cleanup ]

    Signed-off-by: Dan Williams

    Dan Williams
     
  • [ Based on an original patch by Yuri Tikhonov ]

    This adds support for doing asynchronous GF multiplication by adding
    two additional functions to the async_tx API:

    async_gen_syndrome() does simultaneous XOR and Galois field
    multiplication of sources.

    async_syndrome_val() validates the given source buffers against known P
    and Q values.

    When a request is made to run async_pq against more than the hardware
    maximum number of supported sources we need to reuse the previous
    generated P and Q values as sources into the next operation. Care must
    be taken to remove Q from P' and P from Q'. For example to perform a 5
    source pq op with hardware that only supports 4 sources at a time the
    following approach is taken:

    p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
    p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))

    p' = p + q + q + src4 = p + src4
    q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4

    Note: 4 is the minimum acceptable maxpq otherwise we punt to
    synchronous-software path.

    The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
    sources (in the above manner) and fill the remaining slots up to maxpq
    with the new sources/coefficients.

    Note1: Some devices have native support for P+Q continuation and can skip
    this extra work. Devices with this capability can advertise it with
    dma_set_maxpq. It is up to each driver how to handle the
    DMA_PREP_CONTINUE flag.

    Note2: The api supports disabling the generation of P when generating Q,
    this is ignored by the synchronous path but is implemented by some dma
    devices to save unnecessary writes. In this case the continuation
    algorithm is simplified to only reuse Q as a source.

    Cc: H. Peter Anvin
    Cc: David Woodhouse
    Signed-off-by: Yuri Tikhonov
    Signed-off-by: Ilya Yanok
    Reviewed-by: Andre Noll
    Acked-by: Maciej Sosnowski
    Signed-off-by: Dan Williams

    Dan Williams
     

09 Apr, 2009

1 commit

  • 'zero_sum' does not properly describe the operation of generating parity
    and checking that it validates against an existing buffer. Change the
    name of the operation to 'val' (for 'validate'). This is in
    anticipation of the p+q case where it is a requirement to identify the
    target parity buffers separately from the source buffers, because the
    target parity buffers will not have corresponding pq coefficients.

    Reviewed-by: Andre Noll
    Acked-by: Maciej Sosnowski
    Signed-off-by: Dan Williams

    Dan Williams
     

26 Mar, 2009

1 commit


09 Mar, 2009

1 commit

  • * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
    dmatest: fix use after free in dmatest_exit
    ipu_idmac: fix spinlock type
    iop-adma, mv_xor: fix mem leak on self-test setup failure
    fsldma: fix off by one in dma_halt
    I/OAT: fail self-test if callback test reaches timeout
    I/OAT: update driver version and copyright dates
    I/OAT: list usage cleanup
    I/OAT: set tcp_dma_copybreak to 256k for I/OAT ver.3
    I/OAT: cancel watchdog before dma remove
    I/OAT: fail initialization on zero channels detection
    I/OAT: do not set DCACTRL_CMPL_WRITE_ENABLE for I/OAT ver.3
    I/OAT: add verification for proper APICID_TAG_MAP setting by BIOS
    dmaengine: update kerneldoc

    Linus Torvalds
     

05 Mar, 2009

1 commit


04 Mar, 2009

1 commit

  • `iop_adma_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o
    `mv_xor_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o
    `mv64xxx_i2c_unmap_regs' referenced in section `.devinit.text' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o
    `mv64xxx_i2c_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o
    `orion_nand_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o
    `pxafb_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o

    Acked-by: Uwe Kleine-König
    Signed-off-by: Russell King

    Russell King
     

07 Jan, 2009

5 commits


06 Jan, 2009

1 commit

  • async_tx.ko is a consumer of dma channels. A circular dependency arises
    if modules in drivers/dma rely on common code in async_tx.ko. It
    prevents either module from being unloaded.

    Move dma_wait_for_async_tx and async_tx_run_dependencies to dmaeninge.o
    where they should have been from the beginning.

    Reviewed-by: Andrew Morton
    Signed-off-by: Dan Williams

    Dan Williams
     

09 Dec, 2008

1 commit

  • Mapping the destination multiple times is a misuse of the dma-api.
    Since the destination may be reused as a source, ensure that it is only
    mapped once and that it is mapped bidirectionally. This appears to add
    ugliness on the unmap side in that it always reads back the destination
    address from the descriptor, but gcc can determine that dma_unmap is a
    nop and not emit the code that calculates its arguments.

    Cc:
    Cc: Saeed Bishara
    Acked-by: Yuri Tikhonov
    Signed-off-by: Dan Williams

    Dan Williams
     

12 Nov, 2008

2 commits


07 Aug, 2008

1 commit


18 Jul, 2008

2 commits


09 Jul, 2008

3 commits

  • In some cases client code may need the dma-driver to skip the unmap of source
    and/or destination buffers. Setting these flags indicates to the driver to
    skip the unmap step. In this regard async_xor is currently broken in that it
    allows the destination buffer to be unmapped while an operation is still in
    progress, i.e. when the number of sources exceeds the hardware channel's
    maximum (fixed in a subsequent patch).

    Acked-by: Saeed Bishara
    Acked-by: Maciej Sosnowski
    Acked-by: Haavard Skinnemoen
    Signed-off-by: Dan Williams

    Dan Williams
     
  • A DMA controller capable of doing slave transfers may need to know a
    few things about the slave when preparing the channel. We don't want
    to add this information to struct dma_channel since the channel hasn't
    yet been bound to a client at this point.

    Instead, pass a reference to the client requesting the channel to the
    driver's device_alloc_chan_resources hook so that it can pick the
    necessary information from the dma_client struct by itself.

    [dan.j.williams@intel.com: fixed up fsldma and mv_xor]
    Acked-by: Maciej Sosnowski
    Signed-off-by: Haavard Skinnemoen
    Signed-off-by: Dan Williams

    Haavard Skinnemoen
     
  • Since 43cc71eed1250755986da4c0f9898f9a635cb3bf, the platform
    modalias is prefixed with "platform:". Add MODULE_ALIAS() to most
    of the hotpluggable platform drivers, to re-enable auto loading.

    Cc:
    Signed-off-by: Kay Sievers
    Signed-off-by: David Brownell
    Signed-off-by: Andrew Morton
    Signed-off-by: Dan Williams

    Kay Sievers
     

21 May, 2008

1 commit

  • 1) Remove an explicit memset(.., 0, ...) to a variable allocated with
    kzalloc (i.e. 'dest').

    2) Allocate 'src' with kmalloc instead of kzalloc as all elements of the
    'src' buffer are initialized in a 'for(...)' loop just after.

    3) remove useless 'sizeof(u8)', which always returns 1, when computing the
    size of the memory to be allocated.

    Signed-off-by: Christophe Jaillet
    Signed-off-by: Dan Williams

    Christophe Jaillet
     

18 Apr, 2008

3 commits

  • 'ack' is currently a simple integer that flags whether or not a client is done
    touching fields in the given descriptor. It is effectively just a single bit
    of information. Converting this to a flags parameter allows the other bits to
    be put to use to control completion actions, like dma-unmap, and capture
    results, like xor-zero-sum == 0.

    Changes are one of:
    1/ convert all open-coded ->ack manipulations to use async_tx_ack
    and async_tx_test_ack.
    2/ set the ack bit at prep time where possible
    3/ make drivers store the flags at prep time
    4/ add flags to the device_prep_dma_interrupt prototype

    Acked-by: Maciej Sosnowski
    Signed-off-by: Dan Williams

    Dan Williams
     
  • This workaround was covering the dependency submission bug in async_tx.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • DMA drivers no longer need to be notified of dependency submission
    events as async_tx_run_dependencies and async_tx_channel_switch will
    handle the scheduling and execution of dependent operations.

    [sfr@canb.auug.org.au: extend this for fsldma]
    Acked-by: Shannon Nelson
    Signed-off-by: Dan Williams

    Dan Williams