05 Oct, 2016

1 commit

  • Add missing dmaengine_unmap_put(), so we don't OOM during RAID6 sync.

    Fixes: 1786b943dad0 ("async_pq_val: convert to dmaengine_unmap_data")
    Signed-off-by: Justin Maggard
    Reviewed-by: Dan Williams
    Cc:
    Signed-off-by: Vinod Koul

    Justin Maggard
     

18 Mar, 2016

1 commit

  • CMA allocation should be guaranteed to succeed by definition, but,
    unfortunately, it would be failed sometimes. It is hard to track down
    the problem, because it is related to page reference manipulation and we
    don't have any facility to analyze it.

    This patch adds tracepoints to track down page reference manipulation.
    With it, we can find exact reason of failure and can fix the problem.
    Following is an example of tracepoint output. (note: this example is
    stale version that printing flags as the number. Recent version will
    print it as human readable string.)

    -9018 [004] 92.678375: page_ref_set: pfn=0x17ac9 flags=0x0 count=1 mapcount=0 mapping=(nil) mt=4 val=1
    -9018 [004] 92.678378: kernel_stack:
    => get_page_from_freelist (ffffffff81176659)
    => __alloc_pages_nodemask (ffffffff81176d22)
    => alloc_pages_vma (ffffffff811bf675)
    => handle_mm_fault (ffffffff8119e693)
    => __do_page_fault (ffffffff810631ea)
    => trace_do_page_fault (ffffffff81063543)
    => do_async_page_fault (ffffffff8105c40a)
    => async_page_fault (ffffffff817581d8)
    [snip]
    -9018 [004] 92.678379: page_ref_mod: pfn=0x17ac9 flags=0x40048 count=2 mapcount=1 mapping=0xffff880015a78dc1 mt=4 val=1
    [snip]
    ...
    ...
    -9131 [001] 93.174468: test_pages_isolated: start_pfn=0x17800 end_pfn=0x17c00 fin_pfn=0x17ac9 ret=fail
    [snip]
    -9018 [004] 93.174843: page_ref_mod_and_test: pfn=0x17ac9 flags=0x40068 count=0 mapcount=0 mapping=0xffff880015a78dc1 mt=4 val=-1 ret=1
    => release_pages (ffffffff8117c9e4)
    => free_pages_and_swap_cache (ffffffff811b0697)
    => tlb_flush_mmu_free (ffffffff81199616)
    => tlb_finish_mmu (ffffffff8119a62c)
    => exit_mmap (ffffffff811a53f7)
    => mmput (ffffffff81073f47)
    => do_exit (ffffffff810794e9)
    => do_group_exit (ffffffff81079def)
    => SyS_exit_group (ffffffff81079e74)
    => entry_SYSCALL_64_fastpath (ffffffff817560b6)

    This output shows that problem comes from exit path. In exit path, to
    improve performance, pages are not freed immediately. They are gathered
    and processed by batch. During this process, migration cannot be
    possible and CMA allocation is failed. This problem is hard to find
    without this page reference tracepoint facility.

    Enabling this feature bloat kernel text 30 KB in my configuration.

    text data bss dec hex filename
    12127327 2243616 1507328 15878271 f2487f vmlinux_disabled
    12157208 2258880 1507328 15923416 f2f8d8 vmlinux_enabled

    Note that, due to header file dependency problem between mm.h and
    tracepoint.h, this feature has to open code the static key functions for
    tracepoints. Proposed by Steven Rostedt in following link.

    https://lkml.org/lkml/2015/12/9/699

    [arnd@arndb.de: crypto/async_pq: use __free_page() instead of put_page()]
    [iamjoonsoo.kim@lge.com: fix build failure for xtensa]
    [akpm@linux-foundation.org: tweak Kconfig text, per Vlastimil]
    Signed-off-by: Joonsoo Kim
    Acked-by: Michal Nazarewicz
    Acked-by: Vlastimil Babka
    Cc: Minchan Kim
    Cc: Mel Gorman
    Cc: "Kirill A. Shutemov"
    Cc: Sergey Senozhatsky
    Acked-by: Steven Rostedt
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

07 Jan, 2016

1 commit

  • These async_XX functions are called from md/raid5 in an atomic
    section, between get_cpu() and put_cpu(), so they must not sleep.
    So use GFP_NOWAIT rather than GFP_IO.

    Dan Williams writes: Longer term async_tx needs to be merged into md
    directly as we can allocate this unmap data statically per-stripe
    rather than per request.

    Fixed: 7476bd79fc01 ("async_pq: convert to dmaengine_unmap_data")
    Cc: stable@vger.kernel.org (v3.13+)
    Reported-and-tested-by: Stanislav Samsonov
    Acked-by: Dan Williams
    Signed-off-by: NeilBrown
    Signed-off-by: Vinod Koul

    NeilBrown
     

22 Apr, 2015

1 commit

  • Glue it altogehter. The raid6 rmw path should work the same as the
    already existing raid5 logic. So emulate the prexor handling/flags
    and split functions as needed.

    1) Enable xor_syndrome() in the async layer.

    2) Split ops_run_prexor() into RAID4/5 and RAID6 logic. Xor the syndrome
    at the start of a rmw run as we did it before for the single parity.

    3) Take care of rmw run in ops_run_reconstruct6(). Again process only
    the changed pages to get syndrome back into sync.

    4) Enhance set_syndrome_sources() to fill NULL pages if we are in a rmw
    run. The lower layers will calculate start & end pages from that and
    call the xor_syndrome() correspondingly.

    5) Adapt the several places where we ignored Q handling up to now.

    Performance numbers for a single E5630 system with a mix of 10 7200k
    desktop/server disks. 300 seconds random write with 8 threads onto a
    3,2TB (10*400GB) RAID6 64K chunk without spare (group_thread_cnt=4)

    bsize rmw_level=1 rmw_level=0 rmw_level=1 rmw_level=0
    skip_copy=1 skip_copy=1 skip_copy=0 skip_copy=0
    4K 115 KB/s 141 KB/s 165 KB/s 140 KB/s
    8K 225 KB/s 275 KB/s 324 KB/s 274 KB/s
    16K 434 KB/s 536 KB/s 640 KB/s 534 KB/s
    32K 751 KB/s 1,051 KB/s 1,234 KB/s 1,045 KB/s
    64K 1,339 KB/s 1,958 KB/s 2,282 KB/s 1,962 KB/s
    128K 2,673 KB/s 3,862 KB/s 4,113 KB/s 3,898 KB/s
    256K 7,685 KB/s 7,539 KB/s 7,557 KB/s 7,638 KB/s
    512K 19,556 KB/s 19,558 KB/s 19,652 KB/s 19,688 Kb/s

    Signed-off-by: Markus Stockhausen
    Signed-off-by: NeilBrown

    Markus Stockhausen
     

22 Aug, 2014

1 commit


16 Nov, 2013

1 commit

  • Pull dmaengine changes from Dan

    1/ Bartlomiej and Dan finalized a rework of the dma address unmap
    implementation.

    2/ In the course of testing 1/ a collection of enhancements to dmatest
    fell out. Notably basic performance statistics, and fixed / enhanced
    test control through new module parameters 'run', 'wait', 'noverify',
    and 'verbose'. Thanks to Andriy and Linus for their review.

    3/ Testing the raid related corner cases of 1/ triggered bugs in the
    recently added 16-source operation support in the ioatdma driver.

    4/ Some minor fixes / cleanups to mv_xor and ioatdma.

    Conflicts:
    drivers/dma/dmatest.c

    Signed-off-by: Vinod Koul

    Vinod Koul
     

15 Nov, 2013

8 commits

  • With 24 disks and an ioatdma instance with 16 source support there is a
    corner case where the driver needs to be careful to account for the
    number of implied sources in the continuation case.

    Also bump the default case to test more than 16 sources now that it
    triggers different paths in offload drivers.

    Cc: Dave Jiang
    Acked-by: Dave Jiang
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Remove no longer needed DMA unmap flags:
    - DMA_COMPL_SKIP_SRC_UNMAP
    - DMA_COMPL_SKIP_DEST_UNMAP
    - DMA_COMPL_SRC_UNMAP_SINGLE
    - DMA_COMPL_DEST_UNMAP_SINGLE

    Cc: Vinod Koul
    Cc: Tomasz Figa
    Cc: Dave Jiang
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Acked-by: Jon Mason
    Acked-by: Mark Brown
    [djbw: clean up straggling skip unmap flags in ntb]
    Signed-off-by: Dan Williams

    Bartlomiej Zolnierkiewicz
     
  • Use the generic unmap object to unmap dma buffers.

    Cc: Vinod Koul
    Cc: Tomasz Figa
    Cc: Dave Jiang
    Reported-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Use the generic unmap object to unmap dma buffers.

    Cc: Vinod Koul
    Cc: Tomasz Figa
    Cc: Dave Jiang
    Reported-by: Bartlomiej Zolnierkiewicz
    [bzolnier: keep temporary dma_dest array in do_async_gen_syndrome()]
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Use the generic unmap object to unmap dma buffers.

    Cc: Vinod Koul
    Cc: Tomasz Figa
    Cc: Dave Jiang
    Reported-by: Bartlomiej Zolnierkiewicz
    [bzolnier: keep temporary dma_dest array in async_mult()]
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Use the generic unmap object to unmap dma buffers.

    Cc: Vinod Koul
    Cc: Tomasz Figa
    Cc: Dave Jiang
    Reported-by: Bartlomiej Zolnierkiewicz
    [bzolnier: minor cleanups]
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Use the generic unmap object to unmap dma buffers.

    Later we can push this unmap object up to the raid layer and get rid of
    the 'scribble' parameter.

    Cc: Vinod Koul
    Cc: Tomasz Figa
    Cc: Dave Jiang
    Reported-by: Bartlomiej Zolnierkiewicz
    [bzolnier: minor cleanups]
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Use the generic unmap object to unmap dma buffers.

    Cc: Vinod Koul
    Cc: Tomasz Figa
    Cc: Dave Jiang
    Reported-by: Bartlomiej Zolnierkiewicz
    [bzolnier: add missing unmap->len initialization]
    [bzolnier: fix whitespace damage]
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    [djbw: add DMA_ENGINE=n support]
    Signed-off-by: Dan Williams

    Dan Williams
     

25 Oct, 2013

1 commit


04 Jul, 2013

1 commit

  • There have never been any real users of MEMSET operations since they
    have been introduced in January 2007 by commit 7405f74badf4 ("dmaengine:
    refactor dmaengine around dma_async_tx_descriptor"). Therefore remove
    support for them for now, it can be always brought back when needed.

    [sebastian.hesselbarth@gmail.com: fix drivers/dma/mv_xor]
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Signed-off-by: Sebastian Hesselbarth
    Cc: Vinod Koul
    Acked-by: Dan Williams
    Cc: Tomasz Figa
    Cc: Herbert Xu
    Cc: Olof Johansson
    Cc: Kevin Hilman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bartlomiej Zolnierkiewicz
     

30 Apr, 2013

1 commit


08 Jan, 2013

4 commits


20 Mar, 2012

1 commit


01 Nov, 2011

1 commit


22 Jun, 2011

1 commit

  • Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually).

    To prevent mm.h inclusion via other channels also extract "enum dma_data_direction"
    definition into separate header. This tiny piece is what gluing netdevice.h with mm.h
    via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h".
    Removal of mm.h from scatterlist.h was tried and was found not feasible
    on most archs, so the link was cutoff earlier.

    Hope people are OK with tiny include file.

    Note, that mm_types.h is still dragged in, but it is a separate story.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

31 Mar, 2011

1 commit


28 Oct, 2010

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (48 commits)
    DMAENGINE: move COH901318 to arch_initcall
    dma: imx-dma: fix signedness bug
    dma/timberdale: simplify conditional
    ste_dma40: remove channel_type
    ste_dma40: remove enum for endianess
    ste_dma40: remove TIM_FOR_LINK option
    ste_dma40: move mode_opt to separate config
    ste_dma40: move channel mode to a separate field
    ste_dma40: move priority to separate field
    ste_dma40: add variable to indicate valid dma_cfg
    async_tx: make async_tx channel switching opt-in
    move async raid6 test to lib/Kconfig.debug
    dmaengine: Add Freescale i.MX1/21/27 DMA driver
    intel_mid_dma: change the slave interface
    intel_mid_dma: fix the WARN_ONs
    intel_mid_dma: Add sg list support to DMA driver
    intel_mid_dma: Allow DMAC2 to share interrupt
    intel_mid_dma: Allow IRQ sharing
    intel_mid_dma: Add runtime PM support
    DMAENGINE: define a dummy filter function for ste_dma40
    ...

    Linus Torvalds
     

27 Oct, 2010

1 commit

  • Ensure kmap_atomic() usage is strictly nested

    Signed-off-by: Peter Zijlstra
    Reviewed-by: Rik van Riel
    Acked-by: Chris Metcalf
    Cc: David Howells
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Steven Rostedt
    Cc: Russell King
    Cc: Ralf Baechle
    Cc: David Miller
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

08 Oct, 2010

1 commit

  • The prompt for "Self test for hardware accelerated raid6 recovery" does not
    belong in the top level configuration menu. All the options in
    crypto/async_tx/Kconfig are selected and do not depend on CRYPTO.
    Kconfig.debug seems like a reasonable fit.

    Cc: Herbert Xu
    Cc: David Woodhouse
    Signed-off-by: Dan Williams

    Dan Williams
     

09 Aug, 2010

1 commit


22 May, 2010

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
    DMAENGINE: extend the control command to include an arg
    async_tx: trim dma_async_tx_descriptor in 'no channel switch' case
    DMAENGINE: DMA40 fix for allocation of logical channel 0
    DMAENGINE: DMA40 support paused channel status
    dmaengine: mpc512x: Use resource_size
    DMA ENGINE: Do not reset 'private' of channel
    ioat: Remove duplicated devm_kzalloc() calls for ioatdma_device
    ioat3: disable cacheline-unaligned transfers for raid operations
    ioat2,3: convert to producer/consumer locking
    ioat: convert to circ_buf
    DMAENGINE: Support for ST-Ericssons DMA40 block v3
    async_tx: use of kzalloc/kfree requires the include of slab.h
    dmaengine: provide helper for setting txstate
    DMAENGINE: generic channel status v2
    DMAENGINE: generic slave control v2
    dma: timb-dma: Update comment and fix compiler warning
    dma: Add timb-dma
    DMAENGINE: COH 901 318 fix bytesleft
    DMAENGINE: COH 901 318 rename confusing vars

    Linus Torvalds
     

18 May, 2010

1 commit


05 May, 2010

1 commit

  • The raid6 recovery code should immediately drop back to the optimized
    synchronous path when a p+q dma resource is not available. Otherwise we
    run the non-optimized/multi-pass async code in sync mode.

    Verified with raid6test (NDISKS=255)

    Applies to kernels >= 2.6.32.

    Cc:
    Acked-by: NeilBrown
    Reported-by: H. Peter Anvin
    Signed-off-by: Dan Williams
    Signed-off-by: Linus Torvalds

    Dan Williams
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

18 Dec, 2009

1 commit


20 Nov, 2009

1 commit

  • ioat3.2 does not support asynchronous error notifications which makes
    the driver experience latencies when non-zero pq validate results are
    expected. Provide a mechanism for turning off async_xor_val and
    async_syndrome_val via Kconfig. This approach is generally useful for
    any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like
    to force the async_tx api to fall back to the synchronous path for
    certain operations.

    Signed-off-by: Dan Williams

    Dan Williams
     

30 Oct, 2009

1 commit


20 Oct, 2009

3 commits

  • The raid6 recovery code currently requires special handling of the
    4-disk and 5-disk recovery scenarios for the native layout. Quoting
    from commit 0a82a623:

    In these situations the default N-disk algorithm will present
    0-source or 1-source operations to dma devices. To cover for
    dma devices where the minimum source count is 2 we implement
    4-disk and 5-disk handling in the recovery code.

    The ddf layout presents disks=6 and disks=7 to the recovery code in
    these situations. Instead of looking at the number of disks count the
    number of non-zero sources in the list and call the special case code
    when the number of non-failed sources is 0 or 1.

    [neilb@suse.de: replace 'ddf' flag with counting good sources]
    Signed-off-by: Dan Williams

    Dan Williams
     
  • The global scribble page is used as a temporary destination buffer when
    disabling the P or Q result is requested. The local scribble buffer
    contains memory for performing address conversions. Rename the global
    variable to avoid confusion.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • - update the kernel doc for async_syndrome to indicate what NULL in the
    source list means
    - whitespace fixups

    Signed-off-by: Dan Williams

    Dan Williams
     

16 Oct, 2009

1 commit

  • async_syndrome_val check the P and Q blocks used for RAID6
    calculations.
    With DDF raid6, some of the data blocks might be NULL, so
    this needs to be handled in the same way that async_gen_syndrome
    handles it.

    As async_syndrome_val calls async_xor, also enhance async_xor
    to detect and skip NULL blocks in the list.

    Signed-off-by: NeilBrown

    NeilBrown