06 Jan, 2016

3 commits


18 Dec, 2015

8 commits

  • Add support for providing device to filter_fn mapping so client drivers
    can switch to use the dma_request_chan() API.

    Signed-off-by: Peter Ujfalusi
    Reviewed-by: Arnd Bergmann
    Signed-off-by: Vinod Koul

    Peter Ujfalusi
     
  • Add support for providing device to filter_fn mapping so client drivers
    can switch to use the dma_request_chan() API.

    Signed-off-by: Peter Ujfalusi
    Reviewed-by: Arnd Bergmann
    Signed-off-by: Vinod Koul

    Peter Ujfalusi
     
  • The two API function can cover most, if not all current APIs used to
    request a channel. With minimal effort dmaengine drivers, platforms and
    dmaengine user drivers can be converted to use the two function.

    struct dma_chan *dma_request_chan_by_mask(const dma_cap_mask_t *mask);

    To request any channel matching with the requested capabilities, can be
    used to request channel for memcpy, memset, xor, etc where no hardware
    synchronization is needed.

    struct dma_chan *dma_request_chan(struct device *dev, const char *name);
    To request a slave channel. The dma_request_chan() will try to find the
    channel via DT, ACPI or in case if the kernel booted in non DT/ACPI mode
    it will use a filter lookup table and retrieves the needed information from
    the dma_slave_map provided by the DMA drivers.
    This legacy mode needs changes in platform code, in dmaengine drivers and
    finally the dmaengine user drivers can be converted:

    For each dmaengine driver an array of DMA device, slave and the parameter
    for the filter function needs to be added:

    static const struct dma_slave_map da830_edma_map[] = {
    { "davinci-mcasp.0", "rx", EDMA_FILTER_PARAM(0, 0) },
    { "davinci-mcasp.0", "tx", EDMA_FILTER_PARAM(0, 1) },
    { "davinci-mcasp.1", "rx", EDMA_FILTER_PARAM(0, 2) },
    { "davinci-mcasp.1", "tx", EDMA_FILTER_PARAM(0, 3) },
    { "davinci-mcasp.2", "rx", EDMA_FILTER_PARAM(0, 4) },
    { "davinci-mcasp.2", "tx", EDMA_FILTER_PARAM(0, 5) },
    { "spi_davinci.0", "rx", EDMA_FILTER_PARAM(0, 14) },
    { "spi_davinci.0", "tx", EDMA_FILTER_PARAM(0, 15) },
    { "da830-mmc.0", "rx", EDMA_FILTER_PARAM(0, 16) },
    { "da830-mmc.0", "tx", EDMA_FILTER_PARAM(0, 17) },
    { "spi_davinci.1", "rx", EDMA_FILTER_PARAM(0, 18) },
    { "spi_davinci.1", "tx", EDMA_FILTER_PARAM(0, 19) },
    };

    This information is going to be needed by the dmaengine driver, so
    modification to the platform_data is needed, and the driver map should be
    added to the pdata of the DMA driver:

    da8xx_edma0_pdata.slave_map = da830_edma_map;
    da8xx_edma0_pdata.slavecnt = ARRAY_SIZE(da830_edma_map);

    The DMA driver then needs to configure the needed device -> filter_fn
    mapping before it registers with dma_async_device_register() :

    ecc->dma_slave.filter_map.map = info->slave_map;
    ecc->dma_slave.filter_map.mapcnt = info->slavecnt;
    ecc->dma_slave.filter_map.fn = edma_filter_fn;

    When neither DT or ACPI lookup is available the dma_request_chan() will
    try to match the requester's device name with the filter_map's list of
    device names, when a match found it will use the information from the
    dma_slave_map to get the channel with the dma_get_channel() internal
    function.

    Signed-off-by: Peter Ujfalusi
    Reviewed-by: Arnd Bergmann
    Signed-off-by: Vinod Koul

    Peter Ujfalusi
     
  • Channel matching with private_candidate() is used in two paths, the error
    checking is slightly different in them and they are duplicating code also.
    Move the code under find_candidate() to provide consistent execution and
    going to allow us to reuse this mode of channel lookup later.

    Signed-off-by: Peter Ujfalusi
    Reviewed-by: Andy Shevchenko
    Reviewed-by: Arnd Bergmann
    Signed-off-by: Vinod Koul

    Peter Ujfalusi
     
  • If mask is NULL skip the mask matching against the DMA device capabilities.

    Signed-off-by: Peter Ujfalusi
    Reviewed-by: Andy Shevchenko
    Reviewed-by: Arnd Bergmann
    Signed-off-by: Vinod Koul

    Peter Ujfalusi
     
  • Use of the CANCEL bit in mdc_terminate_all creates an
    additional 'command done' to appear in the registers (in
    addition to an interrupt).

    In addition, there is a potential race between
    mdc_terminate_all and the irq handler if a transfer
    completes at the same time as the terminate all (presently
    this results in an inappropriate warning).

    To handle these issues, any outstanding 'command done'
    events are cleared during mdc_terminate_all and the irq
    handler takes no action when there are no new 'command done'
    events.

    Signed-off-by: Damien.Horsley
    Signed-off-by: Vinod Koul

    Damien.Horsley
     
  • Due to changes in device and platform code drivers w/o probe will fail to
    load. This means that the devices for eDMA TPTCs are goign to be without
    driver and omap hwmod code will turn them off after the kernel finished
    loading:
    [ 3.015900] platform 49800000.tptc: omap_device_late_idle: enabled but no driver. Idling
    [ 3.024671] platform 49a00000.tptc: omap_device_late_idle: enabled but no driver. Idling

    This will prevent eDMA to work since the TPTCs are not enabled.

    Signed-off-by: Peter Ujfalusi
    Fixes: 34635b1accb9 ("dmaengine: edma: Add dummy driver skeleton for edma3-tptc")
    Signed-off-by: Vinod Koul

    Peter Ujfalusi
     
  • If the "dma-channels" DT property is missing, the dw_dma_parse_dt()
    function return NULL, but not before allocating memory for a struct
    dw_dma_platform_data through devres. If the device supports parameter
    detection, the probe still succeeds and the allocated memory is not
    released until the device is removed.

    Fix this by deferring the allocation until after checking the
    "dma-channels" property.

    Signed-off-by: Mans Rullgard
    Acked-by: Viresh Kumar
    Signed-off-by: Vinod Koul

    Mans Rullgard
     

05 Dec, 2015

8 commits

  • Calling synchronize_irq() right before devm_free_irq() is quite useless. On
    one hand the IRQ can easily fire again before devm_free_irq() is entered,
    on the other hand devm_free_irq() itself calls synchronize_irq() internally
    (in a race condition free way), before any state associated with the IRQ is
    freed.

    Patch was generated using the following semantic patch:
    //
    @@
    expression irq, dev;
    @@
    -synchronize_irq(irq);
    devm_free_irq(dev, irq, ...);
    //

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Vinod Koul

    Lars-Peter Clausen
     
  • Calling synchronize_irq() right before free_irq() is quite useless. On one
    hand the IRQ can easily fire again before free_irq() is entered, on the
    other hand free_irq() itself calls synchronize_irq() internally (in a race
    condition free way), before any state associated with the IRQ is freed.

    Patch was generated using the following semantic patch:
    //
    @@
    expression irq;
    @@
    -synchronize_irq(irq);
    free_irq(irq, ...);
    //

    Signed-off-by: Lars-Peter Clausen
    Acked-by: Ludovic Desroches
    Signed-off-by: Vinod Koul

    Lars-Peter Clausen
     
  • This add power management suspend/resume support for the fsl-edma
    driver.

    eDMA acted as a basic function used by others. What it needs to do
    is the two steps below to support power management.

    In fsl_edma_suspend_late:
    Check whether the DMA chan is idle, if it is not idle disable DMA
    request.

    In fsl_edma_resume_early:
    Enable the eDMA and wait for being used.

    Signed-off-by: Yuan Yao
    Signed-off-by: Vinod Koul

    Yuan Yao
     
  • There is no need to calculate an overall length of the descriptor each time we
    call for DMA transfer status. Instead we do this at descriptor allocation stage
    and keep the stored length for further usage.

    Signed-off-by: Andy Shevchenko
    Signed-off-by: Vinod Koul

    Andy Shevchenko
     
  • Currently the match DMA controller is done only for lower 32 bits of
    address which might be not true on 64-bit platform. Check upper portion
    as well.

    Signed-off-by: Andy Shevchenko
    Signed-off-by: Vinod Koul

    Andy Shevchenko
     
  • When setting the channel configuration register, the perid field is not
    set to 0 since it is useless for mem2mem transfers. Unfortunately, a
    device has 0 as perid. It could cause spurious flags status because
    the controller could mix some events from the two channels.
    For that reason, use the highest perid value for mem2mem transfers since it
    doesn't match the perid of other devices.

    Signed-off-by: Ludovic Desroches
    Acked-by: Nicolas Ferre
    Signed-off-by: Vinod Koul

    Ludovic Desroches
     
  • Implementations of dmaengine_synchronize() are allowed to sleep, hence the
    function must not be called to from atomic context. Add might_sleep() to
    dmaengine_synchronize() to make it easier to detect non-compliant callers.

    Suggested-by: Andy Shevchenko
    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Vinod Koul

    Lars-Peter Clausen
     
  • This patch fixes an issue that list_for_each_entry() in
    usb_dmac_chan_terminate_all() is possible to cause endless loop because
    this will move own desc to the desc_freed. So, this driver should use
    list_for_each_entry_safe() instead of list_for_each_entry().

    Signed-off-by: Yoshihiro Shimoda
    Signed-off-by: Vinod Koul

    Yoshihiro Shimoda
     

16 Nov, 2015

14 commits

  • As this driver provides a mechanism to reuse transfers, declare it in
    its probe function.

    Signed-off-by: Robert Jarzmik
    Signed-off-by: Vinod Koul

    Robert Jarzmik
     
  • In the current state, the capability of transfer reuse can neither be
    set by a slave dmaengine driver, nor used by a client driver, because
    the capability is not available to dma_get_slave_caps().

    Fix this by adding a way to declare the capability.

    Fixes: 272420214d26 ("dmaengine: Add DMA_CTRL_REUSE")
    Signed-off-by: Robert Jarzmik
    Signed-off-by: Vinod Koul

    Robert Jarzmik
     
  • This patch attempts to enhance the case of a transfer submitted multiple
    times, and where the cost of creating the descriptors chain is not
    negligible.

    This happens with big video buffers (several megabytes, ie. several
    thousands of linked descriptors in one scatter-gather list). In these
    cases, a video driver would want to do :
    - tx = dmaengine_prep_slave_sg()
    - dma_engine_submit(tx);
    - dma_async_issue_pending()
    - wait for video completion
    - read video data (or not, skipping a frame is also possible)
    - dma_engine_submit(tx)
    => here, the descriptors chain recalculation will take time
    => the dma coherent allocation over and over might create holes in
    the dma pool, which is counter-productive.
    - dma_async_issue_pending()
    - etc ...

    In order to cope with this case, virt-dma is modified to prevent freeing
    the descriptors upon completion if DMA_CTRL_REUSE flag is set in the
    transfer.

    This patch is a respin of the former DMA_CTRL_ACK approach, which was
    reverted due to a regression in audio drivers.

    Signed-off-by: Robert Jarzmik
    Signed-off-by: Vinod Koul

    Robert Jarzmik
     
  • Use the new dmaengine_synchronize() function to make sure that all complete
    callbacks have finished running before the runtime data, which is accessed
    in the completed callback, is freed.

    This fixes a long standing use-after-free race condition that has been
    observed on some systems.

    Signed-off-by: Lars-Peter Clausen
    Reviewed-by: Takashi Iwai
    Signed-off-by: Vinod Koul

    Lars-Peter Clausen
     
  • Implement the new device_synchronize() callback to allow proper
    synchronization when stopping a channel. Since the driver already makes
    sure that no new complete callbacks are scheduled after the
    device_terminate_all() callback has been called, all left to do in the
    device_synchronize() callback is to wait for all currently running complete
    callbacks to finish.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Vinod Koul

    Lars-Peter Clausen
     
  • Add a synchronize helper function for the virt-dma library. The function
    makes sure that any scheduled descriptor complete callbacks have finished
    running before the function returns.

    This needs to be called by drivers using virt-dma in their
    device_synchronize() callback. Depending on the driver additional
    operations might be necessary in addition to calling vchan_synchronize() to
    ensure proper synchronization.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Vinod Koul

    Lars-Peter Clausen
     
  • The DMAengine API has a long standing race condition that is inherent to
    the API itself. Calling dmaengine_terminate_all() is supposed to stop and
    abort any pending or active transfers that have previously been submitted.
    Unfortunately it is possible that this operation races against a currently
    running (or with some drivers also scheduled) completion callback.

    Since the API allows dmaengine_terminate_all() to be called from atomic
    context as well as from within a completion callback it is not possible to
    synchronize to the execution of the completion callback from within
    dmaengine_terminate_all() itself.

    This means that a user of the DMAengine API does not know when it is safe
    to free resources used in the completion callback, which can result in a
    use-after-free race condition.

    This patch addresses the issue by introducing an explicit synchronization
    primitive to the DMAengine API called dmaengine_synchronize().

    The existing dmaengine_terminate_all() is deprecated in favor of
    dmaengine_terminate_sync() and dmaengine_terminate_async(). The former
    aborts all pending and active transfers and synchronizes to the current
    context, meaning it will wait until all running completion callbacks have
    finished. This means it is only possible to call this function from
    non-atomic context. The later function does not synchronize, but can still
    be used in atomic context or from within a complete callback. It has to be
    followed up by dmaengine_synchronize() before a client can free the
    resources used in a completion callback.

    In addition to this the semantics of the device_terminate_all() callback
    are slightly relaxed by this patch. It is now OK for a driver to only
    schedule the termination of the active transfer, but does not necessarily
    have to wait until the DMA controller has completely stopped. The driver
    must ensure though that the controller has stopped and no longer accesses
    any memory when the device_synchronize() callback returns.

    This was in part done since most drivers do not pay attention to this
    anyway at the moment and to emphasize that this needs to be done when the
    device_synchronize() callback is implemented. But it also helps with
    implementing support for devices where stopping the controller can require
    operations that may sleep.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Vinod Koul

    Lars-Peter Clausen
     
  • Linus Torvalds
     
  • Pull perf updates from Thomas Gleixner:
    "Mostly updates to the perf tool plus two fixes to the kernel core code:

    - Handle tracepoint filters correctly for inherited events (Peter
    Zijlstra)

    - Prevent a deadlock in perf_lock_task_context (Paul McKenney)

    - Add missing newlines to some pr_err() calls (Arnaldo Carvalho de
    Melo)

    - Print full source file paths when using 'perf annotate --print-line
    --full-paths' (Michael Petlan)

    - Fix 'perf probe -d' when just one out of uprobes and kprobes is
    enabled (Wang Nan)

    - Add compiler.h to list.h to fix 'make perf-tar-src-pkg' generated
    tarballs, i.e. out of tree building (Arnaldo Carvalho de Melo)

    - Add the llvm-src-base.c and llvm-src-kbuild.c files, generated by
    the 'perf test' LLVM entries, when running it in-tree, to
    .gitignore (Yunlong Song)

    - libbpf error reporting improvements, using a strerror interface to
    more precisely tell the user about problems with the provided
    scriptlet, be it in C or as a ready made object file (Wang Nan)

    - Do not be case sensitive when searching for matching 'perf test'
    entries (Arnaldo Carvalho de Melo)

    - Inform the user about objdump failures in 'perf annotate' (Andi
    Kleen)

    - Improve the LLVM 'perf test' entry, introduce a new ones for BPF
    and kbuild tests to check the environment used by clang to compile
    .c scriptlets (Wang Nan)"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
    perf/x86/intel/rapl: Remove the unused RAPL_EVENT_DESC() macro
    tools include: Add compiler.h to list.h
    perf probe: Verify parameters in two functions
    perf session: Add missing newlines to some pr_err() calls
    perf annotate: Support full source file paths for srcline fix
    perf test: Add llvm-src-base.c and llvm-src-kbuild.c to .gitignore
    perf: Fix inherited events vs. tracepoint filters
    perf: Disable IRQs across RCU RS CS that acquires scheduler lock
    perf test: Do not be case sensitive when searching for matching tests
    perf test: Add 'perf test BPF'
    perf test: Enhance the LLVM tests: add kbuild test
    perf test: Enhance the LLVM test: update basic BPF test program
    perf bpf: Improve BPF related error messages
    perf tools: Make fetch_kernel_version() publicly available
    bpf tools: Add new API bpf_object__get_kversion()
    bpf tools: Improve libbpf error reporting
    perf probe: Cleanup find_perf_probe_point_from_map to reduce redundancy
    perf annotate: Inform the user about objdump failures in --stdio
    perf stat: Make stat options global
    perf sched latency: Fix thread pid reuse issue
    ...

    Linus Torvalds
     
  • Pull scheduler fix from Thomas Gleixner:
    "A single fix to prevent math underflow in the numa balancing code"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/numa: Fix math underflow in task_tick_numa()

    Linus Torvalds
     
  • Pull liblockdep fixes from Thomas Gleixner:
    "Three small patches to synchronize liblockdep with the latest core
    changes"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    tools/liblockdep: explicitly declare lockdep API we call from liblockdep
    tools/liblockdep: add userspace versions of WRITE_ONCE and RCU_INIT_POINTER
    tools/liblockdep: remove task argument from debug_check_no_locks_held

    Linus Torvalds
     
  • Pull x86 fixes from Thomas Gleixner:
    "A couple of fixes and updates related to x86:

    - Fix the W+X check regression on XEN

    - The real fix for the low identity map trainwreck

    - Probe legacy PIC early instead of unconditionally allocating legacy
    irqs

    - Add cpu verification to long mode entry

    - Adjust the cache topology to AMD Fam17H systems

    - Let Merrifield use the TSC across S3"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/cpu: Call verify_cpu() after having entered long mode too
    x86/setup: Fix low identity map for >= 2GB kernel range
    x86/mm: Skip the hypervisor range when walking PGD
    x86/AMD: Fix last level cache topology for AMD Fam17h systems
    x86/irq: Probe for PIC presence before allocating descs for legacy IRQs
    x86/cpu/intel: Enable X86_FEATURE_NONSTOP_TSC_S3 for Merrifield

    Linus Torvalds
     
  • ….kernel.org/pub/scm/linux/kernel/git/tip/tip

    Pull irq and timer fixes from Thomas Gleixner:

    - An irq regression fix to restore the wakeup behaviour of chained
    interrupts.

    - A timer fix for a long standing race versus timers scheduled on a
    target cpu which got exposed by recent changes in the workqueue
    implementation.

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    genirq/PM: Restore system wake up from chained interrupts

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    timers: Use proper base migration in add_timer_on()

    Linus Torvalds
     
  • Pull MIPS updates from Ralf Baechle:
    "These are the highlists of the main MIPS pull request for 4.4:

    - Add latencytop support
    - Support appended DTBs
    - VDSO support and initially use it for gettimeofday.
    - Drop the .MIPS.abiflags and ELF NOTE sections from vmlinux
    - Support for the 5KE, an internal test core.
    - Switch all MIPS platfroms to libata drivers.
    - Improved support, cleanups for ralink and Lantiq platforms.
    - Support for the new xilfpga platform.
    - A number of DTB improvments for BMIPS.
    - Improved support for CM and CPS.
    - Minor JZ4740 and BCM47xx enhancements"

    * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (120 commits)
    MIPS: idle: add case for CPU_5KE
    MIPS: Octeon: Support APPENDED_DTB
    MIPS: vmlinux: create a section for appended DTB
    MIPS: Clean up compat_siginfo_t
    MIPS: Fix PAGE_MASK definition
    MIPS: BMIPS: Enable GZIP ramdisk and timed printks
    MIPS: Add xilfpga defconfig
    MIPS: xilfpga: Add mipsfpga platform code
    MIPS: xilfpga: Add xilfpga device tree files.
    dt-bindings: MIPS: Document xilfpga bindings and boot style
    MIPS: Make MIPS_CMDLINE_DTB default
    MIPS: Make the kernel arguments from dtb available
    MIPS: Use USE_OF as the guard for appended dtb
    MIPS: BCM63XX: Use pr_* instead of printk
    MIPS: Loongson: Cleanup CONFIG_LOONGSON_SUSPEND.
    MIPS: lantiq: Disable xbar fpi burst mode
    MIPS: lantiq: Force the crossbar to big endian
    MIPS: lantiq: Initialize the USB core on boot
    MIPS: lantiq: Return correct value for fpi clock on ar9
    MIPS: ralink: Add missing clock on rt305x
    ...

    Linus Torvalds
     

15 Nov, 2015

2 commits

  • Pull sound fixes from Takashi Iwai:
    "Here are a collection of small fixes tha have been gathered for
    4.4-rc1. The only significant changes are those in PCI drivers
    Kconfig, to use "depends on" instead of "select" for CONFIG_ZONE_DMA.
    A reverse select is often more user-friendly, but in this case, it
    makes hard to manage with the conflict with ZONE_DEVICE, so changed in
    such a way for now.

    Others are all small fixes and quirks: an error check in soundcore
    reigster_chrdev(), HD-audio HDMI/DP phantom jack fix, Intel Broxton DP
    quirk, USB-audio DSD device quirk, some constifications, etc"

    * tag 'sound-fix-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: pci: depend on ZONE_DMA
    ALSA: hda - Simplify phantom jack handling for HDMI/DP
    ALSA: hda/hdmi - apply Skylake fix-ups to Broxton display codec
    ALSA: ctxfi: constify rsc ops structures
    ALSA: usb: Add native DSD support for Aune X1S
    ALSA: oxfw: add an comment to Kconfig for TASCAM FireOne
    sound: fix check for error condition of register_chrdev()

    Linus Torvalds
     
  • Pull ARC fixes from Vineet Gupta:
    "Found a couple of brown paper bag bugs with the prev pull request
    (including a SMP build breakage report from Guenter). Since these are
    urgent I also decided to send over a bunch of other pending fixes
    which could have otherwise waited an rc or two.

    Summary:

    - A bunch of brown paper bag bugs (MAINTAINERS list email, SMP build
    failure)
    - cpu_relax() now compiler barrier for UP as well
    - handling of userspace Bus Errors for ARCompact builds"

    * tag 'arc-4.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    ARC: Fix silly typo in MAINTAINERS file
    ARC: cpu_relax() to be compiler barrier even for UP
    ARC: use ASL assembler mnemonic
    ARC: [arcompact] Handle bus error from userspace as Interrupt not exception
    ARC: remove extraneous header include
    ARCv2: lib: memcpy: use local symbols

    Linus Torvalds
     

14 Nov, 2015

5 commits