28 Sep, 2016

2 commits

  • We don't need to hold the spinlock while zeroing the allocated memory.
    In case we handle big buffers this is a severe issue as other CPUs might
    be spinning half a second or longer.

    Signed-off-by: Bastian Hecht
    Signed-off-by: George G. Davis
    Signed-off-by: Mark Craske
    Signed-off-by: Greg Kroah-Hartman

    Bastian Hecht
     
  • We fix a bug in dma_mmap_from_coherent() that appears when we map non page
    aligned DMA memory. It cuts off the non aligned part (this is different to
    dma_alloc_coherent() that always rounds up to full pages). So for mappings
    of less than a page we get -ENXIO as dma_mmap_from_coherent() assumes we
    want to map zero pages.

    Signed-off-by: George G. Davis
    Signed-off-by: Jiada Wang
    Signed-off-by: Mark Craske
    Signed-off-by: Greg Kroah-Hartman

    George G. Davis
     

31 Aug, 2016

1 commit


23 Mar, 2016

2 commits

  • Use memset_io() for DMA_MEMORY_IO mappings which are mapped as I/O
    memory, and regular memset() for DMA_MEMORY_MAP mappings.

    This fixes the below alignment fault on arm64 for DMA_MEMORY_IO
    mappings, where memset() uses the DC ZVA instruction which is invalid on
    device memory.

    Unhandled fault: alignment fault (0x96000061) at 0xffffff8000380000
    Internal error: : 96000061 [#1] PREEMPT SMP
    Modules linked in: hdlcd(+) clk_scpi
    CPU: 4 PID: 1355 Comm: systemd-udevd Not tainted 4.4.0-rc1+ #5
    Hardware name: ARM Juno development board (r0) (DT)
    task: ffffffc9763eee00 ti: ffffffc9758c4000 task.ti: ffffffc9758c4000
    PC is at __efistub_memset+0x1ac/0x200
    LR is at dma_alloc_from_coherent+0xb0/0x120
    pc : [] lr : [] pstate: 400001c5
    sp : ffffffc9758c79a0
    x29: ffffffc9758c79a0 x28: ffffffc000635cd0
    x27: 0000000000000124 x26: ffffffc000119ef4
    x25: 0000000000010000 x24: 0000000000000140
    x23: ffffffc07e9ac3a8 x22: ffffffc9758c7a58
    x21: ffffffc9758c7a68 x20: 0000000000000004
    x19: ffffffc07e9ac380 x18: 0000000000000001
    x17: 0000007fae1bbba8 x16: ffffffc0001b2d1c
    x15: ffffffffffffffff x14: 0ffffffffffffffe
    x13: 0000000000000010 x12: ffffff800837ffff
    x11: ffffff800837ffff x10: 0000000040000000
    x9 : 0000000000000000 x8 : ffffff8000380000
    x7 : 0000000000000000 x6 : 000000000000003f
    x5 : 0000000000000040 x4 : 0000000000000000
    x3 : 0000000000000004 x2 : 000000000000ffc0
    x1 : 0000000000000000 x0 : ffffff8000380000

    Signed-off-by: Brian Starkey
    Reviewed-by: Catalin Marinas
    Cc: Dan Williams
    Cc: Greg Kroah-Hartman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Brian Starkey
     
  • When the DMA_MEMORY_MAP flag is used, memory which can be accessed
    directly should be returned, so use memremap(..., MEMREMAP_WC) to
    provide a writecombine mapping.

    Signed-off-by: Brian Starkey
    Reviewed-by: Catalin Marinas
    Cc: Dan Williams
    Cc: Greg Kroah-Hartman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Brian Starkey
     

10 Feb, 2016

1 commit


14 Oct, 2014

1 commit

  • Initialization procedure of dma coherent pool has been split into two
    parts, so memory pool can now be initialized without assigning to
    particular struct device. Then initialized region can be assigned to more
    than one struct device. To protect from concurent allocations from
    structure. The last part of this patch adds support for handling
    'shared-dma-pool' reserved-memory device tree nodes.

    [akpm@linux-foundation.org: use more appropriate printk facility levels]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Marek Szyprowski
    Cc: Arnd Bergmann
    Cc: Michal Nazarewicz
    Cc: Grant Likely
    Cc: Laura Abbott
    Cc: Josh Cartwright
    Cc: Joonsoo Kim
    Cc: Kyungmin Park
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     

21 May, 2014

1 commit

  • dma_declare_coherent_memory() takes two addresses for a region of memory: a
    "bus_addr" and a "device_addr". I think the intent is that "bus_addr" is
    the physical address a *CPU* would use to access the region, and
    "device_addr" is the bus address the *device* would use to address the
    region.

    Rename "bus_addr" to "phys_addr" and change its type to phys_addr_t.
    Most callers already supply a phys_addr_t for this argument. The others
    supply a 32-bit integer (a constant, unsigned int, or __u32) and need no
    change.

    Use "unsigned long", not phys_addr_t, to hold PFNs.

    No functional change (this could theoretically fix a truncation in a config
    with 32-bit dma_addr_t and 64-bit phys_addr_t, but I don't think there are
    any such cases involving this code).

    Signed-off-by: Bjorn Helgaas
    Acked-by: Arnd Bergmann
    Acked-by: Greg Kroah-Hartman
    Acked-by: James Bottomley
    Acked-by: Randy Dunlap

    Bjorn Helgaas
     

23 Oct, 2012

1 commit


15 Jun, 2012

1 commit


21 May, 2012

1 commit


01 Nov, 2011

1 commit


06 Aug, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

16 Sep, 2009

1 commit