12 Nov, 2016

1 commit

  • CMA allocation request size is represented by size_t that gets truncated
    when same is passed as int to bitmap_find_next_zero_area_off.

    We observe that during fuzz testing when cma allocation request is too
    high, bitmap_find_next_zero_area_off still returns success due to the
    truncation. This leads to kernel crash, as subsequent code assumes that
    requested memory is available.

    Fail cma allocation in case the request breaches the corresponding cma
    region size.

    Link: http://lkml.kernel.org/r/1478189211-3467-1-git-send-email-shashim@codeaurora.org
    Signed-off-by: Shiraz Hashim
    Cc: Catalin Marinas
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shiraz Hashim
     

12 Oct, 2016

1 commit

  • Some of the kmemleak_*() callbacks in memblock, bootmem, CMA convert a
    physical address to a virtual one using __va(). However, such physical
    addresses may sometimes be located in highmem and using __va() is
    incorrect, leading to inconsistent object tracking in kmemleak.

    The following functions have been added to the kmemleak API and they take
    a physical address as the object pointer. They only perform the
    corresponding action if the address has a lowmem mapping:

    kmemleak_alloc_phys
    kmemleak_free_part_phys
    kmemleak_not_leak_phys
    kmemleak_ignore_phys

    The affected calling places have been updated to use the new kmemleak
    API.

    Link: http://lkml.kernel.org/r/1471531432-16503-1-git-send-email-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Reported-by: Vignesh R
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

28 May, 2016

1 commit

  • pageblock_order can be (at least) an unsigned int or an unsigned long
    depending on the kernel config and architecture, so use max_t(unsigned
    long, ...) when comparing it.

    fixes these warnings:

    In file included from include/asm-generic/bug.h:13:0,
    from arch/powerpc/include/asm/bug.h:127,
    from include/linux/bug.h:4,
    from include/linux/mmdebug.h:4,
    from include/linux/mm.h:8,
    from include/linux/memblock.h:18,
    from mm/cma.c:28:
    mm/cma.c: In function 'cma_init_reserved_mem':
    include/linux/kernel.h:748:17: warning: comparison of distinct pointer types lacks a cast
    (void) (&_max1 == &_max2); ^
    mm/cma.c:186:27: note: in expansion of macro 'max'
    alignment = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
    ^
    mm/cma.c: In function 'cma_declare_contiguous':
    include/linux/kernel.h:748:17: warning: comparison of distinct pointer types lacks a cast
    (void) (&_max1 == &_max2); ^
    include/linux/kernel.h:747:9: note: in definition of macro 'max'
    typeof(y) _max2 = (y); ^
    mm/cma.c:270:29: note: in expansion of macro 'max'
    (phys_addr_t)PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order));
    ^
    include/linux/kernel.h:748:17: warning: comparison of distinct pointer types lacks a cast
    (void) (&_max1 == &_max2); ^
    include/linux/kernel.h:747:21: note: in definition of macro 'max'
    typeof(y) _max2 = (y); ^
    mm/cma.c:270:29: note: in expansion of macro 'max'
    (phys_addr_t)PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order));
    ^

    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/20160526150748.5be38a4f@canb.auug.org.au
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     

06 Nov, 2015

1 commit

  • mm/cma.c: In function 'cma_alloc':
    mm/cma.c:366: warning: 'pfn' may be used uninitialized in this function

    The patch actually improves the tracing a bit: if alloc_contig_range()
    fails, tracing will display the offending pfn rather than -1.

    Cc: Stefan Strogin
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Laurent Pinchart
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

23 Oct, 2015

1 commit

  • This was found during userspace fuzzing test when a large size dma cma
    allocation is made by driver(like ion) through userspace.

    show_stack+0x10/0x1c
    dump_stack+0x74/0xc8
    kasan_report_error+0x2b0/0x408
    kasan_report+0x34/0x40
    __asan_storeN+0x15c/0x168
    memset+0x20/0x44
    __dma_alloc_coherent+0x114/0x18c

    Signed-off-by: Rohit Vaswani
    Acked-by: Greg Kroah-Hartman
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rohit Vaswani
     

25 Jun, 2015

2 commits

  • Some high end Intel Xeon systems report uncorrectable memory errors as a
    recoverable machine check. Linux has included code for some time to
    process these and just signal the affected processes (or even recover
    completely if the error was in a read only page that can be replaced by
    reading from disk).

    But we have no recovery path for errors encountered during kernel code
    execution. Except for some very specific cases were are unlikely to ever
    be able to recover.

    Enter memory mirroring. Actually 3rd generation of memory mirroing.

    Gen1: All memory is mirrored
    Pro: No s/w enabling - h/w just gets good data from other side of the
    mirror
    Con: Halves effective memory capacity available to OS/applications

    Gen2: Partial memory mirror - just mirror memory begind some memory controllers
    Pro: Keep more of the capacity
    Con: Nightmare to enable. Have to choose between allocating from
    mirrored memory for safety vs. NUMA local memory for performance

    Gen3: Address range partial memory mirror - some mirror on each memory
    controller
    Pro: Can tune the amount of mirror and keep NUMA performance
    Con: I have to write memory management code to implement

    The current plan is just to use mirrored memory for kernel allocations.
    This has been broken into two phases:

    1) This patch series - find the mirrored memory, use it for boot time
    allocations

    2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the
    unused mirrored memory from mm/memblock.c and only give it out to
    select kernel allocations (this is still being scoped because
    page_alloc.c is scary).

    This patch (of 3):

    Add extra "flags" to memblock to allow selection of memory based on
    attribute. No functional changes

    Signed-off-by: Tony Luck
    Cc: Xishi Qiu
    Cc: Hanjun Guo
    Cc: Xiexiuqi
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Yinghai Lu
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tony Luck
     
  • Signed-off-by: Shailendra Verma
    Acked-by: Michal Nazarewicz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shailendra Verma
     

16 Apr, 2015

1 commit

  • Add trace events for cma_alloc() and cma_release().

    The cma_alloc tracepoint is used both for successful and failed allocations,
    in case of allocation failure pfn=-1UL is stored and printed.

    Signed-off-by: Stefan Strogin
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Joonsoo Kim
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Laurent Pinchart
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stefan Strogin
     

15 Apr, 2015

3 commits

  • Constify function parameters and use correct signness where needed.

    Signed-off-by: Sasha Levin
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Cc: Laurent Pinchart
    Acked-by: Gregory Fong
    Cc: Pintu Kumar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • Provides a userspace interface to trigger a CMA allocation.

    Usage:

    echo [pages] > alloc

    This would provide testing/fuzzing access to the CMA allocation paths.

    Signed-off-by: Sasha Levin
    Acked-by: Joonsoo Kim
    Cc: Marek Szyprowski
    Cc: Laura Abbott
    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • I've noticed that there is no interfaces exposed by CMA which would let me
    fuzz what's going on in there.

    This small patchset exposes some information out to userspace, plus adds
    the ability to trigger allocation and freeing from userspace.

    This patch (of 3):

    Implement a simple debugfs interface to expose information about CMA areas
    in the system.

    Useful for testing/sanity checks for CMA since it was impossible to
    previously retrieve this information in userspace.

    Signed-off-by: Sasha Levin
    Acked-by: Joonsoo Kim
    Cc: Marek Szyprowski
    Cc: Laura Abbott
    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

13 Mar, 2015

1 commit

  • The CMA aligned offset calculation is incorrect for non-zero order_per_bit
    values.

    For example, if cma->order_per_bit=1, cma->base_pfn= 0x2f800000 and
    align_order=12, the function returns a value of 0x17c00 instead of 0x400.

    This patch fixes the CMA aligned offset calculation.

    The previous calculation was wrong and would return too-large values for
    the offset, so that when cma_alloc looks for free pages in the bitmap with
    the requested alignment > order_per_bit, it starts too far into the bitmap
    and so CMA allocations will fail despite there actually being plenty of
    free pages remaining. It will also probably have the wrong alignment.
    With this change, we will get the correct offset into the bitmap.

    One affected user is powerpc KVM, which has kvm_cma->order_per_bit set to
    KVM_CMA_CHUNK_ORDER - PAGE_SHIFT, or 18 - 12 = 6.

    [gregory.0xf0@gmail.com: changelog additions]
    Signed-off-by: Danesh Petigara
    Reviewed-by: Gregory Fong
    Acked-by: Michal Nazarewicz
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Danesh Petigara
     

12 Feb, 2015

1 commit

  • The totalcma_pages variable is not updated to account for CMA regions
    defined via device tree reserved-memory sub-nodes. Fix this omission by
    moving the calculation of totalcma_pages into cma_init_reserved_mem()
    instead of cma_declare_contiguous() such that it will include reserved
    memory used by all CMA regions.

    Signed-off-by: George G. Davis
    Cc: Marek Szyprowski
    Acked-by: Michal Nazarewicz
    Cc: Joonsoo Kim
    Cc: "Aneesh Kumar K.V"
    Cc: Laurent Pinchart
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    George G. Davis
     

19 Dec, 2014

1 commit

  • When the system boots up, in the dmesg logs we can see the memory
    statistics along with total reserved as below. Memory: 458840k/458840k
    available, 65448k reserved, 0K highmem

    When CMA is enabled, still the total reserved memory remains the same.
    However, the CMA memory is not considered as reserved. But, when we see
    /proc/meminfo, the CMA memory is part of free memory. This creates
    confusion. This patch corrects the problem by properly subtracting the
    CMA reserved memory from the total reserved memory in dmesg logs.

    Below is the dmesg snapshot from an arm based device with 512MB RAM and
    12MB single CMA region.

    Before this change:
    Memory: 458840k/458840k available, 65448k reserved, 0K highmem

    After this change:
    Memory: 458840k/458840k available, 53160k reserved, 12288k cma-reserved, 0K highmem

    Signed-off-by: Pintu Kumar
    Signed-off-by: Vishnu Pratap Singh
    Acked-by: Michal Nazarewicz
    Cc: Rafael Aquini
    Cc: Jerome Marchand
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pintu Kumar
     

14 Dec, 2014

2 commits

  • kmemleak will add allocations as objects to a pool. The memory allocated
    for each object in this pool is periodically searched for pointers to
    other allocated objects. This only works for memory that is mapped into
    the kernel's virtual address space, which happens not to be the case for
    most CMA regions.

    Furthermore, CMA regions are typically used to store data transferred to
    or from a device and therefore don't contain pointers to other objects.

    Without this, the kernel crashes on the first execution of the
    scan_gray_list() because it tries to access highmem. Perhaps a more
    appropriate fix would be to reject any object that can't map to a kernel
    virtual address?

    [akpm@linux-foundation.org: add comment]
    [akpm@linux-foundation.org: fix comment, per Catalin]
    [sfr@canb.auug.org.au: include linux/io.h for phys_to_virt()]
    Signed-off-by: Thierry Reding
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Cc: "Aneesh Kumar K.V"
    Cc: Catalin Marinas
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thierry Reding
     
  • The alignment in cma_alloc() was done w.r.t. the bitmap. This is a
    problem when, for example:

    - a device requires 16M (order 12) alignment
    - the CMA region is not 16 M aligned

    In such a case, can result with the CMA region starting at, say,
    0x2f800000 but any allocation you make from there will be aligned from
    there. Requesting an allocation of 32 M with 16 M alignment will result
    in an allocation from 0x2f800000 to 0x31800000, which doesn't work very
    well if your strange device requires 16M alignment.

    Change to use bitmap_find_next_zero_area_off() to account for the
    difference in alignment at reserve-time and alloc-time.

    Signed-off-by: Gregory Fong
    Acked-by: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Cc: Kukjin Kim
    Cc: Laurent Pinchart
    Cc: Laura Abbott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gregory Fong
     

11 Dec, 2014

1 commit

  • high_memory isn't direct mapped memory so retrieving it's physical address
    isn't appropriate. But, it would be useful to check physical address of
    highmem boundary so it's justfiable to get physical address from it. In
    x86, there is a validation check if CONFIG_DEBUG_VIRTUAL and it triggers
    following boot failure reported by Ingo.

    ...
    BUG: Int 6: CR2 00f06f53
    ...
    Call Trace:
    dump_stack+0x41/0x52
    early_idt_handler+0x6b/0x6b
    cma_declare_contiguous+0x33/0x212
    dma_contiguous_reserve_area+0x31/0x4e
    dma_contiguous_reserve+0x11d/0x125
    setup_arch+0x7b5/0xb63
    start_kernel+0xb8/0x3e6
    i386_start_kernel+0x79/0x7d

    To fix boot regression, this patch implements workaround to avoid
    validation check in x86 when retrieving physical address of high_memory.
    __pa_nodebug() used by this patch is implemented only in x86 so there is
    no choice but to use dirty #ifdef.

    [akpm@linux-foundation.org: tweak comment]
    Signed-off-by: Joonsoo Kim
    Reported-by: Ingo Molnar
    Tested-by: Ingo Molnar
    Cc: Marek Szyprowski
    Cc: Russell King
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

27 Oct, 2014

4 commits

  • Casting physical addresses to unsigned long and using %lu truncates the
    values on systems where physical addresses are larger than 32 bits. Use
    %pa and get rid of the cast instead.

    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • Commit 95b0e655f914 ("ARM: mm: don't limit default CMA region only to
    low memory") extended CMA memory reservation to allow usage of high
    memory. It relied on commit f7426b983a6a ("mm: cma: adjust address limit
    to avoid hitting low/high memory boundary") to ensure that the reserved
    block never crossed the low/high memory boundary. While the
    implementation correctly lowered the limit, it failed to consider the
    case where the base..limit range crossed the low/high memory boundary
    with enough space on each side to reserve the requested size on either
    low or high memory.

    Rework the base and limit adjustment to fix the problem. The function
    now starts by rejecting the reservation altogether for fixed
    reservations that cross the boundary, tries to reserve from high memory
    first and then falls back to low memory.

    Signed-off-by: Laurent Pinchart
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • The fixed parameter to cma_declare_contiguous() tells the function
    whether the given base address must be honoured or should be considered
    as a hint only. The API considers a zero base address as meaning any
    base address, which must never be considered as a fixed value.

    Part of the implementation correctly checks both fixed and base != 0,
    but two locations check the fixed value only. Set fixed to false when
    base is 0 to fix that and simplify the code.

    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • If activation of the CMA area fails its mutex won't be initialized,
    leading to an oops at allocation time when trying to lock the mutex. Fix
    this by setting the cma area count field to 0 when activation fails,
    leading to allocation returning NULL immediately.

    Cc: # v3.17
    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     

14 Oct, 2014

2 commits

  • Add a function to create CMA region from previously reserved memory and
    add support for handling 'shared-dma-pool' reserved-memory device tree
    nodes.

    Based on previous code provided by Josh Cartwright

    Signed-off-by: Marek Szyprowski
    Cc: Arnd Bergmann
    Cc: Michal Nazarewicz
    Cc: Grant Likely
    Cc: Laura Abbott
    Cc: Josh Cartwright
    Cc: Joonsoo Kim
    Cc: Kyungmin Park
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     
  • The current cma bitmap aligned mask computation is incorrect. It could
    cause an unexpected alignment when using cma_alloc() if the wanted align
    order is larger than cma->order_per_bit.

    Take kvm for example (PAGE_SHIFT = 12), kvm_cma->order_per_bit is set to
    6. When kvm_alloc_rma() tries to alloc kvm_rma_pages, it will use 15 as
    the expected align value. After using the current implementation however,
    we get 0 as cma bitmap aligned mask other than 511.

    This patch fixes the cma bitmap aligned mask calculation.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Weijie Yang
    Acked-by: Michal Nazarewicz
    Cc: Joonsoo Kim
    Cc: "Aneesh Kumar K.V"
    Cc: [3.17]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weijie Yang
     

10 Oct, 2014

1 commit

  • Russell King recently noticed that limiting default CMA region only to low
    memory on ARM architecture causes serious memory management issues with
    machines having a lot of memory (which is mainly available as high
    memory). More information can be found the following thread:
    http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/

    Those two patches removes this limit letting kernel to put default CMA
    region into high memory when this is possible (there is enough high memory
    available and architecture specific DMA limit fits).

    This should solve strange OOM issues on systems with lots of RAM (i.e.
    >1GiB) and large (>256M) CMA area.

    This patch (of 2):

    Automatically allocated regions should not cross low/high memory boundary,
    because such regions cannot be later correctly initialized due to spanning
    across two memory zones. This patch adds a check for this case and a
    simple code for moving region to low memory if automatically selected
    address might not fit completely into high memory.

    Signed-off-by: Marek Szyprowski
    Acked-by: Michal Nazarewicz
    Cc: Daniel Drake
    Cc: Minchan Kim
    Cc: Russell King
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     

07 Aug, 2014

4 commits

  • We don't need explicit 'CMA:' prefix, since we already define prefix
    'cma:' in pr_fmt. So remove it.

    Signed-off-by: Joonsoo Kim
    Acked-by: Michal Nazarewicz
    Reviewed-by: Zhang Yanfei
    Cc: "Aneesh Kumar K.V"
    Cc: Alexander Graf
    Cc: Aneesh Kumar K.V
    Cc: Gleb Natapov
    Acked-by: Marek Szyprowski
    Tested-by: Marek Szyprowski
    Cc: Minchan Kim
    Cc: Paolo Bonzini
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Conventionally, we put output param to the end of param list and put the
    'base' ahead of 'size', but cma_declare_contiguous() doesn't look like
    that, so change it.

    Additionally, move down cma_areas reference code to the position where
    it is really needed.

    Signed-off-by: Joonsoo Kim
    Acked-by: Michal Nazarewicz
    Reviewed-by: Aneesh Kumar K.V
    Cc: Alexander Graf
    Cc: Aneesh Kumar K.V
    Cc: Gleb Natapov
    Acked-by: Marek Szyprowski
    Tested-by: Marek Szyprowski
    Cc: Minchan Kim
    Cc: Paolo Bonzini
    Cc: Zhang Yanfei
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • We can remove one call sites for clear_cma_bitmap() if we first call it
    before checking error number.

    Signed-off-by: Joonsoo Kim
    Acked-by: Minchan Kim
    Reviewed-by: Michal Nazarewicz
    Reviewed-by: Zhang Yanfei
    Reviewed-by: Aneesh Kumar K.V
    Cc: Alexander Graf
    Cc: Aneesh Kumar K.V
    Cc: Gleb Natapov
    Acked-by: Marek Szyprowski
    Tested-by: Marek Szyprowski
    Cc: Paolo Bonzini
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Currently, there are two users on CMA functionality, one is the DMA
    subsystem and the other is the KVM on powerpc. They have their own code
    to manage CMA reserved area even if they looks really similar. From my
    guess, it is caused by some needs on bitmap management. KVM side wants
    to maintain bitmap not for 1 page, but for more size. Eventually it use
    bitmap where one bit represents 64 pages.

    When I implement CMA related patches, I should change those two places
    to apply my change and it seem to be painful to me. I want to change
    this situation and reduce future code management overhead through this
    patch.

    This change could also help developer who want to use CMA in their new
    feature development, since they can use CMA easily without copying &
    pasting this reserved area management code.

    In previous patches, we have prepared some features to generalize CMA
    reserved area management and now it's time to do it. This patch moves
    core functions to mm/cma.c and change DMA APIs to use these functions.

    There is no functional change in DMA APIs.

    Signed-off-by: Joonsoo Kim
    Acked-by: Michal Nazarewicz
    Acked-by: Zhang Yanfei
    Acked-by: Minchan Kim
    Reviewed-by: Aneesh Kumar K.V
    Cc: Alexander Graf
    Cc: Aneesh Kumar K.V
    Cc: Gleb Natapov
    Acked-by: Marek Szyprowski
    Tested-by: Marek Szyprowski
    Cc: Paolo Bonzini
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim