14 Dec, 2014

2 commits

  • kmemleak will add allocations as objects to a pool. The memory allocated
    for each object in this pool is periodically searched for pointers to
    other allocated objects. This only works for memory that is mapped into
    the kernel's virtual address space, which happens not to be the case for
    most CMA regions.

    Furthermore, CMA regions are typically used to store data transferred to
    or from a device and therefore don't contain pointers to other objects.

    Without this, the kernel crashes on the first execution of the
    scan_gray_list() because it tries to access highmem. Perhaps a more
    appropriate fix would be to reject any object that can't map to a kernel
    virtual address?

    [akpm@linux-foundation.org: add comment]
    [akpm@linux-foundation.org: fix comment, per Catalin]
    [sfr@canb.auug.org.au: include linux/io.h for phys_to_virt()]
    Signed-off-by: Thierry Reding
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Cc: "Aneesh Kumar K.V"
    Cc: Catalin Marinas
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thierry Reding
     
  • The alignment in cma_alloc() was done w.r.t. the bitmap. This is a
    problem when, for example:

    - a device requires 16M (order 12) alignment
    - the CMA region is not 16 M aligned

    In such a case, can result with the CMA region starting at, say,
    0x2f800000 but any allocation you make from there will be aligned from
    there. Requesting an allocation of 32 M with 16 M alignment will result
    in an allocation from 0x2f800000 to 0x31800000, which doesn't work very
    well if your strange device requires 16M alignment.

    Change to use bitmap_find_next_zero_area_off() to account for the
    difference in alignment at reserve-time and alloc-time.

    Signed-off-by: Gregory Fong
    Acked-by: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Cc: Kukjin Kim
    Cc: Laurent Pinchart
    Cc: Laura Abbott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gregory Fong
     

11 Dec, 2014

1 commit

  • high_memory isn't direct mapped memory so retrieving it's physical address
    isn't appropriate. But, it would be useful to check physical address of
    highmem boundary so it's justfiable to get physical address from it. In
    x86, there is a validation check if CONFIG_DEBUG_VIRTUAL and it triggers
    following boot failure reported by Ingo.

    ...
    BUG: Int 6: CR2 00f06f53
    ...
    Call Trace:
    dump_stack+0x41/0x52
    early_idt_handler+0x6b/0x6b
    cma_declare_contiguous+0x33/0x212
    dma_contiguous_reserve_area+0x31/0x4e
    dma_contiguous_reserve+0x11d/0x125
    setup_arch+0x7b5/0xb63
    start_kernel+0xb8/0x3e6
    i386_start_kernel+0x79/0x7d

    To fix boot regression, this patch implements workaround to avoid
    validation check in x86 when retrieving physical address of high_memory.
    __pa_nodebug() used by this patch is implemented only in x86 so there is
    no choice but to use dirty #ifdef.

    [akpm@linux-foundation.org: tweak comment]
    Signed-off-by: Joonsoo Kim
    Reported-by: Ingo Molnar
    Tested-by: Ingo Molnar
    Cc: Marek Szyprowski
    Cc: Russell King
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

27 Oct, 2014

4 commits

  • Casting physical addresses to unsigned long and using %lu truncates the
    values on systems where physical addresses are larger than 32 bits. Use
    %pa and get rid of the cast instead.

    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • Commit 95b0e655f914 ("ARM: mm: don't limit default CMA region only to
    low memory") extended CMA memory reservation to allow usage of high
    memory. It relied on commit f7426b983a6a ("mm: cma: adjust address limit
    to avoid hitting low/high memory boundary") to ensure that the reserved
    block never crossed the low/high memory boundary. While the
    implementation correctly lowered the limit, it failed to consider the
    case where the base..limit range crossed the low/high memory boundary
    with enough space on each side to reserve the requested size on either
    low or high memory.

    Rework the base and limit adjustment to fix the problem. The function
    now starts by rejecting the reservation altogether for fixed
    reservations that cross the boundary, tries to reserve from high memory
    first and then falls back to low memory.

    Signed-off-by: Laurent Pinchart
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • The fixed parameter to cma_declare_contiguous() tells the function
    whether the given base address must be honoured or should be considered
    as a hint only. The API considers a zero base address as meaning any
    base address, which must never be considered as a fixed value.

    Part of the implementation correctly checks both fixed and base != 0,
    but two locations check the fixed value only. Set fixed to false when
    base is 0 to fix that and simplify the code.

    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • If activation of the CMA area fails its mutex won't be initialized,
    leading to an oops at allocation time when trying to lock the mutex. Fix
    this by setting the cma area count field to 0 when activation fails,
    leading to allocation returning NULL immediately.

    Cc: # v3.17
    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     

14 Oct, 2014

2 commits

  • Add a function to create CMA region from previously reserved memory and
    add support for handling 'shared-dma-pool' reserved-memory device tree
    nodes.

    Based on previous code provided by Josh Cartwright

    Signed-off-by: Marek Szyprowski
    Cc: Arnd Bergmann
    Cc: Michal Nazarewicz
    Cc: Grant Likely
    Cc: Laura Abbott
    Cc: Josh Cartwright
    Cc: Joonsoo Kim
    Cc: Kyungmin Park
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     
  • The current cma bitmap aligned mask computation is incorrect. It could
    cause an unexpected alignment when using cma_alloc() if the wanted align
    order is larger than cma->order_per_bit.

    Take kvm for example (PAGE_SHIFT = 12), kvm_cma->order_per_bit is set to
    6. When kvm_alloc_rma() tries to alloc kvm_rma_pages, it will use 15 as
    the expected align value. After using the current implementation however,
    we get 0 as cma bitmap aligned mask other than 511.

    This patch fixes the cma bitmap aligned mask calculation.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Weijie Yang
    Acked-by: Michal Nazarewicz
    Cc: Joonsoo Kim
    Cc: "Aneesh Kumar K.V"
    Cc: [3.17]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weijie Yang
     

10 Oct, 2014

1 commit

  • Russell King recently noticed that limiting default CMA region only to low
    memory on ARM architecture causes serious memory management issues with
    machines having a lot of memory (which is mainly available as high
    memory). More information can be found the following thread:
    http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/

    Those two patches removes this limit letting kernel to put default CMA
    region into high memory when this is possible (there is enough high memory
    available and architecture specific DMA limit fits).

    This should solve strange OOM issues on systems with lots of RAM (i.e.
    >1GiB) and large (>256M) CMA area.

    This patch (of 2):

    Automatically allocated regions should not cross low/high memory boundary,
    because such regions cannot be later correctly initialized due to spanning
    across two memory zones. This patch adds a check for this case and a
    simple code for moving region to low memory if automatically selected
    address might not fit completely into high memory.

    Signed-off-by: Marek Szyprowski
    Acked-by: Michal Nazarewicz
    Cc: Daniel Drake
    Cc: Minchan Kim
    Cc: Russell King
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     

07 Aug, 2014

4 commits

  • We don't need explicit 'CMA:' prefix, since we already define prefix
    'cma:' in pr_fmt. So remove it.

    Signed-off-by: Joonsoo Kim
    Acked-by: Michal Nazarewicz
    Reviewed-by: Zhang Yanfei
    Cc: "Aneesh Kumar K.V"
    Cc: Alexander Graf
    Cc: Aneesh Kumar K.V
    Cc: Gleb Natapov
    Acked-by: Marek Szyprowski
    Tested-by: Marek Szyprowski
    Cc: Minchan Kim
    Cc: Paolo Bonzini
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Conventionally, we put output param to the end of param list and put the
    'base' ahead of 'size', but cma_declare_contiguous() doesn't look like
    that, so change it.

    Additionally, move down cma_areas reference code to the position where
    it is really needed.

    Signed-off-by: Joonsoo Kim
    Acked-by: Michal Nazarewicz
    Reviewed-by: Aneesh Kumar K.V
    Cc: Alexander Graf
    Cc: Aneesh Kumar K.V
    Cc: Gleb Natapov
    Acked-by: Marek Szyprowski
    Tested-by: Marek Szyprowski
    Cc: Minchan Kim
    Cc: Paolo Bonzini
    Cc: Zhang Yanfei
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • We can remove one call sites for clear_cma_bitmap() if we first call it
    before checking error number.

    Signed-off-by: Joonsoo Kim
    Acked-by: Minchan Kim
    Reviewed-by: Michal Nazarewicz
    Reviewed-by: Zhang Yanfei
    Reviewed-by: Aneesh Kumar K.V
    Cc: Alexander Graf
    Cc: Aneesh Kumar K.V
    Cc: Gleb Natapov
    Acked-by: Marek Szyprowski
    Tested-by: Marek Szyprowski
    Cc: Paolo Bonzini
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Currently, there are two users on CMA functionality, one is the DMA
    subsystem and the other is the KVM on powerpc. They have their own code
    to manage CMA reserved area even if they looks really similar. From my
    guess, it is caused by some needs on bitmap management. KVM side wants
    to maintain bitmap not for 1 page, but for more size. Eventually it use
    bitmap where one bit represents 64 pages.

    When I implement CMA related patches, I should change those two places
    to apply my change and it seem to be painful to me. I want to change
    this situation and reduce future code management overhead through this
    patch.

    This change could also help developer who want to use CMA in their new
    feature development, since they can use CMA easily without copying &
    pasting this reserved area management code.

    In previous patches, we have prepared some features to generalize CMA
    reserved area management and now it's time to do it. This patch moves
    core functions to mm/cma.c and change DMA APIs to use these functions.

    There is no functional change in DMA APIs.

    Signed-off-by: Joonsoo Kim
    Acked-by: Michal Nazarewicz
    Acked-by: Zhang Yanfei
    Acked-by: Minchan Kim
    Reviewed-by: Aneesh Kumar K.V
    Cc: Alexander Graf
    Cc: Aneesh Kumar K.V
    Cc: Gleb Natapov
    Acked-by: Marek Szyprowski
    Tested-by: Marek Szyprowski
    Cc: Paolo Bonzini
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim