06 Apr, 2019

1 commit

  • [ Upstream commit 0d3bd18a5efd66097ef58622b898d3139790aa9d ]

    In case cma_init_reserved_mem failed, need to free the memblock
    allocated by memblock_reserve or memblock_alloc_range.

    Quote Catalin's comments:
    https://lkml.org/lkml/2019/2/26/482

    Kmemleak is supposed to work with the memblock_{alloc,free} pair and it
    ignores the memblock_reserve() as a memblock_alloc() implementation
    detail. It is, however, tolerant to memblock_free() being called on
    a sub-range or just a different range from a previous memblock_alloc().
    So the original patch looks fine to me. FWIW:

    Link: http://lkml.kernel.org/r/20190227144631.16708-1-peng.fan@nxp.com
    Signed-off-by: Peng Fan
    Reviewed-by: Catalin Marinas
    Reviewed-by: Mike Rapoport
    Cc: Laura Abbott
    Cc: Joonsoo Kim
    Cc: Michal Hocko
    Cc: Vlastimil Babka
    Cc: Marek Szyprowski
    Cc: Andrey Konovalov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Peng Fan
     

18 Aug, 2018

1 commit

  • cma_alloc() doesn't really support gfp flags other than __GFP_NOWARN, so
    convert gfp_mask parameter to boolean no_warn parameter.

    This will help to avoid giving false feeling that this function supports
    standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
    what has already been an issue: see commit dd65a941f6ba ("arm64:
    dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").

    Link: http://lkml.kernel.org/r/20180709122019eucas1p2340da484acfcc932537e6014f4fd2c29~-sqTPJKij2939229392eucas1p2j@eucas1p2.samsung.com
    Signed-off-by: Marek Szyprowski
    Acked-by: Michal Hocko
    Acked-by: Michał Nazarewicz
    Acked-by: Laura Abbott
    Acked-by: Vlastimil Babka
    Reviewed-by: Christoph Hellwig
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     

25 May, 2018

1 commit

  • This reverts the following commits that change CMA design in MM.

    3d2054ad8c2d ("ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y")

    1d47a3ec09b5 ("mm/cma: remove ALLOC_CMA")

    bad8c6c0b114 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")

    Ville reported a following error on i386.

    Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
    microcode: microcode updated early to revision 0x4, date = 2013-06-28
    Initializing CPU#0
    Initializing HighMem for node 0 (000377fe:00118000)
    Initializing Movable for node 0 (00000001:00118000)
    BUG: Bad page state in process swapper pfn:377fe
    page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
    flags: 0x80000000()
    raw: 80000000 00000000 00000000 ffffff80 00000000 00000100 00000200 00000001
    page dumped because: nonzero mapcount
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0-rc5-elk+ #145
    Hardware name: Dell Inc. Latitude E5410/03VXMC, BIOS A15 07/11/2013
    Call Trace:
    dump_stack+0x60/0x96
    bad_page+0x9a/0x100
    free_pages_check_bad+0x3f/0x60
    free_pcppages_bulk+0x29d/0x5b0
    free_unref_page_commit+0x84/0xb0
    free_unref_page+0x3e/0x70
    __free_pages+0x1d/0x20
    free_highmem_page+0x19/0x40
    add_highpages_with_active_regions+0xab/0xeb
    set_highmem_pages_init+0x66/0x73
    mem_init+0x1b/0x1d7
    start_kernel+0x17a/0x363
    i386_start_kernel+0x95/0x99
    startup_32_smp+0x164/0x168

    The reason for this error is that the span of MOVABLE_ZONE is extended
    to whole node span for future CMA initialization, and, normal memory is
    wrongly freed here. I submitted the fix and it seems to work, but,
    another problem happened.

    It's so late time to fix the later problem so I decide to reverting the
    series.

    Reported-by: Ville Syrjälä
    Acked-by: Laura Abbott
    Acked-by: Michal Hocko
    Cc: Andrew Morton
    Signed-off-by: Joonsoo Kim
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

12 Apr, 2018

1 commit

  • Patch series "mm/cma: manage the memory of the CMA area by using the
    ZONE_MOVABLE", v2.

    0. History

    This patchset is the follow-up of the discussion about the "Introduce
    ZONE_CMA (v7)" [1]. Please reference it if more information is needed.

    1. What does this patch do?

    This patch changes the management way for the memory of the CMA area in
    the MM subsystem. Currently the memory of the CMA area is managed by
    the zone where their pfn is belong to. However, this approach has some
    problems since MM subsystem doesn't have enough logic to handle the
    situation that different characteristic memories are in a single zone.
    To solve this issue, this patch try to manage all the memory of the CMA
    area by using the MOVABLE zone. In MM subsystem's point of view,
    characteristic of the memory on the MOVABLE zone and the memory of the
    CMA area are the same. So, managing the memory of the CMA area by using
    the MOVABLE zone will not have any problem.

    2. Motivation

    There are some problems with current approach. See following. Although
    these problem would not be inherent and it could be fixed without this
    conception change, it requires many hooks addition in various code path
    and it would be intrusive to core MM and would be really error-prone.
    Therefore, I try to solve them with this new approach. Anyway,
    following is the problems of the current implementation.

    o CMA memory utilization

    First, following is the freepage calculation logic in MM.

    - For movable allocation: freepage = total freepage
    - For unmovable allocation: freepage = total freepage - CMA freepage

    Freepages on the CMA area is used after the normal freepages in the zone
    where the memory of the CMA area is belong to are exhausted. At that
    moment that the number of the normal freepages is zero, so

    - For movable allocation: freepage = total freepage = CMA freepage
    - For unmovable allocation: freepage = 0

    If unmovable allocation comes at this moment, allocation request would
    fail to pass the watermark check and reclaim is started. After reclaim,
    there would exist the normal freepages so freepages on the CMA areas
    would not be used.

    FYI, there is another attempt [2] trying to solve this problem in lkml.
    And, as far as I know, Qualcomm also has out-of-tree solution for this
    problem.

    Useless reclaim:

    There is no logic to distinguish CMA pages in the reclaim path. Hence,
    CMA page is reclaimed even if the system just needs the page that can be
    usable for the kernel allocation.

    Atomic allocation failure:

    This is also related to the fallback allocation policy for the memory of
    the CMA area. Consider the situation that the number of the normal
    freepages is *zero* since the bunch of the movable allocation requests
    come. Kswapd would not be woken up due to following freepage
    calculation logic.

    - For movable allocation: freepage = total freepage = CMA freepage

    If atomic unmovable allocation request comes at this moment, it would
    fails due to following logic.

    - For unmovable allocation: freepage = total freepage - CMA freepage = 0

    It was reported by Aneesh [3].

    Useless compaction:

    Usual high-order allocation request is unmovable allocation request and
    it cannot be served from the memory of the CMA area. In compaction,
    migration scanner try to migrate the page in the CMA area and make
    high-order page there. As mentioned above, it cannot be usable for the
    unmovable allocation request so it's just waste.

    3. Current approach and new approach

    Current approach is that the memory of the CMA area is managed by the
    zone where their pfn is belong to. However, these memory should be
    distinguishable since they have a strong limitation. So, they are
    marked as MIGRATE_CMA in pageblock flag and handled specially. However,
    as mentioned in section 2, the MM subsystem doesn't have enough logic to
    deal with this special pageblock so many problems raised.

    New approach is that the memory of the CMA area is managed by the
    MOVABLE zone. MM already have enough logic to deal with special zone
    like as HIGHMEM and MOVABLE zone. So, managing the memory of the CMA
    area by the MOVABLE zone just naturally work well because constraints
    for the memory of the CMA area that the memory should always be
    migratable is the same with the constraint for the MOVABLE zone.

    There is one side-effect for the usability of the memory of the CMA
    area. The use of MOVABLE zone is only allowed for a request with
    GFP_HIGHMEM && GFP_MOVABLE so now the memory of the CMA area is also
    only allowed for this gfp flag. Before this patchset, a request with
    GFP_MOVABLE can use them. IMO, It would not be a big issue since most
    of GFP_MOVABLE request also has GFP_HIGHMEM flag. For example, file
    cache page and anonymous page. However, file cache page for blockdev
    file is an exception. Request for it has no GFP_HIGHMEM flag. There is
    pros and cons on this exception. In my experience, blockdev file cache
    pages are one of the top reason that causes cma_alloc() to fail
    temporarily. So, we can get more guarantee of cma_alloc() success by
    discarding this case.

    Note that there is no change in admin POV since this patchset is just
    for internal implementation change in MM subsystem. Just one minor
    difference for admin is that the memory stat for CMA area will be
    printed in the MOVABLE zone. That's all.

    4. Result

    Following is the experimental result related to utilization problem.

    8 CPUs, 1024 MB, VIRTUAL MACHINE
    make -j16

    CMA area: 0 MB 512 MB
    Elapsed-time: 92.4 186.5
    pswpin: 82 18647
    pswpout: 160 69839

    CMA : 0 MB 512 MB
    Elapsed-time: 93.1 93.4
    pswpin: 84 46
    pswpout: 183 92

    akpm: "kernel test robot" reported a 26% improvement in
    vm-scalability.throughput:
    http://lkml.kernel.org/r/20180330012721.GA3845@yexl-desktop

    [1]: lkml.kernel.org/r/1491880640-9944-1-git-send-email-iamjoonsoo.kim@lge.com
    [2]: https://lkml.org/lkml/2014/10/15/623
    [3]: http://www.spinics.net/lists/linux-mm/msg100562.html

    Link: http://lkml.kernel.org/r/1512114786-5085-2-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Reviewed-by: Aneesh Kumar K.V
    Tested-by: Tony Lindgren
    Acked-by: Vlastimil Babka
    Cc: Johannes Weiner
    Cc: Laura Abbott
    Cc: Marek Szyprowski
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Michal Nazarewicz
    Cc: Minchan Kim
    Cc: Rik van Riel
    Cc: Russell King
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

06 Apr, 2018

2 commits

  • Currently #includes for no obvious
    reason. It looks like it's only a convenience, so remove kmemleak.h
    from slab.h and add to any users of kmemleak_* that
    don't already #include it. Also remove from source
    files that do not use it.

    This is tested on i386 allmodconfig and x86_64 allmodconfig. It would
    be good to run it through the 0day bot for other $ARCHes. I have
    neither the horsepower nor the storage space for the other $ARCHes.

    Update: This patch has been extensively build-tested by both the 0day
    bot & kisskb/ozlabs build farms. Both of them reported 2 build failures
    for which patches are included here (in v2).

    [ slab.h is the second most used header file after module.h; kernel.h is
    right there with slab.h. There could be some minor error in the
    counting due to some #includes having comments after them and I didn't
    combine all of those. ]

    [akpm@linux-foundation.org: security/keys/big_key.c needs vmalloc.h, per sfr]
    Link: http://lkml.kernel.org/r/e4309f98-3749-93e1-4bb7-d9501a39d015@infradead.org
    Link: http://kisskb.ellerman.id.au/kisskb/head/13396/
    Signed-off-by: Randy Dunlap
    Reviewed-by: Ingo Molnar
    Reported-by: Michael Ellerman [2 build failures]
    Reported-by: Fengguang Wu [2 build failures]
    Reviewed-by: Andrew Morton
    Cc: Wei Yongjun
    Cc: Luis R. Rodriguez
    Cc: Greg Kroah-Hartman
    Cc: Mimi Zohar
    Cc: John Johansen
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Link: http://lkml.kernel.org/r/1519585191-10180-4-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrew Morton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

16 Nov, 2017

1 commit

  • It was observed that under cma_alloc fail log, pr_info was used instead
    of pr_err. This will lead to problems if printk debug level is set to
    below 7. In this case the cma_alloc failure log will not be captured in
    the log and it will be difficult to debug.

    Simply replace the pr_info with pr_err to capture failure log.

    Link: http://lkml.kernel.org/r/1507650633-4430-1-git-send-email-pintu.ping@gmail.com
    Signed-off-by: Pintu Agarwal
    Cc: Laura Abbott
    Cc: Greg Kroah-Hartman
    Cc: Jaewon Kim
    Cc: Doug Berger
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pintu Agarwal
     

14 Oct, 2017

1 commit

  • cma_alloc() unconditionally prints an INFO message when the CMA
    allocation fails. Make this message conditional on the non-presence of
    __GFP_NOWARN in gfp_mask.

    This patch aims at removing INFO messages that are displayed when the
    VC4 driver tries to allocate buffer objects. From the driver
    perspective an allocation failure is acceptable, and the driver can
    possibly do something to make following allocation succeed (like
    flushing the VC4 internal cache).

    Link: http://lkml.kernel.org/r/20171004125447.15195-1-boris.brezillon@free-electrons.com
    Signed-off-by: Boris Brezillon
    Acked-by: Laura Abbott
    Cc: Jaewon Kim
    Cc: David Airlie
    Cc: Daniel Vetter
    Cc: Eric Anholt
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Boris Brezillon
     

11 Jul, 2017

2 commits

  • The align_offset parameter is used by bitmap_find_next_zero_area_off()
    to represent the offset of map's base from the previous alignment
    boundary; the function ensures that the returned index, plus the
    align_offset, honors the specified align_mask.

    The logic introduced by commit b5be83e308f7 ("mm: cma: align to physical
    address, not CMA region position") has the cma driver calculate the
    offset to the *next* alignment boundary. In most cases, the base
    alignment is greater than that specified when making allocations,
    resulting in a zero offset whether we align up or down. In the example
    given with the commit, the base alignment (8MB) was half the requested
    alignment (16MB) so the math also happened to work since the offset is
    8MB in both directions. However, when requesting allocations with an
    alignment greater than twice that of the base, the returned index would
    not be correctly aligned.

    Also, the align_order arguments of cma_bitmap_aligned_mask() and
    cma_bitmap_aligned_offset() should not be negative so the argument type
    was made unsigned.

    Fixes: b5be83e308f7 ("mm: cma: align to physical address, not CMA region position")
    Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.com
    Signed-off-by: Angus Clark
    Signed-off-by: Doug Berger
    Acked-by: Gregory Fong
    Cc: Doug Berger
    Cc: Angus Clark
    Cc: Laura Abbott
    Cc: Vlastimil Babka
    Cc: Greg Kroah-Hartman
    Cc: Lucas Stach
    Cc: Catalin Marinas
    Cc: Shiraz Hashim
    Cc: Jaewon Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Berger
     
  • While activating a CMA area we check to make sure that all the PFNs in
    the range are inside the same zone. This is a requirement for
    alloc_contig_range() to work. Any CMA area failing the check is
    disabled for good. This happens silently right now making all future
    cma_alloc() allocations failure inevitable.

    Here we add an error message stating that the CMA area could not be
    activated which makes it easier to explain any future cma_alloc()
    failures on it. While in there, change the bail out goto label from
    'err' to 'not_in_zone' which makes more sense.

    Link: http://lkml.kernel.org/r/20170605023729.26303-1-khandual@linux.vnet.ibm.com
    Signed-off-by: Anshuman Khandual
    Cc: "Aneesh Kumar K.V"
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anshuman Khandual
     

19 Apr, 2017

2 commits


25 Feb, 2017

3 commits

  • There are many reasons of CMA allocation failure such as EBUSY, ENOMEM,
    EINTR. But we did not know error reason so far. This patch prints the
    error value.

    Additionally if CONFIG_CMA_DEBUG is enabled, this patch shows bitmap
    status to know available pages. Actually CMA internally tries on all
    available regions because some regions can be failed because of EBUSY.
    Bitmap status is useful to know in detail on both ENONEM and EBUSY;

    ENOMEM: not tried at all because of no available region
    it could be too small total region or could be fragmentation issue
    EBUSY: tried some region but all failed

    This is an ENOMEM example with this patch.

    [2: Binder:714_1: 744] cma: cma_alloc: alloc failed, req-size: 256 pages, ret: -12

    If CONFIG_CMA_DEBUG is enabled, avabile pages also will be shown as
    concatenated size@position format. So 4@572 means that there are 4
    available pages at 572 position starting from 0 position.

    [2: Binder:714_1: 744] cma: number of available pages: 4@572+7@585+7@601+8@632+38@730+166@1114+127@1921=> 357 free of 2048 total pages

    Link: http://lkml.kernel.org/r/1485909785-3952-1-git-send-email-jaewon31.kim@samsung.com
    Signed-off-by: Jaewon Kim
    Acked-by: Michal Nazarewicz
    Cc: Laura Abbott
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jaewon Kim
     
  • Most users of this interface just want to use it with the default
    GFP_KERNEL flags, but for cases where DMA memory is allocated it may be
    called from a different context.

    No functional change yet, just passing through the flag to the
    underlying alloc_contig_range function.

    Link: http://lkml.kernel.org/r/20170127172328.18574-2-l.stach@pengutronix.de
    Signed-off-by: Lucas Stach
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Radim Krcmar
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Chris Zankel
    Cc: Ralf Baechle
    Cc: Paolo Bonzini
    Cc: Alexander Graf
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lucas Stach
     
  • Currently alloc_contig_range assumes that the compaction should be done
    with the default GFP_KERNEL flags. This is probably right for all
    current uses of this interface, but may change as CMA is used in more
    use-cases (including being the default DMA memory allocator on some
    platforms).

    Change the function prototype, to allow for passing through the GFP mask
    set by upper layers.

    Also respect global restrictions by applying memalloc_noio_flags to the
    passed in flags.

    Link: http://lkml.kernel.org/r/20170127172328.18574-1-l.stach@pengutronix.de
    Signed-off-by: Lucas Stach
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Radim Krcmar
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Chris Zankel
    Cc: Ralf Baechle
    Cc: Paolo Bonzini
    Cc: Alexander Graf
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lucas Stach
     

11 Jan, 2017

1 commit

  • 6b101e2a3ce4 ("mm/CMA: fix boot regression due to physical address of
    high_memory") added checks to use __pa_nodebug on x86 since
    CONFIG_DEBUG_VIRTUAL complains about high_memory not being linearlly
    mapped. arm64 is now getting support for CONFIG_DEBUG_VIRTUAL as well.
    Rather than add an explosion of arches to the #ifdef, switch to an
    alternate method to calculate the physical start of highmem using
    the page before highmem starts. This avoids the need for the #ifdef and
    extra __pa_nodebug calls.

    Reviewed-by: Mark Rutland
    Tested-by: Mark Rutland
    Signed-off-by: Laura Abbott
    Signed-off-by: Will Deacon

    Laura Abbott
     

12 Nov, 2016

1 commit

  • CMA allocation request size is represented by size_t that gets truncated
    when same is passed as int to bitmap_find_next_zero_area_off.

    We observe that during fuzz testing when cma allocation request is too
    high, bitmap_find_next_zero_area_off still returns success due to the
    truncation. This leads to kernel crash, as subsequent code assumes that
    requested memory is available.

    Fail cma allocation in case the request breaches the corresponding cma
    region size.

    Link: http://lkml.kernel.org/r/1478189211-3467-1-git-send-email-shashim@codeaurora.org
    Signed-off-by: Shiraz Hashim
    Cc: Catalin Marinas
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shiraz Hashim
     

12 Oct, 2016

1 commit

  • Some of the kmemleak_*() callbacks in memblock, bootmem, CMA convert a
    physical address to a virtual one using __va(). However, such physical
    addresses may sometimes be located in highmem and using __va() is
    incorrect, leading to inconsistent object tracking in kmemleak.

    The following functions have been added to the kmemleak API and they take
    a physical address as the object pointer. They only perform the
    corresponding action if the address has a lowmem mapping:

    kmemleak_alloc_phys
    kmemleak_free_part_phys
    kmemleak_not_leak_phys
    kmemleak_ignore_phys

    The affected calling places have been updated to use the new kmemleak
    API.

    Link: http://lkml.kernel.org/r/1471531432-16503-1-git-send-email-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Reported-by: Vignesh R
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

28 May, 2016

1 commit

  • pageblock_order can be (at least) an unsigned int or an unsigned long
    depending on the kernel config and architecture, so use max_t(unsigned
    long, ...) when comparing it.

    fixes these warnings:

    In file included from include/asm-generic/bug.h:13:0,
    from arch/powerpc/include/asm/bug.h:127,
    from include/linux/bug.h:4,
    from include/linux/mmdebug.h:4,
    from include/linux/mm.h:8,
    from include/linux/memblock.h:18,
    from mm/cma.c:28:
    mm/cma.c: In function 'cma_init_reserved_mem':
    include/linux/kernel.h:748:17: warning: comparison of distinct pointer types lacks a cast
    (void) (&_max1 == &_max2); ^
    mm/cma.c:186:27: note: in expansion of macro 'max'
    alignment = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
    ^
    mm/cma.c: In function 'cma_declare_contiguous':
    include/linux/kernel.h:748:17: warning: comparison of distinct pointer types lacks a cast
    (void) (&_max1 == &_max2); ^
    include/linux/kernel.h:747:9: note: in definition of macro 'max'
    typeof(y) _max2 = (y); ^
    mm/cma.c:270:29: note: in expansion of macro 'max'
    (phys_addr_t)PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order));
    ^
    include/linux/kernel.h:748:17: warning: comparison of distinct pointer types lacks a cast
    (void) (&_max1 == &_max2); ^
    include/linux/kernel.h:747:21: note: in definition of macro 'max'
    typeof(y) _max2 = (y); ^
    mm/cma.c:270:29: note: in expansion of macro 'max'
    (phys_addr_t)PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order));
    ^

    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/20160526150748.5be38a4f@canb.auug.org.au
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     

06 Nov, 2015

1 commit

  • mm/cma.c: In function 'cma_alloc':
    mm/cma.c:366: warning: 'pfn' may be used uninitialized in this function

    The patch actually improves the tracing a bit: if alloc_contig_range()
    fails, tracing will display the offending pfn rather than -1.

    Cc: Stefan Strogin
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Laurent Pinchart
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

23 Oct, 2015

1 commit

  • This was found during userspace fuzzing test when a large size dma cma
    allocation is made by driver(like ion) through userspace.

    show_stack+0x10/0x1c
    dump_stack+0x74/0xc8
    kasan_report_error+0x2b0/0x408
    kasan_report+0x34/0x40
    __asan_storeN+0x15c/0x168
    memset+0x20/0x44
    __dma_alloc_coherent+0x114/0x18c

    Signed-off-by: Rohit Vaswani
    Acked-by: Greg Kroah-Hartman
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rohit Vaswani
     

25 Jun, 2015

2 commits

  • Some high end Intel Xeon systems report uncorrectable memory errors as a
    recoverable machine check. Linux has included code for some time to
    process these and just signal the affected processes (or even recover
    completely if the error was in a read only page that can be replaced by
    reading from disk).

    But we have no recovery path for errors encountered during kernel code
    execution. Except for some very specific cases were are unlikely to ever
    be able to recover.

    Enter memory mirroring. Actually 3rd generation of memory mirroing.

    Gen1: All memory is mirrored
    Pro: No s/w enabling - h/w just gets good data from other side of the
    mirror
    Con: Halves effective memory capacity available to OS/applications

    Gen2: Partial memory mirror - just mirror memory begind some memory controllers
    Pro: Keep more of the capacity
    Con: Nightmare to enable. Have to choose between allocating from
    mirrored memory for safety vs. NUMA local memory for performance

    Gen3: Address range partial memory mirror - some mirror on each memory
    controller
    Pro: Can tune the amount of mirror and keep NUMA performance
    Con: I have to write memory management code to implement

    The current plan is just to use mirrored memory for kernel allocations.
    This has been broken into two phases:

    1) This patch series - find the mirrored memory, use it for boot time
    allocations

    2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the
    unused mirrored memory from mm/memblock.c and only give it out to
    select kernel allocations (this is still being scoped because
    page_alloc.c is scary).

    This patch (of 3):

    Add extra "flags" to memblock to allow selection of memory based on
    attribute. No functional changes

    Signed-off-by: Tony Luck
    Cc: Xishi Qiu
    Cc: Hanjun Guo
    Cc: Xiexiuqi
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Yinghai Lu
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tony Luck
     
  • Signed-off-by: Shailendra Verma
    Acked-by: Michal Nazarewicz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shailendra Verma
     

16 Apr, 2015

1 commit

  • Add trace events for cma_alloc() and cma_release().

    The cma_alloc tracepoint is used both for successful and failed allocations,
    in case of allocation failure pfn=-1UL is stored and printed.

    Signed-off-by: Stefan Strogin
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Joonsoo Kim
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Laurent Pinchart
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stefan Strogin
     

15 Apr, 2015

3 commits

  • Constify function parameters and use correct signness where needed.

    Signed-off-by: Sasha Levin
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Cc: Laurent Pinchart
    Acked-by: Gregory Fong
    Cc: Pintu Kumar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • Provides a userspace interface to trigger a CMA allocation.

    Usage:

    echo [pages] > alloc

    This would provide testing/fuzzing access to the CMA allocation paths.

    Signed-off-by: Sasha Levin
    Acked-by: Joonsoo Kim
    Cc: Marek Szyprowski
    Cc: Laura Abbott
    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • I've noticed that there is no interfaces exposed by CMA which would let me
    fuzz what's going on in there.

    This small patchset exposes some information out to userspace, plus adds
    the ability to trigger allocation and freeing from userspace.

    This patch (of 3):

    Implement a simple debugfs interface to expose information about CMA areas
    in the system.

    Useful for testing/sanity checks for CMA since it was impossible to
    previously retrieve this information in userspace.

    Signed-off-by: Sasha Levin
    Acked-by: Joonsoo Kim
    Cc: Marek Szyprowski
    Cc: Laura Abbott
    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

13 Mar, 2015

1 commit

  • The CMA aligned offset calculation is incorrect for non-zero order_per_bit
    values.

    For example, if cma->order_per_bit=1, cma->base_pfn= 0x2f800000 and
    align_order=12, the function returns a value of 0x17c00 instead of 0x400.

    This patch fixes the CMA aligned offset calculation.

    The previous calculation was wrong and would return too-large values for
    the offset, so that when cma_alloc looks for free pages in the bitmap with
    the requested alignment > order_per_bit, it starts too far into the bitmap
    and so CMA allocations will fail despite there actually being plenty of
    free pages remaining. It will also probably have the wrong alignment.
    With this change, we will get the correct offset into the bitmap.

    One affected user is powerpc KVM, which has kvm_cma->order_per_bit set to
    KVM_CMA_CHUNK_ORDER - PAGE_SHIFT, or 18 - 12 = 6.

    [gregory.0xf0@gmail.com: changelog additions]
    Signed-off-by: Danesh Petigara
    Reviewed-by: Gregory Fong
    Acked-by: Michal Nazarewicz
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Danesh Petigara
     

12 Feb, 2015

1 commit

  • The totalcma_pages variable is not updated to account for CMA regions
    defined via device tree reserved-memory sub-nodes. Fix this omission by
    moving the calculation of totalcma_pages into cma_init_reserved_mem()
    instead of cma_declare_contiguous() such that it will include reserved
    memory used by all CMA regions.

    Signed-off-by: George G. Davis
    Cc: Marek Szyprowski
    Acked-by: Michal Nazarewicz
    Cc: Joonsoo Kim
    Cc: "Aneesh Kumar K.V"
    Cc: Laurent Pinchart
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    George G. Davis
     

19 Dec, 2014

1 commit

  • When the system boots up, in the dmesg logs we can see the memory
    statistics along with total reserved as below. Memory: 458840k/458840k
    available, 65448k reserved, 0K highmem

    When CMA is enabled, still the total reserved memory remains the same.
    However, the CMA memory is not considered as reserved. But, when we see
    /proc/meminfo, the CMA memory is part of free memory. This creates
    confusion. This patch corrects the problem by properly subtracting the
    CMA reserved memory from the total reserved memory in dmesg logs.

    Below is the dmesg snapshot from an arm based device with 512MB RAM and
    12MB single CMA region.

    Before this change:
    Memory: 458840k/458840k available, 65448k reserved, 0K highmem

    After this change:
    Memory: 458840k/458840k available, 53160k reserved, 12288k cma-reserved, 0K highmem

    Signed-off-by: Pintu Kumar
    Signed-off-by: Vishnu Pratap Singh
    Acked-by: Michal Nazarewicz
    Cc: Rafael Aquini
    Cc: Jerome Marchand
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pintu Kumar
     

14 Dec, 2014

2 commits

  • kmemleak will add allocations as objects to a pool. The memory allocated
    for each object in this pool is periodically searched for pointers to
    other allocated objects. This only works for memory that is mapped into
    the kernel's virtual address space, which happens not to be the case for
    most CMA regions.

    Furthermore, CMA regions are typically used to store data transferred to
    or from a device and therefore don't contain pointers to other objects.

    Without this, the kernel crashes on the first execution of the
    scan_gray_list() because it tries to access highmem. Perhaps a more
    appropriate fix would be to reject any object that can't map to a kernel
    virtual address?

    [akpm@linux-foundation.org: add comment]
    [akpm@linux-foundation.org: fix comment, per Catalin]
    [sfr@canb.auug.org.au: include linux/io.h for phys_to_virt()]
    Signed-off-by: Thierry Reding
    Cc: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Cc: "Aneesh Kumar K.V"
    Cc: Catalin Marinas
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thierry Reding
     
  • The alignment in cma_alloc() was done w.r.t. the bitmap. This is a
    problem when, for example:

    - a device requires 16M (order 12) alignment
    - the CMA region is not 16 M aligned

    In such a case, can result with the CMA region starting at, say,
    0x2f800000 but any allocation you make from there will be aligned from
    there. Requesting an allocation of 32 M with 16 M alignment will result
    in an allocation from 0x2f800000 to 0x31800000, which doesn't work very
    well if your strange device requires 16M alignment.

    Change to use bitmap_find_next_zero_area_off() to account for the
    difference in alignment at reserve-time and alloc-time.

    Signed-off-by: Gregory Fong
    Acked-by: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Joonsoo Kim
    Cc: Kukjin Kim
    Cc: Laurent Pinchart
    Cc: Laura Abbott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gregory Fong
     

11 Dec, 2014

1 commit

  • high_memory isn't direct mapped memory so retrieving it's physical address
    isn't appropriate. But, it would be useful to check physical address of
    highmem boundary so it's justfiable to get physical address from it. In
    x86, there is a validation check if CONFIG_DEBUG_VIRTUAL and it triggers
    following boot failure reported by Ingo.

    ...
    BUG: Int 6: CR2 00f06f53
    ...
    Call Trace:
    dump_stack+0x41/0x52
    early_idt_handler+0x6b/0x6b
    cma_declare_contiguous+0x33/0x212
    dma_contiguous_reserve_area+0x31/0x4e
    dma_contiguous_reserve+0x11d/0x125
    setup_arch+0x7b5/0xb63
    start_kernel+0xb8/0x3e6
    i386_start_kernel+0x79/0x7d

    To fix boot regression, this patch implements workaround to avoid
    validation check in x86 when retrieving physical address of high_memory.
    __pa_nodebug() used by this patch is implemented only in x86 so there is
    no choice but to use dirty #ifdef.

    [akpm@linux-foundation.org: tweak comment]
    Signed-off-by: Joonsoo Kim
    Reported-by: Ingo Molnar
    Tested-by: Ingo Molnar
    Cc: Marek Szyprowski
    Cc: Russell King
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

27 Oct, 2014

4 commits

  • Casting physical addresses to unsigned long and using %lu truncates the
    values on systems where physical addresses are larger than 32 bits. Use
    %pa and get rid of the cast instead.

    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • Commit 95b0e655f914 ("ARM: mm: don't limit default CMA region only to
    low memory") extended CMA memory reservation to allow usage of high
    memory. It relied on commit f7426b983a6a ("mm: cma: adjust address limit
    to avoid hitting low/high memory boundary") to ensure that the reserved
    block never crossed the low/high memory boundary. While the
    implementation correctly lowered the limit, it failed to consider the
    case where the base..limit range crossed the low/high memory boundary
    with enough space on each side to reserve the requested size on either
    low or high memory.

    Rework the base and limit adjustment to fix the problem. The function
    now starts by rejecting the reservation altogether for fixed
    reservations that cross the boundary, tries to reserve from high memory
    first and then falls back to low memory.

    Signed-off-by: Laurent Pinchart
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • The fixed parameter to cma_declare_contiguous() tells the function
    whether the given base address must be honoured or should be considered
    as a hint only. The API considers a zero base address as meaning any
    base address, which must never be considered as a fixed value.

    Part of the implementation correctly checks both fixed and base != 0,
    but two locations check the fixed value only. Set fixed to false when
    base is 0 to fix that and simplify the code.

    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     
  • If activation of the CMA area fails its mutex won't be initialized,
    leading to an oops at allocation time when trying to lock the mutex. Fix
    this by setting the cma area count field to 0 when activation fails,
    leading to allocation returning NULL immediately.

    Cc: # v3.17
    Signed-off-by: Laurent Pinchart
    Acked-by: Michal Nazarewicz
    Signed-off-by: Marek Szyprowski

    Laurent Pinchart
     

14 Oct, 2014

2 commits

  • Add a function to create CMA region from previously reserved memory and
    add support for handling 'shared-dma-pool' reserved-memory device tree
    nodes.

    Based on previous code provided by Josh Cartwright

    Signed-off-by: Marek Szyprowski
    Cc: Arnd Bergmann
    Cc: Michal Nazarewicz
    Cc: Grant Likely
    Cc: Laura Abbott
    Cc: Josh Cartwright
    Cc: Joonsoo Kim
    Cc: Kyungmin Park
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     
  • The current cma bitmap aligned mask computation is incorrect. It could
    cause an unexpected alignment when using cma_alloc() if the wanted align
    order is larger than cma->order_per_bit.

    Take kvm for example (PAGE_SHIFT = 12), kvm_cma->order_per_bit is set to
    6. When kvm_alloc_rma() tries to alloc kvm_rma_pages, it will use 15 as
    the expected align value. After using the current implementation however,
    we get 0 as cma bitmap aligned mask other than 511.

    This patch fixes the cma bitmap aligned mask calculation.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Weijie Yang
    Acked-by: Michal Nazarewicz
    Cc: Joonsoo Kim
    Cc: "Aneesh Kumar K.V"
    Cc: [3.17]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weijie Yang
     

10 Oct, 2014

1 commit

  • Russell King recently noticed that limiting default CMA region only to low
    memory on ARM architecture causes serious memory management issues with
    machines having a lot of memory (which is mainly available as high
    memory). More information can be found the following thread:
    http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/

    Those two patches removes this limit letting kernel to put default CMA
    region into high memory when this is possible (there is enough high memory
    available and architecture specific DMA limit fits).

    This should solve strange OOM issues on systems with lots of RAM (i.e.
    >1GiB) and large (>256M) CMA area.

    This patch (of 2):

    Automatically allocated regions should not cross low/high memory boundary,
    because such regions cannot be later correctly initialized due to spanning
    across two memory zones. This patch adds a check for this case and a
    simple code for moving region to low memory if automatically selected
    address might not fit completely into high memory.

    Signed-off-by: Marek Szyprowski
    Acked-by: Michal Nazarewicz
    Cc: Daniel Drake
    Cc: Minchan Kim
    Cc: Russell King
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski