15 Jun, 2018

1 commit

  • mm/*.c files use symbolic and octal styles for permissions.

    Using octal and not symbolic permissions is preferred by many as more
    readable.

    https://lkml.org/lkml/2016/8/2/1945

    Prefer the direct use of octal for permissions.

    Done using
    $ scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace mm/*.c
    and some typing.

    Before: $ git grep -P -w "0[0-7]{3,3}" mm | wc -l
    44
    After: $ git grep -P -w "0[0-7]{3,3}" mm | wc -l
    86

    Miscellanea:

    o Whitespace neatening around these conversions.

    Link: http://lkml.kernel.org/r/2e032ef111eebcd4c5952bae86763b541d373469.1522102887.git.joe@perches.com
    Signed-off-by: Joe Perches
    Acked-by: David Rientjes
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

06 Apr, 2018

1 commit

  • The early_param() is only called during kernel initialization, So Linux
    marks the functions of it with __init macro to save memory.

    But it forgot to mark the early_page_owner_param(). So, Make it __init
    as well.

    Link: http://lkml.kernel.org/r/20180117034736.26963-1-douly.fnst@cn.fujitsu.com
    Signed-off-by: Dou Liyang
    Reviewed-by: Andrew Morton
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Greg Kroah-Hartman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dou Liyang
     

29 Mar, 2018

1 commit

  • This patch fixes commit 5f48f0bd4e36 ("mm, page_owner: skip unnecessary
    stack_trace entries").

    Because if we skip first two entries then logic of checking count value
    as 2 for recursion is broken and code will go in one depth recursion.

    so we need to check only one call of _RET_IP(__set_page_owner) while
    checking for recursion.

    Current Backtrace while checking for recursion:-

    (save_stack) from (__set_page_owner) // (But recursion returns true here)
    (__set_page_owner) from (get_page_from_freelist)
    (get_page_from_freelist) from (__alloc_pages_nodemask)
    (__alloc_pages_nodemask) from (depot_save_stack)
    (depot_save_stack) from (save_stack) // recursion should return true here
    (save_stack) from (__set_page_owner)
    (__set_page_owner) from (get_page_from_freelist)
    (get_page_from_freelist) from (__alloc_pages_nodemask+)
    (__alloc_pages_nodemask) from (depot_save_stack)
    (depot_save_stack) from (save_stack)
    (save_stack) from (__set_page_owner)
    (__set_page_owner) from (get_page_from_freelist)

    Correct Backtrace with fix:

    (save_stack) from (__set_page_owner) // recursion returned true here
    (__set_page_owner) from (get_page_from_freelist)
    (get_page_from_freelist) from (__alloc_pages_nodemask+)
    (__alloc_pages_nodemask) from (depot_save_stack)
    (depot_save_stack) from (save_stack)
    (save_stack) from (__set_page_owner)
    (__set_page_owner) from (get_page_from_freelist)

    Link: http://lkml.kernel.org/r/1521607043-34670-1-git-send-email-maninder1.s@samsung.com
    Fixes: 5f48f0bd4e36 ("mm, page_owner: skip unnecessary stack_trace entries")
    Signed-off-by: Maninder Singh
    Signed-off-by: Vaneet Narang
    Acked-by: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Oscar Salvador
    Cc: Greg Kroah-Hartman
    Cc: Ayush Mittal
    Cc: Prakash Gupta
    Cc: Vinayak Menon
    Cc: Vasyl Gomonovych
    Cc: Amit Sahrawat
    Cc:
    Cc: Vaneet Narang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maninder Singh
     

01 Feb, 2018

2 commits

  • Remove two redundant assignments in init_pages_in_zone().

    [osalvador@techadventures.net: v3]
    Link: http://lkml.kernel.org/r/20180117124513.GA876@techadventures.net
    [akpm@linux-foundation.org: coding style tweaks]
    Link: http://lkml.kernel.org/r/20180110084355.GA22822@techadventures.net
    Signed-off-by: Oscar Salvador
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oscar Salvador
     
  • Fix ptr_ret.cocci warnings:

    mm/page_owner.c:639:1-3: WARNING: PTR_ERR_OR_ZERO can be used

    Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

    Generated by: scripts/coccinelle/api/ptr_ret.cocci

    Link: http://lkml.kernel.org/r/1511824101-9597-1-git-send-email-gomonovych@gmail.com
    Signed-off-by: Vasyl Gomonovych
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasyl Gomonovych
     

20 Jan, 2018

1 commit

  • When setting page_owner = on, the following warning can be seen in the
    boot log:

    WARNING: CPU: 0 PID: 0 at mm/page_alloc.c:2537 drain_all_pages+0x171/0x1a0
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc7-next-20180109-1-default+ #7
    Hardware name: Dell Inc. Latitude E7470/0T6HHJ, BIOS 1.11.3 11/09/2016
    RIP: 0010:drain_all_pages+0x171/0x1a0
    Call Trace:
    init_page_owner+0x4e/0x260
    start_kernel+0x3e6/0x4a6
    ? set_init_arg+0x55/0x55
    secondary_startup_64+0xa5/0xb0
    Code: c5 ed ff 89 df 48 c7 c6 20 3b 71 82 e8 f9 4b 52 00 3b 05 d7 0b f8 00 89 c3 72 d5 5b 5d 41 5

    This warning is shown because we are calling drain_all_pages() in
    init_early_allocated_pages(), but mm_percpu_wq is not up yet, it is being
    set up later on in kernel_init_freeable() -> init_mm_internals().

    Link: http://lkml.kernel.org/r/20180109153921.GA13070@techadventures.net
    Signed-off-by: Oscar Salvador
    Acked-by: Joonsoo Kim
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Ayush Mittal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oscar Salvador
     

16 Nov, 2017

1 commit

  • Maximum page order can be at max 10 which can be accomodated in short
    data type(2 bytes). last_migrate_reason is defined as enum type whose
    values can be accomodated in short data type (2 bytes).

    Total structure size is currently 16 bytes but after changing structure
    size it goes to 12 bytes.

    Vlastimil said:
    "Looks like it works, so why not.
    Before:
    [ 0.001000] allocated 50331648 bytes of page_ext
    After:
    [ 0.001000] allocated 41943040 bytes of page_ext"

    Link: http://lkml.kernel.org/r/1507623917-37991-1-git-send-email-ayush.m@samsung.com
    Signed-off-by: Ayush Mittal
    Acked-by: Vlastimil Babka
    Cc: Vinayak Menon
    Cc: Amit Sahrawat
    Cc: Vaneet Narang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ayush Mittal
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

14 Sep, 2017

1 commit

  • The page_owner stacktrace always begin as follows:

    [] save_stack+0x40/0xc8
    [] __set_page_owner+0x3c/0x6c

    These two entries do not provide any useful information and limits the
    available stacktrace depth. The page_owner stacktrace was skipping
    caller function from stack entries but this was missed with commit
    f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace")

    Example page_owner entry after the patch:

    Page allocated via order 0, mask 0x8(ffffff80085fb714)
    PFN 654411 type Movable Block 639 type CMA Flags 0x0(ffffffbe5c7f12c0)
    [] post_alloc_hook+0x70/0x80
    ...
    [] msm_comm_try_state+0x5f8/0x14f4
    [] msm_vidc_open+0x5e4/0x7d0
    [] msm_v4l2_open+0xa8/0x224

    Link: http://lkml.kernel.org/r/1504078343-28754-2-git-send-email-guptap@codeaurora.org
    Fixes: f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace")
    Signed-off-by: Prakash Gupta
    Acked-by: Vlastimil Babka
    Cc: Catalin Marinas
    Cc: Joonsoo Kim
    Cc: Michal Hocko
    Cc: Russell King
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Prakash Gupta
     

07 Sep, 2017

2 commits

  • init_pages_in_zone() is run under zone->lock, which means a long lock
    time and disabled interrupts on large machines. This is currently not
    an issue since it runs early in boot, but a later patch will change
    that.

    However, like other pfn scanners, we don't actually need zone->lock even
    when other cpus are running. The only potentially dangerous operation
    here is reading bogus buddy page owner due to race, and we already know
    how to handle that. The worst that can happen is that we skip some
    early allocated pages, which should not affect the debugging power of
    page_owner noticeably.

    Link: http://lkml.kernel.org/r/20170720134029.25268-4-vbabka@suse.cz
    Signed-off-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Joonsoo Kim
    Cc: Mel Gorman
    Cc: Yang Shi
    Cc: Laura Abbott
    Cc: Vinayak Menon
    Cc: zhong jiang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • In init_pages_in_zone() we currently use the generic set_page_owner()
    function to initialize page_owner info for early allocated pages. This
    means we needlessly do lookup_page_ext() twice for each page, and more
    importantly save_stack(), which has to unwind the stack and find the
    corresponding stack depot handle. Because the stack is always the same
    for the initialization, unwind it once in init_pages_in_zone() and reuse
    the handle. Also avoid the repeated lookup_page_ext().

    This can significantly reduce boot times with page_owner=on on large
    machines, especially for kernels built without frame pointer, where the
    stack unwinding is noticeably slower.

    [vbabka@suse.cz: don't duplicate code of __set_page_owner(), per Michal Hocko]
    [akpm@linux-foundation.org: coding-style fixes]
    [vbabka@suse.cz: create statically allocated fake stack trace for early allocated pages, per Michal]
    Link: http://lkml.kernel.org/r/45813564-2342-fc8d-d31a-f4b68a724325@suse.cz
    Link: http://lkml.kernel.org/r/20170720134029.25268-2-vbabka@suse.cz
    Signed-off-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Joonsoo Kim
    Cc: Mel Gorman
    Cc: Yang Shi
    Cc: Laura Abbott
    Cc: Vinayak Menon
    Cc: zhong jiang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     

11 Jul, 2017

1 commit

  • pagetypeinfo_showmixedcount_print is found to take a lot of time to
    complete and it does this holding the zone lock and disabling
    interrupts. In some cases it is found to take more than a second (On a
    2.4GHz,8Gb RAM,arm64 cpu).

    Avoid taking the zone lock similar to what is done by read_page_owner,
    which means possibility of inaccurate results.

    Link: http://lkml.kernel.org/r/1498045643-12257-1-git-send-email-vinmenon@codeaurora.org
    Signed-off-by: Vinayak Menon
    Acked-by: Vlastimil Babka
    Cc: Joonsoo Kim
    Cc: zhongjiang
    Cc: Sergey Senozhatsky
    Cc: Sudip Mukherjee
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Sebastian Andrzej Siewior
    Cc: David Rientjes
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vinayak Menon
     

08 Oct, 2016

2 commits

  • There is a memory waste problem if we define field on struct page_ext by
    hard-coding. Entry size of struct page_ext includes the size of those
    fields even if it is disabled at runtime. Now, extra memory request at
    runtime is possible so page_owner don't need to define it's own fields
    by hard-coding.

    This patch removes hard-coded define and uses extra memory for storing
    page_owner information in page_owner. Most of code are just mechanical
    changes.

    Link: http://lkml.kernel.org/r/1471315879-32294-7-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Cc: Minchan Kim
    Cc: Michal Hocko
    Cc: Sergey Senozhatsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • There is no reason that page_owner specific function resides on
    vmstat.c.

    Link: http://lkml.kernel.org/r/1471315879-32294-4-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Reviewed-by: Sergey Senozhatsky
    Acked-by: Vlastimil Babka
    Cc: Minchan Kim
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

27 Jul, 2016

3 commits

  • Currently, we store each page's allocation stacktrace on corresponding
    page_ext structure and it requires a lot of memory. This causes the
    problem that memory tight system doesn't work well if page_owner is
    enabled. Moreover, even with this large memory consumption, we cannot
    get full stacktrace because we allocate memory at boot time and just
    maintain 8 stacktrace slots to balance memory consumption. We could
    increase it to more but it would make system unusable or change system
    behaviour.

    To solve the problem, this patch uses stackdepot to store stacktrace.
    It obviously provides memory saving but there is a drawback that
    stackdepot could fail.

    stackdepot allocates memory at runtime so it could fail if system has
    not enough memory. But, most of allocation stack are generated at very
    early time and there are much memory at this time. So, failure would
    not happen easily. And, one failure means that we miss just one page's
    allocation stacktrace so it would not be a big problem. In this patch,
    when memory allocation failure happens, we store special stracktrace
    handle to the page that is failed to save stacktrace. With it, user can
    guess memory usage properly even if failure happens.

    Memory saving looks as following. (4GB memory system with page_owner)
    (before the patch -> after the patch)

    static allocation:
    92274688 bytes -> 25165824 bytes

    dynamic allocation after boot + kernel build:
    0 bytes -> 327680 bytes

    total:
    92274688 bytes -> 25493504 bytes

    72% reduction in total.

    Note that implementation looks complex than someone would imagine
    because there is recursion issue. stackdepot uses page allocator and
    page_owner is called at page allocation. Using stackdepot in page_owner
    could re-call page allcator and then page_owner. That is a recursion.
    To detect and avoid it, whenever we obtain stacktrace, recursion is
    checked and page_owner is set to dummy information if found. Dummy
    information means that this page is allocated for page_owner feature
    itself (such as stackdepot) and it's understandable behavior for user.

    [iamjoonsoo.kim@lge.com: mm-page_owner-use-stackdepot-to-store-stacktrace-v3]
    Link: http://lkml.kernel.org/r/1464230275-25791-6-git-send-email-iamjoonsoo.kim@lge.com
    Link: http://lkml.kernel.org/r/1466150259-27727-7-git-send-email-iamjoonsoo.kim@lge.com
    Link: http://lkml.kernel.org/r/1464230275-25791-6-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Mel Gorman
    Cc: Minchan Kim
    Cc: Alexander Potapenko
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • split_page() calls set_page_owner() to set up page_owner to each pages.
    But, it has a drawback that head page and the others have different
    stacktrace because callsite of set_page_owner() is slightly differnt.
    To avoid this problem, this patch copies head page's page_owner to the
    others. It needs to introduce new function, split_page_owner() but it
    also remove the other function, get_page_owner_gfp() so looks good to
    do.

    Link: http://lkml.kernel.org/r/1464230275-25791-4-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Minchan Kim
    Cc: Alexander Potapenko
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Currently, copy_page_owner() doesn't copy all the owner information. It
    skips last_migrate_reason because copy_page_owner() is used for
    migration and it will be properly set soon. But, following patch will
    use copy_page_owner() and this skip will cause the problem that
    allocated page has uninitialied last_migrate_reason. To prevent it,
    this patch also copy last_migrate_reason in copy_page_owner().

    Link: http://lkml.kernel.org/r/1464230275-25791-3-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Minchan Kim
    Cc: Alexander Potapenko
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

25 Jun, 2016

1 commit

  • We have dereferenced page_ext before checking it. Lets check it first
    and then used it.

    Fixes: f86e4271978b ("mm: check the return value of lookup_page_ext for all call sites")
    Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
    Signed-off-by: Sudip Mukherjee
    Acked-by: Vlastimil Babka
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sudip Mukherjee
     

04 Jun, 2016

1 commit

  • Per the discussion with Joonsoo Kim [1], we need check the return value
    of lookup_page_ext() for all call sites since it might return NULL in
    some cases, although it is unlikely, i.e. memory hotplug.

    Tested with ltp with "page_owner=0".

    [1] http://lkml.kernel.org/r/20160519002809.GA10245@js1304-P5Q-DELUXE

    [akpm@linux-foundation.org: fix build-breaking typos]
    [arnd@arndb.de: fix build problems from lookup_page_ext]
    Link: http://lkml.kernel.org/r/6285269.2CksypHdYp@wuerfel
    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/1464023768-31025-1-git-send-email-yang.shi@linaro.org
    Signed-off-by: Yang Shi
    Signed-off-by: Arnd Bergmann
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Shi
     

20 May, 2016

2 commits

  • The function call overhead of get_pfnblock_flags_mask() is measurable in
    the page free paths. This patch uses an inlined version that is faster.

    Signed-off-by: Mel Gorman
    Acked-by: Vlastimil Babka
    Cc: Jesper Dangaard Brouer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • There is a system thats node's pfns are overlapped as follows:

    -----pfn-------->
    N0 N1 N2 N0 N1 N2

    Therefore, we need to care this overlapping when iterating pfn range.

    There are one place in page_owner.c that iterates pfn range and it
    doesn't consider this overlapping. Add it.

    Without this patch, above system could over count early allocated page
    number before page_owner is activated.

    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Cc: Rik van Riel
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: Laura Abbott
    Cc: Minchan Kim
    Cc: Marek Szyprowski
    Cc: Michal Nazarewicz
    Cc: "Aneesh Kumar K.V"
    Cc: "Rafael J. Wysocki"
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

18 Mar, 2016

1 commit

  • Kernel style prefers a single string over split strings when the string is
    'user-visible'.

    Miscellanea:

    - Add a missing newline
    - Realign arguments

    Signed-off-by: Joe Perches
    Acked-by: Tejun Heo [percpu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

16 Mar, 2016

5 commits

  • The page_owner mechanism is useful for dealing with memory leaks. By
    reading /sys/kernel/debug/page_owner one can determine the stack traces
    leading to allocations of all pages, and find e.g. a buggy driver.

    This information might be also potentially useful for debugging, such as
    the VM_BUG_ON_PAGE() calls to dump_page(). So let's print the stored
    info from dump_page().

    Example output:

    page:ffffea000292f1c0 count:1 mapcount:0 mapping:ffff8800b2f6cc18 index:0x91d
    flags: 0x1fffff8001002c(referenced|uptodate|lru|mappedtodisk)
    page dumped because: VM_BUG_ON_PAGE(1)
    page->mem_cgroup:ffff8801392c5000
    page allocated via order 0, migratetype Movable, gfp_mask 0x24213ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD|__GFP_NOWARN|__GFP_NORETRY)
    [] __alloc_pages_nodemask+0x134/0x230
    [] alloc_pages_current+0x88/0x120
    [] __page_cache_alloc+0xe6/0x120
    [] __do_page_cache_readahead+0xdc/0x240
    [] ondemand_readahead+0x135/0x260
    [] page_cache_async_readahead+0x6c/0x70
    [] generic_file_read_iter+0x3f2/0x760
    [] __vfs_read+0xa7/0xd0
    page has been migrated, last migrate reason: compaction

    Signed-off-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Joonsoo Kim
    Cc: Minchan Kim
    Cc: Sasha Levin
    Cc: "Kirill A. Shutemov"
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • During migration, page_owner info is now copied with the rest of the
    page, so the stacktrace leading to free page allocation during migration
    is overwritten. For debugging purposes, it might be however useful to
    know that the page has been migrated since its initial allocation. This
    might happen many times during the lifetime for different reasons and
    fully tracking this, especially with stacktraces would incur extra
    memory costs. As a compromise, store and print the migrate_reason of
    the last migration that occurred to the page. This is enough to
    distinguish compaction, numa balancing etc.

    Example page_owner entry after the patch:

    Page allocated via order 0, mask 0x24200ca(GFP_HIGHUSER_MOVABLE)
    PFN 628753 type Movable Block 1228 type Movable Flags 0x1fffff80040030(dirty|lru|swapbacked)
    [] __alloc_pages_nodemask+0x134/0x230
    [] alloc_pages_vma+0xb5/0x250
    [] shmem_alloc_page+0x61/0x90
    [] shmem_getpage_gfp+0x678/0x960
    [] shmem_fallocate+0x329/0x440
    [] vfs_fallocate+0x140/0x230
    [] SyS_fallocate+0x44/0x70
    [] entry_SYSCALL_64_fastpath+0x12/0x71
    Page has been migrated, last migrate reason: compaction

    Signed-off-by: Vlastimil Babka
    Cc: Joonsoo Kim
    Cc: Minchan Kim
    Cc: Sasha Levin
    Cc: "Kirill A. Shutemov"
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • The page_owner mechanism stores gfp_flags of an allocation and stack
    trace that lead to it. During page migration, the original information
    is practically replaced by the allocation of free page as the migration
    target. Arguably this is less useful and might lead to all the
    page_owner info for migratable pages gradually converge towards
    compaction or numa balancing migrations. It has also lead to
    inaccuracies such as one fixed by commit e2cfc91120fa ("mm/page_owner:
    set correct gfp_mask on page_owner").

    This patch thus introduces copying the page_owner info during migration.
    However, since the fact that the page has been migrated from its
    original place might be useful for debugging, the next patch will
    introduce a way to track that information as well.

    Signed-off-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Joonsoo Kim
    Cc: Minchan Kim
    Cc: Sasha Levin
    Cc: "Kirill A. Shutemov"
    Cc: Mel Gorman
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • CONFIG_PAGE_OWNER attempts to impose negligible runtime overhead when
    enabled during compilation, but not actually enabled during runtime by
    boot param page_owner=on. This overhead can be further reduced using
    the static key mechanism, which this patch does.

    Signed-off-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Joonsoo Kim
    Cc: Minchan Kim
    Cc: Sasha Levin
    Cc: "Kirill A. Shutemov"
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • The information in /sys/kernel/debug/page_owner includes the migratetype
    of the pageblock the page belongs to. This is also checked against the
    page's migratetype (as declared by gfp_flags during its allocation), and
    the page is reported as Fallback if its migratetype differs from the
    pageblock's one. t This is somewhat misleading because in fact fallback
    allocation is not the only reason why these two can differ. It also
    doesn't direcly provide the page's migratetype, although it's possible
    to derive that from the gfp_flags.

    It's arguably better to print both page and pageblock's migratetype and
    leave the interpretation to the consumer than to suggest fallback
    allocation as the only possible reason. While at it, we can print the
    migratetypes as string the same way as /proc/pagetypeinfo does, as some
    of the numeric values depend on kernel configuration. For that, this
    patch moves the migratetype_names array from #ifdef CONFIG_PROC_FS part
    of mm/vmstat.c to mm/page_alloc.c and exports it.

    With the new format strings for flags, we can now also provide symbolic
    page and gfp flags in the /sys/kernel/debug/page_owner file. This
    replaces the positional printing of page flags as single letters, which
    might have looked nicer, but was limited to a subset of flags, and
    required the user to remember the letters.

    Example page_owner entry after the patch:

    Page allocated via order 0, mask 0x24213ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD|__GFP_NOWARN|__GFP_NORETRY)
    PFN 520 type Movable Block 1 type Movable Flags 0xfffff8001006c(referenced|uptodate|lru|active|mappedtodisk)
    [] __alloc_pages_nodemask+0x134/0x230
    [] alloc_pages_current+0x88/0x120
    [] __page_cache_alloc+0xe6/0x120
    [] __do_page_cache_readahead+0xdc/0x240
    [] ondemand_readahead+0x135/0x260
    [] page_cache_sync_readahead+0x31/0x50
    [] generic_file_read_iter+0x453/0x760
    [] __vfs_read+0xa7/0xd0

    Signed-off-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Joonsoo Kim
    Cc: Minchan Kim
    Cc: Sasha Levin
    Cc: "Kirill A. Shutemov"
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     

18 Jul, 2015

1 commit

  • Currently, we set wrong gfp_mask to page_owner info in case of isolated
    freepage by compaction and split page. It causes incorrect mixed
    pageblock report that we can get from '/proc/pagetypeinfo'. This metric
    is really useful to measure fragmentation effect so should be accurate.
    This patch fixes it by setting correct information.

    Without this patch, after kernel build workload is finished, number of
    mixed pageblock is 112 among roughly 210 movable pageblocks.

    But, with this fix, output shows that mixed pageblock is just 57.

    Signed-off-by: Joonsoo Kim
    Cc: Mel Gorman
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

17 Jun, 2015

1 commit

  • This was using module_init, but there is no way this code can
    be modular. In the non-modular case, a module_init becomes a
    device_initcall, but this really isn't a device. So we should
    choose a more appropriate initcall bucket to put it in.

    In order of execution, our close choices are:

    fs_initcall(fn)
    rootfs_initcall(fn)
    device_initcall(fn)
    late_initcall(fn)

    ..and since the initcall here goes after debugfs, we really
    should be post-rootfs, which means late_initcall makes the
    most sense here.

    Cc: Andrew Morton
    Cc: linux-mm@kvack.org
    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

12 Feb, 2015

1 commit

  • Page owner uses the page_ext structure to keep meta-information for every
    page in the system. The structure also contains a field of type 'struct
    stack_trace', page owner uses this field during invocation of the function
    save_stack_trace. It is easy to notice that keeping a copy of this
    structure for every page in the system is very inefficiently in terms of
    memory.

    The patch removes this unnecessary field of page_ext and forces page owner
    to use a stack_trace structure allocated on the stack.

    [akpm@linux-foundation.org: use struct initializers]
    Signed-off-by: Sergei Rogachev
    Acked-by: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sergei Rogachev
     

14 Dec, 2014

2 commits

  • Extended memory to store page owner information is initialized some time
    later than that page allocator starts. Until initialization, many pages
    can be allocated and they have no owner information. This make debugging
    using page owner harder, so some fixup will be helpful.

    This patch fixes up this situation by setting fake owner information
    immediately after page extension is initialized. Information doesn't tell
    the right owner, but, at least, it can tell whether page is allocated or
    not, more correctly.

    On my testing, this patch catches 13343 early allocated pages, although
    they are mostly allocated from page extension feature. Anyway, after
    then, there is no page left that it is allocated and has no page owner
    flag.

    Signed-off-by: Joonsoo Kim
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Cc: Dave Hansen
    Cc: Michal Nazarewicz
    Cc: Jungsoo Son
    Cc: Ingo Molnar
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • This is the page owner tracking code which is introduced so far ago. It
    is resident on Andrew's tree, though, nobody tried to upstream so it
    remain as is. Our company uses this feature actively to debug memory leak
    or to find a memory hogger so I decide to upstream this feature.

    This functionality help us to know who allocates the page. When
    allocating a page, we store some information about allocation in extra
    memory. Later, if we need to know status of all pages, we can get and
    analyze it from this stored information.

    In previous version of this feature, extra memory is statically defined in
    struct page, but, in this version, extra memory is allocated outside of
    struct page. It enables us to turn on/off this feature at boottime
    without considerable memory waste.

    Although we already have tracepoint for tracing page allocation/free,
    using it to analyze page owner is rather complex. We need to enlarge the
    trace buffer for preventing overlapping until userspace program launched.
    And, launched program continually dump out the trace buffer for later
    analysis and it would change system behaviour with more possibility rather
    than just keeping it in memory, so bad for debug.

    Moreover, we can use page_owner feature further for various purposes. For
    example, we can use it for fragmentation statistics implemented in this
    patch. And, I also plan to implement some CMA failure debugging feature
    using this interface.

    I'd like to give the credit for all developers contributed this feature,
    but, it's not easy because I don't know exact history. Sorry about that.
    Below is people who has "Signed-off-by" in the patches in Andrew's tree.

    Contributor:
    Alexander Nyberg
    Mel Gorman
    Dave Hansen
    Minchan Kim
    Michal Nazarewicz
    Andrew Morton
    Jungsoo Son

    Signed-off-by: Joonsoo Kim
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Cc: Dave Hansen
    Cc: Michal Nazarewicz
    Cc: Jungsoo Son
    Cc: Ingo Molnar
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim