13 Sep, 2016

1 commit

  • PAGE_POISONING_ZERO disables zeroing new pages on alloc, they are
    poisoned (zeroed) as they become available.
    In the hibernate use case, free pages will appear in the system without
    being cleared, left there by the loading kernel.

    This patch will make sure free pages are cleared on resume when
    PAGE_POISONING_ZERO is enabled. We free the pages just after resume
    because we can't do it later: going through any device resume code might
    allocate some memory and invalidate the free pages bitmap.

    Thus we don't need to disable hibernation when PAGE_POISONING_ZERO is
    enabled.

    Signed-off-by: Anisse Astier
    Reviewed-by: Kees Cook
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki

    Anisse Astier
     

18 Mar, 2016

1 commit

  • CMA allocation should be guaranteed to succeed by definition, but,
    unfortunately, it would be failed sometimes. It is hard to track down
    the problem, because it is related to page reference manipulation and we
    don't have any facility to analyze it.

    This patch adds tracepoints to track down page reference manipulation.
    With it, we can find exact reason of failure and can fix the problem.
    Following is an example of tracepoint output. (note: this example is
    stale version that printing flags as the number. Recent version will
    print it as human readable string.)

    -9018 [004] 92.678375: page_ref_set: pfn=0x17ac9 flags=0x0 count=1 mapcount=0 mapping=(nil) mt=4 val=1
    -9018 [004] 92.678378: kernel_stack:
    => get_page_from_freelist (ffffffff81176659)
    => __alloc_pages_nodemask (ffffffff81176d22)
    => alloc_pages_vma (ffffffff811bf675)
    => handle_mm_fault (ffffffff8119e693)
    => __do_page_fault (ffffffff810631ea)
    => trace_do_page_fault (ffffffff81063543)
    => do_async_page_fault (ffffffff8105c40a)
    => async_page_fault (ffffffff817581d8)
    [snip]
    -9018 [004] 92.678379: page_ref_mod: pfn=0x17ac9 flags=0x40048 count=2 mapcount=1 mapping=0xffff880015a78dc1 mt=4 val=1
    [snip]
    ...
    ...
    -9131 [001] 93.174468: test_pages_isolated: start_pfn=0x17800 end_pfn=0x17c00 fin_pfn=0x17ac9 ret=fail
    [snip]
    -9018 [004] 93.174843: page_ref_mod_and_test: pfn=0x17ac9 flags=0x40068 count=0 mapcount=0 mapping=0xffff880015a78dc1 mt=4 val=-1 ret=1
    => release_pages (ffffffff8117c9e4)
    => free_pages_and_swap_cache (ffffffff811b0697)
    => tlb_flush_mmu_free (ffffffff81199616)
    => tlb_finish_mmu (ffffffff8119a62c)
    => exit_mmap (ffffffff811a53f7)
    => mmput (ffffffff81073f47)
    => do_exit (ffffffff810794e9)
    => do_group_exit (ffffffff81079def)
    => SyS_exit_group (ffffffff81079e74)
    => entry_SYSCALL_64_fastpath (ffffffff817560b6)

    This output shows that problem comes from exit path. In exit path, to
    improve performance, pages are not freed immediately. They are gathered
    and processed by batch. During this process, migration cannot be
    possible and CMA allocation is failed. This problem is hard to find
    without this page reference tracepoint facility.

    Enabling this feature bloat kernel text 30 KB in my configuration.

    text data bss dec hex filename
    12127327 2243616 1507328 15878271 f2487f vmlinux_disabled
    12157208 2258880 1507328 15923416 f2f8d8 vmlinux_enabled

    Note that, due to header file dependency problem between mm.h and
    tracepoint.h, this feature has to open code the static key functions for
    tracepoints. Proposed by Steven Rostedt in following link.

    https://lkml.org/lkml/2015/12/9/699

    [arnd@arndb.de: crypto/async_pq: use __free_page() instead of put_page()]
    [iamjoonsoo.kim@lge.com: fix build failure for xtensa]
    [akpm@linux-foundation.org: tweak Kconfig text, per Vlastimil]
    Signed-off-by: Joonsoo Kim
    Acked-by: Michal Nazarewicz
    Acked-by: Vlastimil Babka
    Cc: Minchan Kim
    Cc: Mel Gorman
    Cc: "Kirill A. Shutemov"
    Cc: Sergey Senozhatsky
    Acked-by: Steven Rostedt
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

16 Mar, 2016

3 commits

  • By default, page poisoning uses a poison value (0xaa) on free. If this
    is changed to 0, the page is not only sanitized but zeroing on alloc
    with __GFP_ZERO can be skipped as well. The tradeoff is that detecting
    corruption from the poisoning is harder to detect. This feature also
    cannot be used with hibernation since pages are not guaranteed to be
    zeroed after hibernation.

    Credit to Grsecurity/PaX team for inspiring this work

    Signed-off-by: Laura Abbott
    Acked-by: Rafael J. Wysocki
    Cc: "Kirill A. Shutemov"
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Kees Cook
    Cc: Mathias Krause
    Cc: Dave Hansen
    Cc: Jianyu Zhan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     
  • Page poisoning is currently set up as a feature if architectures don't
    have architecture debug page_alloc to allow unmapping of pages. It has
    uses apart from that though. Clearing of the pages on free provides an
    increase in security as it helps to limit the risk of information leaks.
    Allow page poisoning to be enabled as a separate option independent of
    kernel_map pages since the two features do separate work. Because of
    how hiberanation is implemented, the checks on alloc cannot occur if
    hibernation is enabled. The runtime alloc checks can also be enabled
    with an option when !HIBERNATION.

    Credit to Grsecurity/PaX team for inspiring this work

    Signed-off-by: Laura Abbott
    Cc: Rafael J. Wysocki
    Cc: "Kirill A. Shutemov"
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Kees Cook
    Cc: Mathias Krause
    Cc: Dave Hansen
    Cc: Jianyu Zhan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     
  • Since commit 031bc5743f158 ("mm/debug-pagealloc: make debug-pagealloc
    boottime configurable") CONFIG_DEBUG_PAGEALLOC is by default not adding
    any page debugging.

    This resulted in several unnoticed bugs, e.g.

    https://lkml.kernel.org/g/
    or
    https://lkml.kernel.org/g/

    as this behaviour change was not even documented in Kconfig.

    Let's provide a new Kconfig symbol that allows to change the default
    back to enabled, e.g. for debug kernels. This also makes the change
    obvious to kernel packagers.

    Let's also change the Kconfig description for CONFIG_DEBUG_PAGEALLOC, to
    indicate that there are two stages of overhead.

    Signed-off-by: Christian Borntraeger
    Cc: Joonsoo Kim
    Cc: Peter Zijlstra
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Borntraeger
     

09 Jan, 2015

1 commit

  • These are obsolete since commit e30825f1869a ("mm/debug-pagealloc:
    prepare boottime configurable") was merged. So remove them.

    [pebolle@tiscali.nl: find obsolete Kconfig options]
    Signed-off-by: Joonsoo Kim
    Cc: Paul Bolle
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Cc: Dave Hansen
    Cc: Michal Nazarewicz
    Cc: Jungsoo Son
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

14 Dec, 2014

2 commits

  • Until now, debug-pagealloc needs extra flags in struct page, so we need to
    recompile whole source code when we decide to use it. This is really
    painful, because it takes some time to recompile and sometimes rebuild is
    not possible due to third party module depending on struct page. So, we
    can't use this good feature in many cases.

    Now, we have the page extension feature that allows us to insert extra
    flags to outside of struct page. This gets rid of third party module
    issue mentioned above. And, this allows us to determine if we need extra
    memory for this page extension in boottime. With these property, we can
    avoid using debug-pagealloc in boottime with low computational overhead in
    the kernel built with CONFIG_DEBUG_PAGEALLOC. This will help our
    development process greatly.

    This patch is the preparation step to achive above goal. debug-pagealloc
    originally uses extra field of struct page, but, after this patch, it will
    use field of struct page_ext. Because memory for page_ext is allocated
    later than initialization of page allocator in CONFIG_SPARSEMEM, we should
    disable debug-pagealloc feature temporarily until initialization of
    page_ext. This patch implements this.

    Signed-off-by: Joonsoo Kim
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Cc: Dave Hansen
    Cc: Michal Nazarewicz
    Cc: Jungsoo Son
    Cc: Ingo Molnar
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • When we debug something, we'd like to insert some information to every
    page. For this purpose, we sometimes modify struct page itself. But,
    this has drawbacks. First, it requires re-compile. This makes us
    hesitate to use the powerful debug feature so development process is
    slowed down. And, second, sometimes it is impossible to rebuild the
    kernel due to third party module dependency. At third, system behaviour
    would be largely different after re-compile, because it changes size of
    struct page greatly and this structure is accessed by every part of
    kernel. Keeping this as it is would be better to reproduce errornous
    situation.

    This feature is intended to overcome above mentioned problems. This
    feature allocates memory for extended data per page in certain place
    rather than the struct page itself. This memory can be accessed by the
    accessor functions provided by this code. During the boot process, it
    checks whether allocation of huge chunk of memory is needed or not. If
    not, it avoids allocating memory at all. With this advantage, we can
    include this feature into the kernel in default and can avoid rebuild and
    solve related problems.

    Until now, memcg uses this technique. But, now, memcg decides to embed
    their variable to struct page itself and it's code to extend struct page
    has been removed. I'd like to use this code to develop debug feature, so
    this patch resurrect it.

    To help these things to work well, this patch introduces two callbacks for
    clients. One is the need callback which is mandatory if user wants to
    avoid useless memory allocation at boot-time. The other is optional, init
    callback, which is used to do proper initialization after memory is
    allocated. Detailed explanation about purpose of these functions is in
    code comment. Please refer it.

    Others are completely same with previous extension code in memcg.

    Signed-off-by: Joonsoo Kim
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Cc: Dave Hansen
    Cc: Michal Nazarewicz
    Cc: Jungsoo Son
    Cc: Ingo Molnar
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

11 Jan, 2012

1 commit

  • With CONFIG_DEBUG_PAGEALLOC configured, the CPU will generate an exception
    on access (read,write) to an unallocated page, which permits us to catch
    code which corrupts memory. However the kernel is trying to maximise
    memory usage, hence there are usually few free pages in the system and
    buggy code usually corrupts some crucial data.

    This patch changes the buddy allocator to keep more free/protected pages
    and to interlace free/protected and allocated pages to increase the
    probability of catching corruption.

    When the kernel is compiled with CONFIG_DEBUG_PAGEALLOC,
    debug_guardpage_minorder defines the minimum order used by the page
    allocator to grant a request. The requested size will be returned with
    the remaining pages used as guard pages.

    The default value of debug_guardpage_minorder is zero: no change from
    current behaviour.

    [akpm@linux-foundation.org: tweak documentation, s/flg/flag/]
    Signed-off-by: Stanislaw Gruszka
    Cc: Mel Gorman
    Cc: Andrea Arcangeli
    Cc: "Rafael J. Wysocki"
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stanislaw Gruszka
     

23 Mar, 2011

1 commit

  • Fix kconfig dependency warning to satisfy dependencies:

    warning: (PAGE_POISONING) selects DEBUG_PAGEALLOC which has unmet
    direct dependencies (DEBUG_KERNEL && ARCH_SUPPORTS_DEBUG_PAGEALLOC &&
    (!HIBERNATION || !PPC && !SPARC) && !KMEMCHECK)

    Signed-off-by: Akinobu Mita
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

21 Sep, 2009

1 commit


15 Jun, 2009

1 commit


03 Apr, 2009

1 commit

  • This fixes a build failure with generic debug pagealloc:

    mm/debug-pagealloc.c: In function 'set_page_poison':
    mm/debug-pagealloc.c:8: error: 'struct page' has no member named 'debug_flags'
    mm/debug-pagealloc.c: In function 'clear_page_poison':
    mm/debug-pagealloc.c:13: error: 'struct page' has no member named 'debug_flags'
    mm/debug-pagealloc.c: In function 'page_poison':
    mm/debug-pagealloc.c:18: error: 'struct page' has no member named 'debug_flags'
    mm/debug-pagealloc.c: At top level:
    mm/debug-pagealloc.c:120: error: redefinition of 'kernel_map_pages'
    include/linux/mm.h:1278: error: previous definition of 'kernel_map_pages' was here
    mm/debug-pagealloc.c: In function 'kernel_map_pages':
    mm/debug-pagealloc.c:122: error: 'debug_pagealloc_enabled' undeclared (first use in this function)

    by fixing

    - debug_flags should be in struct page
    - define DEBUG_PAGEALLOC config option for all architectures

    Signed-off-by: Akinobu Mita
    Reported-by: Alexander Beregalov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

01 Apr, 2009

1 commit

  • CONFIG_DEBUG_PAGEALLOC is now supported by x86, powerpc, sparc64, and
    s390. This patch implements it for the rest of the architectures by
    filling the pages with poison byte patterns after free_pages() and
    verifying the poison patterns before alloc_pages().

    This generic one cannot detect invalid page accesses immediately but
    invalid read access may cause invalid dereference by poisoned memory and
    invalid write access can be detected after a long delay.

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita