04 Feb, 2020

2 commits

  • The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
    seq_file.h.

    Conversion rule is:

    llseek => proc_lseek
    unlocked_ioctl => proc_ioctl

    xxx => proc_xxx

    delete ".owner = THIS_MODULE" line

    [akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
    [sfr@canb.auug.org.au: fix kernel/sched/psi.c]
    Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
    Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • If max_pfn does not fall onto a section boundary, it is possible to
    inspect PFNs up to max_pfn, and PFNs above max_pfn, however, max_pfn
    itself can't be inspected. We can have a valid (and online) memmap at and
    above max_pfn if max_pfn is not aligned to a section boundary. The whole
    early section has a memmap and is marked online. Being able to inspect
    the state of these PFNs is valuable for debugging, especially because
    max_pfn can change on memory hotplug and expose these memmaps.

    Also, querying page flags via "./page-types -r -a 0x144001,"
    (tools/vm/page-types.c) inside a x86-64 guest with 4160MB under QEMU
    results in an (almost) endless loop in user space, because the end is not
    detected properly when starting after max_pfn.

    Instead, let's allow to inspect all pages in the highest section and
    return 0 directly if we try to access pages above that section.

    While at it, check the count before adjusting it, to avoid masking user
    errors.

    Link: http://lkml.kernel.org/r/20191211163201.17179-3-david@redhat.com
    Signed-off-by: David Hildenbrand
    Cc: Alexey Dobriyan
    Cc: Oscar Salvador
    Cc: Michal Hocko
    Cc: Stephen Rothwell
    Cc: Bob Picco
    Cc: Daniel Jordan
    Cc: Dan Williams
    Cc: Michal Hocko
    Cc: Naoya Horiguchi
    Cc: Pavel Tatashin
    Cc: Steven Sistare
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

19 Oct, 2019

1 commit

  • There are three places where we access uninitialized memmaps, namely:
    - /proc/kpagecount
    - /proc/kpageflags
    - /proc/kpagecgroup

    We have initialized memmaps either when the section is online or when the
    page was initialized to the ZONE_DEVICE. Uninitialized memmaps contain
    garbage and in the worst case trigger kernel BUGs, especially with
    CONFIG_PAGE_POISONING.

    For example, not onlining a DIMM during boot and calling /proc/kpagecount
    with CONFIG_PAGE_POISONING:

    :/# cat /proc/kpagecount > tmp.test
    BUG: unable to handle page fault for address: fffffffffffffffe
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 114616067 P4D 114616067 PUD 114618067 PMD 0
    Oops: 0000 [#1] SMP NOPTI
    CPU: 0 PID: 469 Comm: cat Not tainted 5.4.0-rc1-next-20191004+ #11
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
    RIP: 0010:kpagecount_read+0xce/0x1e0
    Code: e8 09 83 e0 3f 48 0f a3 02 73 2d 4c 89 e7 48 c1 e7 06 48 03 3d ab 51 01 01 74 1d 48 8b 57 08 480
    RSP: 0018:ffffa14e409b7e78 EFLAGS: 00010202
    RAX: fffffffffffffffe RBX: 0000000000020000 RCX: 0000000000000000
    RDX: 0000000000000001 RSI: 00007f76b5595000 RDI: fffff35645000000
    RBP: 00007f76b5595000 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
    R13: 0000000000020000 R14: 00007f76b5595000 R15: ffffa14e409b7f08
    FS: 00007f76b577d580(0000) GS:ffff8f41bd400000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: fffffffffffffffe CR3: 0000000078960000 CR4: 00000000000006f0
    Call Trace:
    proc_reg_read+0x3c/0x60
    vfs_read+0xc5/0x180
    ksys_read+0x68/0xe0
    do_syscall_64+0x5c/0xa0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    For now, let's drop support for ZONE_DEVICE from the three pseudo files
    in order to fix this. To distinguish offline memory (with garbage
    memmap) from ZONE_DEVICE memory with properly initialized memmaps, we
    would have to check get_dev_pagemap() and pfn_zone_device_reserved()
    right now. The usage of both (especially, special casing devmem) is
    frowned upon and needs to be reworked.

    The fundamental issue we have is:

    if (pfn_to_online_page(pfn)) {
    /* memmap initialized */
    } else if (pfn_valid(pfn)) {
    /*
    * ???
    * a) offline memory. memmap garbage.
    * b) devmem: memmap initialized to ZONE_DEVICE.
    * c) devmem: reserved for driver. memmap garbage.
    * (d) devmem: memmap currently initializing - garbage)
    */
    }

    We'll leave the pfn_zone_device_reserved() check in stable_page_flags()
    in place as that function is also used from memory failure. We now no
    longer dump information about pages that are not in use anymore -
    offline.

    Link: http://lkml.kernel.org/r/20191009142435.3975-2-david@redhat.com
    Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319]
    Signed-off-by: David Hildenbrand
    Reported-by: Qian Cai
    Acked-by: Michal Hocko
    Cc: Dan Williams
    Cc: Alexey Dobriyan
    Cc: Stephen Rothwell
    Cc: Toshiki Fukasawa
    Cc: Pankaj gupta
    Cc: Mike Rapoport
    Cc: Anthony Yznaga
    Cc: "Aneesh Kumar K.V"
    Cc: [4.13+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

06 Mar, 2019

1 commit

  • PG_balloon was introduced to implement page migration/compaction for
    pages inflated in virtio-balloon. Nowadays, it is only a marker that a
    page is part of virtio-balloon and therefore logically offline.

    We also want to make use of this flag in other balloon drivers - for
    inflated pages or when onlining a section but keeping some pages offline
    (e.g. used right now by XEN and Hyper-V via set_online_page_callback()).

    We are going to expose this flag to dump tools like makedumpfile. But
    instead of exposing PG_balloon, let's generalize the concept of marking
    pages as logically offline, so it can be reused for other purposes later
    on.

    Rename PG_balloon to PG_offline. This is an indicator that the page is
    logically offline, the content stale and that it should not be touched
    (e.g. a hypervisor would have to allocate backing storage in order for
    the guest to dump an unused page). We can then e.g. exclude such pages
    from dumps.

    We replace and reuse KPF_BALLOON (23), as this shouldn't really harm
    (and for now the semantics stay the same). In following patches, we
    will make use of this bit also in other balloon drivers. While at it,
    document PGTABLE.

    [akpm@linux-foundation.org: fix comment text, per David]
    Link: http://lkml.kernel.org/r/20181119101616.8901-3-david@redhat.com
    Signed-off-by: David Hildenbrand
    Acked-by: Konstantin Khlebnikov
    Acked-by: Michael S. Tsirkin
    Acked-by: Pankaj gupta
    Cc: Jonathan Corbet
    Cc: Alexey Dobriyan
    Cc: Mike Rapoport
    Cc: Christian Hansen
    Cc: Vlastimil Babka
    Cc: "Kirill A. Shutemov"
    Cc: Stephen Rothwell
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: Pavel Tatashin
    Cc: Alexander Duyck
    Cc: Naoya Horiguchi
    Cc: Miles Chen
    Cc: David Rientjes
    Cc: Kazuhito Hagio
    Cc: Arnd Bergmann
    Cc: Baoquan He
    Cc: Borislav Petkov
    Cc: Boris Ostrovsky
    Cc: Dave Young
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Juergen Gross
    Cc: Julien Freche
    Cc: Kairui Song
    Cc: "K. Y. Srinivasan"
    Cc: Len Brown
    Cc: Lianbo Jiang
    Cc: Michal Hocko
    Cc: Nadav Amit
    Cc: Omar Sandoval
    Cc: Pavel Machek
    Cc: Rafael J. Wysocki
    Cc: "Rafael J. Wysocki"
    Cc: Stefano Stabellini
    Cc: Stephen Hemminger
    Cc: Vitaly Kuznetsov
    Cc: Xavier Deguillard
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

29 Dec, 2018

1 commit

  • Certain pages that are never mapped to userspace have a type indicated in
    the page_type field of their struct pages (e.g. PG_buddy). page_type
    overlaps with _mapcount so set the count to 0 and avoid calling
    page_mapcount() for these pages.

    [anthony.yznaga@oracle.com: incorporate feedback from Matthew Wilcox]
    Link: http://lkml.kernel.org/r/1544481313-27318-1-git-send-email-anthony.yznaga@oracle.com
    Link: http://lkml.kernel.org/r/1543963526-27917-1-git-send-email-anthony.yznaga@oracle.com
    Signed-off-by: Anthony Yznaga
    Reviewed-by: Andrew Morton
    Acked-by: Matthew Wilcox
    Reviewed-by: Naoya Horiguchi
    Cc: Vlastimil Babka
    Cc: David Rientjes
    Cc: Alexey Dobriyan
    Cc: Kirill A. Shutemov
    Cc: Mike Rapoport
    Cc: Michal Hocko
    Cc: Alexander Duyck
    Cc: Johannes Weiner
    Cc: Miles Chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anthony Yznaga
     

31 Oct, 2018

1 commit

  • Move remaining definitions and declarations from include/linux/bootmem.h
    into include/linux/memblock.h and remove the redundant header.

    The includes were replaced with the semantic patch below and then
    semi-automated removal of duplicated '#include

    @@
    @@
    - #include
    + #include

    [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
    [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
    [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
    Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
    Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Stephen Rothwell
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

08 Jun, 2018

1 commit

  • Define a new PageTable bit in the page_type and use it to mark pages in
    use as page tables. This can be helpful when debugging crashdumps or
    analysing memory fragmentation. Add a KPF flag to report these pages to
    userspace and update page-types.c to interpret that flag.

    Note that only pages currently accounted as NR_PAGETABLES are tracked as
    PageTable; this does not include pgd/p4d/pud/pmd pages. Those will be the
    subject of a later patch.

    Link: http://lkml.kernel.org/r/20180518194519.3820-4-willy@infradead.org
    Signed-off-by: Matthew Wilcox
    Acked-by: Kirill A. Shutemov
    Acked-by: Vlastimil Babka
    Cc: Christoph Lameter
    Cc: Dave Hansen
    Cc: Jérôme Glisse
    Cc: Lai Jiangshan
    Cc: Martin Schwidefsky
    Cc: Pekka Enberg
    Cc: Randy Dunlap
    Cc: Andrey Ryabinin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

08 Feb, 2017

1 commit

  • Commit 6326fec1122c ("mm: Use owner_priv bit for PageSwapCache, valid
    when PageSwapBacked") aliased PG_swapcache to PG_owner_priv_1 (and
    depending on PageSwapBacked being true).

    As a result, the KPF_SWAPCACHE bit in '/proc/kpageflags' should now be
    synthesized, instead of being shown on unrelated pages which just happen
    to have PG_owner_priv_1 set.

    Signed-off-by: Hugh Dickins
    Cc: Andrew Morton
    Cc: Nicholas Piggin
    Cc: Wu Fengguang
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

25 Dec, 2016

1 commit


20 May, 2016

1 commit

  • Many developers already know that field for reference count of the
    struct page is _count and atomic type. They would try to handle it
    directly and this could break the purpose of page reference count
    tracepoint. To prevent direct _count modification, this patch rename it
    to _refcount and add warning message on the code. After that, developer
    who need to handle reference count will find that field should not be
    accessed directly.

    [akpm@linux-foundation.org: fix comments, per Vlastimil]
    [akpm@linux-foundation.org: Documentation/vm/transhuge.txt too]
    [sfr@canb.auug.org.au: sync ethernet driver changes]
    Signed-off-by: Joonsoo Kim
    Signed-off-by: Stephen Rothwell
    Cc: Vlastimil Babka
    Cc: Hugh Dickins
    Cc: Johannes Berg
    Cc: "David S. Miller"
    Cc: Sunil Goutham
    Cc: Chris Metcalf
    Cc: Manish Chopra
    Cc: Yuval Mintz
    Cc: Tariq Toukan
    Cc: Saeed Mahameed
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

18 Mar, 2016

2 commits

  • Currently /proc/kpageflags returns just KPF_COMPOUND_TAIL for slab tail
    pages, which is inconvenient when grasping how slab pages are
    distributed (userspace always needs to check which kind of tail pages by
    itself). This patch sets KPF_SLAB for such pages.

    With this patch:

    $ grep Slab /proc/meminfo ; tools/vm/page-types -b slab
    Slab: 64880 kB
    flags page-count MB symbolic-flags long-symbolic-flags
    0x0000000000000080 16220 63 _______S__________________________________ slab
    total 16220 63

    16220 pages equals to 64880 kB, so returned result is consistent with the
    global counter.

    Signed-off-by: Naoya Horiguchi
    Reviewed-by: Vladimir Davydov
    Cc: Konstantin Khlebnikov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • Currently /proc/kpageflags returns nothing for "tail" buddy pages, which
    is inconvenient when grasping how free pages are distributed. This
    patch sets KPF_BUDDY for such pages.

    With this patch:

    $ grep MemFree /proc/meminfo ; tools/vm/page-types -b buddy
    MemFree: 3134992 kB
    flags page-count MB symbolic-flags long-symbolic-flags
    0x0000000000000400 779272 3044 __________B_______________________________ buddy
    0x0000000000000c00 4385 17 __________BM______________________________ buddy,mmap
    total 783657 3061

    783657 pages is 3134628 kB (roughly consistent with the global counter,)
    so it's OK.

    [akpm@linux-foundation.org: update comment, per Naoya]
    Signed-off-by: Naoya Horiguchi
    Reviewed-by: Vladimir Davydov >
    Cc: Konstantin Khlebnikov
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

16 Jan, 2016

1 commit

  • Let's define page_mapped() to be true for compound pages if any
    sub-pages of the compound page is mapped (with PMD or PTE).

    On other hand page_mapcount() return mapcount for this particular small
    page.

    This will make cases like page_get_anon_vma() behave correctly once we
    allow huge pages to be mapped with PTE.

    Most users outside core-mm should use page_mapcount() instead of
    page_mapped().

    Signed-off-by: Kirill A. Shutemov
    Tested-by: Sasha Levin
    Tested-by: Aneesh Kumar K.V
    Acked-by: Jerome Marchand
    Cc: Vlastimil Babka
    Cc: Andrea Arcangeli
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Naoya Horiguchi
    Cc: Steve Capper
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Christoph Lameter
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

11 Sep, 2015

4 commits

  • Reading/writing a /proc/kpage* file may take long on machines with a lot
    of RAM installed.

    Signed-off-by: Vladimir Davydov
    Suggested-by: Andres Lagar-Cavilla
    Reviewed-by: Andres Lagar-Cavilla
    Cc: Minchan Kim
    Cc: Raghavendra K T
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Greg Thelen
    Cc: Michel Lespinasse
    Cc: David Rientjes
    Cc: Pavel Emelyanov
    Cc: Cyrill Gorcunov
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • As noted by Minchan, a benefit of reading idle flag from /proc/kpageflags
    is that one can easily filter dirty and/or unevictable pages while
    estimating the size of unused memory.

    Note that idle flag read from /proc/kpageflags may be stale in case the
    page was accessed via a PTE, because it would be too costly to iterate
    over all page mappings on each /proc/kpageflags read to provide an
    up-to-date value. To make sure the flag is up-to-date one has to read
    /sys/kernel/mm/page_idle/bitmap first.

    Signed-off-by: Vladimir Davydov
    Reviewed-by: Andres Lagar-Cavilla
    Cc: Minchan Kim
    Cc: Raghavendra K T
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Greg Thelen
    Cc: Michel Lespinasse
    Cc: David Rientjes
    Cc: Pavel Emelyanov
    Cc: Cyrill Gorcunov
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • Knowing the portion of memory that is not used by a certain application or
    memory cgroup (idle memory) can be useful for partitioning the system
    efficiently, e.g. by setting memory cgroup limits appropriately.
    Currently, the only means to estimate the amount of idle memory provided
    by the kernel is /proc/PID/{clear_refs,smaps}: the user can clear the
    access bit for all pages mapped to a particular process by writing 1 to
    clear_refs, wait for some time, and then count smaps:Referenced. However,
    this method has two serious shortcomings:

    - it does not count unmapped file pages
    - it affects the reclaimer logic

    To overcome these drawbacks, this patch introduces two new page flags,
    Idle and Young, and a new sysfs file, /sys/kernel/mm/page_idle/bitmap.
    A page's Idle flag can only be set from userspace by setting bit in
    /sys/kernel/mm/page_idle/bitmap at the offset corresponding to the page,
    and it is cleared whenever the page is accessed either through page tables
    (it is cleared in page_referenced() in this case) or using the read(2)
    system call (mark_page_accessed()). Thus by setting the Idle flag for
    pages of a particular workload, which can be found e.g. by reading
    /proc/PID/pagemap, waiting for some time to let the workload access its
    working set, and then reading the bitmap file, one can estimate the amount
    of pages that are not used by the workload.

    The Young page flag is used to avoid interference with the memory
    reclaimer. A page's Young flag is set whenever the Access bit of a page
    table entry pointing to the page is cleared by writing to the bitmap file.
    If page_referenced() is called on a Young page, it will add 1 to its
    return value, therefore concealing the fact that the Access bit was
    cleared.

    Note, since there is no room for extra page flags on 32 bit, this feature
    uses extended page flags when compiled on 32 bit.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: kpageidle requires an MMU]
    [akpm@linux-foundation.org: decouple from page-flags rework]
    Signed-off-by: Vladimir Davydov
    Reviewed-by: Andres Lagar-Cavilla
    Cc: Minchan Kim
    Cc: Raghavendra K T
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Greg Thelen
    Cc: Michel Lespinasse
    Cc: David Rientjes
    Cc: Pavel Emelyanov
    Cc: Cyrill Gorcunov
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • /proc/kpagecgroup contains a 64-bit inode number of the memory cgroup each
    page is charged to, indexed by PFN. Having this information is useful for
    estimating a cgroup working set size.

    The file is present if CONFIG_PROC_PAGE_MONITOR && CONFIG_MEMCG.

    Signed-off-by: Vladimir Davydov
    Reviewed-by: Andres Lagar-Cavilla
    Cc: Minchan Kim
    Cc: Raghavendra K T
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Greg Thelen
    Cc: Michel Lespinasse
    Cc: David Rientjes
    Cc: Pavel Emelyanov
    Cc: Cyrill Gorcunov
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

12 Feb, 2015

1 commit

  • Add KPF_ZERO_PAGE flag for zero_page, so that userspace processes can
    detect zero_page in /proc/kpageflags, and then do memory analysis more
    accurately.

    Signed-off-by: Yalin Wang
    Acked-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wang, Yalin
     

10 Oct, 2014

1 commit

  • Always mark pages with PageBalloon even if balloon compaction is disabled
    and expose this mark in /proc/kpageflags as KPF_BALLOON.

    Also this patch adds three counters into /proc/vmstat: "balloon_inflate",
    "balloon_deflate" and "balloon_migrate". They accumulate balloon
    activity. Current size of balloon is (balloon_inflate - balloon_deflate)
    pages.

    All generic balloon code now gathered under option CONFIG_MEMORY_BALLOON.
    It should be selected by ballooning driver which wants use this feature.
    Currently virtio-balloon is the only user.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Rafael Aquini
    Cc: Andrey Ryabinin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

04 Mar, 2014

1 commit

  • Commit bf6bddf1924e ("mm: introduce compaction and migration for
    ballooned pages") introduces page_count(page) into memory compaction
    which dereferences page->first_page if PageTail(page).

    This results in a very rare NULL pointer dereference on the
    aforementioned page_count(page). Indeed, anything that does
    compound_head(), including page_count() is susceptible to racing with
    prep_compound_page() and seeing a NULL or dangling page->first_page
    pointer.

    This patch uses Andrea's implementation of compound_trans_head() that
    deals with such a race and makes it the default compound_head()
    implementation. This includes a read memory barrier that ensures that
    if PageTail(head) is true that we return a head page that is neither
    NULL nor dangling. The patch then adds a store memory barrier to
    prep_compound_page() to ensure page->first_page is set.

    This is the safest way to ensure we see the head page that we are
    expecting, PageTail(page) is already in the unlikely() path and the
    memory barriers are unfortunately required.

    Hugetlbfs is the exception, we don't enforce a store memory barrier
    during init since no race is possible.

    Signed-off-by: David Rientjes
    Cc: Holger Kiehl
    Cc: Christoph Lameter
    Cc: Rafael Aquini
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Mel Gorman
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: "Kirill A. Shutemov"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

24 Jan, 2014

2 commits

  • PROC_FS is a bool, so this code is either present or absent. It will
    never be modular, so using module_init as an alias for __initcall is
    rather misleading.

    Fix this up now, so that we can relocate module_init from init.h into
    module.h in the future. If we don't do this, we'd have to add module.h to
    obviously non-modular code, and that would be ugly at best.

    Note that direct use of __initcall is discouraged, vs. one of the
    priority categorized subgroups. As __initcall gets mapped onto
    device_initcall, our use of fs_initcall (which makes sense for fs code)
    will thus change these registrations from level 6-device to level 5-fs
    (i.e. slightly earlier). However no observable impact of that small
    difference has been observed during testing, or is expected.

    Also note that this change uncovers a missing semicolon bug in the
    registration of vmcore_init as an initcall.

    Signed-off-by: Paul Gortmaker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Gortmaker
     
  • stable_page_flags() checks !PageHuge && PageTransCompound && PageLRU to
    know that a specified page is thp or not. But sometimes it's not enough
    and we fail to detect thp when the thp is on pagevec. This happens only
    for a few seconds after LRU list operations, but it makes it difficult
    to control our applications depending on this flag.

    So this patch adds another check PageAnon to detect thps on pagevec. It
    might not give the future extensibility for thp pagecache, but it's OK
    at least for now.

    Signed-off-by: Naoya Horiguchi
    Cc: David Rientjes
    Cc: KOSAKI Motohiro
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

09 Oct, 2012

1 commit

  • KPF_THP can be set on non-huge compound pages (like slab pages or pages
    allocated by drivers with __GFP_COMP) because PageTransCompound only
    checks PG_head and PG_tail. Obviously this is a bug and breaks user space
    applications which look for thp via /proc/kpageflags.

    This patch rules out setting KPF_THP wrongly by additionally checking
    PageLRU on the head pages.

    Signed-off-by: Naoya Horiguchi
    Acked-by: KOSAKI Motohiro
    Acked-by: David Rientjes
    Reviewed-by: Fengguang Wu
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

22 Mar, 2012

1 commit

  • This flag shows that a given page is a subpage of a transparent hugepage.
    It helps us debug and test the kernel by showing physical address of thp.

    Signed-off-by: Naoya Horiguchi
    Reviewed-by: Wu Fengguang
    Reviewed-by: KAMEZAWA Hiroyuki
    Acked-by: KOSAKI Motohiro
    Cc: David Rientjes
    Cc: Andi Kleen
    Cc: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

14 Jan, 2011

2 commits

  • PG_buddy can be converted to _mapcount == -2. So the PG_compound_lock can
    be added to page->flags without overflowing (because of the sparse section
    bits increasing) with CONFIG_X86_PAE=y and CONFIG_X86_PAT=y. This also
    has to move the memory hotplug code from _mapcount to lru.next to avoid
    any risk of clashes. We can't use lru.next for PG_buddy removal, but
    memory hotplug can use lru.next even more easily than the mapcount
    instead.

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Add a PageSlab() check before adding the _mapcount value to /kpagecount.
    page->_mapcount is in a union with the SLAB structure so for pages
    controlled by SLAB, page_mapcount() returns nonsense.

    Signed-off-by: Petr Holasek
    Cc: Wu Fengguang
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Holasek
     

10 Sep, 2010

1 commit


16 Dec, 2009

1 commit

  • Rename get_uflags() to stable_page_flags() and make it a global function
    for use in the hwpoison page flags filter, which need to compare user
    page flags with the value provided by user space.

    Also move KPF_* to kernel-page-flags.h for use by user space tools.

    Acked-by: Matt Mackall
    Signed-off-by: Andi Kleen
    CC: Nick Piggin
    CC: Christoph Lameter
    Signed-off-by: Wu Fengguang
    Signed-off-by: Andi Kleen

    Wu Fengguang
     

08 Oct, 2009

1 commit

  • This flag indicates a hardware detected memory corruption on the page.
    Any future access of the page data may bring down the machine.

    Signed-off-by: Wu Fengguang
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     

22 Sep, 2009

1 commit

  • KSM will need to identify its kernel merged pages unambiguously, and
    /proc/kpageflags will probably like to do so too.

    Since KSM will only be substituting anonymous pages, statistics are best
    preserved by making a PageKsm page a special PageAnon page: one with no
    anon_vma.

    But KSM then needs its own page_add_ksm_rmap() - keep it in ksm.h near
    PageKsm; and do_wp_page() must COW them, unlike singly mapped PageAnons.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Chris Wright
    Signed-off-by: Izik Eidus
    Cc: Wu Fengguang
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: Wu Fengguang
    Cc: Balbir Singh
    Cc: Hugh Dickins
    Cc: KAMEZAWA Hiroyuki
    Cc: Lee Schermerhorn
    Cc: Avi Kivity
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

17 Jun, 2009

4 commits

  • Currently, nobody wants to turn UNEVICTABLE_LRU off. Thus this
    configurability is unnecessary.

    Signed-off-by: KOSAKI Motohiro
    Cc: Johannes Weiner
    Cc: Andi Kleen
    Acked-by: Minchan Kim
    Cc: David Woodhouse
    Cc: Matt Mackall
    Cc: Rik van Riel
    Cc: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Export all page flags faithfully in /proc/kpageflags.

    11. KPF_MMAP (pseudo flag) memory mapped page
    12. KPF_ANON (pseudo flag) memory mapped page (anonymous)
    13. KPF_SWAPCACHE page is in swap cache
    14. KPF_SWAPBACKED page is swap/RAM backed
    15. KPF_COMPOUND_HEAD (*)
    16. KPF_COMPOUND_TAIL (*)
    17. KPF_HUGE hugeTLB pages
    18. KPF_UNEVICTABLE page is in the unevictable LRU list
    19. KPF_HWPOISON(TBD) hardware detected corruption
    20. KPF_NOPAGE (pseudo flag) no page frame at the address
    32-39. more obscure flags for kernel developers

    (*) For compound pages, exporting _both_ head/tail info enables
    users to tell where a compound page starts/ends, and its order.

    The accompanying page-types tool will handle the details like decoupling
    overloaded flags and hiding obscure flags to normal users.

    Thanks to KOSAKI and Andi for their valuable recommendations!

    Signed-off-by: Wu Fengguang
    Cc: KOSAKI Motohiro
    Cc: Andi Kleen
    Cc: Matt Mackall
    Cc: Alexey Dobriyan
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Move increments of pfn/out to bottom of the loop.

    Signed-off-by: Wu Fengguang
    Cc: KOSAKI Motohiro
    Cc: Andi Kleen
    Acked-by: Matt Mackall
    Cc: Alexey Dobriyan
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • A series of patches to enhance the /proc/pagemap interface and to add a
    userspace executable which can be used to present the pagemap data.

    Export 10 more flags to end users (and more for kernel developers):

    11. KPF_MMAP (pseudo flag) memory mapped page
    12. KPF_ANON (pseudo flag) memory mapped page (anonymous)
    13. KPF_SWAPCACHE page is in swap cache
    14. KPF_SWAPBACKED page is swap/RAM backed
    15. KPF_COMPOUND_HEAD (*)
    16. KPF_COMPOUND_TAIL (*)
    17. KPF_HUGE hugeTLB pages
    18. KPF_UNEVICTABLE page is in the unevictable LRU list
    19. KPF_HWPOISON hardware detected corruption
    20. KPF_NOPAGE (pseudo flag) no page frame at the address

    (*) For compound pages, exporting _both_ head/tail info enables
    users to tell where a compound page starts/ends, and its order.

    a simple demo of the page-types tool

    # ./page-types -h
    page-types [options]
    -r|--raw Raw mode, for kernel developers
    -a|--addr addr-spec Walk a range of pages
    -b|--bits bits-spec Walk pages with specified bits
    -l|--list Show page details in ranges
    -L|--list-each Show page details one by one
    -N|--no-summary Don't show summay info
    -h|--help Show this usage message
    addr-spec:
    N one page at offset N (unit: pages)
    N+M pages range from N to N+M-1
    N,M pages range from N to M-1
    N, pages range from N to end
    ,M pages range from 0 to M
    bits-spec:
    bit1,bit2 (flags & (bit1|bit2)) != 0
    bit1,bit2=bit1 (flags & (bit1|bit2)) == bit1
    bit1,~bit2 (flags & (bit1|bit2)) == bit1
    =bit1,bit2 flags == (bit1|bit2)
    bit-names:
    locked error referenced uptodate
    dirty lru active slab
    writeback reclaim buddy mmap
    anonymous swapcache swapbacked compound_head
    compound_tail huge unevictable hwpoison
    nopage reserved(r) mlocked(r) mappedtodisk(r)
    private(r) private_2(r) owner_private(r) arch(r)
    uncached(r) readahead(o) slob_free(o) slub_frozen(o)
    slub_debug(o)
    (r) raw mode bits (o) overloaded bits

    # ./page-types
    flags page-count MB symbolic-flags long-symbolic-flags
    0x0000000000000000 487369 1903 _________________________________
    0x0000000000000014 5 0 __R_D____________________________ referenced,dirty
    0x0000000000000020 1 0 _____l___________________________ lru
    0x0000000000000024 34 0 __R__l___________________________ referenced,lru
    0x0000000000000028 3838 14 ___U_l___________________________ uptodate,lru
    0x0001000000000028 48 0 ___U_l_______________________I___ uptodate,lru,readahead
    0x000000000000002c 6478 25 __RU_l___________________________ referenced,uptodate,lru
    0x000100000000002c 47 0 __RU_l_______________________I___ referenced,uptodate,lru,readahead
    0x0000000000000040 8344 32 ______A__________________________ active
    0x0000000000000060 1 0 _____lA__________________________ lru,active
    0x0000000000000068 348 1 ___U_lA__________________________ uptodate,lru,active
    0x0001000000000068 12 0 ___U_lA______________________I___ uptodate,lru,active,readahead
    0x000000000000006c 988 3 __RU_lA__________________________ referenced,uptodate,lru,active
    0x000100000000006c 48 0 __RU_lA______________________I___ referenced,uptodate,lru,active,readahead
    0x0000000000004078 1 0 ___UDlA_______b__________________ uptodate,dirty,lru,active,swapbacked
    0x000000000000407c 34 0 __RUDlA_______b__________________ referenced,uptodate,dirty,lru,active,swapbacked
    0x0000000000000400 503 1 __________B______________________ buddy
    0x0000000000000804 1 0 __R________M_____________________ referenced,mmap
    0x0000000000000828 1029 4 ___U_l_____M_____________________ uptodate,lru,mmap
    0x0001000000000828 43 0 ___U_l_____M_________________I___ uptodate,lru,mmap,readahead
    0x000000000000082c 382 1 __RU_l_____M_____________________ referenced,uptodate,lru,mmap
    0x000100000000082c 12 0 __RU_l_____M_________________I___ referenced,uptodate,lru,mmap,readahead
    0x0000000000000868 192 0 ___U_lA____M_____________________ uptodate,lru,active,mmap
    0x0001000000000868 12 0 ___U_lA____M_________________I___ uptodate,lru,active,mmap,readahead
    0x000000000000086c 800 3 __RU_lA____M_____________________ referenced,uptodate,lru,active,mmap
    0x000100000000086c 31 0 __RU_lA____M_________________I___ referenced,uptodate,lru,active,mmap,readahead
    0x0000000000004878 2 0 ___UDlA____M__b__________________ uptodate,dirty,lru,active,mmap,swapbacked
    0x0000000000001000 492 1 ____________a____________________ anonymous
    0x0000000000005808 4 0 ___U_______Ma_b__________________ uptodate,mmap,anonymous,swapbacked
    0x0000000000005868 2839 11 ___U_lA____Ma_b__________________ uptodate,lru,active,mmap,anonymous,swapbacked
    0x000000000000586c 30 0 __RU_lA____Ma_b__________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
    total 513968 2007

    # ./page-types -r
    flags page-count MB symbolic-flags long-symbolic-flags
    0x0000000000000000 468002 1828 _________________________________
    0x0000000100000000 19102 74 _____________________r___________ reserved
    0x0000000000008000 41 0 _______________H_________________ compound_head
    0x0000000000010000 188 0 ________________T________________ compound_tail
    0x0000000000008014 1 0 __R_D__________H_________________ referenced,dirty,compound_head
    0x0000000000010014 4 0 __R_D___________T________________ referenced,dirty,compound_tail
    0x0000000000000020 1 0 _____l___________________________ lru
    0x0000000800000024 34 0 __R__l__________________P________ referenced,lru,private
    0x0000000000000028 3794 14 ___U_l___________________________ uptodate,lru
    0x0001000000000028 46 0 ___U_l_______________________I___ uptodate,lru,readahead
    0x0000000400000028 44 0 ___U_l_________________d_________ uptodate,lru,mappedtodisk
    0x0001000400000028 2 0 ___U_l_________________d_____I___ uptodate,lru,mappedtodisk,readahead
    0x000000000000002c 6434 25 __RU_l___________________________ referenced,uptodate,lru
    0x000100000000002c 47 0 __RU_l_______________________I___ referenced,uptodate,lru,readahead
    0x000000040000002c 14 0 __RU_l_________________d_________ referenced,uptodate,lru,mappedtodisk
    0x000000080000002c 30 0 __RU_l__________________P________ referenced,uptodate,lru,private
    0x0000000800000040 8124 31 ______A_________________P________ active,private
    0x0000000000000040 219 0 ______A__________________________ active
    0x0000000800000060 1 0 _____lA_________________P________ lru,active,private
    0x0000000000000068 322 1 ___U_lA__________________________ uptodate,lru,active
    0x0001000000000068 12 0 ___U_lA______________________I___ uptodate,lru,active,readahead
    0x0000000400000068 13 0 ___U_lA________________d_________ uptodate,lru,active,mappedtodisk
    0x0000000800000068 12 0 ___U_lA_________________P________ uptodate,lru,active,private
    0x000000000000006c 977 3 __RU_lA__________________________ referenced,uptodate,lru,active
    0x000100000000006c 48 0 __RU_lA______________________I___ referenced,uptodate,lru,active,readahead
    0x000000040000006c 5 0 __RU_lA________________d_________ referenced,uptodate,lru,active,mappedtodisk
    0x000000080000006c 3 0 __RU_lA_________________P________ referenced,uptodate,lru,active,private
    0x0000000c0000006c 3 0 __RU_lA________________dP________ referenced,uptodate,lru,active,mappedtodisk,private
    0x0000000c00000068 1 0 ___U_lA________________dP________ uptodate,lru,active,mappedtodisk,private
    0x0000000000004078 1 0 ___UDlA_______b__________________ uptodate,dirty,lru,active,swapbacked
    0x000000000000407c 34 0 __RUDlA_______b__________________ referenced,uptodate,dirty,lru,active,swapbacked
    0x0000000000000400 538 2 __________B______________________ buddy
    0x0000000000000804 1 0 __R________M_____________________ referenced,mmap
    0x0000000000000828 1029 4 ___U_l_____M_____________________ uptodate,lru,mmap
    0x0001000000000828 43 0 ___U_l_____M_________________I___ uptodate,lru,mmap,readahead
    0x000000000000082c 382 1 __RU_l_____M_____________________ referenced,uptodate,lru,mmap
    0x000100000000082c 12 0 __RU_l_____M_________________I___ referenced,uptodate,lru,mmap,readahead
    0x0000000000000868 192 0 ___U_lA____M_____________________ uptodate,lru,active,mmap
    0x0001000000000868 12 0 ___U_lA____M_________________I___ uptodate,lru,active,mmap,readahead
    0x000000000000086c 800 3 __RU_lA____M_____________________ referenced,uptodate,lru,active,mmap
    0x000100000000086c 31 0 __RU_lA____M_________________I___ referenced,uptodate,lru,active,mmap,readahead
    0x0000000000004878 2 0 ___UDlA____M__b__________________ uptodate,dirty,lru,active,mmap,swapbacked
    0x0000000000001000 492 1 ____________a____________________ anonymous
    0x0000000000005008 2 0 ___U________a_b__________________ uptodate,anonymous,swapbacked
    0x0000000000005808 4 0 ___U_______Ma_b__________________ uptodate,mmap,anonymous,swapbacked
    0x000000000000580c 1 0 __RU_______Ma_b__________________ referenced,uptodate,mmap,anonymous,swapbacked
    0x0000000000005868 2839 11 ___U_lA____Ma_b__________________ uptodate,lru,active,mmap,anonymous,swapbacked
    0x000000000000586c 29 0 __RU_lA____Ma_b__________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
    total 513968 2007

    # ./page-types --raw --list --no-summary --bits reserved
    offset count flags
    0 15 _____________________r___________
    31 4 _____________________r___________
    159 97 _____________________r___________
    4096 2067 _____________________r___________
    6752 2390 _____________________r___________
    9355 3 _____________________r___________
    9728 14526 _____________________r___________

    This patch:

    Introduce PageHuge(), which identifies huge/gigantic pages by their
    dedicated compound destructor functions.

    Also move prep_compound_gigantic_page() to hugetlb.c and make
    __free_pages_ok() non-static.

    Signed-off-by: Wu Fengguang
    Cc: KOSAKI Motohiro
    Cc: Andi Kleen
    Cc: Matt Mackall
    Cc: Alexey Dobriyan
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     

11 Mar, 2009

1 commit

  • Fix kpf_copy_bit(src,dst) to be kpf_copy_bit(dst,src) to match the
    actual call patterns, e.g. kpf_copy_bit(kflags, KPF_LOCKED, PG_locked).

    This misplacement of src/dst only affected reporting of PG_writeback,
    PG_reclaim and PG_buddy. For others kflags==uflags so not affected.

    Signed-off-by: Wu Fengguang
    Reviewed-by: KOSAKI Motohiro
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     

25 Feb, 2009

1 commit


23 Oct, 2008

1 commit