Blame view

Documentation/vm/pagemap.txt 6.81 KB
ef421be74   Thomas Tuttle   pagemap: add docu...
1
2
3
4
5
6
  pagemap, from the userspace perspective
  ---------------------------------------
  
  pagemap is a new (as of 2.6.25) set of interfaces in the kernel that allow
  userspace programs to examine the page tables and related information by
  reading files in /proc.
80ae2fdce   Vladimir Davydov   proc: add kpagecg...
7
  There are four components to pagemap:
ef421be74   Thomas Tuttle   pagemap: add docu...
8
9
10
11
12
  
   * /proc/pid/pagemap.  This file lets a userspace process find out which
     physical frame each virtual page is mapped to.  It contains one 64-bit
     value for each virtual page, containing the following data (from
     fs/proc/task_mmu.c, above pagemap_read):
c9ba78e22   Wu Fengguang   pagemap: document...
13
      * Bits 0-54  page frame number (PFN) if present
ef421be74   Thomas Tuttle   pagemap: add docu...
14
      * Bits 0-4   swap type if swapped
c9ba78e22   Wu Fengguang   pagemap: document...
15
      * Bits 5-54  swap offset if swapped
541c237c0   Pavel Emelyanov   pagemap: prepare ...
16
      * Bit  55    pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
83b4b0bb6   Konstantin Khlebnikov   pagemap: update d...
17
      * Bit  56    page exclusively mapped (since 4.2)
77bb499bb   Konstantin Khlebnikov   pagemap: add mmap...
18
      * Bits 57-60 zero
83b4b0bb6   Konstantin Khlebnikov   pagemap: update d...
19
      * Bit  61    page is file-page or shared-anon (since 3.5)
ef421be74   Thomas Tuttle   pagemap: add docu...
20
21
      * Bit  62    page swapped
      * Bit  63    page present
83b4b0bb6   Konstantin Khlebnikov   pagemap: update d...
22
23
24
25
     Since Linux 4.0 only users with the CAP_SYS_ADMIN capability can get PFNs.
     In 4.0 and 4.1 opens by unprivileged fail with -EPERM.  Starting from
     4.2 the PFN field is zeroed if the user does not have CAP_SYS_ADMIN.
     Reason: information about PFNs helps in exploiting Rowhammer vulnerability.
ef421be74   Thomas Tuttle   pagemap: add docu...
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
     If the page is not present but in swap, then the PFN contains an
     encoding of the swap file number and the page's offset into the
     swap. Unmapped pages return a null PFN. This allows determining
     precisely which pages are mapped (or in swap) and comparing mapped
     pages between processes.
  
     Efficient users of this interface will use /proc/pid/maps to
     determine which areas of memory are actually mapped and llseek to
     skip over unmapped regions.
  
   * /proc/kpagecount.  This file contains a 64-bit count of the number of
     times each page is mapped, indexed by PFN.
  
   * /proc/kpageflags.  This file contains a 64-bit set of flags for each
     page, indexed by PFN.
c9ba78e22   Wu Fengguang   pagemap: document...
41
     The flags are (from fs/proc/page.c, above kpageflags_read):
ef421be74   Thomas Tuttle   pagemap: add docu...
42
43
44
45
46
47
48
49
50
51
52
53
  
       0. LOCKED
       1. ERROR
       2. REFERENCED
       3. UPTODATE
       4. DIRTY
       5. LRU
       6. ACTIVE
       7. SLAB
       8. WRITEBACK
       9. RECLAIM
      10. BUDDY
17e895012   Wu Fengguang   pagemap: document...
54
55
56
57
58
59
      11. MMAP
      12. ANON
      13. SWAPCACHE
      14. SWAPBACKED
      15. COMPOUND_HEAD
      16. COMPOUND_TAIL
63f8e8d2a   Doug Hoyte   Documentation typ...
60
      17. HUGE
17e895012   Wu Fengguang   pagemap: document...
61
      18. UNEVICTABLE
253fb02d6   Wu Fengguang   pagemap: export K...
62
      19. HWPOISON
17e895012   Wu Fengguang   pagemap: document...
63
      20. NOPAGE
a1bbb5ec3   Wu Fengguang   pagemap: document...
64
      21. KSM
807f0ccfe   Naoya Horiguchi   pagemap: document...
65
      22. THP
56873f43a   Wang, Yalin   mm:add KPF_ZERO_P...
66
67
      23. BALLOON
      24. ZERO_PAGE
f074a8f49   Vladimir Davydov   proc: export idle...
68
      25. IDLE
17e895012   Wu Fengguang   pagemap: document...
69

80ae2fdce   Vladimir Davydov   proc: add kpagecg...
70
71
72
   * /proc/kpagecgroup.  This file contains a 64-bit inode number of the
     memory cgroup each page is charged to, indexed by PFN. Only available when
     CONFIG_MEMCG is set.
17e895012   Wu Fengguang   pagemap: document...
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
  Short descriptions to the page flags:
  
   0. LOCKED
      page is being locked for exclusive access, eg. by undergoing read/write IO
  
   7. SLAB
      page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
      When compound page is used, SLUB/SLQB will only set this flag on the head
      page; SLOB will not flag it at all.
  
  10. BUDDY
      a free memory block managed by the buddy system allocator
      The buddy system organizes free memory in blocks of various orders.
      An order N block has 2^N physically contiguous pages, with the BUDDY flag
      set for and _only_ for the first page.
  
  15. COMPOUND_HEAD
  16. COMPOUND_TAIL
      A compound page with order N consists of 2^N physically contiguous pages.
      A compound page with order 2 takes the form of "HTTT", where H donates its
      head page and T donates its tail page(s).  The major consumers of compound
      pages are hugeTLB pages (Documentation/vm/hugetlbpage.txt), the SLUB etc.
      memory allocators and various device drivers. However in this interface,
      only huge/giga pages are made visible to end users.
  17. HUGE
      this is an integral part of a HugeTLB page
253fb02d6   Wu Fengguang   pagemap: export K...
99
100
  19. HWPOISON
      hardware detected memory corruption on this page: don't touch the data!
17e895012   Wu Fengguang   pagemap: document...
101
102
  20. NOPAGE
      no page frame exists at the requested address
a1bbb5ec3   Wu Fengguang   pagemap: document...
103
104
  21. KSM
      identical memory pages dynamically shared between one or more processes
807f0ccfe   Naoya Horiguchi   pagemap: document...
105
106
  22. THP
      contiguous pages which construct transparent hugepages
56873f43a   Wang, Yalin   mm:add KPF_ZERO_P...
107
108
109
110
111
  23. BALLOON
      balloon compaction page
  
  24. ZERO_PAGE
      zero page for pfn_zero or huge_zero page
f074a8f49   Vladimir Davydov   proc: export idle...
112
113
114
115
116
  25. IDLE
      page has not been accessed since it was marked idle (see
      Documentation/vm/idle_page_tracking.txt). Note that this flag may be
      stale in case the page was accessed via a PTE. To make sure the flag
      is up-to-date one has to read /sys/kernel/mm/page_idle/bitmap first.
17e895012   Wu Fengguang   pagemap: document...
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
      [IO related page flags]
   1. ERROR     IO error occurred
   3. UPTODATE  page has up-to-date data
                ie. for file backed page: (in-memory data revision >= on-disk one)
   4. DIRTY     page has been written to, hence contains new data
                ie. for file backed page: (in-memory data revision >  on-disk one)
   8. WRITEBACK page is being synced to disk
  
      [LRU related page flags]
   5. LRU         page is in one of the LRU lists
   6. ACTIVE      page is in the active LRU list
  18. UNEVICTABLE page is in the unevictable (non-)LRU list
                  It is somehow pinned and not a candidate for LRU page reclaims,
  		eg. ramfs pages, shmctl(SHM_LOCK) and mlock() memory segments
   2. REFERENCED  page has been referenced since last LRU list enqueue/requeue
   9. RECLAIM     page will be reclaimed soon after its pageout IO completed
  11. MMAP        a memory mapped page
  12. ANON        a memory mapped page that is not part of a file
  13. SWAPCACHE   page is mapped to swap space, ie. has an associated swap entry
  14. SWAPBACKED  page is backed by swap/RAM
3250af197   Randy Wright   Documentation/vm/...
137
138
  The page-types tool in the tools/vm directory can be used to query the
  above flags.
ef421be74   Thomas Tuttle   pagemap: add docu...
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
  
  Using pagemap to do something useful:
  
  The general procedure for using pagemap to find out about a process' memory
  usage goes like this:
  
   1. Read /proc/pid/maps to determine which parts of the memory space are
      mapped to what.
   2. Select the maps you are interested in -- all of them, or a particular
      library, or the stack or the heap, etc.
   3. Open /proc/pid/pagemap and seek to the pages you would like to examine.
   4. Read a u64 for each page from pagemap.
   5. Open /proc/kpagecount and/or /proc/kpageflags.  For each PFN you just
      read, seek to that entry in the file, and read the data you want.
  
  For example, to find the "unique set size" (USS), which is the amount of
  memory that a process is using that is not shared with any other process,
  you can go through every map in the process, find the PFNs, look those up
  in kpagecount, and tally up the number of pages that are only referenced
  once.
  
  Other notes:
  
  Reading from any of the files will return -EINVAL if you are not starting
f884ab15a   Anatol Pomozov   doc: fix misspell...
163
  the read on an 8-byte boundary (e.g., if you sought an odd number of bytes
ef421be74   Thomas Tuttle   pagemap: add docu...
164
  into the file), or if the size of the read is not a multiple of 8 bytes.
83b4b0bb6   Konstantin Khlebnikov   pagemap: update d...
165
166
167
168
169
  
  Before Linux 3.11 pagemap bits 55-60 were used for "page-shift" (which is
  always 12 at most architectures). Since Linux 3.11 their meaning changes
  after first clear of soft-dirty bits. Since Linux 4.2 they are used for
  flags unconditionally.