Commit 541c237c0923f567c9c4cabb8a81635baadc713f

Authored by Pavel Emelyanov
Committed by Linus Torvalds
1 parent 0f8975ec4d

pagemap: prepare to reuse constant bits with page-shift

In order to reuse bits from pagemap entries gracefully, we leave the
entries as is but on pagemap open emit a warning in dmesg, that bits
55-60 are about to change in a couple of releases.  Next, if a user
issues soft-dirty clear command via the clear_refs file (it was disabled
before v3.9) we assume that he's aware of the new pagemap format, note
that fact and report the bits in pagemap in the new manner.

The "migration strategy" looks like this then:

1. existing users are not affected -- they don't touch soft-dirty feature, thus
   see old bits in pagemap, but are warned and have time to fix themselves
2. those who use soft-dirty know about new pagemap format
3. some time soon we get rid of any signs of page-shift in pagemap as well as
   this trick with clear-soft-dirty affecting pagemap format.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 2 changed files with 36 additions and 2 deletions Side-by-side Diff

Documentation/vm/pagemap.txt
... ... @@ -15,7 +15,8 @@
15 15 * Bits 0-54 page frame number (PFN) if present
16 16 * Bits 0-4 swap type if swapped
17 17 * Bits 5-54 swap offset if swapped
18   - * Bits 55-60 page shift (page size = 1<<page shift)
  18 + * Bit 55 pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
  19 + * Bits 56-60 zero
19 20 * Bit 61 page is file-page or shared-anon
20 21 * Bit 62 page swapped
21 22 * Bit 63 page present
... ... @@ -689,6 +689,23 @@
689 689 .release = seq_release_private,
690 690 };
691 691  
  692 +/*
  693 + * We do not want to have constant page-shift bits sitting in
  694 + * pagemap entries and are about to reuse them some time soon.
  695 + *
  696 + * Here's the "migration strategy":
  697 + * 1. when the system boots these bits remain what they are,
  698 + * but a warning about future change is printed in log;
  699 + * 2. once anyone clears soft-dirty bits via clear_refs file,
  700 + * these flag is set to denote, that user is aware of the
  701 + * new API and those page-shift bits change their meaning.
  702 + * The respective warning is printed in dmesg;
  703 + * 3. In a couple of releases we will remove all the mentions
  704 + * of page-shift in pagemap entries.
  705 + */
  706 +
  707 +static bool soft_dirty_cleared __read_mostly;
  708 +
692 709 enum clear_refs_types {
693 710 CLEAR_REFS_ALL = 1,
694 711 CLEAR_REFS_ANON,
... ... @@ -778,6 +795,13 @@
778 795 type = (enum clear_refs_types)itype;
779 796 if (type < CLEAR_REFS_ALL || type >= CLEAR_REFS_LAST)
780 797 return -EINVAL;
  798 +
  799 + if (type == CLEAR_REFS_SOFT_DIRTY) {
  800 + soft_dirty_cleared = true;
  801 + pr_warn_once("The pagemap bits 55-60 has changed their meaning! "
  802 + "See the linux/Documentation/vm/pagemap.txt for details.\n");
  803 + }
  804 +
781 805 task = get_proc_task(file_inode(file));
782 806 if (!task)
783 807 return -ESRCH;
... ... @@ -1091,7 +1115,7 @@
1091 1115 if (!count)
1092 1116 goto out_task;
1093 1117  
1094   - pm.v2 = false;
  1118 + pm.v2 = soft_dirty_cleared;
1095 1119 pm.len = PM_ENTRY_BYTES * (PAGEMAP_WALK_SIZE >> PAGE_SHIFT);
1096 1120 pm.buffer = kmalloc(pm.len, GFP_TEMPORARY);
1097 1121 ret = -ENOMEM;
1098 1122  
... ... @@ -1164,9 +1188,18 @@
1164 1188 return ret;
1165 1189 }
1166 1190  
  1191 +static int pagemap_open(struct inode *inode, struct file *file)
  1192 +{
  1193 + pr_warn_once("Bits 55-60 of /proc/PID/pagemap entries are about "
  1194 + "to stop being page-shift some time soon. See the "
  1195 + "linux/Documentation/vm/pagemap.txt for details.\n");
  1196 + return 0;
  1197 +}
  1198 +
1167 1199 const struct file_operations proc_pagemap_operations = {
1168 1200 .llseek = mem_lseek, /* borrow this */
1169 1201 .read = pagemap_read,
  1202 + .open = pagemap_open,
1170 1203 };
1171 1204 #endif /* CONFIG_PROC_PAGE_MONITOR */
1172 1205