19 Dec, 2013
2 commits
-
There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.During that time, a CPU may continue writing to the page.
This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).The basic race looks like this:
CPU A CPU B CPU C
load TLB entry
make entry PTE/PMD_NUMA
fault on entry
read/write old page
start migrating page
change PTE/PMD to new page
read/write old page [*]
flush TLB
reload TLB from new entry
read/write new page
lose data[*] the old page may belong to a new user at this point!
The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.This should fix both NUMA migration and compaction.
[mgorman@suse.de: fix build]
Signed-off-by: Rik van Riel
Signed-off-by: Mel Gorman
Cc: Alex Thorlton
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
On x86, PMD entries are similar to _PAGE_PROTNONE protection and are
handled as NUMA hinting faults. The following two page table protection
bits are what defines them_PAGE_NUMA:set _PAGE_PRESENT:clear
A PMD is considered present if any of the _PAGE_PRESENT, _PAGE_PROTNONE,
_PAGE_PSE or _PAGE_NUMA bits are set. If pmdp_invalidate encounters a
pmd_numa, it clears the present bit leaving _PAGE_NUMA which will be
considered not present by the CPU but present by pmd_present. The
existing caller of pmdp_invalidate should handle it but it's an
inconsistent state for a PMD. This patch keeps the state consistent
when calling pmdp_invalidate.Signed-off-by: Mel Gorman
Reviewed-by: Rik van Riel
Cc: Alex Thorlton
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Nov, 2013
2 commits
-
Only trivial cases left. Let's convert them altogether.
Signed-off-by: Naoya Horiguchi
Signed-off-by: Kirill A. Shutemov
Tested-by: Alex Thorlton
Cc: Ingo Molnar
Cc: "Eric W . Biederman"
Cc: "Paul E . McKenney"
Cc: Al Viro
Cc: Andi Kleen
Cc: Andrea Arcangeli
Cc: Dave Hansen
Cc: Dave Jones
Cc: David Howells
Cc: Frederic Weisbecker
Cc: Johannes Weiner
Cc: Kees Cook
Cc: Mel Gorman
Cc: Michael Kerrisk
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Robin Holt
Cc: Sedat Dilek
Cc: Srikar Dronamraju
Cc: Thomas Gleixner
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently mm->pmd_huge_pte protected by page table lock. It will not
work with split lock. We have to have per-pmd pmd_huge_pte for proper
access serialization.For now, let's just introduce wrapper to access mm->pmd_huge_pte.
Signed-off-by: Kirill A. Shutemov
Tested-by: Alex Thorlton
Cc: Alex Thorlton
Cc: Ingo Molnar
Cc: Naoya Horiguchi
Cc: "Eric W . Biederman"
Cc: "Paul E . McKenney"
Cc: Al Viro
Cc: Andi Kleen
Cc: Andrea Arcangeli
Cc: Dave Hansen
Cc: Dave Jones
Cc: David Howells
Cc: Frederic Weisbecker
Cc: Johannes Weiner
Cc: Kees Cook
Cc: Mel Gorman
Cc: Michael Kerrisk
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Robin Holt
Cc: Sedat Dilek
Cc: Srikar Dronamraju
Cc: Thomas Gleixner
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Sep, 2013
1 commit
-
pgtable related functions are mostly in pgtable-generic.c.
So move remaining functions from memory.c to pgtable-generic.c.Signed-off-by: Joonsoo Kim
Cc: Johannes Weiner
Cc: Minchan Kim
Cc: Mel Gorman
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
20 Jun, 2013
1 commit
-
This will be later used by powerpc THP support. In powerpc we want to use
pgtable for storing the hash index values. So instead of adding them to
mm_context list, we would like to store them in the second half of pmdSigned-off-by: Aneesh Kumar K.V
Reviewed-by: Andrea Arcangeli
Reviewed-by: David Gibson
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Benjamin Herrenschmidt
11 Dec, 2012
2 commits
-
If ptep_clear_flush() is called to clear a page table entry that is
accessible anyway by the CPU, eg. a _PAGE_PROTNONE page table entry,
there is no need to flush the TLB on remote CPUs.Signed-off-by: Rik van Riel
Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/n/tip-vm3rkzevahelwhejx5uwm8ex@git.kernel.org
Signed-off-by: Ingo Molnar -
The function ptep_set_access_flags is only ever used to upgrade
access permissions to a page. That means the only negative side
effect of not flushing remote TLBs is that other CPUs may incur
spurious page faults, if they happen to access the same address,
and still have a PTE with the old permissions cached in their
TLB.Having another CPU maybe incur a spurious page fault is faster
than always incurring the cost of a remote TLB flush, so replace
the remote TLB flush with a purely local one.This should be safe on every architecture that correctly
implements flush_tlb_fix_spurious_fault() to actually invalidate
the local TLB entry that caused a page fault, as well as on
architectures where the hardware invalidates TLB entries that
cause page faults.In the unlikely event that you are hitting what appears to be
an infinite loop of page faults, and 'git bisect' took you to
this changeset, your architecture needs to implement
flush_tlb_fix_spurious_fault to actually flush the TLB entry.Signed-off-by: Rik van Riel
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Peter Zijlstra
Cc: Michel Lespinasse
Cc: Ingo Molnar
09 Oct, 2012
2 commits
-
On s390, a valid page table entry must not be changed while it is attached
to any CPU. So instead of pmd_mknotpresent() and set_pmd_at(), an IDTE
operation would be necessary there. This patch introduces the
pmdp_invalidate() function, to allow architecture-specific
implementations.Signed-off-by: Gerald Schaefer
Cc: Andrea Arcangeli
Cc: Andi Kleen
Cc: Hugh Dickins
Cc: Hillf Danton
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The thp page table pre-allocation code currently assumes that pgtable_t is
of type "struct page *". This may not be true for all architectures, so
this patch removes that assumption by replacing the functions
prepare_pmd_huge_pte() and get_pmd_huge_pte() with two new functions that
can be defined architecture-specific.It also removes two VM_BUG_ON checks for page_count() and page_mapcount()
operating on a pgtable_t. Apart from the VM_BUG_ON removal, there will be
no functional change introduced by this patch.Signed-off-by: Gerald Schaefer
Cc: Andrea Arcangeli
Cc: Andi Kleen
Cc: Hugh Dickins
Cc: Hillf Danton
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 May, 2012
1 commit
-
The change adds some infrastructure for managing tile pmd's more generally,
using pte_pmd() and pmd_pte() methods to translate pmd values to and
from ptes, since on TILEPro a pmd is really just a nested structure
holding a pgd (aka pte). Several existing pmd methods are moved into
this framework, and a whole raft of additional pmd accessors are defined
that are used by the transparent hugepage framework.The tile PTE now has a "client2" bit. The bit is used to indicate a
transparent huge page is in the process of being split into subpages.This change also fixes a generic bug where the return value of the
generic pmdp_splitting_flush() was incorrect.Signed-off-by: Chris Metcalf
22 Mar, 2012
1 commit
-
These macros will be used in a later patch, where all usages are expected
to be optimized away without #ifdef CONFIG_TRANSPARENT_HUGEPAGE. But to
detect unexpected usages, we convert the existing BUG() to BUILD_BUG().[akpm@linux-foundation.org: fix build in mm/pgtable-generic.c]
Signed-off-by: Naoya Horiguchi
Acked-by: Hillf Danton
Reviewed-by: Andrea Arcangeli
Reviewed-by: KAMEZAWA Hiroyuki
Acked-by: David Rientjes
Cc: Daisuke Nishimura
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Jan, 2011
1 commit
-
mips (and sparc32):
In file included from arch/mips/include/asm/tlb.h:21,
from mm/pgtable-generic.c:9:
include/asm-generic/tlb.h: In function `tlb_flush_mmu':
include/asm-generic/tlb.h:76: error: implicit declaration of function `release_pages'
include/asm-generic/tlb.h: In function `tlb_remove_page':
include/asm-generic/tlb.h:105: error: implicit declaration of function `page_cache_release'free_pages_and_swap_cache() and free_page_and_swap_cache() are macros
which call release_pages() and page_cache_release(). The obvious fix is
to include pagemap.h in swap.h, where those macros are defined. But that
breaks sparc for weird reasons.So fix it within mm/pgtable-generic.c instead.
Reported-by: Yoichi Yuasa
Cc: Geert Uytterhoeven
Acked-by: Sam Ravnborg
Cc: Sergei Shtylyov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Jan, 2011
1 commit
-
pmdp_get_and_clear/pmdp_clear_flush/pmdp_splitting_flush were trapped as
BUG() and they were defined only to diminish the risk of build issues on
not-x86 archs and to be consistent with the generic pte methods previously
defined in include/asm-generic/pgtable.h.But they are causing more trouble than they were supposed to solve, so
it's simpler not to define them when THP is off.This is also correcting the export of pmdp_splitting_flush which is
currently unused (x86 isn't using the generic implementation in
mm/pgtable-generic.c and no other arch needs that [yet]).Signed-off-by: Andrea Arcangeli
Sam Ravnborg
Cc: Stephen Rothwell
Cc: "David S. Miller"
Cc: Benjamin Herrenschmidt
Cc: "Luck, Tony"
Cc: James Bottomley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Jan, 2011
1 commit
-
Some are needed to build but not actually used on archs not supporting
transparent hugepages. Others like pmdp_clear_flush are used by x86 too.Signed-off-by: Andrea Arcangeli
Acked-by: Rik van Riel
Acked-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds