10 Jun, 2020
1 commit
-
Convert comments that reference mmap_sem to reference mmap_lock instead.
[akpm@linux-foundation.org: fix up linux-next leftovers]
[akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
[akpm@linux-foundation.org: more linux-next fixups, per Michel]Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Vlastimil Babka
Reviewed-by: Daniel Jordan
Cc: Davidlohr Bueso
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Laurent Dufour
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com
Signed-off-by: Linus Torvalds
04 Feb, 2020
5 commits
-
As described in the comment, the correct order for freeing pages is:
1) unhook page
2) TLB invalidate page
3) free pageThis order equally applies to page directories.
Currently there are two correct options:
- use tlb_remove_page(), when all page directores are full pages and
there are no futher contraints placed by things like software
walkers (HAVE_FAST_GUP).- use MMU_GATHER_RCU_TABLE_FREE and tlb_remove_table() when the
architecture does not do IPI based TLB invalidate and has
HAVE_FAST_GUP (or software TLB fill).This however leaves architectures that don't have page based directories
but don't need RCU in a bind. For those, provide MMU_GATHER_TABLE_FREE,
which provides the independent batching for directories without the
additional RCU freeing.Link: http://lkml.kernel.org/r/20200116064531.483522-10-aneesh.kumar@linux.ibm.com
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Aneesh Kumar K.V
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Towards a more consistent naming scheme.
Link: http://lkml.kernel.org/r/20200116064531.483522-9-aneesh.kumar@linux.ibm.com
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Aneesh Kumar K.V
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Towards a more consistent naming scheme.
Link: http://lkml.kernel.org/r/20200116064531.483522-8-aneesh.kumar@linux.ibm.com
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Aneesh Kumar K.V
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Towards a more consistent naming scheme.
[akpm@linux-foundation.org: fix sparc64 Kconfig]
Link: http://lkml.kernel.org/r/20200116064531.483522-7-aneesh.kumar@linux.ibm.com
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Aneesh Kumar K.V
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Architectures for which we have hardware walkers of Linux page table
should flush TLB on mmu gather batch allocation failures and batch flush.
Some architectures like POWER supports multiple translation modes (hash
and radix) and in the case of POWER only radix translation mode needs the
above TLBI. This is because for hash translation mode kernel wants to
avoid this extra flush since there are no hardware walkers of linux page
table. With radix translation, the hardware also walks linux page table
and with that, kernel needs to make sure to TLB invalidate page walk cache
before page table pages are freed.More details in commit d86564a2f085 ("mm/tlb, x86/mm: Support invalidating
TLB caches for RCU_TABLE_FREE")The changes to sparc are to make sure we keep the old behavior since we
are now removing HAVE_RCU_TABLE_NO_INVALIDATE. The default value for
tlb_needs_table_invalidate is to always force an invalidate and sparc can
avoid the table invalidate. Hence we define tlb_needs_table_invalidate to
false for sparc architecture.Link: http://lkml.kernel.org/r/20200116064531.483522-3-aneesh.kumar@linux.ibm.com
Fixes: a46cc7a90fd8 ("powerpc/mm/radix: Improve TLB/PWC flushes")
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Michael Ellerman [powerpc]
Cc: [4.14+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Sep, 2019
1 commit
-
Patch series "mm: remove quicklist page table caches".
A while ago Nicholas proposed to remove quicklist page table caches [1].
I've rebased his patch on the curren upstream and switched ia64 and sh to
use generic versions of PTE allocation.[1] https://lore.kernel.org/linux-mm/20190711030339.20892-1-npiggin@gmail.com
This patch (of 3):
Remove page table allocator "quicklists". These have been around for a
long time, but have not got much traction in the last decade and are only
used on ia64 and sh architectures.The numbers in the initial commit look interesting but probably don't
apply anymore. If anybody wants to resurrect this it's in the git
history, but it's unhelpful to have this code and divergent allocator
behaviour for minor archs.Also it might be better to instead make more general improvements to page
allocator if this is still so slow.Link: http://lkml.kernel.org/r/1565250728-21721-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Nicholas Piggin
Signed-off-by: Mike Rapoport
Cc: Tony Luck
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Jun, 2019
1 commit
-
A few new fields were added to mmu_gather to make TLB flush smarter for
huge page by telling what level of page table is changed.__tlb_reset_range() is used to reset all these page table state to
unchanged, which is called by TLB flush for parallel mapping changes for
the same range under non-exclusive lock (i.e. read mmap_sem).Before commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in
munmap"), the syscalls (e.g. MADV_DONTNEED, MADV_FREE) which may update
PTEs in parallel don't remove page tables. But, the forementioned
commit may do munmap() under read mmap_sem and free page tables. This
may result in program hang on aarch64 reported by Jan Stancek. The
problem could be reproduced by his test program with slightly modified
below.---8< num_iter; i++) {
map_address = mmap(distant_area, (size_t) map_size, PROT_WRITE | PROT_READ,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (map_address == MAP_FAILED) {
perror("mmap");
exit(1);
}for (j = 0; j < map_size; j++)
map_address[j] = 'b';if (munmap(map_address, map_size) == -1) {
perror("munmap");
exit(1);
}
}return NULL;
}void *dummy(void *ptr)
{
return NULL;
}int main(void)
{
pthread_t thid[2];/* hint for mmap in map_write_unmap() */
distant_area = mmap(0, DISTANT_MMAP_SIZE, PROT_WRITE | PROT_READ,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
munmap(distant_area, (size_t)DISTANT_MMAP_SIZE);
distant_area += DISTANT_MMAP_SIZE / 2;while (1) {
pthread_create(&thid[0], NULL, map_write_unmap, NULL);
pthread_create(&thid[1], NULL, dummy, NULL);pthread_join(thid[0], NULL);
pthread_join(thid[1], NULL);
}
}
---8mmap_sem);
unmap_region()
tlb_gather_mmu()
inc_tlb_flush_pending(tlb->mm);
free_pgtables()
tlb->freed_tables = 1
tlb->cleared_pmds = 1pthread_exit()
madvise(thread_stack, 8M, MADV_DONTNEED)
zap_page_range()
tlb_gather_mmu()
inc_tlb_flush_pending(tlb->mm);tlb_finish_mmu()
if (mm_tlb_flush_nested(tlb->mm))
__tlb_reset_range()__tlb_reset_range() would reset freed_tables and cleared_* bits, but this
may cause inconsistency for munmap() which do free page tables. Then it
may result in some architectures, e.g. aarch64, may not flush TLB
completely as expected to have stale TLB entries remained.Use fullmm flush since it yields much better performance on aarch64 and
non-fullmm doesn't yields significant difference on x86.The original proposed fix came from Jan Stancek who mainly debugged this
issue, I just wrapped up everything together.Jan's testing results:
v5.2-rc2-24-gbec7550cca10
--------------------------
mean stddev
real 37.382 2.780
user 1.420 0.078
sys 54.658 1.855v5.2-rc2-24-gbec7550cca10 + "mm: mmu_gather: remove __tlb_reset_range() for force flush"
---------------------------------------------------------------------------------------_
mean stddev
real 37.119 2.105
user 1.548 0.087
sys 55.698 1.357[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1558322252-113575-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: Yang Shi
Signed-off-by: Jan Stancek
Reported-by: Jan Stancek
Tested-by: Jan Stancek
Suggested-by: Will Deacon
Tested-by: Will Deacon
Acked-by: Will Deacon
Cc: Peter Zijlstra
Cc: Nick Piggin
Cc: "Aneesh Kumar K.V"
Cc: Nadav Amit
Cc: Minchan Kim
Cc: Mel Gorman
Cc: [4.20+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Apr, 2019
7 commits
-
There are no external users of this API (nor should there be); remove it.
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Dave Hansen
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar -
As the comment notes; it is a potentially dangerous operation. Just
use tlb_flush_mmu(), that will skip the (double) TLB invalidate if
it really isn't needed anyway.No change in behavior intended.
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Dave Hansen
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar -
Since all architectures are now using it, it is redundant.
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Dave Hansen
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar -
Now that all architectures are converted to the generic code, remove
the arch hooks.No change in behavior intended.
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Dave Hansen
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar -
Add the Kconfig option HAVE_MMU_GATHER_NO_GATHER to the generic
mmu_gather code. If the option is set the mmu_gather will not
track individual pages for delayed page free anymore. A platform
that enables the option needs to provide its own implementation
of the __tlb_remove_page_size() function to free pages.No change in behavior intended.
Signed-off-by: Martin Schwidefsky
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Dave Hansen
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Cc: aneesh.kumar@linux.vnet.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: linux@armlinux.org.uk
Cc: npiggin@gmail.com
Link: http://lkml.kernel.org/r/20180918125151.31744-2-schwidefsky@de.ibm.com
Signed-off-by: Ingo Molnar -
Make issuing a TLB invalidate for page-table pages the normal case.
The reason is twofold:
- too many invalidates is safer than too few,
- most architectures use the linux page-tables natively
and would thus require this.Make it an opt-out, instead of an opt-in.
No change in behavior intended.
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Dave Hansen
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar -
Move the mmu_gather::page_size things into the generic code instead of
PowerPC specific bits.No change in behavior intended.
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Aneesh Kumar K.V
Cc: Borislav Petkov
Cc: Dave Hansen
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Nick Piggin
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar
28 Nov, 2018
1 commit
-
Now that call_rcu()'s callback is not invoked until after all
preempt-disable regions of code have completed (in addition to explicitly
marked RCU read-side critical sections), call_rcu() can be used in place
of call_rcu_sched(). This commit therefore makes that change.Signed-off-by: Paul E. McKenney
07 Sep, 2018
1 commit
-
In preparation for maintaining the mmu_gather code as its own entity,
move the implementation out of memory.c and into its own file.Cc: "Kirill A. Shutemov"
Cc: Andrew Morton
Cc: Michal Hocko
Signed-off-by: Peter Zijlstra
Signed-off-by: Will Deacon