16 Nov, 2020
1 commit
-
Although cache alias management calls set up and tear down TLB entries
and fast_second_level_miss is able to restore TLB entry should it be
evicted they absolutely cannot preempt each other because they use the
same TLBTEMP area for different purposes.
Disable preemption around all cache alias management calls to enforce
that.Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov
04 Nov, 2020
1 commit
-
free_highpages() iterates over the free memblock regions in high
memory, and marks each page as available for the memory management
system.Until commit cddb5ddf2b76 ("arm, xtensa: simplify initialization of
high memory pages") it rounded beginning of each region upwards and end of
each region downwards.However, after that commit free_highmem() rounds the beginning and end of
each region downwards, and we may end up freeing a page that is
memblock_reserve()d, resulting in memory corruption.Restore the original rounding of the region boundaries to avoid freeing
reserved pages.Fixes: cddb5ddf2b76 ("arm, xtensa: simplify initialization of high memory pages")
Link: https://lore.kernel.org/r/20201029110334.4118-1-ardb@kernel.org/
Link: https://lore.kernel.org/r/20201031094345.6984-1-rppt@kernel.org
Signed-off-by: Ard Biesheuvel
Co-developed-by: Mike Rapoport
Signed-off-by: Mike Rapoport
Acked-by: Max Filippov
16 Oct, 2020
1 commit
-
Pull dma-mapping updates from Christoph Hellwig:
- rework the non-coherent DMA allocator
- move private definitions out of
- lower CMA_ALIGNMENT (Paul Cercueil)
- remove the omap1 dma address translation in favor of the common code
- make dma-direct aware of multiple dma offset ranges (Jim Quinlan)
- support per-node DMA CMA areas (Barry Song)
- increase the default seg boundary limit (Nicolin Chen)
- misc fixes (Robin Murphy, Thomas Tai, Xu Wang)
- various cleanups
* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
ARM/ixp4xx: add a missing include of dma-map-ops.h
dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
dma-direct: factor out a dma_direct_alloc_from_pool helper
dma-direct check for highmem pages in dma_direct_alloc_pages
dma-mapping: merge into
dma-mapping: move large parts of to kernel/dma
dma-mapping: move dma-debug.h to kernel/dma/
dma-mapping: remove
dma-mapping: merge into
dma-contiguous: remove dma_contiguous_set_default
dma-contiguous: remove dev_set_cma_area
dma-contiguous: remove dma_declare_contiguous
dma-mapping: split
cma: decrease CMA_ALIGNMENT lower limit to 2
firewire-ohci: use dma_alloc_pages
dma-iommu: implement ->alloc_noncoherent
dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
dma-mapping: add a new dma_alloc_pages API
dma-mapping: remove dma_cache_sync
53c700: convert to dma_alloc_noncoherent
...
14 Oct, 2020
1 commit
-
free_highpages() in both arm and xtensa essentially open-code
for_each_free_mem_range() loop to detect high memory pages that were not
reserved and that should be initialized and passed to the buddy allocator.Replace open-coded implementation of for_each_free_mem_range() with usage
of memblock API to simplify the code.Signed-off-by: Mike Rapoport
Signed-off-by: Andrew Morton
Tested-by: Max Filippov [xtensa]
Reviewed-by: Max Filippov [xtensa]
Cc: Andy Lutomirski
Cc: Baoquan He
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Catalin Marinas
Cc: Christoph Hellwig
Cc: Daniel Axtens
Cc: Dave Hansen
Cc: Emil Renner Berthing
Cc: Hari Bathini
Cc: Ingo Molnar
Cc: Ingo Molnar
Cc: Jonathan Cameron
Cc: Marek Szyprowski
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Miguel Ojeda
Cc: Palmer Dabbelt
Cc: Paul Mackerras
Cc: Paul Walmsley
Cc: Peter Zijlstra
Cc: Russell King
Cc: Stafford Horne
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Cc: Will Deacon
Cc: Yoshinori Sato
Link: https://lkml.kernel.org/r/20200818151634.14343-4-rppt@kernel.org
Signed-off-by: Linus Torvalds
06 Oct, 2020
1 commit
-
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.Signed-off-by: Christoph Hellwig
13 Aug, 2020
2 commits
-
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.Remove the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf events because it's
now also done in handle_mm_fault().Move the PERF_COUNT_SW_PAGE_FAULTS event higher before taking mmap_sem for
the fault, then it'll match with the rest of the archs.Signed-off-by: Peter Xu
Signed-off-by: Andrew Morton
Acked-by: Max Filippov
Cc: Chris Zankel
Link: http://lkml.kernel.org/r/20200707225021.200906-24-peterx@redhat.com
Signed-off-by: Linus Torvalds -
Patch series "mm: Page fault accounting cleanups", v5.
This is v5 of the pf accounting cleanup series. It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b9827063 ("mm: allow
VM_FAULT_RETRY for multiple times"):https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/
What this series did:
- Correct page fault accounting: we do accounting for a page fault
(no matter whether it's from #PF handling, or gup, or anything else)
only with the one that completed the fault. For example, page fault
retries should not be counted in page fault counters. Same to the
perf events.- Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
event is used in an adhoc way across different archs.Case (1): for many archs it's done at the entry of a page fault
handler, so that it will also cover e.g. errornous faults.Case (2): for some other archs, it is only accounted when the page
fault is resolved successfully.Case (3): there're still quite some archs that have not enabled
this perf event.Since this series will touch merely all the archs, we unify this
perf event to always follow case (1), which is the one that makes most
sense. And since we moved the accounting into handle_mm_fault, the
other two MAJ/MIN perf events are well taken care of naturally.- Unify definition of "major faults": the definition of "major
fault" is slightly changed when used in accounting (not
VM_FAULT_MAJOR). More information in patch 1.- Always account the page fault onto the one that triggered the page
fault. This does not matter much for #PF handlings, but mostly for
gup. More information on this in patch 25.Patchset layout:
Patch 1: Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23: Enable the new accounting for arch #PF handlers one by one.
Patch 24: Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25: Cleanup GUP task_struct pointer since it's not needed any moreThis patch (of 25):
This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault(). This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events. To
do this, the pt_regs pointer is passed into handle_mm_fault().PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.Suggested-by: Linus Torvalds
Signed-off-by: Peter Xu
Signed-off-by: Andrew Morton
Cc: Albert Ou
Cc: Alexander Gordeev
Cc: Andy Lutomirski
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Brian Cain
Cc: Catalin Marinas
Cc: Christian Borntraeger
Cc: Chris Zankel
Cc: Dave Hansen
Cc: David S. Miller
Cc: Geert Uytterhoeven
Cc: Gerald Schaefer
Cc: Greentime Hu
Cc: Guo Ren
Cc: Heiko Carstens
Cc: Helge Deller
Cc: H. Peter Anvin
Cc: Ingo Molnar
Cc: Ivan Kokshaysky
Cc: James E.J. Bottomley
Cc: John Hubbard
Cc: Jonas Bonn
Cc: Ley Foon Tan
Cc: "Luck, Tony"
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Nick Hu
Cc: Palmer Dabbelt
Cc: Paul Mackerras
Cc: Paul Walmsley
Cc: Pekka Enberg
Cc: Peter Zijlstra
Cc: Richard Henderson
Cc: Rich Felker
Cc: Russell King
Cc: Stafford Horne
Cc: Stefan Kristiansson
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Cc: Vasily Gorbik
Cc: Vincent Chen
Cc: Vineet Gupta
Cc: Will Deacon
Cc: Yoshinori Sato
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Linus Torvalds
08 Aug, 2020
1 commit
-
Patch series "mm: cleanup usage of "
Most architectures have very similar versions of pXd_alloc_one() and
pXd_free_one() for intermediate levels of page table. These patches add
generic versions of these functions in and enable
use of the generic functions where appropriate.In addition, functions declared and defined in headers are
used mostly by core mm and early mm initialization in arch and there is no
actual reason to have the included all over the place.
The first patch in this series removes unneeded includes ofIn the end it didn't work out as neatly as I hoped and moving
pXd_alloc_track() definitions to would require
unnecessary changes to arches that have custom page table allocations, so
I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
to mm/.This patch (of 8):
In most cases header is required only for allocations of
page table memory. Most of the .c files that include that header do not
use symbols declared in and do not require that header.As for the other header files that used to include , it is
possible to move that include into the .c file that actually uses symbols
from and drop the include from the header file.The process was somewhat automated using
sed -i -E '/[
Signed-off-by: Andrew Morton
Reviewed-by: Pekka Enberg
Acked-by: Geert Uytterhoeven [m68k]
Cc: Abdul Haleem
Cc: Andy Lutomirski
Cc: Arnd Bergmann
Cc: Christophe Leroy
Cc: Joerg Roedel
Cc: Max Filippov
Cc: Peter Zijlstra
Cc: Satheesh Rajendran
Cc: Stafford Horne
Cc: Stephen Rothwell
Cc: Steven Rostedt
Cc: Joerg Roedel
Cc: Matthew Wilcox
Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
Signed-off-by: Linus Torvalds
10 Jun, 2020
5 commits
-
Convert comments that reference old mmap_sem APIs to reference
corresponding new mmap locking APIs instead.Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Vlastimil Babka
Reviewed-by: Davidlohr Bueso
Reviewed-by: Daniel Jordan
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Laurent Dufour
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.com
Signed-off-by: Linus Torvalds -
This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.The change is generated using coccinelle with the following rule:
// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .
@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Daniel Jordan
Reviewed-by: Laurent Dufour
Reviewed-by: Vlastimil Babka
Cc: Davidlohr Bueso
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
Signed-off-by: Linus Torvalds -
The powerpc 32-bit implementation of pgtable has nice shortcuts for
accessing kernel PMD and PTE for a given virtual address. Make these
helpers available for all architectures.[rppt@linux.ibm.com: microblaze: fix page table traversal in setup_rt_frame()]
Link: http://lkml.kernel.org/r/20200518191511.GD1118872@kernel.org
[akpm@linux-foundation.org: s/pmd_ptr_k/pmd_off_k/ in various powerpc places]Signed-off-by: Mike Rapoport
Signed-off-by: Andrew Morton
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Brian Cain
Cc: Catalin Marinas
Cc: Chris Zankel
Cc: "David S. Miller"
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Ungerer
Cc: Guan Xuetao
Cc: Guo Ren
Cc: Heiko Carstens
Cc: Helge Deller
Cc: Ingo Molnar
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Matthew Wilcox
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Nick Hu
Cc: Paul Walmsley
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Russell King
Cc: Stafford Horne
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vincent Chen
Cc: Vineet Gupta
Cc: Will Deacon
Cc: Yoshinori Sato
Link: http://lkml.kernel.org/r/20200514170327.31389-9-rppt@kernel.org
Signed-off-by: Linus Torvalds -
The replacement of with made the include
of the latter in the middle of asm includes. Fix this up with the aid of
the below script and manual adjustments here and there.import sys
import reif len(sys.argv) is not 3:
print "USAGE: %s " % (sys.argv[0])
sys.exit(1)hdr_to_move="#include " % sys.argv[2]
moved = False
in_hdrs = Falsewith open(sys.argv[1], "r") as f:
lines = f.readlines()
for _line in lines:
line = _line.rstrip('
')
if line == hdr_to_move:
continue
if line.startswith("#include
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Ungerer
Cc: Guan Xuetao
Cc: Guo Ren
Cc: Heiko Carstens
Cc: Helge Deller
Cc: Ingo Molnar
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Matthew Wilcox
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Nick Hu
Cc: Paul Walmsley
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Russell King
Cc: Stafford Horne
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vincent Chen
Cc: Vineet Gupta
Cc: Will Deacon
Cc: Yoshinori Sato
Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org
Signed-off-by: Linus Torvalds -
The include/linux/pgtable.h is going to be the home of generic page table
manipulation functions.Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
make the latter include asm/pgtable.h.Signed-off-by: Mike Rapoport
Signed-off-by: Andrew Morton
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Brian Cain
Cc: Catalin Marinas
Cc: Chris Zankel
Cc: "David S. Miller"
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Ungerer
Cc: Guan Xuetao
Cc: Guo Ren
Cc: Heiko Carstens
Cc: Helge Deller
Cc: Ingo Molnar
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Matthew Wilcox
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Nick Hu
Cc: Paul Walmsley
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Russell King
Cc: Stafford Horne
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vincent Chen
Cc: Vineet Gupta
Cc: Will Deacon
Cc: Yoshinori Sato
Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org
Signed-off-by: Linus Torvalds
05 Jun, 2020
5 commits
-
To support kmap_atomic_prot(), all architectures need to support
protections passed to their kmap_atomic_high() function. Pass protections
into kmap_atomic_high() and change the name to kmap_atomic_high_prot() to
match.Then define kmap_atomic_prot() as a core function which calls
kmap_atomic_high_prot() when needed.Finally, redefine kmap_atomic() as a wrapper of kmap_atomic_prot() with
the default kmap_prot exported by the architectures.Signed-off-by: Ira Weiny
Signed-off-by: Andrew Morton
Reviewed-by: Christoph Hellwig
Cc: Al Viro
Cc: Andy Lutomirski
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Christian König
Cc: Chris Zankel
Cc: Daniel Vetter
Cc: Dan Williams
Cc: Dave Hansen
Cc: "David S. Miller"
Cc: Helge Deller
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: "James E.J. Bottomley"
Cc: Max Filippov
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20200507150004.1423069-11-ira.weiny@intel.com
Signed-off-by: Linus Torvalds -
To support kmap_atomic_prot() on all architectures each arch must support
protections passed in to them.Change csky, mips, nds32 and xtensa to use their global constant kmap_prot
rather than a hard coded value which was equal.Signed-off-by: Ira Weiny
Signed-off-by: Andrew Morton
Reviewed-by: Christoph Hellwig
Cc: Al Viro
Cc: Andy Lutomirski
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Christian König
Cc: Chris Zankel
Cc: Daniel Vetter
Cc: Dan Williams
Cc: Dave Hansen
Cc: "David S. Miller"
Cc: Helge Deller
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: "James E.J. Bottomley"
Cc: Max Filippov
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20200507150004.1423069-10-ira.weiny@intel.com
Signed-off-by: Linus Torvalds -
Every single architecture (including !CONFIG_HIGHMEM) calls...
pagefault_enable();
preempt_enable();... before returning from __kunmap_atomic(). Lift this code into the
kunmap_atomic() macro.While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to
be consistent.[ira.weiny@intel.com: don't enable pagefault/preempt twice]
Link: http://lkml.kernel.org/r/20200518184843.3029640-1-ira.weiny@intel.com
[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: Ira Weiny
Signed-off-by: Andrew Morton
Reviewed-by: Christoph Hellwig
Cc: Al Viro
Cc: Andy Lutomirski
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Christian König
Cc: Chris Zankel
Cc: Daniel Vetter
Cc: Dan Williams
Cc: Dave Hansen
Cc: "David S. Miller"
Cc: Helge Deller
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: "James E.J. Bottomley"
Cc: Max Filippov
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Cc: Guenter Roeck
Link: http://lkml.kernel.org/r/20200507150004.1423069-8-ira.weiny@intel.com
Signed-off-by: Linus Torvalds -
Every arch has the same code to ensure atomic operations and a check for
!HIGHMEM page.Remove the duplicate code by defining a core kmap_atomic() which only
calls the arch specific kmap_atomic_high() when the page is high memory.[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: Ira Weiny
Signed-off-by: Andrew Morton
Reviewed-by: Christoph Hellwig
Cc: Al Viro
Cc: Andy Lutomirski
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Christian König
Cc: Chris Zankel
Cc: Daniel Vetter
Cc: Dan Williams
Cc: Dave Hansen
Cc: "David S. Miller"
Cc: Helge Deller
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: "James E.J. Bottomley"
Cc: Max Filippov
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20200507150004.1423069-7-ira.weiny@intel.com
Signed-off-by: Linus Torvalds -
Move the kmap() build bug to kmap_init() to facilitate patches to lift
kmap() to the core.Signed-off-by: Ira Weiny
Signed-off-by: Andrew Morton
Reviewed-by: Christoph Hellwig
Cc: Al Viro
Cc: Andy Lutomirski
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Christian König
Cc: Chris Zankel
Cc: Daniel Vetter
Cc: Dan Williams
Cc: Dave Hansen
Cc: "David S. Miller"
Cc: Helge Deller
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: "James E.J. Bottomley"
Cc: Max Filippov
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20200507150004.1423069-3-ira.weiny@intel.com
Signed-off-by: Linus Torvalds
04 Jun, 2020
1 commit
-
free_area_init() only requires the definition of maximal PFN for each of
the supported zone rater than calculation of actual zone sizes and the
sizes of the holes between the zones.After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.Using this function instead of free_area_init_node() simplifies the zone
detection.Signed-off-by: Mike Rapoport
Signed-off-by: Andrew Morton
Tested-by: Hoan Tran [arm64]
Cc: Baoquan He
Cc: Brian Cain
Cc: Catalin Marinas
Cc: "David S. Miller"
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Ungerer
Cc: Guan Xuetao
Cc: Guo Ren
Cc: Heiko Carstens
Cc: Helge Deller
Cc: "James E.J. Bottomley"
Cc: Jonathan Corbet
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Hocko
Cc: Michal Simek
Cc: Nick Hu
Cc: Paul Walmsley
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Russell King
Cc: Stafford Horne
Cc: Thomas Bogendoerfer
Cc: Tony Luck
Cc: Vineet Gupta
Cc: Yoshinori Sato
Link: http://lkml.kernel.org/r/20200412194859.12663-15-rppt@kernel.org
Signed-off-by: Linus Torvalds
03 Apr, 2020
3 commits
-
The idea comes from a discussion between Linus and Andrea [1].
Before this patch we only allow a page fault to retry once. We achieved
this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
handle_mm_fault() the second time. This was majorly used to avoid
unexpected starvation of the system by looping over forever to handle the
page fault on a single page. However that should hardly happen, and after
all for each code path to return a VM_FAULT_RETRY we'll first wait for a
condition (during which time we should possibly yield the cpu) to happen
before VM_FAULT_RETRY is really returned.This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
flag when we receive VM_FAULT_RETRY. It means that the page fault handler
now can retry the page fault for multiple times if necessary without the
need to generate another page fault event. Meanwhile we still keep the
FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
page fault is the first attempt or not.Then we'll have these combinations of fault flags (only considering
ALLOW_RETRY flag and TRIED flag):- ALLOW_RETRY and !TRIED: this means the page fault allows to
retry, and this is the first try- ALLOW_RETRY and TRIED: this means the page fault allows to
retry, and this is not the first try- !ALLOW_RETRY and !TRIED: this means the page fault does not allow
to retry at all- !ALLOW_RETRY and TRIED: this is forbidden and should never be used
In existing code we have multiple places that has taken special care of
the first condition above by checking against (fault_flags &
FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect
the first retry of a page fault by checking against both (fault_flags &
FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
even the 2nd try will have the ALLOW_RETRY set, then use that helper in
all existing special paths. One example is in __lock_page_or_retry(), now
we'll drop the mmap_sem only in the first attempt of page fault and we'll
keep it in follow up retries, so old locking behavior will be retained.This will be a nice enhancement for current code [2] at the same time a
supporting material for the future userfaultfd-writeprotect work, since in
that work there will always be an explicit userfault writeprotect retry
for protected pages, and if that cannot resolve the page fault (e.g., when
userfaultfd-writeprotect is used in conjunction with swapped pages) then
we'll possibly need a 3rd retry of the page fault. It might also benefit
other potential users who will have similar requirement like userfault
write-protection.GUP code is not touched yet and will be covered in follow up patch.
Please read the thread below for more information.
[1] https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/
[2] https://lore.kernel.org/lkml/20181230154648.GB9832@redhat.com/Suggested-by: Linus Torvalds
Suggested-by: Andrea Arcangeli
Signed-off-by: Peter Xu
Signed-off-by: Andrew Morton
Tested-by: Brian Geffon
Cc: Bobby Powers
Cc: David Hildenbrand
Cc: Denis Plotnikov
Cc: "Dr . David Alan Gilbert"
Cc: Hugh Dickins
Cc: Jerome Glisse
Cc: Johannes Weiner
Cc: "Kirill A . Shutemov"
Cc: Martin Cracauer
Cc: Marty McFadden
Cc: Matthew Wilcox
Cc: Maya Gokhale
Cc: Mel Gorman
Cc: Mike Kravetz
Cc: Mike Rapoport
Cc: Pavel Emelyanov
Link: http://lkml.kernel.org/r/20200220160246.9790-1-peterx@redhat.com
Signed-off-by: Linus Torvalds -
Although there're tons of arch-specific page fault handlers, most of them
are still sharing the same initial value of the page fault flags. Say,
merely all of the page fault handlers would allow the fault to be retried,
and they also allow the fault to respond to SIGKILL.Let's define a default value for the fault flags to replace those initial
page fault flags that were copied over. With this, it'll be far easier to
introduce new fault flag that can be used by all the architectures instead
of touching all the archs.Signed-off-by: Peter Xu
Signed-off-by: Andrew Morton
Tested-by: Brian Geffon
Reviewed-by: David Hildenbrand
Cc: Andrea Arcangeli
Cc: Bobby Powers
Cc: Denis Plotnikov
Cc: "Dr . David Alan Gilbert"
Cc: Hugh Dickins
Cc: Jerome Glisse
Cc: Johannes Weiner
Cc: "Kirill A . Shutemov"
Cc: Martin Cracauer
Cc: Marty McFadden
Cc: Matthew Wilcox
Cc: Maya Gokhale
Cc: Mel Gorman
Cc: Mike Kravetz
Cc: Mike Rapoport
Cc: Pavel Emelyanov
Link: http://lkml.kernel.org/r/20200220160238.9694-1-peterx@redhat.com
Signed-off-by: Linus Torvalds -
For most architectures, we've got a quick path to detect fatal signal
after a handle_mm_fault(). Introduce a helper for that quick path.It cleans the current codes a bit so we don't need to duplicate the same
check across archs. More importantly, this will be an unified place that
we handle the signal immediately right after an interrupted page fault, so
it'll be much easier for us if we want to change the behavior of handling
signals later on for all the archs.Note that currently only part of the archs are using this new helper,
because some archs have their own way to handle signals. In the follow up
patches, we'll try to apply this helper to all the rest of archs.Another note is that the "regs" parameter in the new helper is not used
yet. It'll be used very soon. Now we kept it in this patch only to avoid
touching all the archs again in the follow up patches.[peterx@redhat.com: fix sparse warnings]
Link: http://lkml.kernel.org/r/20200311145921.GD479302@xz-x1
Signed-off-by: Peter Xu
Signed-off-by: Andrew Morton
Tested-by: Brian Geffon
Cc: Andrea Arcangeli
Cc: Bobby Powers
Cc: David Hildenbrand
Cc: Denis Plotnikov
Cc: "Dr . David Alan Gilbert"
Cc: Hugh Dickins
Cc: Jerome Glisse
Cc: Johannes Weiner
Cc: "Kirill A . Shutemov"
Cc: Martin Cracauer
Cc: Marty McFadden
Cc: Matthew Wilcox
Cc: Maya Gokhale
Cc: Mel Gorman
Cc: Mike Kravetz
Cc: Mike Rapoport
Cc: Pavel Emelyanov
Link: http://lkml.kernel.org/r/20200220155353.8676-4-peterx@redhat.com
Signed-off-by: Linus Torvalds
27 Nov, 2019
4 commits
-
KASAN shadow map doesn't need to be accessible through the linear kernel
mapping, allocate its pages with MEMBLOCK_ALLOC_ANYWHERE so that high
memory can be used. This frees up to ~100MB of low memory on xtensa
configurations with KASAN and high memory.Cc: stable@vger.kernel.org # v5.1+
Fixes: f240ec09bb8a ("memblock: replace memblock_alloc_base(ANYWHERE) with memblock_phys_alloc")
Reviewed-by: Mike Rapoport
Signed-off-by: Max Filippov -
Virtual and translated addresses retrieved by the xtensa TLB sanity
checker must be consistent, i.e. correspond to the same state of the
checked TLB entry. KASAN shadow memory is mapped dynamically using
auto-refill TLB entries and thus may change TLB state between the
virtual and translated address retrieval, resulting in false TLB
insanity report.
Move read_xtlb_translation close to read_xtlb_virtual to make sure that
read values are consistent.Cc: stable@vger.kernel.org
Fixes: a99e07ee5e88 ("xtensa: check TLB sanity on return to userspace")
Signed-off-by: Max Filippov -
xtensa has 2-level page tables and already uses pgtable-nopmd for page
table folding.Add walks of p4d level where appropriate and drop usage of
__ARCH_USE_5LEVEL_HACK.Signed-off-by: Mike Rapoport
Message-Id:
Signed-off-by: Max Filippov [fix up
arch/xtensa/include/asm/fixmap.h and
arch/xtensa/mm/tlb.c] -
There was a definition of pmd_offset() in arch/xtensa/include/asm/pgtable.h
that shadowed the generic implementation defined in
include/asm-generic/pgtable-nopmd.h.As the result, xtensa had shortcuts in page table traversal in several
places instead of doing level unfolding.Remove local override for pmd_offset() and add page table unfolding where
necessary.Signed-off-by: Mike Rapoport
Message-Id:
Signed-off-by: Max Filippov
21 Oct, 2019
1 commit
-
Use correct symbol for the end of .rodata section when dumping virtual
memory layout. This fixes odd rodata size with XIP kernel.Signed-off-by: Max Filippov
27 Aug, 2019
1 commit
-
The xtensa free_initrd_mem() verifies that initrd is mapped and then
frees its memory using free_reserved_area().The initrd is considered mapped when its memory was successfully reserved
with mem_reserve().Resetting initrd_start to 0 in case of mem_reserve() failure allows to
switch to generic free_initrd_mem() implementation.Signed-off-by: Mike Rapoport
Message-Id:
Signed-off-by: Max Filippov
17 Jul, 2019
1 commit
-
Pull Xtensa updates from Max Filippov:
- clean up PCI support code
- add defconfig and DTS for the 'virt' board
- abstract 'entry' and 'retw' uses in xtensa assembly in preparation
for XEA3/NX pipeline support- random small cleanups
* tag 'xtensa-20190715' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: virt: add defconfig and DTS
xtensa: abstract 'entry' and 'retw' in assembly code
xtensa: One function call less in bootmem_init()
xtensa: remove arch/xtensa/include/asm/types.h
xtensa: use generic pcibios_set_master and pcibios_enable_device
xtensa: drop dead PCI support code
xtensa/PCI: Remove unused variable
09 Jul, 2019
2 commits
-
…iederm/user-namespace
Pull force_sig() argument change from Eric Biederman:
"A source of error over the years has been that force_sig has taken a
task parameter when it is only safe to use force_sig with the current
task.The force_sig function is built for delivering synchronous signals
such as SIGSEGV where the userspace application caused a synchronous
fault (such as a page fault) and the kernel responded with a signal.Because the name force_sig does not make this clear, and because the
force_sig takes a task parameter the function force_sig has been
abused for sending other kinds of signals over the years. Slowly those
have been fixed when the oopses have been tracked down.This set of changes fixes the remaining abusers of force_sig and
carefully rips out the task parameter from force_sig and friends
making this kind of error almost impossible in the future"* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
signal: Remove the signal number and task parameters from force_sig_info
signal: Factor force_sig_info_to_task out of force_sig_info
signal: Generate the siginfo in force_sig
signal: Move the computation of force into send_signal and correct it.
signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
signal: Remove the task parameter from force_sig_fault
signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
signal: Explicitly call force_sig_fault on current
signal/unicore32: Remove tsk parameter from __do_user_fault
signal/arm: Remove tsk parameter from __do_user_fault
signal/arm: Remove tsk parameter from ptrace_break
signal/nds32: Remove tsk parameter from send_sigtrap
signal/riscv: Remove tsk parameter from do_trap
signal/sh: Remove tsk parameter from force_sig_info_fault
signal/um: Remove task parameter from send_sigtrap
signal/x86: Remove task parameter from send_sigtrap
signal: Remove task parameter from force_sig_mceerr
signal: Remove task parameter from force_sig
signal: Remove task parameter from force_sigsegv
... -
Provide abi_entry, abi_entry_default, abi_ret and abi_ret_default macros
that allocate aligned stack frame in windowed and call0 ABIs.
Provide XTENSA_SPILL_STACK_RESERVE macro that specifies required stack
frame size when register spilling is involved.
Replace all uses of 'entry' and 'retw' with the above macros.
This makes most of the xtensa assembly code ready for XEA3 and call0 ABI.Signed-off-by: Max Filippov
06 Jul, 2019
1 commit
-
Avoid an extra function call by using a ternary operator instead of
a conditional statement for a setting selection.This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring
Message-Id:
Signed-off-by: Max Filippov
19 Jun, 2019
1 commit
-
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundationthis program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner
Reviewed-by: Enrico Weigelt
Reviewed-by: Kate Stewart
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman
29 May, 2019
1 commit
-
As synchronous exceptions really only make sense against the current
task (otherwise how are you synchronous) remove the task parameter
from from force_sig_fault to make it explicit that is what is going
on.The two known exceptions that deliver a synchronous exception to a
stopped ptraced task have already been changed to
force_sig_fault_to_task.The callers have been changed with the following emacs regular expression
(with obvious variations on the architectures that take more arguments)
to avoid typos:force_sig_fault[(]\([^,]+\)[,]\([^,]+\)[,]\([^,]+\)[,]\W+current[)]
->
force_sig_fault(\1,\2,\3)Signed-off-by: "Eric W. Biederman"
21 May, 2019
1 commit
-
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:GPL-2.0-only
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
15 May, 2019
1 commit
-
Patch series "provide a generic free_initmem implementation", v2.
Many architectures implement free_initmem() in exactly the same or very
similar way: they wrap the call to free_initmem_default() with sometimes
different 'poison' parameter.These patches switch those architectures to use a generic implementation
that does free_initmem_default(POISON_FREE_INITMEM).This was inspired by Christoph's patches for free_initrd_mem [1] and I
shamelessly copied changelog entries from his patches :)[1] https://lore.kernel.org/lkml/20190213174621.29297-1-hch@lst.de/
This patch (of 2):
For most architectures free_initmem just a wrapper for the same
free_initmem_default(-1) call. Provide that as a generic implementation
marked __weak.Link: http://lkml.kernel.org/r/1550515285-17446-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport
Reviewed-by: Andrew Morton
Cc: Christoph Hellwig
Cc: Palmer Dabbelt
Cc: Richard Kuo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
05 Apr, 2019
1 commit
-
Use %lu instead of %zu to fix the following warning introduced with
recent memblock refactoring:
xtensa/mm/mmu.c:36:9: warning: format '%zu' expects argument of type
'size_t', but argument 3 has type 'long unsigned intSigned-off-by: Max Filippov
13 Mar, 2019
3 commits
-
Add check for the return value of memblock_alloc*() functions and call
panic() in case of error. The panic message repeats the one used by
panicing memblock allocators with adjustment of parameters to include
only relevant ones.The replacement was mostly automated with semantic patches like the one
below with manual massaging of format strings.@@
expression ptr, size, align;
@@
ptr = memblock_alloc(size, align);
+ if (!ptr)
+ panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);[anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
[rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
[rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
[akpm@linux-foundation.org: fix xtensa printk warning]
Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport
Signed-off-by: Anders Roxell
Reviewed-by: Guo Ren [c-sky]
Acked-by: Paul Burton [MIPS]
Acked-by: Heiko Carstens [s390]
Reviewed-by: Juergen Gross [Xen]
Reviewed-by: Geert Uytterhoeven [m68k]
Acked-by: Max Filippov [xtensa]
Cc: Catalin Marinas
Cc: Christophe Leroy
Cc: Christoph Hellwig
Cc: "David S. Miller"
Cc: Dennis Zhou
Cc: Greentime Hu
Cc: Greg Kroah-Hartman
Cc: Guan Xuetao
Cc: Guo Ren
Cc: Mark Salter
Cc: Matt Turner
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Petr Mladek
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Rob Herring
Cc: Rob Herring
Cc: Russell King
Cc: Stafford Horne
Cc: Tony Luck
Cc: Vineet Gupta
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make the memblock_phys_alloc() function an inline wrapper for
memblock_phys_alloc_range() and update the memblock_phys_alloc() callers
to check the returned value and panic in case of error.Link: http://lkml.kernel.org/r/1548057848-15136-8-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport
Cc: Catalin Marinas
Cc: Christophe Leroy
Cc: Christoph Hellwig
Cc: "David S. Miller"
Cc: Dennis Zhou
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Kroah-Hartman
Cc: Guan Xuetao
Cc: Guo Ren
Cc: Guo Ren [c-sky]
Cc: Heiko Carstens
Cc: Juergen Gross [Xen]
Cc: Mark Salter
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Paul Burton
Cc: Petr Mladek
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Rob Herring
Cc: Rob Herring
Cc: Russell King
Cc: Stafford Horne
Cc: Tony Luck
Cc: Vineet Gupta
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The calls to memblock_alloc_base(size, align, MEMBLOCK_ALLOC_ANYWHERE)
and memblock_phys_alloc(size, align) are equivalent as both try to
allocate 'size' bytes with 'align' alignment anywhere in the memory and
panic if hte allocation fails.The conversion is done using the following semantic patch:
@@
expression size, align;
@@
- memblock_alloc_base(size, align, MEMBLOCK_ALLOC_ANYWHERE)
+ memblock_phys_alloc(size, align)Link: http://lkml.kernel.org/r/1548057848-15136-4-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport
Cc: Catalin Marinas
Cc: Christophe Leroy
Cc: Christoph Hellwig
Cc: "David S. Miller"
Cc: Dennis Zhou
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Kroah-Hartman
Cc: Guan Xuetao
Cc: Guo Ren
Cc: Guo Ren [c-sky]
Cc: Heiko Carstens
Cc: Juergen Gross [Xen]
Cc: Mark Salter
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Paul Burton
Cc: Petr Mladek
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Rob Herring
Cc: Rob Herring
Cc: Russell King
Cc: Stafford Horne
Cc: Tony Luck
Cc: Vineet Gupta
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds