16 Nov, 2020

1 commit

  • Although cache alias management calls set up and tear down TLB entries
    and fast_second_level_miss is able to restore TLB entry should it be
    evicted they absolutely cannot preempt each other because they use the
    same TLBTEMP area for different purposes.
    Disable preemption around all cache alias management calls to enforce
    that.

    Cc: stable@vger.kernel.org
    Signed-off-by: Max Filippov

    Max Filippov
     

04 Nov, 2020

1 commit

  • free_highpages() iterates over the free memblock regions in high
    memory, and marks each page as available for the memory management
    system.

    Until commit cddb5ddf2b76 ("arm, xtensa: simplify initialization of
    high memory pages") it rounded beginning of each region upwards and end of
    each region downwards.

    However, after that commit free_highmem() rounds the beginning and end of
    each region downwards, and we may end up freeing a page that is
    memblock_reserve()d, resulting in memory corruption.

    Restore the original rounding of the region boundaries to avoid freeing
    reserved pages.

    Fixes: cddb5ddf2b76 ("arm, xtensa: simplify initialization of high memory pages")
    Link: https://lore.kernel.org/r/20201029110334.4118-1-ardb@kernel.org/
    Link: https://lore.kernel.org/r/20201031094345.6984-1-rppt@kernel.org
    Signed-off-by: Ard Biesheuvel
    Co-developed-by: Mike Rapoport
    Signed-off-by: Mike Rapoport
    Acked-by: Max Filippov

    Ard Biesheuvel
     

16 Oct, 2020

1 commit

  • Pull dma-mapping updates from Christoph Hellwig:

    - rework the non-coherent DMA allocator

    - move private definitions out of

    - lower CMA_ALIGNMENT (Paul Cercueil)

    - remove the omap1 dma address translation in favor of the common code

    - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

    - support per-node DMA CMA areas (Barry Song)

    - increase the default seg boundary limit (Nicolin Chen)

    - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

    - various cleanups

    * tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
    ARM/ixp4xx: add a missing include of dma-map-ops.h
    dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
    dma-direct: factor out a dma_direct_alloc_from_pool helper
    dma-direct check for highmem pages in dma_direct_alloc_pages
    dma-mapping: merge into
    dma-mapping: move large parts of to kernel/dma
    dma-mapping: move dma-debug.h to kernel/dma/
    dma-mapping: remove
    dma-mapping: merge into
    dma-contiguous: remove dma_contiguous_set_default
    dma-contiguous: remove dev_set_cma_area
    dma-contiguous: remove dma_declare_contiguous
    dma-mapping: split
    cma: decrease CMA_ALIGNMENT lower limit to 2
    firewire-ohci: use dma_alloc_pages
    dma-iommu: implement ->alloc_noncoherent
    dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
    dma-mapping: add a new dma_alloc_pages API
    dma-mapping: remove dma_cache_sync
    53c700: convert to dma_alloc_noncoherent
    ...

    Linus Torvalds
     

14 Oct, 2020

1 commit

  • free_highpages() in both arm and xtensa essentially open-code
    for_each_free_mem_range() loop to detect high memory pages that were not
    reserved and that should be initialized and passed to the buddy allocator.

    Replace open-coded implementation of for_each_free_mem_range() with usage
    of memblock API to simplify the code.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Tested-by: Max Filippov [xtensa]
    Reviewed-by: Max Filippov [xtensa]
    Cc: Andy Lutomirski
    Cc: Baoquan He
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Catalin Marinas
    Cc: Christoph Hellwig
    Cc: Daniel Axtens
    Cc: Dave Hansen
    Cc: Emil Renner Berthing
    Cc: Hari Bathini
    Cc: Ingo Molnar
    Cc: Ingo Molnar
    Cc: Jonathan Cameron
    Cc: Marek Szyprowski
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Miguel Ojeda
    Cc: Palmer Dabbelt
    Cc: Paul Mackerras
    Cc: Paul Walmsley
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: https://lkml.kernel.org/r/20200818151634.14343-4-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

06 Oct, 2020

1 commit


13 Aug, 2020

2 commits

  • Use the general page fault accounting by passing regs into
    handle_mm_fault(). It naturally solve the issue of multiple page fault
    accounting when page fault retry happened.

    Remove the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf events because it's
    now also done in handle_mm_fault().

    Move the PERF_COUNT_SW_PAGE_FAULTS event higher before taking mmap_sem for
    the fault, then it'll match with the rest of the archs.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Acked-by: Max Filippov
    Cc: Chris Zankel
    Link: http://lkml.kernel.org/r/20200707225021.200906-24-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Patch series "mm: Page fault accounting cleanups", v5.

    This is v5 of the pf accounting cleanup series. It originates from Gerald
    Schaefer's report on an issue a week ago regarding to incorrect page fault
    accountings for retried page fault after commit 4064b9827063 ("mm: allow
    VM_FAULT_RETRY for multiple times"):

    https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

    What this series did:

    - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault. For example, page fault
    retries should not be counted in page fault counters. Same to the
    perf events.

    - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g. errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense. And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

    - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR). More information in patch 1.

    - Always account the page fault onto the one that triggered the page
    fault. This does not matter much for #PF handlings, but mostly for
    gup. More information on this in patch 25.

    Patchset layout:

    Patch 1: Introduced the accounting in handle_mm_fault(), not enabled.
    Patch 2-23: Enable the new accounting for arch #PF handlers one by one.
    Patch 24: Enable the new accounting for the rest outliers (gup, iommu, etc.)
    Patch 25: Cleanup GUP task_struct pointer since it's not needed any more

    This patch (of 25):

    This is a preparation patch to move page fault accountings into the
    general code in handle_mm_fault(). This includes both the per task
    flt_maj/flt_min counters, and the major/minor page fault perf events. To
    do this, the pt_regs pointer is passed into handle_mm_fault().

    PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
    handlers.

    So far, all the pt_regs pointer that passed into handle_mm_fault() is
    NULL, which means this patch should have no intented functional change.

    Suggested-by: Linus Torvalds
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Cc: Albert Ou
    Cc: Alexander Gordeev
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Christian Borntraeger
    Cc: Chris Zankel
    Cc: Dave Hansen
    Cc: David S. Miller
    Cc: Geert Uytterhoeven
    Cc: Gerald Schaefer
    Cc: Greentime Hu
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: H. Peter Anvin
    Cc: Ingo Molnar
    Cc: Ivan Kokshaysky
    Cc: James E.J. Bottomley
    Cc: John Hubbard
    Cc: Jonas Bonn
    Cc: Ley Foon Tan
    Cc: "Luck, Tony"
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Palmer Dabbelt
    Cc: Paul Mackerras
    Cc: Paul Walmsley
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Cc: Richard Henderson
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Stefan Kristiansson
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Vasily Gorbik
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
    Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     

08 Aug, 2020

1 commit

  • Patch series "mm: cleanup usage of "

    Most architectures have very similar versions of pXd_alloc_one() and
    pXd_free_one() for intermediate levels of page table. These patches add
    generic versions of these functions in and enable
    use of the generic functions where appropriate.

    In addition, functions declared and defined in headers are
    used mostly by core mm and early mm initialization in arch and there is no
    actual reason to have the included all over the place.
    The first patch in this series removes unneeded includes of

    In the end it didn't work out as neatly as I hoped and moving
    pXd_alloc_track() definitions to would require
    unnecessary changes to arches that have custom page table allocations, so
    I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
    to mm/.

    This patch (of 8):

    In most cases header is required only for allocations of
    page table memory. Most of the .c files that include that header do not
    use symbols declared in and do not require that header.

    As for the other header files that used to include , it is
    possible to move that include into the .c file that actually uses symbols
    from and drop the include from the header file.

    The process was somewhat automated using

    sed -i -E '/[
    Signed-off-by: Andrew Morton
    Reviewed-by: Pekka Enberg
    Acked-by: Geert Uytterhoeven [m68k]
    Cc: Abdul Haleem
    Cc: Andy Lutomirski
    Cc: Arnd Bergmann
    Cc: Christophe Leroy
    Cc: Joerg Roedel
    Cc: Max Filippov
    Cc: Peter Zijlstra
    Cc: Satheesh Rajendran
    Cc: Stafford Horne
    Cc: Stephen Rothwell
    Cc: Steven Rostedt
    Cc: Joerg Roedel
    Cc: Matthew Wilcox
    Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
    Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

10 Jun, 2020

5 commits

  • Convert comments that reference old mmap_sem APIs to reference
    corresponding new mmap locking APIs instead.

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Vlastimil Babka
    Reviewed-by: Davidlohr Bueso
    Reviewed-by: Daniel Jordan
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Laurent Dufour
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • This change converts the existing mmap_sem rwsem calls to use the new mmap
    locking API instead.

    The change is generated using coccinelle with the following rule:

    // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

    @@
    expression mm;
    @@
    (
    -init_rwsem
    +mmap_init_lock
    |
    -down_write
    +mmap_write_lock
    |
    -down_write_killable
    +mmap_write_lock_killable
    |
    -down_write_trylock
    +mmap_write_trylock
    |
    -up_write
    +mmap_write_unlock
    |
    -downgrade_write
    +mmap_write_downgrade
    |
    -down_read
    +mmap_read_lock
    |
    -down_read_killable
    +mmap_read_lock_killable
    |
    -down_read_trylock
    +mmap_read_trylock
    |
    -up_read
    +mmap_read_unlock
    )
    -(&mm->mmap_sem)
    +(mm)

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Daniel Jordan
    Reviewed-by: Laurent Dufour
    Reviewed-by: Vlastimil Babka
    Cc: Davidlohr Bueso
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • The powerpc 32-bit implementation of pgtable has nice shortcuts for
    accessing kernel PMD and PTE for a given virtual address. Make these
    helpers available for all architectures.

    [rppt@linux.ibm.com: microblaze: fix page table traversal in setup_rt_frame()]
    Link: http://lkml.kernel.org/r/20200518191511.GD1118872@kernel.org
    [akpm@linux-foundation.org: s/pmd_ptr_k/pmd_off_k/ in various powerpc places]

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matthew Wilcox
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200514170327.31389-9-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The replacement of with made the include
    of the latter in the middle of asm includes. Fix this up with the aid of
    the below script and manual adjustments here and there.

    import sys
    import re

    if len(sys.argv) is not 3:
    print "USAGE: %s " % (sys.argv[0])
    sys.exit(1)

    hdr_to_move="#include " % sys.argv[2]
    moved = False
    in_hdrs = False

    with open(sys.argv[1], "r") as f:
    lines = f.readlines()
    for _line in lines:
    line = _line.rstrip('
    ')
    if line == hdr_to_move:
    continue
    if line.startswith("#include
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matthew Wilcox
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The include/linux/pgtable.h is going to be the home of generic page table
    manipulation functions.

    Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
    make the latter include asm/pgtable.h.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matthew Wilcox
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

05 Jun, 2020

5 commits

  • To support kmap_atomic_prot(), all architectures need to support
    protections passed to their kmap_atomic_high() function. Pass protections
    into kmap_atomic_high() and change the name to kmap_atomic_high_prot() to
    match.

    Then define kmap_atomic_prot() as a core function which calls
    kmap_atomic_high_prot() when needed.

    Finally, redefine kmap_atomic() as a wrapper of kmap_atomic_prot() with
    the default kmap_prot exported by the architectures.

    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20200507150004.1423069-11-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     
  • To support kmap_atomic_prot() on all architectures each arch must support
    protections passed in to them.

    Change csky, mips, nds32 and xtensa to use their global constant kmap_prot
    rather than a hard coded value which was equal.

    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20200507150004.1423069-10-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     
  • Every single architecture (including !CONFIG_HIGHMEM) calls...

    pagefault_enable();
    preempt_enable();

    ... before returning from __kunmap_atomic(). Lift this code into the
    kunmap_atomic() macro.

    While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to
    be consistent.

    [ira.weiny@intel.com: don't enable pagefault/preempt twice]
    Link: http://lkml.kernel.org/r/20200518184843.3029640-1-ira.weiny@intel.com
    [akpm@linux-foundation.org: coding style fixes]
    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Guenter Roeck
    Link: http://lkml.kernel.org/r/20200507150004.1423069-8-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     
  • Every arch has the same code to ensure atomic operations and a check for
    !HIGHMEM page.

    Remove the duplicate code by defining a core kmap_atomic() which only
    calls the arch specific kmap_atomic_high() when the page is high memory.

    [akpm@linux-foundation.org: coding style fixes]
    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20200507150004.1423069-7-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     
  • Move the kmap() build bug to kmap_init() to facilitate patches to lift
    kmap() to the core.

    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20200507150004.1423069-3-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     

04 Jun, 2020

1 commit

  • free_area_init() only requires the definition of maximal PFN for each of
    the supported zone rater than calculation of actual zone sizes and the
    sizes of the holes between the zones.

    After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
    available to all architectures.

    Using this function instead of free_area_init_node() simplifies the zone
    detection.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Tested-by: Hoan Tran [arm64]
    Cc: Baoquan He
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: "James E.J. Bottomley"
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Hocko
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200412194859.12663-15-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

03 Apr, 2020

3 commits

  • The idea comes from a discussion between Linus and Andrea [1].

    Before this patch we only allow a page fault to retry once. We achieved
    this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
    handle_mm_fault() the second time. This was majorly used to avoid
    unexpected starvation of the system by looping over forever to handle the
    page fault on a single page. However that should hardly happen, and after
    all for each code path to return a VM_FAULT_RETRY we'll first wait for a
    condition (during which time we should possibly yield the cpu) to happen
    before VM_FAULT_RETRY is really returned.

    This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
    flag when we receive VM_FAULT_RETRY. It means that the page fault handler
    now can retry the page fault for multiple times if necessary without the
    need to generate another page fault event. Meanwhile we still keep the
    FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
    page fault is the first attempt or not.

    Then we'll have these combinations of fault flags (only considering
    ALLOW_RETRY flag and TRIED flag):

    - ALLOW_RETRY and !TRIED: this means the page fault allows to
    retry, and this is the first try

    - ALLOW_RETRY and TRIED: this means the page fault allows to
    retry, and this is not the first try

    - !ALLOW_RETRY and !TRIED: this means the page fault does not allow
    to retry at all

    - !ALLOW_RETRY and TRIED: this is forbidden and should never be used

    In existing code we have multiple places that has taken special care of
    the first condition above by checking against (fault_flags &
    FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect
    the first retry of a page fault by checking against both (fault_flags &
    FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
    even the 2nd try will have the ALLOW_RETRY set, then use that helper in
    all existing special paths. One example is in __lock_page_or_retry(), now
    we'll drop the mmap_sem only in the first attempt of page fault and we'll
    keep it in follow up retries, so old locking behavior will be retained.

    This will be a nice enhancement for current code [2] at the same time a
    supporting material for the future userfaultfd-writeprotect work, since in
    that work there will always be an explicit userfault writeprotect retry
    for protected pages, and if that cannot resolve the page fault (e.g., when
    userfaultfd-writeprotect is used in conjunction with swapped pages) then
    we'll possibly need a 3rd retry of the page fault. It might also benefit
    other potential users who will have similar requirement like userfault
    write-protection.

    GUP code is not touched yet and will be covered in follow up patch.

    Please read the thread below for more information.

    [1] https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/
    [2] https://lore.kernel.org/lkml/20181230154648.GB9832@redhat.com/

    Suggested-by: Linus Torvalds
    Suggested-by: Andrea Arcangeli
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160246.9790-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Although there're tons of arch-specific page fault handlers, most of them
    are still sharing the same initial value of the page fault flags. Say,
    merely all of the page fault handlers would allow the fault to be retried,
    and they also allow the fault to respond to SIGKILL.

    Let's define a default value for the fault flags to replace those initial
    page fault flags that were copied over. With this, it'll be far easier to
    introduce new fault flag that can be used by all the architectures instead
    of touching all the archs.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Reviewed-by: David Hildenbrand
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160238.9694-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • For most architectures, we've got a quick path to detect fatal signal
    after a handle_mm_fault(). Introduce a helper for that quick path.

    It cleans the current codes a bit so we don't need to duplicate the same
    check across archs. More importantly, this will be an unified place that
    we handle the signal immediately right after an interrupted page fault, so
    it'll be much easier for us if we want to change the behavior of handling
    signals later on for all the archs.

    Note that currently only part of the archs are using this new helper,
    because some archs have their own way to handle signals. In the follow up
    patches, we'll try to apply this helper to all the rest of archs.

    Another note is that the "regs" parameter in the new helper is not used
    yet. It'll be used very soon. Now we kept it in this patch only to avoid
    touching all the archs again in the follow up patches.

    [peterx@redhat.com: fix sparse warnings]
    Link: http://lkml.kernel.org/r/20200311145921.GD479302@xz-x1
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220155353.8676-4-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     

27 Nov, 2019

4 commits

  • KASAN shadow map doesn't need to be accessible through the linear kernel
    mapping, allocate its pages with MEMBLOCK_ALLOC_ANYWHERE so that high
    memory can be used. This frees up to ~100MB of low memory on xtensa
    configurations with KASAN and high memory.

    Cc: stable@vger.kernel.org # v5.1+
    Fixes: f240ec09bb8a ("memblock: replace memblock_alloc_base(ANYWHERE) with memblock_phys_alloc")
    Reviewed-by: Mike Rapoport
    Signed-off-by: Max Filippov

    Max Filippov
     
  • Virtual and translated addresses retrieved by the xtensa TLB sanity
    checker must be consistent, i.e. correspond to the same state of the
    checked TLB entry. KASAN shadow memory is mapped dynamically using
    auto-refill TLB entries and thus may change TLB state between the
    virtual and translated address retrieval, resulting in false TLB
    insanity report.
    Move read_xtlb_translation close to read_xtlb_virtual to make sure that
    read values are consistent.

    Cc: stable@vger.kernel.org
    Fixes: a99e07ee5e88 ("xtensa: check TLB sanity on return to userspace")
    Signed-off-by: Max Filippov

    Max Filippov
     
  • xtensa has 2-level page tables and already uses pgtable-nopmd for page
    table folding.

    Add walks of p4d level where appropriate and drop usage of
    __ARCH_USE_5LEVEL_HACK.

    Signed-off-by: Mike Rapoport
    Message-Id:
    Signed-off-by: Max Filippov [fix up
    arch/xtensa/include/asm/fixmap.h and
    arch/xtensa/mm/tlb.c]

    Mike Rapoport
     
  • There was a definition of pmd_offset() in arch/xtensa/include/asm/pgtable.h
    that shadowed the generic implementation defined in
    include/asm-generic/pgtable-nopmd.h.

    As the result, xtensa had shortcuts in page table traversal in several
    places instead of doing level unfolding.

    Remove local override for pmd_offset() and add page table unfolding where
    necessary.

    Signed-off-by: Mike Rapoport
    Message-Id:
    Signed-off-by: Max Filippov

    Mike Rapoport
     

21 Oct, 2019

1 commit


27 Aug, 2019

1 commit

  • The xtensa free_initrd_mem() verifies that initrd is mapped and then
    frees its memory using free_reserved_area().

    The initrd is considered mapped when its memory was successfully reserved
    with mem_reserve().

    Resetting initrd_start to 0 in case of mem_reserve() failure allows to
    switch to generic free_initrd_mem() implementation.

    Signed-off-by: Mike Rapoport
    Message-Id:
    Signed-off-by: Max Filippov

    Mike Rapoport
     

17 Jul, 2019

1 commit

  • Pull Xtensa updates from Max Filippov:

    - clean up PCI support code

    - add defconfig and DTS for the 'virt' board

    - abstract 'entry' and 'retw' uses in xtensa assembly in preparation
    for XEA3/NX pipeline support

    - random small cleanups

    * tag 'xtensa-20190715' of git://github.com/jcmvbkbc/linux-xtensa:
    xtensa: virt: add defconfig and DTS
    xtensa: abstract 'entry' and 'retw' in assembly code
    xtensa: One function call less in bootmem_init()
    xtensa: remove arch/xtensa/include/asm/types.h
    xtensa: use generic pcibios_set_master and pcibios_enable_device
    xtensa: drop dead PCI support code
    xtensa/PCI: Remove unused variable

    Linus Torvalds
     

09 Jul, 2019

2 commits

  • …iederm/user-namespace

    Pull force_sig() argument change from Eric Biederman:
    "A source of error over the years has been that force_sig has taken a
    task parameter when it is only safe to use force_sig with the current
    task.

    The force_sig function is built for delivering synchronous signals
    such as SIGSEGV where the userspace application caused a synchronous
    fault (such as a page fault) and the kernel responded with a signal.

    Because the name force_sig does not make this clear, and because the
    force_sig takes a task parameter the function force_sig has been
    abused for sending other kinds of signals over the years. Slowly those
    have been fixed when the oopses have been tracked down.

    This set of changes fixes the remaining abusers of force_sig and
    carefully rips out the task parameter from force_sig and friends
    making this kind of error almost impossible in the future"

    * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
    signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
    signal: Remove the signal number and task parameters from force_sig_info
    signal: Factor force_sig_info_to_task out of force_sig_info
    signal: Generate the siginfo in force_sig
    signal: Move the computation of force into send_signal and correct it.
    signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
    signal: Remove the task parameter from force_sig_fault
    signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
    signal: Explicitly call force_sig_fault on current
    signal/unicore32: Remove tsk parameter from __do_user_fault
    signal/arm: Remove tsk parameter from __do_user_fault
    signal/arm: Remove tsk parameter from ptrace_break
    signal/nds32: Remove tsk parameter from send_sigtrap
    signal/riscv: Remove tsk parameter from do_trap
    signal/sh: Remove tsk parameter from force_sig_info_fault
    signal/um: Remove task parameter from send_sigtrap
    signal/x86: Remove task parameter from send_sigtrap
    signal: Remove task parameter from force_sig_mceerr
    signal: Remove task parameter from force_sig
    signal: Remove task parameter from force_sigsegv
    ...

    Linus Torvalds
     
  • Provide abi_entry, abi_entry_default, abi_ret and abi_ret_default macros
    that allocate aligned stack frame in windowed and call0 ABIs.
    Provide XTENSA_SPILL_STACK_RESERVE macro that specifies required stack
    frame size when register spilling is involved.
    Replace all uses of 'entry' and 'retw' with the above macros.
    This makes most of the xtensa assembly code ready for XEA3 and call0 ABI.

    Signed-off-by: Max Filippov

    Max Filippov
     

06 Jul, 2019

1 commit


19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

29 May, 2019

1 commit

  • As synchronous exceptions really only make sense against the current
    task (otherwise how are you synchronous) remove the task parameter
    from from force_sig_fault to make it explicit that is what is going
    on.

    The two known exceptions that deliver a synchronous exception to a
    stopped ptraced task have already been changed to
    force_sig_fault_to_task.

    The callers have been changed with the following emacs regular expression
    (with obvious variations on the architectures that take more arguments)
    to avoid typos:

    force_sig_fault[(]\([^,]+\)[,]\([^,]+\)[,]\([^,]+\)[,]\W+current[)]
    ->
    force_sig_fault(\1,\2,\3)

    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

21 May, 2019

1 commit


15 May, 2019

1 commit

  • Patch series "provide a generic free_initmem implementation", v2.

    Many architectures implement free_initmem() in exactly the same or very
    similar way: they wrap the call to free_initmem_default() with sometimes
    different 'poison' parameter.

    These patches switch those architectures to use a generic implementation
    that does free_initmem_default(POISON_FREE_INITMEM).

    This was inspired by Christoph's patches for free_initrd_mem [1] and I
    shamelessly copied changelog entries from his patches :)

    [1] https://lore.kernel.org/lkml/20190213174621.29297-1-hch@lst.de/

    This patch (of 2):

    For most architectures free_initmem just a wrapper for the same
    free_initmem_default(-1) call. Provide that as a generic implementation
    marked __weak.

    Link: http://lkml.kernel.org/r/1550515285-17446-2-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Palmer Dabbelt
    Cc: Richard Kuo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

05 Apr, 2019

1 commit


13 Mar, 2019

3 commits

  • Add check for the return value of memblock_alloc*() functions and call
    panic() in case of error. The panic message repeats the one used by
    panicing memblock allocators with adjustment of parameters to include
    only relevant ones.

    The replacement was mostly automated with semantic patches like the one
    below with manual massaging of format strings.

    @@
    expression ptr, size, align;
    @@
    ptr = memblock_alloc(size, align);
    + if (!ptr)
    + panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);

    [anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
    Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
    [rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
    Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
    [rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
    Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
    [akpm@linux-foundation.org: fix xtensa printk warning]
    Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Anders Roxell
    Reviewed-by: Guo Ren [c-sky]
    Acked-by: Paul Burton [MIPS]
    Acked-by: Heiko Carstens [s390]
    Reviewed-by: Juergen Gross [Xen]
    Reviewed-by: Geert Uytterhoeven [m68k]
    Acked-by: Max Filippov [xtensa]
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • Make the memblock_phys_alloc() function an inline wrapper for
    memblock_phys_alloc_range() and update the memblock_phys_alloc() callers
    to check the returned value and panic in case of error.

    Link: http://lkml.kernel.org/r/1548057848-15136-8-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Guo Ren [c-sky]
    Cc: Heiko Carstens
    Cc: Juergen Gross [Xen]
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Paul Burton
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The calls to memblock_alloc_base(size, align, MEMBLOCK_ALLOC_ANYWHERE)
    and memblock_phys_alloc(size, align) are equivalent as both try to
    allocate 'size' bytes with 'align' alignment anywhere in the memory and
    panic if hte allocation fails.

    The conversion is done using the following semantic patch:

    @@
    expression size, align;
    @@
    - memblock_alloc_base(size, align, MEMBLOCK_ALLOC_ANYWHERE)
    + memblock_phys_alloc(size, align)

    Link: http://lkml.kernel.org/r/1548057848-15136-4-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Guo Ren [c-sky]
    Cc: Heiko Carstens
    Cc: Juergen Gross [Xen]
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Paul Burton
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport