18 Nov, 2020

1 commit


21 Oct, 2020

1 commit

  • Pull ARC updates from Vineet Gupta:
    "The bulk of ARC pull request is removal of EZChip NPS platform which
    was suffering from constant bitrot. In recent years EZChip has gone
    though multiple successive acquisitions and I guess things and people
    move on. I would like to take this opportunity to recognize and thank
    all those good folks (Gilad, Noam, Ofer...) for contributing major
    bits to ARC port (SMP, Big Endian).

    Summary:

    - drop support for EZChip NPS platform

    - misc other fixes"

    * tag 'arc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    arc: include/asm: fix typos of "themselves"
    ARC: SMP: fix typo and use "come up" instead of "comeup"
    ARC: [dts] fix the errors detected by dtbs_check
    arc: plat-hsdk: fix kconfig dependency warning when !RESET_CONTROLLER
    ARC: [plat-eznps]: Drop support for EZChip NPS platform

    Linus Torvalds
     

06 Oct, 2020

3 commits


02 Sep, 2020

1 commit

  • Rework of memory map initialization broke initialization of ARC systems
    with two memory banks. Before these changes, memblock was not aware of
    nodes configuration and the memory map was always allocated from the
    "lowmem" bank. After the addition of node information to memblock, the core
    mm attempts to allocate the memory map for the "highmem" bank from its
    node. The access to this memory using __va() fails because it can be only
    accessed using kmap.

    Anther problem that was uncovered is that {min,max}_high_pfn are calculated
    from u64 high_mem_start variable which prevents truncation to 32-bit
    physical address and the PFN values are above the node and zone boundaries.

    Use phys_addr_t type for high_mem_start and high_mem_size to ensure
    correspondence between PFNs and highmem zone boundaries and reserve the
    entire highmem bank until mem_init() to avoid accesses to it before highmem
    is enabled.

    To test this:
    1. Enable HIGHMEM in ARC config
    2. Enable 2 memory banks in haps_hs.dts (uncomment the 2nd bank)

    Fixes: 51930df5801e ("mm: free_area_init: allow defining max_zone_pfn in descending order")
    Cc: stable@vger.kernel.org [5.8]
    Signed-off-by: Mike Rapoport
    Signed-off-by: Vineet Gupta
    [vgupta: added instructions to test highmem]

    Mike Rapoport
     

13 Aug, 2020

2 commits

  • Use the general page fault accounting by passing regs into
    handle_mm_fault(). It naturally solve the issue of multiple page fault
    accounting when page fault retry happened.

    Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
    by moving it before taking mmap_sem.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Cc: Vineet Gupta
    Link: http://lkml.kernel.org/r/20200707225021.200906-4-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Patch series "mm: Page fault accounting cleanups", v5.

    This is v5 of the pf accounting cleanup series. It originates from Gerald
    Schaefer's report on an issue a week ago regarding to incorrect page fault
    accountings for retried page fault after commit 4064b9827063 ("mm: allow
    VM_FAULT_RETRY for multiple times"):

    https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

    What this series did:

    - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault. For example, page fault
    retries should not be counted in page fault counters. Same to the
    perf events.

    - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g. errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense. And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

    - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR). More information in patch 1.

    - Always account the page fault onto the one that triggered the page
    fault. This does not matter much for #PF handlings, but mostly for
    gup. More information on this in patch 25.

    Patchset layout:

    Patch 1: Introduced the accounting in handle_mm_fault(), not enabled.
    Patch 2-23: Enable the new accounting for arch #PF handlers one by one.
    Patch 24: Enable the new accounting for the rest outliers (gup, iommu, etc.)
    Patch 25: Cleanup GUP task_struct pointer since it's not needed any more

    This patch (of 25):

    This is a preparation patch to move page fault accountings into the
    general code in handle_mm_fault(). This includes both the per task
    flt_maj/flt_min counters, and the major/minor page fault perf events. To
    do this, the pt_regs pointer is passed into handle_mm_fault().

    PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
    handlers.

    So far, all the pt_regs pointer that passed into handle_mm_fault() is
    NULL, which means this patch should have no intented functional change.

    Suggested-by: Linus Torvalds
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Cc: Albert Ou
    Cc: Alexander Gordeev
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Christian Borntraeger
    Cc: Chris Zankel
    Cc: Dave Hansen
    Cc: David S. Miller
    Cc: Geert Uytterhoeven
    Cc: Gerald Schaefer
    Cc: Greentime Hu
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: H. Peter Anvin
    Cc: Ingo Molnar
    Cc: Ivan Kokshaysky
    Cc: James E.J. Bottomley
    Cc: John Hubbard
    Cc: Jonas Bonn
    Cc: Ley Foon Tan
    Cc: "Luck, Tony"
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Palmer Dabbelt
    Cc: Paul Mackerras
    Cc: Paul Walmsley
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Cc: Richard Henderson
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Stefan Kristiansson
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Vasily Gorbik
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
    Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     

08 Aug, 2020

1 commit

  • Patch series "mm: cleanup usage of "

    Most architectures have very similar versions of pXd_alloc_one() and
    pXd_free_one() for intermediate levels of page table. These patches add
    generic versions of these functions in and enable
    use of the generic functions where appropriate.

    In addition, functions declared and defined in headers are
    used mostly by core mm and early mm initialization in arch and there is no
    actual reason to have the included all over the place.
    The first patch in this series removes unneeded includes of

    In the end it didn't work out as neatly as I hoped and moving
    pXd_alloc_track() definitions to would require
    unnecessary changes to arches that have custom page table allocations, so
    I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
    to mm/.

    This patch (of 8):

    In most cases header is required only for allocations of
    page table memory. Most of the .c files that include that header do not
    use symbols declared in and do not require that header.

    As for the other header files that used to include , it is
    possible to move that include into the .c file that actually uses symbols
    from and drop the include from the header file.

    The process was somewhat automated using

    sed -i -E '/[
    Signed-off-by: Andrew Morton
    Reviewed-by: Pekka Enberg
    Acked-by: Geert Uytterhoeven [m68k]
    Cc: Abdul Haleem
    Cc: Andy Lutomirski
    Cc: Arnd Bergmann
    Cc: Christophe Leroy
    Cc: Joerg Roedel
    Cc: Max Filippov
    Cc: Peter Zijlstra
    Cc: Satheesh Rajendran
    Cc: Stafford Horne
    Cc: Stephen Rothwell
    Cc: Steven Rostedt
    Cc: Joerg Roedel
    Cc: Matthew Wilcox
    Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
    Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

10 Jun, 2020

5 commits

  • Convert comments that reference mmap_sem to reference mmap_lock instead.

    [akpm@linux-foundation.org: fix up linux-next leftovers]
    [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
    [akpm@linux-foundation.org: more linux-next fixups, per Michel]

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Vlastimil Babka
    Reviewed-by: Daniel Jordan
    Cc: Davidlohr Bueso
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Laurent Dufour
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • This change converts the existing mmap_sem rwsem calls to use the new mmap
    locking API instead.

    The change is generated using coccinelle with the following rule:

    // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

    @@
    expression mm;
    @@
    (
    -init_rwsem
    +mmap_init_lock
    |
    -down_write
    +mmap_write_lock
    |
    -down_write_killable
    +mmap_write_lock_killable
    |
    -down_write_trylock
    +mmap_write_trylock
    |
    -up_write
    +mmap_write_unlock
    |
    -downgrade_write
    +mmap_write_downgrade
    |
    -down_read
    +mmap_read_lock
    |
    -down_read_killable
    +mmap_read_lock_killable
    |
    -down_read_trylock
    +mmap_read_trylock
    |
    -up_read
    +mmap_read_unlock
    )
    -(&mm->mmap_sem)
    +(mm)

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Daniel Jordan
    Reviewed-by: Laurent Dufour
    Reviewed-by: Vlastimil Babka
    Cc: Davidlohr Bueso
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • The powerpc 32-bit implementation of pgtable has nice shortcuts for
    accessing kernel PMD and PTE for a given virtual address. Make these
    helpers available for all architectures.

    [rppt@linux.ibm.com: microblaze: fix page table traversal in setup_rt_frame()]
    Link: http://lkml.kernel.org/r/20200518191511.GD1118872@kernel.org
    [akpm@linux-foundation.org: s/pmd_ptr_k/pmd_off_k/ in various powerpc places]

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matthew Wilcox
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200514170327.31389-9-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The replacement of with made the include
    of the latter in the middle of asm includes. Fix this up with the aid of
    the below script and manual adjustments here and there.

    import sys
    import re

    if len(sys.argv) is not 3:
    print "USAGE: %s " % (sys.argv[0])
    sys.exit(1)

    hdr_to_move="#include " % sys.argv[2]
    moved = False
    in_hdrs = False

    with open(sys.argv[1], "r") as f:
    lines = f.readlines()
    for _line in lines:
    line = _line.rstrip('
    ')
    if line == hdr_to_move:
    continue
    if line.startswith("#include
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matthew Wilcox
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The include/linux/pgtable.h is going to be the home of generic page table
    manipulation functions.

    Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
    make the latter include asm/pgtable.h.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matthew Wilcox
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

05 Jun, 2020

5 commits

  • To support kmap_atomic_prot(), all architectures need to support
    protections passed to their kmap_atomic_high() function. Pass protections
    into kmap_atomic_high() and change the name to kmap_atomic_high_prot() to
    match.

    Then define kmap_atomic_prot() as a core function which calls
    kmap_atomic_high_prot() when needed.

    Finally, redefine kmap_atomic() as a wrapper of kmap_atomic_prot() with
    the default kmap_prot exported by the architectures.

    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20200507150004.1423069-11-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     
  • Every single architecture (including !CONFIG_HIGHMEM) calls...

    pagefault_enable();
    preempt_enable();

    ... before returning from __kunmap_atomic(). Lift this code into the
    kunmap_atomic() macro.

    While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to
    be consistent.

    [ira.weiny@intel.com: don't enable pagefault/preempt twice]
    Link: http://lkml.kernel.org/r/20200518184843.3029640-1-ira.weiny@intel.com
    [akpm@linux-foundation.org: coding style fixes]
    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Guenter Roeck
    Link: http://lkml.kernel.org/r/20200507150004.1423069-8-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     
  • Every arch has the same code to ensure atomic operations and a check for
    !HIGHMEM page.

    Remove the duplicate code by defining a core kmap_atomic() which only
    calls the arch specific kmap_atomic_high() when the page is high memory.

    [akpm@linux-foundation.org: coding style fixes]
    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20200507150004.1423069-7-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     
  • The kmap code for all the architectures is almost 100% identical.

    Lift the common code to the core. Use ARCH_HAS_KMAP_FLUSH_TLB to indicate
    if an arch defines kmap_flush_tlb() and call if if needed.

    This also has the benefit of changing kmap() on a number of architectures
    to be an inline call rather than an actual function.

    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christian König
    Cc: Chris Zankel
    Cc: Daniel Vetter
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: "David S. Miller"
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20200507150004.1423069-4-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     
  • Patch series "Remove duplicated kmap code", v3.

    The kmap infrastructure has been copied almost verbatim to every
    architecture. This series consolidates obvious duplicated code by
    defining core functions which call into the architectures only when
    needed.

    Some of the k[un]map_atomic() implementations have some similarities but
    the similarities were not sufficient to warrant further changes.

    In addition we remove a duplicate implementation of kmap() in DRM.

    This patch (of 15):

    Replace the use of BUG_ON(in_interrupt()) in the kmap() and kunmap() in
    favor of might_sleep().

    Besides the benefits of might_sleep(), this normalizes the implementations
    such that they can be made generic in subsequent patches.

    Signed-off-by: Ira Weiny
    Signed-off-by: Andrew Morton
    Reviewed-by: Dan Williams
    Reviewed-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Christian König
    Cc: Daniel Vetter
    Cc: Thomas Bogendoerfer
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "David S. Miller"
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Borislav Petkov
    Cc: "H. Peter Anvin"
    Cc: Dave Hansen
    Cc: Andy Lutomirski
    Cc: Peter Zijlstra
    Cc: Chris Zankel
    Cc: Max Filippov
    Link: http://lkml.kernel.org/r/20200507150004.1423069-1-ira.weiny@intel.com
    Link: http://lkml.kernel.org/r/20200507150004.1423069-2-ira.weiny@intel.com
    Signed-off-by: Linus Torvalds

    Ira Weiny
     

04 Jun, 2020

1 commit

  • Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
    ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it
    is sorted in descending order allows using free_area_init() on such
    architectures.

    Add top -> down traversal of max_zone_pfn array in free_area_init() and
    use the latter in ARC node/zone initialization.

    [rppt@kernel.org: ARC fix]
    Link: http://lkml.kernel.org/r/20200504153901.GM14260@kernel.org
    [rppt@linux.ibm.com: arc: free_area_init(): take into account PAE40 mode]
    Link: http://lkml.kernel.org/r/20200507205900.GH683243@linux.ibm.com
    [akpm@linux-foundation.org: declare arch_has_descending_max_zone_pfns()]
    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Tested-by: Hoan Tran [arm64]
    Reviewed-by: Baoquan He
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: "James E.J. Bottomley"
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Hocko
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Cc: Guenter Roeck
    Link: http://lkml.kernel.org/r/20200412194859.12663-18-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

03 Apr, 2020

3 commits

  • The idea comes from a discussion between Linus and Andrea [1].

    Before this patch we only allow a page fault to retry once. We achieved
    this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
    handle_mm_fault() the second time. This was majorly used to avoid
    unexpected starvation of the system by looping over forever to handle the
    page fault on a single page. However that should hardly happen, and after
    all for each code path to return a VM_FAULT_RETRY we'll first wait for a
    condition (during which time we should possibly yield the cpu) to happen
    before VM_FAULT_RETRY is really returned.

    This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
    flag when we receive VM_FAULT_RETRY. It means that the page fault handler
    now can retry the page fault for multiple times if necessary without the
    need to generate another page fault event. Meanwhile we still keep the
    FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
    page fault is the first attempt or not.

    Then we'll have these combinations of fault flags (only considering
    ALLOW_RETRY flag and TRIED flag):

    - ALLOW_RETRY and !TRIED: this means the page fault allows to
    retry, and this is the first try

    - ALLOW_RETRY and TRIED: this means the page fault allows to
    retry, and this is not the first try

    - !ALLOW_RETRY and !TRIED: this means the page fault does not allow
    to retry at all

    - !ALLOW_RETRY and TRIED: this is forbidden and should never be used

    In existing code we have multiple places that has taken special care of
    the first condition above by checking against (fault_flags &
    FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect
    the first retry of a page fault by checking against both (fault_flags &
    FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
    even the 2nd try will have the ALLOW_RETRY set, then use that helper in
    all existing special paths. One example is in __lock_page_or_retry(), now
    we'll drop the mmap_sem only in the first attempt of page fault and we'll
    keep it in follow up retries, so old locking behavior will be retained.

    This will be a nice enhancement for current code [2] at the same time a
    supporting material for the future userfaultfd-writeprotect work, since in
    that work there will always be an explicit userfault writeprotect retry
    for protected pages, and if that cannot resolve the page fault (e.g., when
    userfaultfd-writeprotect is used in conjunction with swapped pages) then
    we'll possibly need a 3rd retry of the page fault. It might also benefit
    other potential users who will have similar requirement like userfault
    write-protection.

    GUP code is not touched yet and will be covered in follow up patch.

    Please read the thread below for more information.

    [1] https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/
    [2] https://lore.kernel.org/lkml/20181230154648.GB9832@redhat.com/

    Suggested-by: Linus Torvalds
    Suggested-by: Andrea Arcangeli
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160246.9790-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Although there're tons of arch-specific page fault handlers, most of them
    are still sharing the same initial value of the page fault flags. Say,
    merely all of the page fault handlers would allow the fault to be retried,
    and they also allow the fault to respond to SIGKILL.

    Let's define a default value for the fault flags to replace those initial
    page fault flags that were copied over. With this, it'll be far easier to
    introduce new fault flag that can be used by all the architectures instead
    of touching all the archs.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Reviewed-by: David Hildenbrand
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160238.9694-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Let ARC to use the new helper fault_signal_pending() by moving the signal
    check out of the retry logic as standalone. This should also helps to
    simplify the code a bit.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220155843.9172-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     

05 Dec, 2019

1 commit

  • Pull ARC updates from Vineet Gupta

    - Jump Label support for ARC

    - kmemleak enabled

    - arc mm backend TLB Miss / flush optimizations

    - nSIM platform switching to dwuart (vs. arcuart) and ensuing defconfig
    updates and cleanups

    - axs platform pll / video-mode updates

    * tag 'arc-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    ARC: add kmemleak support
    ARC: [plat-axs10x]: remove hardcoded video mode from bootargs
    ARC: [plat-axs10x]: use pgu pll instead of fixed clock
    ARC: ARCv2: jump label: implement jump label patching
    ARC: mm: tlb flush optim: elide redundant uTLB invalidates for MMUv3
    ARC: mm: tlb flush optim: elide repeated uTLB invalidate in loop
    ARC: mm: tlb flush optim: Make TLBWriteNI fallback to TLBWrite if not available
    ARC: mm: TLB Miss optim: avoid re-reading ECR
    ARCv2: mm: TLB Miss optim: Use double world load/stores LDD/STD
    ARCv2: mm: TLB Miss optim: SMP builds can cache pgd pointer in mmu scratch reg
    ARC: nSIM_700: remove unused network options
    ARC: nSIM_700: switch to DW UART usage
    ARC: merge HAPS-HS with nSIM-HS configs
    ARC: HAPS: cleanup defconfigs from unused ETH drivers
    ARC: HAPS: add HIGHMEM memory zone to DTS
    ARC: HAPS: use same UART configuration everywhere
    ARC: HAPS: cleanup defconfigs from unused IO-related options
    ARC: regenerate nSIM and HAPS defconfigs

    Linus Torvalds
     

01 Dec, 2019

1 commit

  • Patch series "elide extraneous generated code for folded p4d/pud/pmd", v3.

    This series came out of seemingly benign excursion into
    understanding/removing __ARCH_USE_5LEVEL_HACK from ARC port showing some
    extraneous code being generated despite folded p4d/pud/pmd

    | bloat-o-meter2 vmlinux-[AB]*
    | add/remove: 0/0 grow/shrink: 3/0 up/down: 130/0 (130)
    | function old new delta
    | free_pgd_range 548 660 +112
    | p4d_clear_bad 2 20 +18

    The patches here address that

    | bloat-o-meter2 vmlinux-[BF]*
    | add/remove: 0/2 grow/shrink: 0/1 up/down: 0/-386 (-386)
    | function old new delta
    | pud_clear_bad 20 - -20
    | p4d_clear_bad 20 - -20
    | free_pgd_range 660 314 -346

    The code savings are not a whole lot, but still worthwhile IMHO.

    This patch (of 5):

    With paging code made 5-level compliant, this is no longer needed. ARC
    has software page walker with 2 lookup levels (pgd -> pte)

    This was expected to be non functional change but ended with slight
    code bloat due to needless inclusions of p*d_free_tlb() macros which
    will be addressed in further patches.

    | bloat-o-meter2 vmlinux-[AB]*
    | add/remove: 0/0 grow/shrink: 2/0 up/down: 128/0 (128)
    | function old new delta
    | free_pgd_range 546 656 +110
    | p4d_clear_bad 2 20 +18
    | Total: Before=4137148, After=4137276, chg 0.000000%

    Link: http://lkml.kernel.org/r/20191016162400.14796-2-vgupta@synopsys.com
    Signed-off-by: Vineet Gupta
    Acked-by: Kirill A. Shutemov
    Cc: "Aneesh Kumar K . V"
    Cc: Arnd Bergmann
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vineet Gupta
     

21 Nov, 2019

1 commit


29 Oct, 2019

6 commits


20 Sep, 2019

1 commit

  • Pull dma-mapping updates from Christoph Hellwig:

    - add dma-mapping and block layer helpers to take care of IOMMU merging
    for mmc plus subsequent fixups (Yoshihiro Shimoda)

    - rework handling of the pgprot bits for remapping (me)

    - take care of the dma direct infrastructure for swiotlb-xen (me)

    - improve the dma noncoherent remapping infrastructure (me)

    - better defaults for ->mmap, ->get_sgtable and ->get_required_mask
    (me)

    - cleanup mmaping of coherent DMA allocations (me)

    - various misc cleanups (Andy Shevchenko, me)

    * tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits)
    mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE
    mmc: queue: Fix bigger segments usage
    arm64: use asm-generic/dma-mapping.h
    swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page
    swiotlb-xen: simplify cache maintainance
    swiotlb-xen: use the same foreign page check everywhere
    swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable
    xen: remove the exports for xen_{create,destroy}_contiguous_region
    xen/arm: remove xen_dma_ops
    xen/arm: simplify dma_cache_maint
    xen/arm: use dev_is_dma_coherent
    xen/arm: consolidate page-coherent.h
    xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance
    arm: remove wrappers for the generic dma remap helpers
    dma-mapping: introduce a dma_common_find_pages helper
    dma-mapping: always use VM_DMA_COHERENT for generic DMA remap
    vmalloc: lift the arm flag for coherent mappings to common code
    dma-mapping: provide a better default ->get_required_mask
    dma-mapping: remove the dma_declare_coherent_memory export
    remoteproc: don't allow modular build
    ...

    Linus Torvalds
     

29 Aug, 2019

1 commit


05 Aug, 2019

1 commit


17 Jul, 2019

1 commit

  • Pull ARC updates from Vineet Gupta:

    - long due rewrite of do_page_fault

    - refactoring of entry/exit code to utilize the double load/store
    instructions

    - hsdk platform updates

    * tag 'arc-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    ARC: [plat-hsdk]: Enable AXI DW DMAC in defconfig
    ARC: [plat-hsdk]: enable DW SPI controller
    ARC: hide unused function unw_hdr_alloc
    ARC: [haps] Add Virtio support
    ARCv2: entry: simplify return to Delay Slot via interrupt
    ARC: entry: EV_Trap expects r10 (vs. r9) to have exception cause
    ARCv2: entry: rewrite to enable use of double load/stores LDD/STD
    ARCv2: entry: avoid a branch
    ARCv2: entry: push out the Z flag unclobber from common EXCEPTION_PROLOGUE
    ARCv2: entry: comments about hardware auto-save on taken interrupts
    ARC: mm: do_page_fault refactor #8: release mmap_sem sooner
    ARC: mm: do_page_fault refactor #7: fold the various error handling
    ARC: mm: do_page_fault refactor #6: error handlers to use same pattern
    ARC: mm: do_page_fault refactor #5: scoot no_context to end
    ARC: mm: do_page_fault refactor #4: consolidate retry related logic
    ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
    ARC: mm: do_page_fault refactor #2: remove short lived variable
    ARC: mm: do_page_fault refactor #1: remove label @good_area

    Linus Torvalds
     

13 Jul, 2019

1 commit

  • Pull dma-mapping updates from Christoph Hellwig:

    - move the USB special case that bounced DMA through a device bar into
    the USB code instead of handling it in the common DMA code (Laurentiu
    Tudor and Fredrik Noring)

    - don't dip into the global CMA pool for single page allocations
    (Nicolin Chen)

    - fix a crash when allocating memory for the atomic pool failed during
    boot (Florian Fainelli)

    - move support for MIPS-style uncached segments to the common code and
    use that for MIPS and nios2 (me)

    - make support for DMA_ATTR_NON_CONSISTENT and
    DMA_ATTR_NO_KERNEL_MAPPING generic (me)

    - convert nds32 to the generic remapping allocator (me)

    * tag 'dma-mapping-5.3' of git://git.infradead.org/users/hch/dma-mapping: (29 commits)
    dma-mapping: mark dma_alloc_need_uncached as __always_inline
    MIPS: only select ARCH_HAS_UNCACHED_SEGMENT for non-coherent platforms
    usb: host: Fix excessive alignment restriction for local memory allocations
    lib/genalloc.c: Add algorithm, align and zeroed family of DMA allocators
    nios2: use the generic uncached segment support in dma-direct
    nds32: use the generic remapping allocator for coherent DMA allocations
    arc: use the generic remapping allocator for coherent DMA allocations
    dma-direct: handle DMA_ATTR_NO_KERNEL_MAPPING in common code
    dma-direct: handle DMA_ATTR_NON_CONSISTENT in common code
    dma-mapping: add a dma_alloc_need_uncached helper
    openrisc: remove the partial DMA_ATTR_NON_CONSISTENT support
    arc: remove the partial DMA_ATTR_NON_CONSISTENT support
    arm-nommu: remove the partial DMA_ATTR_NON_CONSISTENT support
    ARM: dma-mapping: allow larger DMA mask than supported
    dma-mapping: truncate dma masks to what dma_addr_t can hold
    iommu/dma: Apply dma_{alloc,free}_contiguous functions
    dma-remap: Avoid de-referencing NULL atomic_pool
    MIPS: use the generic uncached segment support in dma-direct
    dma-direct: provide generic support for uncached kernel segments
    au1100fb: fix DMA API abuse
    ...

    Linus Torvalds
     

09 Jul, 2019

1 commit

  • …iederm/user-namespace

    Pull force_sig() argument change from Eric Biederman:
    "A source of error over the years has been that force_sig has taken a
    task parameter when it is only safe to use force_sig with the current
    task.

    The force_sig function is built for delivering synchronous signals
    such as SIGSEGV where the userspace application caused a synchronous
    fault (such as a page fault) and the kernel responded with a signal.

    Because the name force_sig does not make this clear, and because the
    force_sig takes a task parameter the function force_sig has been
    abused for sending other kinds of signals over the years. Slowly those
    have been fixed when the oopses have been tracked down.

    This set of changes fixes the remaining abusers of force_sig and
    carefully rips out the task parameter from force_sig and friends
    making this kind of error almost impossible in the future"

    * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
    signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
    signal: Remove the signal number and task parameters from force_sig_info
    signal: Factor force_sig_info_to_task out of force_sig_info
    signal: Generate the siginfo in force_sig
    signal: Move the computation of force into send_signal and correct it.
    signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
    signal: Remove the task parameter from force_sig_fault
    signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
    signal: Explicitly call force_sig_fault on current
    signal/unicore32: Remove tsk parameter from __do_user_fault
    signal/arm: Remove tsk parameter from __do_user_fault
    signal/arm: Remove tsk parameter from ptrace_break
    signal/nds32: Remove tsk parameter from send_sigtrap
    signal/riscv: Remove tsk parameter from do_trap
    signal/sh: Remove tsk parameter from force_sig_info_fault
    signal/um: Remove task parameter from send_sigtrap
    signal/x86: Remove task parameter from send_sigtrap
    signal: Remove task parameter from force_sig_mceerr
    signal: Remove task parameter from force_sig
    signal: Remove task parameter from force_sigsegv
    ...

    Linus Torvalds
     

02 Jul, 2019

2 commits

  • Upon a taken interrupt/exception from User mode, HS hardware auto sets Z flag.
    This helps shave a few instructions from EXCEPTION_PROLOGUE by eliding
    re-reading ERSTATUS and some bit fiddling.

    However TLB Miss Exception handler can clobber the CPU flags and still end
    up in EXCEPTION_PROLOGUE in the slow path handling TLB handling case:

    EV_TLBMissD
    do_slow_path_pf
    EV_TLBProtV (aliased to call_do_page_fault)
    EXCEPTION_PROLOGUE

    As a result, EXCEPTION_PROLOGUE need to "unclobber" the Z flag which this
    patch changes. It is now pushed out to TLB Miss Exception handler.
    The reasons beings:

    - The flag restoration is only needed for slowpath TLB Miss Exception
    handling, but currently being in EXCEPTION_PROLOGUE penalizes all
    exceptions such as ProtV and syscall Trap, where Z flag is already
    as expected.

    - Pushing unclobber out to where it was clobbered is much cleaner and
    also serves to document the fact.

    - Makes EXCEPTION_PROLGUE similar to INTERRUPT_PROLOGUE so easier to
    refactor the common parts which is what this series aims to do

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • In case of successful page fault handling, this patch releases mmap_sem
    before updating the perf stat event for major/minor faults. So even
    though the contention reduction is NOT super high, it is still an
    improvement.

    There's an additional code size improvement as we only have 2 up_read()
    calls now.

    Note to myself:
    --------------

    1. Given the way it is done, we are forced to move @bad_area label earlier
    causing the various "goto bad_area" cases to hit perf stat code.

    - PERF_COUNT_SW_PAGE_FAULTS is NOW updated for access errors which is what
    arm/arm64 seem to be doing as well (with slightly different code)
    - PERF_COUNT_SW_PAGE_FAULTS_{MAJ,MIN} must NOT be updated for the
    error case which is guarded by now setting @fault initial value
    to VM_FAULT_ERROR which serves both cases when handle_mm_fault()
    returns error or is not called at all.

    2. arm/arm64 use two homebrew fault flags VM_FAULT_BAD{MAP,MAPACCESS}
    which I was inclined to add too but seems not needed for ARC

    - given that we have everything is 1 function we can still use goto
    - we setup si_code at the right place (arm* do that in the end)
    - we init fault already to error value which guards entry into perf
    stats event update

    Cc: Peter Zijlstra
    Signed-off-by: Vineet Gupta

    Vineet Gupta