26 Oct, 2020

1 commit

  • Use a more generic form for __section that requires quotes to avoid
    complications with clang and gcc differences.

    Remove the quote operator # from compiler_attributes.h __section macro.

    Convert all unquoted __section(foo) uses to quoted __section("foo").
    Also convert __attribute__((section("foo"))) uses to __section("foo")
    even if the __attribute__ has multiple list entry forms.

    Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

    Signed-off-by: Joe Perches
    Reviewed-by: Nick Desaulniers
    Reviewed-by: Miguel Ojeda
    Signed-off-by: Linus Torvalds

    Joe Perches
     

24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

13 Aug, 2020

2 commits

  • Use the general page fault accounting by passing regs into
    handle_mm_fault(). It naturally solve the issue of multiple page fault
    accounting when page fault retry happened.

    Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
    other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
    handle_mm_fault().

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Cc: James E.J. Bottomley
    Cc: Helge Deller
    Link: http://lkml.kernel.org/r/20200707225021.200906-16-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Patch series "mm: Page fault accounting cleanups", v5.

    This is v5 of the pf accounting cleanup series. It originates from Gerald
    Schaefer's report on an issue a week ago regarding to incorrect page fault
    accountings for retried page fault after commit 4064b9827063 ("mm: allow
    VM_FAULT_RETRY for multiple times"):

    https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

    What this series did:

    - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault. For example, page fault
    retries should not be counted in page fault counters. Same to the
    perf events.

    - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g. errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense. And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

    - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR). More information in patch 1.

    - Always account the page fault onto the one that triggered the page
    fault. This does not matter much for #PF handlings, but mostly for
    gup. More information on this in patch 25.

    Patchset layout:

    Patch 1: Introduced the accounting in handle_mm_fault(), not enabled.
    Patch 2-23: Enable the new accounting for arch #PF handlers one by one.
    Patch 24: Enable the new accounting for the rest outliers (gup, iommu, etc.)
    Patch 25: Cleanup GUP task_struct pointer since it's not needed any more

    This patch (of 25):

    This is a preparation patch to move page fault accountings into the
    general code in handle_mm_fault(). This includes both the per task
    flt_maj/flt_min counters, and the major/minor page fault perf events. To
    do this, the pt_regs pointer is passed into handle_mm_fault().

    PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
    handlers.

    So far, all the pt_regs pointer that passed into handle_mm_fault() is
    NULL, which means this patch should have no intented functional change.

    Suggested-by: Linus Torvalds
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Cc: Albert Ou
    Cc: Alexander Gordeev
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Christian Borntraeger
    Cc: Chris Zankel
    Cc: Dave Hansen
    Cc: David S. Miller
    Cc: Geert Uytterhoeven
    Cc: Gerald Schaefer
    Cc: Greentime Hu
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: H. Peter Anvin
    Cc: Ingo Molnar
    Cc: Ivan Kokshaysky
    Cc: James E.J. Bottomley
    Cc: John Hubbard
    Cc: Jonas Bonn
    Cc: Ley Foon Tan
    Cc: "Luck, Tony"
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Palmer Dabbelt
    Cc: Paul Mackerras
    Cc: Paul Walmsley
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Cc: Richard Henderson
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Stefan Kristiansson
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Vasily Gorbik
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
    Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     

08 Aug, 2020

2 commits

  • After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP we have two equivalent
    functions that call memory_present() for each region in memblock.memory:
    sparse_memory_present_with_active_regions() and membocks_present().

    Moreover, all architectures have a call to either of these functions
    preceding the call to sparse_init() and in the most cases they are called
    one after the other.

    Mark the regions from memblock.memory as present during sparce_init() by
    making sparse_init() call memblocks_present(), make memblocks_present()
    and memory_present() functions static and remove redundant
    sparse_memory_present_with_active_regions() function.

    Also remove no longer required HAVE_MEMORY_PRESENT configuration option.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200712083130.22919-1-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • Patch series "mm: cleanup usage of "

    Most architectures have very similar versions of pXd_alloc_one() and
    pXd_free_one() for intermediate levels of page table. These patches add
    generic versions of these functions in and enable
    use of the generic functions where appropriate.

    In addition, functions declared and defined in headers are
    used mostly by core mm and early mm initialization in arch and there is no
    actual reason to have the included all over the place.
    The first patch in this series removes unneeded includes of

    In the end it didn't work out as neatly as I hoped and moving
    pXd_alloc_track() definitions to would require
    unnecessary changes to arches that have custom page table allocations, so
    I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
    to mm/.

    This patch (of 8):

    In most cases header is required only for allocations of
    page table memory. Most of the .c files that include that header do not
    use symbols declared in and do not require that header.

    As for the other header files that used to include , it is
    possible to move that include into the .c file that actually uses symbols
    from and drop the include from the header file.

    The process was somewhat automated using

    sed -i -E '/[
    Signed-off-by: Andrew Morton
    Reviewed-by: Pekka Enberg
    Acked-by: Geert Uytterhoeven [m68k]
    Cc: Abdul Haleem
    Cc: Andy Lutomirski
    Cc: Arnd Bergmann
    Cc: Christophe Leroy
    Cc: Joerg Roedel
    Cc: Max Filippov
    Cc: Peter Zijlstra
    Cc: Satheesh Rajendran
    Cc: Stafford Horne
    Cc: Stephen Rothwell
    Cc: Steven Rostedt
    Cc: Joerg Roedel
    Cc: Matthew Wilcox
    Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
    Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

21 Jul, 2020

1 commit


10 Jun, 2020

4 commits

  • Convert comments that reference old mmap_sem APIs to reference
    corresponding new mmap locking APIs instead.

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Vlastimil Babka
    Reviewed-by: Davidlohr Bueso
    Reviewed-by: Daniel Jordan
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Laurent Dufour
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • This change converts the existing mmap_sem rwsem calls to use the new mmap
    locking API instead.

    The change is generated using coccinelle with the following rule:

    // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

    @@
    expression mm;
    @@
    (
    -init_rwsem
    +mmap_init_lock
    |
    -down_write
    +mmap_write_lock
    |
    -down_write_killable
    +mmap_write_lock_killable
    |
    -down_write_trylock
    +mmap_write_trylock
    |
    -up_write
    +mmap_write_unlock
    |
    -downgrade_write
    +mmap_write_downgrade
    |
    -down_read
    +mmap_read_lock
    |
    -down_read_killable
    +mmap_read_lock_killable
    |
    -down_read_trylock
    +mmap_read_trylock
    |
    -up_read
    +mmap_read_unlock
    )
    -(&mm->mmap_sem)
    +(mm)

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Daniel Jordan
    Reviewed-by: Laurent Dufour
    Reviewed-by: Vlastimil Babka
    Cc: Davidlohr Bueso
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • The powerpc 32-bit implementation of pgtable has nice shortcuts for
    accessing kernel PMD and PTE for a given virtual address. Make these
    helpers available for all architectures.

    [rppt@linux.ibm.com: microblaze: fix page table traversal in setup_rt_frame()]
    Link: http://lkml.kernel.org/r/20200518191511.GD1118872@kernel.org
    [akpm@linux-foundation.org: s/pmd_ptr_k/pmd_off_k/ in various powerpc places]

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matthew Wilcox
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200514170327.31389-9-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • Patch series "mm: consolidate definitions of page table accessors", v2.

    The low level page table accessors (pXY_index(), pXY_offset()) are
    duplicated across all architectures and sometimes more than once. For
    instance, we have 31 definition of pgd_offset() for 25 supported
    architectures.

    Most of these definitions are actually identical and typically it boils
    down to, e.g.

    static inline unsigned long pmd_index(unsigned long address)
    {
    return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
    }

    static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
    {
    return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
    }

    These definitions can be shared among 90% of the arches provided
    XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.

    For architectures that really need a custom version there is always
    possibility to override the generic version with the usual ifdefs magic.

    These patches introduce include/linux/pgtable.h that replaces
    include/asm-generic/pgtable.h and add the definitions of the page table
    accessors to the new header.

    This patch (of 12):

    The linux/mm.h header includes to allow inlining of the
    functions involving page table manipulations, e.g. pte_alloc() and
    pmd_alloc(). So, there is no point to explicitly include
    in the files that include .

    The include statements in such cases are remove with a simple loop:

    for f in $(git grep -l "include ") ; do
    sed -i -e '/include / d' $f
    done

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matthew Wilcox
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Mike Rapoport
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
    Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

04 Jun, 2020

1 commit

  • free_area_init() only requires the definition of maximal PFN for each of
    the supported zone rater than calculation of actual zone sizes and the
    sizes of the holes between the zones.

    After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
    available to all architectures.

    Using this function instead of free_area_init_node() simplifies the zone
    detection.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Tested-by: Hoan Tran [arm64]
    Cc: Baoquan He
    Cc: Brian Cain
    Cc: Catalin Marinas
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: "James E.J. Bottomley"
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Hocko
    Cc: Michal Simek
    Cc: Nick Hu
    Cc: Paul Walmsley
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/20200412194859.12663-12-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

29 May, 2020

1 commit

  • The Debian kernel v5.6 triggers this kernel panic:

    Kernel panic - not syncing: Bad Address (null pointer deref?)
    Bad Address (null pointer deref?): Code=26 (Data memory access rights trap) at addr 0000000000000000
    CPU: 0 PID: 0 Comm: swapper Not tainted 5.6.0-2-parisc64 #1 Debian 5.6.14-1
    IAOQ[0]: mem_init+0xb0/0x150
    IAOQ[1]: mem_init+0xb4/0x150
    RP(r2): start_kernel+0x6c8/0x1190
    Backtrace:
    [] start_kernel+0x6c8/0x1190
    [] start_parisc+0x158/0x1b8

    on a HP-PARISC rp3440 machine with this memory layout:
    Memory Ranges:
    0) Start 0x0000000000000000 End 0x000000003fffffff Size 1024 MB
    1) Start 0x0000004040000000 End 0x00000040ffdfffff Size 3070 MB

    Fix the crash by avoiding virt_to_page() and similar functions in
    mem_init() until the memory zones have been fully set up.

    Signed-off-by: Helge Deller
    Cc: stable@vger.kernel.org # v5.0+

    Helge Deller
     

03 Apr, 2020

3 commits

  • The idea comes from a discussion between Linus and Andrea [1].

    Before this patch we only allow a page fault to retry once. We achieved
    this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
    handle_mm_fault() the second time. This was majorly used to avoid
    unexpected starvation of the system by looping over forever to handle the
    page fault on a single page. However that should hardly happen, and after
    all for each code path to return a VM_FAULT_RETRY we'll first wait for a
    condition (during which time we should possibly yield the cpu) to happen
    before VM_FAULT_RETRY is really returned.

    This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
    flag when we receive VM_FAULT_RETRY. It means that the page fault handler
    now can retry the page fault for multiple times if necessary without the
    need to generate another page fault event. Meanwhile we still keep the
    FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
    page fault is the first attempt or not.

    Then we'll have these combinations of fault flags (only considering
    ALLOW_RETRY flag and TRIED flag):

    - ALLOW_RETRY and !TRIED: this means the page fault allows to
    retry, and this is the first try

    - ALLOW_RETRY and TRIED: this means the page fault allows to
    retry, and this is not the first try

    - !ALLOW_RETRY and !TRIED: this means the page fault does not allow
    to retry at all

    - !ALLOW_RETRY and TRIED: this is forbidden and should never be used

    In existing code we have multiple places that has taken special care of
    the first condition above by checking against (fault_flags &
    FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect
    the first retry of a page fault by checking against both (fault_flags &
    FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
    even the 2nd try will have the ALLOW_RETRY set, then use that helper in
    all existing special paths. One example is in __lock_page_or_retry(), now
    we'll drop the mmap_sem only in the first attempt of page fault and we'll
    keep it in follow up retries, so old locking behavior will be retained.

    This will be a nice enhancement for current code [2] at the same time a
    supporting material for the future userfaultfd-writeprotect work, since in
    that work there will always be an explicit userfault writeprotect retry
    for protected pages, and if that cannot resolve the page fault (e.g., when
    userfaultfd-writeprotect is used in conjunction with swapped pages) then
    we'll possibly need a 3rd retry of the page fault. It might also benefit
    other potential users who will have similar requirement like userfault
    write-protection.

    GUP code is not touched yet and will be covered in follow up patch.

    Please read the thread below for more information.

    [1] https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/
    [2] https://lore.kernel.org/lkml/20181230154648.GB9832@redhat.com/

    Suggested-by: Linus Torvalds
    Suggested-by: Andrea Arcangeli
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160246.9790-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Although there're tons of arch-specific page fault handlers, most of them
    are still sharing the same initial value of the page fault flags. Say,
    merely all of the page fault handlers would allow the fault to be retried,
    and they also allow the fault to respond to SIGKILL.

    Let's define a default value for the fault flags to replace those initial
    page fault flags that were copied over. With this, it'll be far easier to
    introduce new fault flag that can be used by all the architectures instead
    of touching all the archs.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Reviewed-by: David Hildenbrand
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160238.9694-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • For most architectures, we've got a quick path to detect fatal signal
    after a handle_mm_fault(). Introduce a helper for that quick path.

    It cleans the current codes a bit so we don't need to duplicate the same
    check across archs. More importantly, this will be an unified place that
    we handle the signal immediately right after an interrupted page fault, so
    it'll be much easier for us if we want to change the behavior of handling
    signals later on for all the archs.

    Note that currently only part of the archs are using this new helper,
    because some archs have their own way to handle signals. In the follow up
    patches, we'll try to apply this helper to all the rest of archs.

    Another note is that the "regs" parameter in the new helper is not used
    yet. It'll be used very soon. Now we kept it in this patch only to avoid
    touching all the archs again in the follow up patches.

    [peterx@redhat.com: fix sparse warnings]
    Link: http://lkml.kernel.org/r/20200311145921.GD479302@xz-x1
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220155353.8676-4-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     

27 Jan, 2020

1 commit

  • The current code uses '#if PTRS_PER_PMD == 1' to distinguish 2 vs 3 levels,
    setup, it casts pgd to pgd to cope with page table folding and converts
    addresses of page table entries from physical to virtual and back for no
    good reason.

    Simplify the accesses to the page table entries using proper unfolding of
    the upper layers and replacing '#if PTRS_PER_PMD' with explicit
    '#if CONFIG_PGTABLE_LEVELS == 3'

    Signed-off-by: Mike Rapoport
    Signed-off-by: Helge Deller

    Mike Rapoport
     

14 Jan, 2020

1 commit

  • The commit d96885e277b5 ("parisc: use pgtable-nopXd instead of
    4level-fixup") converted PA-RISC to use folded page tables, but it missed
    the conversion of pgd_populate() to pud_populate() in maps_pages()
    function. This caused the upper page table directory to remain empty and
    the system would crash as a result.

    Using pud_populate() that actually populates the page table instead of
    dummy pgd_populate() fixes the issue.

    Fixes: d96885e277b5 ("parisc: use pgtable-nopXd instead of 4level-fixup")
    Reported-by: Meelis Roos
    Reported-by: Jeroen Roovers
    Reported-by: Mikulas Patocka
    Tested-by: Jeroen Roovers
    Tested-by: Mikulas Patocka
    Signed-off-by: Mike Rapoport
    Signed-off-by: Helge Deller

    Mike Rapoport
     

05 Dec, 2019

2 commits

  • Link: http://lkml.kernel.org/r/1572938135-31886-10-git-send-email-rppt@kernel.org
    Signed-off-by: Helge Deller
    Signed-off-by: Mike Rapoport
    Cc: Anatoly Pugachev
    Cc: Anton Ivanov
    Cc: Arnd Bergmann
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: "James E.J. Bottomley"
    Cc: Jeff Dike
    Cc: "Kirill A. Shutemov"
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Michal Simek
    Cc: Peter Rosin
    Cc: Richard Weinberger
    Cc: Rolf Eike Beer
    Cc: Russell King
    Cc: Russell King
    Cc: Sam Creasey
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Helge Deller
     
  • parisc has two or three levels of page tables and can use appropriate
    pgtable-nopXd and folding of the upper layers.

    Replace usage of include/asm-generic/4level-fixup.h and explicit
    definitions of __PAGETABLE_PxD_FOLDED in parisc with
    include/asm-generic/pgtable-nopmd.h for two-level configurations and
    with include/asm-generic/pgtable-nopud.h for three-lelve configurations
    and adjust page table manipulation macros and functions accordingly.

    Link: http://lkml.kernel.org/r/1572938135-31886-9-git-send-email-rppt@kernel.org
    Signed-off-by: Mike Rapoport
    Acked-by: Helge Deller
    Cc: Anatoly Pugachev
    Cc: Anton Ivanov
    Cc: Arnd Bergmann
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Ungerer
    Cc: "James E.J. Bottomley"
    Cc: Jeff Dike
    Cc: "Kirill A. Shutemov"
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Michal Simek
    Cc: Peter Rosin
    Cc: Richard Weinberger
    Cc: Rolf Eike Beer
    Cc: Russell King
    Cc: Russell King
    Cc: Sam Creasey
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

12 Nov, 2019

1 commit


15 Oct, 2019

1 commit


31 Jul, 2019

1 commit


10 Jul, 2019

1 commit

  • Pull parisc updates from Helge Deller:
    "Dynamic ftrace support by Sven Schnelle and a header guard fix by
    Denis Efremov"

    * 'parisc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: asm: psw.h: missing header guard
    parisc: add dynamic ftrace
    compiler.h: add CC_USING_PATCHABLE_FUNCTION_ENTRY
    parisc: use pr_debug() in kernel/module.c
    parisc: add WARN_ON() to clear_fixmap
    parisc: add spinlock to patch function
    parisc: add support for patching multiple words

    Linus Torvalds
     

09 Jul, 2019

1 commit

  • …iederm/user-namespace

    Pull force_sig() argument change from Eric Biederman:
    "A source of error over the years has been that force_sig has taken a
    task parameter when it is only safe to use force_sig with the current
    task.

    The force_sig function is built for delivering synchronous signals
    such as SIGSEGV where the userspace application caused a synchronous
    fault (such as a page fault) and the kernel responded with a signal.

    Because the name force_sig does not make this clear, and because the
    force_sig takes a task parameter the function force_sig has been
    abused for sending other kinds of signals over the years. Slowly those
    have been fixed when the oopses have been tracked down.

    This set of changes fixes the remaining abusers of force_sig and
    carefully rips out the task parameter from force_sig and friends
    making this kind of error almost impossible in the future"

    * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
    signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
    signal: Remove the signal number and task parameters from force_sig_info
    signal: Factor force_sig_info_to_task out of force_sig_info
    signal: Generate the siginfo in force_sig
    signal: Move the computation of force into send_signal and correct it.
    signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
    signal: Remove the task parameter from force_sig_fault
    signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
    signal: Explicitly call force_sig_fault on current
    signal/unicore32: Remove tsk parameter from __do_user_fault
    signal/arm: Remove tsk parameter from __do_user_fault
    signal/arm: Remove tsk parameter from ptrace_break
    signal/nds32: Remove tsk parameter from send_sigtrap
    signal/riscv: Remove tsk parameter from do_trap
    signal/sh: Remove tsk parameter from force_sig_info_fault
    signal/um: Remove task parameter from send_sigtrap
    signal/x86: Remove task parameter from send_sigtrap
    signal: Remove task parameter from force_sig_mceerr
    signal: Remove task parameter from force_sig
    signal: Remove task parameter from force_sigsegv
    ...

    Linus Torvalds
     

08 Jun, 2019

2 commits

  • This patch implements dynamic ftrace for PA-RISC. The required mcount
    call sequences can get pretty long, so instead of patching the
    whole call sequence out of the functions, we are using
    -fpatchable-function-entry from gcc. This puts a configurable amount of
    NOPS before/at the start of the function. Taking do_sys_open() as example,
    which would look like this when the call is patched out:

    1036b248: 08 00 02 40 nop
    1036b24c: 08 00 02 40 nop
    1036b250: 08 00 02 40 nop
    1036b254: 08 00 02 40 nop

    1036b258 :
    1036b258: 08 00 02 40 nop
    1036b25c: 08 03 02 41 copy r3,r1
    1036b260: 6b c2 3f d9 stw rp,-14(sp)
    1036b264: 08 1e 02 43 copy sp,r3
    1036b268: 6f c1 01 00 stw,ma r1,80(sp)

    When ftrace gets enabled for this function the kernel will patch these
    NOPs to:

    1036b248: 10 19 57 20


    1036b24c: 6f c1 00 80 stw,ma r1,40(sp)
    1036b250: 48 21 3f d1 ldw -18(r1),r1
    1036b254: e8 20 c0 02 bv,n r0(r1)

    1036b258 :
    1036b258: e8 3f 1f df b,l,n .-c,r1
    1036b25c: 08 03 02 41 copy r3,r1
    1036b260: 6b c2 3f d9 stw rp,-14(sp)
    1036b264: 08 1e 02 43 copy sp,r3
    1036b268: 6f c1 01 00 stw,ma r1,80(sp)

    So the first NOP in do_sys_open() will be patched to jump backwards into
    some minimal trampoline code which pushes a stackframe, saves r1 which
    holds the return address, loads the address of the real ftrace function,
    and branches to that location. For 64 Bit things are getting a bit more
    complicated (and longer) because we must make sure that the address of
    ftrace location is 8 byte aligned, and the offset passed to ldd for
    fetching the address is 8 byte aligned as well.

    Note that gcc has a bug which misplaces the function label, and needs a
    patch to make dynamic ftrace work. See
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90751 for details.

    Signed-off-by: Sven Schnelle
    Signed-off-by: Helge Deller

    Sven Schnelle
     
  • Calling clear_fixmap() on an already cleared fixed mapping is
    a bad thing to do. Add a WARN_ON() to catch such issues.

    Signed-off-by: Sven Schnelle
    Signed-off-by: Helge Deller

    Sven Schnelle
     

29 May, 2019

1 commit

  • As synchronous exceptions really only make sense against the current
    task (otherwise how are you synchronous) remove the task parameter
    from from force_sig_fault to make it explicit that is what is going
    on.

    The two known exceptions that deliver a synchronous exception to a
    stopped ptraced task have already been changed to
    force_sig_fault_to_task.

    The callers have been changed with the following emacs regular expression
    (with obvious variations on the architectures that take more arguments)
    to avoid typos:

    force_sig_fault[(]\([^,]+\)[,]\([^,]+\)[,]\([^,]+\)[,]\W+current[)]
    ->
    force_sig_fault(\1,\2,\3)

    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

27 May, 2019

1 commit


21 May, 2019

1 commit


15 May, 2019

2 commits

  • Pull more parisc updates from Helge Deller:
    "Two small enhancements, which I didn't included in the last pull
    request because I wanted to keep them a few more days in for-next
    before sending upstream:

    - Replace the ldcw barrier instruction by a nop instruction in the
    CAS code on uniprocessor machines.

    - Map variables read-only after init (enable ro_after_init feature)"

    * 'parisc-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Use __ro_after_init in init.c
    parisc: Use __ro_after_init in unwind.c
    parisc: Use __ro_after_init in time.c
    parisc: Use __ro_after_init in processor.c
    parisc: Use __ro_after_init in process.c
    parisc: Use __ro_after_init in perf_images.h
    parisc: Use __ro_after_init in pci.c
    parisc: Use __ro_after_init in inventory.c
    parisc: Use __ro_after_init in head.S
    parisc: Use __ro_after_init in firmware.c
    parisc: Use __ro_after_init in drivers.c
    parisc: Use __ro_after_init in cache.c
    parisc: Enable the ro_after_init feature
    parisc: Drop LDCW barrier in CAS code when running UP

    Linus Torvalds
     
  • For most architectures free_initrd_mem just expands to the same
    free_reserved_area call. Provide that as a generic implementation marked
    __weak.

    Link: http://lkml.kernel.org/r/20190213174621.29297-8-hch@lst.de
    Signed-off-by: Christoph Hellwig
    Acked-by: Geert Uytterhoeven [m68k]
    Acked-by: Mike Rapoport
    Cc: Catalin Marinas [arm64]
    Cc: Steven Price
    Cc: Alexander Viro
    Cc: Guan Xuetao
    Cc: Russell King
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

11 May, 2019

2 commits


04 May, 2019

4 commits

  • This patch updates the parisc huge TLB page support to use per-pagetable spinlocks.

    This patch requires Mikulas' per-pagetable spinlock patch and the revised TLB
    serialization patch from Helge and myself. With Mikulas' patch, we need to use
    the per-pagetable spinlock for page table updates. The TLB lock is only used
    to serialize TLB flushes on machines with the Merced bus.

    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller

    John David Anglin
     
  • When making the text sections writeable with set_kernel_text_rw(1),
    include all text sections including those in the __init section.
    Otherwise functions marked with __meminit will stay read-only.

    Signed-off-by: Helge Deller
    Cc: # 4.20+

    Helge Deller
     
  • The commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an
    external fragmentation event occurs") breaks memory management on a
    parisc c8000 workstation with this memory layout:

    0) Start 0x0000000000000000 End 0x000000003fffffff Size 1024 MB
    1) Start 0x0000000100000000 End 0x00000001bfdfffff Size 3070 MB
    2) Start 0x0000004040000000 End 0x00000040ffffffff Size 3072 MB

    With the patch 1c30844d2dfe, the kernel will incorrectly reclaim the
    first zone when it fills up, ignoring the fact that there are two
    completely free zones. Basiscally, it limits cache size to 1GiB.

    The parisc kernel is currently using the DISCONTIGMEM implementation,
    but isn't NUMA. Avoid this issue or strange work-arounds by switching to
    the more commonly used SPARSEMEM implementation.

    Reported-by: Mikulas Patocka
    Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
    Signed-off-by: Helge Deller

    Helge Deller
     
  • These functions will be used for adding code patching
    functions later.

    Signed-off-by: Sven Schnelle
    Signed-off-by: Helge Deller

    Sven Schnelle
     

22 Feb, 2019

1 commit


05 Jan, 2019

1 commit

  • The alternative coding patch for parisc in kernel 4.20 broke booting
    machines with PA8500-PA8700 CPUs. The problem is, that for such machines
    the parisc kernel automatically utilizes huge pages to access kernel
    text code, but the set_kernel_text_rw() function, which is used shortly
    before applying any alternative patches, didn't used the correctly
    hugepage-aligned addresses to remap the kernel text read-writeable.

    Fixes: 3847dab77421 ("parisc: Add alternative coding infrastructure")
    Cc: [4.20]
    Signed-off-by: Helge Deller

    Helge Deller