08 Apr, 2020

2 commits

  • It is unlikely that an inaccessible VMA without required permission flags
    will get a page fault. Hence lets just append unlikely() directive to
    such checks in order to improve performance while also standardizing it
    across various platforms.

    Signed-off-by: Anshuman Khandual
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Cc: Guo Ren
    Cc: Geert Uytterhoeven
    Cc: Ralf Baechle
    Cc: Paul Burton
    Cc: Mike Rapoport
    Link: http://lkml.kernel.org/r/1582525304-32113-1-git-send-email-anshuman.khandual@arm.com
    Signed-off-by: Linus Torvalds

    Anshuman Khandual
     
  • Lets move vma_is_accessible() helper to include/linux/mm.h which makes it
    available for general use. While here, this replaces all remaining open
    encodings for VMA access check with vma_is_accessible().

    Signed-off-by: Anshuman Khandual
    Signed-off-by: Andrew Morton
    Acked-by: Geert Uytterhoeven
    Acked-by: Guo Ren
    Acked-by: Vlastimil Babka
    Cc: Guo Ren
    Cc: Geert Uytterhoeven
    Cc: Ralf Baechle
    Cc: Paul Burton
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: Yoshinori Sato
    Cc: Rich Felker
    Cc: Dave Hansen
    Cc: Andy Lutomirski
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Mel Gorman
    Cc: Alexander Viro
    Cc: "Aneesh Kumar K.V"
    Cc: Arnaldo Carvalho de Melo
    Cc: Arnd Bergmann
    Cc: Nick Piggin
    Cc: Paul Mackerras
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/1582520593-30704-3-git-send-email-anshuman.khandual@arm.com
    Signed-off-by: Linus Torvalds

    Anshuman Khandual
     

03 Apr, 2020

3 commits

  • The idea comes from a discussion between Linus and Andrea [1].

    Before this patch we only allow a page fault to retry once. We achieved
    this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
    handle_mm_fault() the second time. This was majorly used to avoid
    unexpected starvation of the system by looping over forever to handle the
    page fault on a single page. However that should hardly happen, and after
    all for each code path to return a VM_FAULT_RETRY we'll first wait for a
    condition (during which time we should possibly yield the cpu) to happen
    before VM_FAULT_RETRY is really returned.

    This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
    flag when we receive VM_FAULT_RETRY. It means that the page fault handler
    now can retry the page fault for multiple times if necessary without the
    need to generate another page fault event. Meanwhile we still keep the
    FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
    page fault is the first attempt or not.

    Then we'll have these combinations of fault flags (only considering
    ALLOW_RETRY flag and TRIED flag):

    - ALLOW_RETRY and !TRIED: this means the page fault allows to
    retry, and this is the first try

    - ALLOW_RETRY and TRIED: this means the page fault allows to
    retry, and this is not the first try

    - !ALLOW_RETRY and !TRIED: this means the page fault does not allow
    to retry at all

    - !ALLOW_RETRY and TRIED: this is forbidden and should never be used

    In existing code we have multiple places that has taken special care of
    the first condition above by checking against (fault_flags &
    FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect
    the first retry of a page fault by checking against both (fault_flags &
    FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
    even the 2nd try will have the ALLOW_RETRY set, then use that helper in
    all existing special paths. One example is in __lock_page_or_retry(), now
    we'll drop the mmap_sem only in the first attempt of page fault and we'll
    keep it in follow up retries, so old locking behavior will be retained.

    This will be a nice enhancement for current code [2] at the same time a
    supporting material for the future userfaultfd-writeprotect work, since in
    that work there will always be an explicit userfault writeprotect retry
    for protected pages, and if that cannot resolve the page fault (e.g., when
    userfaultfd-writeprotect is used in conjunction with swapped pages) then
    we'll possibly need a 3rd retry of the page fault. It might also benefit
    other potential users who will have similar requirement like userfault
    write-protection.

    GUP code is not touched yet and will be covered in follow up patch.

    Please read the thread below for more information.

    [1] https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/
    [2] https://lore.kernel.org/lkml/20181230154648.GB9832@redhat.com/

    Suggested-by: Linus Torvalds
    Suggested-by: Andrea Arcangeli
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160246.9790-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Although there're tons of arch-specific page fault handlers, most of them
    are still sharing the same initial value of the page fault flags. Say,
    merely all of the page fault handlers would allow the fault to be retried,
    and they also allow the fault to respond to SIGKILL.

    Let's define a default value for the fault flags to replace those initial
    page fault flags that were copied over. With this, it'll be far easier to
    introduce new fault flag that can be used by all the architectures instead
    of touching all the archs.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Reviewed-by: David Hildenbrand
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160238.9694-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • For most architectures, we've got a quick path to detect fatal signal
    after a handle_mm_fault(). Introduce a helper for that quick path.

    It cleans the current codes a bit so we don't need to duplicate the same
    check across archs. More importantly, this will be an unified place that
    we handle the signal immediately right after an interrupted page fault, so
    it'll be much easier for us if we want to change the behavior of handling
    signals later on for all the archs.

    Note that currently only part of the archs are using this new helper,
    because some archs have their own way to handle signals. In the follow up
    patches, we'll try to apply this helper to all the rest of archs.

    Another note is that the "regs" parameter in the new helper is not used
    yet. It'll be used very soon. Now we kept it in this patch only to avoid
    touching all the archs again in the follow up patches.

    [peterx@redhat.com: fix sparse warnings]
    Link: http://lkml.kernel.org/r/20200311145921.GD479302@xz-x1
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220155353.8676-4-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     

10 Feb, 2020

6 commits

  • Also iterate the PMD tables to populate the PTE table allocator. This
    also fully replaces the previous zero_pgtable hack.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Acked-by: Greg Ungerer
    Tested-by: Michael Schmitz
    Tested-by: Greg Ungerer
    Link: https://lore.kernel.org/r/20200131125403.938797587@infradead.org
    Signed-off-by: Geert Uytterhoeven

    Peter Zijlstra
     
  • In addition to the PGD/PMD table size (128*4) add a PTE table size
    (64*4) to the table allocator. This completely removes the pte-table
    overhead compared to the old code, even for dense tables.

    Notes:

    - the allocator gained a list_empty() check to deal with there not
    being any pages at all.

    - the free mask is extended to cover more than the 8 bits required
    for the (512 byte) PGD/PMD tables.

    - NR_PAGETABLE accounting is restored.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Acked-by: Greg Ungerer
    Tested-by: Michael Schmitz
    Tested-by: Greg Ungerer
    Link: https://lore.kernel.org/r/20200131125403.882175409@infradead.org
    Signed-off-by: Geert Uytterhoeven

    Peter Zijlstra
     
  • With the PTE-tables now only being 256 bytes, allocating a full page
    for them is a giant waste. Start by improving the boot time allocator
    such that init_mm initialization will at least have optimal memory
    density.

    Much thanks to Will Deacon in help with debugging and ferreting out
    lost information on these dusty MMUs.

    Notes:

    - _TABLE_MASK is reduced to account for the shorter (256 byte)
    alignment of pte-tables, per the manual, table entries should only
    ever have state in the low 4 bits (Used,WrProt,Desc1,Desc0) so it is
    still longer than strictly required. (Thanks Will!!!)

    - Also use kernel_page_table() for the 020/030 zero_pgtable case and
    consequently remove the zero_pgtable init hack (will fix up later).

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Acked-by: Greg Ungerer
    Tested-by: Michael Schmitz
    Tested-by: Greg Ungerer
    Link: https://lore.kernel.org/r/20200131125403.768263973@infradead.org
    Signed-off-by: Geert Uytterhoeven

    Peter Zijlstra
     
  • The Motorola 68xxx MMUs, 040 (and later) have a fixed 7,7,{5,6}
    page-table setup, where the last depends on the page-size selected (8k
    vs 4k resp.), and head.S selects 4K pages. For 030 (and earlier) we
    explicitly program 7,7,6 and 4K pages in %tc.

    However, the current code implements this mightily weird. What it does
    is group 16 of those (6 bit) pte tables into one 4k page to not waste
    space. The down-side is that that forces pmd_t to be a 16-tuple
    pointing to consecutive pte tables.

    This breaks the generic code which assumes READ_ONCE(*pmd) will be
    word sized.

    Therefore implement a straight forward 7,7,6 3 level page-table setup,
    with the addition (for 020/030) of (partial) large-page support. For
    now this increases the memory footprint for pte-tables 15 fold.

    Tested with ARAnyM/68040 emulation.

    Suggested-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Acked-by: Greg Ungerer
    Tested-by: Michael Schmitz
    Tested-by: Greg Ungerer
    Link: https://lore.kernel.org/r/20200131125403.711478295@infradead.org
    Signed-off-by: Geert Uytterhoeven

    Peter Zijlstra
     
  • Only the Motorola MMU makes use of this allocator, it is a waste of
    .text to include it for Sun3/ColdFire. Also, this is going to avoid
    build issues when we're going to make it more Motorola specific.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Acked-by: Greg Ungerer
    Tested-by: Michael Schmitz
    Tested-by: Greg Ungerer
    Link: https://lore.kernel.org/r/20200131125403.654652162@infradead.org
    Signed-off-by: Geert Uytterhoeven

    Peter Zijlstra
     
  • Seeing how there are 5 copies of this magic code, one of which is
    unexplainably different, unify and document things.

    Suggested-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Acked-by: Greg Ungerer
    Tested-by: Michael Schmitz
    Tested-by: Greg Ungerer
    Link: https://lore.kernel.org/r/20200131125403.597688427@infradead.org
    Signed-off-by: Geert Uytterhoeven

    Peter Zijlstra
     

05 Dec, 2019

1 commit

  • m68k has two or three levels of page tables and can use appropriate
    pgtable-nopXd and folding of the upper layers.

    Replace usage of include/asm-generic/4level-fixup.h and explicit
    definitions of __PAGETABLE_PxD_FOLDED in m68k with
    include/asm-generic/pgtable-nopmd.h for two-level configurations and
    with include/asm-generic/pgtable-nopud.h for three-lelve configurations
    and adjust page table manipulation macros and functions accordingly.

    [akpm@linux-foundation.org: fix merge glitch]
    [geert@linux-m68k.org: more merge glitch fixes]
    [akpm@linux-foundation.org: s/bad_pgd/bad_pud/, per Mike]
    Link: http://lkml.kernel.org/r/1572938135-31886-6-git-send-email-rppt@kernel.org
    Signed-off-by: Mike Rapoport
    Acked-by: Greg Ungerer
    Cc: Anatoly Pugachev
    Cc: Anton Ivanov
    Cc: Arnd Bergmann
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Helge Deller
    Cc: "James E.J. Bottomley"
    Cc: Jeff Dike
    Cc: "Kirill A. Shutemov"
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Michal Simek
    Cc: Peter Rosin
    Cc: Richard Weinberger
    Cc: Rolf Eike Beer
    Cc: Russell King
    Cc: Russell King
    Cc: Sam Creasey
    Cc: Vincent Chen
    Cc: Vineet Gupta
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

12 Nov, 2019

1 commit

  • m68k uses __iounmap as the name for an internal helper that is only
    used for some CPU types. Mark it static, give it a better name
    and move it around a bit to avoid a forward declaration.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Geert Uytterhoeven
    Acked-by: Geert Uytterhoeven

    Christoph Hellwig
     

29 May, 2019

1 commit

  • As synchronous exceptions really only make sense against the current
    task (otherwise how are you synchronous) remove the task parameter
    from from force_sig_fault to make it explicit that is what is going
    on.

    The two known exceptions that deliver a synchronous exception to a
    stopped ptraced task have already been changed to
    force_sig_fault_to_task.

    The callers have been changed with the following emacs regular expression
    (with obvious variations on the architectures that take more arguments)
    to avoid typos:

    force_sig_fault[(]\([^,]+\)[,]\([^,]+\)[,]\([^,]+\)[,]\W+current[)]
    ->
    force_sig_fault(\1,\2,\3)

    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

15 May, 2019

1 commit

  • For most architectures free_initrd_mem just expands to the same
    free_reserved_area call. Provide that as a generic implementation marked
    __weak.

    Link: http://lkml.kernel.org/r/20190213174621.29297-8-hch@lst.de
    Signed-off-by: Christoph Hellwig
    Acked-by: Geert Uytterhoeven [m68k]
    Acked-by: Mike Rapoport
    Cc: Catalin Marinas [arm64]
    Cc: Steven Price
    Cc: Alexander Viro
    Cc: Guan Xuetao
    Cc: Russell King
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

13 Mar, 2019

2 commits

  • Add check for the return value of memblock_alloc*() functions and call
    panic() in case of error. The panic message repeats the one used by
    panicing memblock allocators with adjustment of parameters to include
    only relevant ones.

    The replacement was mostly automated with semantic patches like the one
    below with manual massaging of format strings.

    @@
    expression ptr, size, align;
    @@
    ptr = memblock_alloc(size, align);
    + if (!ptr)
    + panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);

    [anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
    Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
    [rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
    Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
    [rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
    Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
    [akpm@linux-foundation.org: fix xtensa printk warning]
    Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Anders Roxell
    Reviewed-by: Guo Ren [c-sky]
    Acked-by: Paul Burton [MIPS]
    Acked-by: Heiko Carstens [s390]
    Reviewed-by: Juergen Gross [Xen]
    Reviewed-by: Geert Uytterhoeven [m68k]
    Acked-by: Max Filippov [xtensa]
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • memblock_alloc() already clears the allocated memory, no point in doing
    it twice.

    Link: http://lkml.kernel.org/r/1548057848-15136-14-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Geert Uytterhoeven [m68k]
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Guo Ren [c-sky]
    Cc: Heiko Carstens
    Cc: Juergen Gross [Xen]
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Paul Burton
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

06 Mar, 2019

1 commit

  • The PG_reserved flag is cleared from memory that is part of the kernel
    image (and therefore marked as PG_reserved). Avoid using PG_reserved
    directly.

    Link: http://lkml.kernel.org/r/20190114125903.24845-6-david@redhat.com
    Signed-off-by: David Hildenbrand
    Acked-by: Geert Uytterhoeven
    Cc: Michal Hocko
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

20 Dec, 2018

2 commits

  • Pull m68k fix from Geert Uytterhoeven:
    "Fix memblock-related crashes"

    * tag 'm68k-for-v4.20-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
    m68k: Fix memblock-related crashes

    Linus Torvalds
     
  • When running the kernel in Fast RAM on Atari:

    Ignoring memory chunk at 0x0:0xe00000 before the first chunk
    ...
    Unable to handle kernel NULL pointer dereference at virtual address (ptrval)
    Oops: 00000000
    Modules linked in:
    PC: [] free_all_bootmem+0x12c/0x186
    SR: 2714 SP: (ptrval) a2: 005e3314
    d0: 00000000 d1: 0000000a d2: 00000e00 d3: 00000000
    d4: 005e1fc0 d5: 0000001a a0: 01000000 a1: 00000000
    Process swapper (pid: 0, task=(ptrval))
    Frame format=7 eff addr=00000736 ssw=0505 faddr=00000736
    wb 1 stat/addr/data: 0000 00000000 00000000
    wb 2 stat/addr/data: 0000 00000000 00000000
    wb 3 stat/addr/data: 0000 00000736 00000000
    push data: 00000000 00000000 00000000 00000000
    Stack from 005e1f84:
    00000000 0000000a 027d3260 006b5006 00000000 00000000 00000000 00000000
    0004f062 0003a220 0069e272 005e1ff8 0000054c 00000000 00e00000 00000000
    00000001 00693cd8 027d3260 0004f062 0003a220 00691be6 00000000 00000000
    00000000 00000000 00000000 00000000 006b5006 00000000 00690872
    Call Trace: [] printk+0x0/0x18
    [] parse_args+0x0/0x2d4
    [] memblock_virt_alloc_try_nid+0x0/0xa4
    [] mem_init+0xa/0x5c
    [] printk+0x0/0x18
    [] parse_args+0x0/0x2d4
    [] start_kernel+0x1ca/0x462
    [] _sinittext+0x872/0x11f8
    Code: 7a1a eaae 2270 6db0 0061 ef14 2f01 2f03 0736 2203 e589 d681 e78b d6a9 0732 2f03 2f40 0034 4eb9 0069 b8d0 260e 4fef
    Disabling lock debugging due to kernel taint
    Kernel panic - not syncing: Attempted to kill the idle task!

    As the kernel must run in the memory chunk with the lowest address,
    ST-RAM is ignored, and removed from the m68k_memory[] array.
    However, it is not removed from memblock, causing a crash later.

    More investigation shows that there are 3 places where memory chunks are
    ignored, all after the calls to memblock_add() in m68k_parse_bootinfo(),
    and thus causing crashes:
    1. On classic m68k CPUs with a MMU, paging_init() ignores all memory
    chunks below the first chunk, cfr. above,
    2. On Amigas equipped with a Zorro III bus, config_amiga() ignores all
    Zorro II memory,
    3. If CONFIG_SINGLE_MEMORY_CHUNK=y, m68k_parse_bootinfo() ignores all
    but the first memory chunk.

    Fix this by moving the calls to memblock_add() from
    m68k_parse_bootinfo() to paging_init(), after all ignored memory chunks
    have been removed from m68k_memory[].

    Reported-by: Andreas Schwab
    Fixes: 1008a11590b966b4 ("m68k: switch to MEMBLOCK + NO_BOOTMEM")
    Signed-off-by: Geert Uytterhoeven

    Geert Uytterhoeven
     

31 Oct, 2018

4 commits

  • Move remaining definitions and declarations from include/linux/bootmem.h
    into include/linux/memblock.h and remove the redundant header.

    The includes were replaced with the semantic patch below and then
    semi-automated removal of duplicated '#include

    @@
    @@
    - #include
    + #include

    [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
    [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
    [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
    Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
    Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Stephen Rothwell
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The conversion is done using

    sed -i 's@free_all_bootmem@memblock_free_all@' \
    $(git grep -l free_all_bootmem)

    Link: http://lkml.kernel.org/r/1536927045-23536-26-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The alloc_bootmem_pages() function allocates PAGE_SIZE aligned memory.
    memblock_alloc() with alignment set to PAGE_SIZE does exactly the same
    thing.

    The conversion is done using the following semantic patch:

    @@
    expression e;
    @@
    - alloc_bootmem_pages(e)
    + memblock_alloc(e, PAGE_SIZE)

    Link: http://lkml.kernel.org/r/1536927045-23536-20-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The alloc_bootmem_low_pages() function allocates PAGE_SIZE aligned regions
    from low memory. memblock_alloc_low() with alignment set to PAGE_SIZE does
    exactly the same thing.

    The conversion is done using the following semantic patch:

    @@
    expression e;
    @@
    - alloc_bootmem_low_pages(e)
    + memblock_alloc_low(e, PAGE_SIZE)

    Link: http://lkml.kernel.org/r/1536927045-23536-19-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

03 Sep, 2018

1 commit


18 Aug, 2018

1 commit

  • Use new return type vm_fault_t for fault handler. For now, this is just
    documenting that the function returns a VM_FAULT value rather than an
    errno. Once all instances are converted, vm_fault_t will become a
    distinct type.

    Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")

    In this patch all the caller of handle_mm_fault() are changed to return
    vm_fault_t type.

    Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC
    Signed-off-by: Souptick Joarder
    Cc: Matthew Wilcox
    Cc: Richard Henderson
    Cc: Tony Luck
    Cc: Matt Turner
    Cc: Vineet Gupta
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Richard Kuo
    Cc: Geert Uytterhoeven
    Cc: Michal Simek
    Cc: James Hogan
    Cc: Ley Foon Tan
    Cc: Jonas Bonn
    Cc: James E.J. Bottomley
    Cc: Benjamin Herrenschmidt
    Cc: Palmer Dabbelt
    Cc: Yoshinori Sato
    Cc: David S. Miller
    Cc: Richard Weinberger
    Cc: Guan Xuetao
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: "Levin, Alexander (Sasha Levin)"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     

29 Jul, 2018

1 commit

  • In m68k the physical memory is described by [memory_start, memory_end] for
    !MMU variant and by m68k_memory array of memory ranges for the MMU version.
    This information is directly use to register the physical memory with
    memblock.

    The reserve_bootmem() calls are replaced with memblock_reserve() and the
    bootmap bitmap allocation is simply dropped.

    Since the MMU variant creates early mappings only for the small part of the
    memory we force bottom-up allocations in memblock.

    Signed-off-by: Mike Rapoport
    Acked-by: Greg Ungerer
    Signed-off-by: Geert Uytterhoeven

    Mike Rapoport
     

06 Jun, 2018

1 commit

  • Pull m68knommu updates from Greg Ungerer:
    "These changes all relate to converting the IO access functions for the
    ColdFire (and all other non-MMU m68k) platforms to use asm-generic IO
    instead.

    This makes the IO support the same on all ColdFire (regardless of MMU
    enabled or not) and means we can now support PCI in non-MMU mode.

    As a bonus these changes remove more code than they add"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
    m68k: fix ColdFire PCI config reads and writes
    m68k: introduce iomem() macro for __iomem conversions
    m68k: allow ColdFire PCI bus on MMU and non-MMU configuration
    m68k: fix ioremapping for internal ColdFire peripherals
    m68k: fix read/write multi-byte IO for PCI on ColdFire
    m68k: don't redefine access functions if we have PCI
    m68k: remove old ColdFire IO access support code
    m68k: use io_no.h for MMU and non-MMU enabled ColdFire
    m68k: setup PCI support code in io_no.h
    m68k: group io mapping definitions and functions
    m68k: rework raw access macros for the non-MMU case
    m68k: use asm-generic/io.h for non-MMU io access functions
    m68k: put definition guards around virt_to_phys and phys_to_virt
    m68k: move *_relaxed macros into io_no.h and io_mm.h

    Linus Torvalds
     

05 Jun, 2018

1 commit

  • Pull m68k updates from Geert Uytterhoeven:

    - a few time-related fixes:
    - off-by-one calendar month on some classes of machines
    - Y2038 preparation

    - build fix for ndelay() being called with a 64-bit type

    - revive 64-bit get_user(), which is used by some Android code

    - defconfig updates

    - fix for a long-standing fatal bug in iounmap() on '020/030, which was
    actually fixed in 2.4.23, but never in 2.5.x and later

    - default DMA mask to avoid warning splats

    - minor fixes and cleanups

    * tag 'm68k-for-v4.18-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
    m68k: Set default dma mask for platform devices
    m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap()
    m68k/defconfig: Update defconfigs for v4.17-rc3
    m68k/uaccess: Revive 64-bit get_user()
    m68k: Implement ndelay() as an inline function to force type checking/casting
    zorro: Add a blank line after declarations
    m68k: Use read_persistent_clock64() consistently
    m68k: Fix off-by-one calendar month
    m68k: Fix style, spelling, and grammar in siginfo_build_tests()
    m68k/mac: Fix SWIM memory resource end address

    Linus Torvalds
     

28 May, 2018

1 commit

  • The ColdFire SoC internal peripherals are mapped into virtual address
    space using the ACR registers of the cache control unit. This means we
    are using a 1:1 physical:virtual mapping for them that does not rely on
    page table mappings. We can quickly determine if we are accessing an
    internal peripheral device given the physical or vitrual address using
    the same range check.

    The implications of this mapping is that an ioremap should return the
    physical address as the virtual mapping __iomem cookie as well. So fix
    ioremap() to deal with this on ColdFire. Of course you need to take
    care of this in the iounmap() path as well.

    Reported-by: Angelo Dureghello
    Signed-off-by: Greg Ungerer
    Reviewed-by: Angelo Dureghello
    Tested-by: Angelo Dureghello

    Greg Ungerer
     

24 May, 2018

1 commit

  • If 020/030 support is enabled, get_io_area() leaves an IO_SIZE gap
    between mappings which is added to the vm_struct representing the
    mapping. __ioremap() uses the actual requested size (after alignment),
    while __iounmap() is passed the size from the vm_struct.

    On 020/030, early termination descriptors are used to set up mappings of
    extent 'size', which are validated on unmapping. The unmapped gap of
    size IO_SIZE defeats the sanity check of the pmd tables, causing
    __iounmap() to loop forever on 030.

    On 040/060, unmapping of page table entries does not check for a valid
    mapping, so the umapping loop always completes there.

    Adjust size to be unmapped by the gap that had been added in the
    vm_struct prior.

    This fixes the hang in atari_platform_init() reported a long time ago,
    and a similar one reported by Finn recently (addressed by removing
    ioremap() use from the SWIM driver.

    Tested on my Falcon in 030 mode - untested but should work the same on
    040/060 (the extra page tables cleared there would never have been set
    up anyway).

    Signed-off-by: Michael Schmitz
    [geert: Minor commit description improvements]
    [geert: This was fixed in 2.4.23, but not in 2.5.x]
    Signed-off-by: Geert Uytterhoeven
    Cc: stable@vger.kernel.org

    Michael Schmitz
     

25 Apr, 2018

1 commit

  • Filling in struct siginfo before calling force_sig_info a tedious and
    error prone process, where once in a great while the wrong fields
    are filled out, and siginfo has been inconsistently cleared.

    Simplify this process by using the helper force_sig_fault. Which
    takes as a parameters all of the information it needs, ensures
    all of the fiddly bits of filling in struct siginfo are done properly
    and then calls force_sig_info.

    In short about a 5 line reduction in code for every time force_sig_info
    is called, which makes the calling function clearer.

    Cc: Geert Uytterhoeven
    Cc: linux-m68k@lists.linux-m68k.org
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

19 Mar, 2018

1 commit

  • Since commit ad67b74d2469d9b8 ("printk: hash addresses printed with
    %p"), the virtual memory layout printed during boot up contains "ptrval"
    instead of actual addresses:

    Memory: 268040K/276480K available (2979K kernel code, 310K rwdata, 784K rodata, 144K init, 172K bss, 8440K reserved, 0K cma-reserved)
    Virtual kernel memory layout:
    vector : 0x003d2e74 - 0x003d3274 ( 1 KiB)
    kmap : 0xd0000000 - 0xf0000000 ( 512 MiB)
    vmalloc : 0x11800000 - 0xd0000000 (3048 MiB)
    lowmem : 0x00000000 - 0x11000000 ( 272 MiB)
    .init : 0x(ptrval) - 0x(ptrval) ( 144 KiB)
    .text : 0x(ptrval) - 0x(ptrval) (2980 KiB)
    .data : 0x(ptrval) - 0x(ptrval) (1095 KiB)
    .bss : 0x(ptrval) - 0x(ptrval) ( 173 KiB)

    Instead of changing the printing to "%px", and leaking virtual memory
    layout information again, just remove the printing completely, cfr. e.g.
    commit 071929dbdd865f77 ("arm64: Stop printing the virtual memory
    layout").

    All interesting information (actual section sizes) is already printed by
    mem_init_print_info() just above anyway.

    Signed-off-by: Geert Uytterhoeven

    Geert Uytterhoeven
     

23 Jan, 2018

1 commit

  • The siginfo structure has all manners of holes with the result that a
    structure initializer is not guaranteed to initialize all of the bits.
    As we have to copy the structure to userspace don't even try to use
    a structure initializer. Instead use clear_siginfo followed by initializing
    selected fields. This gives a guarantee that uninitialized kernel memory
    is not copied to userspace.

    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

06 Nov, 2017

2 commits

  • The m68k pg_data_table is a fix size array defined in arch/m68k/mm/init.c.
    Index numbers within it are defined based on memory size. But for Coldfire
    these don't take into account a non-zero physical RAM base address, and this
    causes us to access past the end of this array at system start time.

    Change the node shift calculation so that we keep the index inside its range.

    Reported-by: Angelo Dureghello
    Tested-by: Angelo Dureghello
    Signed-off-by: Greg Ungerer

    Greg Ungerer
     
  • The M54[78]x ColdFire parts are not the only members of the ColdFire family
    that have an MMU. But currently some of the early MMU initialization code
    is inside the startup code specific to only the ColdFire M54[78]x parts.
    Move that early ColdFire MMU init code so that it is run for other ColdFire
    parts running with MMU enabled.

    Specifically this means that the MMU initialization code will now also be
    run for the ColdFire M5441x parts when running with MMU enabled.

    The code move meant that the extern definition for the mmu_context_init()
    function had to be moved as well. To make it clear that is ColdFire specific
    I have renamed that with a "cf_" in front of it and put its extern definition
    in the mcfmmu.h (which is already included by the setup code).

    Reported-by: Angelo Dureghello
    Tested-by: Angelo Dureghello
    Signed-off-by: Greg Ungerer

    Greg Ungerer
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

29 Mar, 2017

1 commit


13 Feb, 2017

1 commit


12 Feb, 2017

1 commit