Eric Lee / smarc-fsl-linux-kernel

09 Jun, 2020

40 commits

bce2b68b8 exec: use flush_icache_user_range in read_code ... Browse Code »

read_code operates on user addresses.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Alexander Viro
Link: http://lkml.kernel.org/r/20200515143646.3857579-27-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
48304f799 exec: only build read_code when needed ... Browse Code »

Only build read_code when binary formats that use it are built into the
kernel.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Alexander Viro
Link: http://lkml.kernel.org/r/20200515143646.3857579-26-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
a1e81f965 m68k: implement flush_icache_user_range ... Browse Code »

Rename the current flush_icache_range to flush_icache_user_range as per
commit ae92ef8a4424 ("PATCH] flush icache in correct context") there
seems to be an assumption that it operates on user addresses. Add a
flush_icache_range around it that for now is a no-op.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Acked-by: Geert Uytterhoeven
Cc: Geert Uytterhoeven
Link: http://lkml.kernel.org/r/20200515143646.3857579-25-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
fca7f8e6f arm: rename flush_cache_user_range to flush_icache_user_range ... Browse Code »

flush_icache_user_range will be the name for a generic primitive. Move
the arm name so that arm already has an implementation.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Russell King
Link: http://lkml.kernel.org/r/20200515143646.3857579-24-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
70cd3444c xtensa: implement flush_icache_user_range ... Browse Code »

The Xtensa implementation of flush_icache_range seems to be able to cope
with user addresses. Just define flush_icache_user_range to
flush_icache_range.

[jcmvbkbc@gmail.com: fix flush_icache_user_range in noMMU configs]
Link: http://lkml.kernel.org/r/20200525221556.4270-1-jcmvbkbc@gmail.com

Signed-off-by: Christoph Hellwig
Signed-off-by: Max Filippov
Signed-off-by: Andrew Morton
Cc: Chris Zankel
Cc: Max Filippov
Link: http://lkml.kernel.org/r/20200515143646.3857579-23-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
952ec41c4 sh: implement flush_icache_user_range ... Browse Code »

The SuperH implementation of flush_icache_range seems to be able to cope
with user addresses. Just define flush_icache_user_range to
flush_icache_range.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Yoshinori Sato
Cc: Rich Felker
Link: http://lkml.kernel.org/r/20200515143646.3857579-22-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
1268c3338 asm-generic: add a flush_icache_user_range stub ... Browse Code »

Define flush_icache_user_range to flush_icache_range unless the
architecture provides its own implementation.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Arnd Bergmann
Link: http://lkml.kernel.org/r/20200515143646.3857579-21-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
885f7f8e3 mm: rename flush_icache_user_range to flush_icache_user_page ... Browse Code »

The function currently known as flush_icache_user_range only operates on
a single page. Rename it to flush_icache_user_page as we'll need the
name flush_icache_user_range for something else soon.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Acked-by: Geert Uytterhoeven
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Cc: Tony Luck
Cc: Fenghua Yu
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Vincent Chen
Cc: Jonas Bonn
Cc: Stefan Kristiansson
Cc: Stafford Horne
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Paul Walmsley
Cc: Palmer Dabbelt
Cc: Albert Ou
Cc: Arnd Bergmann
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/20200515143646.3857579-20-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
97f52c153 arm,sparc,unicore32: remove flush_icache_user_range ... Browse Code »

flush_icache_user_range is only used by , so
remove it from the architectures that implement it, but don't use
.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Russell King
Cc: "David S. Miller"
Cc: Guan Xuetao
Link: http://lkml.kernel.org/r/20200515143646.3857579-19-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
396eb69c6 riscv: use asm-generic/cacheflush.h ... Browse Code »

RISC-V needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Also remove the pointless __KERNEL__ ifdef while we're at it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Reviewed-by: Palmer Dabbelt
Acked-by: Palmer Dabbelt
Cc: Paul Walmsley
Cc: Palmer Dabbelt
Cc: Albert Ou
Link: http://lkml.kernel.org/r/20200515143646.3857579-18-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
5019f7601 powerpc: use asm-generic/cacheflush.h ... Browse Code »

Power needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Also remove the pointless __KERNEL__ ifdef while we're at it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Link: http://lkml.kernel.org/r/20200515143646.3857579-17-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
e05094512 openrisc: use asm-generic/cacheflush.h ... Browse Code »

OpenRISC needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Jonas Bonn
Cc: Stefan Kristiansson
Cc: Stafford Horne
Link: http://lkml.kernel.org/r/20200515143646.3857579-16-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
9e730ffac m68knommu: use asm-generic/cacheflush.h ... Browse Code »

m68knommu needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Acked-by: Greg Ungerer
Cc: Greg Ungerer
Cc: Geert Uytterhoeven
Link: http://lkml.kernel.org/r/20200515143646.3857579-15-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
03518c82b microblaze: use asm-generic/cacheflush.h ... Browse Code »

Microblaze needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Michal Simek
Link: http://lkml.kernel.org/r/20200515143646.3857579-14-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
57b94ff59 ia64: use asm-generic/cacheflush.h ... Browse Code »

IA64 needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Tony Luck
Cc: Fenghua Yu
Link: http://lkml.kernel.org/r/20200515143646.3857579-13-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
af23eea56 hexagon: use asm-generic/cacheflush.h ... Browse Code »

Hexagon needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Acked-by: Brian Cain
Link: http://lkml.kernel.org/r/20200515143646.3857579-12-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
2d49d89c7 c6x: use asm-generic/cacheflush.h ... Browse Code »

C6x needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Acked-by: Mark Salter
Cc: Aurelien Jacquiot
Link: http://lkml.kernel.org/r/20200515143646.3857579-11-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
a7ba12121 arm64: use asm-generic/cacheflush.h ... Browse Code »

ARM64 needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Acked-by: Catalin Marinas
Cc: Will Deacon
Link: http://lkml.kernel.org/r/20200515143646.3857579-10-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
43c74ca33 alpha: use asm-generic/cacheflush.h ... Browse Code »

Alpha needs almost no cache flushing routines of its own. Rely on
asm-generic/cacheflush.h for the defaults.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Link: http://lkml.kernel.org/r/20200515143646.3857579-9-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
76b3b58fa asm-generic: improve the flush_dcache_page stub ... Browse Code »

There is a magic ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE cpp symbol that
guards non-stub availability of flush_dcache_pagge. Use that to check
if flush_dcache_pagg is implemented.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Arnd Bergmann
Link: http://lkml.kernel.org/r/20200515143646.3857579-8-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
e0cf615d7 asm-generic: don't include <linux/mm.h> in cacheflush.h ... Browse Code »

This seems to lead to some crazy include loops when using
asm-generic/cacheflush.h on more architectures, so leave it to the arch
header for now.

[hch@lst.de: fix warning]
Link: http://lkml.kernel.org/r/20200520173520.GA11199@lst.de

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Will Deacon
Cc: Nick Piggin
Cc: Peter Zijlstra
Cc: Jeff Dike
Cc: Richard Weinberger
Cc: Anton Ivanov
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: Dan Williams
Cc: Vishal Verma
Cc: Dave Jiang
Cc: Keith Busch
Cc: Ira Weiny
Cc: Arnd Bergmann
Link: http://lkml.kernel.org/r/20200515143646.3857579-7-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
92a73bd29 asm-generic: fix the inclusion guards for cacheflush.h ... Browse Code »

cacheflush.h uses a somewhat to generic include guard name that clashes
with various arch files. Use a more specific one.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Arnd Bergmann
Link: http://lkml.kernel.org/r/20200515143646.3857579-6-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
7c95fda54 unicore32: remove flush_cache_user_range ... Browse Code »

flush_cache_user_range is an ARMism not used by any generic or unicore32
specific code.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Guan Xuetao
Link: http://lkml.kernel.org/r/20200515143646.3857579-5-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
e292e7403 powerpc: unexport flush_icache_user_range ... Browse Code »

flush_icache_user_range is only used by copy_to_user_page, which is only
used by core VM code.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Link: http://lkml.kernel.org/r/20200515143646.3857579-4-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
e7c1fa11b nds32: unexport flush_icache_page ... Browse Code »

flush_icache_page is only used by mm/memory.c.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Greentime Hu
Cc: Vincent Chen
Link: http://lkml.kernel.org/r/20200515143646.3857579-3-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
ce450ebf6 arm: fix the flush_icache_range arguments in set_fiq_handler ... Browse Code »

Patch series "sort out the flush_icache_range mess", v2.

flush_icache_range is mostly used for kernel address, except for the
following cases:

- the nommu brk and mmap implementations

- the read_code helper that is only used for binfmt_flat,
binfmt_elf_fdpic, and binfmt_aout including the broken
ia32 compat version

- binfmt_flat itself

none of which really are used by a typical MMU enabled kernel, as a.out
can only be build for alpha and m68k to start with.

But strangely enough commit ae92ef8a4424 ("PATCH] flush icache in
correct context") added a "set_fs(KERNEL_DS)" around the
flush_icache_range call in the module loader, because apparently m68k
assumed user pointers.

This series first cleans up the cacheflush implementations, largely by
switching as much as possible to the asm-generic version after a few
preparations, then moves the misnamed current flush_icache_user_range to
a new name, to finally introduce a real flush_icache_user_range to be
used for the above use cases to flush the instruction cache for a
userspace address range. The last patch then drops the set_fs in the
module code and moves it into the m68k implementation.

This patch (of 29):

The arguments passed look bogus, try to fix them to something that seems
to make sense.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Arnd Bergmann
Cc: Roman Zippel
Cc: Jessica Yu
Cc: Michal Simek
Cc: Albert Ou
Cc: Alexander Shishkin
Cc: Alexander Viro
Cc: Alexei Starovoitov
Cc: Anton Ivanov
Cc: Arnaldo Carvalho de Melo
Cc: Aurelien Jacquiot
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Brian Cain
Cc: Catalin Marinas
Cc: Christoph Hellwig
Cc: Chris Zankel
Cc: Daniel Borkmann
Cc: Dan Williams
Cc: Dave Jiang
Cc: "David S. Miller"
Cc: Fenghua Yu
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Ungerer
Cc: Guan Xuetao
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Ira Weiny
Cc: Ivan Kokshaysky
Cc: Jeff Dike
Cc: Jiri Olsa
Cc: Jonas Bonn
Cc: Keith Busch
Cc: Mark Rutland
Cc: Mark Salter
Cc: Martin KaFai Lau
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Namhyung Kim
Cc: Nick Piggin
Cc: Palmer Dabbelt
Cc: Palmer Dabbelt
Cc: Paul Mackerras
Cc: Paul Walmsley
Cc: Peter Zijlstra
Cc: Richard Henderson
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Russell King
Cc: Song Liu
Cc: Stafford Horne
Cc: Stefan Kristiansson
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vincent Chen
Cc: Vishal Verma
Cc: Will Deacon
Cc: Yonghong Song
Cc: Yoshinori Sato
Link: http://lkml.kernel.org/r/20200515143646.3857579-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200515143646.3857579-2-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800
690623e1b vhost: convert get_user_pages() --> pin_user_pages() ... Browse Code »

This code was using get_user_pages*(), in approximately a "Case 5"
scenario (accessing the data within a page), using the categorization
from [1]. That means that it's time to convert the get_user_pages*() +
put_page() calls to pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small part
of fixing a long-standing disconnect between pinning pages, and file
systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/

Signed-off-by: John Hubbard
Signed-off-by: Andrew Morton
Reviewed-by: Jan Kara
Acked-by: Michael S. Tsirkin
Acked-by: Pankaj Gupta
Cc: Jason Wang
Cc: Dave Chinner
Cc: Jérôme Glisse
Cc: Jonathan Corbet
Cc: Souptick Joarder
Cc: Vlastimil Babka
Link: http://lkml.kernel.org/r/20200529234309.484480-3-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds

John Hubbard
2020-06-09 02:05:57 +0800
eaf4d22a9 docs: mm/gup: pin_user_pages.rst: add a "case 5" ... Browse Code »

Patch series "vhost, docs: convert to pin_user_pages(), new "case 5""

It recently became clear to me that there are some get_user_pages*()
callers that don't fit neatly into any of the four cases that are so far
listed in pin_user_pages.rst. vhost.c is one of those.

Add a Case 5 to the documentation, and refer to that when converting
vhost.c.

Thanks to Jan Kara for helping me (again) in understanding the
interaction between get_user_pages() and page writeback [1].

This is based on today's mmotm, which has a nearby patch to
pin_user_pages.rst that rewords cases 3 and 4.

Note that I have only compile-tested the vhost.c patch, although that
does also include cross-compiling for a few other arches. Any run-time
testing would be greatly appreciated.

[1] https://lore.kernel.org/r/20200529070343.GL14550@quack2.suse.cz

This patch (of 2):

There are four cases listed in pin_user_pages.rst. These are intended
to help developers figure out whether to use get_user_pages*(), or
pin_user_pages*(). However, the four cases do not cover all the
situations. For example, drivers/vhost/vhost.c has a "pin, write to
page, set page dirty, unpin" case.

Add a fifth case, to help explain that there is a general pattern that
requires pin_user_pages*() API calls.

[jhubbard@nvidia.com: v2]
Link: http://lkml.kernel.org/r/20200601052633.853874-2-jhubbard@nvidia.com

Signed-off-by: John Hubbard
Signed-off-by: Andrew Morton
Cc: Vlastimil Babka
Cc: Jan Kara
Cc: Jérôme Glisse
Cc: Dave Chinner
Cc: Jonathan Corbet
Cc: Souptick Joarder
Cc: "Michael S . Tsirkin"
Cc: Jason Wang
Link: http://lkml.kernel.org/r/20200529234309.484480-1-jhubbard@nvidia.com
Link: http://lkml.kernel.org/r/20200529234309.484480-2-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds

John Hubbard
2020-06-09 02:05:57 +0800
6a005645e mm/gup: documentation fix for pin_user_pages*() APIs ... Browse Code »

All of the pin_user_pages*() API calls will cause pages to be
dma-pinned. As such, they are all suitable for either DMA, RDMA, and/or
Direct IO.

The documentation should say so, but it was instead saying that three of
the API calls were only suitable for Direct IO. This was discovered
when a reviewer wondered why an API call that specifically recommended
against Case 2 (DMA/RDMA) was being used in a DMA situation [1].

Fix this by simply deleting those claims. The gup.c comments already
refer to the more extensive Documentation/core-api/pin_user_pages.rst,
which does have the correct guidance. So let's just write it once,
there.

[1] https://lore.kernel.org/r/20200529074658.GM30374@kadam

Signed-off-by: John Hubbard
Signed-off-by: Andrew Morton
Reviewed-by: David Hildenbrand
Acked-by: Pankaj Gupta
Acked-by: Souptick Joarder
Cc: Dan Carpenter
Cc: Jan Kara
Cc: Vlastimil Babka
Link: http://lkml.kernel.org/r/20200529084515.46259-1-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds

John Hubbard
2020-06-09 02:05:56 +0800
55a650c35 mm/gup: frame_vector: convert get_user_pages() --> pin_user_pages() ... Browse Code »

This code was using get_user_pages*(), and all of the callers so far
were in a "Case 2" scenario (DMA/RDMA), using the categorization from [1].

That means that it's time to convert the get_user_pages*() + put_page()
calls to pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small part
of fixing a long-standing disconnect between pinning pages, and file
systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/

Signed-off-by: John Hubbard
Signed-off-by: Andrew Morton
Acked-by: David Hildenbrand
Cc: Daniel Vetter
Cc: Jérôme Glisse
Cc: Vlastimil Babka
Cc: Jan Kara
Cc: Dave Chinner
Cc: Pankaj Gupta
Cc: Souptick Joarder
Link: http://lkml.kernel.org/r/20200527223243.884385-3-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds

John Hubbard
2020-06-09 02:05:56 +0800
420c2091b mm/gup: introduce pin_user_pages_locked() ... Browse Code »

Patch series "mm/gup: introduce pin_user_pages_locked(), use it in frame_vector.c", v2.

This adds yet one more pin_user_pages*() variant, and uses that to
convert mm/frame_vector.c.

With this, along with maybe 20 or 30 other recent patches in various
trees, we are close to having the relevant gup call sites
converted--with the notable exception of the bio/block layer.

This patch (of 2):

Introduce pin_user_pages_locked(), which is nearly identical to
get_user_pages_locked() except that it sets FOLL_PIN and rejects
FOLL_GET.

As with other pairs of get_user_pages*() and pin_user_pages() API calls,
it's prudent to assert that FOLL_PIN is *not* set in the
get_user_pages*() call, so add that as part of this.

[jhubbard@nvidia.com: v2]
Link: http://lkml.kernel.org/r/20200531234131.770697-2-jhubbard@nvidia.com

Signed-off-by: John Hubbard
Signed-off-by: Andrew Morton
Reviewed-by: David Hildenbrand
Acked-by: Pankaj Gupta
Cc: Daniel Vetter
Cc: Jérôme Glisse
Cc: Vlastimil Babka
Cc: Jan Kara
Cc: Dave Chinner
Cc: Souptick Joarder
Link: http://lkml.kernel.org/r/20200531234131.770697-1-jhubbard@nvidia.com
Link: http://lkml.kernel.org/r/20200527223243.884385-1-jhubbard@nvidia.com
Link: http://lkml.kernel.org/r/20200527223243.884385-2-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds

John Hubbard
2020-06-09 02:05:56 +0800
a8f80f53f mm/gup: update pin_user_pages.rst for "case 3" (mmu notifiers) ... Browse Code »

Update case 3 so that it covers the use of mmu notifiers, for hardware
that does, or does not have replayable page faults.

Also, elaborate case 4 slightly, as it was quite cryptic.

Signed-off-by: John Hubbard
Signed-off-by: Andrew Morton
Cc: Daniel Vetter
Cc: Jérôme Glisse
Cc: Vlastimil Babka
Cc: Jan Kara
Cc: Dave Chinner
Cc: Jonathan Corbet
Link: http://lkml.kernel.org/r/20200527194953.11130-1-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds

John Hubbard
2020-06-09 02:05:56 +0800
dadbb612f mm/gup.c: convert to use get_user_{page|pages}_fast_only() ... Browse Code »

API __get_user_pages_fast() renamed to get_user_pages_fast_only() to
align with pin_user_pages_fast_only().

As part of this we will get rid of write parameter. Instead caller will
pass FOLL_WRITE to get_user_pages_fast_only(). This will not change any
existing functionality of the API.

All the callers are changed to pass FOLL_WRITE.

Also introduce get_user_page_fast_only(), and use it in a few places
that hard-code nr_pages to 1.

Updated the documentation of the API.

Signed-off-by: Souptick Joarder
Signed-off-by: Andrew Morton
Reviewed-by: John Hubbard
Reviewed-by: Paul Mackerras [arch/powerpc/kvm]
Cc: Matthew Wilcox
Cc: Michael Ellerman
Cc: Benjamin Herrenschmidt
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Mark Rutland
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Paolo Bonzini
Cc: Stephen Rothwell
Cc: Mike Rapoport
Cc: Aneesh Kumar K.V
Cc: Michal Suchanek
Link: http://lkml.kernel.org/r/1590396812-31277-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Linus Torvalds

Souptick Joarder
2020-06-09 02:05:56 +0800
e77132e75 kernel/sysctl.c: ignore out-of-range taint bits introduced via kernel.tainted ... Browse Code »

Users with SYS_ADMIN capability can add arbitrary taint flags to the
running kernel by writing to /proc/sys/kernel/tainted or issuing the
command 'sysctl -w kernel.tainted=...'. This interface, however, is
open for any integer value and this might cause an invalid set of flags
being committed to the tainted_mask bitset.

This patch introduces a simple way for proc_taint() to ignore any
eventual invalid bit coming from the user input before committing those
bits to the kernel tainted_mask.

Signed-off-by: Rafael Aquini
Signed-off-by: Andrew Morton
Reviewed-by: Luis Chamberlain
Cc: Kees Cook
Cc: Iurii Zaikin
Cc: "Theodore Ts'o"
Link: http://lkml.kernel.org/r/20200512223946.888020-1-aquini@redhat.com
Signed-off-by: Linus Torvalds

Rafael Aquini
2020-06-09 02:05:56 +0800
60c958d8d panic: add sysctl to dump all CPUs backtraces on oops event ... Browse Code »

Usually when the kernel reaches an oops condition, it's a point of no
return; in case not enough debug information is available in the kernel
splat, one of the last resorts would be to collect a kernel crash dump
and analyze it. The problem with this approach is that in order to
collect the dump, a panic is required (to kexec-load the crash kernel).
When in an environment of multiple virtual machines, users may prefer to
try living with the oops, at least until being able to properly shutdown
their VMs / finish their important tasks.

This patch implements a way to collect a bit more debug details when an
oops event is reached, by printing all the CPUs backtraces through the
usage of NMIs (on architectures that support that). The sysctl added
(and documented) here was called "oops_all_cpu_backtrace", and when set
will (as the name suggests) dump all CPUs backtraces.

Far from ideal, this may be the last option though for users that for
some reason cannot panic on oops. Most of times oopses are clear enough
to indicate the kernel portion that must be investigated, but in virtual
environments it's possible to observe hypervisor/KVM issues that could
lead to oopses shown in other guests CPUs (like virtual APIC crashes).
This patch hence aims to help debug such complex issues without
resorting to kdump.

Signed-off-by: Guilherme G. Piccoli
Signed-off-by: Andrew Morton
Reviewed-by: Kees Cook
Cc: Luis Chamberlain
Cc: Iurii Zaikin
Cc: Thomas Gleixner
Cc: Vlastimil Babka
Cc: Randy Dunlap
Cc: Matthew Wilcox
Link: http://lkml.kernel.org/r/20200327224116.21030-1-gpiccoli@canonical.com
Signed-off-by: Linus Torvalds

Guilherme G. Piccoli
2020-06-09 02:05:56 +0800
0ec9dc9bc kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected ... Browse Code »

Commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks before
panic") introduced a change in that we started to show all CPUs
backtraces when a hung task is detected _and_ the sysctl/kernel
parameter "hung_task_panic" is set. The idea is good, because usually
when observing deadlocks (that may lead to hung tasks), the culprit is
another task holding a lock and not necessarily the task detected as
hung.

The problem with this approach is that dumping backtraces is a slightly
expensive task, specially printing that on console (and specially in
many CPU machines, as servers commonly found nowadays). So, users that
plan to collect a kdump to investigate the hung tasks and narrow down
the deadlock definitely don't need the CPUs backtrace on dmesg/console,
which will delay the panic and pollute the log (crash tool would easily
grab all CPUs traces with 'bt -a' command).

Also, there's the reciprocal scenario: some users may be interested in
seeing the CPUs backtraces but not have the system panic when a hung
task is detected. The current approach hence is almost as embedding a
policy in the kernel, by forcing the CPUs backtraces' dump (only) on
hung_task_panic.

This patch decouples the panic event on hung task from the CPUs
backtraces dump, by creating (and documenting) a new sysctl called
"hung_task_all_cpu_backtrace", analog to the approach taken on soft/hard
lockups, that have both a panic and an "all_cpu_backtrace" sysctl to
allow individual control. The new mechanism for dumping the CPUs
backtraces on hung task detection respects "hung_task_warnings" by not
dumping the traces in case there's no warnings left.

Signed-off-by: Guilherme G. Piccoli
Signed-off-by: Andrew Morton
Reviewed-by: Kees Cook
Cc: Tetsuo Handa
Link: http://lkml.kernel.org/r/20200327223646.20779-1-gpiccoli@canonical.com
Signed-off-by: Linus Torvalds

Guilherme G. Piccoli
2020-06-09 02:05:56 +0800
f117955a2 kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases ... Browse Code »

After a recent change introduced by Vlastimil's series [0], kernel is
able now to handle sysctl parameters on kernel command line; also, the
series introduced a simple infrastructure to convert legacy boot
parameters (that duplicate sysctls) into sysctl aliases.

This patch converts the watchdog parameters softlockup_panic and
{hard,soft}lockup_all_cpu_backtrace to use the new alias infrastructure.
It fixes the documentation too, since the alias only accepts values 0 or
1, not the full range of integers.

We also took the opportunity here to improve the documentation of the
previously converted hung_task_panic (see the patch series [0]) and put
the alias table in alphabetical order.

[0] http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz

Signed-off-by: Guilherme G. Piccoli
Signed-off-by: Andrew Morton
Acked-by: Vlastimil Babka
Cc: Kees Cook
Cc: Iurii Zaikin
Cc: Luis Chamberlain
Link: http://lkml.kernel.org/r/20200507214624.21911-1-gpiccoli@canonical.com
Signed-off-by: Linus Torvalds

Guilherme G. Piccoli
2020-06-09 02:05:56 +0800
4f2f682d8 lib/test_sysctl: support testing of sysctl. boot parameter ... Browse Code »

Testing is done by a new parameter debug.test_sysctl.boot_int which
defaults to 0 and it's expected that the tester passes a boot parameter
that sets it to 1. The test checks if it's set to 1.

To distinguish true failure from parameter not being set, the test
checks /proc/cmdline for the expected parameter, and whether test_sysctl
is built-in and not a module.

[vbabka@suse.cz: skip the new test if boot_int sysctl is not present]
Link: http://lkml.kernel.org/r/305af605-1e60-cf84-fada-6ce1ca37c102@suse.cz

Signed-off-by: Vlastimil Babka
Signed-off-by: Andrew Morton
Cc: Alexey Dobriyan
Cc: Christian Brauner
Cc: David Rientjes
Cc: "Eric W . Biederman"
Cc: Greg Kroah-Hartman
Cc: "Guilherme G . Piccoli"
Cc: Iurii Zaikin
Cc: Ivan Teterevkov
Cc: Kees Cook
Cc: Luis Chamberlain
Cc: Masami Hiramatsu
Cc: Matthew Wilcox
Cc: Michal Hocko
Cc: Michal Hocko
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20200427180433.7029-6-vbabka@suse.cz
Signed-off-by: Linus Torvalds

Vlastimil Babka
2020-06-09 02:05:56 +0800
4546cde96 tools/testing/selftests/sysctl/sysctl.sh: support CONFIG_TEST_SYSCTL=y ... Browse Code »

The testing script recommends CONFIG_TEST_SYSCTL=y, but actually only
works with CONFIG_TEST_SYSCTL=m. Testing of sysctl setting via boot
param however requires the test to be built-in, so make sure the test
script supports it.

Signed-off-by: Vlastimil Babka
Signed-off-by: Andrew Morton
Acked-by: Luis Chamberlain
Cc: Alexey Dobriyan
Cc: Christian Brauner
Cc: David Rientjes
Cc: "Eric W . Biederman"
Cc: Greg Kroah-Hartman
Cc: "Guilherme G . Piccoli"
Cc: Iurii Zaikin
Cc: Ivan Teterevkov
Cc: Kees Cook
Cc: Masami Hiramatsu
Cc: Matthew Wilcox
Cc: Michal Hocko
Cc: Michal Hocko
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20200427180433.7029-5-vbabka@suse.cz
Signed-off-by: Linus Torvalds

Vlastimil Babka
2020-06-09 02:05:56 +0800
b467f3ef3 kernel/hung_task convert hung_task_panic boot parameter to sysctl ... Browse Code »

We can now handle sysctl parameters on kernel command line and have
infrastructure to convert legacy command line options that duplicate
sysctl to become a sysctl alias.

This patch converts the hung_task_panic parameter. Note that the sysctl
handler is more strict and allows only 0 and 1, while the legacy
parameter allowed any non-zero value. But there is little reason anyone
would not be using 1.

Signed-off-by: Vlastimil Babka
Signed-off-by: Andrew Morton
Reviewed-by: Kees Cook
Acked-by: Michal Hocko
Cc: Alexey Dobriyan
Cc: Christian Brauner
Cc: David Rientjes
Cc: "Eric W . Biederman"
Cc: Greg Kroah-Hartman
Cc: "Guilherme G . Piccoli"
Cc: Iurii Zaikin
Cc: Ivan Teterevkov
Cc: Luis Chamberlain
Cc: Masami Hiramatsu
Cc: Matthew Wilcox
Cc: Michal Hocko
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20200427180433.7029-4-vbabka@suse.cz
Signed-off-by: Linus Torvalds

Vlastimil Babka
2020-06-09 02:05:56 +0800