Eric Lee / smarc-fsl-linux-kernel

13 Jan, 2021

1 commit

2179bae04 local64.h: make <asm/local64.h> mandatory ... Browse Code »

[ Upstream commit 87dbc209ea04645fd2351981f09eff5d23f8e2e9 ]

Make mandatory in include/asm-generic/Kbuild and
remove all arch/*/include/asm/local64.h arch-specific files since they
only #include .

This fixes build errors on arch/c6x/ and arch/nios2/ for
block/blk-iocost.c.

Build-tested on 21 of 25 arch-es. (tools problems on the others)

Yes, we could even rename to
and change all #includes to use
instead.

Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap
Suggested-by: Christoph Hellwig
Reviewed-by: Masahiro Yamada
Cc: Jens Axboe
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Aurelien Jacquiot
Cc: Peter Zijlstra
Cc: Arnd Bergmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin

Randy Dunlap
2021-01-13 03:18:16 +0800

30 Dec, 2020

1 commit

bbb7c059f sparc: fix handling of page table constructor failure ... Browse Code »

[ Upstream commit 06517c9a336f4c20f2064611bf4b1e7881a95fe1 ]

The page has just been allocated, so its refcount is 1. free_unref_page()
is for use on pages which have a zero refcount. Use __free_page() like
the other implementations of pte_alloc_one().

Link: https://lkml.kernel.org/r/20201125034655.27687-1-willy@infradead.org
Fixes: 1ae9ae5f7df7 ("sparc: handle pgtable_page_ctor() fail")
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: David Hildenbrand
Reviewed-by: Mike Rapoport
Acked-by: Vlastimil Babka
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin

Matthew Wilcox (Oracle)
2020-12-30 18:53:55 +0800

09 Dec, 2020

2 commits

c6f7e1510 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull sparc64 csum fix from Al Viro:
"Fix for a brown paperbag regression in sparc64"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[regression fix] really dumb fuckup in sparc64 __csum_partial_copy() changes

Linus Torvalds
2020-12-09 07:03:39 +0800
6220e48d9 [regression fix] really dumb fuckup in sparc64 __csum_partial_copy() changes ... Browse Code »

~0U is -1, not 1

Reported-by: Anatoly Pugachev
Tested-by: Anatoly Pugachev
Fixes: fdf8bee96f9a "sparc64: propagate the calling convention changes down to __csum_partial_copy_...()"
X-brown-paperbag: yes
Signed-off-by: Al Viro

Al Viro
2020-12-09 05:37:47 +0800

24 Nov, 2020

1 commit

58c644ba5 sched/idle: Fix arch_cpu_idle() vs tracing ... Browse Code »

We call arch_cpu_idle() with RCU disabled, but then use
local_irq_{en,dis}able(), which invokes tracing, which relies on RCU.

Switch all arch_cpu_idle() implementations to use
raw_local_irq_{en,dis}able() and carefully manage the
lockdep,rcu,tracing state like we do in entry.

(XXX: we really should change arch_cpu_idle() to not return with
interrupts enabled)

Reported-by: Sven Schnelle
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Mark Rutland
Tested-by: Mark Rutland
Link: https://lkml.kernel.org/r/20201120114925.594122626@infradead.org

Peter Zijlstra
2020-11-24 23:47:35 +0800

26 Oct, 2020

1 commit

33def8498 treewide: Convert macro and uses of __section(foo) to __section("foo") ... Browse Code »

Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches
Reviewed-by: Nick Desaulniers
Reviewed-by: Miguel Ojeda
Signed-off-by: Linus Torvalds

Joe Perches
2020-10-26 05:51:49 +0800

24 Oct, 2020

1 commit

4a22709e2 Merge tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block ... Browse Code »

Pull arch task_work cleanups from Jens Axboe:
"Two cleanups that don't fit other categories:

- Finally get the task_work_add() cleanup done properly, so we don't
have random 0/1/false/true/TWA_SIGNAL confusing use cases. Updates
all callers, and also fixes up the documentation for
task_work_add().

- While working on some TIF related changes for 5.11, this
TIF_NOTIFY_RESUME cleanup fell out of that. Remove some arch
duplication for how that is handled"

* tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block:
task_work: cleanup notification modes
tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()

Linus Torvalds
2020-10-24 01:06:38 +0800

23 Oct, 2020

3 commits

746b25b1a Merge tag 'kbuild-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild ... Browse Code »

Pull Kbuild updates from Masahiro Yamada:

- Support 'make compile_commands.json' to generate the compilation
database more easily, avoiding stale entries

- Support 'make clang-analyzer' and 'make clang-tidy' for static checks
using clang-tidy

- Preprocess scripts/modules.lds.S to allow CONFIG options in the
module linker script

- Drop cc-option tests from compiler flags supported by our minimal
GCC/Clang versions

- Use always 12-digits commit hash for CONFIG_LOCALVERSION_AUTO=y

- Use sha1 build id for both BFD linker and LLD

- Improve deb-pkg for reproducible builds and rootless builds

- Remove stale, useless scripts/namespace.pl

- Turn -Wreturn-type warning into error

- Fix build error of deb-pkg when CONFIG_MODULES=n

- Replace 'hostname' command with more portable 'uname -n'

- Various Makefile cleanups

* tag 'kbuild-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
kbuild: Use uname for LINUX_COMPILE_HOST detection
kbuild: Only add -fno-var-tracking-assignments for old GCC versions
kbuild: remove leftover comment for filechk utility
treewide: remove DISABLE_LTO
kbuild: deb-pkg: clean up package name variables
kbuild: deb-pkg: do not build linux-headers package if CONFIG_MODULES=n
kbuild: enforce -Werror=return-type
scripts: remove namespace.pl
builddeb: Add support for all required debian/rules targets
builddeb: Enable rootless builds
builddeb: Pass -n to gzip for reproducible packages
kbuild: split the build log of kallsyms
kbuild: explicitly specify the build id style
scripts/setlocalversion: make git describe output more reliable
kbuild: remove cc-option test of -Werror=date-time
kbuild: remove cc-option test of -fno-stack-check
kbuild: remove cc-option test of -fno-strict-overflow
kbuild: move CFLAGS_{KASAN,UBSAN,KCSAN} exports to relevant Makefiles
kbuild: remove redundant CONFIG_KASAN check from scripts/Makefile.kasan
kbuild: do not create built-in objects for external module builds
...

Linus Torvalds
2020-10-23 04:13:57 +0800
00937f36b Merge tag 'pci-v5.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci ... Browse Code »

Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Print IRQ number used by PCIe Link Bandwidth Notification (Dongdong
Liu)
- Add schedule point in pci_read_config() to reduce max latency
(Jiang Biao)
- Add Kconfig options for MPS/MRRS strategy (Jim Quinlan)

Resource management:
- Fix pci_iounmap() memory leak when !CONFIG_GENERIC_IOMAP (Lorenzo
Pieralisi)

PCIe native device hotplug:
- Reduce noisiness on hot removal (Lukas Wunner)

Power management:
- Revert "PCI/PM: Apply D2 delay as milliseconds, not microseconds"
that was done on the basis of spec typo (Bjorn Helgaas)
- Rename pci_dev.d3_delay to d3hot_delay to remove D3hot/D3cold
ambiguity (Krzysztof Wilczyński)
- Remove unused pcibios_pm_ops (Vaibhav Gupta)

IOMMU:
- Enable Translation Blocking for external devices to harden against
DMA attacks (Rajat Jain)

Error handling:
- Add an ACPI APEI notifier chain for vendor CPER records to enable
device-specific error handling (Shiju Jose)

ASPM:
- Remove struct aspm_register_info to simplify code (Saheed O.
Bolarinwa)

Amlogic Meson PCIe controller driver:
- Build as module by default (Kevin Hilman)

Ampere Altra PCIe controller driver:
- Add MCFG quirk to work around non-standard ECAM implementation
(Tuan Phan)

Broadcom iProc PCIe controller driver:
- Set affinity mask on MSI interrupts (Mark Tomlinson)

Broadcom STB PCIe controller driver:
- Make PCIE_BRCMSTB depend on ARCH_BRCMSTB (Jim Quinlan)
- Add DT bindings for more Brcmstb chips (Jim Quinlan)
- Add bcm7278 register info (Jim Quinlan)
- Add bcm7278 PERST# support (Jim Quinlan)
- Add suspend and resume pm_ops (Jim Quinlan)
- Add control of rescal reset (Jim Quinlan)
- Set additional internal memory DMA viewport sizes (Jim Quinlan)
- Accommodate MSI for older chips (Jim Quinlan)
- Set bus max burst size by chip type (Jim Quinlan)
- Add support for bcm7211, bcm7216, bcm7445, bcm7278 (Jim Quinlan)

Freescale i.MX6 PCIe controller driver:
- Use dev_err_probe() to reduce redundant messages (Anson Huang)

Freescale Layerscape PCIe controller driver:
- Enforce 4K DMA buffer alignment in endpoint test (Hou Zhiqiang)
- Add DT compatible strings for ls1088a, ls2088a (Xiaowei Bao)
- Add endpoint support for ls1088a, ls2088a (Xiaowei Bao)
- Add endpoint test support for lS1088a (Xiaowei Bao)
- Add MSI-X support for ls1088a (Xiaowei Bao)

HiSilicon HIP PCIe controller driver:
- Handle HIP-specific errors via ACPI APEI (Yicong Yang)

HiSilicon Kirin PCIe controller driver:
- Return -EPROBE_DEFER if the GPIO isn't ready (Bean Huo)

Intel VMD host bridge driver:
- Factor out physical offset, bus offset, IRQ domain, IRQ allocation
(Jon Derrick)
- Use generic PCI PM correctly (Jon Derrick)

Marvell Aardvark PCIe controller driver:
- Fix compilation on s390 (Pali Rohár)
- Implement driver 'remove' function and allow to build it as module
(Pali Rohár)
- Move PCIe reset card code to advk_pcie_train_link() (Pali Rohár)
- Convert mvebu a3700 internal SMCC firmware return codes to errno
(Pali Rohár)
- Fix initialization with old Marvell's Arm Trusted Firmware (Pali
Rohár)

Microsoft Hyper-V host bridge driver:
- Fix hibernation in case interrupts are not re-created (Dexuan Cui)

NVIDIA Tegra PCIe controller driver:
- Stop checking return value of debugfs_create() functions (Greg
Kroah-Hartman)
- Convert to use DEFINE_SEQ_ATTRIBUTE macro (Liu Shixin)

Qualcomm PCIe controller driver:
- Reset PCIe to work around Qsdk U-Boot issue (Ansuel Smith)

Renesas R-Car PCIe controller driver:
- Add DT documentation for r8a774a1, r8a774b1, r8a774e1 endpoints
(Lad Prabhakar)
- Add RZ/G2M, RZ/G2N, RZ/G2H IDs to endpoint test (Lad Prabhakar)
- Add DT support for r8a7742 (Lad Prabhakar)

Socionext UniPhier Pro5 controller driver:
- Add DT descriptions of iATU register (host and endpoint) (Kunihiko
Hayashi)

Synopsys DesignWare PCIe controller driver:
- Add link up check in dw_child_pcie_ops.map_bus() (racy, but seems
unavoidable) (Hou Zhiqiang)
- Fix endpoint Header Type check so multi-function devices work (Hou
Zhiqiang)
- Skip PCIE_MSI_INTR0* programming if MSI is disabled (Jisheng Zhang)
- Stop leaking MSI page in suspend/resume (Jisheng Zhang)
- Add common iATU register support instead of keystone-specific code
(Kunihiko Hayashi)
- Major config space access and other cleanups in dwc core and
drivers that use it (al, exynos, histb, imx6, intel-gw, keystone,
kirin, meson, qcom, tegra) (Rob Herring)
- Add multiple PFs support for endpoint (Xiaowei Bao)
- Add MSI-X doorbell mode in endpoint mode (Xiaowei Bao)

Miscellaneous:
- Use fallthrough pseudo-keyword (Gustavo A. R. Silva)
- Fix "0 used as NULL pointer" warnings (Gustavo Pimentel)
- Fix "cast truncates bits from constant value" warnings (Gustavo
Pimentel)
- Remove redundant zeroing for sg_init_table() (Julia Lawall)
- Use scnprintf(), not snprintf(), in sysfs "show" functions
(Krzysztof Wilczyński)
- Remove unused assignments (Krzysztof Wilczyński)
- Fix "0 used as NULL pointer" warning (Krzysztof Wilczyński)
- Simplify bool comparisons (Krzysztof Wilczyński)
- Use for_each_child_of_node() and for_each_node_by_name() (Qinglang
Miao)
- Simplify return expressions (Qinglang Miao)"

* tag 'pci-v5.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (147 commits)
PCI: vmd: Update VMD PM to correctly use generic PCI PM
PCI: vmd: Create IRQ allocation helper
PCI: vmd: Create IRQ Domain configuration helper
PCI: vmd: Create bus offset configuration helper
PCI: vmd: Create physical offset helper
PCI: v3-semi: Remove unneeded break
PCI: dwc: Add link up check in dw_child_pcie_ops.map_bus()
PCI/ASPM: Remove struct pcie_link_state.l1ss
PCI/ASPM: Remove struct aspm_register_info.l1ss_cap
PCI/ASPM: Pass L1SS Capabilities value, not struct aspm_register_info
PCI/ASPM: Remove struct aspm_register_info.l1ss_ctl1
PCI/ASPM: Remove struct aspm_register_info.l1ss_ctl2 (unused)
PCI/ASPM: Remove struct aspm_register_info.l1ss_cap_ptr
PCI/ASPM: Remove struct aspm_register_info.latency_encoding
PCI/ASPM: Remove struct aspm_register_info.enabled
PCI/ASPM: Remove struct aspm_register_info.support
PCI/ASPM: Use 'parent' and 'child' for readability
PCI/ASPM: Move LTR path check to where it's used
PCI/ASPM: Move pci_clear_and_set_dword() earlier
PCI: dwc: Fix MSI page leakage in suspend/resume
...

Linus Torvalds
2020-10-23 03:41:00 +0800
f56e65dff Merge branch 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull initial set_fs() removal from Al Viro:
"Christoph's set_fs base series + fixups"

* 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Allow a NULL pos pointer to __kernel_read
fs: Allow a NULL pos pointer to __kernel_write
powerpc: remove address space overrides using set_fs()
powerpc: use non-set_fs based maccess routines
x86: remove address space overrides using set_fs()
x86: make TASK_SIZE_MAX usable from assembly code
x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h
lkdtm: remove set_fs-based tests
test_bitmap: remove user bitmap tests
uaccess: add infrastructure for kernel builds with set_fs()
fs: don't allow splice read/write without explicit ops
fs: don't allow kernel reads and writes without iter ops
sysctl: Convert to iter interfaces
proc: add a read_iter method to proc proc_ops
proc: cleanup the compat vs no compat file ops
proc: remove a level of indentation in proc_get_inode

Linus Torvalds
2020-10-23 00:59:21 +0800

20 Oct, 2020

1 commit

0f6372e52 treewide: remove DISABLE_LTO ... Browse Code »

This change removes all instances of DISABLE_LTO from
Makefiles, as they are currently unused, and the preferred
method of disabling LTO is to filter out the flags instead.

Note added by Masahiro Yamada:
DISABLE_LTO was added as preparation for GCC LTO, but GCC LTO was
not pulled into the mainline. (https://lkml.org/lkml/2014/4/8/272)

Suggested-by: Kees Cook
Signed-off-by: Sami Tolvanen
Reviewed-by: Kees Cook
Signed-off-by: Masahiro Yamada

Sami Tolvanen
2020-10-20 23:28:53 +0800

19 Oct, 2020

1 commit

ecb8ac8b1 mm/madvise: introduce process_madvise() syscall: an external memory hinting API ... Browse Code »

There is usecase that System Management Software(SMS) want to give a
memory hint like MADV_[COLD|PAGEEOUT] to other processes and in the
case of Android, it is the ActivityManagerService.

The information required to make the reclaim decision is not known to the
app. Instead, it is known to the centralized userspace
daemon(ActivityManagerService), and that daemon must be able to initiate
reclaim on its own without any app involvement.

To solve the issue, this patch introduces a new syscall
process_madvise(2). It uses pidfd of an external process to give the
hint. It also supports vector address range because Android app has
thousands of vmas due to zygote so it's totally waste of CPU and power if
we should call the syscall one by one for each vma.(With testing 2000-vma
syscall vs 1-vector syscall, it showed 15% performance improvement. I
think it would be bigger in real practice because the testing ran very
cache friendly environment).

Another potential use case for the vector range is to amortize the cost
ofTLB shootdowns for multiple ranges when using MADV_DONTNEED; this could
benefit users like TCP receive zerocopy and malloc implementations. In
future, we could find more usecases for other advises so let's make it
happens as API since we introduce a new syscall at this moment. With
that, existing madvise(2) user could replace it with process_madvise(2)
with their own pid if they want to have batch address ranges support
feature.

ince it could affect other process's address range, only privileged
process(PTRACE_MODE_ATTACH_FSCREDS) or something else(e.g., being the same
UID) gives it the right to ptrace the process could use it successfully.
The flag argument is reserved for future use if we need to extend the API.

I think supporting all hints madvise has/will supported/support to
process_madvise is rather risky. Because we are not sure all hints make
sense from external process and implementation for the hint may rely on
the caller being in the current context so it could be error-prone. Thus,
I just limited hints as MADV_[COLD|PAGEOUT] in this patch.

If someone want to add other hints, we could hear the usecase and review
it for each hint. It's safer for maintenance rather than introducing a
buggy syscall but hard to fix it later.

So finally, the API is as follows,

ssize_t process_madvise(int pidfd, const struct iovec *iovec,
unsigned long vlen, int advice, unsigned int flags);

DESCRIPTION
The process_madvise() system call is used to give advice or directions
to the kernel about the address ranges from external process as well as
local process. It provides the advice to address ranges of process
described by iovec and vlen. The goal of such advice is to improve
system or application performance.

The pidfd selects the process referred to by the PID file descriptor
specified in pidfd. (See pidofd_open(2) for further information)

The pointer iovec points to an array of iovec structures, defined in
as:

struct iovec {
void *iov_base; /* starting address */
size_t iov_len; /* number of bytes to be advised */
};

The iovec describes address ranges beginning at address(iov_base)
and with size length of bytes(iov_len).

The vlen represents the number of elements in iovec.

The advice is indicated in the advice argument, which is one of the
following at this moment if the target process specified by pidfd is
external.

MADV_COLD
MADV_PAGEOUT

Permission to provide a hint to external process is governed by a
ptrace access mode PTRACE_MODE_ATTACH_FSCREDS check; see ptrace(2).

The process_madvise supports every advice madvise(2) has if target
process is in same thread group with calling process so user could
use process_madvise(2) to extend existing madvise(2) to support
vector address ranges.

RETURN VALUE
On success, process_madvise() returns the number of bytes advised.
This return value may be less than the total number of requested
bytes, if an error occurred. The caller should check return value
to determine whether a partial advice occurred.

FAQ:

Q.1 - Why does any external entity have better knowledge?

Quote from Sandeep

"For Android, every application (including the special SystemServer)
are forked from Zygote. The reason of course is to share as many
libraries and classes between the two as possible to benefit from the
preloading during boot.

After applications start, (almost) all of the APIs end up calling into
this SystemServer process over IPC (binder) and back to the
application.

In a fully running system, the SystemServer monitors every single
process periodically to calculate their PSS / RSS and also decides
which process is "important" to the user for interactivity.

So, because of how these processes start _and_ the fact that the
SystemServer is looping to monitor each process, it does tend to *know*
which address range of the application is not used / useful.

Besides, we can never rely on applications to clean things up
themselves. We've had the "hey app1, the system is low on memory,
please trim your memory usage down" notifications for a long time[1].
They rely on applications honoring the broadcasts and very few do.

So, if we want to avoid the inevitable killing of the application and
restarting it, some way to be able to tell the OS about unimportant
memory in these applications will be useful.

- ssp

Q.2 - How to guarantee the race(i.e., object validation) between when
giving a hint from an external process and get the hint from the target
process?

process_madvise operates on the target process's address space as it
exists at the instant that process_madvise is called. If the space
target process can run between the time the process_madvise process
inspects the target process address space and the time that
process_madvise is actually called, process_madvise may operate on
memory regions that the calling process does not expect. It's the
responsibility of the process calling process_madvise to close this
race condition. For example, the calling process can suspend the
target process with ptrace, SIGSTOP, or the freezer cgroup so that it
doesn't have an opportunity to change its own address space before
process_madvise is called. Another option is to operate on memory
regions that the caller knows a priori will be unchanged in the target
process. Yet another option is to accept the race for certain
process_madvise calls after reasoning that mistargeting will do no
harm. The suggested API itself does not provide synchronization. It
also apply other APIs like move_pages, process_vm_write.

The race isn't really a problem though. Why is it so wrong to require
that callers do their own synchronization in some manner? Nobody
objects to write(2) merely because it's possible for two processes to
open the same file and clobber each other's writes --- instead, we tell
people to use flock or something. Think about mmap. It never
guarantees newly allocated address space is still valid when the user
tries to access it because other threads could unmap the memory right
before. That's where we need synchronization by using other API or
design from userside. It shouldn't be part of API itself. If someone
needs more fine-grained synchronization rather than process level,
there were two ideas suggested - cookie[2] and anon-fd[3]. Both are
applicable via using last reserved argument of the API but I don't
think it's necessary right now since we have already ways to prevent
the race so don't want to add additional complexity with more
fine-grained optimization model.

To make the API extend, it reserved an unsigned long as last argument
so we could support it in future if someone really needs it.

Q.3 - Why doesn't ptrace work?

Injecting an madvise in the target process using ptrace would not work
for us because such injected madvise would have to be executed by the
target process, which means that process would have to be runnable and
that creates the risk of the abovementioned race and hinting a wrong
VMA. Furthermore, we want to act the hint in caller's context, not the
callee's, because the callee is usually limited in cpuset/cgroups or
even freezed state so they can't act by themselves quick enough, which
causes more thrashing/kill. It doesn't work if the target process are
ptraced(e.g., strace, debugger, minidump) because a process can have at
most one ptracer.

[1] https://developer.android.com/topic/performance/memory"

[2] process_getinfo for getting the cookie which is updated whenever
vma of process address layout are changed - Daniel Colascione -
https://lore.kernel.org/lkml/20190520035254.57579-1-minchan@kernel.org/T/#m7694416fd179b2066a2c62b5b139b14e3894e224

[3] anonymous fd which is used for the object(i.e., address range)
validation - Michal Hocko -
https://lore.kernel.org/lkml/20200120112722.GY18451@dhcp22.suse.cz/

[minchan@kernel.org: fix process_madvise build break for arm64]
Link: http://lkml.kernel.org/r/20200303145756.GA219683@google.com
[minchan@kernel.org: fix build error for mips of process_madvise]
Link: http://lkml.kernel.org/r/20200508052517.GA197378@google.com
[akpm@linux-foundation.org: fix patch ordering issue]
[akpm@linux-foundation.org: fix arm64 whoops]
[minchan@kernel.org: make process_madvise() vlen arg have type size_t, per Florian]
[akpm@linux-foundation.org: fix i386 build]
[sfr@canb.auug.org.au: fix syscall numbering]
Link: https://lkml.kernel.org/r/20200905142639.49fc3f1a@canb.auug.org.au
[sfr@canb.auug.org.au: madvise.c needs compat.h]
Link: https://lkml.kernel.org/r/20200908204547.285646b4@canb.auug.org.au
[minchan@kernel.org: fix mips build]
Link: https://lkml.kernel.org/r/20200909173655.GC2435453@google.com
[yuehaibing@huawei.com: remove duplicate header which is included twice]
Link: https://lkml.kernel.org/r/20200915121550.30584-1-yuehaibing@huawei.com
[minchan@kernel.org: do not use helper functions for process_madvise]
Link: https://lkml.kernel.org/r/20200921175539.GB387368@google.com
[akpm@linux-foundation.org: pidfd_get_pid() gained an argument]
[sfr@canb.auug.org.au: fix up for "iov_iter: transparently handle compat iovecs in import_iovec"]
Link: https://lkml.kernel.org/r/20200928212542.468e1fef@canb.auug.org.au

Signed-off-by: Minchan Kim
Signed-off-by: YueHaibing
Signed-off-by: Stephen Rothwell
Signed-off-by: Andrew Morton
Reviewed-by: Suren Baghdasaryan
Reviewed-by: Vlastimil Babka
Acked-by: David Rientjes
Cc: Alexander Duyck
Cc: Brian Geffon
Cc: Christian Brauner
Cc: Daniel Colascione
Cc: Jann Horn
Cc: Jens Axboe
Cc: Joel Fernandes
Cc: Johannes Weiner
Cc: John Dias
Cc: Kirill Tkhai
Cc: Michal Hocko
Cc: Oleksandr Natalenko
Cc: Sandeep Patil
Cc: SeongJae Park
Cc: SeongJae Park
Cc: Shakeel Butt
Cc: Sonny Rao
Cc: Tim Murray
Cc: Christian Brauner
Cc: Florian Weimer
Cc:
Link: http://lkml.kernel.org/r/20200302193630.68771-3-minchan@kernel.org
Link: http://lkml.kernel.org/r/20200508183320.GA125527@google.com
Link: http://lkml.kernel.org/r/20200622192900.22757-4-minchan@kernel.org
Link: https://lkml.kernel.org/r/20200901000633.1920247-4-minchan@kernel.org
Signed-off-by: Linus Torvalds

Minchan Kim
2020-10-19 00:27:10 +0800

18 Oct, 2020

1 commit

3c532798e tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume() ... Browse Code »

All the callers currently do this, clean it up and move the clearing
into tracehook_notify_resume() instead.

Reviewed-by: Oleg Nesterov
Reviewed-by: Thomas Gleixner
Signed-off-by: Jens Axboe

Jens Axboe
2020-10-18 05:04:36 +0800

17 Oct, 2020

1 commit

96685f866 Merge tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux ... Browse Code »

Pull powerpc updates from Michael Ellerman:

- A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting
it for powerpc, as well as a related fix for sparc.

- Remove support for PowerPC 601.

- Some fixes for watchpoints & addition of a new ptrace flag for
detecting ISA v3.1 (Power10) watchpoint features.

- A fix for kernels using 4K pages and the hash MMU on bare metal
Power9 systems with > 16TB of RAM, or RAM on the 2nd node.

- A basic idle driver for shallow stop states on Power10.

- Tweaks to our sched domains code to better inform the scheduler about
the hardware topology on Power9/10, where two SMT4 cores can be
presented by firmware as an SMT8 core.

- A series doing further reworks & cleanups of our EEH code.

- Addition of a filter for RTAS (firmware) calls done via sys_rtas(),
to prevent root from overwriting kernel memory.

- Other smaller features, fixes & cleanups.

Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe
Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn
Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero,
Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad
Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca
Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas
Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro
Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang
Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha,
Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt,
Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain,
Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang
Yingliang, zhengbin.

* tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits)
Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed"
selftests/powerpc: Fix eeh-basic.sh exit codes
cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier
powerpc/time: Make get_tb() common to PPC32 and PPC64
powerpc/time: Make get_tbl() common to PPC32 and PPC64
powerpc/time: Remove get_tbu()
powerpc/time: Avoid using get_tbl() and get_tbu() internally
powerpc/time: Make mftb() common to PPC32 and PPC64
powerpc/time: Rename mftbl() to mftb()
powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S
powerpc/32s: Rename head_32.S to head_book3s_32.S
powerpc/32s: Setup the early hash table at all time.
powerpc/time: Remove ifdef in get_dec() and set_dec()
powerpc: Remove get_tb_or_rtc()
powerpc: Remove __USE_RTC()
powerpc: Tidy up a bit after removal of PowerPC 601.
powerpc: Remove support for PowerPC 601
powerpc: Remove PowerPC 601
powerpc: Drop SYNC_601() ISYNC_601() and SYNC()
powerpc: Remove CONFIG_PPC601_SYNC_FIX
...

Linus Torvalds
2020-10-17 03:21:15 +0800

16 Oct, 2020

1 commit

5a32c3413 Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping ... Browse Code »

Pull dma-mapping updates from Christoph Hellwig:

- rework the non-coherent DMA allocator

- move private definitions out of

- lower CMA_ALIGNMENT (Paul Cercueil)

- remove the omap1 dma address translation in favor of the common code

- make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

- support per-node DMA CMA areas (Barry Song)

- increase the default seg boundary limit (Nicolin Chen)

- misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

- various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
ARM/ixp4xx: add a missing include of dma-map-ops.h
dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
dma-direct: factor out a dma_direct_alloc_from_pool helper
dma-direct check for highmem pages in dma_direct_alloc_pages
dma-mapping: merge into
dma-mapping: move large parts of to kernel/dma
dma-mapping: move dma-debug.h to kernel/dma/
dma-mapping: remove
dma-mapping: merge into
dma-contiguous: remove dma_contiguous_set_default
dma-contiguous: remove dev_set_cma_area
dma-contiguous: remove dma_declare_contiguous
dma-mapping: split
cma: decrease CMA_ALIGNMENT lower limit to 2
firewire-ohci: use dma_alloc_pages
dma-iommu: implement ->alloc_noncoherent
dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
dma-mapping: add a new dma_alloc_pages API
dma-mapping: remove dma_cache_sync
53c700: convert to dma_alloc_noncoherent
...

Linus Torvalds
2020-10-16 05:43:29 +0800

15 Oct, 2020

2 commits

612e7a4c1 Merge tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux ... Browse Code »

Pull kernel_clone() updates from Christian Brauner:
"During the v5.9 merge window we reworked the process creation
codepaths across multiple architectures. After this work we were only
left with the _do_fork() helper based on the struct kernel_clone_args
calling convention. As was pointed out _do_fork() isn't valid
kernelese especially for a helper that isn't just static.

This series removes the _do_fork() helper and introduces the new
kernel_clone() helper. The process creation cleanup didn't change the
name to something more reasonable mainly because _do_fork() was used
in quite a few places. So sending this as a separate series seemed the
better strategy.

I originally intended to send this early in the v5.9 development cycle
after the merge window had closed but given that this was touching
quite a few places I decided to defer this until the v5.10 merge
window"

* tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
sched: remove _do_fork()
tracing: switch to kernel_clone()
kgdbts: switch to kernel_clone()
kprobes: switch to kernel_clone()
x86: switch to kernel_clone()
sparc: switch to kernel_clone()
nios2: switch to kernel_clone()
m68k: switch to kernel_clone()
ia64: switch to kernel_clone()
h8300: switch to kernel_clone()
fork: introduce kernel_clone()

Linus Torvalds
2020-10-15 05:32:52 +0800
d5660df4a Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge misc updates from Andrew Morton:
"181 patches.

Subsystems affected by this patch series: kbuild, scripts, ntfs,
ocfs2, vfs, mm (slab, slub, kmemleak, dax, debug, pagecache, fadvise,
gup, swap, memremap, memcg, selftests, pagemap, mincore, hmm, dma,
memory-failure, vmallo and migration)"

* emailed patches from Andrew Morton : (181 commits)
mm/migrate: remove obsolete comment about device public
mm/migrate: remove cpages-- in migrate_vma_finalize()
mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary
memblock: use separate iterators for memory and reserved regions
memblock: implement for_each_reserved_mem_region() using __next_mem_region()
memblock: remove unused memblock_mem_size()
x86/setup: simplify reserve_crashkernel()
x86/setup: simplify initrd relocation and reservation
arch, drivers: replace for_each_membock() with for_each_mem_range()
arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
memblock: reduce number of parameters in for_each_mem_range()
memblock: make memblock_debug and related functionality private
memblock: make for_each_memblock_type() iterator private
mircoblaze: drop unneeded NUMA and sparsemem initializations
riscv: drop unneeded node initialization
h8300, nds32, openrisc: simplify detection of memory extents
arm64: numa: simplify dummy_numa_init()
arm, xtensa: simplify initialization of high memory pages
dma-contiguous: simplify cma_early_percent_memory()
KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
...

Linus Torvalds
2020-10-15 00:57:24 +0800

14 Oct, 2020

2 commits

b10d6bca8 arch, drivers: replace for_each_membock() with for_each_mem_range() ... Browse Code »

There are several occurrences of the following pattern:

for_each_memblock(memory, reg) {
start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));

/* do something with start and end */
}

Using for_each_mem_range() iterator is more appropriate in such cases and
allows simpler and cleaner code.

[akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
[rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com

Signed-off-by: Mike Rapoport
Signed-off-by: Andrew Morton
Cc: Andy Lutomirski
Cc: Baoquan He
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Catalin Marinas
Cc: Christoph Hellwig
Cc: Daniel Axtens
Cc: Dave Hansen
Cc: Emil Renner Berthing
Cc: Hari Bathini
Cc: Ingo Molnar
Cc: Ingo Molnar
Cc: Jonathan Cameron
Cc: Marek Szyprowski
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Miguel Ojeda
Cc: Palmer Dabbelt
Cc: Paul Mackerras
Cc: Paul Walmsley
Cc: Peter Zijlstra
Cc: Russell King
Cc: Stafford Horne
Cc: Thomas Bogendoerfer
Cc: Thomas Gleixner
Cc: Will Deacon
Cc: Yoshinori Sato
Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org
Signed-off-by: Linus Torvalds

Mike Rapoport
2020-10-14 09:38:35 +0800
8b05418b2 Merge tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux ... Browse Code »

Pull seccomp updates from Kees Cook:
"The bulk of the changes are with the seccomp selftests to accommodate
some powerpc-specific behavioral characteristics. Additional cleanups,
fixes, and improvements are also included:

- heavily refactor seccomp selftests (and clone3 selftests
dependency) to fix powerpc (Kees Cook, Thadeu Lima de Souza
Cascardo)

- fix style issue in selftests (Zou Wei)

- upgrade "unknown action" from KILL_THREAD to KILL_PROCESS (Rich
Felker)

- replace task_pt_regs(current) with current_pt_regs() (Denis
Efremov)

- fix corner-case race in USER_NOTIF (Jann Horn)

- make CONFIG_SECCOMP no longer per-arch (YiFei Zhu)"

* tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
seccomp: Make duplicate listener detection non-racy
seccomp: Move config option SECCOMP to arch/Kconfig
selftests/clone3: Avoid OS-defined clone_args
selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit
selftests/seccomp: Allow syscall nr and ret value to be set separately
selftests/seccomp: Record syscall during ptrace entry
selftests/seccomp: powerpc: Fix seccomp return value testing
selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET
selftests/seccomp: Avoid redundant register flushes
selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG
selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG
selftests/seccomp: Remove syscall setting #ifdefs
selftests/seccomp: mips: Remove O32-specific macro
selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro
selftests/seccomp: arm: Define SYSCALL_NUM_SET macro
selftests/seccomp: mips: Define SYSCALL_NUM_SET macro
selftests/seccomp: Provide generic syscall setting macro
selftests/seccomp: Refactor arch register macros to avoid xtensa special case
selftests/seccomp: Use __NR_mknodat instead of __NR_mknod
selftests/seccomp: Use bitwise instead of arithmetic operator for flags
...

Linus Torvalds
2020-10-14 07:33:43 +0800

13 Oct, 2020

6 commits

22230cd2c Merge branch 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull compat mount cleanups from Al Viro:
"The last remnants of mount(2) compat buried by Christoph.

Buried into NFS, that is.

Generally I'm less enthusiastic about "let's use in_compat_syscall()
deep in call chain" kind of approach than Christoph seems to be, but
in this case it's warranted - that had been an NFS-specific wart,
hopefully not to be repeated in any other filesystems (read: any new
filesystem introducing non-text mount options will get NAKed even if
it doesn't mess the layout up).

IOW, not worth trying to grow an infrastructure that would avoid that
use of in_compat_syscall()..."

* 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: remove compat_sys_mount
fs,nfs: lift compat nfs4 mount data handling into the nfs code
nfs: simplify nfs4_parse_monolithic

Linus Torvalds
2020-10-13 07:44:57 +0800
e18afa5bf Merge branch 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull compat quotactl cleanups from Al Viro:
"More Christoph's compat cleanups: quotactl(2)"

* 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
quota: simplify the quotactl compat handling
compat: add a compat_need_64bit_alignment_fixup() helper
compat: lift compat_s64 and compat_u64 to

Linus Torvalds
2020-10-13 07:37:13 +0800
85ed13e78 Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull compat iovec cleanups from Al Viro:
"Christoph's series around import_iovec() and compat variant thereof"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
security/keys: remove compat_keyctl_instantiate_key_iov
mm: remove compat_process_vm_{readv,writev}
fs: remove compat_sys_vmsplice
fs: remove the compat readv/writev syscalls
fs: remove various compat readv/writev helpers
iov_iter: transparently handle compat iovecs in import_iovec
iov_iter: refactor rw_copy_check_uvector and import_iovec
iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
compat.h: fix a spelling error in

Linus Torvalds
2020-10-13 07:35:51 +0800
c90578360 Merge branch 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull copy_and_csum cleanups from Al Viro:
"Saner calling conventions for csum_and_copy_..._user() and friends"

[ Removing 800+ lines of code and cleaning stuff up is good - Linus ]

* 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ppc: propagate the calling conventions change down to csum_partial_copy_generic()
amd64: switch csum_partial_copy_generic() to new calling conventions
sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
mips: propagate the calling convention change down into __csum_partial_copy_..._user()
mips: __csum_partial_copy_kernel() has no users left
mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
i386: propagate the calling conventions change down to csum_partial_copy_generic()
sh: propage the calling conventions change down to csum_partial_copy_generic()
m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
arm: propagate the calling convention changes down to csum_partial_copy_from_user()
alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
saner calling conventions for csum_and_copy_..._user()
csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
csum_partial_copy_nocheck(): drop the last argument
unify generic instances of csum_partial_copy_nocheck()
icmp_push_reply(): reorder adding the checksum up
skb_copy_and_csum_bits(): don't bother with the last argument

Linus Torvalds
2020-10-13 07:24:13 +0800
1c6890707 Merge tag 'perf-kprobes-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf/kprobes updates from Ingo Molnar:
"This prepares to unify the kretprobe trampoline handler and make
kretprobe lockless (those patches are still work in progress)"

* tag 'perf-kprobes-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
kprobes: Make local functions static
kprobes: Free kretprobe_instance with RCU callback
kprobes: Remove NMI context check
sparc: kprobes: Use generic kretprobe trampoline handler
sh: kprobes: Use generic kretprobe trampoline handler
s390: kprobes: Use generic kretprobe trampoline handler
powerpc: kprobes: Use generic kretprobe trampoline handler
parisc: kprobes: Use generic kretprobe trampoline handler
mips: kprobes: Use generic kretprobe trampoline handler
ia64: kprobes: Use generic kretprobe trampoline handler
csky: kprobes: Use generic kretprobe trampoline handler
arc: kprobes: Use generic kretprobe trampoline handler
arm64: kprobes: Use generic kretprobe trampoline handler
arm: kprobes: Use generic kretprobe trampoline handler
x86/kprobes: Use generic kretprobe trampoline handler
kprobes: Add generic kretprobe trampoline handler

Linus Torvalds
2020-10-13 05:21:15 +0800
34eb62d86 Merge tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull orphan section checking from Ingo Molnar:
"Orphan link sections were a long-standing source of obscure bugs,
because the heuristics that various linkers & compilers use to handle
them (include these bits into the output image vs discarding them
silently) are both highly idiosyncratic and also version dependent.

Instead of this historically problematic mess, this tree by Kees Cook
(et al) adds build time asserts and build time warnings if there's any
orphan section in the kernel or if a section is not sized as expected.

And because we relied on so many silent assumptions in this area, fix
a metric ton of dependencies and some outright bugs related to this,
before we can finally enable the checks on the x86, ARM and ARM64
platforms"

* tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
x86/boot/compressed: Warn on orphan section placement
x86/build: Warn on orphan section placement
arm/boot: Warn on orphan section placement
arm/build: Warn on orphan section placement
arm64/build: Warn on orphan section placement
x86/boot/compressed: Add missing debugging sections to output
x86/boot/compressed: Remove, discard, or assert for unwanted sections
x86/boot/compressed: Reorganize zero-size section asserts
x86/build: Add asserts for unwanted sections
x86/build: Enforce an empty .got.plt section
x86/asm: Avoid generating unused kprobe sections
arm/boot: Handle all sections explicitly
arm/build: Assert for unwanted sections
arm/build: Add missing sections
arm/build: Explicitly keep .ARM.attributes sections
arm/build: Refactor linker script headers
arm64/build: Assert for unwanted sections
arm64/build: Add missing DWARF sections
arm64/build: Use common DISCARDS in linker script
arm64/build: Remove .eh_frame* sections due to unwind tables
...

Linus Torvalds
2020-10-13 04:39:19 +0800

09 Oct, 2020

2 commits

a96843372 kbuild: explicitly specify the build id style ... Browse Code »

ld's --build-id defaults to "sha1" style, while lld defaults to "fast".
The build IDs are very different between the two, which may confuse
programs that reference them.

Signed-off-by: Bill Wendling
Acked-by: David S. Miller
Signed-off-by: Masahiro Yamada

Bill Wendling
2020-10-09 22:57:30 +0800
282a181b1 seccomp: Move config option SECCOMP to arch/Kconfig ... Browse Code »

In order to make adding configurable features into seccomp easier,
it's better to have the options at one single location, considering
especially that the bulk of seccomp code is arch-independent. An quick
look also show that many SECCOMP descriptions are outdated; they talk
about /proc rather than prctl.

As a result of moving the config option and keeping it default on,
architectures arm, arm64, csky, riscv, sh, and xtensa did not have SECCOMP
on by default prior to this and SECCOMP will be default in this change.

Architectures microblaze, mips, powerpc, s390, sh, and sparc have an
outdated depend on PROC_FS and this dependency is removed in this change.

Suggested-by: Jann Horn
Link: https://lore.kernel.org/lkml/CAG48ez1YWz9cnp08UZgeieYRhHdqh-ch7aNwc4JRBnGyrmgfMg@mail.gmail.com/
Signed-off-by: YiFei Zhu
[kees: added HAVE_ARCH_SECCOMP help text, tweaked wording]
Signed-off-by: Kees Cook
Link: https://lore.kernel.org/r/9ede6ef35c847e58d61e476c6a39540520066613.1600951211.git.yifeifz2@illinois.edu

YiFei Zhu
2020-10-09 04:17:47 +0800

06 Oct, 2020

2 commits

9f4df96b8 dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h> ... Browse Code »

Move more nitty gritty DMA implementation details into the common
internal header.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2020-10-06 13:07:06 +0800
0a0f0d8be dma-mapping: split <linux/dma-mapping.h> ... Browse Code »

Split out all the bits that are purely for dma_map_ops implementations
and related code into a new header so that they
don't get pulled into all the drivers. That also means the architecture
specific is not pulled in by
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2020-10-06 13:07:03 +0800

05 Oct, 2020

2 commits

1d29b36ac sparc32: Move ioremap/iounmap declaration before asm-generic/io.h include ... Browse Code »

Move the ioremap/iounmap declaration before asm-generic/io.h is
included so that it is visible within it.

Link: https://lore.kernel.org/r/93e2f23cda474a92a4708d4c50c9c359426a2162.1600254147.git.lorenzo.pieralisi@arm.com
Signed-off-by: Lorenzo Pieralisi
Signed-off-by: Lorenzo Pieralisi
Acked-by: "David S. Miller"
Cc: "David S. Miller"

Lorenzo Pieralisi
2020-10-05 16:44:41 +0800
333a67839 sparc32: Remove useless io_32.h __KERNEL__ preprocessor guard ... Browse Code »

The __KERNEL_ preprocessor guard is useless in non-uapi headers.

Remove it.

Link: https://lore.kernel.org/r/084753d3064fe946ff1963eda2eb425cfd7daa7b.1600254147.git.lorenzo.pieralisi@arm.com
Signed-off-by: Lorenzo Pieralisi
Signed-off-by: Lorenzo Pieralisi
Acked-by: David S. Miller
Cc: David S. Miller

Lorenzo Pieralisi
2020-10-05 16:44:16 +0800

03 Oct, 2020

3 commits

c3973b401 mm: remove compat_process_vm_{readv,writev} ... Browse Code »

Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2020-10-03 12:02:15 +0800
598b3cec8 fs: remove compat_sys_vmsplice ... Browse Code »

Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2020-10-03 12:02:15 +0800
5f764d624 fs: remove the compat readv/writev syscalls ... Browse Code »

Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2020-10-03 12:02:14 +0800

28 Sep, 2020

1 commit

981aa1d36 PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS ... Browse Code »

The unconditional selection of PCI_MSI_ARCH_FALLBACKS has an unmet
dependency because PCI_MSI_ARCH_FALLBACKS is defined in a 'if PCI' clause.

As it is only relevant when PCI_MSI is enabled, update the affected
architecture Kconfigs to make the selection of PCI_MSI_ARCH_FALLBACKS
depend on 'if PCI_MSI'.

Fixes: 077ee78e3928 ("PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable")
Reported-by: Qian Cai
Signed-off-by: Thomas Gleixner
Cc: Vasily Gorbik
Links: https://lore.kernel.org/r/cdfd63305caa57785b0925dd24c0711ea02c8527.camel@redhat.com

Thomas Gleixner
2020-09-28 22:03:10 +0800

23 Sep, 2020

1 commit

028abd922 fs: remove compat_sys_mount ... Browse Code »

compat_sys_mount is identical to the regular sys_mount now, so remove it
and use the native version everywhere.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2020-09-23 11:45:57 +0800

18 Sep, 2020

1 commit

cc7886d25 compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h> ... Browse Code »

lift the compat_s64 and compat_u64 definitions into common code using the
COMPAT_FOR_U64_ALIGNMENT symbol for the x86 special case.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2020-09-18 01:00:46 +0800

16 Sep, 2020

2 commits

077ee78e3 PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable ... Browse Code »

The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture
requires them or not. Architectures which are fully utilizing hierarchical
irq domains should never call into that code.

It's not only architectures which depend on that by implementing one or
more of the weak functions, there is also a bunch of drivers which relies
on the weak functions which invoke msi_controller::setup_irq[s] and
msi_controller::teardown_irq.

Make the architectures and drivers which rely on them select them in Kconfig
and if not selected replace them by stub functions which emit a warning and
fail the PCI/MSI interrupt allocation.

Signed-off-by: Thomas Gleixner
Link: https://lore.kernel.org/r/20200826112333.992429909@linutronix.de

Thomas Gleixner
2020-09-16 22:52:37 +0800
bafb056ce sparc64: remove mm_cpumask clearing to fix kthread_use_mm race ... Browse Code »

The de facto (and apparently uncommented) standard for using an mm had,
thanks to this code in sparc if nothing else, been that you must have a
reference on mm_users *and that reference must have been obtained with
mmget()*, i.e., from a thread with a reference to mm_users that had used
the mm.

The introduction of mmget_not_zero() in commit d2005e3f41d4
("userfaultfd: don't pin the user memory in userfaultfd_file_create()")
allowed mm_count holders to aoperate on user mappings asynchronously
from the actual threads using the mm, but they were not to load those
mappings into their TLB (i.e., walking vmas and page tables is okay,
kthread_use_mm() is not).

io_uring 2b188cc1bb857 ("Add io_uring IO interface") added code which
does a kthread_use_mm() from a mmget_not_zero() refcount.

The problem with this is code which previously assumed mm == current->mm
and mm->mm_users == 1 implies the mm will remain single-threaded at
least until this thread creates another mm_users reference, has now
broken.

arch/sparc/kernel/smp_64.c:

if (atomic_read(&mm->mm_users) == 1) {
cpumask_copy(mm_cpumask(mm), cpumask_of(cpu));
goto local_flush_and_out;
}

vs fs/io_uring.c

if (unlikely(!(ctx->flags & IORING_SETUP_SQPOLL) ||
!mmget_not_zero(ctx->sqo_mm)))
return -EFAULT;
kthread_use_mm(ctx->sqo_mm);

mmget_not_zero() could come in right after the mm_users == 1 test, then
kthread_use_mm() which sets its CPU in the mm_cpumask. That update could
be lost if cpumask_copy() occurs afterward.

I propose we fix this by allowing mmget_not_zero() to be a first-class
reference, and not have this obscure undocumented and unchecked
restriction.

The basic fix for sparc64 is to remove its mm_cpumask clearing code. The
optimisation could be effectively restored by sending IPIs to mm_cpumask
members and having them remove themselves from mm_cpumask. This is more
tricky so I leave it as an exercise for someone with a sparc64 SMP.
powerpc has a (currently similarly broken) example.

Signed-off-by: Nicholas Piggin
Acked-by: David S. Miller
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20200914045219.3736466-4-npiggin@gmail.com

Nicholas Piggin
2020-09-16 10:24:37 +0800

09 Sep, 2020

1 commit

5e6e9852d uaccess: add infrastructure for kernel builds with set_fs() ... Browse Code »

Add a CONFIG_SET_FS option that is selected by architecturess that
implement set_fs, which is all of them initially. If the option is not
set stubs for routines related to overriding the address space are
provided so that architectures can start to opt out of providing set_fs.

Signed-off-by: Christoph Hellwig
Reviewed-by: Kees Cook
Signed-off-by: Al Viro

Christoph Hellwig
2020-09-09 10:21:32 +0800