Eric Lee / smarc-fsl-linux-kernel

17 May, 2020

1 commit

870c153cf blktrace: Report pid with note messages ... Browse Code »

Currently informational messages within block trace do not have PID
information of the process reporting the message included. With BFQ it
is sometimes useful to have the information and there's no good reason
to omit the information from the trace. So just fill in pid information
when generating note message.

Signed-off-by: Jan Kara
Reviewed-by: Chaitanya Kulkarni
Acked-by: Paolo Valente
Signed-off-by: Jens Axboe

Jan Kara
2020-05-17 04:29:39 +0800

20 Apr, 2020

4 commits

3e0dea576 Merge tag 'timers-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull time namespace fix from Thomas Gleixner:
"An update for the proc interface of time namespaces: Use symbolic
names instead of clockid numbers. The usability nuisance of numbers
was noticed by Michael when polishing the man page"

* tag 'timers-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
proc, time/namespace: Show clock symbolic names in /proc/pid/timens_offsets

Linus Torvalds
2020-04-20 02:46:21 +0800
80ade29e1 Merge tag 'irq-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq fixes from Thomas Gleixner:
"A set of fixes/updates for the interrupt subsystem:

- Remove setup_irq() and remove_irq(). All users have been converted
so remove them before new users surface.

- A set of bugfixes for various interrupt chip drivers

- Add a few missing static attributes to address sparse warnings"

* tag 'irq-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-bcm7038-l1: Make bcm7038_l1_of_init() static
irqchip/irq-mvebu-icu: Make legacy_bindings static
irqchip/meson-gpio: Fix HARDIRQ-safe -> HARDIRQ-unsafe lock order
irqchip/sifive-plic: Fix maximum priority threshold value
irqchip/ti-sci-inta: Fix processing of masked irqs
irqchip/mbigen: Free msi_desc on device teardown
irqchip/gic-v4.1: Update effective affinity of virtual SGIs
irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling
genirq: Remove setup_irq() and remove_irq()

Linus Torvalds
2020-04-20 02:23:33 +0800
08dd38727 Merge tag 'sched-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler fixes from Thomas Gleixner:
"Two fixes for the scheduler:

- Work around an uninitialized variable warning where GCC can't
figure it out.

- Allow 'isolcpus=' to skip unknown subparameters so that older
kernels work with the commandline of a newer kernel. Improve the
error output while at it"

* tag 'sched-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/vtime: Work around an unitialized variable warning
sched/isolation: Allow "isolcpus=" to skip unknown sub-parameters

Linus Torvalds
2020-04-20 02:18:20 +0800
5e7de5812 Merge tag 'core-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull RCU fix from Thomas Gleixner:
"A single bugfix for RCU to prevent taking a lock in NMI context"

* tag 'core-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Don't acquire lock in NMI handler in rcu_nmi_enter_common()

Linus Torvalds
2020-04-20 02:16:00 +0800

19 Apr, 2020

1 commit

774acb2a0 Merge tag 'for-linus-2020-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux ... Browse Code »

Pull thread fixes from Christian Brauner:
"A few fixes and minor improvements:

- Correctly validate the cgroup file descriptor when clone3() is used
with CLONE_INTO_CGROUP.

- Check that a new enough version of struct clone_args is passed
which supports the cgroup file descriptor argument when
CLONE_INTO_CGROUP is set in the flags argument.

- Catch nonsensical struct clone_args layouts at build time.

- Catch extensions of struct clone_args without updating the uapi
visible size definitions at build time.

- Check whether the signal is valid early in kill_pid_usb_asyncio()
before doing further work.

- Replace open-coded rcu_read_lock()+kill_pid_info()+rcu_read_unlock()
sequence in kill_something_info() with kill_proc_info() which is a
dedicated helper to do just that"

* tag 'for-linus-2020-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
clone3: add build-time CLONE_ARGS_SIZE_VER* validity checks
clone3: add a check for the user struct size if CLONE_INTO_CGROUP is set
clone3: fix cgroup argument sanity check
signal: use kill_proc_info instead of kill_pid_info in kill_something_info
signal: check sig before setting info in kill_pid_usb_asyncio

Linus Torvalds
2020-04-19 02:38:51 +0800

17 Apr, 2020

1 commit

c8372665b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Pull networking fixes from David Miller:

1) Disable RISCV BPF JIT builds when !MMU, from Björn Töpel.

2) nf_tables leaves dangling pointer after free, fix from Eric Dumazet.

3) Out of boundary write in __xsk_rcv_memcpy(), fix from Li RongQing.

4) Adjust icmp6 message source address selection when routes have a
preferred source address set, from Tim Stallard.

5) Be sure to validate HSR protocol version when creating new links,
from Taehee Yoo.

6) CAP_NET_ADMIN should be sufficient to manage l2tp tunnels even in
non-initial namespaces, from Michael Weiß.

7) Missing release firmware call in mlx5, from Eran Ben Elisha.

8) Fix variable type in macsec_changelink(), caught by KASAN. Fix from
Taehee Yoo.

9) Fix pause frame negotiation in marvell phy driver, from Clemens
Gruber.

10) Record RX queue early enough in tun packet paths such that XDP
programs will see the correct RX queue index, from Gilberto Bertin.

11) Fix double unlock in mptcp, from Florian Westphal.

12) Fix offset overflow in ARM bpf JIT, from Luke Nelson.

13) marvell10g needs to soft reset PHY when coming out of low power
mode, from Russell King.

14) Fix MTU setting regression in stmmac for some chip types, from
Florian Fainelli.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
amd-xgbe: Use __napi_schedule() in BH context
mISDN: make dmril and dmrim static
net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes
net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode
tipc: fix incorrect increasing of link window
Documentation: Fix tcp_challenge_ack_limit default value
net: tulip: make early_486_chipsets static
dt-bindings: net: ethernet-phy: add desciption for ethernet-phy-id1234.d400
ipv6: remove redundant assignment to variable err
net/rds: Use ERR_PTR for rds_message_alloc_sgs()
net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge
selftests/bpf: Check for correct program attach/detach in xdp_attach test
libbpf: Fix type of old_fd in bpf_xdp_set_link_opts
libbpf: Always specify expected_attach_type on program load if supported
xsk: Add missing check on user supplied headroom size
mac80211: fix channel switch trigger from unknown mesh peer
mac80211: fix race in ieee80211_register_hw()
net: marvell10g: soft-reset the PHY when coming out of low power
net: marvell10g: report firmware version
net/cxgb4: Check the return from t4_query_params properly
...

Linus Torvalds
2020-04-17 05:52:29 +0800

16 Apr, 2020

1 commit

94d440d61 proc, time/namespace: Show clock symbolic names in /proc/pid/timens_offsets ... Browse Code »

Michael Kerrisk suggested to replace numeric clock IDs with symbolic names.

Now the content of these files looks like this:
$ cat /proc/774/timens_offsets
monotonic 864000 0
boottime 1728000 0

For setting offsets, both representations of clocks (numeric and symbolic)
can be used.

As for compatibility, it is acceptable to change things as long as
userspace doesn't care. The format of timens_offsets files is very new and
there are no userspace tools yet which rely on this format.

But three projects crun, util-linux and criu rely on the interface of
setting time offsets and this is why it's required to continue supporting
the numeric clock IDs on write.

Fixes: 04a8682a71be ("fs/proc: Introduce /proc/pid/timens_offsets")
Suggested-by: Michael Kerrisk
Signed-off-by: Andrei Vagin
Signed-off-by: Thomas Gleixner
Tested-by: Michael Kerrisk
Acked-by: Michael Kerrisk
Cc: Andrew Morton
Cc: Eric W. Biederman
Cc: Dmitry Safonov
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200411154031.642557-1-avagin@gmail.com

Andrei Vagin
2020-04-16 18:10:54 +0800

15 Apr, 2020

8 commits

e0d648f9d sched/vtime: Work around an unitialized variable warning ... Browse Code »

Work around this warning:

kernel/sched/cputime.c: In function ‘kcpustat_field’:
kernel/sched/cputime.c:1007:6: warning: ‘val’ may be used uninitialized in this function [-Wmaybe-uninitialized]

because GCC can't see that val is used only when err is 0.

Acked-by: Peter Zijlstra
Signed-off-by: Borislav Petkov
Signed-off-by: Ingo Molnar
Link: https://lore.kernel.org/r/20200327214334.GF8015@zn.tnic

Borislav Petkov
2020-04-15 17:06:50 +0800
3662daf02 sched/isolation: Allow "isolcpus=" to skip unknown sub-parameters ... Browse Code »

The "isolcpus=" parameter allows sub-parameters before the cpulist is
specified, and if the parser detects an unknown sub-parameters the whole
parameter will be ignored.

This design is incompatible with itself when new sub-parameters are added.
An older kernel will not recognize the new sub-parameter and will
invalidate the whole parameter so the CPU isolation will not take
effect. It emits a warning:

isolcpus: Error, unknown flag

The better and compatible way is to allow "isolcpus=" to skip unknown
sub-parameters, so that even if new sub-parameters are added an older
kernel will still be able to behave as usual even if with the new
sub-parameter specified on the command line.

Ideally this should have been there when the first sub-parameter for
"isolcpus=" was introduced.

Suggested-by: Thomas Gleixner
Signed-off-by: Peter Xu
Signed-off-by: Thomas Gleixner
Link: https://lkml.kernel.org/r/20200403223517.406353-1-peterx@redhat.com

Peter Xu
2020-04-15 16:38:26 +0800
a966dcfe1 clone3: add build-time CLONE_ARGS_SIZE_VER* validity checks ... Browse Code »

CLONE_ARGS_SIZE_VER* macros are defined explicitly and not via
the offsets of the relevant struct clone_args fields, which makes
it rather error-prone, so it probably makes sense to add some
compile-time checks for them (including the one that breaks
on struct clone_args extension as a reminder to add a relevant
size macro and a similar check). Function copy_clone_args_from_user
seems to be a good place for such checks.

Signed-off-by: Eugene Syromiatnikov
Acked-by: Christian Brauner
Link: https://lore.kernel.org/r/20200412202658.GA31499@asgard.redhat.com
Signed-off-by: Christian Brauner

Eugene Syromiatnikov
2020-04-15 15:56:32 +0800
62173872c clone3: add a check for the user struct size if CLONE_INTO_CGROUP is set ... Browse Code »

Passing CLONE_INTO_CGROUP with an under-sized structure (that doesn't
properly contain cgroup field) seems like garbage input, especially
considering the fact that fd 0 is a valid descriptor.

Signed-off-by: Eugene Syromiatnikov
Acked-by: Christian Brauner
Link: https://lore.kernel.org/r/20200412203123.GA5869@asgard.redhat.com
Signed-off-by: Christian Brauner

Eugene Syromiatnikov
2020-04-15 15:56:25 +0800
e82a118f5 clone3: fix cgroup argument sanity check ... Browse Code »

Checking that cgroup field value of struct clone_args is less than 0
is useless, as it is defined as unsigned 64-bit integer. Moreover,
it doesn't catch the situations where its higher bits are lost during
the assignment to the cgroup field of the cgroup field of the internal
struct kernel_clone_args (where it is declared as signed 32-bit
integer), so it is still possible to pass garbage there. A check
against INT_MAX solves both these issues.

Fixes: ef2c41cf38a7559b ("clone3: allow spawning processes into cgroups")
Signed-off-by: Eugene Syromiatnikov
Acked-by: Christian Brauner
Link: https://lore.kernel.org/r/20200412202533.GA29554@asgard.redhat.com
Signed-off-by: Christian Brauner

Eugene Syromiatnikov
2020-04-15 15:56:12 +0800
0bbe7f719 tracing: Fix the race between registering 'snapshot' event trigger and triggerin… ... Browse Code »

…g 'snapshot' operation

Traced event can trigger 'snapshot' operation(i.e. calls snapshot_trigger()
or snapshot_count_trigger()) when register_snapshot_trigger() has completed
registration but doesn't allocate buffer for 'snapshot' event trigger. In
the rare case, 'snapshot' operation always detects the lack of allocated
buffer so make register_snapshot_trigger() allocate buffer first.

trigger-snapshot.tc in kselftest reproduces the issue on slow vm:
-----------------------------------------------------------
cat trace
...
ftracetest-3028 [002] .... 236.784290: sched_process_fork: comm=ftracetest pid=3028 child_comm=ftracetest child_pid=3036
<...>-2875 [003] .... 240.460335: tracing_snapshot_instance_cond: *** SNAPSHOT NOT ALLOCATED ***
<...>-2875 [003] .... 240.460338: tracing_snapshot_instance_cond: *** stopping trace here! ***
-----------------------------------------------------------

Link: http://lkml.kernel.org/r/20200414015145.66236-1-yangx.jy@cn.fujitsu.com

Cc: stable@vger.kernel.org
Fixes: 93e31ffbf417a ("tracing: Add 'snapshot' event trigger command")
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Xiao Yang
2020-04-15 10:02:10 +0800
89f33dcad bpf: remove unneeded conversion to bool in __mark_reg_unknown ... Browse Code »

This issue was detected by using the Coccinelle software:

kernel/bpf/verifier.c:1259:16-21: WARNING: conversion to bool not needed here

The conversion to bool is unneeded, remove it.

Reported-by: Hulk Robot
Signed-off-by: Zou Wei
Signed-off-by: Daniel Borkmann
Acked-by: Song Liu
Link: https://lore.kernel.org/bpf/1586779076-101346-1-git-send-email-zou_wei@huawei.com

Zou Wei
2020-04-15 03:40:06 +0800
1f6cb19be bpf: Prevent re-mmap()'ing BPF map as writable for initially r/o mapping ... Browse Code »

VM_MAYWRITE flag during initial memory mapping determines if already mmap()'ed
pages can be later remapped as writable ones through mprotect() call. To
prevent user application to rewrite contents of memory-mapped as read-only and
subsequently frozen BPF map, remove VM_MAYWRITE flag completely on initially
read-only mapping.

Alternatively, we could treat any memory-mapping on unfrozen map as writable
and bump writecnt instead. But there is little legitimate reason to map
BPF map as read-only and then re-mmap() it as writable through mprotect(),
instead of just mmap()'ing it as read/write from the very beginning.

Also, at the suggestion of Jann Horn, drop unnecessary refcounting in mmap
operations. We can just rely on VMA holding reference to BPF map's file
properly.

Fixes: fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY")
Reported-by: Jann Horn
Signed-off-by: Andrii Nakryiko
Signed-off-by: Daniel Borkmann
Reviewed-by: Jann Horn
Link: https://lore.kernel.org/bpf/20200410202613.3679837-1-andriin@fb.com

Andrii Nakryiko
2020-04-15 03:28:57 +0800

14 Apr, 2020

2 commits

07d8350ed genirq: Remove setup_irq() and remove_irq() ... Browse Code »

Now that all the users of setup_irq() & remove_irq() have been replaced by
request_irq() & free_irq() respectively, delete them.

Signed-off-by: afzal mohammed
Signed-off-by: Thomas Gleixner
Reviewed-by: Linus Walleij
Link: https://lkml.kernel.org/r/0aa8771ada1ac8e1312f6882980c9c08bd023148.1585320721.git.afzal.mohd.ma@gmail.com

afzal mohammed
2020-04-14 16:08:50 +0800
40e7d7bdc Merge branch 'urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/paulmck/linux-rcu into core/urgent

Pull RCU fix from Paul E. McKenney.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2020-04-14 14:36:41 +0800

13 Apr, 2020

6 commits

3075afdf1 signal: use kill_proc_info instead of kill_pid_info in kill_something_info ... Browse Code »

signal.c provides kill_proc_info, we can use it instead of kill_pid_info
in kill_something_info func gracefully.

Signed-off-by: Zhiqiang Liu
Acked-by: Oleg Nesterov
Acked-by: Christian Brauner
Link: https://lore.kernel.org/r/80236965-f0b5-c888-95ff-855bdec75bb3@huawei.com
Signed-off-by: Christian Brauner

Zhiqiang Liu
2020-04-13 04:46:34 +0800
eaec2b0bd signal: check sig before setting info in kill_pid_usb_asyncio ... Browse Code »

In kill_pid_usb_asyncio, if signal is not valid, we do not need to
set info struct.

Signed-off-by: Zhiqiang Liu
Acked-by: Christian Brauner
Link: https://lore.kernel.org/r/f525fd08-1cf7-fb09-d20c-4359145eb940@huawei.com
Signed-off-by: Christian Brauner

Zhiqiang Liu
2020-04-13 04:46:34 +0800
0785249f8 Merge tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull time(keeping) updates from Thomas Gleixner:

- Fix the time_for_children symlink in /proc/$PID/ so it properly
reflects that it part of the 'time' namespace

- Add the missing userns limit for the allowed number of time
namespaces, which was half defined but the actual array member was
not added. This went unnoticed as the array has an exessive empty
member at the end but introduced a user visible regression as the
output was corrupted.

- Prevent further silent ucount corruption by adding a BUILD_BUG_ON()
to catch half updated data.

* tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ucount: Make sure ucounts in /proc/sys/user don't regress again
time/namespace: Add max_time_namespaces ucount
time/namespace: Fix time_for_children symlink

Linus Torvalds
2020-04-13 01:13:14 +0800
590680d13 Merge tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler fixes/updates from Thomas Gleixner:

- Deduplicate the average computations in the scheduler core and the
fair class code.

- Fix a raise between runtime distribution and assignement which can
cause exceeding the quota by up to 70%.

- Prevent negative results in the imbalanace calculation

- Remove a stale warning in the workqueue code which can be triggered
since the call site was moved out of preempt disabled code. It's a
false positive.

- Deduplicate the print macros for procfs

- Add the ucmap values to the SCHED_DEBUG procfs output for completness

* tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Add task uclamp values to SCHED_DEBUG procfs
sched/debug: Factor out printing formats into common macros
sched/debug: Remove redundant macro define
sched/core: Remove unused rq::last_load_update_tick
workqueue: Remove the warning in wq_worker_sleeping()
sched/fair: Fix negative imbalance in imbalance calculation
sched/fair: Fix race between runtime distribution and assignment
sched/fair: Align rq->avg_idle and rq->avg_scan_cost

Linus Torvalds
2020-04-13 01:09:19 +0800
20e2aa812 Merge tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Thomas Gleixner:
"Three fixes/updates for perf:

- Fix the perf event cgroup tracking which tries to track the cgroup
even for disabled events.

- Add Ice Lake server support for uncore events

- Disable pagefaults when retrieving the physical address in the
sampling code"

* tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Disable page faults when getting phys address
perf/x86/intel/uncore: Add Ice Lake server uncore support
perf/cgroup: Correct indirection in perf_less_group_idx()
perf/core: Fix event cgroup tracking

Linus Torvalds
2020-04-13 01:05:24 +0800
652fa53ca Merge tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking fixes from Thomas Gleixner:
"Three small fixes/updates for the locking core code:

- Plug a task struct reference leak in the percpu rswem
implementation.

- Document the refcount interaction with PID_MAX_LIMIT

- Improve the 'invalid wait context' data dump in lockdep so it
contains all information which is required to decode the problem"

* tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Improve 'invalid wait context' splat
locking/refcount: Document interaction with PID_MAX_LIMIT
locking/percpu-rwsem: Fix a task_struct refcount

Linus Torvalds
2020-04-13 00:47:10 +0800

12 Apr, 2020

1 commit

75e718839 Merge tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping ... Browse Code »

Pull dma-mapping fixes from Christoph Hellwig:

- fix an integer truncation in dma_direct_get_required_mask
(Kishon Vijay Abraham)

- fix the display of dma mapping types (Grygorii Strashko)

* tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-debug: fix displaying of dma allocation type
dma-direct: fix data truncation in dma_direct_get_required_mask()

Linus Torvalds
2020-04-12 02:34:36 +0800

11 Apr, 2020

6 commits

5b8b9d0c6 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge yet more updates from Andrew Morton:

- Almost all of the rest of MM (memcg, slab-generic, slab, pagealloc,
gup, hugetlb, pagemap, memremap)

- Various other things (hfs, ocfs2, kmod, misc, seqfile)

* akpm: (34 commits)
ipc/util.c: sysvipc_find_ipc() should increase position index
kernel/gcov/fs.c: gcov_seq_next() should increase position index
fs/seq_file.c: seq_read(): add info message about buggy .next functions
drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
change email address for Pali Rohár
selftests: kmod: test disabling module autoloading
selftests: kmod: fix handling test numbers above 9
docs: admin-guide: document the kernel.modprobe sysctl
fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
kmod: make request_module() return an error when autoloading is disabled
mm/memremap: set caching mode for PCI P2PDMA memory to WC
mm/memory_hotplug: add pgprot_t to mhp_params
powerpc/mm: thread pgprot_t through create_section_mapping()
x86/mm: introduce __set_memory_prot()
x86/mm: thread pgprot_t through init_memory_mapping()
mm/memory_hotplug: rename mhp_restrictions to mhp_params
mm/memory_hotplug: drop the flags field from struct mhp_restrictions
mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
mm/vma: introduce VM_ACCESS_FLAGS
mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
...

Linus Torvalds
2020-04-11 08:57:48 +0800
f4d74ef62 kernel/gcov/fs.c: gcov_seq_next() should increase position index ... Browse Code »

If seq_file .next function does not change position index, read after
some lseek can generate unexpected output.

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin
Signed-off-by: Andrew Morton
Acked-by: Peter Oberparleiter
Cc: Al Viro
Cc: Davidlohr Bueso
Cc: Ingo Molnar
Cc: Manfred Spraul
Cc: NeilBrown
Cc: Steven Rostedt
Cc: Waiman Long
Link: http://lkml.kernel.org/r/f65c6ee7-bd00-f910-2f8a-37cc67e4ff88@virtuozzo.com
Signed-off-by: Linus Torvalds

Vasily Averin
2020-04-11 06:36:22 +0800
d7d27cfc5 kmod: make request_module() return an error when autoloading is disabled ... Browse Code »

Patch series "module autoloading fixes and cleanups", v5.

This series fixes a bug where request_module() was reporting success to
kernel code when module autoloading had been completely disabled via
'echo > /proc/sys/kernel/modprobe'.

It also addresses the issues raised on the original thread
(https://lkml.kernel.org/lkml/20200310223731.126894-1-ebiggers@kernel.org/T/#u)
bydocumenting the modprobe sysctl, adding a self-test for the empty path
case, and downgrading a user-reachable WARN_ONCE().

This patch (of 4):

It's long been possible to disable kernel module autoloading completely
(while still allowing manual module insertion) by setting
/proc/sys/kernel/modprobe to the empty string.

This can be preferable to setting it to a nonexistent file since it
avoids the overhead of an attempted execve(), avoids potential
deadlocks, and avoids the call to security_kernel_module_request() and
thus on SELinux-based systems eliminates the need to write SELinux rules
to dontaudit module_request.

However, when module autoloading is disabled in this way,
request_module() returns 0. This is broken because callers expect 0 to
mean that the module was successfully loaded.

Apparently this was never noticed because this method of disabling
module autoloading isn't used much, and also most callers don't use the
return value of request_module() since it's always necessary to check
whether the module registered its functionality or not anyway.

But improperly returning 0 can indeed confuse a few callers, for example
get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit:

if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
fs = __get_fs_type(name, len);
WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
}

This is easily reproduced with:

echo > /proc/sys/kernel/modprobe
mount -t NONEXISTENT none /

It causes:

request_module fs-NONEXISTENT succeeded, but still no fs?
WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
[...]

This should actually use pr_warn_once() rather than WARN_ONCE(), since
it's also user-reachable if userspace immediately unloads the module.
Regardless, request_module() should correctly return an error when it
fails. So let's make it return -ENOENT, which matches the error when
the modprobe binary doesn't exist.

I've also sent patches to document and test this case.

Signed-off-by: Eric Biggers
Signed-off-by: Andrew Morton
Reviewed-by: Kees Cook
Reviewed-by: Jessica Yu
Acked-by: Luis Chamberlain
Cc: Alexei Starovoitov
Cc: Greg Kroah-Hartman
Cc: Jeff Vander Stoep
Cc: Ben Hutchings
Cc: Josh Triplett
Cc:
Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org
Signed-off-by: Linus Torvalds

Eric Biggers
2020-04-11 06:36:22 +0800
ab6f762f0 printk: queue wake_up_klogd irq_work only if per-CPU areas are ready ... Browse Code »

printk_deferred(), similarly to printk_safe/printk_nmi, does not
immediately attempt to print a new message on the consoles, avoiding
calls into non-reentrant kernel paths, e.g. scheduler or timekeeping,
which potentially can deadlock the system.

Those printk() flavors, instead, rely on per-CPU flush irq_work to print
messages from safer contexts. For same reasons (recursive scheduler or
timekeeping calls) printk() uses per-CPU irq_work in order to wake up
user space syslog/kmsg readers.

However, only printk_safe/printk_nmi do make sure that per-CPU areas
have been initialised and that it's safe to modify per-CPU irq_work.
This means that, for instance, should printk_deferred() be invoked "too
early", that is before per-CPU areas are initialised, printk_deferred()
will perform illegal per-CPU access.

Lech Perczak [0] reports that after commit 1b710b1b10ef ("char/random:
silence a lockdep splat with printk()") user-space syslog/kmsg readers
are not able to read new kernel messages.

The reason is printk_deferred() being called too early (as was pointed
out by Petr and John).

Fix printk_deferred() and do not queue per-CPU irq_work before per-CPU
areas are initialized.

Link: https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/
Reported-by: Lech Perczak
Signed-off-by: Sergey Senozhatsky
Tested-by: Jann Horn
Reviewed-by: Petr Mladek
Cc: Greg Kroah-Hartman
Cc: Theodore Ts'o
Cc: John Ogness
Signed-off-by: Linus Torvalds

Sergey Senozhatsky
2020-04-11 04:18:57 +0800
87ad46e60 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull proc fix from Eric Biederman:
"A brown paper bag slipped through my proc changes, and syzcaller
caught it when the code ended up in your tree.

I have opted to fix it the simplest cleanest way I know how, so there
is no reasonable chance for the bug to repeat"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: Use a dedicated lock in struct pid

Linus Torvalds
2020-04-11 03:59:56 +0800
bbec2a2dc Merge tag 'pm-5.7-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull more power management updates from Rafael Wysocki:
"Rework compat ioctl handling in the user space hibernation interface
(Christoph Hellwig) and fix a typo in a function name in the cpuidle
haltpoll driver (Yihao Wu)"

* tag 'pm-5.7-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle-haltpoll: Fix small typo
PM / sleep: handle the compat case in snapshot_set_swap_area()
PM / sleep: move SNAPSHOT_SET_SWAP_AREA handling into a helper

Linus Torvalds
2020-04-11 00:50:00 +0800

10 Apr, 2020

3 commits

40fc7ad2c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf ... Browse Code »

Daniel Borkmann says:

====================
pull-request: bpf 2020-04-10

The following pull-request contains BPF updates for your *net* tree.

We've added 13 non-merge commits during the last 7 day(s) which contain
a total of 13 files changed, 137 insertions(+), 43 deletions(-).

The main changes are:

1) JIT code emission fixes for riscv and arm32, from Luke Nelson and Xi Wang.

2) Disable vmlinux BTF info if GCC_PLUGIN_RANDSTRUCT is used, from Slava Bacherikov.

3) Fix oob write in AF_XDP when meta data is used, from Li RongQing.

4) Fix bpf_get_link_xdp_id() handling on single prog when flags are specified,
from Andrey Ignatov.

5) Fix sk_assign() BPF helper for request sockets that can have sk_reuseport
field uninitialized, from Joe Stringer.

6) Fix mprotect() test case for the BPF LSM, from KP Singh.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-04-10 08:39:22 +0800
c0cc27117 Merge tag 'modules-for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux ... Browse Code »

Pull module updates from Jessica Yu:
"Only a small cleanup this time around: a trivial conversion of
zero-length arrays to flexible arrays"

* tag 'modules-for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
kernel: module: Replace zero-length array with flexible-array member

Linus Torvalds
2020-04-10 03:52:34 +0800
63f818f46 proc: Use a dedicated lock in struct pid ... Browse Code »

syzbot wrote:
> ========================================================
> WARNING: possible irq lock inversion dependency detected
> 5.6.0-syzkaller #0 Not tainted
> --------------------------------------------------------
> swapper/1/0 just changed the state of lock:
> ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigurg+0x9f/0x320 fs/fcntl.c:840
> but this lock took another, SOFTIRQ-unsafe lock in the past:
> (&pid->wait_pidfd){+.+.}-{2:2}
>
>
> and interrupts could create inverse lock ordering between them.
>
>
> other info that might help us debug this:
> Possible interrupt unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&pid->wait_pidfd);
> local_irq_disable();
> lock(tasklist_lock);
> lock(&pid->wait_pidfd);
>
> lock(tasklist_lock);
>
> *** DEADLOCK ***
>
> 4 locks held by swapper/1/0:

The problem is that because wait_pidfd.lock is taken under the tasklist
lock. It must always be taken with irqs disabled as tasklist_lock can be
taken from interrupt context and if wait_pidfd.lock was already taken this
would create a lock order inversion.

Oleg suggested just disabling irqs where I have added extra calls to
wait_pidfd.lock. That should be safe and I think the code will eventually
do that. It was rightly pointed out by Christian that sharing the
wait_pidfd.lock was a premature optimization.

It is also true that my pre-merge window testing was insufficient. So
remove the premature optimization and give struct pid a dedicated lock of
it's own for struct pid things. I have verified that lockdep sees all 3
paths where we take the new pid->lock and lockdep does not complain.

It is my current day dream that one day pid->lock can be used to guard the
task lists as well and then the tasklist_lock won't need to be held to
deliver signals. That will require taking pid->lock with irqs disabled.

Acked-by: Christian Brauner
Link: https://lore.kernel.org/lkml/00000000000011d66805a25cd73f@google.com/
Cc: Oleg Nesterov
Cc: Christian Brauner
Reported-by: syzbot+343f75cdeea091340956@syzkaller.appspotmail.com
Reported-by: syzbot+832aabf700bc3ec920b9@syzkaller.appspotmail.com
Reported-by: syzbot+f675f964019f884dbd0f@syzkaller.appspotmail.com
Reported-by: syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com
Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-04-10 01:15:35 +0800

09 Apr, 2020

2 commits

9bb50ed74 dma-debug: fix displaying of dma allocation type ... Browse Code »

The commit 2e05ea5cdc1a ("dma-mapping: implement dma_map_single_attrs using
dma_map_page_attrs") removed "dma_debug_page" enum, but missed to update
type2name string table. This causes incorrect displaying of dma allocation
type.
Fix it by removing "page" string from type2name string table and switch to
use named initializers.

Before (dma_alloc_coherent()):
k3-ringacc 4b800000.ringacc: scather-gather idx 2208 P=d1140000 N=d114 D=d1140000 L=40 DMA_BIDIRECTIONAL dma map error check not applicable
k3-ringacc 4b800000.ringacc: scather-gather idx 2216 P=d1150000 N=d115 D=d1150000 L=40 DMA_BIDIRECTIONAL dma map error check not applicable

After:
k3-ringacc 4b800000.ringacc: coherent idx 2208 P=d1140000 N=d114 D=d1140000 L=40 DMA_BIDIRECTIONAL dma map error check not applicable
k3-ringacc 4b800000.ringacc: coherent idx 2216 P=d1150000 N=d115 D=d1150000 L=40 DMA_BIDIRECTIONAL dma map error check not applicable

Fixes: 2e05ea5cdc1a ("dma-mapping: implement dma_map_single_attrs using dma_map_page_attrs")
Signed-off-by: Grygorii Strashko
Signed-off-by: Christoph Hellwig

Grygorii Strashko
2020-04-09 03:46:57 +0800
cdcda0d1f dma-direct: fix data truncation in dma_direct_get_required_mask() ... Browse Code »

The upper 32-bit physical address gets truncated inadvertently
when dma_direct_get_required_mask() invokes phys_to_dma_direct().
This results in dma_addressing_limited() return incorrect value
when used in platforms with LPAE enabled.
Fix it here by explicitly type casting 'max_pfn' to phys_addr_t
in order to prevent overflow of intermediate value while evaluating
'(max_pfn - 1) << PAGE_SHIFT'.

Signed-off-by: Kishon Vijay Abraham I
Signed-off-by: Christoph Hellwig

Kishon Vijay Abraham I
2020-04-09 02:52:24 +0800

08 Apr, 2020

4 commits

9a019db0b locking/lockdep: Improve 'invalid wait context' splat ... Browse Code »

The 'invalid wait context' splat doesn't print all the information
required to reconstruct / validate the error, specifically the
irq-context state is missing.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar

Peter Zijlstra
2020-04-08 18:05:07 +0800
d22cc7f67 locking/percpu-rwsem: Fix a task_struct refcount ... Browse Code »

The following commit:

7f26482a872c ("locking/percpu-rwsem: Remove the embedded rwsem")

introduced task_struct memory leaks due to messing up the task_struct
refcount.

At the beginning of percpu_rwsem_wake_function(), it calls get_task_struct(),
but if the trylock failed, it will remain in the waitqueue. However, it
will run percpu_rwsem_wake_function() again with get_task_struct() to
increase the refcount but then only call put_task_struct() once the trylock
succeeded.

Fix it by adjusting percpu_rwsem_wake_function() a bit to guard against
when percpu_rwsem_wait() observing !private, terminating the wait and
doing a quick exit() while percpu_rwsem_wake_function() then doing
wake_up_process(p) as a use-after-free.

Fixes: 7f26482a872c ("locking/percpu-rwsem: Remove the embedded rwsem")
Suggested-by: Peter Zijlstra
Signed-off-by: Qian Cai
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar
Link: https://lkml.kernel.org/r/20200330213002.2374-1-cai@lca.pw

Qian Cai
2020-04-08 18:05:06 +0800
96e74ebf8 sched/debug: Add task uclamp values to SCHED_DEBUG procfs ... Browse Code »

Requested and effective uclamp values can be a bit tricky to decipher when
playing with cgroup hierarchies. Add them to a task's procfs when
SCHED_DEBUG is enabled.

Reviewed-by: Qais Yousef
Signed-off-by: Valentin Schneider
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar
Link: https://lkml.kernel.org/r/20200226124543.31986-4-valentin.schneider@arm.com

Valentin Schneider
2020-04-08 17:35:27 +0800
9e3bf9469 sched/debug: Factor out printing formats into common macros ... Browse Code »

The printing macros in debug.c keep redefining the same output
format. Collect each output format in a single definition, and reuse that
definition in the other macros. While at it, add a layer of parentheses and
replace printf's with the newly introduced macros.

Reviewed-by: Qais Yousef
Signed-off-by: Valentin Schneider
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar
Link: https://lkml.kernel.org/r/20200226124543.31986-3-valentin.schneider@arm.com

Valentin Schneider
2020-04-08 17:35:26 +0800