Eric Lee / smarc-fsl-linux-kernel

21 Dec, 2018

22 commits

3dab30f33 locking/qspinlock: Bound spinning on pending->locked transition in slowpath ... Browse Code »

commit 6512276d97b160d90b53285bd06f7f201459a7e3 upstream.

If a locker taking the qspinlock slowpath reads a lock value indicating
that only the pending bit is set, then it will spin whilst the
concurrent pending->locked transition takes effect.

Unfortunately, there is no guarantee that such a transition will ever be
observed since concurrent lockers could continuously set pending and
hand over the lock amongst themselves, leading to starvation. Whilst
this would probably resolve in practice, it means that it is not
possible to prove liveness properties about the lock and means that lock
acquisition time is unbounded.

Rather than removing the pending->locked spinning from the slowpath
altogether (which has been shown to heavily penalise a 2-threaded
locking stress test on x86), this patch replaces the explicit spinning
with a call to atomic_cond_read_relaxed and allows the architecture to
provide a bound on the number of spins. For architectures that can
respond to changes in cacheline state in their smp_cond_load implementation,
it should be sufficient to use the default bound of 1.

Suggested-by: Waiman Long
Signed-off-by: Will Deacon
Acked-by: Peter Zijlstra (Intel)
Acked-by: Waiman Long
Cc: Linus Torvalds
Cc: Thomas Gleixner
Cc: boqun.feng@gmail.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1524738868-31318-4-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Sasha Levin

Will Deacon
2018-12-21 21:13:08 +0800
13f14c363 locking/qspinlock: Ensure node is initialised before updating prev->next ... Browse Code »

commit 95bcade33a8af38755c9b0636e36a36ad3789fe6 upstream.

When a locker ends up queuing on the qspinlock locking slowpath, we
initialise the relevant mcs node and publish it indirectly by updating
the tail portion of the lock word using xchg_tail. If we find that there
was a pre-existing locker in the queue, we subsequently update their
->next field to point at our node so that we are notified when it's our
turn to take the lock.

This can be roughly illustrated as follows:

/* Initialise the fields in node and encode a pointer to node in tail */
tail = initialise_node(node);

/*
* Exchange tail into the lockword using an atomic read-modify-write
* operation with release semantics
*/
old = xchg_tail(lock, tail);

/* If there was a pre-existing waiter ... */
if (old & _Q_TAIL_MASK) {
prev = decode_tail(old);
smp_read_barrier_depends();

/* ... then update their ->next field to point to node.
WRITE_ONCE(prev->next, node);
}

The conditional update of prev->next therefore relies on the address
dependency from the result of xchg_tail ensuring order against the
prior initialisation of node. However, since the release semantics of
the xchg_tail operation apply only to the write portion of the RmW,
then this ordering is not guaranteed and it is possible for the CPU
to return old before the writes to node have been published, consequently
allowing us to point prev->next to an uninitialised node.

This patch fixes the problem by making the update of prev->next a RELEASE
operation, which also removes the reliance on dependency ordering.

Signed-off-by: Will Deacon
Acked-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1518528177-19169-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Sasha Levin

Will Deacon
2018-12-21 21:13:08 +0800
a9febd662 locking: Remove smp_read_barrier_depends() from queued_spin_lock_slowpath() ... Browse Code »

commit 548095dea63ffc016d39c35b32c628d033638aca upstream.

Queued spinlocks are not used by DEC Alpha, and furthermore operations
such as READ_ONCE() and release/relaxed RMW atomics are being changed
to imply smp_read_barrier_depends(). This commit therefore removes the
now-redundant smp_read_barrier_depends() from queued_spin_lock_slowpath(),
and adjusts the comments accordingly.

Signed-off-by: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Ingo Molnar
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Sasha Levin

Paul E. McKenney
2018-12-21 21:13:08 +0800
f02ef68bd x86/build: Fix compiler support check for CONFIG_RETPOLINE ... Browse Code »

commit 25896d073d8a0403b07e6dec56f58e6c33678207 upstream.

It is troublesome to add a diagnostic like this to the Makefile
parse stage because the top-level Makefile could be parsed with
a stale include/config/auto.conf.

Once you are hit by the error about non-retpoline compiler, the
compilation still breaks even after disabling CONFIG_RETPOLINE.

The easiest fix is to move this check to the "archprepare" like
this commit did:

829fe4aa9ac1 ("x86: Allow generating user-space headers without a compiler")

Reported-by: Meelis Roos
Tested-by: Meelis Roos
Signed-off-by: Masahiro Yamada
Acked-by: Zhenzhong Duan
Cc: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Zhenzhong Duan
Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support")
Link: http://lkml.kernel.org/r/1543991239-18476-1-git-send-email-yamada.masahiro@socionext.com
Link: https://lkml.org/lkml/2018/12/4/206
Signed-off-by: Ingo Molnar
Signed-off-by: Sasha Levin
Cc: Gi-Oh Kim
Signed-off-by: Greg Kroah-Hartman

Masahiro Yamada
2018-12-21 21:13:07 +0800
c8626858b drm/amdgpu: update SMC firmware image for polaris10 variants ... Browse Code »

commit d55d8be0747c96db28a1d08fc24d22ccd9b448ac upstream.

Some new variants require different firmwares.

Signed-off-by: Junwei Zhang
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

Junwei Zhang
2018-12-21 21:13:07 +0800
8a2004627 drm/i915/execlists: Apply a full mb before execution for Braswell ... Browse Code »

commit cf66b8a0ba142fbd1bf10ac8f3ae92d1b0cb7b8f upstream.

Braswell is really picky about having our writes posted to memory before
we execute or else the GPU may see stale values. A wmb() is insufficient
as it only ensures the writes are visible to other cores, we need a full
mb() to ensure the writes are in memory and visible to the GPU.

The most frequent failure in flushing before execution is that we see
stale PTE values and execute the wrong pages.

References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")
Signed-off-by: Chris Wilson
Cc: Mika Kuoppala
Cc: Tvrtko Ursulin
Cc: Joonas Lahtinen
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin
Link: https://patchwork.freedesktop.org/patch/msgid/20181206084431.9805-3-chris@chris-wilson.co.uk
(cherry picked from commit 490b8c65b9db45896769e1095e78725775f47b3e)
Signed-off-by: Joonas Lahtinen
Signed-off-by: Greg Kroah-Hartman

Chris Wilson
2018-12-21 21:13:07 +0800
8b1bdb941 Revert "drm/rockchip: Allow driver to be shutdown on reboot/kexec" ... Browse Code »

commit 63238173b2faf3d6b85a416f1c69af6c7be2413f upstream.

This reverts commit 7f3ef5dedb146e3d5063b6845781ad1bb59b92b5.

It causes new warnings [1] on shutdown when running the Google Kevin or
Scarlet (RK3399) boards under Chrome OS. Presumably our usage of DRM is
different than what Marc and Heiko test.

We're looking at a different approach (e.g., [2]) to replace this, but
IMO the revert should be taken first, as it already propagated to
-stable.

[1] Report here:
http://lkml.kernel.org/lkml/20181205030127.GA200921@google.com

WARNING: CPU: 4 PID: 2035 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x1c4/0x294
...
Call trace:
drm_mode_config_cleanup+0x1c4/0x294
rockchip_drm_unbind+0x4c/0x8c
component_master_del+0x88/0xb8
rockchip_drm_platform_remove+0x2c/0x44
rockchip_drm_platform_shutdown+0x20/0x2c
platform_drv_shutdown+0x2c/0x38
device_shutdown+0x164/0x1b8
kernel_restart_prepare+0x40/0x48
kernel_restart+0x20/0x68
...
Memory manager not clean during takedown.
WARNING: CPU: 4 PID: 2035 at drivers/gpu/drm/drm_mm.c:950 drm_mm_takedown+0x34/0x44
...
drm_mm_takedown+0x34/0x44
rockchip_drm_unbind+0x64/0x8c
component_master_del+0x88/0xb8
rockchip_drm_platform_remove+0x2c/0x44
rockchip_drm_platform_shutdown+0x20/0x2c
platform_drv_shutdown+0x2c/0x38
device_shutdown+0x164/0x1b8
kernel_restart_prepare+0x40/0x48
kernel_restart+0x20/0x68
...

[2] https://patchwork.kernel.org/patch/10556151/
https://www.spinics.net/lists/linux-rockchip/msg21342.html
[PATCH] drm/rockchip: shutdown drm subsystem on shutdown

Fixes: 7f3ef5dedb14 ("drm/rockchip: Allow driver to be shutdown on reboot/kexec")
Cc: Jeffy Chen
Cc: Robin Murphy
Cc: Vicente Bergas
Cc: Marc Zyngier
Cc: Heiko Stuebner
Cc: stable@vger.kernel.org
Signed-off-by: Brian Norris
Signed-off-by: Heiko Stuebner
Link: https://patchwork.freedesktop.org/patch/msgid/20181205181657.177703-1-briannorris@chromium.org
Signed-off-by: Greg Kroah-Hartman

Brian Norris
2018-12-21 21:13:07 +0800
d0a954cbf drm/nouveau/kms: Fix memory leak in nv50_mstm_del() ... Browse Code »

commit 24199c5436f267399afed0c4f1f57663c0408f57 upstream.

Noticed this while working on redoing the reference counting scheme in
the DP MST helpers. Nouveau doesn't attempt to call
drm_dp_mst_topology_mgr_destroy() at all, which leaves it leaking all of
the resources for drm_dp_mst_topology_mgr and it's children mstbs+ports.

Fixes: f479c0ba4a17 ("drm/nouveau/kms/nv50: initial support for DP 1.2 multi-stream")
Signed-off-by: Lyude Paul
Cc: # v4.10+
Signed-off-by: Ben Skeggs
Signed-off-by: Greg Kroah-Hartman

Lyude Paul
2018-12-21 21:13:06 +0800
e89aa8183 powerpc/msi: Fix NULL pointer access in teardown code ... Browse Code »

commit 78e7b15e17ac175e7eed9e21c6f92d03d3b0a6fa upstream.

The arch_teardown_msi_irqs() function assumes that controller ops
pointers were already checked in arch_setup_msi_irqs(), but this
assumption is wrong: arch_teardown_msi_irqs() can be called even when
arch_setup_msi_irqs() returns an error (-ENOSYS).

This can happen in the following scenario:
- msi_capability_init() calls pci_msi_setup_msi_irqs()
- pci_msi_setup_msi_irqs() returns -ENOSYS
- msi_capability_init() notices the error and calls free_msi_irqs()
- free_msi_irqs() calls pci_msi_teardown_msi_irqs()

This is easier to see when CONFIG_PCI_MSI_IRQ_DOMAIN is not set and
pci_msi_setup_msi_irqs() and pci_msi_teardown_msi_irqs() are just
aliases to arch_setup_msi_irqs() and arch_teardown_msi_irqs().

The call to free_msi_irqs() upon pci_msi_setup_msi_irqs() failure
seems legit, as it does additional cleanup; e.g.
list_del(&entry->list) and kfree(entry) inside free_msi_irqs() do
happen (MSI descriptors are allocated before pci_msi_setup_msi_irqs()
is called and need to be cleaned up if that fails).

Fixes: 6b2fd7efeb88 ("PCI/MSI/PPC: Remove arch_msi_check_device()")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Radu Rendec
Signed-off-by: Michael Ellerman
Signed-off-by: Greg Kroah-Hartman

Radu Rendec
2018-12-21 21:13:06 +0800
a28d5f3d9 tracing: Fix memory leak of instance function hash filters ... Browse Code »

commit 2840f84f74035e5a535959d5f17269c69fa6edc5 upstream.

The following commands will cause a memory leak:

# cd /sys/kernel/tracing
# mkdir instances/foo
# echo schedule > instance/foo/set_ftrace_filter
# rmdir instances/foo

The reason is that the hashes that hold the filters to set_ftrace_filter and
set_ftrace_notrace are not freed if they contain any data on the instance
and the instance is removed.

Found by kmemleak detector.

Cc: stable@vger.kernel.org
Fixes: 591dffdade9f ("ftrace: Allow for function tracing instance to filter functions")
Signed-off-by: Steven Rostedt (VMware)
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (VMware)
2018-12-21 21:13:06 +0800
b2e08ad9d tracing: Fix memory leak in set_trigger_filter() ... Browse Code »

commit 3cec638b3d793b7cacdec5b8072364b41caeb0e1 upstream.

When create_event_filter() fails in set_trigger_filter(), the filter may
still be allocated and needs to be freed. The caller expects the
data->filter to be updated with the new filter, even if the new filter
failed (we could add an error message by setting set_str parameter of
create_event_filter(), but that's another update).

But because the error would just exit, filter was left hanging and
nothing could free it.

Found by kmemleak detector.

Cc: stable@vger.kernel.org
Fixes: bac5fb97a173a ("tracing: Add and use generic set_trigger_filter() implementation")
Reviewed-by: Tom Zanussi
Signed-off-by: Steven Rostedt (VMware)
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (VMware)
2018-12-21 21:13:06 +0800
d17cc664f dm cache metadata: verify cache has blocks in blocks_are_clean_separate_dirty() ... Browse Code »

commit 687cf4412a343a63928a5c9d91bdc0f522939d43 upstream.

Otherwise dm_bitset_cursor_begin() return -ENODATA. Other calls to
dm_bitset_cursor_begin() have similar negative checks.

Fixes inability to create a cache in passthrough mode (even though doing
so makes no sense).

Fixes: 0d963b6e65 ("dm cache metadata: fix metadata2 format's blocks_are_clean_separate_dirty")
Cc: stable@vger.kernel.org
Reported-by: David Teigland
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Mike Snitzer
2018-12-21 21:13:06 +0800
cd5d8a920 dm thin: send event about thin-pool state change _after_ making it ... Browse Code »

commit f6c367585d0d851349d3a9e607c43e5bea993fa1 upstream.

Sending a DM event before a thin-pool state change is about to happen is
a bug. It wasn't realized until it became clear that userspace response
to the event raced with the actual state change that the event was
meant to notify about.

Fix this by first updating internal thin-pool state to reflect what the
DM event is being issued about. This fixes a long-standing racey/buggy
userspace device-mapper-test-suite 'resize_io' test that would get an
event but not find the state it was looking for -- so it would just go
on to hang because no other events caused the test to reevaluate the
thin-pool's state.

Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Mike Snitzer
2018-12-21 21:13:05 +0800
529571392 ARM: mmp/mmp2: fix cpu_is_mmp2() on mmp2-dt ... Browse Code »

commit 76f4e2c3b6a560cdd7a75b87df543e04d05a9e5f upstream.

cpu_is_mmp2() was equivalent to cpu_is_pj4(), wouldn't be correct for
multiplatform kernels. Fix it by also considering mmp_chip_id, as is
done for cpu_is_pxa168() and cpu_is_pxa910() above.

Moreover, it is only available with CONFIG_CPU_MMP2 and thus doesn't work
on DT-based MMP2 machines. Enable it on CONFIG_MACH_MMP2_DT too.

Note: CONFIG_CPU_MMP2 is only used for machines that use board files
instead of DT. It should perhaps be renamed. I'm not doing it now, because
I don't have a better idea.

Signed-off-by: Lubomir Rintel
Acked-by: Arnd Bergmann
Cc: stable@vger.kernel.org
Signed-off-by: Olof Johansson
Signed-off-by: Greg Kroah-Hartman

Lubomir Rintel
2018-12-21 21:13:05 +0800
c1149b873 fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS ... Browse Code »

commit 2e64ff154ce6ce9a8dc0f9556463916efa6ff460 upstream.

When FUSE_OPEN returns ENOSYS, the no_open bit is set on the connection.

Because the FUSE_RELEASE and FUSE_RELEASEDIR paths share code, this
incorrectly caused the FUSE_RELEASEDIR request to be dropped and never sent
to userspace.

Pass an isdir bool to distinguish between FUSE_RELEASE and FUSE_RELEASEDIR
inside of fuse_file_put.

Fixes: 7678ac50615d ("fuse: support clients that don't implement 'open'")
Cc: # v3.14
Signed-off-by: Chad Austin
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman

Chad Austin
2018-12-21 21:13:05 +0800
38ef9c5a9 mmc: sdhci: fix the timeout check window for clock and reset ... Browse Code »

commit b704441e38f645dcfba1348ca3cc1ba43d1a9f31 upstream.

We observed some premature timeouts on a virtualization platform, the log
is like this:

case 1:
[159525.255629] mmc1: Internal clock never stabilised.
[159525.255818] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[159525.256049] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
...
[159525.257205] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa03
From the clock control register dump, we are pretty sure the clock was
stablized.

case 2:
[ 914.550127] mmc1: Reset 0x2 never completed.
[ 914.550321] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 914.550608] mmc1: sdhci: Sys addr: 0x00000010 | Version: 0x00001002

After checking the sdhci code, we found the timeout check actually has a
little window that the CPU can be scheduled out and when it comes back,
the original time set or check is not valid.

Fixes: 5a436cc0af62 ("mmc: sdhci: Optimize delay loops")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alek Du
Acked-by: Adrian Hunter
Signed-off-by: Ulf Hansson
Signed-off-by: Greg Kroah-Hartman

Alek Du
2018-12-21 21:13:05 +0800
30d358d80 MMC: OMAP: fix broken MMC on OMAP15XX/OMAP5910/OMAP310 ... Browse Code »

commit e8cde625bfe8a714a856e1366bcbb259d7346095 upstream.

Since v2.6.22 or so there has been reports [1] about OMAP MMC being
broken on OMAP15XX based hardware (OMAP5910 and OMAP310). The breakage
seems to have been caused by commit 46a6730e3ff9 ("mmc-omap: Fix
omap to use MMC_POWER_ON") that changed clock enabling to be done
on MMC_POWER_ON. This can happen multiple times in a row, and on 15XX
the hardware doesn't seem to like it and the MMC just stops responding.
Fix by memorizing the power mode and do the init only when necessary.

Before the patch (on Palm TE):

mmc0: new SD card at address b368
mmcblk0: mmc0:b368 SDC 977 MiB
mmci-omap mmci-omap.0: command timeout (CMD18)
mmci-omap mmci-omap.0: command timeout (CMD13)
mmci-omap mmci-omap.0: command timeout (CMD13)
mmci-omap mmci-omap.0: command timeout (CMD12) [x 6]
mmci-omap mmci-omap.0: command timeout (CMD13) [x 6]
mmcblk0: error -110 requesting status
mmci-omap mmci-omap.0: command timeout (CMD8)
mmci-omap mmci-omap.0: command timeout (CMD18)
mmci-omap mmci-omap.0: command timeout (CMD13)
mmci-omap mmci-omap.0: command timeout (CMD13)
mmci-omap mmci-omap.0: command timeout (CMD12) [x 6]
mmci-omap mmci-omap.0: command timeout (CMD13) [x 6]
mmcblk0: error -110 requesting status
mmcblk0: recovery failed!
print_req_error: I/O error, dev mmcblk0, sector 0
Buffer I/O error on dev mmcblk0, logical block 0, async page read
mmcblk0: unable to read partition table

After the patch:

mmc0: new SD card at address b368
mmcblk0: mmc0:b368 SDC 977 MiB
mmcblk0: p1

The patch is based on a fix and analysis done by Ladislav Michl.

Tested on OMAP15XX/OMAP310 (Palm TE), OMAP1710 (Nokia 770)
and OMAP2420 (Nokia N810).

[1] https://marc.info/?t=123175197000003&r=1&w=2

Fixes: 46a6730e3ff9 ("mmc-omap: Fix omap to use MMC_POWER_ON")
Reported-by: Ladislav Michl
Reported-by: Andrzej Zaborowski
Tested-by: Ladislav Michl
Acked-by: Tony Lindgren
Signed-off-by: Aaro Koskinen
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson
Signed-off-by: Greg Kroah-Hartman

Aaro Koskinen
2018-12-21 21:13:05 +0800
87d143de9 arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing ... Browse Code »

commit 3238c359acee4ab57f15abb5a82b8ab38a661ee7 upstream.

We need to invalidate the caches *before* clearing the buffer via the
non-cacheable alias, else in the worst case __dma_flush_area() may
write back dirty lines over the top of our nice new zeros.

Fixes: dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag")
Cc: # 4.18.x-
Acked-by: Will Deacon
Signed-off-by: Robin Murphy
Signed-off-by: Catalin Marinas
Signed-off-by: Greg Kroah-Hartman

Robin Murphy
2018-12-21 21:13:04 +0800
b99eaefb6 userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered ... Browse Code »

commit 01e881f5a1fca4677e82733061868c6d6ea05ca7 upstream.

Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd
could trigger an harmless false positive WARN_ON. Check the vma is
already registered before checking VM_MAYWRITE to shut off the false
positive warning.

Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com
Cc:
Fixes: 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas")
Signed-off-by: Andrea Arcangeli
Reported-by: syzbot+06c7092e7d71218a2c16@syzkaller.appspotmail.com
Acked-by: Mike Rapoport
Acked-by: Hugh Dickins
Acked-by: Peter Xu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Andrea Arcangeli
2018-12-21 21:13:04 +0800
5f4610fe2 aio: fix spectre gadget in lookup_ioctx ... Browse Code »

commit a538e3ff9dabcdf6c3f477a373c629213d1c3066 upstream.

Matthew pointed out that the ioctx_table is susceptible to spectre v1,
because the index can be controlled by an attacker. The below patch
should mitigate the attack for all of the aio system calls.

Cc: stable@vger.kernel.org
Reported-by: Matthew Wilcox
Reported-by: Dan Carpenter
Signed-off-by: Jeff Moyer
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jeff Moyer
2018-12-21 21:13:04 +0800
7ff0bcb2c pinctrl: sunxi: a83t: Fix IRQ offset typo for PH11 ... Browse Code »

commit 478b6767ad26ab86d9ecc341027dd09a87b1f997 upstream.

Pin PH11 is used on various A83T board to detect a change in the OTG
port's ID pin, as in when an OTG host cable is plugged in.

The incorrect offset meant the gpiochip/irqchip was activating the wrong
pin for interrupts.

Fixes: 4730f33f0d82 ("pinctrl: sunxi: add allwinner A83T PIO controller support")
Cc:
Signed-off-by: Chen-Yu Tsai
Acked-by: Maxime Ripard
Signed-off-by: Linus Walleij
Signed-off-by: Greg Kroah-Hartman

Chen-Yu Tsai
2018-12-21 21:13:04 +0800
30c64b5a4 timer/debug: Change /proc/timer_list from 0444 to 0400 ... Browse Code »

[ Upstream commit 8e7df2b5b7f245c9bd11064712db5cb69044a362 ]

While it uses %pK, there's still few reasons to read this file
as non-root.

Suggested-by: Linus Torvalds
Acked-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar
Signed-off-by: Sasha Levin

Ingo Molnar
2018-12-21 21:13:04 +0800

17 Dec, 2018

18 commits

3beeb2615 Linux 4.14.89 Browse Code »

Greg Kroah-Hartman
2018-12-17 16:28:56 +0800
4465b31b7 tcp: lack of available data can also cause TSO defer ... Browse Code »

commit f9bfe4e6a9d08d405fe7b081ee9a13e649c97ecf upstream.

tcp_tso_should_defer() can return true in three different cases :

1) We are cwnd-limited
2) We are rwnd-limited
3) We are application limited.

Neal pointed out that my recent fix went too far, since
it assumed that if we were not in 1) case, we must be rwnd-limited

Fix this by properly populating the is_cwnd_limited and
is_rwnd_limited booleans.

After this change, we can finally move the silly check for FIN
flag only for the application-limited case.

The same move for EOR bit will be handled in net-next,
since commit 1c09f7d073b1 ("tcp: do not try to defer skbs
with eor mark (MSG_EOR)") is scheduled for linux-4.21

Tested by running 200 concurrent netperf -t TCP_RR -- -r 60000,100
and checking none of them was rwnd_limited in the chrono_stat
output from "ss -ti" command.

Fixes: 41727549de3e ("tcp: Do not underestimate rwnd_limited")
Signed-off-by: Eric Dumazet
Suggested-by: Neal Cardwell
Reviewed-by: Neal Cardwell
Acked-by: Soheil Hassas Yeganeh
Reviewed-by: Yuchung Cheng
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2018-12-17 16:28:55 +0800
01a166012 IB/hfi1: Fix an out-of-bounds access in get_hw_stats ... Browse Code »

commit 36d842194a57f1b21fbc6a6875f2fa2f9a7f8679 upstream.

When running with KASAN, the following trace is produced:

[ 62.535888]

==================================================================
[ 62.544930] BUG: KASAN: slab-out-of-bounds in
gut_hw_stats+0x122/0x230 [hfi1]
[ 62.553856] Write of size 8 at addr ffff88080e8d6330 by task
kworker/0:1/14

[ 62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted
4.19.0-test-build-kasan+ #8
[ 62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
SE5C610.86B.01.01.0019.101220160604 10/12/2016
[ 62.587951] Workqueue: events work_for_cpu_fn
[ 62.594050] Call Trace:
[ 62.598023] dump_stack+0xc6/0x14c
[ 62.603089] ? dump_stack_print_info.cold.1+0x2f/0x2f
[ 62.610041] ? kmsg_dump_rewind_nolock+0x59/0x59
[ 62.616615] ? get_hw_stats+0x122/0x230 [hfi1]
[ 62.622985] print_address_description+0x6c/0x23c
[ 62.629744] ? get_hw_stats+0x122/0x230 [hfi1]
[ 62.636108] kasan_report.cold.6+0x241/0x308
[ 62.642365] get_hw_stats+0x122/0x230 [hfi1]
[ 62.648703] ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[ 62.655088] ? __kmalloc+0x110/0x240
[ 62.660695] ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[ 62.667142] setup_hw_stats+0xd8/0x430 [ib_core]
[ 62.673972] ? show_hfi+0x50/0x50 [hfi1]
[ 62.680026] ib_device_register_sysfs+0x165/0x180 [ib_core]
[ 62.687995] ib_register_device+0x5a2/0xa10 [ib_core]
[ 62.695340] ? show_hfi+0x50/0x50 [hfi1]
[ 62.701421] ? ib_unregister_device+0x2e0/0x2e0 [ib_core]
[ 62.709222] ? __vmalloc_node_range+0x2d0/0x380
[ 62.716131] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[ 62.723735] ? vmalloc_node+0x5c/0x70
[ 62.729697] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[ 62.737347] ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt]
[ 62.744998] ? __rvt_alloc_mr+0x110/0x110 [rdmavt]
[ 62.752315] ? rvt_rc_error+0x140/0x140 [rdmavt]
[ 62.759434] ? rvt_vma_open+0x30/0x30 [rdmavt]
[ 62.766364] ? mutex_unlock+0x1d/0x40
[ 62.772445] ? kmem_cache_create_usercopy+0x15d/0x230
[ 62.780115] rvt_register_device+0x1f6/0x360 [rdmavt]
[ 62.787823] ? rvt_get_port_immutable+0x180/0x180 [rdmavt]
[ 62.796058] ? __get_txreq+0x400/0x400 [hfi1]
[ 62.802969] ? memcpy+0x34/0x50
[ 62.808611] hfi1_register_ib_device+0xde6/0xeb0 [hfi1]
[ 62.816601] ? hfi1_get_npkeys+0x10/0x10 [hfi1]
[ 62.823760] ? hfi1_init+0x89f/0x9a0 [hfi1]
[ 62.830469] ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1]
[ 62.838204] ? pcie_capability_clear_and_set_word+0xcd/0xe0
[ 62.846429] ? pcie_capability_read_word+0xd0/0xd0
[ 62.853791] ? hfi1_pcie_init+0x187/0x4b0 [hfi1]
[ 62.860958] init_one+0x67f/0xae0 [hfi1]
[ 62.867301] ? hfi1_init+0x9a0/0x9a0 [hfi1]
[ 62.873876] ? wait_woken+0x130/0x130
[ 62.879860] ? read_word_at_a_time+0xe/0x20
[ 62.886329] ? strscpy+0x14b/0x280
[ 62.891998] ? hfi1_init+0x9a0/0x9a0 [hfi1]
[ 62.898405] local_pci_probe+0x70/0xd0
[ 62.904295] ? pci_device_shutdown+0x90/0x90
[ 62.910833] work_for_cpu_fn+0x29/0x40
[ 62.916750] process_one_work+0x584/0x960
[ 62.922974] ? rcu_work_rcufn+0x40/0x40
[ 62.928991] ? __schedule+0x396/0xdc0
[ 62.934806] ? __sched_text_start+0x8/0x8
[ 62.941020] ? pick_next_task_fair+0x68b/0xc60
[ 62.947674] ? run_rebalance_domains+0x260/0x260
[ 62.954471] ? __list_add_valid+0x29/0xa0
[ 62.960607] ? move_linked_works+0x1c7/0x230
[ 62.967077] ?
trace_event_raw_event_workqueue_execute_start+0x140/0x140
[ 62.976248] ? mutex_lock+0xa6/0x100
[ 62.982029] ? __mutex_lock_slowpath+0x10/0x10
[ 62.988795] ? __switch_to+0x37a/0x710
[ 62.994731] worker_thread+0x62e/0x9d0
[ 63.000602] ? max_active_store+0xf0/0xf0
[ 63.006828] ? __switch_to_asm+0x40/0x70
[ 63.012932] ? __switch_to_asm+0x34/0x70
[ 63.019013] ? __switch_to_asm+0x40/0x70
[ 63.025042] ? __switch_to_asm+0x34/0x70
[ 63.031030] ? __switch_to_asm+0x40/0x70
[ 63.037006] ? __schedule+0x396/0xdc0
[ 63.042660] ? kmem_cache_alloc_trace+0xf3/0x1f0
[ 63.049323] ? kthread+0x59/0x1d0
[ 63.054594] ? ret_from_fork+0x35/0x40
[ 63.060257] ? __sched_text_start+0x8/0x8
[ 63.066212] ? schedule+0xcf/0x250
[ 63.071529] ? __wake_up_common+0x110/0x350
[ 63.077794] ? __schedule+0xdc0/0xdc0
[ 63.083348] ? wait_woken+0x130/0x130
[ 63.088963] ? finish_task_switch+0x1f1/0x520
[ 63.095258] ? kasan_unpoison_shadow+0x30/0x40
[ 63.101792] ? __init_waitqueue_head+0xa0/0xd0
[ 63.108183] ? replenish_dl_entity.cold.60+0x18/0x18
[ 63.115151] ? _raw_spin_lock_irqsave+0x25/0x50
[ 63.121754] ? max_active_store+0xf0/0xf0
[ 63.127753] kthread+0x1ae/0x1d0
[ 63.132894] ? kthread_bind+0x30/0x30
[ 63.138422] ret_from_fork+0x35/0x40

[ 63.146973] Allocated by task 14:
[ 63.152077] kasan_kmalloc+0xbf/0xe0
[ 63.157471] __kmalloc+0x110/0x240
[ 63.162804] init_cntrs+0x34d/0xdf0 [hfi1]
[ 63.168883] hfi1_init_dd+0x29a3/0x2f90 [hfi1]
[ 63.175244] init_one+0x551/0xae0 [hfi1]
[ 63.181065] local_pci_probe+0x70/0xd0
[ 63.186759] work_for_cpu_fn+0x29/0x40
[ 63.192310] process_one_work+0x584/0x960
[ 63.198163] worker_thread+0x62e/0x9d0
[ 63.203843] kthread+0x1ae/0x1d0
[ 63.208874] ret_from_fork+0x35/0x40

[ 63.217203] Freed by task 1:
[ 63.221844] __kasan_slab_free+0x12e/0x180
[ 63.227844] kfree+0x92/0x1a0
[ 63.232570] single_release+0x3a/0x60
[ 63.238024] __fput+0x1d9/0x480
[ 63.242911] task_work_run+0x139/0x190
[ 63.248440] exit_to_usermode_loop+0x191/0x1a0
[ 63.254814] do_syscall_64+0x301/0x330
[ 63.260283] entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 63.270199] The buggy address belongs to the object at
ffff88080e8d5500
which belongs to the cache kmalloc-4096 of size 4096
[ 63.287247] The buggy address is located 3632 bytes inside of
4096-byte region [ffff88080e8d5500, ffff88080e8d6500)
[ 63.303564] The buggy address belongs to the page:
[ 63.310447] page:ffffea00203a3400 count:1 mapcount:0
mapping:ffff88081380e840 index:0x0 compound_mapcount: 0
[ 63.323102] flags: 0x2fffff80008100(slab|head)
[ 63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001
ffff88081380e840
[ 63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff
0000000000000000
[ 63.350564] page dumped because: kasan: bad access detected

[ 63.361974] Memory state around the buggy address:
[ 63.369137] ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[ 63.379082] ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[ 63.389032] >ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc
fc fc fc
[ 63.398944] ^
[ 63.406141] ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[ 63.416109] ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[ 63.426099]
==================================================================

The trace happens because get_hw_stats() assumes there is room in the
memory allocated in init_cntrs() to accommodate the driver counters.
Unfortunately, that routine only allocated space for the device
counters.

Fix by insuring the allocation has room for the additional driver
counters.

Cc: # v4.14+
Fixes: b7481944b06e9 ("IB/hfi1: Show statistics counters under IB stats interface")
Reviewed-by: Mike Marciniczyn
Reviewed-by: Mike Ruhl
Signed-off-by: Piotr Stankiewicz
Signed-off-by: Dennis Dalessandro
Signed-off-by: Doug Ledford
Signed-off-by: Greg Kroah-Hartman

Piotr Stankiewicz
2018-12-17 16:28:55 +0800
d655a1a66 ALSA: hda/realtek - Fixed headphone issue for ALC700 ... Browse Code »

commit bde1a7459623a66c2abec4d0a841e4b06cc88d9a upstream.

If it plugged headphone or headset into the jack, then
do the reboot, it will have a chance to cause headphone no sound.
It just need to run the headphone mode procedure after boot time.
The issue will be fixed.
It also suitable for ALC234 ALC274 and ALC294.

Signed-off-by: Kailang Yang
Cc:
Signed-off-by: Takashi Iwai
Signed-off-by: Greg Kroah-Hartman

Kailang Yang
2018-12-17 16:28:55 +0800
62711dc68 ALSA: fireface: fix reference to wrong register for clock configuration ... Browse Code »

commit fa9c98e4b975bb3192ed6af09d9fa282ed3cd8a0 upstream.

In an initial commit, 'SYNC_STATUS' register is referred to get
clock configuration, however this is wrong, according to my local
note at hand for reverse-engineering about packet dump. It should
be 'CLOCK_CONFIG' register. Actually, ff400_dump_clock_config()
is correctly programmed.

This commit fixes the bug.

Cc: # v4.12+
Fixes: 76fdb3a9e13a ('ALSA: fireface: add support for Fireface 400')
Signed-off-by: Takashi Sakamoto
Signed-off-by: Takashi Iwai
Signed-off-by: Greg Kroah-Hartman

Takashi Sakamoto
2018-12-17 16:28:55 +0800
16906e5ad staging: speakup: Replace strncpy with memcpy ... Browse Code »

commit fd29edc7232bc19f969e8f463138afc5472b3d5f upstream.

gcc 8.1.0 generates the following warnings.

drivers/staging/speakup/kobjects.c: In function 'punc_store':
drivers/staging/speakup/kobjects.c:522:2: warning:
'strncpy' output truncated before terminating nul
copying as many bytes from a string as its length
drivers/staging/speakup/kobjects.c:504:6: note: length computed here

drivers/staging/speakup/kobjects.c: In function 'synth_store':
drivers/staging/speakup/kobjects.c:391:2: warning:
'strncpy' output truncated before terminating nul
copying as many bytes from a string as its length
drivers/staging/speakup/kobjects.c:388:8: note: length computed here

Using strncpy() is indeed less than perfect since the length of data to
be copied has already been determined with strlen(). Replace strncpy()
with memcpy() to address the warning and optimize the code a little.

Signed-off-by: Guenter Roeck
Reviewed-by: Samuel Thibault
Signed-off-by: Greg Kroah-Hartman

Guenter Roeck
2018-12-17 16:28:55 +0800
5d2cc520d flexfiles: enforce per-mirror stateid only for v4 DSes ... Browse Code »

commit 320f35b7bf8cccf1997ca3126843535e1b95e9c4 upstream.

Since commit bb21ce0ad227 we always enforce per-mirror stateid.
However, this makes sense only for v4+ servers.

Signed-off-by: Tigran Mkrtchyan
Signed-off-by: Trond Myklebust
Signed-off-by: Greg Kroah-Hartman

Tigran Mkrtchyan
2018-12-17 16:28:55 +0800
891e5a89f lib/rbtree-test: lower default params ... Browse Code »

commit 0b548e33e6cb2bff240fdaf1783783be15c29080 upstream.

Fengguang reported soft lockups while running the rbtree and interval
tree test modules. The logic for these tests all occur in init phase,
and we currently are pounding with the default values for number of
nodes and number of iterations of each test. Reduce the latter by two
orders of magnitude. This does not influence the value of the tests in
that one thousand times by default is enough to get the picture.

Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805
Signed-off-by: Davidlohr Bueso
Reported-by: Fengguang Wu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Cc: Guenter Roeck
Signed-off-by: Greg Kroah-Hartman

Davidlohr Bueso
2018-12-17 16:28:55 +0800
16c9a316f printk: Wake klogd when passing console_lock owner ... Browse Code »

[ Upstream commit c14376de3a1befa70d9811ca2872d47367b48767 ]

wake_klogd is a local variable in console_unlock(). The information
is lost when the console_lock owner using the busy wait added by
the commit dbdda842fe96f8932 ("printk: Add console owner and waiter
logic to load balance console writes"). The following race is
possible:

CPU0 CPU1
console_unlock()

for (;;)
/* calling console for last message */

printk()
log_store()
log_next_seq++;

/* see new message */
if (seen_seq != log_next_seq) {
wake_klogd = true;
seen_seq = log_next_seq;
}

console_lock_spinning_enable();

if (console_trylock_spinning())
/* spinning */

if (console_lock_spinning_disable_and_check()) {
printk_safe_exit_irqrestore(flags);
return;

console_unlock()
if (seen_seq != log_next_seq) {
/* already seen */
/* nothing to do */

Result: Nobody would wakeup klogd.

One solution would be to make a global variable from wake_klogd.
But then we would need to manipulate it under a lock or so.

This patch wakes klogd also when console_lock is passed to the
spinning waiter. It looks like the right way to go. Also userspace
should have a chance to see and store any "flood" of messages.

Note that the very late klogd wake up was a historic solution.
It made sense on single CPU systems or when sys_syslog() operations
were synchronized using the big kernel lock like in v2.1.113.
But it is questionable these days.

Fixes: dbdda842fe96f8932 ("printk: Add console owner and waiter logic to load balance console writes")
Link: http://lkml.kernel.org/r/20180226155734.dzwg3aovqnwtvkoy@pathway.suse.cz
Cc: Steven Rostedt
Cc: linux-kernel@vger.kernel.org
Cc: Tejun Heo
Suggested-by: Sergey Senozhatsky
Reviewed-by: Sergey Senozhatsky
Signed-off-by: Petr Mladek
Signed-off-by: Sasha Levin

Petr Mladek
2018-12-17 16:28:55 +0800
08b7a8f88 printk: Never set console_may_schedule in console_trylock() ... Browse Code »

[ Upstream commit fd5f7cde1b85d4c8e09ca46ce948e008a2377f64 ]

This patch, basically, reverts commit 6b97a20d3a79 ("printk:
set may_schedule for some of console_trylock() callers").
That commit was a mistake, it introduced a big dependency
on the scheduler, by enabling preemption under console_sem
in printk()->console_unlock() path, which is rather too
critical. The patch did not significantly reduce the
possibilities of printk() lockups, but made it possible to
stall printk(), as has been reported by Tetsuo Handa [1].

Another issues is that preemption under console_sem also
messes up with Steven Rostedt's hand off scheme, by making
it possible to sleep with console_sem both in console_unlock()
and in vprintk_emit(), after acquiring the console_sem
ownership (anywhere between printk_safe_exit_irqrestore() in
console_trylock_spinning() and printk_safe_enter_irqsave()
in console_unlock()). This makes hand off less likely and,
at the same time, may result in a significant amount of
pending logbuf messages. Preempted console_sem owner makes
it impossible for other CPUs to emit logbuf messages, but
does not make it impossible for other CPUs to append new
messages to the logbuf.

Reinstate the old behavior and make printk() non-preemptible.
Should any printk() lockup reports arrive they must be handled
in a different way.

[1] http://lkml.kernel.org/r/201603022101.CAH73907.OVOOMFHFFtQJSL%20()%20I-love%20!%20SAKURA%20!%20ne%20!%20jp
Fixes: 6b97a20d3a79 ("printk: set may_schedule for some of console_trylock() callers")
Link: http://lkml.kernel.org/r/20180116044716.GE6607@jagdpanzerIV
To: Tetsuo Handa
Cc: Sergey Senozhatsky
Cc: Tejun Heo
Cc: akpm@linux-foundation.org
Cc: linux-mm@kvack.org
Cc: Cong Wang
Cc: Dave Hansen
Cc: Johannes Weiner
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Vlastimil Babka
Cc: Peter Zijlstra
Cc: Linus Torvalds
Cc: Jan Kara
Cc: Mathieu Desnoyers
Cc: Byungchul Park
Cc: Pavel Machek
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky
Reported-by: Tetsuo Handa
Reviewed-by: Steven Rostedt (VMware)
Signed-off-by: Petr Mladek
Signed-off-by: Sasha Levin

Sergey Senozhatsky
2018-12-17 16:28:55 +0800
ef433725f printk: Hide console waiter logic into helpers ... Browse Code »

[ Upstream commit c162d5b4338d72deed61aa65ed0f2f4ba2bbc8ab ]

The commit ("printk: Add console owner and waiter logic to load balance
console writes") made vprintk_emit() and console_unlock() even more
complicated.

This patch extracts the new code into 3 helper functions. They should
help to keep it rather self-contained. It will be easier to use and
maintain.

This patch just shuffles the existing code. It does not change
the functionality.

Link: http://lkml.kernel.org/r/20180112160837.GD24497@linux.suse
Cc: akpm@linux-foundation.org
Cc: linux-mm@kvack.org
Cc: Cong Wang
Cc: Dave Hansen
Cc: Johannes Weiner
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Vlastimil Babka
Cc: Peter Zijlstra
Cc: Linus Torvalds
Cc: Jan Kara
Cc: Mathieu Desnoyers
Cc: Tetsuo Handa
Cc: rostedt@home.goodmis.org
Cc: Byungchul Park
Cc: Tejun Heo
Cc: Pavel Machek
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Steven Rostedt (VMware)
Acked-by: Sergey Senozhatsky
Signed-off-by: Petr Mladek
Signed-off-by: Sasha Levin

Petr Mladek
2018-12-17 16:28:54 +0800
594231141 printk: Add console owner and waiter logic to load balance console writes ... Browse Code »

[ Upstream commit dbdda842fe96f8932bae554f0adf463c27c42bc7 ]

This patch implements what I discussed in Kernel Summit. I added
lockdep annotation (hopefully correctly), and it hasn't had any splats
(since I fixed some bugs in the first iterations). It did catch
problems when I had the owner covering too much. But now that the owner
is only set when actively calling the consoles, lockdep has stayed
quiet.

Here's the design again:

I added a "console_owner" which is set to a task that is actively
writing to the consoles. It is *not* the same as the owner of the
console_lock. It is only set when doing the calls to the console
functions. It is protected by a console_owner_lock which is a raw spin
lock.

There is a console_waiter. This is set when there is an active console
owner that is not current, and waiter is not set. This too is protected
by console_owner_lock.

In printk() when it tries to write to the consoles, we have:

if (console_trylock())
console_unlock();

Now I added an else, which will check if there is an active owner, and
no current waiter. If that is the case, then console_waiter is set, and
the task goes into a spin until it is no longer set.

When the active console owner finishes writing the current message to
the consoles, it grabs the console_owner_lock and sees if there is a
waiter, and clears console_owner.

If there is a waiter, then it breaks out of the loop, clears the waiter
flag (because that will release the waiter from its spin), and exits.
Note, it does *not* release the console semaphore. Because it is a
semaphore, there is no owner. Another task may release it. This means
that the waiter is guaranteed to be the new console owner! Which it
becomes.

Then the waiter calls console_unlock() and continues to write to the
consoles.

If another task comes along and does a printk() it too can become the
new waiter, and we wash rinse and repeat!

By Petr Mladek about possible new deadlocks:

The thing is that we move console_sem only to printk() call
that normally calls console_unlock() as well. It means that
the transferred owner should not bring new type of dependencies.
As Steven said somewhere: "If there is a deadlock, it was
there even before."

We could look at it from this side. The possible deadlock would
look like:

CPU0 CPU1

console_unlock()

console_owner = current;

spin_lockA()
printk()
spin = true;
while (...)

call_console_drivers()
spin_lockA()

This would be a deadlock. CPU0 would wait for the lock A.
While CPU1 would own the lockA and would wait for CPU0
to finish calling the console drivers and pass the console_sem
owner.

But if the above is true than the following scenario was
already possible before:

CPU0

spin_lockA()
printk()
console_unlock()
call_console_drivers()
spin_lockA()

By other words, this deadlock was there even before. Such
deadlocks are prevented by using printk_deferred() in
the sections guarded by the lock A.

By Steven Rostedt:

To demonstrate the issue, this module has been shown to lock up a
system with 4 CPUs and a slow console (like a serial console). It is
also able to lock up a 8 CPU system with only a fast (VGA) console, by
passing in "loops=100". The changes in this commit prevent this module
from locking up the system.

#include
#include
#include
#include
#include
#include

static bool stop_testing;
static unsigned int loops = 1;

static void preempt_printk_workfn(struct work_struct *work)
{
int i;

while (!READ_ONCE(stop_testing)) {
for (i = 0; i < loops && !READ_ONCE(stop_testing); i++) {
preempt_disable();
pr_emerg("%5d%-75s\n", smp_processor_id(),
" XXX NOPREEMPT");
preempt_enable();
}
msleep(1);
}
}

static struct work_struct __percpu *works;

static void finish(void)
{
int cpu;

WRITE_ONCE(stop_testing, true);
for_each_online_cpu(cpu)
flush_work(per_cpu_ptr(works, cpu));
free_percpu(works);
}

static int __init test_init(void)
{
int cpu;

works = alloc_percpu(struct work_struct);
if (!works)
return -ENOMEM;

/*
* This is just a test module. This will break if you
* do any CPU hot plugging between loading and
* unloading the module.
*/

for_each_online_cpu(cpu) {
struct work_struct *work = per_cpu_ptr(works, cpu);

INIT_WORK(work, &preempt_printk_workfn);
schedule_work_on(cpu, work);
}

return 0;
}

static void __exit test_exit(void)
{
finish();
}

module_param(loops, uint, 0);
module_init(test_init);
module_exit(test_exit);
MODULE_LICENSE("GPL");

Link: http://lkml.kernel.org/r/20180110132418.7080-2-pmladek@suse.com
Cc: akpm@linux-foundation.org
Cc: linux-mm@kvack.org
Cc: Cong Wang
Cc: Dave Hansen
Cc: Johannes Weiner
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Vlastimil Babka
Cc: Peter Zijlstra
Cc: Linus Torvalds
Cc: Jan Kara
Cc: Mathieu Desnoyers
Cc: Tetsuo Handa
Cc: Byungchul Park
Cc: Tejun Heo
Cc: Pavel Machek
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware)
[pmladek@suse.com: Commit message about possible deadlocks]
Acked-by: Sergey Senozhatsky
Signed-off-by: Petr Mladek
Signed-off-by: Sasha Levin

Steven Rostedt (VMware)
2018-12-17 16:28:54 +0800
62582f67f Revert "printk: Never set console_may_schedule in console_trylock()" ... Browse Code »

This reverts commit c9b8d580b3fb0ab65d37c372aef19a318fda3199.

This is just a technical revert to make the printk fix apply cleanly,
this patch will be re-picked in about 3 commits.

Sasha Levin
2018-12-17 16:28:54 +0800
56926f913 ocfs2: fix potential use after free ... Browse Code »

[ Upstream commit 164f7e586739d07eb56af6f6d66acebb11f315c8 ]

ocfs2_get_dentry() calls iput(inode) to drop the reference count of
inode, and if the reference count hits 0, inode is freed. However, in
this function, it then reads inode->i_generation, which may result in a
use after free bug. Move the put operation later.

Link: http://lkml.kernel.org/r/1543109237-110227-1-git-send-email-bianpan2016@163.com
Fixes: 781f200cb7a("ocfs2: Remove masklog ML_EXPORT.")
Signed-off-by: Pan Bian
Reviewed-by: Andrew Morton
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Junxiao Bi
Cc: Joseph Qi
Cc: Changwei Ge
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin

Pan Bian
2018-12-17 16:28:54 +0800
225b11374 debugobjects: avoid recursive calls with kmemleak ... Browse Code »

[ Upstream commit 8de456cf87ba863e028c4dd01bae44255ce3d835 ]

CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to
recursive calls.

fill_pool
kmemleak_ignore
make_black_object
put_object
__call_rcu (kernel/rcu/tree.c)
debug_rcu_head_queue
debug_object_activate
debug_object_init
fill_pool
kmemleak_ignore
make_black_object
...

So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly
allocated debug objects at all.

Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.us
Signed-off-by: Qian Cai
Suggested-by: Catalin Marinas
Acked-by: Waiman Long
Acked-by: Catalin Marinas
Cc: Thomas Gleixner
Cc: Yang Shi
Cc: Arnd Bergmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin

Qian Cai
2018-12-17 16:28:54 +0800
95c8714e3 hfsplus: do not free node before using ... Browse Code »

[ Upstream commit c7d7d620dcbd2a1c595092280ca943f2fced7bbd ]

hfs_bmap_free() frees node via hfs_bnode_put(node). However it then
reads node->this when dumping error message on an error path, which may
result in a use-after-free bug. This patch frees node only when it is
never used.

Link: http://lkml.kernel.org/r/1543053441-66942-1-git-send-email-bianpan2016@163.com
Signed-off-by: Pan Bian
Reviewed-by: Andrew Morton
Cc: Ernesto A. Fernandez
Cc: Joe Perches
Cc: Viacheslav Dubeyko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin

Pan Bian
2018-12-17 16:28:54 +0800
5ee5fa61d hfs: do not free node before using ... Browse Code »

[ Upstream commit ce96a407adef126870b3f4a1b73529dd8aa80f49 ]

hfs_bmap_free() frees the node via hfs_bnode_put(node). However, it
then reads node->this when dumping error message on an error path, which
may result in a use-after-free bug. This patch frees the node only when
it is never again used.

Link: http://lkml.kernel.org/r/1542963889-128825-1-git-send-email-bianpan2016@163.com
Fixes: a1185ffa2fc ("HFS rewrite")
Signed-off-by: Pan Bian
Reviewed-by: Andrew Morton
Cc: Joe Perches
Cc: Ernesto A. Fernandez
Cc: Viacheslav Dubeyko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin

Pan Bian
2018-12-17 16:28:54 +0800
c7aafad09 mm/page_alloc.c: fix calculation of pgdat->nr_zones ... Browse Code »

[ Upstream commit 8f416836c0d50b198cad1225132e5abebf8980dc ]

init_currently_empty_zone() will adjust pgdat->nr_zones and set it to
'zone_idx(zone) + 1' unconditionally. This is correct in the normal
case, while not exact in hot-plug situation.

This function is used in two places:

* free_area_init_core()
* move_pfn_range_to_zone()

In the first case, we are sure zone index increase monotonically. While
in the second one, this is under users control.

One way to reproduce this is:
----------------------------

1. create a virtual machine with empty node1

-m 4G,slots=32,maxmem=32G \
-smp 4,maxcpus=8 \
-numa node,nodeid=0,mem=4G,cpus=0-3 \
-numa node,nodeid=1,mem=0G,cpus=4-7

2. hot-add cpu 3-7

cpu-add [3-7]

2. hot-add memory to nod1

object_add memory-backend-ram,id=ram0,size=1G
device_add pc-dimm,id=dimm0,memdev=ram0,node=1

3. online memory with following order

echo online_movable > memory47/state
echo online > memory40/state

After this, node1 will have its nr_zones equals to (ZONE_NORMAL + 1)
instead of (ZONE_MOVABLE + 1).

Michal said:
"Having an incorrect nr_zones might result in all sorts of problems
which would be quite hard to debug (e.g. reclaim not considering the
movable zone). I do not expect many users would suffer from this it
but still this is trivial and obviously right thing to do so
backporting to the stable tree shouldn't be harmful (last famous
words)"

Link: http://lkml.kernel.org/r/20181117022022.9956-1-richard.weiyang@gmail.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online")
Signed-off-by: Wei Yang
Acked-by: Michal Hocko
Reviewed-by: Oscar Salvador
Cc: Anshuman Khandual
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Signed-off-by: Sasha Levin

Wei Yang
2018-12-17 16:28:54 +0800