Eric Lee / smarc-fsl-linux-kernel

30 Sep, 2016

8 commits

f39180efe sched/core: Remove unused @cpu argument from destroy_sched_domain*() ... Browse Code »

Small cleanup; nothing uses the @cpu argument so make it go away.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2016-09-30 16:54:05 +0800
0176beaff sched/wait: Introduce init_wait_entry() ... Browse Code »

The partial initialization of wait_queue_t in prepare_to_wait_event() looks
ugly. This was done to shrink .text, but we can simply add the new helper
which does the full initialization and shrink the compiled code a bit more.

And. This way prepare_to_wait_event() can have more users. In particular we
are ready to remove the signal_pending_state() checks from wait_bit_action_f
helpers and change __wait_on_bit_lock() to use prepare_to_wait_event().

Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra (Intel)
Cc: Al Viro
Cc: Bart Van Assche
Cc: Johannes Weiner
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Neil Brown
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20160906140055.GA6167@redhat.com
Signed-off-by: Ingo Molnar

Oleg Nesterov
2016-09-30 16:54:03 +0800
eaf9ef522 sched/wait: Avoid abort_exclusive_wait() in __wait_on_bit_lock() ... Browse Code »

__wait_on_bit_lock() doesn't need abort_exclusive_wait() too. Right
now it can't use prepare_to_wait_event() (see the next change), but
it can do the additional finish_wait() if action() fails.

abort_exclusive_wait() no longer has callers, remove it.

Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra (Intel)
Cc: Al Viro
Cc: Bart Van Assche
Cc: Johannes Weiner
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Neil Brown
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20160906140053.GA6164@redhat.com
Signed-off-by: Ingo Molnar

Oleg Nesterov
2016-09-30 16:54:03 +0800
b1ea06a90 sched/wait: Avoid abort_exclusive_wait() in ___wait_event() ... Browse Code »

___wait_event() doesn't really need abort_exclusive_wait(), we can simply
change prepare_to_wait_event() to remove the waiter from q->task_list if
it was interrupted.

This simplifies the code/logic, and this way prepare_to_wait_event() can
have more users, see the next change.

Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra (Intel)
Cc: Al Viro
Cc: Bart Van Assche
Cc: Johannes Weiner
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Neil Brown
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20160908164815.GA18801@redhat.com
Signed-off-by: Ingo Molnar
--
include/linux/wait.h | 7 +------
kernel/sched/wait.c | 35 +++++++++++++++++++++++++----------
2 files changed, 26 insertions(+), 16 deletions(-)

Oleg Nesterov
2016-09-30 16:53:44 +0800
38a3e1fc1 sched/wait: Fix abort_exclusive_wait(), it should pass TASK_NORMAL to wake_up() ... Browse Code »

Otherwise this logic only works if mode is "compatible" with another
exclusive waiter.

If some wq has both TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE waiters,
abort_exclusive_wait() won't wait an uninterruptible waiter.

The main user is __wait_on_bit_lock() and currently it is fine but only
because TASK_KILLABLE includes TASK_UNINTERRUPTIBLE and we do not have
lock_page_interruptible() yet.

Just use TASK_NORMAL and remove the "mode" arg from abort_exclusive_wait().
Yes, this means that (say) wake_up_interruptible() can wake up the non-
interruptible waiter(s), but I think this is fine. And in fact I think
that abort_exclusive_wait() must die, see the next change.

Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra (Intel)
Cc: Al Viro
Cc: Bart Van Assche
Cc: Johannes Weiner
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Neil Brown
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20160906140047.GA6157@redhat.com
Signed-off-by: Ingo Molnar

Oleg Nesterov
2016-09-30 16:53:19 +0800
ab522e33f sched/fair: Fix fixed point arithmetic width for shares and effective load ... Browse Code »

Since commit:

2159197d6677 ("sched/core: Enable increased load resolution on 64-bit kernels")

we now have two different fixed point units for load:

- 'shares' in calc_cfs_shares() has 20 bit fixed point unit on 64-bit
kernels. Therefore use scale_load() on MIN_SHARES.

- 'wl' in effective_load() has 10 bit fixed point unit. Therefore use
scale_load_down() on tg->shares which has 20 bit fixed point unit on
64-bit kernels.

Signed-off-by: Dietmar Eggemann
Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1471874441-24701-1-git-send-email-dietmar.eggemann@arm.com
Signed-off-by: Ingo Molnar

Dietmar Eggemann
2016-09-30 16:53:19 +0800
8f37961cf sched/core, x86/topology: Fix NUMA in package topology bug ... Browse Code »

Current code can call set_cpu_sibling_map() and invoke sched_set_topology()
more than once (e.g. on CPU hot plug). When this happens after
sched_init_smp() has been called, we lose the NUMA topology extension to
sched_domain_topology in sched_init_numa(). This results in incorrect
topology when the sched domain is rebuilt.

This patch fixes the bug and issues warning if we call sched_set_topology()
after sched_init_smp().

Signed-off-by: Tim Chen
Signed-off-by: Srinivas Pandruvada
Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: bp@suse.de
Cc: jolsa@redhat.com
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/1474485552-141429-2-git-send-email-srinivas.pandruvada@linux.intel.com
Signed-off-by: Ingo Molnar

Tim Chen
2016-09-30 16:53:18 +0800
536e0e81e Merge branch 'linus' into sched/core, to pick up fixes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2016-09-30 16:44:27 +0800

29 Sep, 2016

7 commits

53061afee Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge fixes from Andrew Morton:
"4 fixes"

* emailed patches from Andrew Morton :
mem-hotplug: use nodes that contain memory as mask in new_node_page()
scripts/recordmcount.c: account for .softirqentry.text
dma-mapping.h: preserve unmap info for CONFIG_DMA_API_DEBUG
mm,ksm: fix endless looping in allocating memory when ksm enable

Linus Torvalds
2016-09-29 07:20:24 +0800
231e97e2b mem-hotplug: use nodes that contain memory as mask in new_node_page() ... Browse Code »

9bb627be47a5 ("mem-hotplug: don't clear the only node in new_node_page()")
prevents allocating from an empty nodemask, but as David points out, it is
still wrong. As node_online_map may include memoryless nodes, only
allocating from these nodes is meaningless.

This patch uses node_states[N_MEMORY] mask to prevent the above case.

Fixes: 9bb627be47a5 ("mem-hotplug: don't clear the only node in new_node_page()")
Fixes: 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest neighbor node when mem-offline")
Link: http://lkml.kernel.org/r/1474447117.28370.6.camel@TP420
Signed-off-by: Li Zhong
Suggested-by: David Rientjes
Acked-by: Vlastimil Babka
Cc: Michal Hocko
Cc: John Allen
Cc: Xishi Qiu
Cc: Joonsoo Kim
Cc: Naoya Horiguchi
Cc: Tetsuo Handa
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zhong
2016-09-29 07:19:02 +0800
e436fd61a scripts/recordmcount.c: account for .softirqentry.text ... Browse Code »

be7635e7287e ("arch, ftrace: for KASAN put hard/soft IRQ entries into
separate sections") added .softirqentry.text section, but it was not added
to recordmcount. So functions in the section are untracable. Add the
section to scripts/recordmcount.c and scripts/recordmcount.pl.

Fixes: be7635e7287e ("arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections")
Link: http://lkml.kernel.org/r/1474902626-73468-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov
Acked-by: Steve Rostedt
Cc: [4.6+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dmitry Vyukov
2016-09-29 07:19:02 +0800
2481366af dma-mapping.h: preserve unmap info for CONFIG_DMA_API_DEBUG ... Browse Code »

When CONFIG_DMA_API_DEBUG is enabled we need to preserve unmapping address
even if "unmap" is a no-op for our architecutre because we need
debug_dma_unmap_page() to correctly cleanup all of the debug bookkeeping.
Failing to do so results in a false positive warnings about previously
mapped areas never being unmapped.

Link: http://lkml.kernel.org/r/1474387125-3713-1-git-send-email-andrew.smirnov@gmail.com
Signed-off-by: Andrey Smirnov
Reviewed-by: Robin Murphy
Cc: Joerg Roedel
Cc: Will Deacon
Cc: Zhen Lei
Cc: "Luis R. Rodriguez"
Cc: Christian Borntraeger
Cc: Geliang Tang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrey Smirnov
2016-09-29 07:19:01 +0800
5b398e416 mm,ksm: fix endless looping in allocating memory when ksm enable ... Browse Code »

I hit the following hung task when runing a OOM LTP test case with 4.1
kernel.

Call trace:
[] __switch_to+0x74/0x8c
[] __schedule+0x23c/0x7bc
[] schedule+0x3c/0x94
[] rwsem_down_write_failed+0x214/0x350
[] down_write+0x64/0x80
[] __ksm_exit+0x90/0x19c
[] mmput+0x118/0x11c
[] do_exit+0x2dc/0xa74
[] do_group_exit+0x4c/0xe4
[] get_signal+0x444/0x5e0
[] do_signal+0x1d8/0x450
[] do_notify_resume+0x70/0x78

The oom victim cannot terminate because it needs to take mmap_sem for
write while the lock is held by ksmd for read which loops in the page
allocator

ksm_do_scan
scan_get_next_rmap_item
down_read
get_next_rmap_item
alloc_rmap_item #ksmd will loop permanently.

There is no way forward because the oom victim cannot release any memory
in 4.1 based kernel. Since 4.6 we have the oom reaper which would solve
this problem because it would release the memory asynchronously.
Nevertheless we can relax alloc_rmap_item requirements and use
__GFP_NORETRY because the allocation failure is acceptable as ksm_do_scan
would just retry later after the lock got dropped.

Such a patch would be also easy to backport to older stable kernels which
do not have oom_reaper.

While we are at it add GFP_NOWARN so the admin doesn't have to be alarmed
by the allocation failure.

Link: http://lkml.kernel.org/r/1474165570-44398-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang
Suggested-by: Hugh Dickins
Suggested-by: Michal Hocko
Acked-by: Michal Hocko
Acked-by: Hugh Dickins
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

zhong jiang
2016-09-29 07:19:01 +0800
ae6dd8d61 Merge tag 'for-linus-20160928' of git://git.infradead.org/linux-mtd ... Browse Code »

Pull late MTD fixes from Brian Norris:
"Another round of MTD fixes for v4.8

My apologies for sending this so late. I've been fairly absent as a
maintainer this cycle, but I did queue these up weeks ago. In the
meantime, Richard was able to handle some other fixes (thanks!) but
didn't pick these up.

On the bright side, these are very simple changes that should carry
little risk.

Summary:

- Davinci NAND: fix a long-standing bug in how we clear/prep 4-bit ECC

- OMAP NAND: an error-handling fix that made it into v4.8-rc1 caused
error-handling cases in other configurations/code-paths; this fixes
the fix"

* tag 'for-linus-20160928' of git://git.infradead.org/linux-mtd:
mtd: nand: davinci: Reinitialize the HW ECC engine in 4bit hwctl
mtd: nand: omap2: Don't call dma_release_channel() if dma_request_chan() failed

Linus Torvalds
2016-09-29 03:53:08 +0800
0a966fa89 MAINTAINERS: Update my e-mail ... Browse Code »

I will be starting employment at Versity next week and would like to update
my MAINTAINERS e-mail to reflect that change. My versity e-mail is already
activated so I shouldn't get any bounces on the new one. My ability to help
with Ocfs2 kernel maintenance won't change as a result of the new job.

Signed-off-by: Mark Fasheh
Signed-off-by: Linus Torvalds

Mark Fasheh
2016-09-29 03:52:05 +0800

28 Sep, 2016

1 commit

8ab293e3a Merge branch 'for-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup fixes from Tejun Heo:
"Three late fixes for cgroup: Two cpuset ones, one trivial and the
other pretty obscure, and a cgroup core fix for a bug which impacts
cgroup v2 namespace users"

* 'for-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix invalid controller enable rejections with cgroup namespace
cpuset: fix non static symbol warning
cpuset: handle race between CPU hotplug and cpuset_hotplug_work

Linus Torvalds
2016-09-28 07:43:11 +0800

26 Sep, 2016

9 commits

08895a8b6 Linux 4.8-rc8 Browse Code »

Linus Torvalds
2016-09-26 09:47:13 +0800
4c04b4b53 Merge tag 'trace-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracefs fixes from Steven Rostedt:
"Al Viro has been looking at the tracefs code, and has pointed out some
issues. This contains one fix by me and one by Al. I'm sure that
he'll come up with more but for now I tested these patches and they
don't appear to have any negative impact on tracing"

* tag 'trace-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
fix memory leaks in tracing_buffers_splice_read()
tracing: Move mutex to protect against resetting of seq data

Linus Torvalds
2016-09-26 09:40:13 +0800
90b75db64 fault_in_multipages_readable() throws set-but-unused error ... Browse Code »

When building XFS with -Werror, it now fails with:

include/linux/pagemap.h: In function 'fault_in_multipages_readable':
include/linux/pagemap.h:602:16: error: variable 'c' set but not used [-Werror=unused-but-set-variable]
volatile char c;
^

This is a regression caused by commit e23d4159b109 ("fix
fault_in_multipages_...() on architectures with no-op access_ok()").
Fix it by re-adding the "(void)c" trick taht was previously used to make
the compiler think the variable is used.

Signed-off-by: Dave Chinner
Cc: Al Viro
Signed-off-by: Linus Torvalds

Dave Chinner
2016-09-26 09:16:44 +0800
38e088546 mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing ... Browse Code »

The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
PMDs respectively as requiring balancing upon a subsequent page fault.
User-defined PROT_NONE memory regions which also have this flag set will
not normally invoke the NUMA balancing code as do_page_fault() will send
a segfault to the process before handle_mm_fault() is even called.

However if access_remote_vm() is invoked to access a PROT_NONE region of
memory, handle_mm_fault() is called via faultin_page() and
__get_user_pages() without any access checks being performed, meaning
the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
region.

A simple means of triggering this problem is to access PROT_NONE mmap'd
memory using /proc/self/mem which reliably results in the NUMA handling
functions being invoked when CONFIG_NUMA_BALANCING is set.

This issue was reported in bugzilla (issue 99101) which includes some
simple repro code.

There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
added at commit c0e7cad to avoid accidentally provoking strange
behaviour by attempting to apply NUMA balancing to pages that are in
fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro.

This patch moves the PROT_NONE check into mm/memory.c rather than
invoking BUG_ON() as faulting in these pages via faultin_page() is a
valid reason for reaching the NUMA check with the PROT_NONE page table
flag set and is therefore not always a bug.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101
Reported-by: Trevor Saunders
Signed-off-by: Lorenzo Stoakes
Acked-by: Rik van Riel
Cc: Andrew Morton
Cc: Mel Gorman
Signed-off-by: Linus Torvalds

Lorenzo Stoakes
2016-09-26 06:43:42 +0800
831e45d84 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus ... Browse Code »

Pull MIPS fixes from Ralf Baechle:
"A round of 4.8 fixes:

MIPS generic code:
- Add a missing ".set pop" in an early commit
- Fix memory regions reaching top of physical
- MAAR: Fix address alignment
- vDSO: Fix Malta EVA mapping to vDSO page structs
- uprobes: fix incorrect uprobe brk handling
- uprobes: select HAVE_REGS_AND_STACK_ACCESS_API
- Avoid a BUG warning during PR_SET_FP_MODE prctl
- SMP: Fix possibility of deadlock when bringing CPUs online
- R6: Remove compact branch policy Kconfig entries
- Fix size calc when avoiding IPIs for small icache flushes
- Fix pre-r6 emulation FPU initialisation
- Fix delay slot emulation count in debugfs

ATH79:
- Fix test for error return of clk_register_fixed_factor.

Octeon:
- Fix kernel header to work for VDSO build.
- Fix initialization of platform device probing.

paravirt:
- Fix undefined reference to smp_bootstrap"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix delay slot emulation count in debugfs
MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
MIPS: Fix pre-r6 emulation FPU initialisation
MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs
MIPS: Select HAVE_REGS_AND_STACK_ACCESS_API
MIPS: Octeon: Fix platform bus probing
MIPS: Octeon: mangle-port: fix build failure with VDSO code
MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)
MIPS: c-r4k: Fix size calc when avoiding IPIs for small icache flushes
MIPS: Add a missing ".set pop" in an early commit
MIPS: paravirt: Fix undefined reference to smp_bootstrap
MIPS: Remove compact branch policy Kconfig entries
MIPS: MAAR: Fix address alignment
MIPS: Fix memory regions reaching top of physical
MIPS: uprobes: fix incorrect uprobe brk handling
MIPS: ath79: Fix test for error return of clk_register_fixed_factor().

Linus Torvalds
2016-09-26 04:59:52 +0800
751b9a5d1 Merge tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux ... Browse Code »

Pull one more powerpc fix from Michael Ellerman:
"powernv/pci: Fix m64 checks for SR-IOV and window alignment from
Russell Currey"

* tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment

Linus Torvalds
2016-09-26 04:52:59 +0800
8d2c0d36d radix tree: fix sibling entry handling in radix_tree_descend() ... Browse Code »

The fixes to the radix tree test suite show that the multi-order case is
broken. The basic reason is that the radix tree code uses tagged
pointers with the "internal" bit in the low bits, and calculating the
pointer indices was supposed to mask off those bits. But gcc will
notice that we then use the index to re-create the pointer, and will
avoid doing the arithmetic and use the tagged pointer directly.

This cleans the code up, using the existing is_sibling_entry() helper to
validate the sibling pointer range (instead of open-coding it), and
using entry_to_node() to mask off the low tag bit from the pointer. And
once you do that, you might as well just use the now cleaned-up pointer
directly.

[ Side note: the multi-order code isn't actually ever used in the kernel
right now, and the only reason I didn't just delete all that code is
that Kirill Shutemov piped up and said:

"Well, my ext4-with-huge-pages patchset[1] uses multi-order entries.
It also converts shmem-with-huge-pages and hugetlb to them.

I'm okay with converting it to other mechanism, but I need
something. (I looked into Konstantin's RFC patchset[2]. It looks
okay, but I don't feel myself qualified to review it as I don't
know much about radix-tree internals.)"

[1] http://lkml.kernel.org/r/20160915115523.29737-1-kirill.shutemov@linux.intel.com
[2] http://lkml.kernel.org/r/147230727479.9957.1087787722571077339.stgit@zurg ]

Reported-by: Matthew Wilcox
Cc: Andrew Morton
Cc: Ross Zwisler
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: Konstantin Khlebnikov
Cc: Cedric Blancher
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-09-26 04:32:46 +0800
62fd5258e radix tree test suite: Test radix_tree_replace_slot() for multiorder entries ... Browse Code »

When we replace a multiorder entry, check that all indices reflect the
new value.

Also, compile the test suite with -O2, which shows other problems with
the code due to some dodgy pointer operations in the radix tree code.

Signed-off-by: Matthew Wilcox
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-09-26 02:49:16 +0800
1ae2293dd fix memory leaks in tracing_buffers_splice_read() ... Browse Code »

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro

Al Viro
2016-09-26 01:30:13 +0800

25 Sep, 2016

11 commits

1245800c0 tracing: Move mutex to protect against resetting of seq data ... Browse Code »

The iter->seq can be reset outside the protection of the mutex. So can
reading of user data. Move the mutex up to the beginning of the function.

Fixes: d7350c3f45694 ("tracing/core: make the read callbacks reentrants")
Cc: stable@vger.kernel.org # 2.6.30+
Reported-by: Al Viro
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2016-09-25 22:27:08 +0800
116e7111c MIPS: Fix delay slot emulation count in debugfs ... Browse Code »

Commit 432c6bacbd0c ("MIPS: Use per-mm page to execute branch delay slot
instructions") accidentally removed use of the MIPS_FPU_EMU_INC_STATS
macro from do_dsemulret, leading to the ds_emul file in debugfs always
returning zero even though we perform delay slot emulations.

Fix this by re-adding the use of the MIPS_FPU_EMU_INC_STATS macro.

Signed-off-by: Paul Burton
Fixes: 432c6bacbd0c ("MIPS: Use per-mm page to execute branch delay slot instructions")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14301/
Signed-off-by: Ralf Baechle

Paul Burton
2016-09-25 07:59:16 +0800
8f46cca1e MIPS: SMP: Fix possibility of deadlock when bringing CPUs online ... Browse Code »

This patch fixes the possibility of a deadlock when bringing up
secondary CPUs.
The deadlock occurs because the set_cpu_online() is called before
synchronise_count_slave(). This can cause a deadlock if the boot CPU,
having scheduled another thread, attempts to send an IPI to the
secondary CPU, which it sees has been marked online. The secondary is
blocked in synchronise_count_slave() waiting for the boot CPU to enter
synchronise_count_master(), but the boot cpu is blocked in
smp_call_function_many() waiting for the secondary to respond to it's
IPI request.

Fix this by marking the CPU online in cpu_callin_map and synchronising
counters before declaring the CPU online and calculating the maps for
IPIs.

Signed-off-by: Matt Redfearn
Reported-by: Justin Chen
Tested-by: Justin Chen
Cc: Florian Fainelli
Cc: stable@vger.kernel.org # v4.1+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14302/
Signed-off-by: Ralf Baechle

Matt Redfearn
2016-09-25 07:43:52 +0800
9c0e28a7b Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Thomas Gleixner:
"Three fixlets for perf:

- add a missing NULL pointer check in the intel BTS driver

- make BTS an exclusive PMU because BTS can only handle one event at
a time

- ensure that exclusive events are limited to one PMU so that several
exclusive events can be scheduled on different PMU instances"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Limit matching exclusive events to one PMU
perf/x86/intel/bts: Make it an exclusive PMU
perf/x86/intel/bts: Make sure debug store is valid

Linus Torvalds
2016-09-25 03:44:28 +0800
2507c8566 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking fixes from Thomas Gleixner:
"Two smallish fixes:

- use the proper asm constraint in the Super-H atomic_fetch_ops

- a trivial typo fix in the Kconfig help text"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
locking/atomic, arch/sh: Fix ATOMIC_FETCH_OP()

Linus Torvalds
2016-09-25 03:41:19 +0800
709b8f67d Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull EFI fixes from Thomas Gleixner:
"Two fixes for EFI/PAT:

- a 32bit overflow bug in the PAT code which was unearthed by the
large EFI mappings

- prevent a boot hang on large systems when EFI mixed mode is enabled
but not used"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Only map RAM into EFI page tables if in mixed-mode
x86/mm/pat: Prevent hang during boot when mapping pages

Linus Torvalds
2016-09-25 03:35:26 +0800
4b8b0ff60 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq fixes from Thomas Gleixner:
"Three fixes for irq core and irq chip drivers:

- Do not set the irq type if type is NONE. Fixes a boot regression
on various SoCs

- Use the proper cpu for setting up the GIC target list. Discovered
by the cpumask debugging code.

- A rather large fix for the MIPS-GIC so per cpu local interrupts
work again. This was discovered late because the code falls back
to slower timers which use normal device interrupts"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mips-gic: Fix local interrupts
irqchip/gicv3: Silence noisy DEBUG_PER_CPU_MAPS warning
genirq: Skip chained interrupt trigger setup if type is IRQ_TYPE_NONE

Linus Torvalds
2016-09-25 03:30:12 +0800
0f2657417 Merge branch 'hughd-fixes' (patches from Hugh Dickins) ... Browse Code »

Merge VM fixes from High Dickins:
"I get the impression that Andrew is away or busy at the moment, so I'm
going to send you three independent uncontroversial little mm fixes
directly - though none is strictly a 4.8 regression fix.

- shmem: fix tmpfs to handle the huge= option properly from Toshi
Kani is a one-liner to fix a major embarrassment in 4.8's hugepages
on tmpfs feature: although Hillf pointed it out in June, somehow
both Kirill and I repeatedly dropped the ball on this one. You
might wonder if the feature got tested at all with that bug in:
yes, it did, but for wider testing coverage, Kirill and I had each
relied too much on an override which bypasses that condition.

- huge tmpfs: fix Committed_AS leak just a run-of-the-mill accounting
fix in the same feature.

- mm: delete unnecessary and unsafe init_tlb_ubc() is an unrelated
fix to 4.3's TLB flush batching in reclaim: the bug would be rare,
and none of us will be shamed if this one misses 4.8; but it got
such a quick ack from Mel today that I'm inclined to offer it along
with the first two"

* emailed patches from Hugh Dickins :
mm: delete unnecessary and unsafe init_tlb_ubc()
huge tmpfs: fix Committed_AS leak
shmem: fix tmpfs to handle the huge= option properly

Linus Torvalds
2016-09-25 02:31:45 +0800
b385d21f2 mm: delete unnecessary and unsafe init_tlb_ubc() ... Browse Code »

init_tlb_ubc() looked unnecessary to me: tlb_ubc is statically
initialized with zeroes in the init_task, and copied from parent to
child while it is quiescent in arch_dup_task_struct(); so I went to
delete it.

But inserted temporary debug WARN_ONs in place of init_tlb_ubc() to
check that it was always empty at that point, and found them firing:
because memcg reclaim can recurse into global reclaim (when allocating
biosets for swapout in my case), and arrive back at the init_tlb_ubc()
in shrink_node_memcg().

Resetting tlb_ubc.flush_required at that point is wrong: if the upper
level needs a deferred TLB flush, but the lower level turns out not to,
we miss a TLB flush. But fortunately, that's the only part of the
protocol that does not nest: with the initialization removed, cpumask
collects bits from upper and lower levels, and flushes TLB when needed.

Fixes: 72b252aed506 ("mm: send one IPI per CPU to TLB flush all entries after unmapping pages")
Signed-off-by: Hugh Dickins
Acked-by: Mel Gorman
Cc: stable@vger.kernel.org # 4.3+
Signed-off-by: Linus Torvalds

Hugh Dickins
2016-09-25 02:20:01 +0800
71664665c huge tmpfs: fix Committed_AS leak ... Browse Code »

Under swapping load on huge tmpfs, /proc/meminfo's Committed_AS grows
bigger and bigger: just a cosmetic issue for most users, but disabling
for those who run without overcommit (/proc/sys/vm/overcommit_memory 2).

shmem_uncharge() was forgetting to unaccount __vm_enough_memory's
charge, and shmem_charge() was forgetting it on the filesystem-full
error path.

Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Signed-off-by: Hugh Dickins
Acked-by: Kirill A. Shutemov
Signed-off-by: Linus Torvalds

Hugh Dickins
2016-09-25 02:20:01 +0800
3089bf614 shmem: fix tmpfs to handle the huge= option properly ... Browse Code »

shmem_get_unmapped_area() checks SHMEM_SB(sb)->huge incorrectly, which
leads to a reversed effect of "huge=" mount option.

Fix the check in shmem_get_unmapped_area().

Note, the default value of SHMEM_SB(sb)->huge remains as
SHMEM_HUGE_NEVER. User will need to specify "huge=" option to enable
huge page mappings.

Reported-by: Hillf Danton
Signed-off-by: Toshi Kani
Acked-by: Kirill A. Shutemov
Reviewed-by: Aneesh Kumar K.V
Signed-off-by: Hugh Dickins
Signed-off-by: Linus Torvalds

Toshi Kani
2016-09-25 02:20:01 +0800

24 Sep, 2016

4 commits

bd5dbcb4b Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux ... Browse Code »

Pull i2c fixes from Wolfram Sang:
"Three driver bugfixes: fixing uninitialized memory pointers (eg20t),
pm/clock imbalance (qup), and a wrongly set cached variable (pc954x)"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
i2c: mux: pca954x: retry updating the mux selection on failure
i2c-eg20t: fix race between i2c init and interrupt enable

Linus Torvalds
2016-09-24 07:44:12 +0800
d0c1d15f5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input ... Browse Code »

Pull input updates from Dmitry Torokhov:
"Just a fix up for the firmware handling to the Silead driver (which is
a new driver in this release)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: silead_gsl1680 - use "silead/" prefix for firmware loading
Input: silead_gsl1680 - document firmware-name, fix implementation

Linus Torvalds
2016-09-24 07:34:24 +0800
4ee698662 Merge branch 'for-linus' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block fixes from Jens Axboe:
"Three fixes, two regressions and one that poses a problem in blk-mq
with the new nvmef code"

* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
nvme-rdma: only clear queue flags after successful connect
blk-throttle: Extend slice if throttle group is not empty

Linus Torvalds
2016-09-24 07:24:36 +0800
9157056da cgroup: fix invalid controller enable rejections with cgroup namespace ... Browse Code »

On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it. The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2b0 ("cgroup: introduce cgroup namespaces").

When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace. This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.

Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.

While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.

Signed-off-by: Tejun Heo
Reported-by: Evgeny Vereshchagin
Cc: Serge E. Hallyn
Cc: Aditya Kali
Cc: Eric W. Biederman
Cc: stable@vger.kernel.org # v4.6+
Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Link: https://github.com/systemd/systemd/pull/3589#issuecomment-249089541

Tejun Heo
2016-09-24 04:55:49 +0800