Eric Lee / smarc-fsl-linux-kernel

21 Nov, 2015

2 commits

7625b3a00 kernel/panic.c: turn off locks debug before releasing console lock ... Browse Code »

Commit 08d78658f393 ("panic: release stale console lock to always get the
logbuf printed out") introduced an unwanted bad unlock balance report when
panic() is called directly and not from OOPS (e.g. from out_of_memory()).
The difference is that in case of OOPS we disable locks debug in
oops_enter() and on direct panic call nobody does that.

Fixes: 08d78658f393 ("panic: release stale console lock to always get the logbuf printed out")
Reported-by: kernel test robot
Signed-off-by: Vitaly Kuznetsov
Cc: HATAYAMA Daisuke
Cc: Masami Hiramatsu
Cc: Jiri Kosina
Cc: Baoquan He
Cc: Prarit Bhargava
Cc: Xie XiuQi
Cc: Seth Jennings
Cc: "K. Y. Srinivasan"
Cc: Jan Kara
Cc: Petr Mladek
Cc: Yasuaki Ishimatsu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Kuznetsov
2015-11-21 08:17:32 +0800
9d8a76521 kernel/signal.c: unexport sigsuspend() ... Browse Code »

sigsuspend() is nowhere used except in signal.c itself, so we can mark it
static do not pollute the global namespace.

But this patch is more than a boring cleanup patch, it fixes a real issue
on UserModeLinux. UML has a special console driver to display ttys using
xterm, or other terminal emulators, on the host side. Vegard reported
that sometimes UML is unable to spawn a xterm and he's facing the
following warning:

WARNING: CPU: 0 PID: 908 at include/linux/thread_info.h:128 sigsuspend+0xab/0xc0()

It turned out that this warning makes absolutely no sense as the UML
xterm code calls sigsuspend() on the host side, at least it tries. But
as the kernel itself offers a sigsuspend() symbol the linker choose this
one instead of the glibc wrapper. Interestingly this code used to work
since ever but always blocked signals on the wrong side. Some recent
kernel change made the WARN_ON() trigger and uncovered the bug.

It is a wonderful example of how much works by chance on computers. :-)

Fixes: 68f3f16d9ad0f1 ("new helper: sigsuspend()")
Signed-off-by: Richard Weinberger
Reported-by: Vegard Nossum
Tested-by: Vegard Nossum
Acked-by: Oleg Nesterov
Cc: [3.5+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Weinberger
2015-11-21 08:17:32 +0800

20 Nov, 2015

1 commit

a3d66b5a1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching ... Browse Code »

Pull livepatching fix from Jiri Kosina:
"A fix for module handling in case kASLR has been enabled, from Zhou
Chengming"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: x86: fix relocation computation with kASLR

Linus Torvalds
2015-11-20 04:16:12 +0800

16 Nov, 2015

3 commits

0ca9b6760 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf updates from Thomas Gleixner:
"Mostly updates to the perf tool plus two fixes to the kernel core code:

- Handle tracepoint filters correctly for inherited events (Peter
Zijlstra)

- Prevent a deadlock in perf_lock_task_context (Paul McKenney)

- Add missing newlines to some pr_err() calls (Arnaldo Carvalho de
Melo)

- Print full source file paths when using 'perf annotate --print-line
--full-paths' (Michael Petlan)

- Fix 'perf probe -d' when just one out of uprobes and kprobes is
enabled (Wang Nan)

- Add compiler.h to list.h to fix 'make perf-tar-src-pkg' generated
tarballs, i.e. out of tree building (Arnaldo Carvalho de Melo)

- Add the llvm-src-base.c and llvm-src-kbuild.c files, generated by
the 'perf test' LLVM entries, when running it in-tree, to
.gitignore (Yunlong Song)

- libbpf error reporting improvements, using a strerror interface to
more precisely tell the user about problems with the provided
scriptlet, be it in C or as a ready made object file (Wang Nan)

- Do not be case sensitive when searching for matching 'perf test'
entries (Arnaldo Carvalho de Melo)

- Inform the user about objdump failures in 'perf annotate' (Andi
Kleen)

- Improve the LLVM 'perf test' entry, introduce a new ones for BPF
and kbuild tests to check the environment used by clang to compile
.c scriptlets (Wang Nan)"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
perf/x86/intel/rapl: Remove the unused RAPL_EVENT_DESC() macro
tools include: Add compiler.h to list.h
perf probe: Verify parameters in two functions
perf session: Add missing newlines to some pr_err() calls
perf annotate: Support full source file paths for srcline fix
perf test: Add llvm-src-base.c and llvm-src-kbuild.c to .gitignore
perf: Fix inherited events vs. tracepoint filters
perf: Disable IRQs across RCU RS CS that acquires scheduler lock
perf test: Do not be case sensitive when searching for matching tests
perf test: Add 'perf test BPF'
perf test: Enhance the LLVM tests: add kbuild test
perf test: Enhance the LLVM test: update basic BPF test program
perf bpf: Improve BPF related error messages
perf tools: Make fetch_kernel_version() publicly available
bpf tools: Add new API bpf_object__get_kversion()
bpf tools: Improve libbpf error reporting
perf probe: Cleanup find_perf_probe_point_from_map to reduce redundancy
perf annotate: Inform the user about objdump failures in --stdio
perf stat: Make stat options global
perf sched latency: Fix thread pid reuse issue
...

Linus Torvalds
2015-11-16 01:36:24 +0800
051b29f27 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler fix from Thomas Gleixner:
"A single fix to prevent math underflow in the numa balancing code"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/numa: Fix math underflow in task_tick_numa()

Linus Torvalds
2015-11-16 01:35:33 +0800
511601bdb Merge branches 'irq-urgent-for-linus' and 'timers-urgent-for-linus' of git://git… ... Browse Code »

….kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq and timer fixes from Thomas Gleixner:

- An irq regression fix to restore the wakeup behaviour of chained
interrupts.

- A timer fix for a long standing race versus timers scheduled on a
target cpu which got exposed by recent changes in the workqueue
implementation.

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/PM: Restore system wake up from chained interrupts

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers: Use proper base migration in add_timer_on()

Linus Torvalds
2015-11-16 01:30:48 +0800

13 Nov, 2015

2 commits

0e9760642 Merge tag 'trace-v4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull trace cleanups from Steven Rostedt:
"This contains three more clean up patches.

One patch is needed to make tracing work without debugfs now that
tracing uses its own tracefs.

The second is removing an unused variable.

The third is fixing a warning about unused variables when MAX_TRACER
is not configured. Note, this warning shows up in gcc 6.0, but does
not show up in gcc 4.9, as it seems that gcc does not complain about
constants not being used"

* tag 'trace-v4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: #ifdef out uses of max trace when CONFIG_TRACER_MAX_TRACE is not set
tracing: Remove unused ftrace_cpu_disabled per cpu variable
tracing: Make tracing work when debugfs is not configured in

Linus Torvalds
2015-11-13 08:22:54 +0800
3370b69eb Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

Pull second batch of kvm updates from Paolo Bonzini:
"Four changes:

- x86: work around two nasty cases where a benign exception occurs
while another is being delivered. The endless stream of exceptions
causes an infinite loop in the processor, which not even NMIs or
SMIs can interrupt; in the virt case, there is no possibility to
exit to the host either.

- x86: support for Skylake per-guest TSC rate. Long supported by
AMD, the patches mostly move things from there to common
arch/x86/kvm/ code.

- generic: remove local_irq_save/restore from the guest entry and
exit paths when context tracking is enabled. The patches are a few
months old, but we discussed them again at kernel summit. Andy
will pick up from here and, in 4.5, try to remove it from the user
entry/exit paths.

- PPC: Two bug fixes, see merge commit 370289756becc for details"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
KVM: x86: rename update_db_bp_intercept to update_bp_intercept
KVM: svm: unconditionally intercept #DB
KVM: x86: work around infinite loop in microcode when #AC is delivered
context_tracking: avoid irq_save/irq_restore on guest entry and exit
context_tracking: remove duplicate enabled check
KVM: VMX: Dump TSC multiplier in dump_vmcs()
KVM: VMX: Use a scaled host TSC for guest readings of MSR_IA32_TSC
KVM: VMX: Setup TSC scaling ratio when a vcpu is loaded
KVM: VMX: Enable and initialize VMX TSC scaling
KVM: x86: Use the correct vcpu's TSC rate to compute time scale
KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc()
KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset()
KVM: x86: Replace call-back compute_tsc_offset() with a common function
KVM: x86: Replace call-back set_tsc_khz() with a common function
KVM: x86: Add a common TSC scaling function
KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch
KVM: x86: Collect information for setting TSC scaling ratio
KVM: x86: declare a few variables as __read_mostly
KVM: x86: merge handle_mmio_page_fault and handle_mmio_page_fault_common
KVM: PPC: Book3S HV: Don't dynamically split core when already split
...

Linus Torvalds
2015-11-13 06:34:06 +0800

12 Nov, 2015

1 commit

e41b104c7 livepatch: x86: fix relocation computation with kASLR ... Browse Code »

With kASLR enabled, old_addr provided by patch module is being shifted
accrodingly so that the symbol lookups work. To have module relocations
handled properly as well, the same transformation needs to be perfomed
on relocation address information.

[jkosina@suse.cz: extended / reworded changelog a bit]
Reported-by: Cyril B.
Signed-off-by: Zhou Chengming
Acked-by: Josh Poimboeuf
Signed-off-by: Jiri Kosina

Zhou Chengming
2015-11-12 00:36:04 +0800

11 Nov, 2015

3 commits

2df4ee78d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix null deref in xt_TEE netfilter module, from Eric Dumazet.

2) Several spots need to get to the original listner for SYN-ACK
packets, most spots got this ok but some were not. Whilst covering
the remaining cases, create a helper to do this. From Eric Dumazet.

3) Missiing check of return value from alloc_netdev() in CAIF SPI code,
from Rasmus Villemoes.

4) Don't sleep while != TASK_RUNNING in macvtap, from Vlad Yasevich.

5) Use after free in mvneta driver, from Justin Maggard.

6) Fix race on dst->flags access in dst_release(), from Eric Dumazet.

7) Add missing ZLIB_INFLATE dependency for new qed driver. From Arnd
Bergmann.

8) Fix multicast getsockopt deadlock, from WANG Cong.

9) Fix deadlock in btusb, from Kuba Pawlak.

10) Some ipv6_add_dev() failure paths were not cleaning up the SNMP6
counter state. From Sabrina Dubroca.

11) Fix packet_bind() race, which can cause lost notifications, from
Francesco Ruggeri.

12) Fix MAC restoration in qlcnic driver during bonding mode changes,
from Jarod Wilson.

13) Revert bridging forward delay change which broke libvirt and other
userspace things, from Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
Revert "bridge: Allow forward delay to be cfgd when STP enabled"
bpf_trace: Make dependent on PERF_EVENTS
qed: select ZLIB_INFLATE
net: fix a race in dst_release()
net: mvneta: Fix memory use after free.
net: Documentation: Fix default value tcp_limit_output_bytes
macvtap: Resolve possible __might_sleep warning in macvtap_do_read()
mvneta: add FIXED_PHY dependency
net: caif: check return value of alloc_netdev
net: hisilicon: NET_VENDOR_HISILICON should depend on HAS_DMA
drivers: net: xgene: fix RGMII 10/100Mb mode
netfilter: nft_meta: use skb_to_full_sk() helper
net_sched: em_meta: use skb_to_full_sk() helper
sched: cls_flow: use skb_to_full_sk() helper
netfilter: xt_owner: use skb_to_full_sk() helper
smack: use skb_to_full_sk() helper
net: add skb_to_full_sk() helper and use it in selinux_netlbl_skbuff_setsid()
bpf: doc: correct arch list for supported eBPF JIT
dwc_eth_qos: Delete an unnecessary check before the function call "of_node_put"
bonding: fix panic on non-ARPHRD_ETHER enslave failure
...

Linus Torvalds
2015-11-11 10:11:41 +0800
a31d82d85 bpf_trace: Make dependent on PERF_EVENTS ... Browse Code »

Arnd Bergmann reported:

In my ARM randconfig tests, I'm getting a build error for
newly added code in bpf_perf_event_read and bpf_perf_event_output
whenever CONFIG_PERF_EVENTS is disabled:

kernel/trace/bpf_trace.c: In function 'bpf_perf_event_read':
kernel/trace/bpf_trace.c:203:11: error: 'struct perf_event' has no member named 'oncpu'
if (event->oncpu != smp_processor_id() ||
^
kernel/trace/bpf_trace.c:204:11: error: 'struct perf_event' has no member named 'pmu'
event->pmu->count)

This can happen when UPROBE_EVENT is enabled but KPROBE_EVENT
is disabled. I'm not sure if that is a configuration we care
about, otherwise we could prevent this case from occuring by
adding Kconfig dependencies.

Looking at this further, it's really that UPROBE_EVENT enables PERF_EVENTS.
By just having BPF_EVENTS depend on PERF_EVENTS, then all is fine.

Link: http://lkml.kernel.org/r/4525348.Aq9YoXkChv@wuerfel
Reported-by: Arnd Bergmann
Signed-off-by: Steven Rostedt
Signed-off-by: David S. Miller

Steven Rostedt
2015-11-11 04:40:14 +0800
264015f8a Merge tag 'libnvdimm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm updates from Dan Williams:
"Outside of the new ACPI-NFIT hot-add support this pull request is more
notable for what it does not contain, than what it does. There were a
handful of development topics this cycle, dax get_user_pages, dax
fsync, and raw block dax, that need more more iteration and will wait
for 4.5.

The patches to make devm and the pmem driver NUMA aware have been in
-next for several weeks. The hot-add support has not, but is
contained to the NFIT driver and is passing unit tests. The coredump
support is straightforward and was looked over by Jeff. All of it has
received a 0day build success notification across 107 configs.

Summary:

- Add support for the ACPI 6.0 NFIT hot add mechanism to process
updates of the NFIT at runtime.

- Teach the coredump implementation how to filter out DAX mappings.

- Introduce NUMA hints for allocations made by the pmem driver, and
as a side effect all devm allocations now hint their NUMA node by
default"

* tag 'libnvdimm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
coredump: add DAX filtering for FDPIC ELF coredumps
coredump: add DAX filtering for ELF coredumps
acpi: nfit: Add support for hot-add
nfit: in acpi_nfit_init, break on a 0-length table
pmem, memremap: convert to numa aware allocations
devm_memremap_pages: use numa_mem_id
devm: make allocations numa aware by default
devm_memremap: convert to return ERR_PTR
devm_memunmap: use devres_release()
pmem: kill memremap_pmem()
x86, mm: quiet arch_add_memory()

Linus Torvalds
2015-11-11 04:07:22 +0800

10 Nov, 2015

8 commits

e428abbbf tracing: #ifdef out uses of max trace when CONFIG_TRACER_MAX_TRACE is not set ... Browse Code »

tracing_max_lat_fops is used only when TRACER_MAX_TRACE enabled, so also
swith the related code. The related warning with defconfig under x86_64:

CC kernel/trace/trace.o
kernel/trace/trace.c:5466:37: warning: ‘tracing_max_lat_fops’ defined but not used [-Wunused-const-variable]
static const struct file_operations tracing_max_lat_fops = {

Signed-off-by: Chen Gang
Signed-off-by: Steven Rostedt

Chen Gang
2015-11-10 23:16:05 +0800
4717f1337 genirq/PM: Restore system wake up from chained interrupts ... Browse Code »

Commit e509bd7da149 ("genirq: Allow migration of chained interrupts
by installing default action") breaks PCS wake up IRQ behaviour on
TI OMAP based platforms (dra7-evm).

TI OMAP IRQ wake up configuration:
GIC-irqchip->PCM_IRQ
|- omap_prcm_register_chain_handler
|- PRCM-irqchip -> PRCM_IO_IRQ
|- pcs_irq_chain_handler
|- pinctrl-irqchip -> PCS_uart1_wakeup_irq

This happens because IRQ PM code (irq/pm.c) is expected to ignore
chained interrupts by default:
static bool suspend_device_irq(struct irq_desc *desc)
{
if (!desc->action || desc->no_suspend_depth)
return false;
- it's expected !desc->action = true for chained interrupts;

but, after above change, all chained interrupt descriptors will
have default action handler installed - chained_action.
As result, chained interrupts will be silently disabled during system
suspend.

Hence, fix it by introducing helper function irq_desc_is_chained() and
use it in suspend_device_irq() for chained interrupts identification
and skip them, once detected.

Fixes: e509bd7da149 ("genirq: Allow migration of chained interrupts..")
Signed-off-by: Grygorii Strashko
Reviewed-by: Mika Westerberg
Cc: Tony Lindgren
Cc:
Cc:
Cc: Tony Lindgren
Link: http://lkml.kernel.org/r/1447149492-20699-1-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner

Grygorii Strashko
2015-11-10 22:11:31 +0800
d0e536d89 context_tracking: avoid irq_save/irq_restore on guest entry and exit ... Browse Code »

guest_enter and guest_exit must be called with interrupts disabled,
since they take the vtime_seqlock with write_seq{lock,unlock}.
Therefore, it is not necessary to check for exceptions, nor to
save/restore the IRQ state, when context tracking functions are
called by guest_enter and guest_exit.

Split the body of context_tracking_entry and context_tracking_exit
out to __-prefixed functions, and use them from KVM.

Rik van Riel has measured this to speed up a tight vmentry/vmexit
loop by about 2%.

Cc: Andy Lutomirski
Cc: Frederic Weisbecker
Cc: Paul McKenney
Reviewed-by: Rik van Riel
Tested-by: Rik van Riel
Signed-off-by: Paolo Bonzini

Paolo Bonzini
2015-11-10 19:06:23 +0800
f70cd6b07 context_tracking: remove duplicate enabled check ... Browse Code »

All calls to context_tracking_enter and context_tracking_exit
are already checking context_tracking_is_enabled, except the
context_tracking_user_enter and context_tracking_user_exit
functions left in for the benefit of assembly calls.

Pull the check up to those functions, by making them simple
wrappers around the user_enter and user_exit inline functions.

Cc: Frederic Weisbecker
Cc: Paul McKenney
Reviewed-by: Rik van Riel
Tested-by: Rik van Riel
Acked-by: Andy Lutomirski
Signed-off-by: Paolo Bonzini

Paolo Bonzini
2015-11-10 19:06:22 +0800
bd4f203e4 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge third patch-bomb from Andrew Morton:
"We're pretty much done over here - I'm still waiting for a nouveau
merge so I can cleanly finish up Christoph's dma-mapping rework.

- bunch of small misc stuff

- fold abs64() into abs(), remove abs64()

- new_valid_dev() cleanups

- binfmt_elf_fdpic feature work"

* emailed patches from Andrew Morton : (24 commits)
fs/binfmt_elf_fdpic.c: provide NOMMU loader for regular ELF binaries
fs/stat.c: remove unnecessary new_valid_dev() check
fs/reiserfs/namei.c: remove unnecessary new_valid_dev() check
fs/nilfs2/namei.c: remove unnecessary new_valid_dev() check
fs/ncpfs/dir.c: remove unnecessary new_valid_dev() check
fs/jfs: remove unnecessary new_valid_dev() checks
fs/hpfs/namei.c: remove unnecessary new_valid_dev() check
fs/f2fs/namei.c: remove unnecessary new_valid_dev() check
fs/ext2/namei.c: remove unnecessary new_valid_dev() check
fs/exofs/namei.c: remove unnecessary new_valid_dev() check
fs/btrfs/inode.c: remove unnecessary new_valid_dev() check
fs/9p: remove unnecessary new_valid_dev() checks
include/linux/kdev_t.h: old/new_valid_dev() can return bool
include/linux/kdev_t.h: remove unused huge_valid_dev()
kmap_atomic_to_page() has no users, remove it
drivers/scsi/cxgbi: fix build with EXTRA_CFLAGS
dma: remove external references to dma_supported
Documentation/sysctl/vm.txt: fix misleading code reference of overcommit_memory
remove abs64()
kernel.h: make abs() work with 64-bit types
...

Linus Torvalds
2015-11-10 13:05:13 +0800
50c36504f Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module updates from Rusty Russell:
"Nothing exciting, minor tweaks and cleanups"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
scripts: [modpost] add new sections to white list
modpost: Add flag -E for making section mismatches fatal
params: don't ignore the rest of cmdline if parse_one() fails
modpost: abort if a module symbol is too long

Linus Torvalds
2015-11-10 07:53:39 +0800
79211c8ed remove abs64() ... Browse Code »

Switch everything to the new and more capable implementation of abs().
Mainly to give the new abs() a bit of a workout.

Cc: Michal Nazarewicz
Cc: John Stultz
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Peter Zijlstra
Cc: Masami Hiramatsu
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2015-11-10 07:11:24 +0800
85ce23005 Merge branch 'for-4.4/hotplug' into libnvdimm-for-next Browse Code »

Dan Williams
2015-11-10 02:29:39 +0800

09 Nov, 2015

3 commits

25b3e5a33 sched/numa: Fix math underflow in task_tick_numa() ... Browse Code »

The NUMA balancing code implements delays in scanning by
advancing curr->node_stamp beyond curr->se.sum_exec_runtime.

With unsigned math, that creates an underflow, which results
in task_numa_work being queued all the time, even when we
don't want to.

Avoiding the math underflow makes it possible to reduce CPU
overhead in the NUMA balancing code.

Reported-and-tested-by: Jan Stancek
Signed-off-by: Rik van Riel
Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: mgorman@suse.de
Link: http://lkml.kernel.org/r/1446756983-28173-2-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar

Rik van Riel
2015-11-09 23:13:27 +0800
b71b437ee perf: Fix inherited events vs. tracepoint filters ... Browse Code »

Arnaldo reported that tracepoint filters seem to misbehave (ie. not
apply) on inherited events.

The fix is obvious; filters are only set on the actual (parent)
event, use the normal pattern of using this parent event for filters.
This is safe because each child event has a reference to it.

Reported-by: Arnaldo Carvalho de Melo
Tested-by: Arnaldo Carvalho de Melo
Signed-off-by: Peter Zijlstra (Intel)
Cc: Adrian Hunter
Cc: Arnaldo Carvalho de Melo
Cc: David Ahern
Cc: Frédéric Weisbecker
Cc: Jiri Olsa
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Thomas Gleixner
Cc: Wang Nan
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20151102095051.GN17308@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-11-09 23:13:11 +0800
2fd590777 perf: Disable IRQs across RCU RS CS that acquires scheduler lock ... Browse Code »

The perf_lock_task_context() function disables preemption across its
RCU read-side critical section because that critical section acquires
a scheduler lock. If there was a preemption during that RCU read-side
critical section, the rcu_read_unlock() could attempt to acquire scheduler
locks, resulting in deadlock.

However, recent optimizations to expedited grace periods mean that IPI
handlers that execute during preemptible RCU read-side critical sections
can now cause the subsequent rcu_read_unlock() to acquire scheduler locks.
Disabling preemption does nothiing to prevent these IPI handlers from
executing, so these optimizations introduced a deadlock. In theory,
this deadlock could be avoided by pulling all wakeups and printk()s out
from rnp->lock critical sections, but in practice this would re-introduce
some RCU CPU stall warning bugs.

Given that acquiring scheduler locks entails disabling interrupts, these
deadlocks can be avoided by disabling interrupts (instead of disabling
preemption) across any RCU read-side critical that acquires scheduler
locks and holds them across the rcu_read_unlock(). This commit therefore
makes this change for perf_lock_task_context().

Reported-by: Dave Jones
Reported-by: Peter Zijlstra
Signed-off-by: Paul E. McKenney
Signed-off-by: Peter Zijlstra (Intel)
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Stephane Eranian
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20151104134838.GR29027@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

Paul E. McKenney
2015-11-09 23:13:11 +0800

08 Nov, 2015

2 commits

ad804a0b2 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge second patch-bomb from Andrew Morton:

- most of the rest of MM

- procfs

- lib/ updates

- printk updates

- bitops infrastructure tweaks

- checkpatch updates

- nilfs2 update

- signals

- various other misc bits: coredump, seqfile, kexec, pidns, zlib, ipc,
dma-debug, dma-mapping, ...

* emailed patches from Andrew Morton : (102 commits)
ipc,msg: drop dst nil validation in copy_msg
include/linux/zutil.h: fix usage example of zlib_adler32()
panic: release stale console lock to always get the logbuf printed out
dma-debug: check nents in dma_sync_sg*
dma-mapping: tidy up dma_parms default handling
pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode
kexec: use file name as the output message prefix
fs, seqfile: always allow oom killer
seq_file: reuse string_escape_str()
fs/seq_file: use seq_* helpers in seq_hex_dump()
coredump: change zap_threads() and zap_process() to use for_each_thread()
coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP
signal: remove jffs2_garbage_collect_thread()->allow_signal(SIGCONT)
signal: introduce kernel_signal_stop() to fix jffs2_garbage_collect_thread()
signal: turn dequeue_signal_lock() into kernel_dequeue_signal()
signals: kill block_all_signals() and unblock_all_signals()
nilfs2: fix gcc uninitialized-variable warnings in powerpc build
nilfs2: fix gcc unused-but-set-variable warnings
MAINTAINERS: nilfs2: add header file for tracing
nilfs2: add tracepoints for analyzing reading and writing metadata files
...

Linus Torvalds
2015-11-08 06:32:45 +0800
03e88ae6b tracing: Remove unused ftrace_cpu_disabled per cpu variable ... Browse Code »

Since the ring buffer is lockless, there is no need to disable ftrace on
CPU. And no one doing so: after commit 68179686ac67cb ("tracing: Remove
ftrace_disable/enable_cpu()") ftrace_cpu_disabled stays the same after
initialization, nothing changes it.
ftrace_cpu_disabled shouldn't be used by any external module since it
disables only function and graph_function tracers but not any other
tracer.

Link: http://lkml.kernel.org/r/1446836846-22239-1-git-send-email-0x7f454c46@gmail.com

Signed-off-by: Dmitry Safonov
Signed-off-by: Steven Rostedt

Dmitry Safonov
2015-11-08 02:25:14 +0800

07 Nov, 2015

10 commits

08d78658f panic: release stale console lock to always get the logbuf printed out ... Browse Code »

In some cases we may end up killing the CPU holding the console lock
while still having valuable data in logbuf. E.g. I'm observing the
following:

- A crash is happening on one CPU and console_unlock() is being called on
some other.

- console_unlock() tries to print out the buffer before releasing the lock
and on slow console it takes time.

- in the meanwhile crashing CPU does lots of printk()-s with valuable data
(which go to the logbuf) and sends IPIs to all other CPUs.

- console_unlock() finishes printing previous chunk and enables interrupts
before trying to print out the rest, the CPU catches the IPI and never
releases console lock.

This is not the only possible case: in VT/fb subsystems we have many other
console_lock()/console_unlock() users. Non-masked interrupts (or
receiving NMI in case of extreme slowness) will have the same result.
Getting the whole console buffer printed out on crash should be top
priority.

[akpm@linux-foundation.org: tweak comment text]
Signed-off-by: Vitaly Kuznetsov
Cc: HATAYAMA Daisuke
Cc: Masami Hiramatsu
Cc: Jiri Kosina
Cc: Baoquan He
Cc: Prarit Bhargava
Cc: Xie XiuQi
Cc: Seth Jennings
Cc: "K. Y. Srinivasan"
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Kuznetsov
2015-11-07 09:50:42 +0800
8639b4613 pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode ... Browse Code »

setpriority(PRIO_USER, 0, x) will change the priority of tasks outside of
the current pid namespace. This is in contrast to both the other modes of
setpriority and the example of kill(-1). Fix this. getpriority and
ioprio have the same failure mode, fix them too.

Eric said:

: After some more thinking about it this patch sounds justifiable.
:
: My goal with namespaces is not to build perfect isolation mechanisms
: as that can get into ill defined territory, but to build well defined
: mechanisms. And to handle the corner cases so you can use only
: a single namespace with well defined results.
:
: In this case you have found the two interfaces I am aware of that
: identify processes by uid instead of by pid. Which quite frankly is
: weird. Unfortunately the weird unexpected cases are hard to handle
: in the usual way.
:
: I was hoping for a little more information. Changes like this one we
: have to be careful of because someone might be depending on the current
: behavior. I don't think they are and I do think this make sense as part
: of the pid namespace.

Signed-off-by: Ben Segall
Cc: Oleg Nesterov
Cc: Al Viro
Cc: Ambrose Feinstein
Acked-by: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ben Segall
2015-11-07 09:50:42 +0800
de90a6bca kexec: use file name as the output message prefix ... Browse Code »

kexec output message misses the prefix "kexec", when Dave Young split the
kexec code. Now, we use file name as the output message prefix.

Currently, the format of output message:
[ 140.290795] SYSC_kexec_load: hello, world
[ 140.291534] kexec: sanity_check_segment_list: hello, world

Ideally, the format of output message:
[ 30.791503] kexec: SYSC_kexec_load, Hello, world
[ 79.182752] kexec_core: sanity_check_segment_list, Hello, world

Remove the custom prefix "kexec" in output message.

Signed-off-by: Minfei Huang
Acked-by: Dave Young
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minfei Huang
2015-11-07 09:50:42 +0800
5fa534c98 coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP ... Browse Code »

task_will_free_mem() is wrong in many ways, and in particular the
SIGNAL_GROUP_COREDUMP check is not reliable: a task can participate in the
coredumping without SIGNAL_GROUP_COREDUMP bit set.

change zap_threads() paths to always set SIGNAL_GROUP_COREDUMP even if
other CLONE_VM processes can't react to SIGKILL. Fortunately, at least
oom-kill case if fine; it kills all tasks sharing the same mm, so it
should also kill the process which actually dumps the core.

The change in prepare_signal() is not strictly necessary, it just ensures
that the patch does not bring another subtle behavioural change. But it
reminds us that this SIGNAL_GROUP_EXIT/COREDUMP case needs more changes.

Signed-off-by: Oleg Nesterov
Cc: David Rientjes
Cc: Kyle Walker
Acked-by: Michal Hocko
Cc: Stanislav Kozina
Cc: Tetsuo Handa
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2015-11-07 09:50:42 +0800
2e01fabe6 signals: kill block_all_signals() and unblock_all_signals() ... Browse Code »

It is hardly possible to enumerate all problems with block_all_signals()
and unblock_all_signals(). Just for example,

1. block_all_signals(SIGSTOP/etc) simply can't help if the caller is
multithreaded. Another thread can dequeue the signal and force the
group stop.

2. Even is the caller is single-threaded, it will "stop" anyway. It
will not sleep, but it will spin in kernel space until SIGCONT or
SIGKILL.

And a lot more. In short, this interface doesn't work at all, at least
the last 10+ years.

Daniel said:

Yeah the only times I played around with the DRM_LOCK stuff was when
old drivers accidentally deadlocked - my impression is that the entire
DRM_LOCK thing was never really tested properly ;-) Hence I'm all for
purging where this leaks out of the drm subsystem.

Signed-off-by: Oleg Nesterov
Acked-by: Daniel Vetter
Acked-by: Dave Airlie
Cc: Richard Weinberger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2015-11-07 09:50:42 +0800
3824657c5 printk: prevent userland from spoofing kernel messages ... Browse Code »

The following statement of ABI/testing/dev-kmsg is not quite right:

It is not possible to inject messages from userspace with the
facility number LOG_KERN (0), to make sure that the origin of the
messages can always be reliably determined.

Userland actually can inject messages with a facility of 0 by abusing the
fact that the facility is stored in a u8 data type. By using a facility
which is a multiple of 256 the assignment of msg->facility in log_store()
implicitly truncates it to 0, i.e. LOG_KERN, allowing users of /dev/kmsg
to spoof kernel messages as shown below:

The following call...
# printf 'Kernel panic - not syncing: beer empty\n' 0 >/dev/kmsg
...leads to the following log entry (dmesg -x | tail -n 1):
user :emerg : [ 66.137758] Kernel panic - not syncing: beer empty

However, this call...
# printf 'Kernel panic - not syncing: beer empty\n' 0x800 >/dev/kmsg
...leads to the slightly different log entry (note the kernel facility):
kern :emerg : [ 74.177343] Kernel panic - not syncing: beer empty

Fix that by limiting the user provided facility to 8 bit right from the
beginning and catch the truncation early.

Fixes: 7ff9554bb578 ("printk: convert byte-buffer to variable-length...")
Signed-off-by: Mathias Krause
Cc: Greg Kroah-Hartman
Cc: Petr Mladek
Cc: Alex Elder
Cc: Joe Perches
Cc: Kay Sievers
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mathias Krause
2015-11-07 09:50:42 +0800
3d9c637f4 module: export param_free_charp() ... Browse Code »

Change the param_free_charp() function from static to exported.

It is used by zswap in the next patch ("zswap: use charp for zswap param
strings").

Signed-off-by: Dan Streetman
Acked-by: Rusty Russell
Cc: Seth Jennings
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Streetman
2015-11-07 09:50:42 +0800
71baba4b9 mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM ... Browse Code »

__GFP_WAIT was used to signal that the caller was in atomic context and
could not sleep. Now it is possible to distinguish between true atomic
context and callers that are not willing to sleep. The latter should
clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing
__GFP_WAIT behaves differently, there is a risk that people will clear the
wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
indicate what it does -- setting it allows all reclaim activity, clearing
them prevents it.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mel Gorman
Acked-by: Michal Hocko
Acked-by: Vlastimil Babka
Acked-by: Johannes Weiner
Cc: Christoph Lameter
Acked-by: David Rientjes
Cc: Vitaly Wool
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2015-11-07 09:50:42 +0800
d0164adc8 mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep an… ... Browse Code »

…d avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve". __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim. __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
__GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
into this category where kswapd will still be woken but atomic reserves
are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
helper gfpflags_allow_blocking() where possible. This is because
checking for __GFP_WAIT as was done historically now can trigger false
positives. Some exceptions like dm-crypt.c exist where the code intent
is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL. They may
now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Mel Gorman
2015-11-07 09:50:42 +0800
22402cd0a Merge tag 'trace-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracking updates from Steven Rostedt:
"Most of the changes are clean ups and small fixes. Some of them have
stable tags to them. I searched through my INBOX just as the merge
window opened and found lots of patches to pull. I ran them through
all my tests and they were in linux-next for a few days.

Features added this release:
----------------------------

- Module globbing. You can now filter function tracing to several
modules. # echo '*:mod:*snd*' > set_ftrace_filter (Dmitry Safonov)

- Tracer specific options are now visible even when the tracer is not
active. It was rather annoying that you can only see and modify
tracer options after enabling the tracer. Now they are in the
options/ directory even when the tracer is not active. Although
they are still only visible when the tracer is active in the
trace_options file.

- Trace options are now per instance (although some of the tracer
specific options are global)

- New tracefs file: set_event_pid. If any pid is added to this file,
then all events in the instance will filter out events that are not
part of this pid. sched_switch and sched_wakeup events handle next
and the wakee pids"

* tag 'trace-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (68 commits)
tracefs: Fix refcount imbalance in start_creating()
tracing: Put back comma for empty fields in boot string parsing
tracing: Apply tracer specific options from kernel command line.
tracing: Add some documentation about set_event_pid
ring_buffer: Remove unneeded smp_wmb() before wakeup of reader benchmark
tracing: Allow dumping traces without tracking trace started cpus
ring_buffer: Fix more races when terminating the producer in the benchmark
ring_buffer: Do no not complete benchmark reader too early
tracing: Remove redundant TP_ARGS redefining
tracing: Rename max_stack_lock to stack_trace_max_lock
tracing: Allow arch-specific stack tracer
recordmcount: arm64: Replace the ignored mcount call into nop
recordmcount: Fix endianness handling bug for nop_mcount
tracepoints: Fix documentation of RCU lockdep checks
tracing: ftrace_event_is_function() can return boolean
tracing: is_legal_op() can return boolean
ring-buffer: rb_event_is_commit() can return boolean
ring-buffer: rb_per_cpu_empty() can return boolean
ring_buffer: ring_buffer_empty{cpu}() can return boolean
ring-buffer: rb_is_reader_page() can return boolean
...

Linus Torvalds
2015-11-07 05:30:20 +0800

06 Nov, 2015

5 commits

8b1291994 tracing: Make tracing work when debugfs is not configured in ... Browse Code »

Currently tracing_init_dentry() returns -ENODEV when debugfs is not
configured in, which causes tracefs not populated with tracing files and
directories, so we will get an empty directory even after we manually
mount tracefs.

We can make tracing_init_dentry() return NULL if debugfs is not
configured in and can manually mount tracefs. But return -ENODEV
if debugfs is configured in but not initialized or failed to create
automount point as that would break backward compatibility with older
tools.

Link: http://lkml.kernel.org/r/1446797056-11683-1-git-send-email-hello.wjx@gmail.com

Signed-off-by: Jiaxing Wang
Signed-off-by: Steven Rostedt

Jiaxing Wang
2015-11-06 23:02:33 +0800
2e3078af2 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge patch-bomb from Andrew Morton:

- inotify tweaks

- some ocfs2 updates (many more are awaiting review)

- various misc bits

- kernel/watchdog.c updates

- Some of mm. I have a huge number of MM patches this time and quite a
lot of it is quite difficult and much will be held over to next time.

* emailed patches from Andrew Morton : (162 commits)
selftests: vm: add tests for lock on fault
mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage
mm: introduce VM_LOCKONFAULT
mm: mlock: add new mlock system call
mm: mlock: refactor mlock, munlock, and munlockall code
kasan: always taint kernel on report
mm, slub, kasan: enable user tracking by default with KASAN=y
kasan: use IS_ALIGNED in memory_is_poisoned_8()
kasan: Fix a type conversion error
lib: test_kasan: add some testcases
kasan: update reference to kasan prototype repo
kasan: move KASAN_SANITIZE in arch/x86/boot/Makefile
kasan: various fixes in documentation
kasan: update log messages
kasan: accurately determine the type of the bad access
kasan: update reported bug types for kernel memory accesses
kasan: update reported bug types for not user nor kernel memory accesses
mm/kasan: prevent deadlock in kasan reporting
mm/kasan: don't use kasan shadow pointer in generic functions
mm/kasan: MODULE_VADDR is not available on all archs
...

Linus Torvalds
2015-11-06 15:10:54 +0800
de60f5f10 mm: introduce VM_LOCKONFAULT ... Browse Code »

The cost of faulting in all memory to be locked can be very high when
working with large mappings. If only portions of the mapping will be used
this can incur a high penalty for locking.

For the example of a large file, this is the usage pattern for a large
statical language model (probably applies to other statical or graphical
models as well). For the security example, any application transacting in
data that cannot be swapped out (credit card data, medical records, etc).

This patch introduces the ability to request that pages are not
pre-faulted, but are placed on the unevictable LRU when they are finally
faulted in. The VM_LOCKONFAULT flag will be used together with VM_LOCKED
and has no effect when set without VM_LOCKED. Setting the VM_LOCKONFAULT
flag for a VMA will cause pages faulted into that VMA to be added to the
unevictable LRU when they are faulted or if they are already present, but
will not cause any missing pages to be faulted in.

Exposing this new lock state means that we cannot overload the meaning of
the FOLL_POPULATE flag any longer. Prior to this patch it was used to
mean that the VMA for a fault was locked. This means we need the new
FOLL_MLOCK flag to communicate the locked state of a VMA. FOLL_POPULATE
will now only control if the VMA should be populated and in the case of
VM_LOCKONFAULT, it will not be set.

Signed-off-by: Eric B Munson
Acked-by: Kirill A. Shutemov
Acked-by: Vlastimil Babka
Cc: Michal Hocko
Cc: Jonathan Corbet
Cc: Catalin Marinas
Cc: Geert Uytterhoeven
Cc: Guenter Roeck
Cc: Heiko Carstens
Cc: Michael Kerrisk
Cc: Ralf Baechle
Cc: Shuah Khan
Cc: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric B Munson
2015-11-06 11:34:48 +0800
a8ca5d0ec mm: mlock: add new mlock system call ... Browse Code »

With the refactored mlock code, introduce a new system call for mlock.
The new call will allow the user to specify what lock states are being
added. mlock2 is trivial at the moment, but a follow on patch will add a
new mlock state making it useful.

Signed-off-by: Eric B Munson
Acked-by: Michal Hocko
Acked-by: Vlastimil Babka
Cc: Heiko Carstens
Cc: Geert Uytterhoeven
Cc: Catalin Marinas
Cc: Stephen Rothwell
Cc: Guenter Roeck
Cc: Jonathan Corbet
Cc: Kirill A. Shutemov
Cc: Michael Kerrisk
Cc: Ralf Baechle
Cc: Shuah Khan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric B Munson
2015-11-06 11:34:48 +0800
da39da3a5 mm, oom: remove task_lock protecting comm printing ... Browse Code »

The oom killer takes task_lock() in a couple of places solely to protect
printing the task's comm.

A process's comm, including current's comm, may change due to
/proc/pid/comm or PR_SET_NAME.

The comm will always be NULL-terminated, so the worst race scenario would
only be during update. We can tolerate a comm being printed that is in
the middle of an update to avoid taking the lock.

Other locations in the kernel have already dropped task_lock() when
printing comm, so this is consistent.

Signed-off-by: David Rientjes
Suggested-by: Oleg Nesterov
Cc: Michal Hocko
Cc: Vladimir Davydov
Cc: Sergey Senozhatsky
Acked-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2015-11-06 11:34:48 +0800