Eric Lee / smarc-fsl-linux-kernel

24 Aug, 2016

1 commit

b2d4c2edb locking/hung_task: Show all locks ... Browse Code »

When we get a hung task it can often be valuable to see _all_ the held
locks on the system (in case we are being blocked on trying to acquire
one), e.g. with this patch we can immediately see where the problem is
below:

INFO: task trinity-c3:14933 blocked for more than 120 seconds.
Not tainted 4.8.0-rc1+ #135
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
trinity-c3 D ffff88010c16fc88 0 14933 1 0x00080004
ffff88010c16fc88 000000003b9aca00 0000000000000000 0000000000000296
00000000776cdf88 ffff88011a520ae0 ffff88011a520b08 ffff88011a520198
ffffffff867d7f00 ffff88011942c080 ffff880116841580 ffff88010c168000
Call Trace:
[] schedule+0x77/0x230
[] __lock_sock+0x129/0x250
[] ? __sk_destruct+0x450/0x450
[] ? wake_bit_function+0x2e0/0x2e0
[] lock_sock_nested+0xeb/0x120
[] irda_setsockopt+0x65/0xb40
[] SyS_setsockopt+0x139/0x230
[] ? SyS_recv+0x20/0x20
[] ? trace_event_raw_event_sys_enter+0xb90/0xb90
[] ? __this_cpu_preempt_check+0x13/0x20
[] ? __context_tracking_exit.part.3+0x30/0x1b0
[] ? SyS_recv+0x20/0x20
[] do_syscall_64+0x1b3/0x4b0
[] entry_SYSCALL64_slow_path+0x25/0x25

Showing all locks held in the system:
2 locks held by khungtaskd/563:
#0: (rcu_read_lock){......}, at: [] watchdog+0x106/0x910
#1: (tasklist_lock){......}, at: [] debug_show_all_locks+0x74/0x360
1 lock held by trinity-c0/19280:
#0: (sk_lock-AF_IRDA){......}, at: [] irda_accept+0x176/0x10f0
1 lock held by trinity-c0/12865:
#0: (sk_lock-AF_IRDA){......}, at: [] irda_accept+0x176/0x10f0

Signed-off-by: Vegard Nossum
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Mandeep Singh Baines
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1471538460-7505-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Ingo Molnar

Vegard Nossum
2016-08-24 18:16:13 +0800

18 Aug, 2016

4 commits

70800c3c0 locking/rwsem: Scan the wait_list for readers only once ... Browse Code »

When wanting to wakeup readers, __rwsem_mark_wakeup() currently
iterates the wait_list twice while looking to wakeup the first N
queued reader-tasks. While this can be quite inefficient, it was
there such that a awoken reader would be first and foremost
acknowledged by the lock counter.

Keeping the same logic, we can further benefit from the use of
wake_qs and avoid entirely the first wait_list iteration that sets
the counter as wake_up_process() isn't going to occur right away,
and therefore we maintain the counter->list order of going about
things.

Other than saving cycles with O(n) "scanning", this change also
nicely cleans up a good chunk of __rwsem_mark_wakeup(); both
visually and less tedious to read.

For example, the following improvements where seen on some will
it scale microbenchmarks, on a 48-core Haswell:

v4.7 v4.7-rwsem-v1
Hmean signal1-processes-8 5792691.42 ( 0.00%) 5771971.04 ( -0.36%)
Hmean signal1-processes-12 6081199.96 ( 0.00%) 6072174.38 ( -0.15%)
Hmean signal1-processes-21 3071137.71 ( 0.00%) 3041336.72 ( -0.97%)
Hmean signal1-processes-48 3712039.98 ( 0.00%) 3708113.59 ( -0.11%)
Hmean signal1-processes-79 4464573.45 ( 0.00%) 4682798.66 ( 4.89%)
Hmean signal1-processes-110 4486842.01 ( 0.00%) 4633781.71 ( 3.27%)
Hmean signal1-processes-141 4611816.83 ( 0.00%) 4692725.38 ( 1.75%)
Hmean signal1-processes-172 4638157.05 ( 0.00%) 4714387.86 ( 1.64%)
Hmean signal1-processes-203 4465077.80 ( 0.00%) 4690348.07 ( 5.05%)
Hmean signal1-processes-224 4410433.74 ( 0.00%) 4687534.43 ( 6.28%)

Stddev signal1-processes-8 6360.47 ( 0.00%) 8455.31 ( 32.94%)
Stddev signal1-processes-12 4004.98 ( 0.00%) 9156.13 (128.62%)
Stddev signal1-processes-21 3273.14 ( 0.00%) 5016.80 ( 53.27%)
Stddev signal1-processes-48 28420.25 ( 0.00%) 26576.22 ( -6.49%)
Stddev signal1-processes-79 22038.34 ( 0.00%) 18992.70 (-13.82%)
Stddev signal1-processes-110 23226.93 ( 0.00%) 17245.79 (-25.75%)
Stddev signal1-processes-141 6358.98 ( 0.00%) 7636.14 ( 20.08%)
Stddev signal1-processes-172 9523.70 ( 0.00%) 4824.75 (-49.34%)
Stddev signal1-processes-203 13915.33 ( 0.00%) 9326.33 (-32.98%)
Stddev signal1-processes-224 15573.94 ( 0.00%) 10613.82 (-31.85%)

Other runs that saw improvements include context_switch and pipe; and
as expected, this is particularly highlighted on larger thread counts
as it becomes more expensive to walk the list twice.

No change in wakeup ordering or semantics.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Waiman.Long@hp.com
Cc: dave@stgolabs.net
Cc: jason.low2@hpe.com
Cc: wanpeng.li@hotmail.com
Link: http://lkml.kernel.org/r/1470384285-32163-4-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2016-08-18 21:37:11 +0800
c2867bbaf locking/rwsem: Remove a few useless comments ... Browse Code »

Our rwsem code (xadd, at least) is rather well documented, but
there are a few really annoying comments in there that serve
no purpose and we shouldn't bother with them.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Waiman.Long@hp.com
Cc: dave@stgolabs.net
Cc: jason.low2@hpe.com
Cc: wanpeng.li@hotmail.com
Link: http://lkml.kernel.org/r/1470384285-32163-3-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2016-08-18 21:37:07 +0800
84b23f9b5 locking/rwsem: Return void in __rwsem_mark_wake() ... Browse Code »

We currently return a rw_semaphore structure, which is the
same lock we passed to the function's argument in the first
place. While there are several functions that choose this
return value, the callers use it, for example, for things
like ERR_PTR. This is not the case for __rwsem_mark_wake(),
and in addition this function is really about the lock
waiters (which we know there are at this point), so its
somewhat odd to be returning the sem structure.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Waiman.Long@hp.com
Cc: dave@stgolabs.net
Cc: jason.low2@hpe.com
Cc: wanpeng.li@hotmail.com
Link: http://lkml.kernel.org/r/1470384285-32163-2-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2016-08-18 21:37:03 +0800
3942a9bd7 locking, rcu, cgroup: Avoid synchronize_sched() in __cgroup_procs_write() ... Browse Code »

The current percpu-rwsem read side is entirely free of serializing insns
at the cost of having a synchronize_sched() in the write path.

The latency of the synchronize_sched() is too high for cgroups. The
commit 1ed1328792ff talks about the write path being a fairly cold path
but this is not the case for Android which moves task to the foreground
cgroup and back around binder IPC calls from foreground processes to
background processes, so it is significantly hotter than human initiated
operations.

Switch cgroup_threadgroup_rwsem into the slow mode for now to avoid the
problem, hopefully it should not be that slow after another commit:

80127a39681b ("locking/percpu-rwsem: Optimize readers and reduce global impact").

We could just add rcu_sync_enter() into cgroup_init() but we do not want
another synchronize_sched() at boot time, so this patch adds the new helper
which doesn't block but currently can only be called before the first use.

Reported-by: John Stultz
Reported-by: Dmitry Shmidt
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Colin Cross
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Rom Lemarchand
Cc: Tejun Heo
Cc: Thomas Gleixner
Cc: Todd Kjos
Link: http://lkml.kernel.org/r/20160811165413.GA22807@redhat.com
Signed-off-by: Ingo Molnar

Peter Zijlstra
2016-08-18 21:36:59 +0800

12 Aug, 2016

4 commits

e8cb0fe6e locking/Documentation: Add Korean translation ... Browse Code »

This commit adds Korean version of memory-barriers.txt document. The
header is referred to HOWTO Korean version.

The translation has started from Feb, 2016 and using a public git
repository[1] to maintain the work. It's commit history says that it is
following upstream changes as well.

[1] https://github.com/sjp38/linux.doc_trans_membarrier

Signed-off-by: SeongJae Park
Signed-off-by: Paul E. McKenney
Reviewed-by: Byungchul Park
Acked-by: David Howells
Acked-by: Minchan Kim
Acked-by: Jonathan Corbet
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-arch@vger.kernel.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1470939463-31950-4-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

SeongJae Park
2016-08-12 14:24:14 +0800
8b9e77155 locking/Documentation: Fix a typo of example result ... Browse Code »

An example result for data dependent write has a typo. This commit
fixes the wrong typo.

Signed-off-by: SeongJae Park
Signed-off-by: Paul E. McKenney
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: dhowells@redhat.com
Cc: linux-arch@vger.kernel.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1470939463-31950-3-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

SeongJae Park
2016-08-12 14:24:13 +0800
d7cab36db locking/Documentation: Fix wrong section reference ... Browse Code »

Signed-off-by: SeongJae Park
Signed-off-by: Paul E. McKenney
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: dhowells@redhat.com
Cc: linux-arch@vger.kernel.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1470939463-31950-2-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

SeongJae Park
2016-08-12 14:24:13 +0800
dfeccea61 locking/Documentation: Maintain consistent blank line ... Browse Code »

Signed-off-by: SeongJae Park
Signed-off-by: Paul E. McKenney
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: dhowells@redhat.com
Cc: linux-arch@vger.kernel.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1470939463-31950-1-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

SeongJae Park
2016-08-12 14:24:13 +0800

10 Aug, 2016

14 commits

80127a396 locking/percpu-rwsem: Optimize readers and reduce global impact ... Browse Code »

Currently the percpu-rwsem switches to (global) atomic ops while a
writer is waiting; which could be quite a while and slows down
releasing the readers.

This patch cures this problem by ordering the reader-state vs
reader-count (see the comments in __percpu_down_read() and
percpu_down_write()). This changes a global atomic op into a full
memory barrier, which doesn't have the global cacheline contention.

This also enables using the percpu-rwsem with rcu_sync disabled in order
to bias the implementation differently, reducing the writer latency by
adding some cost to readers.

Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Oleg Nesterov
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Paul McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
[ Fixed modular build. ]
Signed-off-by: Ingo Molnar

Peter Zijlstra
2016-08-10 20:34:01 +0800
08be8f63c locking/pvstat: Separate wait_again and spurious wakeup stats ... Browse Code »

Currently there are overlap in the pvqspinlock wait_again and
spurious_wakeup stat counters. Because of lock stealing, it is
no longer possible to accurately determine if spurious wakeup has
happened in the queue head. As they track both the queue node and
queue head status, it is also hard to tell how many of those comes
from the queue head and how many from the queue node.

This patch changes the accounting rules so that spurious wakeup is
only tracked in the queue node. The wait_again count, however, is
only tracked in the queue head when the vCPU failed to acquire the
lock after a vCPU kick. This should give a much better indication of
the wait-kick dynamics in the queue node and the queue head.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boqun Feng
Cc: Douglas Hatch
Cc: Linus Torvalds
Cc: Pan Xinhui
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Scott J Norton
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1464713631-1066-2-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar

Waiman Long
2016-08-10 20:16:02 +0800
64a5e3cb3 locking/qspinlock: Improve readability ... Browse Code »

Restructure pv_queued_spin_steal_lock() as I found it hard to read.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Waiman Long

Peter Zijlstra
2016-08-10 20:16:02 +0800
c2ace36b8 locking/pvqspinlock: Fix a bug in qstat_read() ... Browse Code »

It's obviously wrong to set stat to NULL. So lets remove it.
Otherwise it is always zero when we check the latency of kick/wake.

Signed-off-by: Pan Xinhui
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Waiman Long
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1468405414-3700-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

Pan Xinhui
2016-08-10 20:13:29 +0800
229ce6315 locking/pvqspinlock: Fix double hash race ... Browse Code »

When the lock holder vCPU is racing with the queue head:

CPU 0 (lock holder) CPU1 (queue head)
=================== =================
spin_lock(); spin_lock();
pv_kick_node(): pv_wait_head_or_lock():
if (!lp) {
lp = pv_hash(lock, pn);
xchg(&l->locked, _Q_SLOW_VAL);
}
WRITE_ONCE(pn->state, vcpu_halted);
cmpxchg(&pn->state,
vcpu_halted, vcpu_hashed);
WRITE_ONCE(l->locked, _Q_SLOW_VAL);
(void)pv_hash(lock, pn);

In this case, lock holder inserts the pv_node of queue head into the
hash table and set _Q_SLOW_VAL unnecessary. This patch avoids it by
restoring/setting vcpu_hashed state after failing adaptive locking
spinning.

Signed-off-by: Wanpeng Li
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Pan Xinhui
Cc: Andrew Morton
Cc: Davidlohr Bueso
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Waiman Long
Link: http://lkml.kernel.org/r/1468484156-4521-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Ingo Molnar

Wanpeng Li
2016-08-10 20:13:28 +0800
2db34e8bf locking/qrwlock: Fix write unlock bug on big endian systems ... Browse Code »

This patch aims to get rid of endianness in queued_write_unlock(). We
want to set __qrwlock->wmode to NULL, however the address is not
&lock->cnts in big endian machine. That causes queued_write_unlock()
write NULL to the wrong field of __qrwlock.

So implement __qrwlock_write_byte() which returns the correct
__qrwlock->wmode address.

Suggested-by: Peter Zijlstra (Intel)
Signed-off-by: Pan Xinhui
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Waiman.Long@hpe.com
Cc: arnd@arndb.de
Cc: boqun.feng@gmail.com
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1468835259-4486-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

pan xinhui
2016-08-10 20:13:27 +0800
a2071cd76 Merge branch 'linus' into locking/urgent, to pick up fixes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2016-08-10 20:11:54 +0800
a0cba2179 Revert "printk: create pr_<level> functions" ... Browse Code »

This reverts commit 874f9c7da9a4acbc1b9e12ca722579fb50e4d142.

Geert Uytterhoeven reports:
"This change seems to have an (unintendent?) side-effect.

Before, pr_*() calls without a trailing newline characters would be
printed with a newline character appended, both on the console and in
the output of the dmesg command.

After this commit, no new line character is appended, and the output
of the next pr_*() call of the same type may be appended, like in:

- Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000
- Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)
+ Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)"

Joe Perches says:
"No, that is not intentional.

The newline handling code inside vprintk_emit is a bit involved and
for now I suggest a revert until this has all the same behavior as
earlier"

Reported-by: Geert Uytterhoeven
Requested-by: Joe Perches
Cc: Andrew Morton
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-08-10 01:48:18 +0800
84bd8d33a Merge tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing fix from Steven Rostedt:
"Fix tick_stop tracepoint symbols for user export.

Luiz Capitulino noticed that the tick_stop tracepoint wasn't being
parsed properly by the tracing user space tools.

This was due to the TRACE_DEFINE_ENUM() being set to a define, when it
should have been set to the enum itself. The define was of the MASK
that used the BIT to shift. The BIT was the enum and by adding that,
everything gets converted nicely. The MASK is still kept just in case
it gets converted to an enum in the future"

* tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix tick_stop tracepoint symbols for user export

Linus Torvalds
2016-08-10 01:34:09 +0800
b79f34d6a Merge tag 'gcc-plugin-infrastructure-v4.8-rc2' of git://git.kernel.org/pub/scm/l… ... Browse Code »

…inux/kernel/git/kees/linux

Pull gcc plugin improvements from Kees Cook:
"Several fixes/improvements for the gcc plugin infrastructure:

- fix a problem with gcc plugins interfering with cc-option tests.

- abort more gracefully when gcc plugin headers or compiler support
is missing.

- improve the gcc plugin rule generation to be more dynamic, pass
arguments, and build from subdirectories"

* tag 'gcc-plugin-infrastructure-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: Add support for plugin subdirectories
gcc-plugins: Automate make rule generation
gcc-plugins: Add support for passing plugin arguments
gcc-plugins: abort builds cleanly when not supported
kbuild: no gcc-plugins during cc-option tests

Linus Torvalds
2016-08-10 01:30:07 +0800
e1d009eab Merge tag 'platform-drivers-x86-v4.8-3' of git://git.infradead.org/users/dvhart/… ... Browse Code »

…linux-platform-drivers-x86

Pull x86 platform driver update from Darren Hart:
"dell-wmi: ignore battery remove/insert event"

* tag 'platform-drivers-x86-v4.8-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
dell-wmi: Ignore WMI event 0xe00e

Linus Torvalds
2016-08-10 01:26:14 +0800
cb0d93aaf Merge tag 'drm-fixes-for-4.8-rc2' of git://people.freedesktop.org/~airlied/linux ... Browse Code »

Pull drm fixes from Dave Airlie:
"This contains a bunch of amdgpu fixes, and some i915 regression fixes.

It also contains some fixes for an older regression with some EDID
changes and some 6bpc panels.

Then there are the lockdep, cirrus and rcar-du regression fixes from
this window"

* tag 'drm-fixes-for-4.8-rc2' of git://people.freedesktop.org/~airlied/linux:
drm/cirrus: Fix NULL pointer dereference when registering the fbdev
drm/edid: Set 8 bpc color depth for displays with "DFP 1.x compliant TMDS".
drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown"
drm/edid: Add 6 bpc quirk for display AEO model 0.
drm: Paper over locking inversion after registration rework
drm: rcar-du: Link HDMI encoder with bridge
drm/ttm: Wait for a BO to become idle before unbinding it from GTT
drm/i915/fbdev: Check for the framebuffer before use
drm/amdgpu: update golden setting of polaris10
drm/amdgpu: update golden setting of stoney
drm/amdgpu: update golden setting of polaris11
drm/amdgpu: update golden setting of carrizo
drm/amdgpu: update golden setting of iceland
drm/amd/amdgpu: change pptable output format from ASCII to binary
drm/amdgpu/ci: add mullins to default case for smc ucode
drm/amdgpu/gmc7: add missing mullins case
drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL

Linus Torvalds
2016-08-10 01:20:21 +0800
a3d1ddd93 ipr: Fix sync scsi scan ... Browse Code »

Commit b195d5e2bffd ("ipr: Wait to do async scan until scsi host is
initialized") fixed async scan for ipr, but broke sync scan for ipr.

This fixes sync scan back up.

Signed-off-by: Brian King
Reported-and-tested-by: Michael Ellerman
Signed-off-by: Linus Torvalds

Brian King
2016-08-10 01:17:42 +0800
c4159a75b mm: memcontrol: only mark charged pages with PageKmemcg ... Browse Code »

To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg,
which sets page->_mapcount to -512. Currently, we set/clear PageKmemcg
in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated
with __GFP_ACCOUNT, including those that aren't actually charged to any
cgroup, i.e. allocated from the root cgroup context. To avoid overhead
in case cgroups are not used, we only do that if memcg_kmem_enabled() is
true. The latter is set iff there are kmem-enabled memory cgroups
(online or offline). The root cgroup is not considered kmem-enabled.

As a result, if a page is allocated with __GFP_ACCOUNT for the root
cgroup when there are kmem-enabled memory cgroups and is freed after all
kmem-enabled memory cgroups were removed, e.g.

# no memory cgroups has been created yet, create one
mkdir /sys/fs/cgroup/memory/test
# run something allocating pages with __GFP_ACCOUNT, e.g.
# a program using pipe
dmesg | tail
# remove the memory cgroup
rmdir /sys/fs/cgroup/memory/test

we'll get bad page state bug complaining about page->_mapcount != -1:

BUG: Bad page state in process swapper/0 pfn:1fd945c
page:ffffea007f651700 count:0 mapcount:-511 mapping: (null) index:0x0
flags: 0x1000000000000000()

To avoid that, let's mark with PageKmemcg only those pages that are
actually charged to and hence pin a non-root memory cgroup.

Fixes: 4949148ad433 ("mm: charge/uncharge kmemcg from generic page allocator paths")
Reported-and-tested-by: Eric Dumazet
Signed-off-by: Vladimir Davydov
Signed-off-by: Linus Torvalds

Vladimir Davydov
2016-08-10 01:14:10 +0800

09 Aug, 2016

16 commits

c87edb361 tracing: Fix tick_stop tracepoint symbols for user export ... Browse Code »

The symbols used in the tick_stop tracepoint were not being converted
properly into integers in the trace_stop format file. Instead we had this:

print fmt: "success=%d dependency=%s", REC->success,
__print_symbolic(REC->dependency, { 0, "NONE" },
{ (1 << TICK_DEP_BIT_POSIX_TIMER), "POSIX_TIMER" },
{ (1 << TICK_DEP_BIT_PERF_EVENTS), "PERF_EVENTS" },
{ (1 << TICK_DEP_BIT_SCHED), "SCHED" },
{ (1 << TICK_DEP_BIT_CLOCK_UNSTABLE), "CLOCK_UNSTABLE" })

User space tools have no idea how to parse "TICK_DEP_BIT_SCHED" or the other
symbols used to do the bit shifting. The reason is that the conversion was
done with using the TICK_DEP_MASK_* symbols which are just macros that
convert to the BIT shift itself (with the exception of NONE, which was
converted properly, because it doesn't use bits, and is defined as zero).

The TICK_DEP_BIT_* needs to be denoted by TRACE_DEFINE_ENUM() in order to
have this properly converted for user space tools to parse this event.

Cc: stable@vger.kernel.org
Cc: Frederic Weisbecker
Fixes: e6e6cc22e067 ("nohz: Use enum code for tick stop failure tracing message")
Reported-by: Luiz Capitulino
Tested-by: Luiz Capitulino
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2016-08-09 21:51:23 +0800
36e9d08b5 drm/cirrus: Fix NULL pointer dereference when registering the fbdev ... Browse Code »

cirrus_modeset_init() is initializing/registering the emulated fbdev
and, since commit c61b93fe51b1 ("drm/atomic: Fix remaining places where
!funcs->best_encoder is valid"), DRM internals can access/test some of
the fields in mode_config->funcs as part of the fbdev registration
process.
Make sure dev->mode_config.funcs is properly set to avoid dereferencing
a NULL pointer.

Reported-by: Mike Marshall
Reported-by: Eric W. Biederman
Signed-off-by: Boris Brezillon
Fixes: c61b93fe51b1 ("drm/atomic: Fix remaining places where !funcs->best_encoder is valid")
Signed-off-by: Dave Airlie

Boris Brezillon
2016-08-09 11:01:47 +0800
caefd8c9a gcc-plugins: Add support for plugin subdirectories ... Browse Code »

This adds support for building more complex gcc plugins that live in a
subdirectory instead of just in a single source file.

Reported-by: PaX Team
Signed-off-by: Emese Revfy
[kees: clarified commit message]
Signed-off-by: Kees Cook

Emese Revfy
2016-08-09 08:53:05 +0800
7040c83bf gcc-plugins: Automate make rule generation ... Browse Code »

There's no reason to repeat the same names in the Makefile when the .so
files have already been listed. The .o list can be generated from them.

Reported-by: PaX Team
Signed-off-by: Emese Revfy
[kees: clarified commit message]
Signed-off-by: Kees Cook

Emese Revfy
2016-08-09 08:52:20 +0800
65d59ec8a gcc-plugins: Add support for passing plugin arguments ... Browse Code »

The latent_entropy plugin needs to pass arguments, so this adds the
support.

Signed-off-by: Emese Revfy
Signed-off-by: Kees Cook

Emese Revfy
2016-08-09 08:49:05 +0800
ed58c0e9e gcc-plugins: abort builds cleanly when not supported ... Browse Code »

When the compiler doesn't support gcc plugins (either due to missing
headers or too old a version), report the problem and abort the build
instead of emitting a warning and letting the build founder with arcane
compiler errors.

Signed-off-by: Kees Cook

Kees Cook
2016-08-09 08:49:05 +0800
d26e94149 kbuild: no gcc-plugins during cc-option tests ... Browse Code »

The gcc-plugins arguments should not be included when performing
cc-option tests.

Steps to reproduce:
1) make mrproper
2) make defconfig
3) enable GCC_PLUGINS, GCC_PLUGIN_CYC_COMPLEXITY
4) enable FUNCTION_TRACER (it will select other options as well)
5) make && make modules

Build errors:
MODPOST 18 modules
ERROR: "__fentry__" [net/netfilter/xt_nat.ko] undefined!
ERROR: "__fentry__" [net/netfilter/xt_mark.ko] undefined!
ERROR: "__fentry__" [net/netfilter/xt_addrtype.ko] undefined!
ERROR: "__fentry__" [net/netfilter/xt_LOG.ko] undefined!
ERROR: "__fentry__" [net/netfilter/nf_nat_sip.ko] undefined!
ERROR: "__fentry__" [net/netfilter/nf_nat_irc.ko] undefined!
ERROR: "__fentry__" [net/netfilter/nf_nat_ftp.ko] undefined!
ERROR: "__fentry__" [net/netfilter/nf_nat.ko] undefined!

Reported-by: Laura Abbott
Signed-off-by: Emese Revfy
[kees: renamed variable, clarified commit message]
Signed-off-by: Kees Cook

Emese Revfy
2016-08-09 08:49:05 +0800
210a021da drm/edid: Set 8 bpc color depth for displays with "DFP 1.x compliant TMDS". ... Browse Code »

According to E-EDID spec 1.3, table 3.9, a digital video sink with the
"DFP 1.x compliant TMDS" bit set is "signal compatible with VESA DFP 1.x
TMDS CRGB, 1 pixel / clock, up to 8 bits / color MSB aligned".

For such displays, the DFP spec 1.0, section 3.10 "EDID support" says:

"If the DFP monitor only supports EDID 1.X (1.1, 1.2, etc.)
without extensions, the host will make the following assumptions:

1. 24-bit MSB-aligned RGB TFT
2. DE polarity is active high
3. H and V syncs are active high
4. Established CRT timings will be used
5. Dithering will not be enabled on the host"

So if we don't know the bit depth of the display from additional
colorimetry info we should assume 8 bpc / 24 bpp by default.

This patch adds info->bpc = 8 assignement for that case.

Signed-off-by: Mario Kleiner
Cc: Jani Nikula
Cc: Ville Syrjälä
Cc: Daniel Vetter
Signed-off-by: Dave Airlie

Mario Kleiner
2016-08-09 06:56:04 +0800
196f954e2 drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown" ... Browse Code »

This reverts commit 013dd9e03872
("drm/i915/dp: fall back to 18 bpp when sink capability is unknown")

This commit introduced a regression into stable kernels,
as it reduces output color depth to 6 bpc for any video
sink connected to a Displayport connector if that sink
doesn't report a specific color depth via EDID, or if
our EDID parser doesn't actually recognize the proper
bpc from EDID.

Affected are active DisplayPort->VGA converters and
active DisplayPort->DVI converters. Both should be
able to handle 8 bpc, but are degraded to 6 bpc with
this patch.

The reverted commit was meant to fix
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=105331

A followup patch implements a fix for that specific bug,
which is caused by a faulty EDID of the affected DP panel
by adding a new EDID quirk for that panel.

DP 18 bpp fallback handling and other improvements to
DP sink bpc detection will be handled for future
kernels in a separate series of patches.

Please backport to stable.

Signed-off-by: Mario Kleiner
Acked-by: Jani Nikula
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä
Cc: Daniel Vetter
Signed-off-by: Dave Airlie

Mario Kleiner
2016-08-09 06:56:01 +0800
e10aec652 drm/edid: Add 6 bpc quirk for display AEO model 0. ... Browse Code »

Bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=105331
reports that the "AEO model 0" display is driven with 8 bpc
without dithering by default, which looks bad because that
panel is apparently a 6 bpc DP panel with faulty EDID.

A fix for this was made by commit 013dd9e03872
("drm/i915/dp: fall back to 18 bpp when sink capability is unknown").

That commit triggers new regressions in precision for DP->DVI and
DP->VGA displays. A patch is out to revert that commit, but it will
revert video output for the AEO model 0 panel to 8 bpc without
dithering.

The EDID 1.3 of that panel, as decoded from the xrandr output
attached to that bugzilla bug report, is somewhat faulty, and beyond
other problems also sets the "DFP 1.x compliant TMDS" bit, which
according to DFP spec means to drive the panel with 8 bpc and
no dithering in absence of other colorimetry information.

Try to make the original bug reporter happy despite the
faulty EDID by adding a quirk to mark that panel as 6 bpc,
so 6 bpc output with dithering creates a nice picture.

Tested by injecting the edid from the fdo bug into a DP connector
via drm_kms_helper.edid_firmware and verifying the 6 bpc + dithering
is selected.

This patch should be backported to stable.

Signed-off-by: Mario Kleiner
Cc: stable@vger.kernel.org
Cc: Jani Nikula
Cc: Ville Syrjälä
Cc: Daniel Vetter
Signed-off-by: Dave Airlie

Mario Kleiner
2016-08-09 06:56:00 +0800
81abf2525 Merge tag 'lkdtm-v4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux ... Browse Code »

Pull lkdtm update from Kees Cook:
"Fix rebuild problem with LKDTM's rodata test"

[ This, and the usercopy branch, both came in before the merge window
closed, but ended up in my 'need to look more' queue and thus got
merged only after rc1 was out ]

* tag 'lkdtm-v4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lkdtm: Fix targets for objcopy usage
lkdtm: fix false positive warning from -Wmaybe-uninitialized

Linus Torvalds
2016-08-09 06:39:24 +0800
1eccfa090 Merge tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux ... Browse Code »

Pull usercopy protection from Kees Cook:
"Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
copy_from_user bounds checking for most architectures on SLAB and
SLUB"

* tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
mm: SLUB hardened usercopy support
mm: SLAB hardened usercopy support
s390/uaccess: Enable hardened usercopy
sparc/uaccess: Enable hardened usercopy
powerpc/uaccess: Enable hardened usercopy
ia64/uaccess: Enable hardened usercopy
arm64/uaccess: Enable hardened usercopy
ARM: uaccess: Enable hardened usercopy
x86/uaccess: Enable hardened usercopy
mm: Hardened usercopy
mm: Implement stack frame object validation
mm: Add is_migrate_cma_page

Linus Torvalds
2016-08-09 05:48:14 +0800
1bd4403d8 unsafe_[get|put]_user: change interface to use a error target label ... Browse Code »

When I initially added the unsafe_[get|put]_user() helpers in commit
5b24a7a2aa20 ("Add 'unsafe' user access functions for batched
accesses"), I made the mistake of modeling the interface on our
traditional __[get|put]_user() functions, which return zero on success,
or -EFAULT on failure.

That interface is fairly easy to use, but it's actually fairly nasty for
good code generation, since it essentially forces the caller to check
the error value for each access.

In particular, since the error handling is already internally
implemented with an exception handler, and we already use "asm goto" for
various other things, we could fairly easily make the error cases just
jump directly to an error label instead, and avoid the need for explicit
checking after each operation.

So switch the interface to pass in an error label, rather than checking
the error value in the caller. Best do it now before we start growing
more users (the signal handling code in particular would be a good place
to use the new interface).

So rather than

if (unsafe_get_user(x, ptr))
... handle error ..

the interface is now

unsafe_get_user(x, ptr, label);

where an error during the user mode fetch will now just cause a jump to
'label' in the caller.

Right now the actual _implementation_ of this all still ends up being a
"if (err) goto label", and does not take advantage of any exception
label tricks, but for "unsafe_put_user()" in particular it should be
fairly straightforward to convert to using the exception table model.

Note that "unsafe_get_user()" is much harder to convert to a clever
exception table model, because current versions of gcc do not allow the
use of "asm goto" (for the exception) with output values (for the actual
value to be fetched). But that is hopefully not a limitation in the
long term.

[ Also note that it might be a good idea to switch unsafe_get_user() to
actually _return_ the value it fetches from user space, but this
commit only changes the error handling semantics ]

Signed-off-by: Linus Torvalds

Linus Torvalds
2016-08-09 04:02:01 +0800
574673c23 printk: Remove unnecessary #ifdef CONFIG_PRINTK ... Browse Code »

In commit 874f9c7da9a4 ("printk: create pr_ functions"), new
pr_level defines were added to printk.c.

These new defines are guarded by an #ifdef CONFIG_PRINTK - however,
there is already a surrounding #ifdef CONFIG_PRINTK starting a lot
earlier in line 249 which means the newly introduced #ifdef is
unnecessary.

Let's remove it to avoid confusion.

Signed-off-by: Andreas Ziegler
Cc: Joe Perches
Cc: Andrew Morton
Signed-off-by: Linus Torvalds

Andreas Ziegler
2016-08-09 02:29:39 +0800
65a97a67a dell-wmi: Ignore WMI event 0xe00e ... Browse Code »

WMI event 0xe00e is received when battery was removed or inserted.

Signed-off-by: Pali Rohár
Signed-off-by: Darren Hart

Pali Rohár
2016-08-09 02:00:21 +0800
65ea11ec6 x86/hweight: Don't clobber %rdi ... Browse Code »

The caller expects %rdi to remain intact, push+pop it make that happen.

Fixes the following kind of explosions on my core2duo machine when
trying to reboot or shut down:

general protection fault: 0000 [#1] PREEMPT SMP
Modules linked in: i915 i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm netconsole configfs binfmt_misc iTCO_wdt psmouse pcspkr snd_hda_codec_idt e100 coretemp hwmon snd_hda_codec_generic i2c_i801 mii i2c_smbus lpc_ich mfd_core snd_hda_intel uhci_hcd snd_hda_codec snd_hwdep snd_hda_core ehci_pci 8250 ehci_hcd snd_pcm 8250_base usbcore evdev serial_core usb_common parport_pc parport snd_timer snd soundcore
CPU: 0 PID: 3070 Comm: reboot Not tainted 4.8.0-rc1-perf-dirty #69
Hardware name: /D946GZIS, BIOS TS94610J.86A.0087.2007.1107.1049 11/07/2007
task: ffff88012a0b4080 task.stack: ffff880123850000
RIP: 0010:[] [] x86_perf_event_update+0x52/0xc0
RSP: 0018:ffff880123853b60 EFLAGS: 00010087
RAX: 0000000000000001 RBX: ffff88012fc0a3c0 RCX: 000000000000001e
RDX: 0000000000000000 RSI: 0000000040000000 RDI: ffff88012b014800
RBP: ffff880123853b88 R08: ffffffffffffffff R09: 0000000000000000
R10: ffffea0004a012c0 R11: ffffea0004acedc0 R12: ffffffff80000001
R13: ffff88012b0149c0 R14: ffff88012b014800 R15: 0000000000000018
FS: 00007f8b155cd700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8b155f5000 CR3: 000000012a2d7000 CR4: 00000000000006f0
Stack:
ffff88012fc0a3c0 ffff88012b014800 0000000000000004 0000000000000001
ffff88012fc1b750 ffff880123853bb0 ffffffff81003d59 ffff88012b014800
ffff88012fc0a3c0 ffff88012b014800 ffff880123853bd8 ffffffff81003e13
Call Trace:
[] x86_pmu_stop+0x59/0xd0
[] x86_pmu_del+0x43/0x140
[] event_sched_out.isra.105+0xbd/0x260
[] __perf_remove_from_context+0x2d/0xb0
[] __perf_event_exit_context+0x4d/0x70
[] generic_exec_single+0xb6/0x140
[] ? __perf_remove_from_context+0xb0/0xb0
[] ? __perf_remove_from_context+0xb0/0xb0
[] smp_call_function_single+0xdf/0x140
[] perf_event_exit_cpu_context+0x87/0xc0
[] perf_reboot+0x13/0x40
[] notifier_call_chain+0x4a/0x70
[] __blocking_notifier_call_chain+0x47/0x60
[] blocking_notifier_call_chain+0x16/0x20
[] kernel_restart_prepare+0x1d/0x40
[] kernel_restart+0x12/0x60
[] SYSC_reboot+0xf6/0x1b0
[] ? mntput_no_expire+0x2c/0x1b0
[] ? mntput+0x24/0x40
[] ? __fput+0x16c/0x1e0
[] ? ____fput+0xe/0x10
[] ? task_work_run+0x83/0xa0
[] ? exit_to_usermode_loop+0x53/0xc0
[] ? trace_hardirqs_on_thunk+0x1a/0x1c
[] SyS_reboot+0xe/0x10
[] entry_SYSCALL_64_fastpath+0x18/0xa3
Code: 7c 4c 8d af c0 01 00 00 49 89 fe eb 10 48 09 c2 4c 89 e0 49 0f b1 55 00 4c 39 e0 74 35 4d 8b a6 c0 01 00 00 41 8b 8e 60 01 00 00 33 8b 35 6e 02 8c 00 48 c1 e2 20 85 f6 7e d2 48 89 d3 89 cf
RIP [] x86_perf_event_update+0x52/0xc0
RSP
---[ end trace 7ec95181faf211be ]---
note: reboot[3070] exited with preempt_count 2

Cc: Borislav Petkov
Cc: H. Peter Anvin
Cc: Andy Lutomirski
Cc: Brian Gerst
Cc: Denys Vlasenko
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Ingo Molnar
Fixes: f5967101e9de ("x86/hweight: Get rid of the special calling convention")
Signed-off-by: Ville Syrjälä
Signed-off-by: Linus Torvalds

Ville Syrjälä
2016-08-09 01:58:25 +0800

08 Aug, 2016

1 commit

4872850a1 Merge branch 'drm-next-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-next ... Browse Code »

A few fixes for amdgpu and ttm for 4.8
- fix a ttm regression caused by the new pipelining code
- fixes for mullins on amdgpu
- updated golden settings for amdgpu

* 'drm-next-4.8' of git://people.freedesktop.org/~agd5f/linux:
drm/ttm: Wait for a BO to become idle before unbinding it from GTT
drm/amdgpu: update golden setting of polaris10
drm/amdgpu: update golden setting of stoney
drm/amdgpu: update golden setting of polaris11
drm/amdgpu: update golden setting of carrizo
drm/amdgpu: update golden setting of iceland
drm/amd/amdgpu: change pptable output format from ASCII to binary
drm/amdgpu/ci: add mullins to default case for smc ucode
drm/amdgpu/gmc7: add missing mullins case

Dave Airlie
2016-08-08 14:45:33 +0800