24 Aug, 2016

1 commit

  • When we get a hung task it can often be valuable to see _all_ the held
    locks on the system (in case we are being blocked on trying to acquire
    one), e.g. with this patch we can immediately see where the problem is
    below:

    INFO: task trinity-c3:14933 blocked for more than 120 seconds.
    Not tainted 4.8.0-rc1+ #135
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    trinity-c3 D ffff88010c16fc88 0 14933 1 0x00080004
    ffff88010c16fc88 000000003b9aca00 0000000000000000 0000000000000296
    00000000776cdf88 ffff88011a520ae0 ffff88011a520b08 ffff88011a520198
    ffffffff867d7f00 ffff88011942c080 ffff880116841580 ffff88010c168000
    Call Trace:
    [] schedule+0x77/0x230
    [] __lock_sock+0x129/0x250
    [] ? __sk_destruct+0x450/0x450
    [] ? wake_bit_function+0x2e0/0x2e0
    [] lock_sock_nested+0xeb/0x120
    [] irda_setsockopt+0x65/0xb40
    [] SyS_setsockopt+0x139/0x230
    [] ? SyS_recv+0x20/0x20
    [] ? trace_event_raw_event_sys_enter+0xb90/0xb90
    [] ? __this_cpu_preempt_check+0x13/0x20
    [] ? __context_tracking_exit.part.3+0x30/0x1b0
    [] ? SyS_recv+0x20/0x20
    [] do_syscall_64+0x1b3/0x4b0
    [] entry_SYSCALL64_slow_path+0x25/0x25

    Showing all locks held in the system:
    2 locks held by khungtaskd/563:
    #0: (rcu_read_lock){......}, at: [] watchdog+0x106/0x910
    #1: (tasklist_lock){......}, at: [] debug_show_all_locks+0x74/0x360
    1 lock held by trinity-c0/19280:
    #0: (sk_lock-AF_IRDA){......}, at: [] irda_accept+0x176/0x10f0
    1 lock held by trinity-c0/12865:
    #0: (sk_lock-AF_IRDA){......}, at: [] irda_accept+0x176/0x10f0

    Signed-off-by: Vegard Nossum
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Mandeep Singh Baines
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1471538460-7505-1-git-send-email-vegard.nossum@oracle.com
    Signed-off-by: Ingo Molnar

    Vegard Nossum
     

18 Aug, 2016

4 commits

  • When wanting to wakeup readers, __rwsem_mark_wakeup() currently
    iterates the wait_list twice while looking to wakeup the first N
    queued reader-tasks. While this can be quite inefficient, it was
    there such that a awoken reader would be first and foremost
    acknowledged by the lock counter.

    Keeping the same logic, we can further benefit from the use of
    wake_qs and avoid entirely the first wait_list iteration that sets
    the counter as wake_up_process() isn't going to occur right away,
    and therefore we maintain the counter->list order of going about
    things.

    Other than saving cycles with O(n) "scanning", this change also
    nicely cleans up a good chunk of __rwsem_mark_wakeup(); both
    visually and less tedious to read.

    For example, the following improvements where seen on some will
    it scale microbenchmarks, on a 48-core Haswell:

    v4.7 v4.7-rwsem-v1
    Hmean signal1-processes-8 5792691.42 ( 0.00%) 5771971.04 ( -0.36%)
    Hmean signal1-processes-12 6081199.96 ( 0.00%) 6072174.38 ( -0.15%)
    Hmean signal1-processes-21 3071137.71 ( 0.00%) 3041336.72 ( -0.97%)
    Hmean signal1-processes-48 3712039.98 ( 0.00%) 3708113.59 ( -0.11%)
    Hmean signal1-processes-79 4464573.45 ( 0.00%) 4682798.66 ( 4.89%)
    Hmean signal1-processes-110 4486842.01 ( 0.00%) 4633781.71 ( 3.27%)
    Hmean signal1-processes-141 4611816.83 ( 0.00%) 4692725.38 ( 1.75%)
    Hmean signal1-processes-172 4638157.05 ( 0.00%) 4714387.86 ( 1.64%)
    Hmean signal1-processes-203 4465077.80 ( 0.00%) 4690348.07 ( 5.05%)
    Hmean signal1-processes-224 4410433.74 ( 0.00%) 4687534.43 ( 6.28%)

    Stddev signal1-processes-8 6360.47 ( 0.00%) 8455.31 ( 32.94%)
    Stddev signal1-processes-12 4004.98 ( 0.00%) 9156.13 (128.62%)
    Stddev signal1-processes-21 3273.14 ( 0.00%) 5016.80 ( 53.27%)
    Stddev signal1-processes-48 28420.25 ( 0.00%) 26576.22 ( -6.49%)
    Stddev signal1-processes-79 22038.34 ( 0.00%) 18992.70 (-13.82%)
    Stddev signal1-processes-110 23226.93 ( 0.00%) 17245.79 (-25.75%)
    Stddev signal1-processes-141 6358.98 ( 0.00%) 7636.14 ( 20.08%)
    Stddev signal1-processes-172 9523.70 ( 0.00%) 4824.75 (-49.34%)
    Stddev signal1-processes-203 13915.33 ( 0.00%) 9326.33 (-32.98%)
    Stddev signal1-processes-224 15573.94 ( 0.00%) 10613.82 (-31.85%)

    Other runs that saw improvements include context_switch and pipe; and
    as expected, this is particularly highlighted on larger thread counts
    as it becomes more expensive to walk the list twice.

    No change in wakeup ordering or semantics.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman.Long@hp.com
    Cc: dave@stgolabs.net
    Cc: jason.low2@hpe.com
    Cc: wanpeng.li@hotmail.com
    Link: http://lkml.kernel.org/r/1470384285-32163-4-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • Our rwsem code (xadd, at least) is rather well documented, but
    there are a few really annoying comments in there that serve
    no purpose and we shouldn't bother with them.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman.Long@hp.com
    Cc: dave@stgolabs.net
    Cc: jason.low2@hpe.com
    Cc: wanpeng.li@hotmail.com
    Link: http://lkml.kernel.org/r/1470384285-32163-3-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • We currently return a rw_semaphore structure, which is the
    same lock we passed to the function's argument in the first
    place. While there are several functions that choose this
    return value, the callers use it, for example, for things
    like ERR_PTR. This is not the case for __rwsem_mark_wake(),
    and in addition this function is really about the lock
    waiters (which we know there are at this point), so its
    somewhat odd to be returning the sem structure.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman.Long@hp.com
    Cc: dave@stgolabs.net
    Cc: jason.low2@hpe.com
    Cc: wanpeng.li@hotmail.com
    Link: http://lkml.kernel.org/r/1470384285-32163-2-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • The current percpu-rwsem read side is entirely free of serializing insns
    at the cost of having a synchronize_sched() in the write path.

    The latency of the synchronize_sched() is too high for cgroups. The
    commit 1ed1328792ff talks about the write path being a fairly cold path
    but this is not the case for Android which moves task to the foreground
    cgroup and back around binder IPC calls from foreground processes to
    background processes, so it is significantly hotter than human initiated
    operations.

    Switch cgroup_threadgroup_rwsem into the slow mode for now to avoid the
    problem, hopefully it should not be that slow after another commit:

    80127a39681b ("locking/percpu-rwsem: Optimize readers and reduce global impact").

    We could just add rcu_sync_enter() into cgroup_init() but we do not want
    another synchronize_sched() at boot time, so this patch adds the new helper
    which doesn't block but currently can only be called before the first use.

    Reported-by: John Stultz
    Reported-by: Dmitry Shmidt
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Colin Cross
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Rom Lemarchand
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Todd Kjos
    Link: http://lkml.kernel.org/r/20160811165413.GA22807@redhat.com
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

12 Aug, 2016

4 commits

  • This commit adds Korean version of memory-barriers.txt document. The
    header is referred to HOWTO Korean version.

    The translation has started from Feb, 2016 and using a public git
    repository[1] to maintain the work. It's commit history says that it is
    following upstream changes as well.

    [1] https://github.com/sjp38/linux.doc_trans_membarrier

    Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Byungchul Park
    Acked-by: David Howells
    Acked-by: Minchan Kim
    Acked-by: Jonathan Corbet
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arch@vger.kernel.org
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1470939463-31950-4-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    SeongJae Park
     
  • An example result for data dependent write has a typo. This commit
    fixes the wrong typo.

    Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dhowells@redhat.com
    Cc: linux-arch@vger.kernel.org
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1470939463-31950-3-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    SeongJae Park
     
  • Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dhowells@redhat.com
    Cc: linux-arch@vger.kernel.org
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1470939463-31950-2-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    SeongJae Park
     
  • Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dhowells@redhat.com
    Cc: linux-arch@vger.kernel.org
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1470939463-31950-1-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    SeongJae Park
     

10 Aug, 2016

14 commits

  • Currently the percpu-rwsem switches to (global) atomic ops while a
    writer is waiting; which could be quite a while and slows down
    releasing the readers.

    This patch cures this problem by ordering the reader-state vs
    reader-count (see the comments in __percpu_down_read() and
    percpu_down_write()). This changes a global atomic op into a full
    memory barrier, which doesn't have the global cacheline contention.

    This also enables using the percpu-rwsem with rcu_sync disabled in order
    to bias the implementation differently, reducing the writer latency by
    adding some cost to readers.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Oleg Nesterov
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Paul McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    [ Fixed modular build. ]
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Currently there are overlap in the pvqspinlock wait_again and
    spurious_wakeup stat counters. Because of lock stealing, it is
    no longer possible to accurately determine if spurious wakeup has
    happened in the queue head. As they track both the queue node and
    queue head status, it is also hard to tell how many of those comes
    from the queue head and how many from the queue node.

    This patch changes the accounting rules so that spurious wakeup is
    only tracked in the queue node. The wait_again count, however, is
    only tracked in the queue head when the vCPU failed to acquire the
    lock after a vCPU kick. This should give a much better indication of
    the wait-kick dynamics in the queue node and the queue head.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boqun Feng
    Cc: Douglas Hatch
    Cc: Linus Torvalds
    Cc: Pan Xinhui
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1464713631-1066-2-git-send-email-Waiman.Long@hpe.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Restructure pv_queued_spin_steal_lock() as I found it hard to read.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman Long

    Peter Zijlstra
     
  • It's obviously wrong to set stat to NULL. So lets remove it.
    Otherwise it is always zero when we check the latency of kick/wake.

    Signed-off-by: Pan Xinhui
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Waiman Long
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1468405414-3700-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Pan Xinhui
     
  • When the lock holder vCPU is racing with the queue head:

    CPU 0 (lock holder) CPU1 (queue head)
    =================== =================
    spin_lock(); spin_lock();
    pv_kick_node(): pv_wait_head_or_lock():
    if (!lp) {
    lp = pv_hash(lock, pn);
    xchg(&l->locked, _Q_SLOW_VAL);
    }
    WRITE_ONCE(pn->state, vcpu_halted);
    cmpxchg(&pn->state,
    vcpu_halted, vcpu_hashed);
    WRITE_ONCE(l->locked, _Q_SLOW_VAL);
    (void)pv_hash(lock, pn);

    In this case, lock holder inserts the pv_node of queue head into the
    hash table and set _Q_SLOW_VAL unnecessary. This patch avoids it by
    restoring/setting vcpu_hashed state after failing adaptive locking
    spinning.

    Signed-off-by: Wanpeng Li
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Pan Xinhui
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/1468484156-4521-1-git-send-email-wanpeng.li@hotmail.com
    Signed-off-by: Ingo Molnar

    Wanpeng Li
     
  • This patch aims to get rid of endianness in queued_write_unlock(). We
    want to set __qrwlock->wmode to NULL, however the address is not
    &lock->cnts in big endian machine. That causes queued_write_unlock()
    write NULL to the wrong field of __qrwlock.

    So implement __qrwlock_write_byte() which returns the correct
    __qrwlock->wmode address.

    Suggested-by: Peter Zijlstra (Intel)
    Signed-off-by: Pan Xinhui
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman.Long@hpe.com
    Cc: arnd@arndb.de
    Cc: boqun.feng@gmail.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1468835259-4486-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    pan xinhui
     
  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • This reverts commit 874f9c7da9a4acbc1b9e12ca722579fb50e4d142.

    Geert Uytterhoeven reports:
    "This change seems to have an (unintendent?) side-effect.

    Before, pr_*() calls without a trailing newline characters would be
    printed with a newline character appended, both on the console and in
    the output of the dmesg command.

    After this commit, no new line character is appended, and the output
    of the next pr_*() call of the same type may be appended, like in:

    - Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000
    - Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)
    + Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)"

    Joe Perches says:
    "No, that is not intentional.

    The newline handling code inside vprintk_emit is a bit involved and
    for now I suggest a revert until this has all the same behavior as
    earlier"

    Reported-by: Geert Uytterhoeven
    Requested-by: Joe Perches
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull tracing fix from Steven Rostedt:
    "Fix tick_stop tracepoint symbols for user export.

    Luiz Capitulino noticed that the tick_stop tracepoint wasn't being
    parsed properly by the tracing user space tools.

    This was due to the TRACE_DEFINE_ENUM() being set to a define, when it
    should have been set to the enum itself. The define was of the MASK
    that used the BIT to shift. The BIT was the enum and by adding that,
    everything gets converted nicely. The MASK is still kept just in case
    it gets converted to an enum in the future"

    * tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Fix tick_stop tracepoint symbols for user export

    Linus Torvalds
     
  • …inux/kernel/git/kees/linux

    Pull gcc plugin improvements from Kees Cook:
    "Several fixes/improvements for the gcc plugin infrastructure:

    - fix a problem with gcc plugins interfering with cc-option tests.

    - abort more gracefully when gcc plugin headers or compiler support
    is missing.

    - improve the gcc plugin rule generation to be more dynamic, pass
    arguments, and build from subdirectories"

    * tag 'gcc-plugin-infrastructure-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    gcc-plugins: Add support for plugin subdirectories
    gcc-plugins: Automate make rule generation
    gcc-plugins: Add support for passing plugin arguments
    gcc-plugins: abort builds cleanly when not supported
    kbuild: no gcc-plugins during cc-option tests

    Linus Torvalds
     
  • …linux-platform-drivers-x86

    Pull x86 platform driver update from Darren Hart:
    "dell-wmi: ignore battery remove/insert event"

    * tag 'platform-drivers-x86-v4.8-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
    dell-wmi: Ignore WMI event 0xe00e

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "This contains a bunch of amdgpu fixes, and some i915 regression fixes.

    It also contains some fixes for an older regression with some EDID
    changes and some 6bpc panels.

    Then there are the lockdep, cirrus and rcar-du regression fixes from
    this window"

    * tag 'drm-fixes-for-4.8-rc2' of git://people.freedesktop.org/~airlied/linux:
    drm/cirrus: Fix NULL pointer dereference when registering the fbdev
    drm/edid: Set 8 bpc color depth for displays with "DFP 1.x compliant TMDS".
    drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown"
    drm/edid: Add 6 bpc quirk for display AEO model 0.
    drm: Paper over locking inversion after registration rework
    drm: rcar-du: Link HDMI encoder with bridge
    drm/ttm: Wait for a BO to become idle before unbinding it from GTT
    drm/i915/fbdev: Check for the framebuffer before use
    drm/amdgpu: update golden setting of polaris10
    drm/amdgpu: update golden setting of stoney
    drm/amdgpu: update golden setting of polaris11
    drm/amdgpu: update golden setting of carrizo
    drm/amdgpu: update golden setting of iceland
    drm/amd/amdgpu: change pptable output format from ASCII to binary
    drm/amdgpu/ci: add mullins to default case for smc ucode
    drm/amdgpu/gmc7: add missing mullins case
    drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
    drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL

    Linus Torvalds
     
  • Commit b195d5e2bffd ("ipr: Wait to do async scan until scsi host is
    initialized") fixed async scan for ipr, but broke sync scan for ipr.

    This fixes sync scan back up.

    Signed-off-by: Brian King
    Reported-and-tested-by: Michael Ellerman
    Signed-off-by: Linus Torvalds

    Brian King
     
  • To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg,
    which sets page->_mapcount to -512. Currently, we set/clear PageKmemcg
    in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated
    with __GFP_ACCOUNT, including those that aren't actually charged to any
    cgroup, i.e. allocated from the root cgroup context. To avoid overhead
    in case cgroups are not used, we only do that if memcg_kmem_enabled() is
    true. The latter is set iff there are kmem-enabled memory cgroups
    (online or offline). The root cgroup is not considered kmem-enabled.

    As a result, if a page is allocated with __GFP_ACCOUNT for the root
    cgroup when there are kmem-enabled memory cgroups and is freed after all
    kmem-enabled memory cgroups were removed, e.g.

    # no memory cgroups has been created yet, create one
    mkdir /sys/fs/cgroup/memory/test
    # run something allocating pages with __GFP_ACCOUNT, e.g.
    # a program using pipe
    dmesg | tail
    # remove the memory cgroup
    rmdir /sys/fs/cgroup/memory/test

    we'll get bad page state bug complaining about page->_mapcount != -1:

    BUG: Bad page state in process swapper/0 pfn:1fd945c
    page:ffffea007f651700 count:0 mapcount:-511 mapping: (null) index:0x0
    flags: 0x1000000000000000()

    To avoid that, let's mark with PageKmemcg only those pages that are
    actually charged to and hence pin a non-root memory cgroup.

    Fixes: 4949148ad433 ("mm: charge/uncharge kmemcg from generic page allocator paths")
    Reported-and-tested-by: Eric Dumazet
    Signed-off-by: Vladimir Davydov
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

09 Aug, 2016

16 commits

  • The symbols used in the tick_stop tracepoint were not being converted
    properly into integers in the trace_stop format file. Instead we had this:

    print fmt: "success=%d dependency=%s", REC->success,
    __print_symbolic(REC->dependency, { 0, "NONE" },
    { (1 << TICK_DEP_BIT_POSIX_TIMER), "POSIX_TIMER" },
    { (1 << TICK_DEP_BIT_PERF_EVENTS), "PERF_EVENTS" },
    { (1 << TICK_DEP_BIT_SCHED), "SCHED" },
    { (1 << TICK_DEP_BIT_CLOCK_UNSTABLE), "CLOCK_UNSTABLE" })

    User space tools have no idea how to parse "TICK_DEP_BIT_SCHED" or the other
    symbols used to do the bit shifting. The reason is that the conversion was
    done with using the TICK_DEP_MASK_* symbols which are just macros that
    convert to the BIT shift itself (with the exception of NONE, which was
    converted properly, because it doesn't use bits, and is defined as zero).

    The TICK_DEP_BIT_* needs to be denoted by TRACE_DEFINE_ENUM() in order to
    have this properly converted for user space tools to parse this event.

    Cc: stable@vger.kernel.org
    Cc: Frederic Weisbecker
    Fixes: e6e6cc22e067 ("nohz: Use enum code for tick stop failure tracing message")
    Reported-by: Luiz Capitulino
    Tested-by: Luiz Capitulino
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • cirrus_modeset_init() is initializing/registering the emulated fbdev
    and, since commit c61b93fe51b1 ("drm/atomic: Fix remaining places where
    !funcs->best_encoder is valid"), DRM internals can access/test some of
    the fields in mode_config->funcs as part of the fbdev registration
    process.
    Make sure dev->mode_config.funcs is properly set to avoid dereferencing
    a NULL pointer.

    Reported-by: Mike Marshall
    Reported-by: Eric W. Biederman
    Signed-off-by: Boris Brezillon
    Fixes: c61b93fe51b1 ("drm/atomic: Fix remaining places where !funcs->best_encoder is valid")
    Signed-off-by: Dave Airlie

    Boris Brezillon
     
  • This adds support for building more complex gcc plugins that live in a
    subdirectory instead of just in a single source file.

    Reported-by: PaX Team
    Signed-off-by: Emese Revfy
    [kees: clarified commit message]
    Signed-off-by: Kees Cook

    Emese Revfy
     
  • There's no reason to repeat the same names in the Makefile when the .so
    files have already been listed. The .o list can be generated from them.

    Reported-by: PaX Team
    Signed-off-by: Emese Revfy
    [kees: clarified commit message]
    Signed-off-by: Kees Cook

    Emese Revfy
     
  • The latent_entropy plugin needs to pass arguments, so this adds the
    support.

    Signed-off-by: Emese Revfy
    Signed-off-by: Kees Cook

    Emese Revfy
     
  • When the compiler doesn't support gcc plugins (either due to missing
    headers or too old a version), report the problem and abort the build
    instead of emitting a warning and letting the build founder with arcane
    compiler errors.

    Signed-off-by: Kees Cook

    Kees Cook
     
  • The gcc-plugins arguments should not be included when performing
    cc-option tests.

    Steps to reproduce:
    1) make mrproper
    2) make defconfig
    3) enable GCC_PLUGINS, GCC_PLUGIN_CYC_COMPLEXITY
    4) enable FUNCTION_TRACER (it will select other options as well)
    5) make && make modules

    Build errors:
    MODPOST 18 modules
    ERROR: "__fentry__" [net/netfilter/xt_nat.ko] undefined!
    ERROR: "__fentry__" [net/netfilter/xt_mark.ko] undefined!
    ERROR: "__fentry__" [net/netfilter/xt_addrtype.ko] undefined!
    ERROR: "__fentry__" [net/netfilter/xt_LOG.ko] undefined!
    ERROR: "__fentry__" [net/netfilter/nf_nat_sip.ko] undefined!
    ERROR: "__fentry__" [net/netfilter/nf_nat_irc.ko] undefined!
    ERROR: "__fentry__" [net/netfilter/nf_nat_ftp.ko] undefined!
    ERROR: "__fentry__" [net/netfilter/nf_nat.ko] undefined!

    Reported-by: Laura Abbott
    Signed-off-by: Emese Revfy
    [kees: renamed variable, clarified commit message]
    Signed-off-by: Kees Cook

    Emese Revfy
     
  • According to E-EDID spec 1.3, table 3.9, a digital video sink with the
    "DFP 1.x compliant TMDS" bit set is "signal compatible with VESA DFP 1.x
    TMDS CRGB, 1 pixel / clock, up to 8 bits / color MSB aligned".

    For such displays, the DFP spec 1.0, section 3.10 "EDID support" says:

    "If the DFP monitor only supports EDID 1.X (1.1, 1.2, etc.)
    without extensions, the host will make the following assumptions:

    1. 24-bit MSB-aligned RGB TFT
    2. DE polarity is active high
    3. H and V syncs are active high
    4. Established CRT timings will be used
    5. Dithering will not be enabled on the host"

    So if we don't know the bit depth of the display from additional
    colorimetry info we should assume 8 bpc / 24 bpp by default.

    This patch adds info->bpc = 8 assignement for that case.

    Signed-off-by: Mario Kleiner
    Cc: Jani Nikula
    Cc: Ville Syrjälä
    Cc: Daniel Vetter
    Signed-off-by: Dave Airlie

    Mario Kleiner
     
  • This reverts commit 013dd9e03872
    ("drm/i915/dp: fall back to 18 bpp when sink capability is unknown")

    This commit introduced a regression into stable kernels,
    as it reduces output color depth to 6 bpc for any video
    sink connected to a Displayport connector if that sink
    doesn't report a specific color depth via EDID, or if
    our EDID parser doesn't actually recognize the proper
    bpc from EDID.

    Affected are active DisplayPort->VGA converters and
    active DisplayPort->DVI converters. Both should be
    able to handle 8 bpc, but are degraded to 6 bpc with
    this patch.

    The reverted commit was meant to fix
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=105331

    A followup patch implements a fix for that specific bug,
    which is caused by a faulty EDID of the affected DP panel
    by adding a new EDID quirk for that panel.

    DP 18 bpp fallback handling and other improvements to
    DP sink bpc detection will be handled for future
    kernels in a separate series of patches.

    Please backport to stable.

    Signed-off-by: Mario Kleiner
    Acked-by: Jani Nikula
    Cc: stable@vger.kernel.org
    Cc: Ville Syrjälä
    Cc: Daniel Vetter
    Signed-off-by: Dave Airlie

    Mario Kleiner
     
  • Bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=105331
    reports that the "AEO model 0" display is driven with 8 bpc
    without dithering by default, which looks bad because that
    panel is apparently a 6 bpc DP panel with faulty EDID.

    A fix for this was made by commit 013dd9e03872
    ("drm/i915/dp: fall back to 18 bpp when sink capability is unknown").

    That commit triggers new regressions in precision for DP->DVI and
    DP->VGA displays. A patch is out to revert that commit, but it will
    revert video output for the AEO model 0 panel to 8 bpc without
    dithering.

    The EDID 1.3 of that panel, as decoded from the xrandr output
    attached to that bugzilla bug report, is somewhat faulty, and beyond
    other problems also sets the "DFP 1.x compliant TMDS" bit, which
    according to DFP spec means to drive the panel with 8 bpc and
    no dithering in absence of other colorimetry information.

    Try to make the original bug reporter happy despite the
    faulty EDID by adding a quirk to mark that panel as 6 bpc,
    so 6 bpc output with dithering creates a nice picture.

    Tested by injecting the edid from the fdo bug into a DP connector
    via drm_kms_helper.edid_firmware and verifying the 6 bpc + dithering
    is selected.

    This patch should be backported to stable.

    Signed-off-by: Mario Kleiner
    Cc: stable@vger.kernel.org
    Cc: Jani Nikula
    Cc: Ville Syrjälä
    Cc: Daniel Vetter
    Signed-off-by: Dave Airlie

    Mario Kleiner
     
  • Pull lkdtm update from Kees Cook:
    "Fix rebuild problem with LKDTM's rodata test"

    [ This, and the usercopy branch, both came in before the merge window
    closed, but ended up in my 'need to look more' queue and thus got
    merged only after rc1 was out ]

    * tag 'lkdtm-v4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    lkdtm: Fix targets for objcopy usage
    lkdtm: fix false positive warning from -Wmaybe-uninitialized

    Linus Torvalds
     
  • Pull usercopy protection from Kees Cook:
    "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
    copy_from_user bounds checking for most architectures on SLAB and
    SLUB"

    * tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    mm: SLUB hardened usercopy support
    mm: SLAB hardened usercopy support
    s390/uaccess: Enable hardened usercopy
    sparc/uaccess: Enable hardened usercopy
    powerpc/uaccess: Enable hardened usercopy
    ia64/uaccess: Enable hardened usercopy
    arm64/uaccess: Enable hardened usercopy
    ARM: uaccess: Enable hardened usercopy
    x86/uaccess: Enable hardened usercopy
    mm: Hardened usercopy
    mm: Implement stack frame object validation
    mm: Add is_migrate_cma_page

    Linus Torvalds
     
  • When I initially added the unsafe_[get|put]_user() helpers in commit
    5b24a7a2aa20 ("Add 'unsafe' user access functions for batched
    accesses"), I made the mistake of modeling the interface on our
    traditional __[get|put]_user() functions, which return zero on success,
    or -EFAULT on failure.

    That interface is fairly easy to use, but it's actually fairly nasty for
    good code generation, since it essentially forces the caller to check
    the error value for each access.

    In particular, since the error handling is already internally
    implemented with an exception handler, and we already use "asm goto" for
    various other things, we could fairly easily make the error cases just
    jump directly to an error label instead, and avoid the need for explicit
    checking after each operation.

    So switch the interface to pass in an error label, rather than checking
    the error value in the caller. Best do it now before we start growing
    more users (the signal handling code in particular would be a good place
    to use the new interface).

    So rather than

    if (unsafe_get_user(x, ptr))
    ... handle error ..

    the interface is now

    unsafe_get_user(x, ptr, label);

    where an error during the user mode fetch will now just cause a jump to
    'label' in the caller.

    Right now the actual _implementation_ of this all still ends up being a
    "if (err) goto label", and does not take advantage of any exception
    label tricks, but for "unsafe_put_user()" in particular it should be
    fairly straightforward to convert to using the exception table model.

    Note that "unsafe_get_user()" is much harder to convert to a clever
    exception table model, because current versions of gcc do not allow the
    use of "asm goto" (for the exception) with output values (for the actual
    value to be fetched). But that is hopefully not a limitation in the
    long term.

    [ Also note that it might be a good idea to switch unsafe_get_user() to
    actually _return_ the value it fetches from user space, but this
    commit only changes the error handling semantics ]

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • In commit 874f9c7da9a4 ("printk: create pr_ functions"), new
    pr_level defines were added to printk.c.

    These new defines are guarded by an #ifdef CONFIG_PRINTK - however,
    there is already a surrounding #ifdef CONFIG_PRINTK starting a lot
    earlier in line 249 which means the newly introduced #ifdef is
    unnecessary.

    Let's remove it to avoid confusion.

    Signed-off-by: Andreas Ziegler
    Cc: Joe Perches
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Ziegler
     
  • WMI event 0xe00e is received when battery was removed or inserted.

    Signed-off-by: Pali Rohár
    Signed-off-by: Darren Hart

    Pali Rohár
     
  • The caller expects %rdi to remain intact, push+pop it make that happen.

    Fixes the following kind of explosions on my core2duo machine when
    trying to reboot or shut down:

    general protection fault: 0000 [#1] PREEMPT SMP
    Modules linked in: i915 i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm netconsole configfs binfmt_misc iTCO_wdt psmouse pcspkr snd_hda_codec_idt e100 coretemp hwmon snd_hda_codec_generic i2c_i801 mii i2c_smbus lpc_ich mfd_core snd_hda_intel uhci_hcd snd_hda_codec snd_hwdep snd_hda_core ehci_pci 8250 ehci_hcd snd_pcm 8250_base usbcore evdev serial_core usb_common parport_pc parport snd_timer snd soundcore
    CPU: 0 PID: 3070 Comm: reboot Not tainted 4.8.0-rc1-perf-dirty #69
    Hardware name: /D946GZIS, BIOS TS94610J.86A.0087.2007.1107.1049 11/07/2007
    task: ffff88012a0b4080 task.stack: ffff880123850000
    RIP: 0010:[] [] x86_perf_event_update+0x52/0xc0
    RSP: 0018:ffff880123853b60 EFLAGS: 00010087
    RAX: 0000000000000001 RBX: ffff88012fc0a3c0 RCX: 000000000000001e
    RDX: 0000000000000000 RSI: 0000000040000000 RDI: ffff88012b014800
    RBP: ffff880123853b88 R08: ffffffffffffffff R09: 0000000000000000
    R10: ffffea0004a012c0 R11: ffffea0004acedc0 R12: ffffffff80000001
    R13: ffff88012b0149c0 R14: ffff88012b014800 R15: 0000000000000018
    FS: 00007f8b155cd700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f8b155f5000 CR3: 000000012a2d7000 CR4: 00000000000006f0
    Stack:
    ffff88012fc0a3c0 ffff88012b014800 0000000000000004 0000000000000001
    ffff88012fc1b750 ffff880123853bb0 ffffffff81003d59 ffff88012b014800
    ffff88012fc0a3c0 ffff88012b014800 ffff880123853bd8 ffffffff81003e13
    Call Trace:
    [] x86_pmu_stop+0x59/0xd0
    [] x86_pmu_del+0x43/0x140
    [] event_sched_out.isra.105+0xbd/0x260
    [] __perf_remove_from_context+0x2d/0xb0
    [] __perf_event_exit_context+0x4d/0x70
    [] generic_exec_single+0xb6/0x140
    [] ? __perf_remove_from_context+0xb0/0xb0
    [] ? __perf_remove_from_context+0xb0/0xb0
    [] smp_call_function_single+0xdf/0x140
    [] perf_event_exit_cpu_context+0x87/0xc0
    [] perf_reboot+0x13/0x40
    [] notifier_call_chain+0x4a/0x70
    [] __blocking_notifier_call_chain+0x47/0x60
    [] blocking_notifier_call_chain+0x16/0x20
    [] kernel_restart_prepare+0x1d/0x40
    [] kernel_restart+0x12/0x60
    [] SYSC_reboot+0xf6/0x1b0
    [] ? mntput_no_expire+0x2c/0x1b0
    [] ? mntput+0x24/0x40
    [] ? __fput+0x16c/0x1e0
    [] ? ____fput+0xe/0x10
    [] ? task_work_run+0x83/0xa0
    [] ? exit_to_usermode_loop+0x53/0xc0
    [] ? trace_hardirqs_on_thunk+0x1a/0x1c
    [] SyS_reboot+0xe/0x10
    [] entry_SYSCALL_64_fastpath+0x18/0xa3
    Code: 7c 4c 8d af c0 01 00 00 49 89 fe eb 10 48 09 c2 4c 89 e0 49 0f b1 55 00 4c 39 e0 74 35 4d 8b a6 c0 01 00 00 41 8b 8e 60 01 00 00 33 8b 35 6e 02 8c 00 48 c1 e2 20 85 f6 7e d2 48 89 d3 89 cf
    RIP [] x86_perf_event_update+0x52/0xc0
    RSP
    ---[ end trace 7ec95181faf211be ]---
    note: reboot[3070] exited with preempt_count 2

    Cc: Borislav Petkov
    Cc: H. Peter Anvin
    Cc: Andy Lutomirski
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Fixes: f5967101e9de ("x86/hweight: Get rid of the special calling convention")
    Signed-off-by: Ville Syrjälä
    Signed-off-by: Linus Torvalds

    Ville Syrjälä
     

08 Aug, 2016

1 commit

  • A few fixes for amdgpu and ttm for 4.8
    - fix a ttm regression caused by the new pipelining code
    - fixes for mullins on amdgpu
    - updated golden settings for amdgpu

    * 'drm-next-4.8' of git://people.freedesktop.org/~agd5f/linux:
    drm/ttm: Wait for a BO to become idle before unbinding it from GTT
    drm/amdgpu: update golden setting of polaris10
    drm/amdgpu: update golden setting of stoney
    drm/amdgpu: update golden setting of polaris11
    drm/amdgpu: update golden setting of carrizo
    drm/amdgpu: update golden setting of iceland
    drm/amd/amdgpu: change pptable output format from ASCII to binary
    drm/amdgpu/ci: add mullins to default case for smc ucode
    drm/amdgpu/gmc7: add missing mullins case

    Dave Airlie