27 Mar, 2019

1 commit

  • commit 71492580571467fb7177aade19c18ce7486267f5 upstream.

    Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
    warning right after a previous lockdep warning. It is likely that the
    previous warning turned off lock debugging causing the lockdep to have
    inconsistency states leading to the lock downgrade warning.

    Fix that by add a check for debug_locks at the beginning of
    __lock_downgrade().

    Debugged-by: Tetsuo Handa
    Reported-by: Tetsuo Handa
    Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/1547093005-26085-1-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Waiman Long
     

06 Mar, 2019

1 commit

  • [ Upstream commit e158488be27b157802753a59b336142dc0eb0380 ]

    Because wake_q_add() can imply an immediate wakeup (cmpxchg failure
    case), we must not rely on the wakeup being delayed. However, commit:

    e38513905eea ("locking/rwsem: Rework zeroing reader waiter->task")

    relies on exactly that behaviour in that the wakeup must not happen
    until after we clear waiter->task.

    [ peterz: Added changelog. ]

    Signed-off-by: Xie Yongji
    Signed-off-by: Zhang Yu
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: e38513905eea ("locking/rwsem: Rework zeroing reader waiter->task")
    Link: https://lkml.kernel.org/r/1543495830-2644-1-git-send-email-xieyongji@baidu.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin

    Xie Yongji
     

13 Feb, 2019

1 commit

  • commit 1a1fb985f2e2b85ec0d3dc2e519ee48389ec2434 upstream.

    commit 56222b212e8e ("futex: Drop hb->lock before enqueueing on the
    rtmutex") changed the locking rules in the futex code so that the hash
    bucket lock is not longer held while the waiter is enqueued into the
    rtmutex wait list. This made the lock and the unlock path symmetric, but
    unfortunately the possible early exit from __rt_mutex_proxy_start() due to
    a detected deadlock was not updated accordingly. That allows a concurrent
    unlocker to observe inconsitent state which triggers the warning in the
    unlock path.

    futex_lock_pi() futex_unlock_pi()
    lock(hb->lock)
    queue(hb_waiter) lock(hb->lock)
    lock(rtmutex->wait_lock)
    unlock(hb->lock)
    // acquired hb->lock
    hb_waiter = futex_top_waiter()
    lock(rtmutex->wait_lock)
    __rt_mutex_proxy_start()
    ---> fail
    remove(rtmutex_waiter);
    ---> returns -EDEADLOCK
    unlock(rtmutex->wait_lock)
    // acquired wait_lock
    wake_futex_pi()
    rt_mutex_next_owner()
    --> returns NULL
    --> WARN

    lock(hb->lock)
    unqueue(hb_waiter)

    The problem is caused by the remove(rtmutex_waiter) in the failure case of
    __rt_mutex_proxy_start() as this lets the unlocker observe a waiter in the
    hash bucket but no waiter on the rtmutex, i.e. inconsistent state.

    The original commit handles this correctly for the other early return cases
    (timeout, signal) by delaying the removal of the rtmutex waiter until the
    returning task reacquired the hash bucket lock.

    Treat the failure case of __rt_mutex_proxy_start() in the same way and let
    the existing cleanup code handle the eventual handover of the rtmutex
    gracefully. The regular rt_mutex_proxy_start() gains the rtmutex waiter
    removal for the failure case, so that the other callsites are still
    operating correctly.

    Add proper comments to the code so all these details are fully documented.

    Thanks to Peter for helping with the analysis and writing the really
    valuable code comments.

    Fixes: 56222b212e8e ("futex: Drop hb->lock before enqueueing on the rtmutex")
    Reported-by: Heiko Carstens
    Co-developed-by: Peter Zijlstra
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner
    Tested-by: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: linux-s390@vger.kernel.org
    Cc: Stefan Liebler
    Cc: Sebastian Sewior
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1901292311410.1950@nanos.tec.linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 Dec, 2018

2 commits

  • commit 7aa54be2976550f17c11a1c3e3630002dea39303 upstream.

    On x86 we cannot do fetch_or() with a single instruction and thus end up
    using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
    with a composite operation: tas-pending + load.

    Using two instructions of course opens a window we previously did not
    have. Consider the scenario:

    CPU0 CPU1 CPU2

    1) lock
    trylock -> (0,0,1)

    2) lock
    trylock /* fail */

    3) unlock -> (0,0,0)

    4) lock
    trylock -> (0,0,1)

    5) tas-pending -> (0,1,1)
    load-val (0,0,1)

    FAIL: _2_ owners

    where 5) is our new composite operation. When we consider each part of
    the qspinlock state as a separate variable (as we can when
    _Q_PENDING_BITS == 8) then the above is entirely possible, because
    tas-pending will only RmW the pending byte, so the later load is able
    to observe prior tail and lock state (but not earlier than its own
    trylock, which operates on the whole word, due to coherence).

    To avoid this we need 2 things:

    - the load must come after the tas-pending (obviously, otherwise it
    can trivially observe prior state).

    - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
    example, such that we cannot observe other state prior to setting
    pending.

    On x86 we can realize this by using "LOCK BTS m32, r32" for
    tas-pending followed by a regular load.

    Note that observing later state is not a problem:

    - if we fail to observe a later unlock, we'll simply spin-wait for
    that store to become visible.

    - if we observe a later xchg_tail(), there is no difference from that
    xchg_tail() having taken place before the tas-pending.

    Suggested-by: Will Deacon
    Reported-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Will Deacon
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: andrea.parri@amarulasolutions.com
    Cc: longman@redhat.com
    Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
    Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org
    Signed-off-by: Ingo Molnar
    [bigeasy: GEN_BINARY_RMWcc macro redo]
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Sasha Levin

    Peter Zijlstra
     
  • commit 53bf57fab7321fb42b703056a4c80fc9d986d170 upstream.

    Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL)
    such that we loose the indent. This also result in a more natural code
    flow IMO.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: andrea.parri@amarulasolutions.com
    Cc: longman@redhat.com
    Link: https://lkml.kernel.org/r/20181003130257.156322446@infradead.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Sasha Levin

    Peter Zijlstra
     

14 Nov, 2018

1 commit

  • [ Upstream commit 9506a7425b094d2f1d9c877ed5a78f416669269b ]

    It was found that when debug_locks was turned off because of a problem
    found by the lockdep code, the system performance could drop quite
    significantly when the lock_stat code was also configured into the
    kernel. For instance, parallel kernel build time on a 4-socket x86-64
    server nearly doubled.

    Further analysis into the cause of the slowdown traced back to the
    frequent call to debug_locks_off() from the __lock_acquired() function
    probably due to some inconsistent lockdep states with debug_locks
    off. The debug_locks_off() function did an unconditional atomic xchg
    to write a 0 value into debug_locks which had already been set to 0.
    This led to severe cacheline contention in the cacheline that held
    debug_locks. As debug_locks is being referenced in quite a few different
    places in the kernel, this greatly slow down the system performance.

    To prevent that trashing of debug_locks cacheline, lock_acquired()
    and lock_contended() now checks the state of debug_locks before
    proceeding. The debug_locks_off() function is also modified to check
    debug_locks before calling __debug_locks_off().

    Signed-off-by: Waiman Long
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Waiman Long
     

03 Oct, 2018

1 commit

  • If CONFIG_WW_MUTEX_SELFTEST=y is enabled, booting an image
    in an arm64 virtual machine results in the following
    traceback if 8 CPUs are enabled:

    DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current)
    WARNING: CPU: 2 PID: 537 at kernel/locking/mutex.c:1033 __mutex_unlock_slowpath+0x1a8/0x2e0
    ...
    Call trace:
    __mutex_unlock_slowpath()
    ww_mutex_unlock()
    test_cycle_work()
    process_one_work()
    worker_thread()
    kthread()
    ret_from_fork()

    If requesting b_mutex fails with -EDEADLK, the error variable
    is reassigned to the return value from calling ww_mutex_lock
    on a_mutex again. If this call fails, a_mutex is not locked.
    It is, however, unconditionally unlocked subsequently, causing
    the reported warning. Fix the problem by using two error variables.

    With this change, the selftest still fails as follows:

    cyclic deadlock not resolved, ret[7/8] = -35

    However, the traceback is gone.

    Signed-off-by: Guenter Roeck
    Cc: Chris Wilson
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Fixes: d1b42b800e5d0 ("locking/ww_mutex: Add kselftests for resolving ww_mutex cyclic deadlocks")
    Link: http://lkml.kernel.org/r/1538516929-9734-1-git-send-email-linux@roeck-us.net
    Signed-off-by: Ingo Molnar

    Guenter Roeck
     

10 Sep, 2018

3 commits

  • Trivial fix to spelling mistake in pr_err() error message

    Signed-off-by: Colin Ian King
    Acked-by: Will Deacon
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: kernel-janitors@vger.kernel.org
    Link: http://lkml.kernel.org/r/20180824112235.8842-1-colin.king@canonical.com
    Signed-off-by: Ingo Molnar

    Colin Ian King
     
  • Commit:

    c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")

    added the inclusion of .

    liblockdep doesn't have a stub version of that header so now fails to build.

    However, commit:

    bff1b208a5d1 ("tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"")

    removed the use of functions declared in that header. So delete the #include.

    Signed-off-by: Ben Hutchings
    Cc: Joel Fernandes
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Sasha Levin
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Fixes: bff1b208a5d1 ("tracing: Partial revert of "tracing: Centralize ...")
    Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints ...")
    Link: http://lkml.kernel.org/r/20180828203315.GD18030@decadent.org.uk
    Signed-off-by: Ingo Molnar

    Ben Hutchings
     
  • The following commit:

    08295b3b5bee ("Implement an algorithm choice for Wound-Wait mutexes")

    introduced a reference in the documentation to a function that was
    removed in an earlier commit.

    It also forgot to remove a call to debug_mutex_add_waiter() which is now
    unconditionally called by __mutex_add_waiter().

    Fix those bugs.

    Signed-off-by: Thomas Hellstrom
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dri-devel@lists.freedesktop.org
    Fixes: 08295b3b5bee ("Implement an algorithm choice for Wound-Wait mutexes")
    Link: http://lkml.kernel.org/r/20180903140708.2401-1-thellstrom@vmware.com
    Signed-off-by: Ingo Molnar

    Thomas Hellstrom
     

21 Aug, 2018

1 commit

  • Pull tracing updates from Steven Rostedt:

    - Restructure of lockdep and latency tracers

    This is the biggest change. Joel Fernandes restructured the hooks
    from irqs and preemption disabling and enabling. He got rid of a lot
    of the preprocessor #ifdef mess that they caused.

    He turned both lockdep and the latency tracers to use trace events
    inserted in the preempt/irqs disabling paths. But unfortunately,
    these started to cause issues in corner cases. Thus, parts of the
    code was reverted back to where lockdep and the latency tracers just
    get called directly (without using the trace events). But because the
    original change cleaned up the code very nicely we kept that, as well
    as the trace events for preempt and irqs disabling, but they are
    limited to not being called in NMIs.

    - Have trace events use SRCU for "rcu idle" calls. This was required
    for the preempt/irqs off trace events. But it also had to not allow
    them to be called in NMI context. Waiting till Paul makes an NMI safe
    SRCU API.

    - New notrace SRCU API to allow trace events to use SRCU.

    - Addition of mcount-nop option support

    - SPDX headers replacing GPL templates.

    - Various other fixes and clean ups.

    - Some fixes are marked for stable, but were not fully tested before
    the merge window opened.

    * tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
    tracing: Fix SPDX format headers to use C++ style comments
    tracing: Add SPDX License format tags to tracing files
    tracing: Add SPDX License format to bpf_trace.c
    blktrace: Add SPDX License format header
    s390/ftrace: Add -mfentry and -mnop-mcount support
    tracing: Add -mcount-nop option support
    tracing: Avoid calling cc-option -mrecord-mcount for every Makefile
    tracing: Handle CC_FLAGS_FTRACE more accurately
    Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()
    Uprobes: Simplify uprobe_register() body
    tracepoints: Free early tracepoints after RCU is initialized
    uprobes: Use synchronize_rcu() not synchronize_sched()
    tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister()
    ftrace: Remove unused pointer ftrace_swapper_pid
    tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage"
    tracing/irqsoff: Handle preempt_count for different configs
    tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"
    tracing: irqsoff: Account for additional preempt_disable
    trace: Use rcu_dereference_raw for hooks from trace-event subsystem
    tracing/kprobes: Fix within_notrace_func() to check only notrace functions
    ...

    Linus Torvalds
     

16 Aug, 2018

1 commit

  • Pull drm updates from Dave Airlie:
    "This is the main drm pull request for 4.19.

    Rob has some new hardware support for new qualcomm hw that I'll send
    along separately. This has the display part of it, the remaining pull
    is for the acceleration engine.

    This also contains a wound-wait/wait-die mutex rework, Peter has acked
    it for merging via my tree.

    Otherwise mostly the usual level of activity. Summary:

    core:
    - Wound-wait/wait-die mutex rework
    - Add writeback connector type
    - Add "content type" property for HDMI
    - Move GEM bo to drm_framebuffer
    - Initial gpu scheduler documentation
    - GPU scheduler fixes for dying processes
    - Console deferred fbcon takeover support
    - Displayport support for CEC tunneling over AUX

    panel:
    - otm8009a panel driver fixes
    - Innolux TV123WAM and G070Y2-L01 panel driver
    - Ilitek ILI9881c panel driver
    - Rocktech RK070ER9427 LCD
    - EDT ETM0700G0EDH6 and EDT ETM0700G0BDH6
    - DLC DLC0700YZG-1
    - BOE HV070WSA-100
    - newhaven, nhd-4.3-480272ef-atxl LCD
    - DataImage SCF0700C48GGU18
    - Sharp LQ035Q7DB03
    - p079zca: Refactor to support multiple panels

    tinydrm:
    - ILI9341 display panel

    New driver:
    - vkms - virtual kms driver to testing.

    i915:
    - Icelake:
    Display enablement
    DSI support
    IRQ support
    Powerwell support
    - GPU reset fixes and improvements
    - Full ppgtt support refactoring
    - PSR fixes and improvements
    - Execlist improvments
    - GuC related fixes

    amdgpu:
    - Initial amdgpu documentation
    - JPEG engine support on VCN
    - CIK uses powerplay by default
    - Move to using core PCIE functionality for gens/lanes
    - DC/Powerplay interface rework
    - Stutter mode support for RV
    - Vega12 Powerplay updates
    - GFXOFF fixes
    - GPUVM fault debugging
    - Vega12 GFXOFF
    - DC improvements
    - DC i2c/aux changes
    - UVD 7.2 fixes
    - Powerplay fixes for Polaris12, CZ/ST
    - command submission bo_list fixes

    amdkfd:
    - Raven support
    - Power management fixes

    udl:
    - Cleanups and fixes

    nouveau:
    - misc fixes and cleanups.

    msm:
    - DPU1 support display controller in sdm845
    - GPU coredump support.

    vmwgfx:
    - Atomic modesetting validation fixes
    - Support for multisample surfaces

    armada:
    - Atomic modesetting support completed.

    exynos:
    - IPPv2 fixes
    - Move g2d to component framework
    - Suspend/resume support cleanups
    - Driver cleanups

    imx:
    - CSI configuration improvements
    - Driver cleanups
    - Use atomic suspend/resume helpers
    - ipu-v3 V4L2 XRGB32/XBGR32 support

    pl111:
    - Add Nomadik LCDC variant

    v3d:
    - GPU scheduler jobs management

    sun4i:
    - R40 display engine support
    - TCON TOP driver

    mediatek:
    - MT2712 SoC support

    rockchip:
    - vop fixes

    omapdrm:
    - Workaround for DRA7 errata i932
    - Fix mm_list locking

    mali-dp:
    - Writeback implementation
    PM improvements
    - Internal error reporting debugfs

    tilcdc:
    - Single fix for deferred probing

    hdlcd:
    - Teardown fixes

    tda998x:
    - Converted to a bridge driver.

    etnaviv:
    - Misc fixes"

    * tag 'drm-next-2018-08-15' of git://anongit.freedesktop.org/drm/drm: (1506 commits)
    drm/amdgpu/sriov: give 8s for recover vram under RUNTIME
    drm/scheduler: fix param documentation
    drm/i2c: tda998x: correct PLL divider calculation
    drm/i2c: tda998x: get rid of private fill_modes function
    drm/i2c: tda998x: move mode_valid() to bridge
    drm/i2c: tda998x: register bridge outside of component helper
    drm/i2c: tda998x: cleanup from previous changes
    drm/i2c: tda998x: allocate tda998x_priv inside tda998x_create()
    drm/i2c: tda998x: convert to bridge driver
    drm/scheduler: fix timeout worker setup for out of order job completions
    drm/amd/display: display connected to dp-1 does not light up
    drm/amd/display: update clk for various HDMI color depths
    drm/amd/display: program display clock on cache match
    drm/amd/display: Add NULL check for enabling dp ss
    drm/amd/display: add vbios table check for enabling dp ss
    drm/amd/display: Don't share clk source between DP and HDMI
    drm/amd/display: Fix DP HBR2 Eye Diagram Pattern on Carrizo
    drm/amd/display: Use calculated disp_clk_khz value for dce110
    drm/amd/display: Implement custom degamma lut on dcn
    drm/amd/display: Destroy aux_engines only once
    ...

    Linus Torvalds
     

14 Aug, 2018

1 commit

  • Pull RCU updates from Thomas Gleixner:
    "A large update to RCU:

    Preparatory work for consolidating the RCU flavors:

    - Introduce grace-period sequence numbers to the RCU-bh, RCU-preempt,
    and RCU-sched flavors, replacing the old ->gpnum and ->completed
    pair of fields.

    This change allows lockless code to obtain the complete
    grace-period state with a single READ_ONCE(), which is needed to
    maintain tolerable lock contention during the upcoming
    consolidation of the three RCU flavors.

    Note that grace-period sequence numbers are already used by
    rcu_barrier(), expedited RCU grace periods, and SRCU, and are thus
    already heavily used and well-tested. Joel Fernandes contributed a
    number of excellent fixes and improvements.

    - Clean up some grace-period-reporting loose ends, including
    improving the handling of quiescent states from offline CPUs and
    fixing some false-positive WARN_ON_ONCE() invocations.

    (Strictly speaking, the WARN_ON_ONCE() invocations were quite
    correct, but their invariants were (harmlessly) violated by the
    earlier sloppy handling of quiescent states from offline CPUs.)

    In addition, improve grace-period forward-progress guarantees so as
    to allow removal of fail-safe checks that required otherwise
    needless lock acquisitions. Finally, add more diagnostics to help
    debug the upcoming consolidation of the RCU-bh, RCU-preempt, and
    RCU-sched flavors.

    The rest:

    - SRCU updates

    - Updates to rcutorture and associated scripting.

    - The usual pile of miscellaneous fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits)
    rcutorture: Fix rcu_barrier successes counter
    rcutorture: Add support to detect if boost kthread prio is too low
    rcutorture: Use monotonic timestamp for stall detection
    rcutorture: Make boost test more robust
    rcutorture: Disable RT throttling for boost tests
    rcutorture: Emphasize testing of single reader protection type
    rcutorture: Handle extended read-side critical sections
    rcutorture: Make rcu_torture_timer() use rcu_torture_one_read()
    rcutorture: Use per-CPU random state for rcu_torture_timer()
    rcutorture: Use atomic increment for n_rcu_torture_timers
    rcutorture: Extract common code from rcu_torture_reader()
    rcuperf: Remove unused torturing_tasks() function
    rcu: Remove rcutorture test version and sequence number
    rcutorture: Change units of onoff_interval to jiffies
    rcu: Assign higher prio to RCU threads if rcutorture is built-in
    rculist: Improve documentation for list_for_each_entry_from_rcu()
    srcu: Add grace-period number to rcutorture statistics printout
    rcu: Print stall-warning NMI dyntick state in hexadecimal
    MAINTAINERS: Update RCU, SRCU, and TORTURE-TEST entries
    rcu: Make rcu_seq_diff() more exact
    ...

    Linus Torvalds
     

11 Aug, 2018

1 commit

  • Joel Fernandes created a nice patch that cleaned up the duplicate hooks used
    by lockdep and irqsoff latency tracer. It made both use tracepoints. But it
    caused lockdep to trigger several false positives. We have not figured out
    why yet, but removing lockdep from using the trace event hooks and just call
    its helper functions directly (like it use to), makes the problem go away.

    This is a partial revert of the clean up patch c3bc8fd637a9 ("tracing:
    Centralize preemptirq tracepoints and unify their usage") that adds direct
    calls for lockdep, but also keeps most of the clean up done to get rid of
    the horrible preprocessor if statements.

    Link: http://lkml.kernel.org/r/20180806155058.5ee875f4@gandalf.local.home

    Cc: Peter Zijlstra
    Reviewed-by: Joel Fernandes (Google)
    Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

31 Jul, 2018

2 commits

  • This patch detaches the preemptirq tracepoints from the tracers and
    keeps it separate.

    Advantages:
    * Lockdep and irqsoff event can now run in parallel since they no longer
    have their own calls.

    * This unifies the usecase of adding hooks to an irqsoff and irqson
    event, and a preemptoff and preempton event.
    3 users of the events exist:
    - Lockdep
    - irqsoff and preemptoff tracers
    - irqs and preempt trace events

    The unification cleans up several ifdefs and makes the code in preempt
    tracer and irqsoff tracers simpler. It gets rid of all the horrific
    ifdeferry around PROVE_LOCKING and makes configuration of the different
    users of the tracepoints more easy and understandable. It also gets rid
    of the time_* function calls from the lockdep hooks used to call into
    the preemptirq tracer which is not needed anymore. The negative delta in
    lines of code in this patch is quite large too.

    In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
    as a single point for registering probes onto the tracepoints. With
    this,
    the web of config options for preempt/irq toggle tracepoints and its
    users becomes:

    PREEMPT_TRACER PREEMPTIRQ_EVENTS IRQSOFF_TRACER PROVE_LOCKING
    | | \ | |
    \ (selects) / \ \ (selects) /
    TRACE_PREEMPT_TOGGLE ----> TRACE_IRQFLAGS
    \ /
    \ (depends on) /
    PREEMPTIRQ_TRACEPOINTS

    Other than the performance tests mentioned in the previous patch, I also
    ran the locking API test suite. I verified that all tests cases are
    passing.

    I also injected issues by not registering lockdep probes onto the
    tracepoints and I see failures to confirm that the probes are indeed
    working.

    This series + lockdep probes not registered (just to inject errors):
    [ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] sirq-safe-A => hirqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + irqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] soft-safe-A + irqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + irqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] soft-safe-A + irqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
    [ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok |

    With this series + lockdep probes registered, all locking tests pass:

    [ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/12: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/21: ok | ok | ok |
    [ 0.000000] hard-safe-A + irqs-on/12: ok | ok | ok |
    [ 0.000000] soft-safe-A + irqs-on/12: ok | ok | ok |
    [ 0.000000] hard-safe-A + irqs-on/21: ok | ok | ok |
    [ 0.000000] soft-safe-A + irqs-on/21: ok | ok | ok |
    [ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
    [ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok |

    Link: http://lkml.kernel.org/r/20180730222423.196630-4-joel@joelfernandes.org

    Acked-by: Peter Zijlstra (Intel)
    Reviewed-by: Namhyung Kim
    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Steven Rostedt (VMware)

    Joel Fernandes (Google)
     
  • get_cpu_var disables preemption which has the potential to call into the
    preemption disable trace points causing some complications. There's also
    no need to disable preemption in uses of get_lock_stats anyway since
    preempt is already disabled. So lets simplify the code.

    Link: http://lkml.kernel.org/r/20180730222423.196630-2-joel@joelfernandes.org

    Suggested-by: Peter Zijlstra
    Acked-by: Peter Zijlstra
    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Steven Rostedt (VMware)

    Joel Fernandes (Google)
     

25 Jul, 2018

1 commit

  • Needed for annotating rt_mutex locks.

    Tested-by: John Sperbeck
    Signed-off-by: Peter Rosin
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Davidlohr Bueso
    Cc: Deepa Dinamani
    Cc: Greg Kroah-Hartman
    Cc: Linus Torvalds
    Cc: Peter Chang
    Cc: Peter Zijlstra
    Cc: Philippe Ombredanne
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: Wolfram Sang
    Link: http://lkml.kernel.org/r/20180720083914.1950-2-peda@axentia.se
    Signed-off-by: Ingo Molnar

    Peter Rosin
     

17 Jul, 2018

1 commit

  • …k/linux-rcu into core/rcu

    Pull RCU updates from Paul E. McKenney:

    - An optimization and a fix for RCU expedited grace periods, with
    the fix being from Boqun Feng.

    - Miscellaneous fixes, including a lockdep-annotation fix from
    Boqun Feng.

    - SRCU updates.

    - Updates to rcutorture and associated scripting.

    - Introduce grace-period sequence numbers to the RCU-bh, RCU-preempt,
    and RCU-sched flavors, replacing the old ->gpnum and ->completed
    pair of fields. This change allows lockless code to obtain the
    complete grace-period state with a single READ_ONCE(), which is
    needed to maintain tolerable lock contention during the upcoming
    consolidation of the three RCU flavors. Note that grace-period
    sequence numbers are already used by rcu_barrier(), expedited
    RCU grace periods, and SRCU, and are thus already heavily used
    and well-tested. Joel Fernandes contributed a number of excellent
    fixes and improvements.

    - Clean up some grace-period-reporting loose ends, including
    improving the handling of quiescent states from offline CPUs
    and fixing some false-positive WARN_ON_ONCE() invocations.
    (Strictly speaking, the WARN_ON_ONCE() invocations were quite
    correct, but their invariants were (harmlessly) violated by the
    earlier sloppy handling of quiescent states from offline CPUs.)
    In addition, improve grace-period forward-progress guarantees so
    as to allow removal of fail-safe checks that required otherwise
    needless lock acquisitions. Finally, add more diagnostics to
    help debug the upcoming consolidation of the RCU-bh, RCU-preempt,
    and RCU-sched flavors.

    - Additional miscellaneous fixes, including those contributed by
    Byungchul Park, Mauro Carvalho Chehab, Joe Perches, Joel Fernandes,
    Steven Rostedt, Andrea Parri, and Neil Brown.

    - Additional torture-test changes, including several contributed by
    Arnd Bergmann and Joel Fernandes.

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

06 Jul, 2018

1 commit

  • A patchset worked out together with Peter Zijlstra. Ingo is OK with taking
    it through the DRM tree:

    This is a small fallout from a work to allow batching WW mutex locks and
    unlocks.

    Our Wound-Wait mutexes actually don't use the Wound-Wait algorithm but
    the Wait-Die algorithm. One could perhaps rename those mutexes tree-wide to
    "Wait-Die mutexes" or "Deadlock Avoidance mutexes". Another approach suggested
    here is to implement also the "Wound-Wait" algorithm as a per-WW-class
    choice, as it has advantages in some cases. See for example

    http://www.mathcs.emory.edu/~cheung/Courses/554/Syllabus/8-recv+serial/deadlock-compare.html

    Now Wound-Wait is a preemptive algorithm, and the preemption is implemented
    using a lazy scheme: If a wounded transaction is about to go to sleep on
    a contended WW mutex, we return -EDEADLK. That is sufficient for deadlock
    prevention. Since with WW mutexes we also require the aborted transaction to
    sleep waiting to lock the WW mutex it was aborted on, this choice also provides
    a suitable WW mutex to sleep on. If we were to return -EDEADLK on the first
    WW mutex lock after the transaction was wounded whether the WW mutex was
    contended or not, the transaction might frequently be restarted without a wait,
    which is far from optimal. Note also that with the lazy preemption scheme,
    contrary to Wait-Die there will be no rollbacks on lock contention of locks
    held by a transaction that has completed its locking sequence.

    The modeset locks are then changed from Wait-Die to Wound-Wait since the
    typical locking pattern of those locks very well matches the criterion for
    a substantial reduction in the number of rollbacks. For reservation objects,
    the benefit is more unclear at this point and they remain using Wait-Die.

    Signed-off-by: Dave Airlie
    Link: https://patchwork.freedesktop.org/patch/msgid/20180703105339.4461-1-thellstrom@vmware.com

    Dave Airlie
     

03 Jul, 2018

2 commits

  • The current Wound-Wait mutex algorithm is actually not Wound-Wait but
    Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait
    is, contrary to Wait-Die a preemptive algorithm and is known to generate
    fewer backoffs. Testing reveals that this is true if the
    number of simultaneous contending transactions is small.
    As the number of simultaneous contending threads increases, Wait-Wound
    becomes inferior to Wait-Die in terms of elapsed time.
    Possibly due to the larger number of held locks of sleeping transactions.

    Update documentation and callers.

    Timings using git://people.freedesktop.org/~thomash/ww_mutex_test
    tag patch-18-06-15

    Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly
    chosen out of 100000. Four core Intel x86_64:

    Algorithm #threads Rollbacks time
    Wound-Wait 4 ~100 ~17s.
    Wait-Die 4 ~150000 ~19s.
    Wound-Wait 16 ~360000 ~109s.
    Wait-Die 16 ~450000 ~82s.

    Cc: Ingo Molnar
    Cc: Jonathan Corbet
    Cc: Gustavo Padovan
    Cc: Maarten Lankhorst
    Cc: Sean Paul
    Cc: David Airlie
    Cc: Davidlohr Bueso
    Cc: "Paul E. McKenney"
    Cc: Josh Triplett
    Cc: Thomas Gleixner
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Greg Kroah-Hartman
    Cc: linux-doc@vger.kernel.org
    Cc: linux-media@vger.kernel.org
    Cc: linaro-mm-sig@lists.linaro.org
    Co-authored-by: Peter Zijlstra
    Signed-off-by: Thomas Hellstrom
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Ingo Molnar

    Thomas Hellstrom
     
  • Make the WW mutex code more readable by adding comments, splitting up
    functions and pointing out that we're actually using the Wait-Die
    algorithm.

    Cc: Ingo Molnar
    Cc: Jonathan Corbet
    Cc: Gustavo Padovan
    Cc: Maarten Lankhorst
    Cc: Sean Paul
    Cc: David Airlie
    Cc: Davidlohr Bueso
    Cc: "Paul E. McKenney"
    Cc: Josh Triplett
    Cc: Thomas Gleixner
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Greg Kroah-Hartman
    Cc: linux-doc@vger.kernel.org
    Cc: linux-media@vger.kernel.org
    Cc: linaro-mm-sig@lists.linaro.org
    Co-authored-by: Thomas Hellstrom
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Hellstrom
    Acked-by: Ingo Molnar

    Peter Ziljstra
     

26 Jun, 2018

2 commits

  • This commit adds "#define pr_fmt(fmt) fmt" to the torture-test files
    in order to keep the current dmesg format. Once Joe's commits have
    hit mainline, these definitions will be changed in order to automatically
    generate the dmesg line prefix that the scripts expect. This will have
    the beneficial side-effect of allowing printk() formats to be used more
    widely and of shortening some pr_*() lines.

    Signed-off-by: Paul E. McKenney
    Cc: Joe Perches

    Paul E. McKenney
     
  • Some bugs reproduce quickly only at high CPU-hotplug rates, so the
    rcutorture TREE03 scenario now has only 200 milliseconds spacing between
    CPU-hotplug operations. At this rate, the torture-test pair of console
    messages per operation becomes a bit voluminous. This commit therefore
    converts the torture-test set of "verbose" kernel-boot arguments from
    bool to int, and prints the extra console messages only when verbose=2.
    The default is still verbose=1.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

24 Jun, 2018

1 commit

  • Pull locking fixes from Thomas Gleixner:
    "A set of fixes and updates for the locking code:

    - Prevent lockdep from updating irq state within its own code and
    thereby confusing itself.

    - Buid fix for older GCCs which mistreat anonymous unions

    - Add a missing lockdep annotation in down_read_non_onwer() which
    causes up_read_non_owner() to emit a lockdep splat

    - Remove the custom alpha dec_and_lock() implementation which is
    incorrect in terms of ordering and use the generic one.

    The remaining two commits are not strictly fixes. They provide irqsave
    variants of atomic_dec_and_lock() and refcount_dec_and_lock(). These
    are required to merge the relevant updates and cleanups into different
    maintainer trees for 4.19, so routing them into mainline without
    actual users is the sanest approach.

    They should have been in -rc1, but last weekend I took the liberty to
    just avoid computers in order to regain some mental sanity"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    locking/qspinlock: Fix build for anonymous union in older GCC compilers
    locking/lockdep: Do not record IRQ state within lockdep code
    locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS
    locking/refcounts: Implement refcount_dec_and_lock_irqsave()
    atomic: Add irqsave variant of atomic_dec_and_lock()
    alpha: Remove custom dec_and_lock() implementation

    Linus Torvalds
     

22 Jun, 2018

1 commit

  • While debugging where things were going wrong with mapping
    enabling/disabling interrupts with the lockdep state and actual real
    enabling and disabling interrupts, I had to silent the IRQ
    disabling/enabling in debug_check_no_locks_freed() because it was
    always showing up as it was called before the splat was.

    Use raw_local_irq_save/restore() for not only debug_check_no_locks_freed()
    but for all internal lockdep functions, as they hide useful information
    about where interrupts were used incorrectly last.

    Signed-off-by: Steven Rostedt (VMware)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/lkml/20180404140630.3f4f4c7a@gandalf.local.home
    Signed-off-by: Ingo Molnar

    Steven Rostedt (VMware)
     

20 Jun, 2018

1 commit

  • It was found that the use of up_read_non_owner() in NFS was causing
    the following warning when DEBUG_RWSEMS was configured.

    DEBUG_LOCKS_WARN_ON(sem->owner != ((struct task_struct *)(1UL << 0)))

    Looking into the rwsem.c file, it was discovered that the corresponding
    down_read_non_owner() function was not setting the owner field properly.
    This is fixed now, and the warning should be gone.

    Fixes: 5149cbac4235 ("locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches")
    Signed-off-by: Waiman Long
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Tested-by: Gavin Schenk
    Cc: Davidlohr Bueso
    Cc: Dan Williams
    Cc: Arnd Bergmann
    Cc: linux-nfs@vger.kernel.org
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/1527168398-4291-1-git-send-email-longman@redhat.com

    Waiman Long
     

13 Jun, 2018

2 commits

  • The kzalloc() function has a 2-factor argument form, kcalloc(). This
    patch replaces cases of:

    kzalloc(a * b, gfp)

    with:
    kcalloc(a * b, gfp)

    as well as handling cases of:

    kzalloc(a * b * c, gfp)

    with:

    kzalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kzalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kzalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kzalloc
    + kcalloc
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kzalloc(sizeof(THING) * C2, ...)
    |
    kzalloc(sizeof(TYPE) * C2, ...)
    |
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(C1 * C2, ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     
  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

05 Jun, 2018

2 commits

  • Pull locking updates from Ingo Molnar:

    - Lots of tidying up changes all across the map for Linux's formal
    memory/locking-model tooling, by Alan Stern, Akira Yokosawa, Andrea
    Parri, Paul E. McKenney and SeongJae Park.

    Notable changes beyond an overall update in the tooling itself is the
    tidying up of spin_is_locked() semantics, which spills over into the
    kernel proper as well.

    - qspinlock improvements: the locking algorithm now guarantees forward
    progress whereas the previous implementation in mainline could starve
    threads indefinitely in cmpxchg() loops. Also other related cleanups
    to the qspinlock code (Will Deacon)

    - misc smaller improvements, cleanups and fixes all across the locking
    subsystem

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
    locking/rwsem: Simplify the is-owner-spinnable checks
    tools/memory-model: Add reference for 'Simplifying ARM concurrency'
    tools/memory-model: Update ASPLOS information
    MAINTAINERS, tools/memory-model: Update e-mail address for Andrea Parri
    tools/memory-model: Fix coding style in 'lock.cat'
    tools/memory-model: Remove out-of-date comments and code from lock.cat
    tools/memory-model: Improve mixed-access checking in lock.cat
    tools/memory-model: Improve comments in lock.cat
    tools/memory-model: Remove duplicated code from lock.cat
    tools/memory-model: Flag "cumulativity" and "propagation" tests
    tools/memory-model: Add model support for spin_is_locked()
    tools/memory-model: Add scripts to test memory model
    tools/memory-model: Fix coding style in 'linux-kernel.def'
    tools/memory-model: Model 'smp_store_mb()'
    tools/memory-order: Update the cheat-sheet to show that smp_mb__after_atomic() orders later RMW operations
    tools/memory-order: Improve key for SELF and SV
    tools/memory-model: Fix cheat sheet typo
    tools/memory-model: Update required version of herdtools7
    tools/memory-model: Redefine rb in terms of rcu-fence
    tools/memory-model: Rename link and rcu-path to rcu-link and rb
    ...

    Linus Torvalds
     
  • Pull procfs updates from Al Viro:
    "Christoph's proc_create_... cleanups series"

    * 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (44 commits)
    xfs, proc: hide unused xfs procfs helpers
    isdn/gigaset: add back gigaset_procinfo assignment
    proc: update SIZEOF_PDE_INLINE_NAME for the new pde fields
    tty: replace ->proc_fops with ->proc_show
    ide: replace ->proc_fops with ->proc_show
    ide: remove ide_driver_proc_write
    isdn: replace ->proc_fops with ->proc_show
    atm: switch to proc_create_seq_private
    atm: simplify procfs code
    bluetooth: switch to proc_create_seq_data
    netfilter/x_tables: switch to proc_create_seq_private
    netfilter/xt_hashlimit: switch to proc_create_{seq,single}_data
    neigh: switch to proc_create_seq_data
    hostap: switch to proc_create_{seq,single}_data
    bonding: switch to proc_create_seq_data
    rtc/proc: switch to proc_create_single_data
    drbd: switch to proc_create_single
    resource: switch to proc_create_seq_data
    staging/rtl8192u: simplify procfs code
    jfs: simplify procfs code
    ...

    Linus Torvalds
     

25 May, 2018

2 commits


16 May, 2018

4 commits

  • The filesystem freezing code needs to transfer ownership of a rwsem
    embedded in a percpu-rwsem from the task that does the freezing to
    another one that does the thawing by calling percpu_rwsem_release()
    after freezing and percpu_rwsem_acquire() before thawing.

    However, the new rwsem debug code runs afoul with this scheme by warning
    that the task that releases the rwsem isn't the one that acquires it,
    as reported by Amir Goldstein:

    DEBUG_LOCKS_WARN_ON(sem->owner != get_current())
    WARNING: CPU: 1 PID: 1401 at /home/amir/build/src/linux/kernel/locking/rwsem.c:133 up_write+0x59/0x79

    Call Trace:
    percpu_up_write+0x1f/0x28
    thaw_super_locked+0xdf/0x120
    do_vfs_ioctl+0x270/0x5f1
    ksys_ioctl+0x52/0x71
    __x64_sys_ioctl+0x16/0x19
    do_syscall_64+0x5d/0x167
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    To work properly with the rwsem debug code, we need to annotate that the
    rwsem ownership is unknown during the tranfer period until a brave soul
    comes forward to acquire the ownership. During that period, optimistic
    spinning will be disabled.

    Reported-by: Amir Goldstein
    Tested-by: Amir Goldstein
    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Jan Kara
    Cc: Linus Torvalds
    Cc: Matthew Wilcox
    Cc: Oleg Nesterov
    Cc: Paul E. McKenney
    Cc: Theodore Y. Ts'o
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-fsdevel@vger.kernel.org
    Link: http://lkml.kernel.org/r/1526420991-21213-3-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • There are use cases where a rwsem can be acquired by one task, but
    released by another task. In thess cases, optimistic spinning may need
    to be disabled. One example will be the filesystem freeze/thaw code
    where the task that freezes the filesystem will acquire a write lock
    on a rwsem and then un-owns it before returning to userspace. Later on,
    another task will come along, acquire the ownership, thaw the filesystem
    and release the rwsem.

    Bit 0 of the owner field was used to designate that it is a reader
    owned rwsem. It is now repurposed to mean that the owner of the rwsem
    is not known. If only bit 0 is set, the rwsem is reader owned. If bit
    0 and other bits are set, it is writer owned with an unknown owner.
    One such value for the latter case is (-1L). So we can set owner to 1 for
    reader-owned, -1 for writer-owned. The owner is unknown in both cases.

    To handle transfer of rwsem ownership, the higher level code should
    set the owner field to -1 to indicate a write-locked rwsem with unknown
    owner. Optimistic spinning will be disabled in this case.

    Once the higher level code figures who the new owner is, it can then
    set the owner field accordingly.

    Tested-by: Amir Goldstein
    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Jan Kara
    Cc: Linus Torvalds
    Cc: Matthew Wilcox
    Cc: Oleg Nesterov
    Cc: Paul E. McKenney
    Cc: Theodore Y. Ts'o
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-fsdevel@vger.kernel.org
    Link: http://lkml.kernel.org/r/1526420991-21213-2-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Variants of proc_create{,_data} that directly take a seq_file show
    callback and drastically reduces the boilerplate code in the callers.

    All trivial callers converted over.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Variants of proc_create{,_data} that directly take a struct seq_operations
    argument and drastically reduces the boilerplate code in the callers.

    All trivial callers converted over.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

14 May, 2018

2 commits

  • Calling lockdep_print_held_locks() on a running thread is considered unsafe.

    Since all callers should follow that rule and the sanity check is not heavy,
    this patch moves the sanity check to inside lockdep_print_held_locks().

    As a side effect of this patch, the number of locks held by running threads
    will be printed as well. This change will be preferable when we want to
    know which threads might be relevant to a problem but are unable to print
    any clues because that thread is running.

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Dmitry Vyukov
    Cc: Linus Torvalds
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1523011279-8206-2-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
    Signed-off-by: Ingo Molnar

    Tetsuo Handa
     
  • debug_show_all_locks() tries to grab the tasklist_lock for two seconds, but
    calling while_each_thread() without tasklist_lock held is not safe.

    See the following commit for more information:

    4449a51a7c281602 ("vm_is_stack: use for_each_thread() rather then buggy while_each_thread()")

    Change debug_show_all_locks() from "do_each_thread()/while_each_thread()
    with possibility of missing tasklist_lock" to "for_each_process_thread()
    with RCU", and add a call to touch_all_softlockup_watchdogs() like
    show_state_filter() does.

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1523011279-8206-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
    Signed-off-by: Ingo Molnar

    Tetsuo Handa
     

04 May, 2018

1 commit

  • Use try_cmpxchg to avoid the pointless TEST instruction..
    And add the (missing) atomic_long_try_cmpxchg*() wrappery.

    On x86_64 this gives:

    0000000000000710 : 0000000000000710 :
    710: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx 710: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx
    717: 00 00 717: 00 00
    715: R_X86_64_32S current_task 715: R_X86_64_32S current_task
    719: 31 c0 xor %eax,%eax 719: 31 c0 xor %eax,%eax
    71b: f0 48 0f b1 17 lock cmpxchg %rdx,(%rdi) 71b: f0 48 0f b1 17 lock cmpxchg %rdx,(%rdi)
    720: 48 85 c0 test %rax,%rax 720: 75 02 jne 724
    723: 75 02 jne 727 722: f3 c3 repz retq
    725: f3 c3 repz retq 724: eb da jmp 700
    727: eb d7 jmp 700 726: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
    729: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 72d: 00 00 00

    On ARM64 this gives:

    000000000000638 : 0000000000000638 :
    638: d5384101 mrs x1, sp_el0 638: d5384101 mrs x1, sp_el0
    63c: d2800002 mov x2, #0x0 63c: d2800002 mov x2, #0x0
    640: f9800011 prfm pstl1strm, [x0] 640: f9800011 prfm pstl1strm, [x0]
    644: c85ffc03 ldaxr x3, [x0] 644: c85ffc03 ldaxr x3, [x0]
    648: ca020064 eor x4, x3, x2 648: ca020064 eor x4, x3, x2
    64c: b5000064 cbnz x4, 658 64c: b5000064 cbnz x4, 658
    650: c8047c01 stxr w4, x1, [x0] 650: c8047c01 stxr w4, x1, [x0]
    654: 35ffff84 cbnz w4, 644 654: 35ffff84 cbnz w4, 644
    658: b40000c3 cbz x3, 670 658: b5000043 cbnz x3, 660
    65c: a9bf7bfd stp x29, x30, [sp,#-16]! 65c: d65f03c0 ret
    660: 910003fd mov x29, sp 660: a9bf7bfd stp x29, x30, [sp,#-16]!
    664: 97ffffef bl 620 664: 910003fd mov x29, sp
    668: a8c17bfd ldp x29, x30, [sp],#16 668: 97ffffee bl 620
    66c: d65f03c0 ret 66c: a8c17bfd ldp x29, x30, [sp],#16
    670: d65f03c0 ret 670: d65f03c0 ret

    Reported-by: Matthew Wilcox
    Acked-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

27 Apr, 2018

1 commit

  • The native clear_pending() function is identical to the PV version, so the
    latter can simply be removed.

    This fixes the build for systems with >= 16K CPUs using the PV lock implementation.

    Reported-by: Waiman Long
    Signed-off-by: Will Deacon
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: boqun.feng@gmail.com
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: paulmck@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/20180427101619.GB21705@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon