31 May, 2019

1 commit

  • Based on 3 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version this program is distributed in the
    hope that it will be useful but without any warranty without even
    the implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version [author] [kishon] [vijay] [abraham]
    [i] [kishon]@[ti] [com] this program is distributed in the hope that
    it will be useful but without any warranty without even the implied
    warranty of merchantability or fitness for a particular purpose see
    the gnu general public license for more details

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version [author] [graeme] [gregory]
    [gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i]
    [kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema]
    [hk] [hemahk]@[ti] [com] this program is distributed in the hope
    that it will be useful but without any warranty without even the
    implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 1105 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Richard Fontana
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070033.202006027@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

10 Apr, 2019

2 commits

  • The QUEUED_LOCK_STAT option to report queued spinlocks event counts
    was previously allowed only on x86 architecture. To make the locking
    event counting code more useful, it is now renamed to a more generic
    LOCK_EVENT_COUNTS config option. This new option will be available to
    all the architectures that use qspinlock at the moment.

    Other locking code can now start to use the generic locking event
    counting code by including lock_events.h and put the new locking event
    names into the lock_events_list.h header file.

    My experience with lock event counting is that it gives valuable insight
    on how the locking code works and what can be done to make it better. I
    would like to extend this benefit to other locking code like mutex and
    rwsem in the near future.

    The PV qspinlock specific code will stay in qspinlock_stat.h. The
    locking event counters will now reside in the /lock_event_counts
    directory.

    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Acked-by: Davidlohr Bueso
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20190404174320.22416-9-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • The percpu event counts used by qspinlock code can be useful for
    other locking code as well. So a new set of lockevent_* counting APIs
    is introduced with the lock event names extracted out into the new
    lock_events_list.h header file for easier addition in the future.

    The existing qstat_inc() calls are replaced by either lockevent_inc() or
    lockevent_cond_inc() calls.

    The qstat_hop() call is renamed to lockevent_pv_hop(). The "reset_counters"
    debugfs file is also renamed to ".reset_counts".

    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Acked-by: Davidlohr Bueso
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20190404174320.22416-8-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

04 Feb, 2019

1 commit

  • Track the number of slowpath locking operations that are being done
    without any MCS node available as well renaming lock_index[123] to make
    them more descriptive.

    Using these stat counters is one way to find out if a code path is
    being exercised.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Borislav Petkov
    Cc: H. Peter Anvin
    Cc: James Morse
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: SRINIVAS
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: Zhenzhong Duan
    Link: https://lkml.kernel.org/r/1548798828-16156-3-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

17 Oct, 2018

1 commit

  • Queued spinlock supports up to 4 levels of lock slowpath nesting -
    user context, soft IRQ, hard IRQ and NMI. However, we are not sure how
    often the nesting happens.

    So add 3 more per-CPU stat counters to track the number of instances where
    nesting index goes to 1, 2 and 3 respectively.

    On a dual-socket 64-core 128-thread Zen server, the following were the
    new stat counter values under different circumstances:

    State slowpath index1 index2 index3
    ----- -------- ------ ------ -------
    After bootup 1,012,150 82 0 0
    After parallel build + perf-top 125,195,009 82 0 0

    So the chance of having more than 2 levels of nesting is extremely low.

    [ mingo: Minor changelog edits. ]

    Signed-off-by: Waiman Long
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/1539697507-28084-1-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

27 Apr, 2018

1 commit

  • Currently, the qspinlock_stat code tracks only statistical counts in the
    PV qspinlock code. However, it may also be useful to track the number
    of locking operations done via the pending code vs. the MCS lock queue
    slowpath for the non-PV case.

    The qspinlock stat code is modified to do that. The stat counter
    pv_lock_slowpath is renamed to lock_slowpath so that it can be used by
    both the PV and non-PV cases.

    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Waiman Long
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: boqun.feng@gmail.com
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: paulmck@linux.vnet.ibm.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1524738868-31318-14-git-send-email-will.deacon@arm.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

02 Mar, 2017

1 commit


05 Dec, 2016

1 commit


10 Aug, 2016

2 commits

  • Currently there are overlap in the pvqspinlock wait_again and
    spurious_wakeup stat counters. Because of lock stealing, it is
    no longer possible to accurately determine if spurious wakeup has
    happened in the queue head. As they track both the queue node and
    queue head status, it is also hard to tell how many of those comes
    from the queue head and how many from the queue node.

    This patch changes the accounting rules so that spurious wakeup is
    only tracked in the queue node. The wait_again count, however, is
    only tracked in the queue head when the vCPU failed to acquire the
    lock after a vCPU kick. This should give a much better indication of
    the wait-kick dynamics in the queue node and the queue head.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boqun Feng
    Cc: Douglas Hatch
    Cc: Linus Torvalds
    Cc: Pan Xinhui
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1464713631-1066-2-git-send-email-Waiman.Long@hpe.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • It's obviously wrong to set stat to NULL. So lets remove it.
    Otherwise it is always zero when we check the latency of kick/wake.

    Signed-off-by: Pan Xinhui
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Waiman Long
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1468405414-3700-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Pan Xinhui
     

05 May, 2016

2 commits

  • Specifically around the debugfs file creation calls,
    I have no idea if they could ever possibly fail, but
    this is core code (debug aside) so lets at least
    check the return value and inform anything fishy.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Waiman Long
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20160420041725.GC3472@linux-uzut.site
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • ... remove the redundant second iteration, this is most
    likely a copy/past buglet.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dave@stgolabs.net
    Cc: waiman.long@hpe.com
    Link: http://lkml.kernel.org/r/1460961103-24953-2-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

19 Apr, 2016

1 commit

  • While playing with the qstat statistics (in /qlockstat/) I ran into
    the following splat on a VM when opening pv_hash_hops:

    divide error: 0000 [#1] SMP
    ...
    RIP: 0010:[] [] qstat_read+0x12e/0x1e0
    ...
    Call Trace:
    [] ? mem_cgroup_commit_charge+0x6c/0xd0
    [] ? page_add_new_anon_rmap+0x8c/0xd0
    [] ? handle_mm_fault+0x1439/0x1b40
    [] ? do_mmap+0x449/0x550
    [] ? __vfs_read+0x23/0xd0
    [] ? rw_verify_area+0x52/0xd0
    [] ? vfs_read+0x81/0x120
    [] ? SyS_read+0x42/0xa0
    [] ? entry_SYSCALL_64_fastpath+0x1e/0xa8

    Fix this by verifying that qstat_pv_kick_unlock is in fact non-zero,
    similarly to what the qstat_pv_latency_wake case does, as if nothing
    else, this can come from resetting the statistics, thus having 0 kicks
    should be quite valid in this context.

    Signed-off-by: Davidlohr Bueso
    Reviewed-by: Waiman Long
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dave@stgolabs.net
    Cc: waiman.long@hpe.com
    Link: http://lkml.kernel.org/r/1460961103-24953-1-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

29 Feb, 2016

2 commits

  • This patch enables the tracking of the number of slowpath locking
    operations performed. This can be used to compare against the number
    of lock stealing operations to see what percentage of locks are stolen
    versus acquired via the regular slowpath.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Douglas Hatch
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1449778666-13593-2-git-send-email-Waiman.Long@hpe.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • This patch moves the lock stealing count tracking code into
    pv_queued_spin_steal_lock() instead of via a jacket function simplifying
    the code.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Douglas Hatch
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1449778666-13593-3-git-send-email-Waiman.Long@hpe.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

04 Dec, 2015

3 commits

  • In an overcommitted guest where some vCPUs have to be halted to make
    forward progress in other areas, it is highly likely that a vCPU later
    in the spinlock queue will be spinning while the ones earlier in the
    queue would have been halted. The spinning in the later vCPUs is then
    just a waste of precious CPU cycles because they are not going to
    get the lock soon as the earlier ones have to be woken up and take
    their turn to get the lock.

    This patch implements an adaptive spinning mechanism where the vCPU
    will call pv_wait() if the previous vCPU is not running.

    Linux kernel builds were run in KVM guest on an 8-socket, 4
    cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
    Haswell-EX system. Both systems are configured to have 32 physical
    CPUs. The kernel build times before and after the patch were:

    Westmere Haswell
    Patch 32 vCPUs 48 vCPUs 32 vCPUs 48 vCPUs
    ----- -------- -------- -------- --------
    Before patch 3m02.3s 5m00.2s 1m43.7s 3m03.5s
    After patch 3m03.0s 4m37.5s 1m43.0s 2m47.2s

    For 32 vCPUs, this patch doesn't cause any noticeable change in
    performance. For 48 vCPUs (over-committed), there is about 8%
    performance improvement.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1447114167-47185-8-git-send-email-Waiman.Long@hpe.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • This patch allows one attempt for the lock waiter to steal the lock
    when entering the PV slowpath. To prevent lock starvation, the pending
    bit will be set by the queue head vCPU when it is in the active lock
    spinning loop to disable any lock stealing attempt. This helps to
    reduce the performance penalty caused by lock waiter preemption while
    not having much of the downsides of a real unfair lock.

    The pv_wait_head() function was renamed as pv_wait_head_or_lock()
    as it was modified to acquire the lock before returning. This is
    necessary because of possible lock stealing attempts from other tasks.

    Linux kernel builds were run in KVM guest on an 8-socket, 4
    cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
    Haswell-EX system. Both systems are configured to have 32 physical
    CPUs. The kernel build times before and after the patch were:

    Westmere Haswell
    Patch 32 vCPUs 48 vCPUs 32 vCPUs 48 vCPUs
    ----- -------- -------- -------- --------
    Before patch 3m15.6s 10m56.1s 1m44.1s 5m29.1s
    After patch 3m02.3s 5m00.2s 1m43.7s 3m03.5s

    For the overcommited case (48 vCPUs), this patch is able to reduce
    kernel build time by more than 54% for Westmere and 44% for Haswell.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1447190336-53317-1-git-send-email-Waiman.Long@hpe.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • This patch enables the accumulation of kicking and waiting related
    PV qspinlock statistics when the new QUEUED_LOCK_STAT configuration
    option is selected. It also enables the collection of data which
    enable us to calculate the kicking and wakeup latencies which have
    a heavy dependency on the CPUs being used.

    The statistical counters are per-cpu variables to minimize the
    performance overhead in their updates. These counters are exported
    via the debugfs filesystem under the qlockstat directory. When the
    corresponding debugfs files are read, summation and computing of the
    required data are then performed.

    The measured latencies for different CPUs are:

    CPU Wakeup Kicking
    --- ------ -------
    Haswell-EX 63.6us 7.4us
    Westmere-EX 67.6us 9.3us

    The measured latencies varied a bit from run-to-run. The wakeup
    latency is much higher than the kicking latency.

    A sample of statistical counters after system bootup (with vCPU
    overcommit) was:

    pv_hash_hops=1.00
    pv_kick_unlock=1148
    pv_kick_wake=1146
    pv_latency_kick=11040
    pv_latency_wake=194840
    pv_spurious_wakeup=7
    pv_wait_again=4
    pv_wait_head=23
    pv_wait_node=1129

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1447114167-47185-6-git-send-email-Waiman.Long@hpe.com
    Signed-off-by: Ingo Molnar

    Waiman Long