22 Feb, 2015

1 commit


18 Feb, 2015

1 commit

  • With task_blocks_on_rt_mutex() returning early -EDEADLK we never
    add the waiter to the waitqueue. Later, we try to remove it via
    remove_waiter() and go boom in rt_mutex_top_waiter() because
    rb_entry() gives a NULL pointer.

    ( Tested on v3.18-RT where rtmutex is used for regular mutex and I
    tried to get one twice in a row. )

    Not sure when this started but I guess 397335f004f4 ("rtmutex: Fix
    deadlock detector for real") or commit 3d5c9340d194 ("rtmutex:
    Handle deadlock detection smarter").

    Signed-off-by: Sebastian Andrzej Siewior
    Acked-by: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: # for v3.16 and later kernels
    Link: http://lkml.kernel.org/r/1424187823-19600-1-git-send-email-bigeasy@linutronix.de
    Signed-off-by: Ingo Molnar

    Sebastian Andrzej Siewior
     

12 Feb, 2015

1 commit

  • Pull s390 updates from Martin Schwidefsky:

    - The remaining patches for the z13 machine support: kernel build
    option for z13, the cache synonym avoidance, SMT support,
    compare-and-delay for spinloops and the CES5S crypto adapater.

    - The ftrace support for function tracing with the gcc hotpatch option.
    This touches common code Makefiles, Steven is ok with the changes.

    - The hypfs file system gets an extension to access diagnose 0x0c data
    in user space for performance analysis for Linux running under z/VM.

    - The iucv hvc console gets wildcard spport for the user id filtering.

    - The cacheinfo code is converted to use the generic infrastructure.

    - Cleanup and bug fixes.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
    s390/process: free vx save area when releasing tasks
    s390/hypfs: Eliminate hypfs interval
    s390/hypfs: Add diagnose 0c support
    s390/cacheinfo: don't use smp_processor_id() in preemptible context
    s390/zcrypt: fixed domain scanning problem (again)
    s390/smp: increase maximum value of NR_CPUS to 512
    s390/jump label: use different nop instruction
    s390/jump label: add sanity checks
    s390/mm: correct missing space when reporting user process faults
    s390/dasd: cleanup profiling
    s390/dasd: add locking for global_profile access
    s390/ftrace: hotpatch support for function tracing
    ftrace: let notrace function attribute disable hotpatching if necessary
    ftrace: allow architectures to specify ftrace compile options
    s390: reintroduce diag 44 calls for cpu_relax()
    s390/zcrypt: Add support for new crypto express (CEX5S) adapter.
    s390/zcrypt: Number of supported ap domains is not retrievable.
    s390/spinlock: add compare-and-delay to lock wait loops
    s390/tape: remove redundant if statement
    s390/hvc_iucv: add simple wildcard matches to the iucv allow filter
    ...

    Linus Torvalds
     

11 Feb, 2015

1 commit

  • Pull networking updates from David Miller:

    1) More iov_iter conversion work from Al Viro.

    [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was
    wrong, and this pull actually adds an extra commit on top of the
    branch I'm pulling to fix that up, so that the pre-merge state is
    ok. - Linus ]

    2) Various optimizations to the ipv4 forwarding information base trie
    lookup implementation. From Alexander Duyck.

    3) Remove sock_iocb altogether, from CHristoph Hellwig.

    4) Allow congestion control algorithm selection via routing metrics.
    From Daniel Borkmann.

    5) Make ipv4 uncached route list per-cpu, from Eric Dumazet.

    6) Handle rfs hash collisions more gracefully, also from Eric Dumazet.

    7) Add xmit_more support to r8169, e1000, and e1000e drivers. From
    Florian Westphal.

    8) Transparent Ethernet Bridging support for GRO, from Jesse Gross.

    9) Add BPF packet actions to packet scheduler, from Jiri Pirko.

    10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer.

    11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman
    Kwok.

    12) More sanely handle out-of-window dupacks, which can result in
    serious ACK storms. From Neal Cardwell.

    13) Various rhashtable bug fixes and enhancements, from Herbert Xu,
    Patrick McHardy, and Thomas Graf.

    14) Support xmit_more in be2net, from Sathya Perla.

    15) Group Policy extensions for vxlan, from Thomas Graf.

    16) Remove Checksum Offload support for vxlan, from Tom Herbert.

    17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From
    Vlad Yasevich.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits)
    crypto: fix af_alg_make_sg() conversion to iov_iter
    ipv4: Namespecify TCP PMTU mechanism
    i40e: Fix for stats init function call in Rx setup
    tcp: don't include Fast Open option in SYN-ACK on pure SYN-data
    openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set
    ipv6: Make __ipv6_select_ident static
    ipv6: Fix fragment id assignment on LE arches.
    bridge: Fix inability to add non-vlan fdb entry
    net: Mellanox: Delete unnecessary checks before the function call "vunmap"
    cxgb4: Add support in cxgb4 to get expansion rom version via ethtool
    ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version
    net: dsa: Remove redundant phy_attach()
    IB/mlx4: Reset flow support for IB kernel ULPs
    IB/mlx4: Always use the correct port for mirrored multicast attachments
    net/bonding: Fix potential bad memory access during bonding events
    tipc: remove tipc_snprintf
    tipc: nl compat add noop and remove legacy nl framework
    tipc: convert legacy nl stats show to nl compat
    tipc: convert legacy nl net id get to nl compat
    tipc: convert legacy nl net id set to nl compat
    ...

    Linus Torvalds
     

10 Feb, 2015

1 commit

  • Pull scheduler updates from Ingo Molnar:
    "The main scheduler changes in this cycle were:

    - various sched/deadline fixes and enhancements

    - rescheduling latency fixes/cleanups

    - rework the rq->clock code to be more consistent and more robust.

    - minor micro-optimizations

    - ->avg.decay_count fixes

    - add a stack overflow check to might_sleep()

    - idle-poll handler fix, possibly resulting in power savings

    - misc smaller updates and fixes"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/Documentation: Remove unneeded word
    sched/wait: Introduce wait_on_bit_timeout()
    sched: Pull resched loop to __schedule() callers
    sched/deadline: Remove cpu_active_mask from cpudl_find()
    sched: Fix hrtick_start() on UP
    sched/deadline: Avoid pointless __setscheduler()
    sched/deadline: Fix stale yield state
    sched/deadline: Fix hrtick for a non-leftmost task
    sched/deadline: Modify cpudl::free_cpus to reflect rd->online
    sched/idle: Add missing checks to the exit condition of cpu_idle_poll()
    sched: Fix missing preemption opportunity
    sched/rt: Reduce rq lock contention by eliminating locking of non-feasible target
    sched/debug: Print rq->clock_task
    sched/core: Rework rq->clock update skips
    sched/core: Validate rq_clock*() serialization
    sched/core: Remove check of p->sched_class
    sched/fair: Fix sched_entity::avg::decay_count initialization
    sched/debug: Fix potential call to __ffs(0) in sched_show_task()
    sched/debug: Check for stack overflow in ___might_sleep()
    sched/fair: Fix the dealing with decay_count in __synchronize_entity_decay()

    Linus Torvalds
     

04 Feb, 2015

4 commits

  • We explicitly mark the task running after returning from
    a __rt_mutex_slowlock() call, which does the actual sleeping
    via wait-wake-trylocking. As such, this patch does two things:

    (1) refactors the code so that setting current to TASK_RUNNING
    is done by __rt_mutex_slowlock(), and not by the callers. The
    downside to this is that it becomes a bit unclear when at what
    point we block. As such I've added a comment that the task
    blocks when calling __rt_mutex_slowlock() so readers can figure
    out when it is running again.

    (2) relaxes setting current's state through __set_current_state(),
    instead of it's more expensive barrier alternative. There was no
    need for the implied barrier as we're obviously not planning on
    blocking.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1422857784.18096.1.camel@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • Call __set_task_state() instead of assigning the new state
    directly. These interfaces also aid CONFIG_DEBUG_ATOMIC_SLEEP
    environments, keeping track of who last changed the state.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: "Paul E. McKenney"
    Cc: Jason Low
    Cc: Michel Lespinasse
    Cc: Tim Chen
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1422257769-14083-2-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • By the time we wake up and get the lock after being asleep
    in the slowpath, we better be running. As good practice,
    be explicit about this and avoid any mischief.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: "Paul E. McKenney"
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1421717961.4903.11.camel@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • The second 'mutex' shouldn't be there, it can't be about the mutex,
    as the mutex can't be freed, but unlocked, the memory where the
    mutex resides however, can be freed.

    Signed-off-by: Sharon Dvir
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1422827252-31363-1-git-send-email-sharon.dvir1@mail.huji.ac.il
    Signed-off-by: Ingo Molnar

    Sharon Dvir
     

29 Jan, 2015

1 commit

  • If the kernel is compiled with function tracer support the -pg compile option
    is passed to gcc to generate extra code into the prologue of each function.

    This patch replaces the "open-coded" -pg compile flag with a CC_FLAGS_FTRACE
    makefile variable which architectures can override if a different option
    should be used for code generation.

    Acked-by: Steven Rostedt
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

15 Jan, 2015

1 commit


14 Jan, 2015

5 commits

  • Both mutexes and rwsems took a performance hit when we switched
    over from the original mcs code to the cancelable variant (osq).
    The reason being the use of smp_load_acquire() when polling for
    node->locked. This is not needed as reordering is not an issue,
    as such, relax the barrier semantics. Paul describes the scenario
    nicely: https://lkml.org/lkml/2013/11/19/405

    - If we start polling before the insertion is complete, all that
    happens is that the first few polls have no chance of seeing a lock
    grant.

    - Ordering the polling against the initialization -- the above
    xchg() is already doing that for us.

    The smp_load_acquire() when unqueuing make sense. In addition,
    we don't need to worry about leaking the critical region as
    osq is only used internally.

    This impacts both regular and large levels of concurrency,
    ie on a 40 core system with a disk intensive workload:

    disk-1 804.83 ( 0.00%) 828.16 ( 2.90%)
    disk-61 8063.45 ( 0.00%) 18181.82 (125.48%)
    disk-121 7187.41 ( 0.00%) 20119.17 (179.92%)
    disk-181 6933.32 ( 0.00%) 20509.91 (195.82%)
    disk-241 6850.81 ( 0.00%) 20397.80 (197.74%)
    disk-301 6815.22 ( 0.00%) 20287.58 (197.68%)
    disk-361 7080.40 ( 0.00%) 20205.22 (185.37%)
    disk-421 7076.13 ( 0.00%) 19957.33 (182.04%)
    disk-481 7083.25 ( 0.00%) 19784.06 (179.31%)
    disk-541 7038.39 ( 0.00%) 19610.92 (178.63%)
    disk-601 7072.04 ( 0.00%) 19464.53 (175.23%)
    disk-661 7010.97 ( 0.00%) 19348.23 (175.97%)
    disk-721 7069.44 ( 0.00%) 19255.33 (172.37%)
    disk-781 7007.58 ( 0.00%) 19103.14 (172.61%)
    disk-841 6981.18 ( 0.00%) 18964.22 (171.65%)
    disk-901 6968.47 ( 0.00%) 18826.72 (170.17%)
    disk-961 6964.61 ( 0.00%) 18708.02 (168.62%)

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: "Paul E. McKenney"
    Cc: Thomas Gleixner
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1420573509-24774-7-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • We have two flavors of the MCS spinlock: standard and cancelable (OSQ).
    While each one is independent of the other, we currently mix and match
    them. This patch:

    - Moves the OSQ code out of mcs_spinlock.h (which only deals with the traditional
    version) into include/linux/osq_lock.h. No unnecessary code is added to the
    more global header file, anything locks that make use of OSQ must include
    it anyway.

    - Renames mcs_spinlock.c to osq_lock.c. This file only contains osq code.

    - Introduces a CONFIG_LOCK_SPIN_ON_OWNER in order to only build osq_lock
    if there is support for it.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Thomas Gleixner
    Cc: "Paul E. McKenney"
    Cc: Jason Low
    Cc: Linus Torvalds
    Cc: Mikulas Patocka
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/1420573509-24774-5-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • ... which is equivalent to the fastpath counter part.
    This mainly allows getting some WW specific code out
    of generic mutex paths.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: "Paul E. McKenney"
    Cc: Thomas Gleixner
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1420573509-24774-4-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • It serves much better if the comments are right before the osq_lock() call.
    Also delete a useless comment.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: "Paul E. McKenney"
    Cc: Thomas Gleixner
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1420573509-24774-3-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • Mark it so by renaming __mutex_lock_check_stamp().

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: "Paul E. McKenney"
    Cc: Thomas Gleixner
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1420573509-24774-2-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

09 Jan, 2015

1 commit

  • Currently if DEBUG_MUTEXES is enabled, the mutex->owner field is only
    cleared iff debug_locks is active. This exposes a race to other users of
    the field where the mutex->owner may be still set to a stale value,
    potentially upsetting mutex_spin_on_owner() among others.

    References: https://bugs.freedesktop.org/show_bug.cgi?id=87955
    Signed-off-by: Chris Wilson
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Davidlohr Bueso
    Cc: Daniel Vetter
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1420540175-30204-1-git-send-email-chris@chris-wilson.co.uk
    Signed-off-by: Ingo Molnar

    Chris Wilson
     

04 Jan, 2015

1 commit


28 Oct, 2014

1 commit

  • We're going to make might_sleep() test for TASK_RUNNING, because
    blocking without TASK_RUNNING will destroy the task state by setting
    it to TASK_RUNNING.

    There are a few occasions where its 'valid' to call blocking
    primitives (and mutex_lock in particular) and not have TASK_RUNNING,
    typically such cases are right before we set TASK_RUNNING anyhow.

    Robustify the code by not assuming this; this has the beneficial side
    effect of allowing optional code emission for fixing the above
    might_sleep() false positives.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: tglx@linutronix.de
    Cc: ilya.dryomov@inktank.com
    Cc: umgwanakikbuti@gmail.com
    Cc: Oleg Nesterov
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20140924082241.988560063@infradead.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

13 Oct, 2014

1 commit

  • Pull core locking updates from Ingo Molnar:
    "The main updates in this cycle were:

    - mutex MCS refactoring finishing touches: improve comments, refactor
    and clean up code, reduce debug data structure footprint, etc.

    - qrwlock finishing touches: remove old code, self-test updates.

    - small rwsem optimization

    - various smaller fixes/cleanups"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    locking/lockdep: Revert qrwlock recusive stuff
    locking/rwsem: Avoid double checking before try acquiring write lock
    locking/rwsem: Move EXPORT_SYMBOL() lines to follow function definition
    locking/rwlock, x86: Delete unused asm/rwlock.h and rwlock.S
    locking/rwlock, x86: Clean up asm/spinlock*.h to remove old rwlock code
    locking/semaphore: Resolve some shadow warnings
    locking/selftest: Support queued rwlock
    locking/lockdep: Restrict the use of recursive read_lock() with qrwlock
    locking/spinlocks: Always evaluate the second argument of spin_lock_nested()
    locking/Documentation: Update locking/mutex-design.txt disadvantages
    locking/Documentation: Move locking related docs into Documentation/locking/
    locking/mutexes: Use MUTEX_SPIN_ON_OWNER when appropriate
    locking/mutexes: Refactor optimistic spinning code
    locking/mcs: Remove obsolete comment
    locking/mutexes: Document quick lock release when unlocking
    locking/mutexes: Standardize arguments in lock/unlock slowpaths
    locking: Remove deprecated smp_mb__() barriers

    Linus Torvalds
     

03 Oct, 2014

2 commits

  • Commit f0bab73cb539 ("locking/lockdep: Restrict the use of recursive
    read_lock() with qrwlock") changed lockdep to try and conform to the
    qrwlock semantics which differ from the traditional rwlock semantics.

    In particular qrwlock is fair outside of interrupt context, but in
    interrupt context readers will ignore all fairness.

    The problem modeling this is that read and write side have different
    lock state (interrupts) semantics but we only have a single
    representation of these. Therefore lockdep will get confused, thinking
    the lock can cause interrupt lock inversions.

    So revert it for now; the old rwlock semantics were already imperfectly
    modeled and the qrwlock extra won't fit either.

    If we want to properly fix this, I think we need to resurrect the work
    by Gautham did a few years ago that split the read and write state of
    locks:

    http://lwn.net/Articles/332801/

    FWIW the locking selftest that would've failed (and was reported by
    Borislav earlier) is something like:

    RL(X1); /* IRQ-ON */
    LOCK(A);
    UNLOCK(A);
    RU(X1);

    IRQ_ENTER();
    RL(X1); /* IN-IRQ */
    RU(X1);
    IRQ_EXIT();

    At which point it would report that because A is an IRQ-unsafe lock we
    can suffer the following inversion:

    CPU0 CPU1

    lock(A)
    lock(X1)
    lock(A)

    lock(X1)

    And this is 'wrong' because X1 can recurse (assuming the above lock are
    in fact read-lock) but lockdep doesn't know about this.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Waiman Long
    Cc: ego@linux.vnet.ibm.com
    Cc: bp@alien8.de
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Link: http://lkml.kernel.org/r/20140930132600.GA7444@worktop.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Commit 9b0fc9c09f1b ("rwsem: skip initial trylock in rwsem_down_write_failed")
    checks for if there are known active lockers in order to avoid write trylocking
    using expensive cmpxchg() when it likely wouldn't get the lock.

    However, a subsequent patch was added such that we directly
    check for sem->count == RWSEM_WAITING_BIAS right before trying
    that cmpxchg().

    Thus, commit 9b0fc9c09f1b now just adds overhead.

    This patch modifies it so that we only do a check for if
    count == RWSEM_WAITING_BIAS.

    Also, add a comment on why we do an "extra check" of count
    before the cmpxchg().

    Signed-off-by: Jason Low
    Acked-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Aswin Chandramouleeswaran
    Cc: Chegu Vinod
    Cc: Peter Hurley
    Cc: Tim Chen
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1410913017.2447.22.camel@j-VirtualBox
    Signed-off-by: Ingo Molnar

    Jason Low
     

30 Sep, 2014

4 commits


17 Sep, 2014

8 commits

  • The amount of global variables is getting pretty ugly. Group variables
    related to the execution (ie: not parameters) in a new context structure.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney

    Davidlohr Bueso
     
  • We can easily do so with our new reader lock support. Just an arbitrary
    design default: readers have higher (5x) critical region latencies than
    writers: 50 ms and 10 ms, respectively.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney

    Davidlohr Bueso
     
  • Most of it is based on what we already have for writers. This allows
    readers to be very independent (and thus configurable), enabling
    future module parameters to control things such as rw distribution.
    Furthermore, readers have their own delaying function, allowing us
    to test different rw critical region latencies, and stress locking
    internals. Similarly, statistics, for now will only serve for the
    number of lock acquisitions -- as opposed to writers, readers have
    no failure detection.

    In addition, introduce a new nreaders_stress module parameter. The
    default number of readers will be the same number of writers threads.
    Writer threads are interleaved with readers. Documentation is updated,
    respectively.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney

    Davidlohr Bueso
     
  • When performing module cleanups by calling torture_cleanup() the
    'torture_type' string in nullified However, callers are not necessarily
    done, and might still need to reference the variable. This impacts
    both rcutorture and locktorture, causing printing things like:

    [ 94.226618] (null)-torture: Stopping lock_torture_writer task
    [ 94.226624] (null)-torture: Stopping lock_torture_stats task

    Thus delay this operation until the very end of the cleanup process.
    The consequence (which shouldn't matter for this kid of program) is,
    of course, that we delay the window between rmmod and modprobing,
    for instance in module_torture_begin().

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney

    Davidlohr Bueso
     
  • The statistics structure can serve well for both reader and writer
    locks, thus simply rename some fields that mention 'write' and leave
    the declaration of lwsa.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney

    Davidlohr Bueso
     
  • Regular locks are very different than locks with debugging. For instance
    for mutexes, debugging forces to only take the slowpaths. As such, the
    locktorture module should take this into account when printing related
    information -- specifically when printing user passed parameters, it seems
    the right place for such info.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney

    Davidlohr Bueso
     
  • Add a "mutex_lock" torture test. The main difference with the already
    existing spinlock tests is that the latency of the critical region
    is much larger. We randomly delay for (arbitrarily) either 500 ms or,
    otherwise, 25 ms. While this can considerably reduce the amount of
    writes compared to non blocking locks, if run long enough it can have
    the same torturous effect. Furthermore it is more representative of
    mutex hold times and can stress better things like thrashing.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney

    Davidlohr Bueso
     
  • ... to just 'torture_runnable'. It follows other variable naming
    and is shorter.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney

    Davidlohr Bueso
     

16 Sep, 2014

1 commit


04 Sep, 2014

1 commit

  • Resolve some shadow warnings resulting from using the name
    jiffies, which is a well-known global. This is not a problem
    of course, but it could be a trap for someone copying and
    pasting code, and it just makes W=2 a little cleaner.

    Signed-off-by: Mark Rustad
    Signed-off-by: Jeff Kirsher
    Acked-by: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Thomas Gleixner
    Cc: Paul E. McKenney
    Link: http://lkml.kernel.org/r/1409739444-13635-1-git-send-email-jeffrey.t.kirsher@intel.com
    Signed-off-by: Ingo Molnar

    Mark Rustad
     

13 Aug, 2014

4 commits

  • Unlike the original unfair rwlock implementation, queued rwlock
    will grant lock according to the chronological sequence of the lock
    requests except when the lock requester is in the interrupt context.
    Consequently, recursive read_lock calls will now hang the process if
    there is a write_lock call somewhere in between the read_lock calls.

    This patch updates the lockdep implementation to look for recursive
    read_lock calls. A new read state (3) is used to mark those read_lock
    call that cannot be recursively called except in the interrupt
    context. The new read state does exhaust the 2 bits available in
    held_lock:read bit field. The addition of any new read state in the
    future may require a redesign of how all those bits are squeezed
    together in the held_lock structure.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra
    Cc: Maarten Lankhorst
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Fengguang Wu
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1407345722-61615-2-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Specifically:
    Documentation/locking/lockdep-design.txt
    Documentation/locking/lockstat.txt
    Documentation/locking/mutex-design.txt
    Documentation/locking/rt-mutex-design.txt
    Documentation/locking/rt-mutex.txt
    Documentation/locking/spinlocks.txt
    Documentation/locking/ww-mutex-design.txt

    Signed-off-by: Davidlohr Bueso
    Acked-by: Randy Dunlap
    Signed-off-by: Peter Zijlstra
    Cc: jason.low2@hp.com
    Cc: aswin@hp.com
    Cc: Alexei Starovoitov
    Cc: Al Viro
    Cc: Andrew Morton
    Cc: Chris Mason
    Cc: Dan Streetman
    Cc: David Airlie
    Cc: Davidlohr Bueso
    Cc: David S. Miller
    Cc: Greg Kroah-Hartman
    Cc: Heiko Carstens
    Cc: Jason Low
    Cc: Josef Bacik
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Lubomir Rintel
    Cc: Masanari Iida
    Cc: Paul E. McKenney
    Cc: Randy Dunlap
    Cc: Tim Chen
    Cc: Vineet Gupta
    Cc: fengguang.wu@intel.com
    Link: http://lkml.kernel.org/r/1406752916-3341-6-git-send-email-davidlohr@hp.com
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • 4badad35 ("locking/mutex: Disable optimistic spinning on some
    architectures") added a ARCH_SUPPORTS_ATOMIC_RMW flag to
    disable the mutex optimistic feature on specific archs.

    Because CONFIG_MUTEX_SPIN_ON_OWNER only depended on DEBUG and
    SMP, it was ok to have the ->owner field conditional a bit
    flexible. However by adding a new variable to the matter,
    we can waste space with the unused field, ie: CONFIG_SMP &&
    (!CONFIG_MUTEX_SPIN_ON_OWNER && !CONFIG_DEBUG_MUTEX).

    Signed-off-by: Davidlohr Bueso
    Acked-by: Jason Low
    Signed-off-by: Peter Zijlstra
    Cc: aswin@hp.com
    Cc: Davidlohr Bueso
    Cc: Heiko Carstens
    Cc: Jason Low
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Tim Chen
    Link: http://lkml.kernel.org/r/1406752916-3341-5-git-send-email-davidlohr@hp.com
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • When we fail to acquire the mutex in the fastpath, we end up calling
    __mutex_lock_common(). A *lot* goes on in this function. Move out the
    optimistic spinning code into mutex_optimistic_spin() and simplify
    the former a bit. Furthermore, this is similar to what we have in
    rwsems. No logical changes.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Jason Low
    Signed-off-by: Peter Zijlstra
    Cc: aswin@hp.com
    Cc: mingo@kernel.org
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1406752916-3341-4-git-send-email-davidlohr@hp.com
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso