09 Oct, 2019

1 commit

  • Since the following commit:

    b4adfe8e05f1 ("locking/lockdep: Remove unused argument in __lock_release")

    @nested is no longer used in lock_release(), so remove it from all
    lock_release() calls and friends.

    Signed-off-by: Qian Cai
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Acked-by: Daniel Vetter
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: airlied@linux.ie
    Cc: akpm@linux-foundation.org
    Cc: alexander.levin@microsoft.com
    Cc: daniel@iogearbox.net
    Cc: davem@davemloft.net
    Cc: dri-devel@lists.freedesktop.org
    Cc: duyuyang@gmail.com
    Cc: gregkh@linuxfoundation.org
    Cc: hannes@cmpxchg.org
    Cc: intel-gfx@lists.freedesktop.org
    Cc: jack@suse.com
    Cc: jlbec@evilplan.or
    Cc: joonas.lahtinen@linux.intel.com
    Cc: joseph.qi@linux.alibaba.com
    Cc: jslaby@suse.com
    Cc: juri.lelli@redhat.com
    Cc: maarten.lankhorst@linux.intel.com
    Cc: mark@fasheh.com
    Cc: mhocko@kernel.org
    Cc: mripard@kernel.org
    Cc: ocfs2-devel@oss.oracle.com
    Cc: rodrigo.vivi@intel.com
    Cc: sean@poorly.run
    Cc: st@kernel.org
    Cc: tj@kernel.org
    Cc: tytso@mit.edu
    Cc: vdavydov.dev@gmail.com
    Cc: vincent.guittot@linaro.org
    Cc: viro@zeniv.linux.org.uk
    Link: https://lkml.kernel.org/r/1568909380-32199-1-git-send-email-cai@lca.pw
    Signed-off-by: Ingo Molnar

    Qian Cai
     

25 Jul, 2019

1 commit

  • While reviewing rwsem down_slowpath, Will noticed ldsem had a copy of
    a bug we just found for rwsem.

    X = 0;

    CPU0 CPU1

    rwsem_down_read()
    for (;;) {
    set_current_state(TASK_UNINTERRUPTIBLE);

    X = 1;
    rwsem_up_write();
    rwsem_mark_wake()
    atomic_long_add(adjustment, &sem->count);
    smp_store_release(&waiter->task, NULL);

    if (!waiter.task)
    break;

    ...
    }

    r = X;

    Allows 'r == 0'.

    Reported-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Linus Torvalds
    Cc: Peter Hurley
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: 4898e640caf0 ("tty: Add timed, writer-prioritized rw semaphore")
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

05 Dec, 2018

3 commits

  • It seems like when ldsem_down_read() fails with timeout, it misses
    update for sem->wait_readers. By that reason, when writer finally
    releases write end of the semaphore __ldsem_wake_readers() does adjust
    sem->count with wrong value:
    sem->wait_readers * (LDSEM_ACTIVE_BIAS - LDSEM_WAIT_BIAS)

    I.e, if update comes with 1 missed wait_readers decrement, sem->count
    will be 0x100000001 which means that there is active reader and it'll
    make any further writer to fail in acquiring the semaphore.

    It looks like, this is a dead-code, because ldsem_down_read() is never
    called with timeout different than MAX_SCHEDULE_TIMEOUT, so it might be
    worth to delete timeout parameter and error path fall-back..

    Cc: Jiri Slaby
    Signed-off-by: Dmitry Safonov
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Safonov
     
  • For some reason ldsem has its own lockdep wrappers, make them go away.

    Cc: Jiri Slaby
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Dmitry Safonov
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     
  • ldsem_down_read() will sleep if there is pending writer in the queue.
    If the writer times out, readers in the queue should be woken up,
    otherwise they may miss a chance to acquire the semaphore until the last
    active reader will do ldsem_up_read().

    There was a couple of reports where there was one active reader and
    other readers soft locked up:
    Showing all locks held in the system:
    2 locks held by khungtaskd/17:
    #0: (rcu_read_lock){......}, at: watchdog+0x124/0x6d1
    #1: (tasklist_lock){.+.+..}, at: debug_show_all_locks+0x72/0x2d3
    2 locks held by askfirst/123:
    #0: (&tty->ldisc_sem){.+.+.+}, at: ldsem_down_read+0x46/0x58
    #1: (&ldata->atomic_read_lock){+.+...}, at: n_tty_read+0x115/0xbe4

    Prevent readers wait for active readers to release ldisc semaphore.

    Link: lkml.kernel.org/r/20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com
    Link: lkml.kernel.org/r/20180907045041.GF1110@shao2-debian
    Cc: Jiri Slaby
    Cc: Peter Zijlstra
    Cc: stable@vger.kernel.org
    Reported-by: kernel test robot
    Signed-off-by: Dmitry Safonov
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Safonov
     

28 Jun, 2018

1 commit

  • Mark found ldsem_cmpxchg() needed an (atomic_long_t *) cast to keep
    working after making the atomic_long interface type safe.

    Needing casts is bad form, which made me look at the code. There are no
    ld_semaphore::count users outside of these functions so there is no
    reason why it can not be an atomic_long_t in the first place, obviating
    the need for this cast.

    That also ensures the loads use atomic_long_read(), which implies (at
    least) READ_ONCE() in order to guarantee single-copy-atomic loads.

    When using atomic_long_try_cmpxchg() the ldsem_cmpxchg() wrapper gets
    very thin (the only difference is not changing *old on success, which
    most callers don't seem to care about).

    So rework the whole thing to use atomic_long_t and its accessors
    directly.

    While there, fixup all the horrible comment styles.

    Cc: Peter Hurley
    Reported-by: Mark Rutland
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Mark Rutland
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     

08 Nov, 2017

2 commits

  • Now that the SPDX tag is in all tty files, that identifies the license
    in a specific and legally-defined manner. So the extra GPL text wording
    can be removed as it is no longer needed at all.

    This is done on a quest to remove the 700+ different ways that files in
    the kernel describe the GPL license text. And there's unneeded stuff
    like the address (sometimes incorrect) for the FSF which is never
    needed.

    No copyright headers or other non-license-description text was removed.

    Cc: Jiri Slaby
    Cc: James Hogan
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • It's good to have SPDX identifiers in all files to make it easier to
    audit the kernel tree for correct licenses.

    Update the drivers/tty files files with the correct SPDX license
    identifier based on the license text in the file itself. The SPDX
    identifier is a legally binding shorthand, which can be used instead of
    the full boiler plate text.

    This work is based on a script and data from Thomas Gleixner, Philippe
    Ombredanne, and Kate Stewart.

    Cc: Jiri Slaby
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: Chris Metcalf
    Cc: Jiri Kosina
    Cc: David Sterba
    Cc: James Hogan
    Cc: Rob Herring
    Cc: Eric Anholt
    Cc: Stefan Wahren
    Cc: Florian Fainelli
    Cc: Ray Jui
    Cc: Scott Branden
    Cc: bcm-kernel-feedback-list@broadcom.com
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Cc: Joachim Eastwood
    Cc: Matthias Brugger
    Cc: Masahiro Yamada
    Cc: Tobias Klauser
    Cc: Russell King
    Cc: Vineet Gupta
    Cc: Richard Genoud
    Cc: Alexander Shiyan
    Cc: Baruch Siach
    Cc: "Maciej W. Rozycki"
    Cc: "Uwe Kleine-König"
    Cc: Pat Gefre
    Cc: "Guilherme G. Piccoli"
    Cc: Jason Wessel
    Cc: Vladimir Zapolskiy
    Cc: Sylvain Lemieux
    Cc: Carlo Caione
    Cc: Kevin Hilman
    Cc: Liviu Dudau
    Cc: Sudeep Holla
    Cc: Lorenzo Pieralisi
    Cc: Andy Gross
    Cc: David Brown
    Cc: "Andreas Färber"
    Cc: Kevin Cernekee
    Cc: Laxman Dewangan
    Cc: Thierry Reding
    Cc: Jonathan Hunter
    Cc: Barry Song
    Cc: Patrice Chotard
    Cc: Maxime Coquelin
    Cc: Alexandre Torgue
    Cc: "David S. Miller"
    Cc: Peter Korsgaard
    Cc: Timur Tabi
    Cc: Tony Prisk
    Cc: Michal Simek
    Cc: "Sören Brinkmann"
    Cc: Thomas Gleixner
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Jiri Slaby
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

02 Mar, 2017

2 commits


14 Jan, 2017

2 commits

  • This is a nasty interface and setting the state of a foreign task must
    not be done. As of the following commit:

    be628be0956 ("bcache: Make gc wakeup sane, remove set_task_state()")

    ... everyone in the kernel calls set_task_state() with current, allowing
    the helper to be removed.

    However, as the comment indicates, it is still around for those archs
    where computing current is more expensive than using a pointer, at least
    in theory. An important arch that is affected is arm64, however this has
    been addressed now [1] and performance is up to par making no difference
    with either calls.

    Of all the callers, if any, it's the locking bits that would care most
    about this -- ie: we end up passing a tsk pointer to a lot of the lock
    slowpath, and setting ->state on that. The following numbers are based
    on two tests: a custom ad-hoc microbenchmark that just measures
    latencies (for ~65 million calls) between get_task_state() vs
    get_current_state().

    Secondly for a higher overview, an unlink microbenchmark was used,
    which pounds on a single file with open, close,unlink combos with
    increasing thread counts (up to 4x ncpus). While the workload is quite
    unrealistic, it does contend a lot on the inode mutex or now rwsem.

    [1] https://lkml.kernel.org/r/1483468021-8237-1-git-send-email-mark.rutland@arm.com

    == 1. x86-64 ==

    Avg runtime set_task_state(): 601 msecs
    Avg runtime set_current_state(): 552 msecs

    vanilla dirty
    Hmean unlink1-processes-2 36089.26 ( 0.00%) 38977.33 ( 8.00%)
    Hmean unlink1-processes-5 28555.01 ( 0.00%) 29832.55 ( 4.28%)
    Hmean unlink1-processes-8 37323.75 ( 0.00%) 44974.57 ( 20.50%)
    Hmean unlink1-processes-12 43571.88 ( 0.00%) 44283.01 ( 1.63%)
    Hmean unlink1-processes-21 34431.52 ( 0.00%) 38284.45 ( 11.19%)
    Hmean unlink1-processes-30 34813.26 ( 0.00%) 37975.17 ( 9.08%)
    Hmean unlink1-processes-48 37048.90 ( 0.00%) 39862.78 ( 7.59%)
    Hmean unlink1-processes-79 35630.01 ( 0.00%) 36855.30 ( 3.44%)
    Hmean unlink1-processes-110 36115.85 ( 0.00%) 39843.91 ( 10.32%)
    Hmean unlink1-processes-141 32546.96 ( 0.00%) 35418.52 ( 8.82%)
    Hmean unlink1-processes-172 34674.79 ( 0.00%) 36899.21 ( 6.42%)
    Hmean unlink1-processes-203 37303.11 ( 0.00%) 36393.04 ( -2.44%)
    Hmean unlink1-processes-224 35712.13 ( 0.00%) 36685.96 ( 2.73%)

    == 2. ppc64le ==

    Avg runtime set_task_state(): 938 msecs
    Avg runtime set_current_state: 940 msecs

    vanilla dirty
    Hmean unlink1-processes-2 19269.19 ( 0.00%) 30704.50 ( 59.35%)
    Hmean unlink1-processes-5 20106.15 ( 0.00%) 21804.15 ( 8.45%)
    Hmean unlink1-processes-8 17496.97 ( 0.00%) 17243.28 ( -1.45%)
    Hmean unlink1-processes-12 14224.15 ( 0.00%) 17240.21 ( 21.20%)
    Hmean unlink1-processes-21 14155.66 ( 0.00%) 15681.23 ( 10.78%)
    Hmean unlink1-processes-30 14450.70 ( 0.00%) 15995.83 ( 10.69%)
    Hmean unlink1-processes-48 16945.57 ( 0.00%) 16370.42 ( -3.39%)
    Hmean unlink1-processes-79 15788.39 ( 0.00%) 14639.27 ( -7.28%)
    Hmean unlink1-processes-110 14268.48 ( 0.00%) 14377.40 ( 0.76%)
    Hmean unlink1-processes-141 14023.65 ( 0.00%) 16271.69 ( 16.03%)
    Hmean unlink1-processes-172 13417.62 ( 0.00%) 16067.55 ( 19.75%)
    Hmean unlink1-processes-203 15293.08 ( 0.00%) 15440.40 ( 0.96%)
    Hmean unlink1-processes-234 13719.32 ( 0.00%) 16190.74 ( 18.01%)
    Hmean unlink1-processes-265 16400.97 ( 0.00%) 16115.22 ( -1.74%)
    Hmean unlink1-processes-296 14388.60 ( 0.00%) 16216.13 ( 12.70%)
    Hmean unlink1-processes-320 15771.85 ( 0.00%) 15905.96 ( 0.85%)

    x86-64 (known to be fast for get_current()/this_cpu_read_stable() caching)
    and ppc64 (with paca) show similar improvements in the unlink microbenches.
    The small delta for ppc64 (2ms), does not represent the gains on the unlink
    runs. In the case of x86, there was a decent amount of variation in the
    latency runs, but always within a 20 to 50ms increase), ppc was more constant.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dave@stgolabs.net
    Cc: mark.rutland@arm.com
    Link: http://lkml.kernel.org/r/1483479794-14013-5-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • This patch effectively replaces the tsk pointer dereference
    (which is obviously == current), to directly use get_current()
    macro. This is to make the removal of setting foreign task
    states smoother and painfully obvious. Performance win on some
    archs such as x86-64 and ppc64 -- arm64 is no longer an issue.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Greg Kroah-Hartman
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: mark.rutland@arm.com
    Link: http://lkml.kernel.org/r/1483479794-14013-3-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

14 Dec, 2015

2 commits


11 May, 2015

1 commit


10 Feb, 2014

1 commit

  • The "int check" argument of lock_acquire() and held_lock->check are
    misleading. This is actually a boolean: 2 means "true", everything
    else is "false".

    And there is no need to pass 1 or 0 to lock_acquire() depending on
    CONFIG_PROVE_LOCKING, __lock_acquire() checks prove_locking at the
    start and clears "check" if !CONFIG_PROVE_LOCKING.

    Note: probably we can simply kill this member/arg. The only explicit
    user of check => 0 is rcu_lock_acquire(), perhaps we can change it to
    use lock_acquire(trylock =>, read => 2). __lockdep_no_validate means
    check => 0 implicitly, but we can change validate_chain() to check
    hlock->instance->key instead. Not to mention it would be nice to get
    rid of lockdep_set_novalidate_class().

    Signed-off-by: Oleg Nesterov
    Cc: Dave Jones
    Cc: Greg Kroah-Hartman
    Cc: Linus Torvalds
    Cc: Paul McKenney
    Cc: Steven Rostedt
    Cc: Alan Stern
    Cc: Sasha Levin
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140120182006.GA26495@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     

17 Dec, 2013

1 commit

  • When a controlling tty is being hung up and the hang up is
    waiting for a just-signalled tty reader or writer to exit, and a new tty
    reader/writer tries to acquire an ldisc reference concurrently with the
    ldisc reference release from the signalled reader/writer, the hangup
    can hang. The new reader/writer is sleeping in ldsem_down_read() and the
    hangup is sleeping in ldsem_down_write() [1].

    The new reader/writer fails to wakeup the waiting hangup because the
    wrong lock count value is checked (the old lock count rather than the new
    lock count) to see if the lock is unowned.

    Change helper function to return the new lock count if the cmpxchg was
    successful; document this behavior.

    [1] edited dmesg log from reporter

    SysRq : Show Blocked State
    task PC stack pid father
    systemd D ffff88040c4f0000 0 1 0 0x00000000
    ffff88040c49fbe0 0000000000000046 ffff88040c4a0000 ffff88040c49ffd8
    00000000001d3980 00000000001d3980 ffff88040c4a0000 ffff88040593d840
    ffff88040c49fb40 ffffffff810a4cc0 0000000000000006 0000000000000023
    Call Trace:
    [] ? sched_clock_cpu+0x9f/0xe4
    [] ? sched_clock_cpu+0x9f/0xe4
    [] ? sched_clock_cpu+0x9f/0xe4
    [] ? sched_clock_cpu+0x9f/0xe4
    [] schedule+0x24/0x5e
    [] schedule_timeout+0x15b/0x1ec
    [] ? sched_clock_cpu+0x9f/0xe4
    [] ? _raw_spin_unlock_irq+0x24/0x26
    [] down_read_failed+0xe3/0x1b9
    [] ldsem_down_read+0x8b/0xa5
    [] ? tty_ldisc_ref_wait+0x1b/0x44
    [] tty_ldisc_ref_wait+0x1b/0x44
    [] tty_write+0x7d/0x28a
    [] redirected_tty_write+0x8d/0x98
    [] ? tty_write+0x28a/0x28a
    [] do_loop_readv_writev+0x56/0x79
    [] do_readv_writev+0x1b0/0x1ff
    [] ? do_vfs_ioctl+0x32a/0x489
    [] ? final_putname+0x1d/0x3a
    [] vfs_writev+0x2e/0x49
    [] SyS_writev+0x47/0xaa
    [] system_call_fastpath+0x16/0x1b
    bash D ffffffff81c104c0 0 5469 5302 0x00000082
    ffff8800cf817ac0 0000000000000046 ffff8804086b22a0 ffff8800cf817fd8
    00000000001d3980 00000000001d3980 ffff8804086b22a0 ffff8800cf817a48
    000000000000b9a0 ffff8800cf817a78 ffffffff81004675 ffff8800cf817a44
    Call Trace:
    [] ? dump_trace+0x165/0x29c
    [] ? sched_clock_cpu+0x9f/0xe4
    [] ? save_stack_trace+0x26/0x41
    [] schedule+0x24/0x5e
    [] schedule_timeout+0x15b/0x1ec
    [] ? sched_clock_cpu+0x9f/0xe4
    [] ? down_write_failed+0xa3/0x1c9
    [] ? _raw_spin_unlock_irq+0x24/0x26
    [] down_write_failed+0xab/0x1c9
    [] ldsem_down_write+0x79/0xb1
    [] ? tty_ldisc_lock_pair_timeout+0xa5/0xd9
    [] tty_ldisc_lock_pair_timeout+0xa5/0xd9
    [] tty_ldisc_hangup+0xc4/0x218
    [] __tty_hangup+0x2e2/0x3ed
    [] disassociate_ctty+0x63/0x226
    [] do_exit+0x79f/0xa11
    [] ? get_signal_to_deliver+0x206/0x62f
    [] ? lock_release_holdtime.part.8+0xf/0x16e
    [] do_group_exit+0x47/0xb5
    [] get_signal_to_deliver+0x241/0x62f
    [] do_signal+0x43/0x59d
    [] ? __audit_syscall_exit+0x21a/0x2a8
    [] ? lock_release_holdtime.part.8+0xf/0x16e
    [] do_notify_resume+0x54/0x6c
    [] int_signal+0x12/0x17

    Reported-by: Sami Farin
    Cc: # 3.12.x
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

21 May, 2013

1 commit

  • The semantics of a rw semaphore are almost ideally suited
    for tty line discipline lifetime management; multiple active
    threads obtain "references" (read locks) while performing i/o
    to prevent the loss or change of the current line discipline
    (write lock).

    Unfortunately, the existing rw_semaphore is ill-suited in other
    ways;
    1) TIOCSETD ioctl (change line discipline) expects to return an
    error if the line discipline cannot be exclusively locked within
    5 secs. Lock wait timeouts are not supported by rwsem.
    2) A tty hangup is expected to halt and scrap pending i/o, so
    exclusive locking must be prioritized.
    Writer priority is not supported by rwsem.

    Add ld_semaphore which implements these requirements in a
    semantically similar way to rw_semaphore.

    Writer priority is handled by separate wait lists for readers and
    writers. Pending write waits are priortized before existing read
    waits and prevent further read locks.

    Wait timeouts are trivially added, but obviously change the lock
    semantics as lock attempts can fail (but only due to timeout).

    This implementation incorporates the write-lock stealing work of
    Michel Lespinasse .

    Cc: Michel Lespinasse
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley