09 Oct, 2019
1 commit
-
Since the following commit:
b4adfe8e05f1 ("locking/lockdep: Remove unused argument in __lock_release")
@nested is no longer used in lock_release(), so remove it from all
lock_release() calls and friends.Signed-off-by: Qian Cai
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Acked-by: Daniel Vetter
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: airlied@linux.ie
Cc: akpm@linux-foundation.org
Cc: alexander.levin@microsoft.com
Cc: daniel@iogearbox.net
Cc: davem@davemloft.net
Cc: dri-devel@lists.freedesktop.org
Cc: duyuyang@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: hannes@cmpxchg.org
Cc: intel-gfx@lists.freedesktop.org
Cc: jack@suse.com
Cc: jlbec@evilplan.or
Cc: joonas.lahtinen@linux.intel.com
Cc: joseph.qi@linux.alibaba.com
Cc: jslaby@suse.com
Cc: juri.lelli@redhat.com
Cc: maarten.lankhorst@linux.intel.com
Cc: mark@fasheh.com
Cc: mhocko@kernel.org
Cc: mripard@kernel.org
Cc: ocfs2-devel@oss.oracle.com
Cc: rodrigo.vivi@intel.com
Cc: sean@poorly.run
Cc: st@kernel.org
Cc: tj@kernel.org
Cc: tytso@mit.edu
Cc: vdavydov.dev@gmail.com
Cc: vincent.guittot@linaro.org
Cc: viro@zeniv.linux.org.uk
Link: https://lkml.kernel.org/r/1568909380-32199-1-git-send-email-cai@lca.pw
Signed-off-by: Ingo Molnar
25 Jul, 2019
1 commit
-
While reviewing rwsem down_slowpath, Will noticed ldsem had a copy of
a bug we just found for rwsem.X = 0;
CPU0 CPU1
rwsem_down_read()
for (;;) {
set_current_state(TASK_UNINTERRUPTIBLE);X = 1;
rwsem_up_write();
rwsem_mark_wake()
atomic_long_add(adjustment, &sem->count);
smp_store_release(&waiter->task, NULL);if (!waiter.task)
break;...
}r = X;
Allows 'r == 0'.
Reported-by: Will Deacon
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Linus Torvalds
Cc: Peter Hurley
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Fixes: 4898e640caf0 ("tty: Add timed, writer-prioritized rw semaphore")
Signed-off-by: Ingo Molnar
05 Dec, 2018
3 commits
-
It seems like when ldsem_down_read() fails with timeout, it misses
update for sem->wait_readers. By that reason, when writer finally
releases write end of the semaphore __ldsem_wake_readers() does adjust
sem->count with wrong value:
sem->wait_readers * (LDSEM_ACTIVE_BIAS - LDSEM_WAIT_BIAS)I.e, if update comes with 1 missed wait_readers decrement, sem->count
will be 0x100000001 which means that there is active reader and it'll
make any further writer to fail in acquiring the semaphore.It looks like, this is a dead-code, because ldsem_down_read() is never
called with timeout different than MAX_SCHEDULE_TIMEOUT, so it might be
worth to delete timeout parameter and error path fall-back..Cc: Jiri Slaby
Signed-off-by: Dmitry Safonov
Signed-off-by: Greg Kroah-Hartman -
For some reason ldsem has its own lockdep wrappers, make them go away.
Cc: Jiri Slaby
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Dmitry Safonov
Signed-off-by: Greg Kroah-Hartman -
ldsem_down_read() will sleep if there is pending writer in the queue.
If the writer times out, readers in the queue should be woken up,
otherwise they may miss a chance to acquire the semaphore until the last
active reader will do ldsem_up_read().There was a couple of reports where there was one active reader and
other readers soft locked up:
Showing all locks held in the system:
2 locks held by khungtaskd/17:
#0: (rcu_read_lock){......}, at: watchdog+0x124/0x6d1
#1: (tasklist_lock){.+.+..}, at: debug_show_all_locks+0x72/0x2d3
2 locks held by askfirst/123:
#0: (&tty->ldisc_sem){.+.+.+}, at: ldsem_down_read+0x46/0x58
#1: (&ldata->atomic_read_lock){+.+...}, at: n_tty_read+0x115/0xbe4Prevent readers wait for active readers to release ldisc semaphore.
Link: lkml.kernel.org/r/20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com
Link: lkml.kernel.org/r/20180907045041.GF1110@shao2-debian
Cc: Jiri Slaby
Cc: Peter Zijlstra
Cc: stable@vger.kernel.org
Reported-by: kernel test robot
Signed-off-by: Dmitry Safonov
Signed-off-by: Greg Kroah-Hartman
28 Jun, 2018
1 commit
-
Mark found ldsem_cmpxchg() needed an (atomic_long_t *) cast to keep
working after making the atomic_long interface type safe.Needing casts is bad form, which made me look at the code. There are no
ld_semaphore::count users outside of these functions so there is no
reason why it can not be an atomic_long_t in the first place, obviating
the need for this cast.That also ensures the loads use atomic_long_read(), which implies (at
least) READ_ONCE() in order to guarantee single-copy-atomic loads.When using atomic_long_try_cmpxchg() the ldsem_cmpxchg() wrapper gets
very thin (the only difference is not changing *old on success, which
most callers don't seem to care about).So rework the whole thing to use atomic_long_t and its accessors
directly.While there, fixup all the horrible comment styles.
Cc: Peter Hurley
Reported-by: Mark Rutland
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Mark Rutland
Signed-off-by: Greg Kroah-Hartman
08 Nov, 2017
2 commits
-
Now that the SPDX tag is in all tty files, that identifies the license
in a specific and legally-defined manner. So the extra GPL text wording
can be removed as it is no longer needed at all.This is done on a quest to remove the 700+ different ways that files in
the kernel describe the GPL license text. And there's unneeded stuff
like the address (sometimes incorrect) for the FSF which is never
needed.No copyright headers or other non-license-description text was removed.
Cc: Jiri Slaby
Cc: James Hogan
Signed-off-by: Greg Kroah-Hartman -
It's good to have SPDX identifiers in all files to make it easier to
audit the kernel tree for correct licenses.Update the drivers/tty files files with the correct SPDX license
identifier based on the license text in the file itself. The SPDX
identifier is a legally binding shorthand, which can be used instead of
the full boiler plate text.This work is based on a script and data from Thomas Gleixner, Philippe
Ombredanne, and Kate Stewart.Cc: Jiri Slaby
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Chris Metcalf
Cc: Jiri Kosina
Cc: David Sterba
Cc: James Hogan
Cc: Rob Herring
Cc: Eric Anholt
Cc: Stefan Wahren
Cc: Florian Fainelli
Cc: Ray Jui
Cc: Scott Branden
Cc: bcm-kernel-feedback-list@broadcom.com
Cc: "James E.J. Bottomley"
Cc: Helge Deller
Cc: Joachim Eastwood
Cc: Matthias Brugger
Cc: Masahiro Yamada
Cc: Tobias Klauser
Cc: Russell King
Cc: Vineet Gupta
Cc: Richard Genoud
Cc: Alexander Shiyan
Cc: Baruch Siach
Cc: "Maciej W. Rozycki"
Cc: "Uwe Kleine-König"
Cc: Pat Gefre
Cc: "Guilherme G. Piccoli"
Cc: Jason Wessel
Cc: Vladimir Zapolskiy
Cc: Sylvain Lemieux
Cc: Carlo Caione
Cc: Kevin Hilman
Cc: Liviu Dudau
Cc: Sudeep Holla
Cc: Lorenzo Pieralisi
Cc: Andy Gross
Cc: David Brown
Cc: "Andreas Färber"
Cc: Kevin Cernekee
Cc: Laxman Dewangan
Cc: Thierry Reding
Cc: Jonathan Hunter
Cc: Barry Song
Cc: Patrice Chotard
Cc: Maxime Coquelin
Cc: Alexandre Torgue
Cc: "David S. Miller"
Cc: Peter Korsgaard
Cc: Timur Tabi
Cc: Tony Prisk
Cc: Michal Simek
Cc: "Sören Brinkmann"
Cc: Thomas Gleixner
Cc: Kate Stewart
Cc: Philippe Ombredanne
Cc: Jiri Slaby
Signed-off-by: Greg Kroah-Hartman
02 Mar, 2017
2 commits
-
…ed APIs from <linux/sched.h> to <linux/sched/task.h>
But first update usage sites with the new header dependency.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org> -
We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar
14 Jan, 2017
2 commits
-
This is a nasty interface and setting the state of a foreign task must
not be done. As of the following commit:be628be0956 ("bcache: Make gc wakeup sane, remove set_task_state()")
... everyone in the kernel calls set_task_state() with current, allowing
the helper to be removed.However, as the comment indicates, it is still around for those archs
where computing current is more expensive than using a pointer, at least
in theory. An important arch that is affected is arm64, however this has
been addressed now [1] and performance is up to par making no difference
with either calls.Of all the callers, if any, it's the locking bits that would care most
about this -- ie: we end up passing a tsk pointer to a lot of the lock
slowpath, and setting ->state on that. The following numbers are based
on two tests: a custom ad-hoc microbenchmark that just measures
latencies (for ~65 million calls) between get_task_state() vs
get_current_state().Secondly for a higher overview, an unlink microbenchmark was used,
which pounds on a single file with open, close,unlink combos with
increasing thread counts (up to 4x ncpus). While the workload is quite
unrealistic, it does contend a lot on the inode mutex or now rwsem.[1] https://lkml.kernel.org/r/1483468021-8237-1-git-send-email-mark.rutland@arm.com
== 1. x86-64 ==
Avg runtime set_task_state(): 601 msecs
Avg runtime set_current_state(): 552 msecsvanilla dirty
Hmean unlink1-processes-2 36089.26 ( 0.00%) 38977.33 ( 8.00%)
Hmean unlink1-processes-5 28555.01 ( 0.00%) 29832.55 ( 4.28%)
Hmean unlink1-processes-8 37323.75 ( 0.00%) 44974.57 ( 20.50%)
Hmean unlink1-processes-12 43571.88 ( 0.00%) 44283.01 ( 1.63%)
Hmean unlink1-processes-21 34431.52 ( 0.00%) 38284.45 ( 11.19%)
Hmean unlink1-processes-30 34813.26 ( 0.00%) 37975.17 ( 9.08%)
Hmean unlink1-processes-48 37048.90 ( 0.00%) 39862.78 ( 7.59%)
Hmean unlink1-processes-79 35630.01 ( 0.00%) 36855.30 ( 3.44%)
Hmean unlink1-processes-110 36115.85 ( 0.00%) 39843.91 ( 10.32%)
Hmean unlink1-processes-141 32546.96 ( 0.00%) 35418.52 ( 8.82%)
Hmean unlink1-processes-172 34674.79 ( 0.00%) 36899.21 ( 6.42%)
Hmean unlink1-processes-203 37303.11 ( 0.00%) 36393.04 ( -2.44%)
Hmean unlink1-processes-224 35712.13 ( 0.00%) 36685.96 ( 2.73%)== 2. ppc64le ==
Avg runtime set_task_state(): 938 msecs
Avg runtime set_current_state: 940 msecsvanilla dirty
Hmean unlink1-processes-2 19269.19 ( 0.00%) 30704.50 ( 59.35%)
Hmean unlink1-processes-5 20106.15 ( 0.00%) 21804.15 ( 8.45%)
Hmean unlink1-processes-8 17496.97 ( 0.00%) 17243.28 ( -1.45%)
Hmean unlink1-processes-12 14224.15 ( 0.00%) 17240.21 ( 21.20%)
Hmean unlink1-processes-21 14155.66 ( 0.00%) 15681.23 ( 10.78%)
Hmean unlink1-processes-30 14450.70 ( 0.00%) 15995.83 ( 10.69%)
Hmean unlink1-processes-48 16945.57 ( 0.00%) 16370.42 ( -3.39%)
Hmean unlink1-processes-79 15788.39 ( 0.00%) 14639.27 ( -7.28%)
Hmean unlink1-processes-110 14268.48 ( 0.00%) 14377.40 ( 0.76%)
Hmean unlink1-processes-141 14023.65 ( 0.00%) 16271.69 ( 16.03%)
Hmean unlink1-processes-172 13417.62 ( 0.00%) 16067.55 ( 19.75%)
Hmean unlink1-processes-203 15293.08 ( 0.00%) 15440.40 ( 0.96%)
Hmean unlink1-processes-234 13719.32 ( 0.00%) 16190.74 ( 18.01%)
Hmean unlink1-processes-265 16400.97 ( 0.00%) 16115.22 ( -1.74%)
Hmean unlink1-processes-296 14388.60 ( 0.00%) 16216.13 ( 12.70%)
Hmean unlink1-processes-320 15771.85 ( 0.00%) 15905.96 ( 0.85%)x86-64 (known to be fast for get_current()/this_cpu_read_stable() caching)
and ppc64 (with paca) show similar improvements in the unlink microbenches.
The small delta for ppc64 (2ms), does not represent the gains on the unlink
runs. In the case of x86, there was a decent amount of variation in the
latency runs, but always within a 20 to 50ms increase), ppc was more constant.Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: dave@stgolabs.net
Cc: mark.rutland@arm.com
Link: http://lkml.kernel.org/r/1483479794-14013-5-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar -
This patch effectively replaces the tsk pointer dereference
(which is obviously == current), to directly use get_current()
macro. This is to make the removal of setting foreign task
states smoother and painfully obvious. Performance win on some
archs such as x86-64 and ppc64 -- arm64 is no longer an issue.Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: mark.rutland@arm.com
Link: http://lkml.kernel.org/r/1483479794-14013-3-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar
14 Dec, 2015
2 commits
-
This function compiles to 491 bytes of machine code.
Signed-off-by: Denys Vlasenko
CC: Jiri Slaby
CC: linux-serial@vger.kernel.org
Reviewed-by: Peter Hurley
Signed-off-by: Greg Kroah-Hartman -
This function compiles to 479 bytes of machine code.
Signed-off-by: Denys Vlasenko
CC: Jiri Slaby
CC: linux-serial@vger.kernel.org
Reviewed-by: Peter Hurley
Signed-off-by: Greg Kroah-Hartman
11 May, 2015
1 commit
-
We should not be doing assignments within an if () block
so fix up the code to not do this.change was created using Coccinelle.
CC: Jiri Slaby
Signed-off-by: Greg Kroah-Hartman
10 Feb, 2014
1 commit
-
The "int check" argument of lock_acquire() and held_lock->check are
misleading. This is actually a boolean: 2 means "true", everything
else is "false".And there is no need to pass 1 or 0 to lock_acquire() depending on
CONFIG_PROVE_LOCKING, __lock_acquire() checks prove_locking at the
start and clears "check" if !CONFIG_PROVE_LOCKING.Note: probably we can simply kill this member/arg. The only explicit
user of check => 0 is rcu_lock_acquire(), perhaps we can change it to
use lock_acquire(trylock =>, read => 2). __lockdep_no_validate means
check => 0 implicitly, but we can change validate_chain() to check
hlock->instance->key instead. Not to mention it would be nice to get
rid of lockdep_set_novalidate_class().Signed-off-by: Oleg Nesterov
Cc: Dave Jones
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Paul McKenney
Cc: Steven Rostedt
Cc: Alan Stern
Cc: Sasha Levin
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20140120182006.GA26495@redhat.com
Signed-off-by: Ingo Molnar
17 Dec, 2013
1 commit
-
When a controlling tty is being hung up and the hang up is
waiting for a just-signalled tty reader or writer to exit, and a new tty
reader/writer tries to acquire an ldisc reference concurrently with the
ldisc reference release from the signalled reader/writer, the hangup
can hang. The new reader/writer is sleeping in ldsem_down_read() and the
hangup is sleeping in ldsem_down_write() [1].The new reader/writer fails to wakeup the waiting hangup because the
wrong lock count value is checked (the old lock count rather than the new
lock count) to see if the lock is unowned.Change helper function to return the new lock count if the cmpxchg was
successful; document this behavior.[1] edited dmesg log from reporter
SysRq : Show Blocked State
task PC stack pid father
systemd D ffff88040c4f0000 0 1 0 0x00000000
ffff88040c49fbe0 0000000000000046 ffff88040c4a0000 ffff88040c49ffd8
00000000001d3980 00000000001d3980 ffff88040c4a0000 ffff88040593d840
ffff88040c49fb40 ffffffff810a4cc0 0000000000000006 0000000000000023
Call Trace:
[] ? sched_clock_cpu+0x9f/0xe4
[] ? sched_clock_cpu+0x9f/0xe4
[] ? sched_clock_cpu+0x9f/0xe4
[] ? sched_clock_cpu+0x9f/0xe4
[] schedule+0x24/0x5e
[] schedule_timeout+0x15b/0x1ec
[] ? sched_clock_cpu+0x9f/0xe4
[] ? _raw_spin_unlock_irq+0x24/0x26
[] down_read_failed+0xe3/0x1b9
[] ldsem_down_read+0x8b/0xa5
[] ? tty_ldisc_ref_wait+0x1b/0x44
[] tty_ldisc_ref_wait+0x1b/0x44
[] tty_write+0x7d/0x28a
[] redirected_tty_write+0x8d/0x98
[] ? tty_write+0x28a/0x28a
[] do_loop_readv_writev+0x56/0x79
[] do_readv_writev+0x1b0/0x1ff
[] ? do_vfs_ioctl+0x32a/0x489
[] ? final_putname+0x1d/0x3a
[] vfs_writev+0x2e/0x49
[] SyS_writev+0x47/0xaa
[] system_call_fastpath+0x16/0x1b
bash D ffffffff81c104c0 0 5469 5302 0x00000082
ffff8800cf817ac0 0000000000000046 ffff8804086b22a0 ffff8800cf817fd8
00000000001d3980 00000000001d3980 ffff8804086b22a0 ffff8800cf817a48
000000000000b9a0 ffff8800cf817a78 ffffffff81004675 ffff8800cf817a44
Call Trace:
[] ? dump_trace+0x165/0x29c
[] ? sched_clock_cpu+0x9f/0xe4
[] ? save_stack_trace+0x26/0x41
[] schedule+0x24/0x5e
[] schedule_timeout+0x15b/0x1ec
[] ? sched_clock_cpu+0x9f/0xe4
[] ? down_write_failed+0xa3/0x1c9
[] ? _raw_spin_unlock_irq+0x24/0x26
[] down_write_failed+0xab/0x1c9
[] ldsem_down_write+0x79/0xb1
[] ? tty_ldisc_lock_pair_timeout+0xa5/0xd9
[] tty_ldisc_lock_pair_timeout+0xa5/0xd9
[] tty_ldisc_hangup+0xc4/0x218
[] __tty_hangup+0x2e2/0x3ed
[] disassociate_ctty+0x63/0x226
[] do_exit+0x79f/0xa11
[] ? get_signal_to_deliver+0x206/0x62f
[] ? lock_release_holdtime.part.8+0xf/0x16e
[] do_group_exit+0x47/0xb5
[] get_signal_to_deliver+0x241/0x62f
[] do_signal+0x43/0x59d
[] ? __audit_syscall_exit+0x21a/0x2a8
[] ? lock_release_holdtime.part.8+0xf/0x16e
[] do_notify_resume+0x54/0x6c
[] int_signal+0x12/0x17Reported-by: Sami Farin
Cc: # 3.12.x
Signed-off-by: Peter Hurley
Signed-off-by: Greg Kroah-Hartman
21 May, 2013
1 commit
-
The semantics of a rw semaphore are almost ideally suited
for tty line discipline lifetime management; multiple active
threads obtain "references" (read locks) while performing i/o
to prevent the loss or change of the current line discipline
(write lock).Unfortunately, the existing rw_semaphore is ill-suited in other
ways;
1) TIOCSETD ioctl (change line discipline) expects to return an
error if the line discipline cannot be exclusively locked within
5 secs. Lock wait timeouts are not supported by rwsem.
2) A tty hangup is expected to halt and scrap pending i/o, so
exclusive locking must be prioritized.
Writer priority is not supported by rwsem.Add ld_semaphore which implements these requirements in a
semantically similar way to rw_semaphore.Writer priority is handled by separate wait lists for readers and
writers. Pending write waits are priortized before existing read
waits and prevent further read locks.Wait timeouts are trivially added, but obviously change the lock
semantics as lock attempts can fail (but only due to timeout).This implementation incorporates the write-lock stealing work of
Michel Lespinasse .Cc: Michel Lespinasse
Signed-off-by: Peter Hurley
Signed-off-by: Greg Kroah-Hartman