Eric Lee / smarc-fsl-linux-kernel

11 Jul, 2016

1 commit

2c13ce8f6 posix_cpu_timer: Exit early when process has been reaped ... Browse Code »

Variable "now" seems to be genuinely used unintialized
if branch

if (CPUCLOCK_PERTHREAD(timer->it_clock)) {

is not taken and branch

if (unlikely(sighand == NULL)) {

is taken. In this case the process has been reaped and the timer is marked as
disarmed anyway. So none of the postprocessing of the sample is
required. Return right away.

Signed-off-by: Alexey Dobriyan
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160707223911.GA26483@p183.telecom.by
Signed-off-by: Thomas Gleixner

Alexey Dobriyan
2016-07-11 23:20:12 +0800

02 Mar, 2016

1 commit

b78783000 posix-cpu-timers: Migrate to use new tick dependency mask model ... Browse Code »

Instead of providing asynchronous checks for the nohz subsystem to verify
posix cpu timers tick dependency, migrate the latter to the new mask.

In order to keep track of the running timers and expose the tick
dependency accordingly, we must probe the timers queuing and dequeuing
on threads and process lists.

Unfortunately it implies both task and signal level dependencies. We
should be able to further optimize this and merge all that on the task
level dependency, at the cost of a bit of complexity and may be overhead.

Reviewed-by: Chris Metcalf
Cc: Christoph Lameter
Cc: Chris Metcalf
Cc: Ingo Molnar
Cc: Luiz Capitulino
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Cc: Viresh Kumar
Signed-off-by: Frederic Weisbecker

Frederic Weisbecker
2016-03-02 23:44:27 +0800

15 Oct, 2015

4 commits

c8d75aa47 posix_cpu_timer: Reduce unnecessary sighand lock contention ... Browse Code »

It was found while running a database workload on large systems that
significant time was spent trying to acquire the sighand lock.

The issue was that whenever an itimer expired, many threads ended up
simultaneously trying to send the signal. Most of the time, nothing
happened after acquiring the sighand lock because another thread
had just already sent the signal and updated the "next expire" time.
The fastpath_timer_check() didn't help much since the "next expire"
time was updated after the threads exit fastpath_timer_check().

This patch addresses this by having the thread_group_cputimer structure
maintain a boolean to signify when a thread in the group is already
checking for process wide timers, and adds extra logic in the fastpath
to check the boolean.

Signed-off-by: Jason Low
Reviewed-by: Oleg Nesterov
Reviewed-by: George Spelvin
Cc: Paul E. McKenney
Cc: Frederic Weisbecker
Cc: Davidlohr Bueso
Cc: Steven Rostedt
Cc: hideaki.kimura@hpe.com
Cc: terry.rudd@hpe.com
Cc: scott.norton@hpe.com
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1444849677-29330-5-git-send-email-jason.low2@hp.com
Signed-off-by: Thomas Gleixner

Jason Low
2015-10-15 17:23:41 +0800
d5c373eb5 posix_cpu_timer: Convert cputimer->running to bool ... Browse Code »

In the next patch in this series, a new field 'checking_timer' will
be added to 'struct thread_group_cputimer'. Both this and the
existing 'running' integer field are just used as boolean values. To
save space in the structure, we can make both of these fields booleans.

This is a preparatory patch to convert the existing running integer
field to a boolean.

Suggested-by: George Spelvin
Signed-off-by: Jason Low
Reviewed: George Spelvin
Cc: Oleg Nesterov
Cc: Paul E. McKenney
Cc: Frederic Weisbecker
Cc: Davidlohr Bueso
Cc: Steven Rostedt
Cc: hideaki.kimura@hpe.com
Cc: terry.rudd@hpe.com
Cc: scott.norton@hpe.com
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1444849677-29330-4-git-send-email-jason.low2@hp.com
Signed-off-by: Thomas Gleixner

Jason Low
2015-10-15 17:23:41 +0800
934715a19 posix_cpu_timer: Check thread timers only when there are active thread timers ... Browse Code »

The fastpath_timer_check() contains logic to check for if any timers
are set by checking if !task_cputime_zero(). Similarly, we can do this
before calling check_thread_timers(). In the case where there
are only process-wide timers, this will skip all of the computations for
per-thread timers when there are no per-thread timers.

As suggested by George, we can put the task_cputime_zero() check in
check_thread_timers(), since that is more of an optization to the
function. Similarly, we move the existing check of cputimer->running
to check_process_timers().

Signed-off-by: Jason Low
Reviewed-by: Oleg Nesterov
Reviewed-by: George Spelvin
Cc: Paul E. McKenney
Cc: Frederic Weisbecker
Cc: Davidlohr Bueso
Cc: Steven Rostedt
Cc: hideaki.kimura@hpe.com
Cc: terry.rudd@hpe.com
Cc: scott.norton@hpe.com
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1444849677-29330-3-git-send-email-jason.low2@hp.com
Signed-off-by: Thomas Gleixner

Jason Low
2015-10-15 17:23:41 +0800
7c177d994 posix_cpu_timer: Optimize fastpath_timer_check() ... Browse Code »

In fastpath_timer_check(), the task_cputime() function is always
called to compute the utime and stime values. However, this is not
necessary if there are no per-thread timers to check for. This patch
modifies the code such that we compute the task_cputime values only
when there are per-thread timers set.

Signed-off-by: Jason Low
Reviewed-by: Oleg Nesterov
Reviewed-by: Frederic Weisbecker
Reviewed-by: Davidlohr Bueso
Reviewed-by: George Spelvin
Cc: Paul E. McKenney
Cc: Steven Rostedt
Cc: hideaki.kimura@hpe.com
Cc: terry.rudd@hpe.com
Cc: scott.norton@hpe.com
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1444849677-29330-2-git-send-email-jason.low2@hp.com
Signed-off-by: Thomas Gleixner

Jason Low
2015-10-15 17:23:41 +0800

08 May, 2015

3 commits

711074451 sched, timer: Use the atomic task_cputime in thread_group_cputimer ... Browse Code »

Recent optimizations were made to thread_group_cputimer to improve its
scalability by keeping track of cputime stats without a lock. However,
the values were open coded to the structure, causing them to be at
a different abstraction level from the regular task_cputime structure.
Furthermore, any subsequent similar optimizations would not be able to
share the new code, since they are specific to thread_group_cputimer.

This patch adds the new task_cputime_atomic data structure (introduced in
the previous patch in the series) to thread_group_cputimer for keeping
track of the cputime atomically, which also helps generalize the code.

Suggested-by: Ingo Molnar
Signed-off-by: Jason Low
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Thomas Gleixner
Acked-by: Rik van Riel
Cc: Andrew Morton
Cc: Aswin Chandramouleeswaran
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: Frederic Weisbecker
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Mel Gorman
Cc: Mike Galbraith
Cc: Oleg Nesterov
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Scott J Norton
Cc: Steven Rostedt
Cc: Waiman Long
Link: http://lkml.kernel.org/r/1430251224-5764-6-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2015-05-08 18:17:46 +0800
1018016c7 sched, timer: Replace spinlocks with atomics in thread_group_cputimer(), to improve scalability ... Browse Code »

While running a database workload, we found a scalability issue with itimers.

Much of the problem was caused by the thread_group_cputimer spinlock.
Each time we account for group system/user time, we need to obtain a
thread_group_cputimer's spinlock to update the timers. On larger systems
(such as a 16 socket machine), this caused more than 30% of total time
spent trying to obtain this kernel lock to update these group timer stats.

This patch converts the timers to 64-bit atomic variables and use
atomic add to update them without a lock. With this patch, the percent
of total time spent updating thread group cputimer timers was reduced
from 30% down to less than 1%.

Note: On 32-bit systems using the generic 64-bit atomics, this causes
sample_group_cputimer() to take locks 3 times instead of just 1 time.
However, we tested this patch on a 32-bit system ARM system using the
generic atomics and did not find the overhead to be much of an issue.
An explanation for why this isn't an issue is that 32-bit systems usually
have small numbers of CPUs, and cacheline contention from extra spinlocks
called periodically is not really apparent on smaller systems.

Signed-off-by: Jason Low
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Thomas Gleixner
Acked-by: Rik van Riel
Cc: Andrew Morton
Cc: Aswin Chandramouleeswaran
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: Frederic Weisbecker
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Mel Gorman
Cc: Mike Galbraith
Cc: Oleg Nesterov
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Scott J Norton
Cc: Steven Rostedt
Cc: Waiman Long
Link: http://lkml.kernel.org/r/1430251224-5764-4-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2015-05-08 18:15:31 +0800
316c1608d sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() ... Browse Code »

ACCESS_ONCE doesn't work reliably on non-scalar types. This patch removes
the rest of the existing usages of ACCESS_ONCE() in the scheduler, and use
the new READ_ONCE() and WRITE_ONCE() APIs as appropriate.

Signed-off-by: Jason Low
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Thomas Gleixner
Acked-by: Rik van Riel
Acked-by: Waiman Long
Cc: Andrew Morton
Cc: Aswin Chandramouleeswaran
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: Frederic Weisbecker
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Mel Gorman
Cc: Mike Galbraith
Cc: Oleg Nesterov
Cc: Paul E. McKenney
Cc: Preeti U Murthy
Cc: Scott J Norton
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/1430251224-5764-2-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2015-05-08 18:11:32 +0800

13 Feb, 2015

1 commit

f56141e3e all arches, signal: move restart_block to struct task_struct ... Browse Code »

If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target. This is because the
restart_block is held in the same memory allocation as the kernel stack.

Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.

Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.

It's also a decent simplification, since the restart code is more or less
identical on all architectures.

[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: Andy Lutomirski
Cc: Thomas Gleixner
Cc: Al Viro
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Kees Cook
Cc: David Miller
Acked-by: Richard Weinberger
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Cc: Vineet Gupta
Cc: Russell King
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Haavard Skinnemoen
Cc: Hans-Christian Egtvedt
Cc: Steven Miao
Cc: Mark Salter
Cc: Aurelien Jacquiot
Cc: Mikael Starvik
Cc: Jesper Nilsson
Cc: David Howells
Cc: Richard Kuo
Cc: "Luck, Tony"
Cc: Geert Uytterhoeven
Cc: Michal Simek
Cc: Ralf Baechle
Cc: Jonas Bonn
Cc: "James E.J. Bottomley"
Cc: Helge Deller
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Acked-by: Michael Ellerman (powerpc)
Tested-by: Michael Ellerman (powerpc)
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Chen Liqin
Cc: Lennox Wu
Cc: Chris Metcalf
Cc: Guan Xuetao
Cc: Chris Zankel
Cc: Max Filippov
Cc: Oleg Nesterov
Cc: Guenter Roeck
Signed-off-by: James Hogan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Lutomirski
2015-02-13 10:54:12 +0800

16 Nov, 2014

1 commit

23cfa361f sched/cputime: Fix cpu_timer_sample_group() double accounting ... Browse Code »

While looking over the cpu-timer code I found that we appear to add
the delta for the calling task twice, through:

cpu_timer_sample_group()
thread_group_cputimer()
thread_group_cputime()
times->sum_exec_runtime += task_sched_runtime();

*sample = cputime.sum_exec_runtime + task_delta_exec();

Which would make the sample run ahead, making the sleep short.

Signed-off-by: Peter Zijlstra (Intel)
Cc: KOSAKI Motohiro
Cc: Oleg Nesterov
Cc: Stanislaw Gruszka
Cc: Christoph Lameter
Cc: Frederic Weisbecker
Cc: Linus Torvalds
Cc: Rik van Riel
Cc: Tejun Heo
Link: http://lkml.kernel.org/r/20141112113737.GI10476@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-11-16 17:04:18 +0800

08 Sep, 2014

1 commit

e78c34967 time, signal: Protect resource use statistics with seqlock ... Browse Code »

Both times() and clock_gettime(CLOCK_PROCESS_CPUTIME_ID) have scalability
issues on large systems, due to both functions being serialized with a
lock.

The lock protects against reporting a wrong value, due to a thread in the
task group exiting, its statistics reporting up to the signal struct, and
that exited task's statistics being counted twice (or not at all).

Protecting that with a lock results in times() and clock_gettime() being
completely serialized on large systems.

This can be fixed by using a seqlock around the events that gather and
propagate statistics. As an additional benefit, the protection code can
be moved into thread_group_cputime(), slightly simplifying the calling
functions.

In the case of posix_cpu_clock_get_task() things can be simplified a
lot, because the calling function already ensures that the task sticks
around, and the rest is now taken care of in thread_group_cputime().

This way the statistics reporting code can run lockless.

Signed-off-by: Rik van Riel
Signed-off-by: Peter Zijlstra (Intel)
Cc: Alex Thorlton
Cc: Andrew Morton
Cc: Daeseok Youn
Cc: David Rientjes
Cc: Dongsheng Yang
Cc: Geert Uytterhoeven
Cc: Guillaume Morin
Cc: Ionut Alexa
Cc: Kees Cook
Cc: Linus Torvalds
Cc: Li Zefan
Cc: Michal Hocko
Cc: Michal Schmidt
Cc: Oleg Nesterov
Cc: Vladimir Davydov
Cc: umgwanakikbuti@gmail.com
Cc: fweisbec@gmail.com
Cc: srao@redhat.com
Cc: lwoodman@redhat.com
Cc: atheurer@redhat.com
Link: http://lkml.kernel.org/r/20140816134010.26a9b572@annuminas.surriel.com
Signed-off-by: Ingo Molnar

Rik van Riel
2014-09-08 14:17:01 +0800

23 Jun, 2014

1 commit

5cee96459 time/timers: Move all time(r) related files into kernel/time ... Browse Code »

Except for Kconfig.HZ. That needs a separate treatment.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2014-06-23 17:22:35 +0800