Eric Lee / smarc-fsl-linux-kernel

03 Aug, 2018

1 commit

c06f5a018 delayacct: Use raw_spinlocks ... Browse Code »

[ Upstream commit 02acc80d19edb0d5684c997b2004ad19f9f5236e ]

try_to_wake_up() might invoke delayacct_blkio_end() while holding the
pi_lock (which is a raw_spinlock_t). delayacct_blkio_end() acquires
task_delay_info.lock which is a spinlock_t. This causes a might sleep splat
on -RT where non raw spinlocks are converted to 'sleeping' spinlocks.

task_delay_info.lock is only held for a short amount of time so it's not a
problem latency wise to make convert it to a raw spinlock.

Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Thomas Gleixner
Cc: Balbir Singh
Link: https://lkml.kernel.org/r/20180423161024.6710-1-bigeasy@linutronix.de
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Sebastian Andrzej Siewior
7 years ago

24 Jan, 2018

1 commit

36ae2e6f5 delayacct: Account blkio completion on the correct task ... Browse Code »

commit c96f5471ce7d2aefd0dda560cc23f08ab00bc65d upstream.

Before commit:

e33a9bba85a8 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")

delayacct_blkio_end() was called after context-switching into the task which
completed I/O.

This resulted in double counting: the task would account a delay both waiting
for I/O and for time spent in the runqueue.

With e33a9bba85a8, delayacct_blkio_end() is called by try_to_wake_up().
In ttwu, we have not yet context-switched. This is more correct, in that
the delay accounting ends when the I/O is complete.

But delayacct_blkio_end() relies on 'get_current()', and we have not yet
context-switched into the task whose I/O completed. This results in the
wrong task having its delay accounting statistics updated.

Instead of doing that, pass the task_struct being woken to delayacct_blkio_end(),
so that it can update the statistics of the correct task.

Signed-off-by: Josh Snyder
Acked-by: Tejun Heo
Acked-by: Balbir Singh
Cc: Brendan Gregg
Cc: Jens Axboe
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-block@vger.kernel.org
Fixes: e33a9bba85a8 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")
Link: http://lkml.kernel.org/r/1513613712-571-1-git-send-email-joshs@netflix.com
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Josh Snyder
8 years ago

02 Mar, 2017

2 commits

32ef5517c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <… ... Browse Code »

…linux/sched/cputime.h>

Introduce a trivial, mostly empty <linux/sched/cputime.h> header
to prepare for the moving of cputime functionality out of sched.h.

Update all code that relies on these facilities.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
8 years ago
9164bb4a1 sched/headers: Prepare to move 'init_task' and 'init_thread_union' from <linux/s… ... Browse Code »

…ched.h> to <linux/sched/task.h>

Update all usage sites first.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
8 years ago

01 Feb, 2017

2 commits

dbf3da1c2 delaycct: Convert obsolete cputime type to nsecs ... Browse Code »

Use the new nsec based cputime accessors as part of the whole cputime
conversion from cputime_t to nsecs.

Signed-off-by: Frederic Weisbecker
Cc: Benjamin Herrenschmidt
Cc: Fenghua Yu
Cc: Heiko Carstens
Cc: Linus Torvalds
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Stanislaw Gruszka
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Wanpeng Li
Link: http://lkml.kernel.org/r/1485832191-26889-14-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar

Frederic Weisbecker
9 years ago
a1cecf2ba sched/cputime: Introduce special task_cputime_t() API to return old-typed cputime ... Browse Code »

This API returns a task's cputime in cputime_t in order to ease the
conversion of cputime internals to use nsecs units instead. Blindly
converting all cputime readers to use this API now will later let us
convert more smoothly and step by step all these places to use the
new nsec based cputime.

Signed-off-by: Frederic Weisbecker
Cc: Benjamin Herrenschmidt
Cc: Fenghua Yu
Cc: Heiko Carstens
Cc: Linus Torvalds
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Stanislaw Gruszka
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Wanpeng Li
Link: http://lkml.kernel.org/r/1485832191-26889-7-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar

Frederic Weisbecker
9 years ago

15 Jan, 2016

1 commit

5d097056c kmemcg: account certain kmem allocations to memcg ... Browse Code »

Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg. For the list, see below:

- threadinfo
- task_struct
- task_delay_info
- pid
- cred
- mm_struct
- vm_area_struct and vm_region (nommu)
- anon_vma and anon_vma_chain
- signal_struct
- sighand_struct
- fs_struct
- files_struct
- fdtable and fdtable->full_fds_bits
- dentry and external_name
- inode for all filesystems. This is the most tedious part, because
most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov
Acked-by: Johannes Weiner
Acked-by: Michal Hocko
Cc: Tejun Heo
Cc: Greg Thelen
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vladimir Davydov
10 years ago

24 Jul, 2014

2 commits

68f6783d2 delayacct: Remove braindamaged type conversions ... Browse Code »

Converting cputime to timespec and timespec to nanoseconds makes no
sense. Use cputime_to_ns() and be done with it.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
11 years ago
9667a23db delayacct: Make accounting nanosecond based ... Browse Code »

Kill the timespec juggling and calculate with plain nanoseconds.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
11 years ago

12 Jun, 2014

1 commit

b5d768253 delayacct: Use ktime_get_ts() ... Browse Code »

do_posix_clock_monotonic_gettime() is a leftover from the initial
posix timer implementation which maps to ktime_get_ts(). Remove the
silly wrapper while at it.

Signed-off-by: Thomas Gleixner
Cc: John Stultz
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20140611234606.931409215@linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
11 years ago

13 Nov, 2013

1 commit

324d666a5 kernel/delayacct.c: remove redundant checking in __delayacct_add_tsk() ... Browse Code »

The wrapper function delayacct_add_tsk() already checked 'tsk->delays',
and __delayacct_add_tsk() has no another direct callers, so can remove the
redundancy checking code.

And the label 'done' is also useless, so remove it, too.

Signed-off-by: Chen Gang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chen Gang
12 years ago

28 Jan, 2013

1 commit

6fac4829c cputime: Use accessors to read task cputime stats ... Browse Code »

This is in preparation for the full dynticks feature. While
remotely reading the cputime of a task running in a full
dynticks CPU, we'll need to do some extra-computation. This
way we can account the time it spent tickless in userspace
since its last cputime snapshot.

Signed-off-by: Frederic Weisbecker
Cc: Andrew Morton
Cc: Ingo Molnar
Cc: Li Zhong
Cc: Namhyung Kim
Cc: Paul E. McKenney
Cc: Paul Gortmaker
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Thomas Gleixner

Frederic Weisbecker
13 years ago

14 Jul, 2011

1 commit

c9aaa8957 KVM: Steal time implementation ... Browse Code »

To implement steal time, we need the hypervisor to pass the guest
information about how much time was spent running other processes
outside the VM, while the vcpu had meaningful work to do - halt
time does not count.

This information is acquired through the run_delay field of
delayacct/schedstats infrastructure, that counts time spent in a
runqueue but not running.

Steal time is a per-cpu information, so the traditional MSR-based
infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the
memory area address containing information about steal time

This patch contains the hypervisor part of the steal time infrasructure,
and can be backported independently of the guest portion.

[avi, yongjie: export delayacct_on, to avoid build failures in some configs]

Signed-off-by: Glauber Costa
Tested-by: Eric B Munson
CC: Rik van Riel
CC: Jeremy Fitzhardinge
CC: Peter Zijlstra
CC: Anthony Liguori
Signed-off-by: Yongjie Ren
Signed-off-by: Avi Kivity

Glauber Costa
14 years ago

19 Sep, 2009

1 commit

6952b61de headers: taskstats_kern.h trim ... Browse Code »

Remove net/genetlink.h inclusion, now sched.c won't be recompiled
because of some networking changes.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
16 years ago

18 Dec, 2008

1 commit

9c2c48020 schedstat: consolidate per-task cpu runtime stats ... Browse Code »

Impact: simplify code

When we turn on CONFIG_SCHEDSTATS, per-task cpu runtime is accumulated
twice. Once in task->se.sum_exec_runtime and once in sched_info.cpu_time.
These two stats are exactly the same.

Given that task->se.sum_exec_runtime is always accumulated by the core
scheduler, sched_info can reuse that data instead of duplicate the accounting.

Signed-off-by: Ken Chen
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Ken Chen
17 years ago

26 Jul, 2008

2 commits

016ae219b per-task-delay-accounting: update taskstats for memory reclaim delay ... Browse Code »

Add members for memory reclaim delay to taskstats, and accumulate them in
__delayacct_add_tsk() .

Signed-off-by: Keika Kobayashi
Cc: Hiroshi Shimamoto
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Keika Kobayashi
17 years ago
873b47717 per-task-delay-accounting: add memory reclaim delay ... Browse Code »

Sometimes, application responses become bad under heavy memory load.
Applications take a bit time to reclaim memory. The statistics, how long
memory reclaim takes, will be useful to measure memory usage.

This patch adds accounting memory reclaim to per-task-delay-accounting for
accounting the time of do_try_to_free_pages().

- When System is under low memory load,
memory reclaim may not occur.

$ free
total used free shared buffers cached
Mem: 8197800 1577300 6620500 0 4808 1516724
-/+ buffers/cache: 55768 8142032
Swap: 16386292 0 16386292

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 5069748 10612 3014060 0 0 0 0 3 26 0 0 100 0
0 0 0 5069748 10612 3014060 0 0 0 0 4 22 0 0 100 0
0 0 0 5069748 10612 3014060 0 0 0 0 3 18 0 0 100 0

Measure the time of tar command.

$ ls -s test.dat
1501472 test.dat

$ time tar cvf test.tar test.dat
real 0m13.388s
user 0m0.116s
sys 0m5.304s

$ ./delayget -d -p
CPU count real total virtual total delay total
428 5528345500 5477116080 62749891
IO count delay total
338 8078977189
SWAP count delay total
0 0
RECLAIM count delay total
0 0

- When system is under heavy memory load
memory reclaim may occur.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 7159032 49724 1812 3012 0 0 0 0 3 24 0 0 100 0
0 0 7159032 49724 1812 3012 0 0 0 0 4 24 0 0 100 0
0 0 7159032 49848 1812 3012 0 0 0 0 3 22 0 0 100 0

In this case, one process uses more 8G memory
by execution of malloc() and memset().

$ time tar cvf test.tar test.dat
real 1m38.563s
CPU count real total virtual total delay total
9021 7140446250 7315277975 923201824
IO count delay total
8965 90466349669
SWAP count delay total
3 21036367
RECLAIM count delay total
740 61011951153

In the later case, the value of RECLAIM is increasing.
So, taskstats can show how much memory reclaim influences TAT.

Signed-off-by: Keika Kobayashi
Acked-by: Balbir Singh
Acked-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Keika Kobayashi
17 years ago

19 Oct, 2007

1 commit

c66f08be7 Add scaled time to taskstats based process accounting ... Browse Code »

This adds items to the taststats struct to account for user and system
time based on scaling the CPU frequency and instruction issue rates.

Adds account_(user|system)_time_scaled callbacks which architectures
can use to account for time using this mechanism.

Signed-off-by: Michael Neuling
Cc: Balbir Singh
Cc: Jay Lan
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Neuling
18 years ago

15 Oct, 2007

1 commit

2d72376b3 sched: clean up schedstats, cnt -> count ... Browse Code »

rename all 'cnt' fields and variables to the less yucky 'count' name.

yuckage noticed by Andrew Morton.

no change in code, other than the /proc/sched_debug bkl_count string got
a bit larger:

text data bss dec hex filename
38236 3506 24 41766 a326 sched.o.before
38240 3506 24 41770 a32a sched.o.after

Signed-off-by: Ingo Molnar
Reviewed-by: Thomas Gleixner

Ingo Molnar
18 years ago

10 Jul, 2007

1 commit

172ba844a sched: update delay-accounting to use CFS's precise stats ... Browse Code »

update delay-accounting to use CFS's precise stats.

Signed-off-by: Ingo Molnar

Balbir Singh
18 years ago

08 May, 2007

1 commit

0a31bd5f2 KMEM_CACHE(): simplify slab cache creation ... Browse Code »

This patch provides a new macro

KMEM_CACHE(, )

to simplify slab creation. KMEM_CACHE creates a slab with the name of the
struct, with the size of the struct and with the alignment of the struct.
Additional slab flags may be specified if necessary.

Example

struct test_slab {
int a,b,c;
struct list_head;
} __cacheline_aligned_in_smp;

test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC)

will create a new slab named "test_slab" of the size sizeof(struct
test_slab) and aligned to the alignment of test slab. If it fails then we
panic.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
18 years ago

08 Dec, 2006

2 commits

e18b890bb [PATCH] slab: remove kmem_cache_t ... Browse Code »

Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#

set -e

for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done

The script was run like this

sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
19 years ago
e94b17660 [PATCH] slab: remove SLAB_KERNEL ... Browse Code »

SLAB_KERNEL is an alias of GFP_KERNEL.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
19 years ago

06 Nov, 2006

1 commit

64efade11 [PATCH] lockdep: fix delayacct locking bug ... Browse Code »

Make the delayacct lock irqsave; this avoids the possible deadlock where
an interrupt is taken while holding the delayacct lock which needs to
take the delayacct lock.

Signed-off-by: Peter Zijlstra
Acked-by: Oleg Nesterov
Cc: Balbir Singh
Cc: Shailabh Nagar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
19 years ago

02 Sep, 2006

1 commit

35df17c57 [PATCH] task delay accounting fixes ... Browse Code »

Cleanup allocation and freeing of tsk->delays used by delay accounting.
This solves two problems reported for delay accounting:

1. oops in __delayacct_blkio_ticks
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1844.html

Currently tsk->delays is getting freed too early in task exit which can
cause a NULL tsk->delays to get accessed via reading of /proc//stats.
The patch fixes this problem by freeing tsk->delays closer to when
task_struct itself is freed up. As a result, it also eliminates the use of
tsk->delays_lock which was only being used (inadequately) to safeguard
access to tsk->delays while a task was exiting.

2. Possible memory leak in kernel/delayacct.c
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1389.html

The patch cleans up tsk->delays allocations after a bad fork which was
missing earlier.

The patch has been tested to fix the problems listed above and stress
tested with rapid calls to delay accounting's taskstats command interface
(which is the other path that can access the same data, besides the /proc
interface causing the oops above).

Signed-off-by: Shailabh Nagar
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
19 years ago

01 Aug, 2006

1 commit

163ecdff0 [PATCH] delay accounting: temporarily enable by default ... Browse Code »

Enable delay accounting by default so that feature gets coverage testing
without requiring special measures.

Earlier, it was off by default and had to be enabled via a boot time param.
This patch reverses the default behaviour to improve coverage testing. It
can be removed late in the kernel development cycle if its believed users
shouldn't have to incur any cost if they don't want delay accounting. Or
it can be retained forever if the utility of the stats is deemed common
enough to warrant keeping the feature on.

Signed-off-by: Shailabh Nagar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
19 years ago

15 Jul, 2006

4 commits

258904546 [PATCH] per-task-delay-accounting: /proc export of aggregated block I/O delays ... Browse Code »

Export I/O delays seen by a task through /proc//stats for use in top
etc.

Note that delays for I/O done for swapping in pages (swapin I/O) is clubbed
together with all other I/O here (this is not the case in the netlink
interface where the swapin I/O is kept distinct)

[akpm@osdl.org: printk warning fix]
Signed-off-by: Shailabh Nagar
Signed-off-by: Balbir Singh
Cc: Jes Sorensen
Cc: Peter Chubb
Cc: Erich Focht
Cc: Levent Serinol
Cc: Jay Lan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
19 years ago
6f44993fe [PATCH] per-task-delay-accounting: delay accounting usage of taskstats interface ... Browse Code »

Usage of taskstats interface by delay accounting.

Signed-off-by: Shailabh Nagar
Signed-off-by: Balbir Singh
Cc: Jes Sorensen
Cc: Peter Chubb
Cc: Erich Focht
Cc: Levent Serinol
Cc: Jay Lan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
19 years ago
0ff922452 [PATCH] per-task-delay-accounting: sync block I/O and swapin delay collection ... Browse Code »

Unlike earlier iterations of the delay accounting patches, now delays are only
collected for the actual I/O waits rather than try and cover the delays seen
in I/O submission paths.

Account separately for block I/O delays incurred as a result of swapin page
faults whose frequency can be affected by the task/process' rss limit. Hence
swapin delays can act as feedback for rss limit changes independent of I/O
priority changes.

Signed-off-by: Shailabh Nagar
Signed-off-by: Balbir Singh
Cc: Jes Sorensen
Cc: Peter Chubb
Cc: Erich Focht
Cc: Levent Serinol
Cc: Jay Lan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
19 years ago
ca74e92b4 [PATCH] per-task-delay-accounting: setup ... Browse Code »

Initialization code related to collection of per-task "delay" statistics which
measure how long it had to wait for cpu, sync block io, swapping etc. The
collection of statistics and the interface are in other patches. This patch
sets up the data structures and allows the statistics collection to be
disabled through a kernel boot parameter.

Signed-off-by: Shailabh Nagar
Signed-off-by: Balbir Singh
Cc: Jes Sorensen
Cc: Peter Chubb
Cc: Erich Focht
Cc: Levent Serinol
Cc: Jay Lan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
19 years ago