10 Nov, 2010
1 commit
-
Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.Tetsuo Handa wrote:
| Quoting from one of posts in that thead
| http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
|
|| Usually tasklist gives enough protection, but if copy_process() fails
|| it calls free_pid() lockless and does call_rcu(delayed_put_pid().
|| This means, without rcu lock find_pid_ns() can't scan the hash table
|| safely.Thomas Gleixner wrote:
| We can remove the tasklist_lock while at it. rcu_read_lock is enough.Patch also replaces thread_group_leader with has_group_leader_pid
in accordance to comment by Oleg Nesterov:| ... thread_group_leader() check is not relaible without
| tasklist. If we race with de_thread() find_task_by_vpid() can find
| the new leader before it updates its ->group_leader.
|
| perhaps it makes sense to change posix_cpu_timer_create() to use
| has_group_leader_pid() instead, just to make this code not look racy
| and avoid adding new problems.Signed-off-by: Sergey Senozhatsky
Cc: Peter Zijlstra
Cc: Stanislaw Gruszka
Reviewed-by: Oleg Nesterov
LKML-Reference:
Signed-off-by: Thomas Gleixner
11 Aug, 2010
1 commit
-
* 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux:
unistd: add __NR_prlimit64 syscall numbers
rlimits: implement prlimit64 syscall
rlimits: switch more rlimit syscalls to do_prlimit
rlimits: redo do_setrlimit to more generic do_prlimit
rlimits: add rlimit64 structure
rlimits: do security check under task_lock
rlimits: allow setrlimit to non-current tasks
rlimits: split sys_setrlimit
rlimits: selinux, do rlimits changes under task_lock
rlimits: make sure ->rlim_max never grows in sys_setrlimit
rlimits: add task_struct to update_rlimit_cpu
rlimits: security, add task_struct to setrlimitFix up various system call number conflicts. We not only added fanotify
system calls in the meantime, but asm-generic/unistd.h added a wait4
along with a range of reserved per-architecture system calls.
16 Jul, 2010
1 commit
-
Add task_struct as a parameter to update_rlimit_cpu to be able to set
rlimit_cpu of different task than current.Signed-off-by: Jiri Slaby
Acked-by: James Morris
18 Jun, 2010
3 commits
-
fastpath_timer_check()->thread_group_cputimer() is racy and
unneeded.It is racy because another thread can clear ->running before
thread_group_cputimer() takes cputimer->lock. In this case
thread_group_cputimer() will set ->running = true again and call
thread_group_cputime(). But since we do not hold tasklist or
siglock, we can race with fork/exit and copy the wrong results
into cputimer->cputime.It is unneeded because if ->running == true we can just use
the numbers in cputimer->cputime we already have.Change fastpath_timer_check() to copy cputimer->cputime into
the local variable under cputimer->lock. We do not re-check
->running under cputimer->lock, run_posix_cpu_timers() does
this check later.Note: we can add more optimizations on top of this change.
Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
run_posix_cpu_timers() doesn't work if current has already passed
exit_notify(). This was needed to prevent the races with do_wait().Since ea6d290c ->signal is always valid and can't go away. We can
remove the "tsk->exit_state == 0" in fastpath_timer_check() and
convert run_posix_cpu_timers() to use lock_task_sighand().Note: it makes sense to take group_leader's sighand instead, the
sub-thread still uses CPU after release_task(). But we need more
changes to do this.Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
thread_group_cputime() looks as if it is rcu-safe, but in fact this
was wrong until ea6d290c which pins task->signal to task_struct.
It checks ->sighand != NULL under rcu, but this can't help if ->signal
can go away. Fortunately the caller either holds ->siglock, or it is
fastpath_timer_check() which uses current and checks exit_state == 0.- Since ea6d290c commit tsk->signal is stable, we can read it first
and avoid the initialization from INIT_CPUTIME.- Even if tsk->signal is always valid, we still have to check it
is safe to use next_thread() under rcu_read_lock(). Currently
the code checks ->sighand != NULL, change it to use pid_alive()
which is commonly used to ensure the task wasn't unhashed before
we take rcu_read_lock().Add the comment to explain this check.
- Change the main loop to use the while_each_thread() helper.
Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
28 May, 2010
1 commit
-
Preparation to make task->signal immutable, no functional changes.
posix-cpu-timers.c checks task->signal != NULL to ensure this task is
alive and didn't pass __exit_signal(). This is correct but we are going
to change the lifetime rules for ->signal and never reset this pointer.Change the code to check ->sighand instead, it doesn't matter which
pointer we check under tasklist, they both are cleared simultaneously.As Roland pointed out, some of these changes are not strictly needed and
probably it makes sense to revert them later, when ->signal will be pinned
to task_struct. But this patch tries to ensure the subsequent changes in
fork/exit can't make any visible impact on posix cpu timers.Signed-off-by: Oleg Nesterov
Cc: Fenghua Yu
Acked-by: Roland McGrath
Cc: Stanislaw Gruszka
Cc: Tony Luck
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
10 May, 2010
2 commits
-
We can optimize and simplify things taking into account signal->cputimer
is always running when we have configured any process wide cpu timer.In check_process_timers(), we don't have to check if new updated value of
signal->cputime_expires is smaller, since we maintain new first expiration
time ({prof,virt,sched}_expires) in code flow and all other writes to
expiration cache are protected by sighand->siglock .Signed-off-by: Stanislaw Gruszka
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Hidetoshi Seto
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
Reason: Further posix_cpu_timer patches depend on mainline changes
Signed-off-by: Thomas Gleixner
27 Mar, 2010
1 commit
-
…el/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
time: Fix accumulation bug triggered by long delay.
posix-cpu-timers: Reset expire cache when no timer is running
timer stats: Fix del_timer_sync() and try_to_del_timer_sync()
clockevents: Sanitize min_delta_ns adjustment and prevent overflows
13 Mar, 2010
6 commits
-
Spread p->sighand->siglock locking scope to make sure that
fastpath_timer_check() never iterates over all threads. Without
locking there is small possibility that signal->cputimer will stop
running while we write values to signal->cputime_expires.Calling thread_group_cputime() from fastpath_timer_check() is not only
bad because it is slow, also it is racy with __exit_signal() which can
lead to invalid signal->{s,u}time values.Signed-off-by: Stanislaw Gruszka
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Hidetoshi Seto
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
When user sets up a timer without associated signal and process does
not use any other cpu timers and does not exit, tsk->signal->cputimer
is enabled and running forever.Avoid running the timer for no reason.
I used below program to check patch does not break current user space
visible behavior.#include
#include
#include
#include
#include
#include
#include
#includevoid consume_cpu(void)
{
int i = 0;
int count = 0;for(i=0; i< 30; i++) {
consume_cpu();
memset(&spec, 0, sizeof(spec));
assert(timer_gettime(tid, &spec) == 0);
printf("%lu.%09lu\n",
(unsigned long) spec.it_value.tv_sec,
(unsigned long) spec.it_value.tv_nsec);
}assert(timer_delete(tid) == 0);
return 0;
}Signed-off-by: Stanislaw Gruszka
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Hidetoshi Seto
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
According POSIX we need to correctly set old timer it_interval value when
user request that in timer_settime(). Tested using below program.#include
#include
#include
#include
#include
#include
#includeint main(void)
{
struct sigaction act;
struct sigevent evt = { };
timer_t tid;
struct itimerspec spec, u_spec, k_spec;evt.sigev_notify = SIGEV_SIGNAL;
evt.sigev_signo = SIGPROF;
assert(timer_create(CLOCK_PROCESS_CPUTIME_ID, &evt, &tid) == 0);spec.it_value.tv_sec = 1;
spec.it_value.tv_nsec = 2;
spec.it_interval.tv_sec = 3;
spec.it_interval.tv_nsec = 4;
u_spec = spec;
assert(timer_settime(tid, 0, &spec, NULL) == 0);spec.it_value.tv_sec = 5;
spec.it_value.tv_nsec = 6;
spec.it_interval.tv_sec = 7;
spec.it_interval.tv_nsec = 8;
assert(timer_settime(tid, 0, &spec, &k_spec) == 0);#define PRT(val) printf(#val ":\t%d/%d\n", (int) u_spec.val, (int) k_spec.val)
PRT(it_value.tv_sec);
PRT(it_value.tv_nsec);
PRT(it_interval.tv_sec);
PRT(it_interval.tv_nsec);return 0;
}Signed-off-by: Stanislaw Gruszka
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Hidetoshi Seto
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
Signed-off-by: Stanislaw Gruszka
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Hidetoshi Seto
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
Let always set signal->cputime_expires expiration cache when setting
new itimer, POSIX 1.b timer, and RLIMIT_CPU. Since we are
initializing prof_exp expiration cache during fork(), this allows to
remove "RLIMIT_CPU != inf" check from fastpath_timer_check() and do
some other cleanups.Checked against regression using test cases from:
http://marc.info/?l=linux-kernel&m=123749066504641&w=4
http://marc.info/?l=linux-kernel&m=123811277916642&w=2Signed-off-by: Stanislaw Gruszka
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Hidetoshi Seto
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
When a process deletes cpu timer or a timer expires we do not clear
the expiration cache sig->cputimer_expires.As a result the fastpath_timer_check() which prevents us to loop over
all threads in case no timer is active is not working and we run the
slow path needlessly on every tick.Zero sig->cputimer_expires in stop_process_timers().
Signed-off-by: Stanislaw Gruszka
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Hidetoshi Seto
Cc: Spencer Candland
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner
07 Mar, 2010
2 commits
-
Make sure compiler won't do weird things with limits. E.g. fetching them
twice may return 2 different values after writable limits are implemented.I.e. either use rlimit helpers added in commit 3e10e716abf3 ("resource:
add helpers for fetching rlimits") or ACCESS_ONCE if not applicable.Signed-off-by: Jiri Slaby
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fetch rlimit (both hard and soft) values only once and work on them. It
removes many accesses through sig structure and makes the code cleaner.Mostly a preparation for writable resource limits support.
Signed-off-by: Jiri Slaby
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Nov, 2009
1 commit
-
We have already new_timer initialized to all-zeros hence in function
initializations are not needed. Document function expectation about
new_timer argument as well.Signed-off-by: Stanislaw Gruszka
Cc: johnstul@us.ibm.com
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner
29 Aug, 2009
2 commits
-
Add tracepoints for all itimer variants: ITIMER_REAL, ITIMER_VIRTUAL
and ITIMER_PROF.[ tglx: Fixed comments and made the output more readable, parseable
and consistent. Replaced pid_vnr by pid_nr because the hrtimer
callback can happen in any namespace ]Signed-off-by: Xiao Guangrong
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Cc: Mathieu Desnoyers
Cc: Anton Blanchard
Cc: Peter Zijlstra
Cc: KOSAKI Motohiro
Cc: Zhaolei
LKML-Reference:
Signed-off-by: Thomas Gleixner -
Merge reason: timer tracepoint patches depend on both branches
Signed-off-by: Thomas Gleixner
09 Aug, 2009
1 commit
-
When the process exits we don't have to run new cputimer nor
use running one (as it not accounts when tsk->exit_state != 0)
to get process CPU times. As there is only one thread we can
just use CPU times fields from task and signal structs.Signed-off-by: Stanislaw Gruszka
Cc: Peter Zijlstra
Cc: Roland McGrath
Cc: Vitaly Mayatskikh
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar
03 Aug, 2009
4 commits
-
For powerpc with CONFIG_VIRT_CPU_ACCOUNTING
jiffies_to_cputime(1) is not compile time constant and run time
calculations are quite expensive. To optimize we use
precomputed value. For all other architectures is is
preprocessor definition.Signed-off-by: Stanislaw Gruszka
Acked-by: Peter Zijlstra
Acked-by: Thomas Gleixner
Cc: Oleg Nesterov
Cc: Andrew Morton
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
LKML-Reference:
Signed-off-by: Ingo Molnar -
Don't update values in expiration cache when new ones are
equal. Add expire_le() and expire_gt() helpers to simplify the
code.Signed-off-by: Stanislaw Gruszka
Acked-by: Peter Zijlstra
Acked-by: Thomas Gleixner
Cc: Oleg Nesterov
Cc: Andrew Morton
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
LKML-Reference:
Signed-off-by: Ingo Molnar -
Measure ITIMER_PROF and ITIMER_VIRT timers interval error
between real ticks and requested by user. Take it into account
when scheduling next tick.This patch introduce possibility where time between two
consecutive tics is smaller then requested interval, it
preserve however dependency that n tick is generated not
earlier than n*interval time - counting from the beginning of
periodic signal generation.Signed-off-by: Stanislaw Gruszka
Acked-by: Peter Zijlstra
Acked-by: Thomas Gleixner
Cc: Oleg Nesterov
Cc: Andrew Morton
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
LKML-Reference:
Signed-off-by: Ingo Molnar -
Both cpu itimers have same data flow in the few places, this
patch make unification of code related with VIRT and PROF
itimers.Signed-off-by: Stanislaw Gruszka
Acked-by: Peter Zijlstra
Acked-by: Thomas Gleixner
Cc: Oleg Nesterov
Cc: Andrew Morton
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
LKML-Reference:
Signed-off-by: Ingo Molnar
30 Apr, 2009
1 commit
-
Sparse reports the following in kernel/posix-cpu-timers.c:
warning: symbol 'firing' shadows an earlier one
Signed-off-by: H Hartley Sweeten
Cc: Subrata Modak
LKML-Reference:
Signed-off-by: Ingo Molnar
10 Apr, 2009
1 commit
-
…l/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: do not count frozen tasks toward load
sched: refresh MAINTAINERS entry
sched: Print sched_group::__cpu_power in sched_domain_debug
cpuacct: add per-cgroup utime/stime statistics
posixtimers, sched: Fix posix clock monotonicity
sched_rt: don't allocate cpumask in fastpath
cpuacct: make cpuacct hierarchy walk in cpuacct_charge() safe when rcupreempt is used -v2
08 Apr, 2009
2 commits
-
update_rlimit_cpu() tries to optimize out set_process_cpu_timer() in case
when we already have CPUCLOCK_PROF timer which should expire first. But it
uses cputime_lt() instead of cputime_gt().Test case:
int main(void)
{
struct itimerval it = {
.it_value = { .tv_sec = 1000 },
};assert(!setitimer(ITIMER_PROF, &it, NULL));
struct rlimit rl = {
.rlim_cur = 1,
.rlim_max = 1,
};assert(!setrlimit(RLIMIT_CPU, &rl));
for (;;)
;return 0;
}Without this patch, the task is not killed as RLIMIT_CPU demands.
Signed-off-by: Oleg Nesterov
Acked-by: Peter Zijlstra
Cc: Peter Lojkin
Cc: Roland McGrath
Cc: stable@kernel.org
LKML-Reference:
Signed-off-by: Ingo Molnar -
Merge reason: update to latest upstream to queue up fix
Signed-off-by: Ingo Molnar
01 Apr, 2009
1 commit
-
Impact: Regression fix (against clock_gettime() backwarding bug)
This patch re-introduces a couple of functions, task_sched_runtime
and thread_group_sched_runtime, which was once removed at the
time of 2.6.28-rc1.These functions protect the sampling of thread/process clock with
rq lock. This rq lock is required not to update rq->clock during
the sampling.i.e.
The clock_gettime() may return
((accounted runtime before update) + (delta after update))
that is less than what it should be.v2 -> v3:
- Rename static helper function __task_delta_exec()
to do_task_delta_exec() since -tip tree already has
a __task_delta_exec() of different version.v1 -> v2:
- Revises comments of function and patch description.
- Add note about accuracy of thread group's runtime.Signed-off-by: Hidetoshi Seto
Acked-by: Peter Zijlstra
Cc: stable@kernel.org [2.6.28.x][2.6.29.x]
LKML-Reference:
Signed-off-by: Ingo Molnar
24 Mar, 2009
1 commit
-
See http://bugzilla.kernel.org/show_bug.cgi?id=12911
copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
fastpath_timer_check() returns false unless we have other cpu timers.This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not
optimal, we need further cleanups here. With this patch update_rlimit_cpu()
is not really needed, but I don't think it should be removed.The proper fix (I think) is:
- set_process_cpu_timer() should just start the cputimer->running
logic (it does), no need to change cputime_expires.xxx_exp- posix_cpu_timers_init_group() should set ->running when needed
- fastpath_timer_check() can check ->running instead of
task_cputime_zero(signal->cputime_expires)Reported-by: Peter Lojkin
Signed-off-by: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Roland McGrath
Cc: [for 2.6.29.x]
LKML-Reference:
Signed-off-by: Ingo Molnar
13 Feb, 2009
1 commit
-
While reviewing the manpages, I noticed I'd missed some clock vs timer sites.
Make sure that all timer functions call cpu_timer_sample_group() and not
cpu_clock_sample_group(). This ensures that we enable the process wide timer
in time, and therefore pay the O(n) thread group cost from the syscall.Not doing it here, will result in the first jiffy tick after setting the timer
doing this, resulting in a very expensive tick (but only once) and a delay in
actually starting the timer.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
11 Feb, 2009
2 commits
-
The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.This fixes a flood of warnings reported by Mike Galbraith.
Reported-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
05 Feb, 2009
1 commit
-
Change the process wide cpu timers/clocks so that we:
1) don't mess up the kernel with too many threads,
2) don't have a per-cpu allocation for each process,
3) have no impact when not used.In order to accomplish this we're going to split it into two parts:
- clocks; which can take all the time they want since they run
from user context -- ie. sys_clock_gettime(CLOCK_PROCESS_CPUTIME_ID)- timers; which need constant time sampling but since they're
explicity used, the user can pay the overhead.The clock readout will go back to a full sum of the thread group, while the
timers will run of a global 'clock' that only runs when needed, so only
programs that make use of the facility pay the price.Signed-off-by: Peter Zijlstra
Reviewed-by: Ingo Molnar
Signed-off-by: Ingo Molnar
08 Jan, 2009
1 commit
-
Either we bounce once cacheline per cpu per tick, yielding n^2 bounces
or we just bounce a single..Also, using per-cpu allocations for the thread-groups complicates the
per-cpu allocator in that its currently aimed to be a fixed sized
allocator and the only possible extention to that would be vmap based,
which is seriously constrained on 32 bit archs.So making the per-cpu memory requirement depend on the number of
processes is an issue.Lastly, it didn't deal with cpu-hotplug, although admittedly that might
be fixable.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
25 Dec, 2008
1 commit
24 Nov, 2008
1 commit
-
Since CLOCK_PROCESS_CPUTIME_ID is in fact translated to -6, the switch
statement in cpu_clock_sample_group() must first mask off the irrelevant
bits, similar to cpu_clock_sample().Signed-off-by: Petr Tesarik
Signed-off-by: Thomas Gleixner--
posix-cpu-timers.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
17 Nov, 2008
1 commit
-
Impact: simplify the code
thread_group_cputime() is called by current when it must have the valid
->signal, or under ->siglock, or under tasklist_lock after the ->signal
check, or the caller is wait_task_zombie() which reaps the child. In any
case ->signal can't be NULL.But the point of this patch is not optimization. If it is possible to call
thread_group_cputime() when ->signal == NULL we are doing something wrong,
and we should not mask the problem. thread_group_cputime() fills *times
and the caller will use it, if we silently use task_struct->*times* we
report the wrong values.Signed-off-by: Oleg Nesterov
Signed-off-by: Ingo Molnar