22 Aug, 2009
1 commit
-
After talking with some application writers who want very fast, but not
fine-grained timestamps, I decided to try to implement new clock_ids
to clock_gettime(): CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE
which returns the time at the last tick. This is very fast as we don't
have to access any hardware (which can be very painful if you're using
something like the acpi_pm clocksource), and we can even use the vdso
clock_gettime() method to avoid the syscall. The only trade off is you
only get low-res tick grained time resolution.This isn't a new idea, I know Ingo has a patch in the -rt tree that made
the vsyscall gettimeofday() return coarse grained time when the
vsyscall64 sysctrl was set to 2. However this affects all applications
on a system.With this method, applications can choose the proper speed/granularity
trade-off for themselves.Signed-off-by: John Stultz
Cc: Andi Kleen
Cc: nikolag@ca.ibm.com
Cc: Darren Hart
Cc: arjan@infradead.org
Cc: jonathan@jonmasters.org
LKML-Reference:
Signed-off-by: Thomas Gleixner
04 Aug, 2009
1 commit
-
Prevent calling do_nanosleep() with clockid
CLOCK_MONOTONIC_RAW, it may cause oops, such as NULL pointer
dereference.Signed-off-by: Hiroshi Shimamoto
Cc: Andrew Morton
Cc: Thomas Gleixner
Cc: John Stultz
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar
14 Jan, 2009
1 commit
-
Signed-off-by: Heiko Carstens
26 Dec, 2008
1 commit
-
…ohz', 'timers/ntp', 'timers/posixtimers' and 'timers/rtc' into timers/core
21 Dec, 2008
1 commit
-
Impact: Prevent kernel crash with posix timer clockid CLOCK_MONOTONIC_RAW
commit 2d42244ae71d6c7b0884b5664cf2eda30fb2ae68 (clocksource:
introduce CLOCK_MONOTONIC_RAW) introduced a new clockid, which is only
available to read out the raw not NTP adjusted system time.The above commit did not prevent that a posix timer can be created
with that clockid. The timer_create() syscall succeeds and initializes
the timer to a non existing hrtimer base. When the timer is deleted
either by timer_delete() or by the exit() cleanup the kernel crashes.Prevent the creation of timers for CLOCK_MONOTONIC_RAW by setting the
posix clock function to no_timer_create which returns an error code.Reported-and-tested-by: Eric Sesterhenn
Signed-off-by: Thomas Gleixner
Acked-by: Oleg Nesterov
Signed-off-by: Linus Torvalds
13 Dec, 2008
2 commits
-
Impact: clean up, speed up
->it_pid (was ->it_process) has also a special meaning: if it is NULL,
the timer is under deletion or it wasn't initialized yet. We can check
->it_signal != NULL instead, this way we can- simplify sys_timer_create() a bit
- remove yet another check from lock_timer()
- move put_pid(->it_pid) into release_posix_timer() which
runs outside of ->it_lockSigned-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
Impact: restructure, clean up code
k_itimer holds the ref to the ->it_process until sys_timer_delete(). This
allows to pin up to RLIMIT_SIGPENDING dead task_struct's. Change the code
to use "struct pid *" instead.The patch doesn't kill ->it_process, it places ->it_pid into the union.
->it_process is still used by do_cpu_nanosleep() as before. It would be
trivial to change the nanosleep code as well, but since it uses it_process
in a special way I think it is better to keep this field for grep.The patch bloats the kernel by 104 bytes and it also adds the new pointer,
->it_signal, to k_itimer. It is used by lock_timer() to verify that the
found timer was not created by another process. It is not clear why do we
use the global database (and thus the global idr_lock) for posix timers.
We still need the signal_struct->posix_timers which contains all useable
timers, perhaps it is better to use some form of per-process array
instead.Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner
22 Oct, 2008
1 commit
-
Conflicts:
kernel/time/tick-sched.c
Signed-off-by: Thomas Gleixner
20 Oct, 2008
1 commit
-
…tp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus
18 Oct, 2008
1 commit
-
Conflicts:
arch/x86/kvm/i8254.c
03 Oct, 2008
1 commit
-
Found by static checker (http://repo.or.cz/w/smatch.git).
Signed-off-by: Dan Carpenter
Acked-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
24 Sep, 2008
9 commits
-
Cleanup. Imho makes the code much more understandable. At least this
patch lessens both the source and compiled code.Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
lock_timer() checks that the timer found by idr_find(timer_id) has ->it_id
== timer_id. This buys nothing. This check can fail only if
sys_timer_create() unlocked idr_lock after idr_get_new(), but didn't set
->it_id = new_timer_id yet. But in that case ->it_process == NULL so
lock_timer() can't succeed anyway.Also remove a couple of unneeded typecasts.
Note that with or without this patch we have a small problem.
sys_timer_create() doesn't ensure that the result of setting (say)
->it_sigev_notify must be visible if lock_timer() succeeds.Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
With the recent changes ->it_sigev_signo and ->it_sigev_value are only
used in sys_timer_create(), kill them.Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
Cleanup.
- sys_timer_create() is big and complicated. The code above the "out:"
label relies on the fact that "error" must be == 0. This is not very
robust, make the code more explicit. Remove the unneeded initialization
of error.- If idr_get_new() succeeds (as it normally should), we check the returned
value twice. Move the "-EAGAIN" check under "if (error)".Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
posix_timer_event() always populates timer->sigq with the same numbers,
move this code into sys_timer_create().Note that with this patch we can kill it_sigev_signo and it_sigev_value.
Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
- Change the code to do rcu_read_lock() instead of taking tasklist_lock,
it is safe to get_task_struct(p) if p was found under RCU.However, now we must not use process's sighand/signal, they may be NULL.
We can use current->sighand/signal instead, this "process" must belong
to the current's thread-group.- Factor out the common code for 2 "if (timer_event_spec)" branches, the
!timer_event_spec case can use current too.- use spin_lock_irq() instead of _irqsave(), kill "flags".
Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
sys_timer_create() return -EINVAL if the target thread has PF_EXITING.
This doesn't really make sense, the sub-thread can die right after unlock.
And in fact, this is just wrong. Without SIGEV_THREAD_ID good_sigevent()
returns ->group_leader, and it is very possible that the leader is already
dead. This is OK, we shouldn't return the error in this case.Remove this check and the comment. Note that the "process" was found
under tasklist_lock, it must have ->sighand != NULL.Also, remove a couple of unneeded initializations.
Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
Change the code to get/put timer->it_process regardless of
SIGEV_THREAD_ID. This streamlines the create/destroy paths and allows us
to simplify the usage of exit_itimers() in de_thread().Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner -
posix_timer_event() drops SIGEV_THREAD_ID and switches to ->group_leader
if send_sigqueue() fails.This is not very useful and doesn't work reliably. send_sigqueue() can
only fail if ->it_process is dead. But it can die before it dequeues the
SI_TIMER signal, in that case the timer stops anyway.Remove this code. I guess it was needed a long ago to ensure that the
timer is not destroyed when when its creator thread dies.Q: perhaps it makes sense to change sys_timer_settime() to return an error
if ->it_process is dead?Signed-off-by: Oleg Nesterov
Cc: mingo@elte.hu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner
06 Sep, 2008
1 commit
-
In order to be able to do range hrtimers we need to use accessor functions
to the "expire" member of the hrtimer struct.
This patch converts kernel/* to these accessors.Signed-off-by: Arjan van de Ven
21 Aug, 2008
1 commit
-
In talking with Josip Loncaric, and his work on clock synchronization (see
btime.sf.net), he mentioned that for really close synchronization, it is
useful to have access to "hardware time", that is a notion of time that is
not in any way adjusted by the clock slewing done to keep close time sync.Part of the issue is if we are using the kernel's ntp adjusted
representation of time in order to measure how we should correct time, we
can run into what Paul McKenney aptly described as "Painting a road using
the lines we're painting as the guide".I had been thinking of a similar problem, and was trying to come up with a
way to give users access to a purely hardware based time representation
that avoided users having to know the underlying frequency and mask values
needed to deal with the wide variety of possible underlying hardware
counters.My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
nanosecond based time value, that increments starting at bootup and has no
frequency adjustments made to it what so ever.The time is accessed from userspace via the posix_clock_gettime() syscall,
passing CLOCK_MONOTONIC_RAW as the clock_id.Signed-off-by: John Stultz
Signed-off-by: Roman Zippel
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar
12 Aug, 2008
1 commit
-
…el/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
posix-timers: fix posix_timer_event() vs dequeue_signal() race
posix-timers: do_schedule_next_timer: fix the setting of ->si_overrun
26 Jul, 2008
2 commits
-
release_posix_timer() can't be called with ->it_process != NULL. Once
sys_timer_create() sets ->it_process it must not call
release_posix_timer(), otherwise we can race with another thread doing
sys_timer_delete(), this timer is visible to idr_find() and unlocked.The same is true for two other callers (actually, for any possible
caller), sys_timer_delete() and itimer_delete(). They must clear
->it_process before unlock_timer() + release_posix_timer().Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Cc: john stultz
Cc: Thomas Gleixner
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
sys_timer_delete() and itimer_delete() check "timer->it_process != NULL",
this looks completely bogus. ->it_process == NULL means that this timer
is already under destruction or it is not fully initialized, this must not
happen.sys_timer_delete: the timer is locked, and lock_timer() can't succeed
if ->it_process == NULL.itimer_delete: it is called by exit_itimers() when there are no other
threads which can play with signal_struct->posix_timers.Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Cc: john stultz
Cc: Thomas Gleixner
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
24 Jul, 2008
2 commits
-
The bug was reported and analysed by Mark McLoughlin ,
the patch is based on his and Roland's suggestions.posix_timer_event() always rewrites the pre-allocated siginfo before sending
the signal. Most of the written info is the same all the time, but memset(0)
is very wrong. If ->sigq is queued we can race with collect_signal() which
can fail to find this siginfo looking at .si_signo, or copy_siginfo() can
copy the wrong .si_code/si_tid/etc.In short, sys_timer_settime() can in fact stop the active timer, or the user
can receive the siginfo with the wrong .si_xxx values.Move "memset(->info, 0)" from posix_timer_event() to alloc_posix_timer(),
change send_sigqueue() to set .si_overrun = 0 when ->sigq is not queued.
It would be nice to move the whole sigq->info initialization from send to
create path, but this is not easy to do without uglifying timer_create()
further.As Roland rightly pointed out, we need more cleanups/fixes here, see the
"FIXME" comment in the patch. Hopefully this patch makes sense anyway, and
it can mask the most bad implications.Reported-by: Mark McLoughlin
Signed-off-by: Oleg Nesterov
Cc: Mark McLoughlin
Cc: Oliver Pinter
Cc: Roland McGrath
Cc: stable@kernel.org
Cc: Andrew Morton
Signed-off-by: Thomas Gleixnerkernel/posix-timers.c | 17 +++++++++++++----
kernel/signal.c | 1 +
2 files changed, 14 insertions(+), 4 deletions(-) -
do_schedule_next_timer() sets info->si_overrun = timr->it_overrun_last,
this discards the already accumulated overruns.Signed-off-by: Oleg Nesterov
Cc: Mark McLoughlin
Cc: Oliver Pinter
Cc: Roland McGrath
Cc: stable@kernel.org
Cc: Andrew Morton
Signed-off-by: Thomas Gleixner
30 Apr, 2008
1 commit
-
We export send_sigqueue() and send_group_sigqueue() for the only user,
posix_timer_event(). This is a bit silly, because both are just trivial
helpers on top of do_send_sigqueue() and because the we pass the unused
.si_signo parameter.Kill them both, rename do_send_sigqueue() to send_sigqueue(), and export it.
Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Apr, 2008
1 commit
-
None of these files use any of the functionality promised by
asm/semaphore.h.Signed-off-by: Matthew Wilcox
15 Feb, 2008
1 commit
-
Various user space callers ask for relative timeouts. While we fixed
that overflow issue in hrtimer_start(), the sites which convert
relative user space values to absolute timeouts themself were uncovered.Instead of putting overflow checks into each place add a function
which does the sanity checking and convert all affected callers to use
it.Thanks to Frans Pop, who reported the problem and tested the fixes.
Signed-off-by: Thomas Gleixner
Acked-by: Ingo Molnar
Tested-by: Frans Pop
10 Feb, 2008
1 commit
-
Spotted by Pavel Emelyanov and Alexey Dobriyan.
hrtimer_nanosleep() sets restart_block->arg1 = rmtp, but this rmtp points to
the local variable which lives in the caller's stack frame. This means that
if sys_restart_syscall() actually happens and it is interrupted as well, we
don't update the user-space variable, but write into the already dead stack
frame.Introduced by commit 04c227140fed77587432667a574b14736a06dd7f
hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easierChange the callers to pass "__user *rmtp" to hrtimer_nanosleep(), and change
hrtimer_nanosleep() to use copy_to_user() to actually update *rmtp.Small problem remains. man 2 nanosleep states that *rtmp should be written if
nanosleep() was interrupted (it says nothing whether it is OK to update *rmtp
if nanosleep returns 0), but (with or without this patch) we can dirty *rem
even if nanosleep() returns 0.NOTE: this patch doesn't change compat_sys_nanosleep(), because it has other
bugs. Fixed by the next patch.Signed-off-by: Oleg Nesterov
Cc: Alexey Dobriyan
Cc: Michael Kerrisk
Cc: Pavel Emelyanov
Cc: Peter Zijlstra
Cc: Toyo Abe
Cc: Andrew Morton
Signed-off-by: Thomas Gleixnerinclude/linux/hrtimer.h | 2 -
kernel/hrtimer.c | 51 +++++++++++++++++++++++++-----------------------
kernel/posix-timers.c | 14 +------------
3 files changed, 30 insertions(+), 37 deletions(-)
09 Feb, 2008
1 commit
-
All the functions that need to lookup a task by pid in posix timers obtain
this pid from a user space, and thus this value refers to a task in the same
namespace, as the current task lives in.So the proper behavior is to call find_task_by_vpid() here.
Signed-off-by: Pavel Emelyanov
Cc: "Eric W. Biederman"
Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Feb, 2008
1 commit
-
This is the new timerfd API as it is implemented by the following patch:
int timerfd_create(int clockid, int flags);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.The timerfd_settime() API give new settings by the timerfd fd, by optionally
retrieving the previous expiration time (in case the "otmr" parameter is not
NULL).The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
is set in the "flags" parameter. Otherwise it's a relative time.The timerfd_gettime() API returns the next expiration time of the timer, or
{0, 0} if the timerfd has not been set yet.Like the previous timerfd API implementation, read(2) and poll(2) are
supported (with the same interface). Here's a simple test program I used to
exercise the new timerfd APIs:http://www.xmailserver.org/timerfd-test2.c
[akpm@linux-foundation.org: coding-style cleanups]
[akpm@linux-foundation.org: fix ia64 build]
[akpm@linux-foundation.org: fix m68k build]
[akpm@linux-foundation.org: fix mips build]
[akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
[heiko.carstens@de.ibm.com: fix s390]
[akpm@linux-foundation.org: fix powerpc build]
[akpm@linux-foundation.org: fix sparc64 more]
Signed-off-by: Davide Libenzi
Cc: Michael Kerrisk
Cc: Thomas Gleixner
Cc: Davide Libenzi
Cc: Michael Kerrisk
Cc: Martin Schwidefsky
Signed-off-by: Heiko Carstens
Cc: Michael Kerrisk
Cc: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Feb, 2008
1 commit
-
Signed-off-by: Joe Perches
Signed-off-by: Adrian Bunk
20 Oct, 2007
1 commit
-
With pid namespaces this field is now dangerous to use explicitly, so hide
it behind the helpers.Also the pid and pgrp fields o task_struct and signal_struct are to be
deprecated. Unfortunately this patch cannot be sent right now as this
leads to tons of warnings, so start isolating them, and deprecate later.Actually the p->tgid == pid has to be changed to has_group_leader_pid(),
but Oleg pointed out that in case of posix cpu timers this is the same, and
thread_group_leader() is more preferable.Signed-off-by: Pavel Emelyanov
Acked-by: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Oct, 2007
1 commit
-
Pull the copy_to_user out of hrtimer_nanosleep and into the callers
(common_nsleep, sys_nanosleep) in preparation for converting
compat_sys_nanosleep to use hrtimers.Signed-off-by: Anton Blanchard
Acked-by: Arnd Bergmann
Signed-off-by: Thomas Gleixner
17 Oct, 2007
1 commit
-
These aren't modular, so SLAB_PANIC is OK.
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Oct, 2007
1 commit
-
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds
23 Aug, 2007
2 commits
-
sys_timer_create() sets ->it_process and unlocks ->siglock, then checks
tmr->it_sigev_notify to define if get_task_struct() is needed.We already passed ->it_id to the caller, another thread can delete this timer
and free its memory in between.As a minimal fix, move this code under ->siglock, sys_timer_delete() takes it
too before calling release_posix_timer(). A proper serialization would be to
take ->it_lock, we add a partly initialized timer on posix_timers_id, not
good.Signed-off-by: Oleg Nesterov
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
timer_delete does:
lock_timer();
timer->it_process = NULL;
unlock_timer();
release_posix_timer();timer->it_process is checked in lock_timer() to prevent access to a
timer, which is on the way to be deleted, but the check happens after
idr_lock is dropped. This allows release_posix_timer() to delete the
timer before the lock code can check the timer:CPU 0 CPU 1
lock_timer();
timer->it_process = NULL;
unlock_timer();
lock_timer()
spin_lock(idr_lock);
timer = idr_find();
spin_lock(timer->lock);
spin_unlock(idr_lock);
release_posix_timer();
spin_lock(idr_lock);
idr_remove(timer);
spin_unlock(idr_lock);
free_timer(timer);
if (timer->......)Change the locking to prevent this.
Signed-off-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds