Eric Lee / linux-smarc-t335x-v3.2

22 Oct, 2010

1 commit

6f1bc451e timer: Make try_to_del_timer_sync() the same on SMP and UP ... Browse Code »

On UP try_to_del_timer_sync() is mapped to del_timer() which does not
take the running timer callback into account, so it has different
semantics.

Remove the SMP dependency of try_to_del_timer_sync() by using
base->running_timer in the UP case as well.

[ tglx: Removed set_running_timer() inline and tweaked the changelog ]

Signed-off-by: Yong Zhang
Cc: Ingo Molnar
Cc: Peter Zijlstra
Acked-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

Yong Zhang
2010-10-22 20:46:25 +0800

21 Oct, 2010

3 commits

dd6414b50 timer: Permit statically-declared work with deferrable timers ... Browse Code »

Currently, you have to just define a delayed_work uninitialised, and then
initialise it before first use. That's a tad clumsy. At risk of playing
mind-games with the compiler, fooling it into doing pointer arithmetic
with compile-time-constants, this lets clients properly initialise delayed
work with deferrable timers statically.

This patch was inspired by the issues which lead Artem Bityutskiy to
commit 8eab945c5616fc984 ("sunrpc: make the cache cleaner workqueue
deferrable").

Signed-off-by: Phil Carmody
Acked-by: Artem Bityutskiy
Cc: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

Phil Carmody
2010-10-21 23:30:06 +0800
aaabe31c2 timer: Initialize the field slack of timer_list ... Browse Code »

TIMER_INITIALIZER() should initialize the field slack of timer_list as
__init_timer() does.

Signed-off-by: Changli Gao
Cc: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

Changli Gao
2010-10-21 23:30:06 +0800
d0959024d timer_list: Remove alignment padding on 64 bit when CONFIG_TIMER_STATS ... Browse Code »

Reorder struct timer_list to remove 8 bytes of alignment padding on 64
bit builds when CONFIG_TIMER_STATS is selected.

timer_list is widely used across the kernel so many structures will
benefit and shrink in size.

For example, with my config on x86_64
per_cpu_dm_data shrinks from 136 to 128 bytes
and
ahci_port_priv shrinks from 1032 to 968 bytes.

Signed-off-by: Richard Kennedy
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

Richard Kennedy
2010-10-21 23:30:06 +0800

03 Aug, 2010

1 commit

8cadd2831 timer: add on-stack deferrable timer interfaces ... Browse Code »

In some cases (for instance with kernel threads) it may be desireable to
use on-stack deferrable timers to get their power saving benefits. Add
interfaces to support this for the IPS driver.

Signed-off-by: Jesse Barnes
Signed-off-by: Matthew Garrett

Jesse Barnes
2010-08-03 21:48:45 +0800

07 Apr, 2010

1 commit

3bbb9ec94 timers: Introduce the concept of timer slack for legacy timers ... Browse Code »

While HR timers have had the concept of timer slack for quite some time
now, the legacy timers lacked this concept, and had to make do with
round_jiffies() and friends.

Timer slack is important for power management; grouping timers reduces the
number of wakeups which in turn reduces power consumption.

This patch introduces timer slack to the legacy timers using the following
pieces:
* A slack field in the timer struct
* An api (set_timer_slack) that callers can use to set explicit timer slack
* A default slack of 0.4% of the requested delay for callers that do not set
any explicit slack
* Rounding code that is part of mod_timer() that tries to
group timers around jiffies values every 'power of two'
(so quick timers will group around every 2, but longer timers
will group around every 4, 8, 16, 32 etc)

Signed-off-by: Arjan van de Ven
Cc: johnstul@us.ibm.com
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

Arjan van de Ven
2010-04-07 03:50:02 +0800

31 Aug, 2009

1 commit

e500011ff timers: Drop a function prototype ... Browse Code »

Drop prototype for non-existent next_timer_interrupt() function.

Signed-off-by: Randy Dunlap
Cc: akpm
LKML-Reference:
Signed-off-by: Thomas Gleixner

Randy Dunlap
2009-08-31 04:26:34 +0800

24 Jun, 2009

1 commit

507e12315 timer stats: Optimize by adding quick check to avoid function calls ... Browse Code »

When the kernel is configured with CONFIG_TIMER_STATS but timer
stats are runtime disabled we still get calls to
__timer_stats_timer_set_start_info which initializes some
fields in the corresponding struct timer_list.

So add some quick checks in the the timer stats setup functions
to avoid function calls to __timer_stats_timer_set_start_info
when timer stats are disabled.

In an artificial workload that does nothing but playing ping
pong with a single tcp packet via loopback this decreases cpu
consumption by 1 - 1.5%.

This is part of a modified function trace output on SLES11:

perl-2497 [00] 28630647177732388 [+ 125]: sk_reset_timer
Cc: Andrew Morton
Cc: Martin Schwidefsky
Cc: Mustafa Mesanovic
Cc: Arjan van de Ven
LKML-Reference:
Signed-off-by: Ingo Molnar

Heiko Carstens
2009-06-24 17:15:09 +0800

13 May, 2009

1 commit

597d02757 timers: Framework for identifying pinned timers ... Browse Code »

* Arun R Bharadwaj [2009-04-16 12:11:36]:

This patch creates a new framework for identifying cpu-pinned timers
and hrtimers.

This framework is needed because pinned timers are expected to fire on
the same CPU on which they are queued. So it is essential to identify
these and not migrate them, in case there are any.

For regular timers, the currently existing add_timer_on() can be used
queue pinned timers and subsequently mod_timer_pinned() can be used
to modify the 'expires' field.

For hrtimers, new modes HRTIMER_ABS_PINNED and HRTIMER_REL_PINNED are
added to queue cpu-pinned hrtimer.

[ tglx: use .._PINNED mode argument instead of creating tons of new
functions ]

Signed-off-by: Arun R Bharadwaj
Signed-off-by: Thomas Gleixner

Arun R Bharadwaj
2009-05-13 22:52:42 +0800

31 Mar, 2009

1 commit

c4e1aa67e Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (33 commits)
lockdep: fix deadlock in lockdep_trace_alloc
lockdep: annotate reclaim context (__GFP_NOFS), fix SLOB
lockdep: annotate reclaim context (__GFP_NOFS), fix
lockdep: build fix for !PROVE_LOCKING
lockstat: warn about disabled lock debugging
lockdep: use stringify.h
lockdep: simplify check_prev_add_irq()
lockdep: get_user_chars() redo
lockdep: simplify get_user_chars()
lockdep: add comments to mark_lock_irq()
lockdep: remove macro usage from mark_held_locks()
lockdep: fully reduce mark_lock_irq()
lockdep: merge the !_READ mark_lock_irq() helpers
lockdep: merge the _READ mark_lock_irq() helpers
lockdep: simplify mark_lock_irq() helpers #3
lockdep: further simplify mark_lock_irq() helpers
lockdep: simplify the mark_lock_irq() helpers
lockdep: split up mark_lock_irq()
lockdep: generate usage strings
lockdep: generate the state bit definitions
...

Linus Torvalds
2009-03-31 08:17:35 +0800

19 Feb, 2009

1 commit

74019224a timers: add mod_timer_pending() ... Browse Code »

Impact: new timer API

Based on an idea from Martin Josefsson with the help of
Patrick McHardy and Stephen Hemminger:

introduce the mod_timer_pending() API which is a mod_timer()
offspring that is an invariant on already removed timers.

(regular mod_timer() re-activates non-pending timers.)

This is useful for the networking code in that it can
allow unserialized mod_timer_pending() timer-forwarding
calls, but a single del_timer*() will stop the timer
from being reactivated again.

Also while at it:

- optimize the regular mod_timer() path some more, the
timer-stat and a debug check was needlessly duplicated
in __mod_timer().

- make the exports come straight after the function, as
most other exports in timer.c already did.

- eliminate __mod_timer() as an external API, change the
users to mod_timer().

The regular mod_timer() code path is not impacted
significantly, due to inlining optimizations and due to
the simplifications.

Based-on-patch-from: Stephen Hemminger
Acked-by: Stephen Hemminger
Cc: "David S. Miller"
Cc: Patrick McHardy
Cc: netdev@vger.kernel.org
Cc: Oleg Nesterov
Cc: Andrew Morton
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-02-19 02:26:33 +0800

15 Feb, 2009

1 commit

6f2b9b9a9 timer: implement lockdep deadlock detection ... Browse Code »

This modifies the timer code in a way to allow lockdep to detect
deadlocks resulting from a lock being taken in the timer function
as well as around the del_timer_sync() call.

Signed-off-by: Johannes Berg

Johannes Berg
2009-02-15 06:25:52 +0800

06 Nov, 2008

1 commit

9c133c469 Add round_jiffies_up and related routines ... Browse Code »

This patch (as1158b) adds round_jiffies_up() and friends. These
routines work like the analogous round_jiffies() functions, except
that they will never round down.

The new routines will be useful for timeouts where we don't care
exactly when the timer expires, provided it doesn't expire too soon.

Signed-off-by: Alan Stern
Signed-off-by: Jens Axboe

Alan Stern
2008-11-06 15:42:48 +0800

30 Apr, 2008

1 commit

c6f3a97f8 debugobjects: add timer specific object debugging code ... Browse Code »

Add calls to the generic object debugging infrastructure and provide fixup
functions which allow to keep the system alive when recoverable problems have
been detected by the object debugging core code.

Signed-off-by: Thomas Gleixner
Acked-by: Ingo Molnar
Cc: Greg KH
Cc: Randy Dunlap
Cc: Kay Sievers
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2008-04-30 23:29:53 +0800

09 Feb, 2008

2 commits

6d141c3ff workqueue: make delayed_work_timer_fn() static ... Browse Code »

delayed_work_timer_fn() is a timer function, make it static.

Signed-off-by: Li Zefan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-02-09 01:22:37 +0800
ec7015840 Remove fastcall from linux/include ... Browse Code »

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Harvey Harrison
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harvey Harrison
2008-02-09 01:22:31 +0800

30 Jan, 2008

1 commit

a6fa8e5a6 time: clean hungarian notation from timers ... Browse Code »

Clean up hungarian notation from timer code.

Signed-off-by: Pavel Machek
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Pavel Machek
2008-01-30 20:30:00 +0800

17 Jul, 2007

2 commits

c5c061b8f Add a flag to indicate deferrable timers in /proc/timer_stats ... Browse Code »

Add a flag in /proc/timer_stats to indicate deferrable timers. This will
let developers/users to differentiate between types of tiemrs in
/proc/timer_stats.

Deferrable timer and normal timer will appear in /proc/timer_stats as below.
10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
10, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)

Also version of timer_stats changes from v0.1 to v0.2

Signed-off-by: Venkatesh Pallipadi
Acked-by: Ingo Molnar
Cc: Thomas Gleixner
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Venki Pallipadi
2007-07-17 00:05:45 +0800
0a3021f4e Remove unnecessary includes of spinlock.h under include/linux ... Browse Code »

Remove the obviously unnecessary includes of under the
include/linux/ directory, and fix the couple errors that are introduced as
a result of that.

Signed-off-by: Robert P. J. Day
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Robert P. J. Day
2007-07-17 00:05:42 +0800

30 May, 2007

1 commit

eaad084bb NOHZ: prevent multiplication overflow - stop timer for huge timeouts ... Browse Code »

get_next_timer_interrupt() returns a delta of (LONG_MAX > 1) in case
there is no timer pending. On 64 bit machines this results in a
multiplication overflow in tick_nohz_stop_sched_tick().

Reported by: Dave Miller

Make the return value a constant and limit the return value to a 32 bit
value.

When the max timeout value is returned, we can safely stop the tick
timer device. The max jiffies delta results in a 12 days timeout for
HZ=1000.

In the long term the get_next_timer_interrupt() code needs to be
reworked to return ktime instead of jiffies, but we have to wait until
the last users of the original NO_IDLE_HZ code are converted.

Signed-off-by: Thomas Gleixner
Acked-off-by: David S. Miller
Signed-off-by: Linus Torvalds

Thomas Gleixner
2007-05-30 09:11:10 +0800

09 May, 2007

1 commit

6e453a675 Add support for deferrable timers ... Browse Code »

Introduce a new flag for timers - deferrable: Timers that work normally
when system is busy. But, will not cause CPU to come out of idle (just to
service this timer), when CPU is idle. Instead, this timer will be
serviced when CPU eventually wakes up with a subsequent non-deferrable
timer.

The main advantage of this is to avoid unnecessary timer interrupts when
CPU is idle. If the routine currently called by a timer can wait until
next event without any issues, this new timer can be used to setup timer
event for that routine. This, with dynticks, allows CPUs to be lazy,
allowing them to stay in idle for extended period of time by reducing
unnecesary wakeup and thereby reducing the power consumption.

This patch:

Builds this new timer on top of existing timer infrastructure. It uses
last bit in 'base' pointer of timer_list structure to store this deferrable
timer flag. __next_timer_interrupt() function skips over these deferrable
timers when CPU looks for next timer event for which it has to wake up.

This is exported by a new interface init_timer_deferrable() that can be
called in place of regular init_timer().

[akpm@linux-foundation.org: Privatise a #define]
Signed-off-by: Venkatesh Pallipadi
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Oleg Nesterov
Cc: Dave Jones
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Venki Pallipadi
2007-05-09 02:15:05 +0800

17 Feb, 2007

3 commits

82f67cd9f [PATCH] Add debugging feature /proc/timer_stat ... Browse Code »

Add /proc/timer_stats support: debugging feature to profile timer expiration.
Both the starting site, process/PID and the expiration function is captured.
This allows the quick identification of timer event sources in a system.

Sample output:

# echo 1 > /proc/timer_stats
# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 4.010 s
24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
11, 0 swapper sk_reset_timer (tcp_delack_timer)
6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
4, 2050 pcscd do_nanosleep (hrtimer_wakeup)
5, 4179 sshd sk_reset_timer (tcp_write_timer)
4, 2248 yum-updatesd schedule_timeout (process_timeout)
18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
3, 0 swapper sk_reset_timer (tcp_delack_timer)
1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer)
2, 1 swapper e1000_up (e1000_watchdog)
1, 1 init schedule_timeout (process_timeout)
100 total events, 25.24 events/sec

[ cleanups and hrtimers support from Thomas Gleixner ]
[bunk@stusta.de: nr_entries can become static]
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner
Cc: john stultz
Cc: Roman Zippel
Cc: Andi Kleen
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2007-02-17 00:13:59 +0800
c9cb2e3d7 [PATCH] hrtimers: namespace and enum cleanup ... Browse Code »

- hrtimers did not use the hrtimer_restart enum and relied on the implict
int representation. Fix the prototypes and the functions using the enums.
- Use seperate name spaces for the enumerations
- Convert hrtimer_restart macro to inline function
- Add comments

No functional changes.

[akpm@osdl.org: fix input driver]
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Cc: john stultz
Cc: Roman Zippel
Cc: Dmitry Torokhov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2007-02-17 00:13:58 +0800
fd064b9b7 [PATCH] Extend next_timer_interrupt() to use a reference jiffie ... Browse Code »

For CONFIG_NO_HZ we need to calculate the next timer wheel event based on a
given jiffie value. Extend the existing code to allow the extra 'now'
argument. Provide a compability function for the existing implementations to
call the function with now == jiffies. (This also solves the racyness of the
original code vs. jiffies changing during the iteration.)

No functional changes to existing users of this infrastructure.

[ remove WARN_ON() that triggered on s390, by Carsten Otte ]
[ made new helper static, Adrian Bunk ]
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Cc: john stultz
Cc: Roman Zippel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2007-02-17 00:13:58 +0800

27 Jan, 2007

1 commit

45f8bde0d [PATCH] fix various kernel-doc in header files ... Browse Code »

Fix a number of kernel-doc entries for header files in include/linux by
making sure they begin with the appropriate '/**' notation and use @var
notation.

Signed-off-by: Robert P. J. Day
Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Robert P. J. Day
2007-01-27 05:51:00 +0800

11 Dec, 2006

1 commit

4c36a5dec [PATCH] round_jiffies infrastructure ... Browse Code »

Introduce a round_jiffies() function as well as a round_jiffies_relative()
function. These functions round a jiffies value to the next whole second.
The primary purpose of this rounding is to cause all "we don't care exactly
when" timers to happen at the same jiffy.

This avoids multiple timers firing within the second for no real reason;
with dynamic ticks these extra timers cause wakeups from deep sleep CPU
sleep states and thus waste power.

The exact wakeup moment is skewed by the cpu number, to avoid all cpus from
waking up at the exact same time (and hitting the same lock/cachelines
there)

[akpm@osdl.org: fix variable type]
Signed-off-by: Arjan van de Ven
Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2006-12-11 01:57:22 +0800

26 Apr, 2006

1 commit

62c4f0a2d Don't include linux/config.h from anywhere else in include/ ... Browse Code »

Signed-off-by: David Woodhouse

David Woodhouse
2006-04-26 19:56:16 +0800

01 Apr, 2006

1 commit

3691c5199 [PATCH] kill __init_timer_base in favor of boot_tvec_bases ... Browse Code »

Commit a4a6198b80cf82eb8160603c98da218d1bd5e104:
[PATCH] tvec_bases too large for per-cpu data

introduced "struct tvec_t_base_s boot_tvec_bases" which is visible at
compile time. This means we can kill __init_timer_base and move
timer_base_s's content into tvec_t_base_s.

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-04-01 04:18:52 +0800

27 Mar, 2006

1 commit

05cfb614d [PATCH] hrtimers: remove data field ... Browse Code »

The nanosleep cleanup allows to remove the data field of hrtimer. The
callback function can use container_of() to get it's own data. Since the
hrtimer structure is anyway embedded in other structures, this adds no
overhead.

Signed-off-by: Roman Zippel
Signed-off-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roman Zippel
2006-03-27 00:57:03 +0800

25 Mar, 2006

1 commit

13fce8062 Fix simple typos ... Browse Code »

This corrects some trivial errors in ARM docs and comments,

Signed-off-by: Adrian Bunk

Andrzej Zaborowski
2006-03-25 01:13:37 +0800

11 Jan, 2006

1 commit

2ff678b8d [PATCH] hrtimer: switch itimers to hrtimer ... Browse Code »

switch itimers to a hrtimers-based implementation

Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2006-01-11 00:01:38 +0800

31 Oct, 2005

3 commits

15d2bace5 [PATCH] add_timer() of a pending timer is illegal ... Browse Code »

In the recent timer rework we lost the check for an add_timer() of an
already-pending timer. That check was useful for networking, so put it back.

Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2005-10-31 09:37:21 +0800
dfc4f94d2 [PATCH] remove timer debug field ... Browse Code »

Remove timer_list.magic and associated debugging code.

I originally added this when a spinlock was added to timer_list - this meant
that an all-zeroes timer became illegal and init_timer() was required.

That spinlock isn't even there any more, although timer.base must now be
initialised.

I'll keep this debugging code in -mm.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2005-10-31 09:37:18 +0800
a8db2db1e [PATCH] introduce setup_timer() helper ... Browse Code »

Every user of init_timer() also needs to initialize ->function and ->data
fields. This patch adds a simple setup_timer() helper for that.

The schedule_timeout() is patched as an example of usage.

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-10-31 09:37:17 +0800

10 Sep, 2005

1 commit

8d06afab7 [PATCH] timer initialization cleanup: DEFINE_TIMER ... Browse Code »

Clean up timer initialization by introducing DEFINE_TIMER a'la
DEFINE_SPINLOCK. Build and boot-tested on x86. A similar patch has been
been in the -RT tree for some time.

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2005-09-10 05:03:48 +0800

24 Jun, 2005

2 commits

fd450b731 [PATCH] timers: introduce try_to_del_timer_sync() ... Browse Code »

This patch splits del_timer_sync() into 2 functions. The new one,
try_to_del_timer_sync(), returns -1 when it hits executing timer.

It can be used in interrupt context, or when the caller hold locks which
can prevent completion of the timer's handler.

NOTE. Currently it can't be used in interrupt context in UP case, because
->running_timer is used only with CONFIG_SMP.

Should the need arise, it is possible to kill #ifdef CONFIG_SMP in
set_running_timer(), it is cheap.

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-06-24 00:45:16 +0800
55c888d6d [PATCH] timers fixes/improvements ... Browse Code »

This patch tries to solve following problems:

1. del_timer_sync() is racy. The timer can be fired again after
del_timer_sync have checked all cpus and before it will recheck
timer_pending().

2. It has scalability problems. All cpus are scanned to determine
if the timer is running on that cpu.

With this patch del_timer_sync is O(1) and no slower than plain
del_timer(pending_timer), unless it has to actually wait for
completion of the currently running timer.

The only restriction is that the recurring timer should not use
add_timer_on().

3. The timers are not serialized wrt to itself.

If CPU_0 does mod_timer(jiffies+1) while the timer is currently
running on CPU 1, it is quite possible that local interrupt on
CPU_0 will start that timer before it finished on CPU_1.

4. The timers locking is suboptimal. __mod_timer() takes 3 locks
at once and still requires wmb() in del_timer/run_timers.

The new implementation takes 2 locks sequentially and does not
need memory barriers.

Currently ->base != NULL means that the timer is pending. In that case
->base.lock is used to lock the timer. __mod_timer also takes timer->lock
because ->base can be == NULL.

This patch uses timer->entry.next != NULL as indication that the timer is
pending. So it does __list_del(), entry->next = NULL instead of list_del()
when the timer is deleted.

The ->base field is used for hashed locking only, it is initialized
in init_timer() which sets ->base = per_cpu(tvec_bases). When the
tvec_bases.lock is locked, it means that all timers which are tied
to this base via timer->base are locked, and the base itself is locked
too.

So __run_timers/migrate_timers can safely modify all timers which could
be found on ->tvX lists (pending timers).

When the timer's base is locked, and the timer removed from ->entry list
(which means that _run_timers/migrate_timers can't see this timer), it is
possible to set timer->base = NULL and drop the lock: the timer remains
locked.

This patch adds lock_timer_base() helper, which waits for ->base != NULL,
locks the ->base, and checks it is still the same.

__mod_timer() schedules the timer on the local CPU and changes it's base.
However, it does not lock both old and new bases at once. It locks the
timer via lock_timer_base(), deletes the timer, sets ->base = NULL, and
unlocks old base. Then __mod_timer() locks new_base, sets ->base = new_base,
and adds this timer. This simplifies the code, because AB-BA deadlock is not
possible. __mod_timer() also ensures that the timer's base is not changed
while the timer's handler is running on the old base.

__run_timers(), del_timer() do not change ->base anymore, they only clear
pending flag.

So del_timer_sync() can test timer->base->running_timer == timer to detect
whether it is running or not.

We don't need timer_list->lock anymore, this patch kills it.

We also don't need barriers. del_timer() and __run_timers() used smp_wmb()
before clearing timer's pending flag. It was needed because __mod_timer()
did not lock old_base if the timer is not pending, so __mod_timer()->list_add()
could race with del_timer()->list_del(). With this patch these functions are
serialized through base->lock.

One problem. TIMER_INITIALIZER can't use per_cpu(tvec_bases). So this patch
adds global

struct timer_base_s {
spinlock_t lock;
struct timer_list *running_timer;
} __init_timer_base;

which is used by TIMER_INITIALIZER. The corresponding fields in tvec_t_base_s
struct are replaced by struct timer_base_s t_base.

It is indeed ugly. But this can't have scalability problems. The global
__init_timer_base.lock is used only when __mod_timer() is called for the first
time AND the timer was compile time initialized. After that the timer migrates
to the local CPU.

Signed-off-by: Oleg Nesterov
Acked-by: Ingo Molnar
Signed-off-by: Renaud Lienhart
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-06-24 00:45:16 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800