22 Oct, 2010

1 commit

  • On UP try_to_del_timer_sync() is mapped to del_timer() which does not
    take the running timer callback into account, so it has different
    semantics.

    Remove the SMP dependency of try_to_del_timer_sync() by using
    base->running_timer in the UP case as well.

    [ tglx: Removed set_running_timer() inline and tweaked the changelog ]

    Signed-off-by: Yong Zhang
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Acked-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Yong Zhang
     

21 Oct, 2010

3 commits

  • Currently, you have to just define a delayed_work uninitialised, and then
    initialise it before first use. That's a tad clumsy. At risk of playing
    mind-games with the compiler, fooling it into doing pointer arithmetic
    with compile-time-constants, this lets clients properly initialise delayed
    work with deferrable timers statically.

    This patch was inspired by the issues which lead Artem Bityutskiy to
    commit 8eab945c5616fc984 ("sunrpc: make the cache cleaner workqueue
    deferrable").

    Signed-off-by: Phil Carmody
    Acked-by: Artem Bityutskiy
    Cc: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Phil Carmody
     
  • TIMER_INITIALIZER() should initialize the field slack of timer_list as
    __init_timer() does.

    Signed-off-by: Changli Gao
    Cc: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Changli Gao
     
  • Reorder struct timer_list to remove 8 bytes of alignment padding on 64
    bit builds when CONFIG_TIMER_STATS is selected.

    timer_list is widely used across the kernel so many structures will
    benefit and shrink in size.

    For example, with my config on x86_64
    per_cpu_dm_data shrinks from 136 to 128 bytes
    and
    ahci_port_priv shrinks from 1032 to 968 bytes.

    Signed-off-by: Richard Kennedy
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Richard Kennedy
     

03 Aug, 2010

1 commit


07 Apr, 2010

1 commit

  • While HR timers have had the concept of timer slack for quite some time
    now, the legacy timers lacked this concept, and had to make do with
    round_jiffies() and friends.

    Timer slack is important for power management; grouping timers reduces the
    number of wakeups which in turn reduces power consumption.

    This patch introduces timer slack to the legacy timers using the following
    pieces:
    * A slack field in the timer struct
    * An api (set_timer_slack) that callers can use to set explicit timer slack
    * A default slack of 0.4% of the requested delay for callers that do not set
    any explicit slack
    * Rounding code that is part of mod_timer() that tries to
    group timers around jiffies values every 'power of two'
    (so quick timers will group around every 2, but longer timers
    will group around every 4, 8, 16, 32 etc)

    Signed-off-by: Arjan van de Ven
    Cc: johnstul@us.ibm.com
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Arjan van de Ven
     

31 Aug, 2009

1 commit


24 Jun, 2009

1 commit

  • When the kernel is configured with CONFIG_TIMER_STATS but timer
    stats are runtime disabled we still get calls to
    __timer_stats_timer_set_start_info which initializes some
    fields in the corresponding struct timer_list.

    So add some quick checks in the the timer stats setup functions
    to avoid function calls to __timer_stats_timer_set_start_info
    when timer stats are disabled.

    In an artificial workload that does nothing but playing ping
    pong with a single tcp packet via loopback this decreases cpu
    consumption by 1 - 1.5%.

    This is part of a modified function trace output on SLES11:

    perl-2497 [00] 28630647177732388 [+ 125]: sk_reset_timer
    Cc: Andrew Morton
    Cc: Martin Schwidefsky
    Cc: Mustafa Mesanovic
    Cc: Arjan van de Ven
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Heiko Carstens
     

13 May, 2009

1 commit

  • * Arun R Bharadwaj [2009-04-16 12:11:36]:

    This patch creates a new framework for identifying cpu-pinned timers
    and hrtimers.

    This framework is needed because pinned timers are expected to fire on
    the same CPU on which they are queued. So it is essential to identify
    these and not migrate them, in case there are any.

    For regular timers, the currently existing add_timer_on() can be used
    queue pinned timers and subsequently mod_timer_pinned() can be used
    to modify the 'expires' field.

    For hrtimers, new modes HRTIMER_ABS_PINNED and HRTIMER_REL_PINNED are
    added to queue cpu-pinned hrtimer.

    [ tglx: use .._PINNED mode argument instead of creating tons of new
    functions ]

    Signed-off-by: Arun R Bharadwaj
    Signed-off-by: Thomas Gleixner

    Arun R Bharadwaj
     

31 Mar, 2009

1 commit

  • * 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (33 commits)
    lockdep: fix deadlock in lockdep_trace_alloc
    lockdep: annotate reclaim context (__GFP_NOFS), fix SLOB
    lockdep: annotate reclaim context (__GFP_NOFS), fix
    lockdep: build fix for !PROVE_LOCKING
    lockstat: warn about disabled lock debugging
    lockdep: use stringify.h
    lockdep: simplify check_prev_add_irq()
    lockdep: get_user_chars() redo
    lockdep: simplify get_user_chars()
    lockdep: add comments to mark_lock_irq()
    lockdep: remove macro usage from mark_held_locks()
    lockdep: fully reduce mark_lock_irq()
    lockdep: merge the !_READ mark_lock_irq() helpers
    lockdep: merge the _READ mark_lock_irq() helpers
    lockdep: simplify mark_lock_irq() helpers #3
    lockdep: further simplify mark_lock_irq() helpers
    lockdep: simplify the mark_lock_irq() helpers
    lockdep: split up mark_lock_irq()
    lockdep: generate usage strings
    lockdep: generate the state bit definitions
    ...

    Linus Torvalds
     

19 Feb, 2009

1 commit

  • Impact: new timer API

    Based on an idea from Martin Josefsson with the help of
    Patrick McHardy and Stephen Hemminger:

    introduce the mod_timer_pending() API which is a mod_timer()
    offspring that is an invariant on already removed timers.

    (regular mod_timer() re-activates non-pending timers.)

    This is useful for the networking code in that it can
    allow unserialized mod_timer_pending() timer-forwarding
    calls, but a single del_timer*() will stop the timer
    from being reactivated again.

    Also while at it:

    - optimize the regular mod_timer() path some more, the
    timer-stat and a debug check was needlessly duplicated
    in __mod_timer().

    - make the exports come straight after the function, as
    most other exports in timer.c already did.

    - eliminate __mod_timer() as an external API, change the
    users to mod_timer().

    The regular mod_timer() code path is not impacted
    significantly, due to inlining optimizations and due to
    the simplifications.

    Based-on-patch-from: Stephen Hemminger
    Acked-by: Stephen Hemminger
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Cc: netdev@vger.kernel.org
    Cc: Oleg Nesterov
    Cc: Andrew Morton
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

15 Feb, 2009

1 commit


06 Nov, 2008

1 commit

  • This patch (as1158b) adds round_jiffies_up() and friends. These
    routines work like the analogous round_jiffies() functions, except
    that they will never round down.

    The new routines will be useful for timeouts where we don't care
    exactly when the timer expires, provided it doesn't expire too soon.

    Signed-off-by: Alan Stern
    Signed-off-by: Jens Axboe

    Alan Stern
     

30 Apr, 2008

1 commit

  • Add calls to the generic object debugging infrastructure and provide fixup
    functions which allow to keep the system alive when recoverable problems have
    been detected by the object debugging core code.

    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Cc: Greg KH
    Cc: Randy Dunlap
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

09 Feb, 2008

2 commits


30 Jan, 2008

1 commit


17 Jul, 2007

2 commits

  • Add a flag in /proc/timer_stats to indicate deferrable timers. This will
    let developers/users to differentiate between types of tiemrs in
    /proc/timer_stats.

    Deferrable timer and normal timer will appear in /proc/timer_stats as below.
    10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
    10, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)

    Also version of timer_stats changes from v0.1 to v0.2

    Signed-off-by: Venkatesh Pallipadi
    Acked-by: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: john stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Venki Pallipadi
     
  • Remove the obviously unnecessary includes of under the
    include/linux/ directory, and fix the couple errors that are introduced as
    a result of that.

    Signed-off-by: Robert P. J. Day
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

30 May, 2007

1 commit

  • get_next_timer_interrupt() returns a delta of (LONG_MAX > 1) in case
    there is no timer pending. On 64 bit machines this results in a
    multiplication overflow in tick_nohz_stop_sched_tick().

    Reported by: Dave Miller

    Make the return value a constant and limit the return value to a 32 bit
    value.

    When the max timeout value is returned, we can safely stop the tick
    timer device. The max jiffies delta results in a 12 days timeout for
    HZ=1000.

    In the long term the get_next_timer_interrupt() code needs to be
    reworked to return ktime instead of jiffies, but we have to wait until
    the last users of the original NO_IDLE_HZ code are converted.

    Signed-off-by: Thomas Gleixner
    Acked-off-by: David S. Miller
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

09 May, 2007

1 commit

  • Introduce a new flag for timers - deferrable: Timers that work normally
    when system is busy. But, will not cause CPU to come out of idle (just to
    service this timer), when CPU is idle. Instead, this timer will be
    serviced when CPU eventually wakes up with a subsequent non-deferrable
    timer.

    The main advantage of this is to avoid unnecessary timer interrupts when
    CPU is idle. If the routine currently called by a timer can wait until
    next event without any issues, this new timer can be used to setup timer
    event for that routine. This, with dynticks, allows CPUs to be lazy,
    allowing them to stay in idle for extended period of time by reducing
    unnecesary wakeup and thereby reducing the power consumption.

    This patch:

    Builds this new timer on top of existing timer infrastructure. It uses
    last bit in 'base' pointer of timer_list structure to store this deferrable
    timer flag. __next_timer_interrupt() function skips over these deferrable
    timers when CPU looks for next timer event for which it has to wake up.

    This is exported by a new interface init_timer_deferrable() that can be
    called in place of regular init_timer().

    [akpm@linux-foundation.org: Privatise a #define]
    Signed-off-by: Venkatesh Pallipadi
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Oleg Nesterov
    Cc: Dave Jones
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Venki Pallipadi
     

17 Feb, 2007

3 commits

  • Add /proc/timer_stats support: debugging feature to profile timer expiration.
    Both the starting site, process/PID and the expiration function is captured.
    This allows the quick identification of timer event sources in a system.

    Sample output:

    # echo 1 > /proc/timer_stats
    # cat /proc/timer_stats
    Timer Stats Version: v0.1
    Sample period: 4.010 s
    24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
    11, 0 swapper sk_reset_timer (tcp_delack_timer)
    6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
    2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
    17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
    2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
    4, 2050 pcscd do_nanosleep (hrtimer_wakeup)
    5, 4179 sshd sk_reset_timer (tcp_write_timer)
    4, 2248 yum-updatesd schedule_timeout (process_timeout)
    18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
    3, 0 swapper sk_reset_timer (tcp_delack_timer)
    1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer)
    2, 1 swapper e1000_up (e1000_watchdog)
    1, 1 init schedule_timeout (process_timeout)
    100 total events, 25.24 events/sec

    [ cleanups and hrtimers support from Thomas Gleixner ]
    [bunk@stusta.de: nr_entries can become static]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Cc: Andi Kleen
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • - hrtimers did not use the hrtimer_restart enum and relied on the implict
    int representation. Fix the prototypes and the functions using the enums.
    - Use seperate name spaces for the enumerations
    - Convert hrtimer_restart macro to inline function
    - Add comments

    No functional changes.

    [akpm@osdl.org: fix input driver]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Cc: Dmitry Torokhov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • For CONFIG_NO_HZ we need to calculate the next timer wheel event based on a
    given jiffie value. Extend the existing code to allow the extra 'now'
    argument. Provide a compability function for the existing implementations to
    call the function with now == jiffies. (This also solves the racyness of the
    original code vs. jiffies changing during the iteration.)

    No functional changes to existing users of this infrastructure.

    [ remove WARN_ON() that triggered on s390, by Carsten Otte ]
    [ made new helper static, Adrian Bunk ]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

27 Jan, 2007

1 commit


11 Dec, 2006

1 commit

  • Introduce a round_jiffies() function as well as a round_jiffies_relative()
    function. These functions round a jiffies value to the next whole second.
    The primary purpose of this rounding is to cause all "we don't care exactly
    when" timers to happen at the same jiffy.

    This avoids multiple timers firing within the second for no real reason;
    with dynamic ticks these extra timers cause wakeups from deep sleep CPU
    sleep states and thus waste power.

    The exact wakeup moment is skewed by the cpu number, to avoid all cpus from
    waking up at the exact same time (and hitting the same lock/cachelines
    there)

    [akpm@osdl.org: fix variable type]
    Signed-off-by: Arjan van de Ven
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

26 Apr, 2006

1 commit


01 Apr, 2006

1 commit

  • Commit a4a6198b80cf82eb8160603c98da218d1bd5e104:
    [PATCH] tvec_bases too large for per-cpu data

    introduced "struct tvec_t_base_s boot_tvec_bases" which is visible at
    compile time. This means we can kill __init_timer_base and move
    timer_base_s's content into tvec_t_base_s.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

27 Mar, 2006

1 commit

  • The nanosleep cleanup allows to remove the data field of hrtimer. The
    callback function can use container_of() to get it's own data. Since the
    hrtimer structure is anyway embedded in other structures, this adds no
    overhead.

    Signed-off-by: Roman Zippel
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roman Zippel
     

25 Mar, 2006

1 commit


11 Jan, 2006

1 commit


31 Oct, 2005

3 commits

  • In the recent timer rework we lost the check for an add_timer() of an
    already-pending timer. That check was useful for networking, so put it back.

    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Remove timer_list.magic and associated debugging code.

    I originally added this when a spinlock was added to timer_list - this meant
    that an all-zeroes timer became illegal and init_timer() was required.

    That spinlock isn't even there any more, although timer.base must now be
    initialised.

    I'll keep this debugging code in -mm.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Every user of init_timer() also needs to initialize ->function and ->data
    fields. This patch adds a simple setup_timer() helper for that.

    The schedule_timeout() is patched as an example of usage.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

10 Sep, 2005

1 commit


24 Jun, 2005

2 commits

  • This patch splits del_timer_sync() into 2 functions. The new one,
    try_to_del_timer_sync(), returns -1 when it hits executing timer.

    It can be used in interrupt context, or when the caller hold locks which
    can prevent completion of the timer's handler.

    NOTE. Currently it can't be used in interrupt context in UP case, because
    ->running_timer is used only with CONFIG_SMP.

    Should the need arise, it is possible to kill #ifdef CONFIG_SMP in
    set_running_timer(), it is cheap.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • This patch tries to solve following problems:

    1. del_timer_sync() is racy. The timer can be fired again after
    del_timer_sync have checked all cpus and before it will recheck
    timer_pending().

    2. It has scalability problems. All cpus are scanned to determine
    if the timer is running on that cpu.

    With this patch del_timer_sync is O(1) and no slower than plain
    del_timer(pending_timer), unless it has to actually wait for
    completion of the currently running timer.

    The only restriction is that the recurring timer should not use
    add_timer_on().

    3. The timers are not serialized wrt to itself.

    If CPU_0 does mod_timer(jiffies+1) while the timer is currently
    running on CPU 1, it is quite possible that local interrupt on
    CPU_0 will start that timer before it finished on CPU_1.

    4. The timers locking is suboptimal. __mod_timer() takes 3 locks
    at once and still requires wmb() in del_timer/run_timers.

    The new implementation takes 2 locks sequentially and does not
    need memory barriers.

    Currently ->base != NULL means that the timer is pending. In that case
    ->base.lock is used to lock the timer. __mod_timer also takes timer->lock
    because ->base can be == NULL.

    This patch uses timer->entry.next != NULL as indication that the timer is
    pending. So it does __list_del(), entry->next = NULL instead of list_del()
    when the timer is deleted.

    The ->base field is used for hashed locking only, it is initialized
    in init_timer() which sets ->base = per_cpu(tvec_bases). When the
    tvec_bases.lock is locked, it means that all timers which are tied
    to this base via timer->base are locked, and the base itself is locked
    too.

    So __run_timers/migrate_timers can safely modify all timers which could
    be found on ->tvX lists (pending timers).

    When the timer's base is locked, and the timer removed from ->entry list
    (which means that _run_timers/migrate_timers can't see this timer), it is
    possible to set timer->base = NULL and drop the lock: the timer remains
    locked.

    This patch adds lock_timer_base() helper, which waits for ->base != NULL,
    locks the ->base, and checks it is still the same.

    __mod_timer() schedules the timer on the local CPU and changes it's base.
    However, it does not lock both old and new bases at once. It locks the
    timer via lock_timer_base(), deletes the timer, sets ->base = NULL, and
    unlocks old base. Then __mod_timer() locks new_base, sets ->base = new_base,
    and adds this timer. This simplifies the code, because AB-BA deadlock is not
    possible. __mod_timer() also ensures that the timer's base is not changed
    while the timer's handler is running on the old base.

    __run_timers(), del_timer() do not change ->base anymore, they only clear
    pending flag.

    So del_timer_sync() can test timer->base->running_timer == timer to detect
    whether it is running or not.

    We don't need timer_list->lock anymore, this patch kills it.

    We also don't need barriers. del_timer() and __run_timers() used smp_wmb()
    before clearing timer's pending flag. It was needed because __mod_timer()
    did not lock old_base if the timer is not pending, so __mod_timer()->list_add()
    could race with del_timer()->list_del(). With this patch these functions are
    serialized through base->lock.

    One problem. TIMER_INITIALIZER can't use per_cpu(tvec_bases). So this patch
    adds global

    struct timer_base_s {
    spinlock_t lock;
    struct timer_list *running_timer;
    } __init_timer_base;

    which is used by TIMER_INITIALIZER. The corresponding fields in tvec_t_base_s
    struct are replaced by struct timer_base_s t_base.

    It is indeed ugly. But this can't have scalability problems. The global
    __init_timer_base.lock is used only when __mod_timer() is called for the first
    time AND the timer was compile time initialized. After that the timer migrates
    to the local CPU.

    Signed-off-by: Oleg Nesterov
    Acked-by: Ingo Molnar
    Signed-off-by: Renaud Lienhart
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds