11 Oct, 2007

1 commit


26 Jul, 2007

2 commits

  • This avoids xtime lag seen with dynticks, because while 'xtime' itself
    is still not updated often, we keep a 'xtime_cache' variable around that
    contains the approximate real-time that _is_ updated each time we do a
    'update_wall_time()', and is thus never off by more than one tick.

    IOW, this restores the original semantics for 'xtime' users, as long as
    you use the proper abstraction functions (ie 'current_kernel_time()' or
    'get_seconds()' depending on whether you want a timespec or just the
    seconds field).

    [ Updated Patch. As penance for my sins I've also yanked another #ifdef
    that was added to avoid the xtime lag w/ hrtimers. ]

    Signed-off-by: John Stultz
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    john stultz
     
  • This avoids use of the kernel-internal "xtime" variable directly outside
    of the actual time-related functions. Instead, use the helper functions
    that we already have available to us.

    This doesn't actually change any behaviour, but this will allow us to
    fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ
    (because much of the realtime information is maintained as separate
    offsets to 'xtime'), which has caused interfaces that use xtime directly
    to get a time that is out of sync with the real-time clock by up to a
    third of a second or so.

    Signed-off-by: John Stultz
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    john stultz
     

22 Jul, 2007

2 commits


17 Jul, 2007

1 commit


10 May, 2007

1 commit

  • Since nonboot CPUs are now disabled after tasks and devices have been
    frozen and the CPU hotplug infrastructure is used for this purpose, we need
    special CPU hotplug notifications that will help the CPU-hotplug-aware
    subsystems distinguish normal CPU hotplug events from CPU hotplug events
    related to a system-wide suspend or resume operation in progress. This
    patch introduces such notifications and causes them to be used during
    suspend and resume transitions. It also changes all of the
    CPU-hotplug-aware subsystems to take these notifications into consideration
    (for now they are handled in the same way as the corresponding "normal"
    ones).

    [oleg@tv-sign.ru: cleanups]
    Signed-off-by: Rafael J. Wysocki
    Cc: Gautham R Shenoy
    Cc: Pavel Machek
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

09 May, 2007

1 commit

  • Other symbols of the hrtimers API are already exported.

    Signed-off-by: Stas Sergeev
    Acked-by: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stas Sergeev
     

28 Apr, 2007

1 commit


26 Apr, 2007

1 commit

  • Get rid of the manual clock source selection mess and use ktime. Also
    use a scalar representation, which allows to clean up pkt_sched.h a bit
    more and results in less ktime_to_ns() calls in most cases.

    The PSCHED_US2JIFFIE/PSCHED_JIFFIE2US macros are implemented quite
    inefficient by this patch, following patches will convert all qdiscs
    to hrtimers and get rid of them entirely.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

08 Apr, 2007

1 commit

  • Soeren Sonnenburg reported that upon resume he is getting
    this backtrace:

    [] smp_apic_timer_interrupt+0x57/0x90
    [] retrigger_next_event+0x0/0xb0
    [] apic_timer_interrupt+0x28/0x30
    [] retrigger_next_event+0x0/0xb0
    [] __kfifo_put+0x8/0x90
    [] on_each_cpu+0x35/0x60
    [] clock_was_set+0x18/0x20
    [] timekeeping_resume+0x7c/0xa0
    [] __sysdev_resume+0x11/0x80
    [] sysdev_resume+0x47/0x80
    [] device_power_up+0x5/0x10

    it turns out that on resume we mistakenly re-enable interrupts too
    early. Do the timer retrigger only on the current CPU.

    Signed-off-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Acked-by: Soeren Sonnenburg
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

29 Mar, 2007

1 commit

  • hrtimer_start() incorrectly set the 'reprogram' flag to enqueue_hrtimer(),
    which should only be 1 if the hrtimer is queued to the current CPU.

    Doing otherwise could result in a reprogramming of the current CPU's
    clockevents device, with a timer that is not queued to it - resulting in a
    bogus next expiry value.

    Signed-off-by: Ingo Molnar
    Cc: Michal Piotrowski
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

17 Mar, 2007

2 commits

  • commit f4304ab21513b834c8fe3403927c60c2b81a72d7 (HZ free NTP) moved the
    access to wall_to_monotonic in hrtimer_get_softirq_time() out of the
    xtime_lock protection.

    Move it back into the seq_lock section.

    Signed-off-by: Thomas Gleixner
    Acked-by: John Stultz
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • hrtimer_forward() does not check for the possible overflow of
    timer->expires. This can happen on 64 bit machines with large interval
    values and results currently in an endless loop in the softirq because the
    expiry value becomes negative and therefor the timer is expired all the
    time.

    Check for this condition and set the expiry value to the max. expiry time
    in the future. The fix should be applied to stable kernel series as well.

    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

07 Mar, 2007

1 commit

  • The TIMER_SOFTIRQ runs the hrtimers during bootup until a usable
    clocksource and clock event sources are registered. The switch to high
    resolution mode happens inside of the TIMER_SOFTIRQ, but runs the softirq
    afterwards. That way the tick emulation timer, which was set up in the
    switch to highres might be executed in the softirq context, which is a BUG.
    The rbtree has not to be touched by the softirq after the highres switch.

    This BUG was observed by Andres Salomon, who provided the information to
    debug it.

    Return early from the softirq, when the switch was sucessful.

    [dilinger@debian.org: add debug warning]
    [akpm@linux-foundation.org: make debug warning compile]
    Signed-off-by: Thomas Gleixner
    Cc: Andres Salomon
    Acked-by: Ingo Molnar
    Signed-off-by: Andres Salomon
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

05 Mar, 2007

1 commit

  • Doing something like this on a two cpu system

    # echo 0 > /sys/devices/system/cpu/cpu0/online
    # echo 1 > /sys/devices/system/cpu/cpu0/online
    # echo 0 > /sys/devices/system/cpu/cpu1/online

    will give me this:

    =======================================================
    [ INFO: possible circular locking dependency detected ]
    2.6.21-rc2-g562aa1d4-dirty #7
    -------------------------------------------------------
    bash/1282 is trying to acquire lock:
    (&cpu_base->lock_key){.+..}, at: [] hrtimer_cpu_notify+0xc6/0x240

    but task is already holding lock:
    (&cpu_base->lock_key#2){.+..}, at: [] hrtimer_cpu_notify+0xbc/0x240

    which lock already depends on the new lock.

    This happens because we have the following code in kernel/hrtimer.c:

    migrate_hrtimers(int cpu)
    [...]
    old_base = &per_cpu(hrtimer_bases, cpu);
    new_base = &get_cpu_var(hrtimer_bases);
    [...]
    spin_lock(&new_base->lock);
    spin_lock(&old_base->lock);

    Which means the spinlocks are taken in an order which depends on which cpu
    gets shut down from which other cpu. Therefore lockdep complains that there
    might be an ABBA deadlock. Since migrate_hrtimers() gets only called on
    cpu hotplug it's safe to assume that it isn't executed concurrently on a

    The same problem exists in kernel/timer.c: migrate_timers().

    As pointed out by Christian Borntraeger one possible solution to avoid
    the locking order complaints would be to make sure that the locks are
    always taken in the same order. E.g. by taking the lock of the cpu with
    the lower number first.

    To achieve this we introduce two new spinlock functions double_spin_lock
    and double_spin_unlock which lock or unlock two locks in a given order.

    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Roman Zippel
    Cc: John Stultz
    Cc: Christian Borntraeger
    Cc: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     

17 Feb, 2007

11 commits

  • Add /proc/timer_stats support: debugging feature to profile timer expiration.
    Both the starting site, process/PID and the expiration function is captured.
    This allows the quick identification of timer event sources in a system.

    Sample output:

    # echo 1 > /proc/timer_stats
    # cat /proc/timer_stats
    Timer Stats Version: v0.1
    Sample period: 4.010 s
    24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
    11, 0 swapper sk_reset_timer (tcp_delack_timer)
    6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
    2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
    17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
    2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
    4, 2050 pcscd do_nanosleep (hrtimer_wakeup)
    5, 4179 sshd sk_reset_timer (tcp_write_timer)
    4, 2248 yum-updatesd schedule_timeout (process_timeout)
    18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
    3, 0 swapper sk_reset_timer (tcp_delack_timer)
    1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer)
    2, 1 swapper e1000_up (e1000_watchdog)
    1, 1 init schedule_timeout (process_timeout)
    100 total events, 25.24 events/sec

    [ cleanups and hrtimers support from Thomas Gleixner ]
    [bunk@stusta.de: nr_entries can become static]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Cc: Andi Kleen
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Implement high resolution timers on top of the hrtimers infrastructure and the
    clockevents / tick-management framework. This provides accurate timers for
    all hrtimer subsystem users.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • With Ingo Molnar

    Add functions to provide dynamic ticks and high resolution timers. The code
    which keeps track of jiffies and handles the long idle periods is shared
    between tick based and high resolution timer based dynticks. The dyntick
    functionality can be disabled on the kernel commandline. Provide also the
    infrastructure to support high resolution timers.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Architectures register their clock event devices, in the clock events core.
    Users of the clockevents core can get clock event devices for their use. The
    clockevents core code provides notification mechanisms for various clock
    related management events.

    This allows to control the clock event devices without the architectures
    having to worry about the details of function assignment. This is also a
    preliminary for high resolution timers and dynamic ticks to allow the core
    code to control the clock functionality without intrusive changes to the
    architecture code.

    [Fixes-by: Ingo Molnar ]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Roman Zippel
    Cc: john stultz
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Reintroduce ktimers feature "optimized away" by the ktimers review process:
    remove the curr_timer pointer from the cpu-base and use the hrtimer state.

    No functional changes.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Roman Zippel
    Cc: john stultz
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Reintroduce ktimers feature "optimized away" by the ktimers review process:
    multiple hrtimer states to enable the running of hrtimers without holding the
    cpu-base-lock.

    (The "optimized" rbtree hack carried only 2 states worth of information and we
    need 4 for high resolution timers and dynamic ticks.)

    No functional changes.

    Build-fixes-from: Andrew Morton
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Roman Zippel
    Cc: john stultz
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Improve kernel/hrtimers.c locking: use a per-CPU base with a lock to control
    locking of all clocks belonging to a CPU. This simplifies code that needs to
    lock all clocks at once. This makes life easier for high-res timers and
    dyntick.

    No functional changes.

    [ optimization change from Andrew Morton ]

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • - hrtimers did not use the hrtimer_restart enum and relied on the implict
    int representation. Fix the prototypes and the functions using the enums.
    - Use seperate name spaces for the enumerations
    - Convert hrtimer_restart macro to inline function
    - Add comments

    No functional changes.

    [akpm@osdl.org: fix input driver]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Cc: Dmitry Torokhov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • For CONFIG_NO_HZ we need to calculate the next timer wheel event based on a
    given jiffie value. Extend the existing code to allow the extra 'now'
    argument. Provide a compability function for the existing implementations to
    call the function with now == jiffies. (This also solves the racyness of the
    original code vs. jiffies changing during the iteration.)

    No functional changes to existing users of this infrastructure.

    [ remove WARN_ON() that triggered on s390, by Carsten Otte ]
    [ made new helper static, Adrian Bunk ]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Persistent clock support: do proper timekeeping across suspend/resume.

    [bunk@stusta.de: cleanup]
    Signed-off-by: John Stultz
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Roman Zippel
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Stultz
     
  • Distangle the NTP update from HZ. This is necessary for dynamic tick enabled
    kernels.

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     

12 Feb, 2007

1 commit

  • A variety of (mostly) innocuous fixes to the embedded kernel-doc content in
    source files, including:

    * make multi-line initial descriptions single line
    * denote some function names, constants and structs as such
    * change erroneous opening '/*' to '/**' in a few places
    * reword some text for clarity

    Signed-off-by: Robert P. J. Day
    Cc: "Randy.Dunlap"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

30 Sep, 2006

1 commit

  • The clock_nanosleep() function does not return the time remaining when the
    sleep is interrupted by a signal.

    This patch creates a new call out, compat_clock_nanosleep_restart(), which
    handles returning the remaining time after a sleep is interrupted. This
    patch revives clock_nanosleep_restart(). It is now accessed via the new
    call out. The compat_clock_nanosleep_restart() is used for compatibility
    access.

    Since this is implemented in compatibility mode the normal path is
    virtually unaffected - no real performance impact.

    Signed-off-by: Toyo Abe
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toyo Abe
     

15 Aug, 2006

1 commit


01 Aug, 2006

1 commit

  • Few of the callback functions and notifier blocks that are associated with cpu
    notifications incorrectly have __devinit and __devinitdata. They should be
    __cpuinit and __cpuinitdata instead.

    It makes no functional difference but wastes text area when CONFIG_HOTPLUG is
    enabled and CONFIG_HOTPLUG_CPU is not.

    This patch fixes all those instances.

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     

04 Jul, 2006

2 commits


28 Jun, 2006

2 commits

  • This patch reverts notifier_block changes made in 2.6.17

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     
  • In 2.6.17, there was a problem with cpu_notifiers and XFS. I provided a
    band-aid solution to solve that problem. In the process, i undid all the
    changes you both were making to ensure that these notifiers were available
    only at init time (unless CONFIG_HOTPLUG_CPU is defined).

    We deferred the real fix to 2.6.18. Here is a set of patches that fixes the
    XFS problem cleanly and makes the cpu notifiers available only at init time
    (unless CONFIG_HOTPLUG_CPU is defined).

    If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
    time.

    This patch reverts the notifier_call changes made in 2.6.17

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     

26 Jun, 2006

2 commits

  • Fix kernel-doc formatting in ktime.h and hrtimer.[ch] files.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • There are several instances of per_cpu(foo, raw_smp_processor_id()), which
    is semantically equivalent to __get_cpu_var(foo) but without the warning
    that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled. For
    those architectures with optimized per-cpu implementations, namely ia64,
    powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower
    code than __get_cpu_var(), so it would be preferable to use __get_cpu_var
    on those platforms.

    This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x,
    raw_smp_processor_id()) on architectures that use the generic per-cpu
    implementation, and turns into __get_cpu_var(x) on the architectures that
    have an optimized per-cpu implementation.

    Signed-off-by: Paul Mackerras
    Acked-by: David S. Miller
    Acked-by: Ingo Molnar
    Acked-by: Martin Schwidefsky
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mackerras
     

21 Jun, 2006

1 commit

  • * git://git.infradead.org/~dwmw2/rbtree-2.6:
    [RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
    Update UML kernel/physmem.c to use rb_parent() accessor macro
    [RBTREE] Update hrtimers to use rb_parent() accessor macro.
    [RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
    [RBTREE] Merge colour and parent fields of struct rb_node.
    [RBTREE] Remove dead code in rb_erase()
    [RBTREE] Update JFFS2 to use rb_parent() accessor macro.
    [RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
    [RBTREE] Update key.c to use rb_parent() accessor macro.
    [RBTREE] Update ext3 to use rb_parent() accessor macro.
    [RBTREE] Change rbtree off-tree marking in I/O schedulers.
    [RBTREE] Add accessor macros for colour and parent fields of rb_node

    Linus Torvalds
     

01 Jun, 2006

1 commit

  • From: Stephen Hemminger

    I want to use the hrtimer's in the netem (Network Emulator) qdisc. But the
    necessary symbols aren't exported for module use.

    Also needed by SystemTap.

    Signed-off-by: Stephen Hemminger
    Acked-by: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "Stone, Joshua I"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Hemminger
     

26 Apr, 2006

1 commit