08 Sep, 2011

1 commit

  • The automatic increase of the min_delta_ns of a clockevents device
    should be done in the clockevents code as the minimum delay is an
    attribute of the clockevents device.

    In addition not all architectures want the automatic adjustment, on a
    massively virtualized system it can happen that the programming of a
    clock event fails several times in a row because the virtual cpu has
    been rescheduled quickly enough. In that case the minimum delay will
    erroneously be increased with no way back. The new config symbol
    GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic
    adjustment. The config option is selected only for x86.

    Signed-off-by: Martin Schwidefsky
    Cc: john stultz
    Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com
    Signed-off-by: Thomas Gleixner

    Martin Schwidefsky
     

16 Mar, 2011

1 commit

  • …l/git/tip/linux-2.6-tip

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits)
    posix-clocks: Check write permissions in posix syscalls
    hrtimer: Remove empty hrtimer_init_hres_timer()
    hrtimer: Update hrtimer->state documentation
    hrtimer: Update base[CLOCK_BOOTTIME].offset correctly
    timers: Export CLOCK_BOOTTIME via the posix timers interface
    timers: Add CLOCK_BOOTTIME hrtimer base
    time: Extend get_xtime_and_monotonic_offset() to also return sleep
    time: Introduce get_monotonic_boottime and ktime_get_boottime
    hrtimers: extend hrtimer base code to handle more then 2 clockids
    ntp: Remove redundant and incorrect parameter check
    mn10300: Switch do_timer() to xtimer_update()
    posix clocks: Introduce dynamic clocks
    posix-timers: Cleanup namespace
    posix-timers: Add support for fd based clocks
    x86: Add clock_adjtime for x86
    posix-timers: Introduce a syscall for clock tuning.
    time: Splitout compat timex accessors
    ntp: Add ADJ_SETOFFSET mode bit
    time: Introduce timekeeping_inject_offset
    posix-timer: Update comment
    ...

    Fix up new system-call-related conflicts in
    arch/x86/ia32/ia32entry.S
    arch/x86/include/asm/unistd_32.h
    arch/x86/include/asm/unistd_64.h
    arch/x86/kernel/syscall_table_32.S
    (name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some
    due to movement of get_jiffies_64() in:
    kernel/time.c

    Linus Torvalds
     

26 Feb, 2011

1 commit

  • When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only
    can switch into oneshot mode, when the backup broadcast device
    supports oneshot mode as well. Otherwise we would try to switch the
    broadcast device into an unsupported mode unconditionally. This went
    unnoticed so far as the current available broadcast devices support
    oneshot mode. Seth unearthed this problem while debugging and working
    around an hpet related BIOS wreckage.

    Add the necessary check to tick_is_oneshot_available().

    Reported-and-tested-by: Seth Forshee
    Signed-off-by: Thomas Gleixner
    LKML-Reference:
    Cc: stable@kernel.org # .21 ->

    Thomas Gleixner
     

01 Feb, 2011

1 commit

  • All callers of do_timer() are converted to xtime_update(). The only
    users of xtime_lock are in kernel/time/. Make both local to
    kernel/time/ and remove them from the global header files.

    [ tglx: Reuse tick-internal.h instead of creating another local header
    file. Massaged changelog ]

    Signed-off-by: Torben Hohn
    Cc: Peter Zijlstra
    Cc: johnstul@us.ibm.com
    Cc: yong.zhang0@gmail.com
    Cc: hch@infradead.org
    Signed-off-by: Thomas Gleixner

    Torben Hohn
     

17 Dec, 2010

1 commit

  • __get_cpu_var() can be replaced with this_cpu_read and will then use a
    single read instruction with implied address calculation to access the
    correct per cpu instance.

    However, the address of a per cpu variable passed to __this_cpu_read()
    cannot be determined (since it's an implied address conversion through
    segment prefixes). Therefore apply this only to uses of __get_cpu_var
    where the address of the variable is not used.

    Cc: Pekka Enberg
    Cc: Hugh Dickins
    Cc: Thomas Gleixner
    Acked-by: H. Peter Anvin
    Signed-off-by: Christoph Lameter
    Signed-off-by: Tejun Heo

    Christoph Lameter
     

15 Dec, 2009

2 commits


02 May, 2009

1 commit

  • tick_handle_periodic() can lock up hard when a one shot clock event
    device is used in combination with jiffies clocksource.

    Avoid an endless loop issue by requiring that a highres valid
    clocksource be installed before we call tick_periodic() in a loop when
    using ONESHOT mode. The result is we will only increment jiffies once
    per interrupt until a continuous hardware clocksource is available.

    Without this, we can run into a endless loop, where each cycle through
    the loop, jiffies is updated which increments time by tick_period or
    more (due to clock steering), which can cause the event programming to
    think the next event was before the newly incremented time and fail
    causing tick_periodic() to be called again and the whole process loops
    forever.

    [ Impact: prevent hard lock up ]

    Signed-off-by: John Stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner
    Cc: stable@kernel.org

    john stultz
     

31 Jan, 2009

1 commit

  • Impact: fix CPU hotplug hang on Power6 testbox

    On architectures that support offlining all cpus (at least powerpc/pseries),
    hot-unpluging the tick_do_timer_cpu can result in a system hang.

    This comes from the fact that if the cpu going down happens to be the
    cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
    cpu is dead (via the CPU_DEAD notification), we're left without ticks,
    jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
    That's particularly the case for the cpu looping in __cpu_die() waiting
    for the dying cpu to be dead.

    This patch addresses this by having the tick_do_timer_cpu handover happen
    earlier during the CPU_DYING notification. For this, a new clockevent
    notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
    in hrtimer_cpu_notify().

    Signed-off-by: Sebastien Dugue
    Cc:
    Signed-off-by: Ingo Molnar

    Sebastien Dugue
     

01 Jan, 2009

1 commit

  • Impact: Use new APIs

    Convert kernel/time functions to use struct cpumask *.

    Note the ugly bitmap declarations in tick-broadcast.c. These should
    be cpumask_var_t, but there was no obvious initialization function to
    put the alloc_cpumask_var() calls in. This was safe.

    (Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK,
    so we use a bitmap here to show we really mean it).

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis

    Rusty Russell
     

13 Dec, 2008

2 commits

  • Impact: change calling convention of existing clock_event APIs

    struct clock_event_timer's cpumask field gets changed to take pointer,
    as does the ->broadcast function.

    Another single-patch change. For safety, we BUG_ON() in
    clockevents_register_device() if it's not set.

    Signed-off-by: Rusty Russell
    Cc: Ingo Molnar

    Rusty Russell
     
  • Impact: change existing irq_chip API

    Not much point with gentle transition here: the struct irq_chip's
    setaffinity method signature needs to change.

    Fortunately, not widely used code, but hits a few architectures.

    Note: In irq_select_affinity() I save a temporary in by mangling
    irq_desc[irq].affinity directly. Ingo, does this break anything?

    (Folded in fix from KOSAKI Motohiro)

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Reviewed-by: Grant Grundler
    Acked-by: Ingo Molnar
    Cc: ralf@linux-mips.org
    Cc: grundler@parisc-linux.org
    Cc: jeremy@xensource.com
    Cc: KOSAKI Motohiro

    Rusty Russell
     

23 Sep, 2008

2 commits

  • Impact: timer hang on CPU online observed on AMD C1E systems

    When a CPU is brought online then the broadcast machinery can
    be in the one shot state already. Check this and setup the timer
    device of the new CPU in one shot mode so the broadcast code
    can pick up the next_event value correctly.

    Another AMD C1E oddity, as we switch to broadcast immediately and
    not after the full bring up via the ACPI cpu idle code.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Impact: rare hang which can be triggered on CPU online.

    tick_do_timer_cpu keeps track of the CPU which updates jiffies
    via do_timer. The value -1 is used to signal, that currently no
    CPU is doing this. There are two cases, where the variable can
    have this state:

    boot:
    necessary for systems where the boot cpu id can be != 0

    nohz long idle sleep:
    When the CPU which did the jiffies update last goes into
    a long idle sleep it drops the update jiffies duty so
    another CPU which is not idle can pick it up and keep
    jiffies going.

    Using the same value for both situations is wrong, as the CPU online
    code can see the -1 state when the timer of the newly onlined CPU is
    setup. The setup for a newly onlined CPU goes through periodic mode
    and can pick up the do_timer duty without being aware of the nohz /
    highres mode of the already running system.

    Use two separate states and make them constants to avoid magic
    numbers confusion.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

17 Sep, 2008

1 commit

  • The device shut down does not cleanup the next_event variable of the
    clock event device. So when the device is reactivated the possible
    stale next_event value can prevent the device to be reprogrammed as it
    claims to wait on a event already.

    This is the root cause of the resurfacing suspend/resume problem,
    where systems need key press to come back to life.

    Fix this by setting next_event to KTIME_MAX when the device is shut
    down. Use a separate function for shutdown which takes care of that
    and only keep the direct set mode call in the broadcast code, where we
    can not touch the next_event value.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

05 Sep, 2008

1 commit

  • There is a ordering related problem with clockevents code, due to which
    clockevents_register_device() called after tickless/highres switch
    will not work. The new clockevent ends up with clockevents_handle_noop as
    event handler, resulting in no timer activity.

    The problematic path seems to be

    * old device already has hrtimer_interrupt as the event_handler
    * new clockevent device registers with a higher rating
    * tick_check_new_device() is called
    * clockevents_exchange_device() gets called
    * old->event_handler is set to clockevents_handle_noop
    * tick_setup_device() is called for the new device
    * which sets new->event_handler using the old->event_handler which is noop.

    Change the ordering so that new device inherits the proper handler.

    This does not have any issue in normal case as most likely all the clockevent
    devices are setup before the highres switch. But, can potentially be affecting
    some corner case where HPET force detect happens after the highres switch.
    This was a problem with HPET in MSI mode code that we have been experimenting
    with.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Shaohua Li
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Venkatesh Pallipadi
     

26 Jul, 2008

1 commit


19 Jul, 2008

1 commit


17 Apr, 2008

1 commit

  • > Generic code is not supposed to include irq.h. Replace this include
    > by linux/hardirq.h instead and add/replace an include of linux/irq.h
    > in asm header files where necessary.
    > This change should only matter for architectures that make use of
    > GENERIC_CLOCKEVENTS.
    > Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
    >
    > I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
    > This patch fixes also build breakages caused by the include replacement in
    > tick-common.h.

    I generally dislike adding optional linux/* includes in asm/* includes -
    I'm nervous about this causing include loops.

    However, there's a separate point to be discussed here.

    That is, what interfaces are expected of every architecture in the kernel.
    If generic code wants to be able to set the affinity of interrupts, then
    that needs to become part of the interfaces listed in linux/interrupt.h
    rather than linux/irq.h.

    So what I suggest is this approach instead (against Linus' tree of a
    couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
    to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
    and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
    rarely used include since not much touches the stacked parent context
    registers.)

    Build tested on ARM PXA family kernels and ARM's Realview platform
    kernels which both use genirq.

    [ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]

    Signed-off-by: Russell King
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Russell King
     

15 Oct, 2007

1 commit

  • The 64bit SMP bootup is slightly different to the 32bit one. It enables
    the boot CPU local APIC timer before all CPUs are brought up. Some AMD C1E
    systems have the C1E feature flag only set in the secondary CPU. Due to
    the early enable of the boot CPU local APIC timer the APIC timer is
    registered as a fully functional device. When we detect the wreckage during
    the bringup of the secondary CPU, we need to force the boot CPU into
    broadcast mode.

    Add a new notifier reason and implement the force broadcast in the clock
    events layer.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

13 Oct, 2007

1 commit


22 Jul, 2007

1 commit

  • We need to make sure, that the clockevent devices are resumed, before
    the tick is resumed. The current resume logic does not guarantee this.

    Add CLOCK_EVT_MODE_RESUME and call the set mode functions of the clock
    event devices before resuming the tick / oneshot functionality.

    Fixup the existing users.

    Thanks to Nigel Cunningham for tracking down a long standing thinko,
    which affected the jinxed VAIO.

    [akpm@linux-foundation.org: xen build fix]
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

09 May, 2007

1 commit

  • While the !highres/!dyntick code assigns the duty of the do_timer() call to
    one specific CPU, this was dropped in the highres/dyntick part during
    development.

    Steven Rostedt discovered the xtime lock contention on highres/dyntick due
    to several CPUs trying to update jiffies.

    Add the single CPU assignement back. In the dyntick case this needs to be
    handled carefully, as the CPU which has the do_timer() duty must drop the
    assignement and let it be grabbed by another CPU, which is active.
    Otherwise the do_timer() calls would not happen during the long sleep.

    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Cc: Steven Rostedt
    Acked-by: Mark Lord
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

17 Mar, 2007

1 commit

  • I finally found a dual core box, which survives suspend/resume without
    crashing in the middle of nowhere. Sigh, I never figured out from the
    code and the bug reports what's going on.

    The observed hangs are caused by a stale state transition of the clock
    event devices, which keeps the RCU synchronization away from completion,
    when the non boot CPU is brought back up.

    The suspend/resume in oneshot mode needs the similar care as the
    periodic mode during suspend to RAM. My assumption that the state
    transitions during the different shutdown/bringups of s2disk would go
    through the periodic boot phase and then switch over to highres resp.
    nohz mode were simply wrong.

    Add the appropriate suspend / resume handling for the non periodic
    modes.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

07 Mar, 2007

1 commit

  • The programming of periodic tick devices needs to be saved/restored
    across suspend/resume - otherwise we might end up with a system coming
    up that relies on getting a PIT (or HPET) interrupt, while those devices
    default to 'no interrupts' after powerup. (To confuse things it worked
    to a certain degree on some systems because the lapic gets initialized
    as a side-effect of SMP bootup.)

    This suspend / resume thing was dropped unintentionally during the
    last-minute -mm code reshuffling.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

27 Feb, 2007

1 commit


17 Feb, 2007

4 commits

  • add /proc/timer_list, which prints all currently pending (high-res) timers,
    all clock-event sources and their parameters in a human-readable form.

    Sample output:

    Timer List Version: v0.1
    HRTIMER_MAX_CLOCK_BASES: 2
    now at 4246046273872 nsecs

    cpu: 0
    clock 0:
    .index: 0
    .resolution: 1 nsecs
    .get_time: ktime_get_real
    .offset: 1273998312645738432 nsecs
    active timers:
    clock 1:
    .index: 1
    .resolution: 1 nsecs
    .get_time: ktime_get
    .offset: 0 nsecs
    active timers:
    #0: , hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0
    # expires at 4246432689566 nsecs [in 386415694 nsecs]
    #1: , hrtimer_wakeup, do_nanosleep, pcscd/2050
    # expires at 4247018194689 nsecs [in 971920817 nsecs]
    #2: , hrtimer_wakeup, do_nanosleep, irqbalance/1909
    # expires at 4247351358392 nsecs [in 1305084520 nsecs]
    #3: , hrtimer_wakeup, do_nanosleep, crond/2157
    # expires at 4249097614968 nsecs [in 3051341096 nsecs]
    #4: , it_real_fn, do_setitimer, syslogd/1888
    # expires at 4251329900926 nsecs [in 5283627054 nsecs]
    .expires_next : 4246432689566 nsecs
    .hres_active : 1
    .check_clocks : 0
    .nr_events : 31306
    .idle_tick : 4246020791890 nsecs
    .tick_stopped : 1
    .idle_jiffies : 986504
    .idle_calls : 40700
    .idle_sleeps : 36014
    .idle_entrytime : 4246019418883 nsecs
    .idle_sleeptime : 4178181972709 nsecs

    cpu: 1
    clock 0:
    .index: 0
    .resolution: 1 nsecs
    .get_time: ktime_get_real
    .offset: 1273998312645738432 nsecs
    active timers:
    clock 1:
    .index: 1
    .resolution: 1 nsecs
    .get_time: ktime_get
    .offset: 0 nsecs
    active timers:
    #0: , hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0
    # expires at 4246050084568 nsecs [in 3810696 nsecs]
    #1: , hrtimer_wakeup, do_nanosleep, atd/2227
    # expires at 4261010635003 nsecs [in 14964361131 nsecs]
    #2: , hrtimer_wakeup, do_nanosleep, smartd/2332
    # expires at 5469485798970 nsecs [in 1223439525098 nsecs]
    .expires_next : 4246050084568 nsecs
    .hres_active : 1
    .check_clocks : 0
    .nr_events : 24043
    .idle_tick : 4246046084568 nsecs
    .tick_stopped : 0
    .idle_jiffies : 986510
    .idle_calls : 26360
    .idle_sleeps : 22551
    .idle_entrytime : 4246043874339 nsecs
    .idle_sleeptime : 4170763761184 nsecs

    tick_broadcast_mask: 00000003
    event_broadcast_mask: 00000001

    CPU#0's local event device:

    Clock Event Device: lapic
    capabilities: 0000000e
    max_delta_ns: 807385544
    min_delta_ns: 1443
    mult: 44624025
    shift: 32
    set_next_event: lapic_next_event
    set_mode: lapic_timer_setup
    event_handler: hrtimer_interrupt
    .installed: 1
    .expires: 4246432689566 nsecs

    CPU#1's local event device:

    Clock Event Device: lapic
    capabilities: 0000000e
    max_delta_ns: 807385544
    min_delta_ns: 1443
    mult: 44624025
    shift: 32
    set_next_event: lapic_next_event
    set_mode: lapic_timer_setup
    event_handler: hrtimer_interrupt
    .installed: 1
    .expires: 4246050084568 nsecs

    Clock Event Device: hpet
    capabilities: 00000007
    max_delta_ns: 2147483647
    min_delta_ns: 3352
    mult: 61496110
    shift: 32
    set_next_event: hpet_next_event
    set_mode: hpet_set_mode
    event_handler: handle_nextevt_broadcast

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • With Ingo Molnar

    Add functions to provide dynamic ticks and high resolution timers. The code
    which keeps track of jiffies and handles the long idle periods is shared
    between tick based and high resolution timer based dynticks. The dyntick
    functionality can be disabled on the kernel commandline. Provide also the
    infrastructure to support high resolution timers.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • With Ingo Molnar

    Add broadcast functionality, so per cpu clock event devices can be registered
    as dummy devices or switched from/to broadcast on demand. The broadcast
    function distributes the events via the broadcast function of the clock event
    device. This is primarily designed to replace the switch apic timer to / from
    IPI in power states, where the apic stops.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • With Ingo Molnar

    The tick-management code is the first user of the clockevents layer. It takes
    clock event devices from the clock events core and uses them to provide the
    periodic tick.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner