02 Dec, 2011

1 commit


08 Sep, 2011

1 commit

  • The automatic increase of the min_delta_ns of a clockevents device
    should be done in the clockevents code as the minimum delay is an
    attribute of the clockevents device.

    In addition not all architectures want the automatic adjustment, on a
    massively virtualized system it can happen that the programming of a
    clock event fails several times in a row because the virtual cpu has
    been rescheduled quickly enough. In that case the minimum delay will
    erroneously be increased with no way back. The new config symbol
    GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic
    adjustment. The config option is selected only for x86.

    Signed-off-by: Martin Schwidefsky
    Cc: john stultz
    Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com
    Signed-off-by: Thomas Gleixner

    Martin Schwidefsky
     

21 May, 2011

1 commit


17 May, 2011

1 commit

  • The first cpu which switches from periodic to oneshot mode switches
    also the broadcast device into oneshot mode. The broadcast device
    serves as a backup for per cpu timers which stop in deeper
    C-states. To avoid starvation of the cpus which might be in idle and
    depend on broadcast mode it marks the other cpus as broadcast active
    and sets the brodcast expiry value of those cpus to the next tick.

    The oneshot mode broadcast bit for the other cpus is sticky and gets
    only cleared when those cpus exit idle. If a cpu was not idle while
    the bit got set in consequence the bit prevents that the broadcast
    device is armed on behalf of that cpu when it enters idle for the
    first time after it switched to oneshot mode.

    In most cases that goes unnoticed as one of the other cpus has usually
    a timer pending which keeps the broadcast device armed with a short
    timeout. Now if the only cpu which has a short timer active has the
    bit set then the broadcast device will not be armed on behalf of that
    cpu and will fire way after the expected timer expiry. In the case of
    Christians bug report it took ~145 seconds which is about half of the
    wrap around time of HPET (the limit for that device) due to the fact
    that all other cpus had no timers armed which expired before the 145
    seconds timeframe.

    The solution is simply to clear the broadcast active bit
    unconditionally when a cpu switches to oneshot mode after the first
    cpu switched the broadcast device over. It's not idle at that point
    otherwise it would not be executing that code.

    [ I fundamentally hate that broadcast crap. Why the heck thought some
    folks that when going into deep idle it's a brilliant concept to
    switch off the last device which brings the cpu back from that
    state? ]

    Thanks to Christian for providing all the valuable debug information!

    Reported-and-tested-by: Christian Hoffmann
    Cc: John Stultz
    Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E
    Cc: stable@kernel.org
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

05 May, 2011

1 commit

  • Avoid taking broadcast_lock in the idle path for systems where the
    timer doesn't stop in C3.

    [ tglx: Removed the stale label and added comment ]

    Signed-off-by: Andi Kleen
    Cc: Dave Kleikamp
    Cc: Chris Mason
    Cc: Peter Zijlstra
    Cc: Tim Chen
    Cc: lenb@kernel.org
    Cc: paulmck@us.ibm.com
    Link: http://lkml.kernel.org/r/%3C20110504234806.GF2925%40one.firstfloor.org%3E
    Signed-off-by: Thomas Gleixner

    Andi Kleen
     

16 Mar, 2011

1 commit

  • …l/git/tip/linux-2.6-tip

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits)
    posix-clocks: Check write permissions in posix syscalls
    hrtimer: Remove empty hrtimer_init_hres_timer()
    hrtimer: Update hrtimer->state documentation
    hrtimer: Update base[CLOCK_BOOTTIME].offset correctly
    timers: Export CLOCK_BOOTTIME via the posix timers interface
    timers: Add CLOCK_BOOTTIME hrtimer base
    time: Extend get_xtime_and_monotonic_offset() to also return sleep
    time: Introduce get_monotonic_boottime and ktime_get_boottime
    hrtimers: extend hrtimer base code to handle more then 2 clockids
    ntp: Remove redundant and incorrect parameter check
    mn10300: Switch do_timer() to xtimer_update()
    posix clocks: Introduce dynamic clocks
    posix-timers: Cleanup namespace
    posix-timers: Add support for fd based clocks
    x86: Add clock_adjtime for x86
    posix-timers: Introduce a syscall for clock tuning.
    time: Splitout compat timex accessors
    ntp: Add ADJ_SETOFFSET mode bit
    time: Introduce timekeeping_inject_offset
    posix-timer: Update comment
    ...

    Fix up new system-call-related conflicts in
    arch/x86/ia32/ia32entry.S
    arch/x86/include/asm/unistd_32.h
    arch/x86/include/asm/unistd_64.h
    arch/x86/kernel/syscall_table_32.S
    (name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some
    due to movement of get_jiffies_64() in:
    kernel/time.c

    Linus Torvalds
     

26 Feb, 2011

1 commit

  • When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only
    can switch into oneshot mode, when the backup broadcast device
    supports oneshot mode as well. Otherwise we would try to switch the
    broadcast device into an unsupported mode unconditionally. This went
    unnoticed so far as the current available broadcast devices support
    oneshot mode. Seth unearthed this problem while debugging and working
    around an hpet related BIOS wreckage.

    Add the necessary check to tick_is_oneshot_available().

    Reported-and-tested-by: Seth Forshee
    Signed-off-by: Thomas Gleixner
    LKML-Reference:
    Cc: stable@kernel.org # .21 ->

    Thomas Gleixner
     

01 Feb, 2011

1 commit

  • All callers of do_timer() are converted to xtime_update(). The only
    users of xtime_lock are in kernel/time/. Make both local to
    kernel/time/ and remove them from the global header files.

    [ tglx: Reuse tick-internal.h instead of creating another local header
    file. Massaged changelog ]

    Signed-off-by: Torben Hohn
    Cc: Peter Zijlstra
    Cc: johnstul@us.ibm.com
    Cc: yong.zhang0@gmail.com
    Cc: hch@infradead.org
    Signed-off-by: Thomas Gleixner

    Torben Hohn
     

12 Jul, 2010

1 commit


15 Dec, 2009

1 commit


20 Aug, 2009

1 commit

  • Currently clockevents_notify() is called with interrupts enabled at
    some places and interrupts disabled at some other places.

    This results in a deadlock in this scenario.

    cpu A holds clockevents_lock in clockevents_notify() with irqs enabled
    cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled
    cpu C doing set_mtrr() which will try to rendezvous of all the cpus.

    This will result in C and A come to the rendezvous point and waiting
    for B. B is stuck forever waiting for the spinlock and thus not
    reaching the rendezvous point.

    Fix the clockevents code so that clockevents_lock is taken with
    interrupts disabled and thus avoid the above deadlock.

    Also call lapic_timer_propagate_broadcast() on the destination cpu so
    that we avoid calling smp_call_function() in the clockevents notifier
    chain.

    This issue left us wondering if we need to change the MTRR rendezvous
    logic to use stop machine logic (instead of smp_call_function) or add
    a check in spinlock debug code to see if there are other spinlocks
    which gets taken under both interrupts enabled/disabled conditions.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Venkatesh Pallipadi
    Cc: "Pallipadi Venkatesh"
    Cc: "Brown Len"
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Suresh Siddha
     

02 May, 2009

1 commit


01 Jan, 2009

2 commits


13 Dec, 2008

1 commit


18 Oct, 2008

1 commit

  • We did not restart the tick device from irq_enter() to avoid double
    reprogramming and extra events in the return immediate to idle case.

    But long lasting softirqs can lead to a situation where jiffies become
    stale:

    idle()
    tick stopped (reprogrammed to next pending timer)
    halt()
    interrupt
    jiffies updated from irq_enter()
    interrupt handler
    softirq function 1 runs 20ms
    softirq function 2 arms a 10ms timer with a stale jiffies value
    jiffies updated from irq_exit()
    timer wheel has now an already expired timer
    (the one added in function 2)
    timer fires and timer softirq runs

    This was discovered when debugging a timer problem which happend only
    when the ath5k driver is active. The debugging proved that there is a
    softirq function running for more than 20ms, which is a bug by itself.

    To solve this we restart the tick timer right from irq_enter(), but do
    not go through the other functions which are necessary to return from
    idle when need_resched() is set.

    Reported-by: Elias Oltmanns
    Signed-off-by: Thomas Gleixner
    Tested-by: Elias Oltmanns

    Thomas Gleixner
     

04 Oct, 2008

1 commit


23 Sep, 2008

2 commits

  • Impact: timer hang on CPU online observed on AMD C1E systems

    When a CPU is brought online then the broadcast machinery can
    be in the one shot state already. Check this and setup the timer
    device of the new CPU in one shot mode so the broadcast code
    can pick up the next_event value correctly.

    Another AMD C1E oddity, as we switch to broadcast immediately and
    not after the full bring up via the ACPI cpu idle code.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Impact: Possible hang on CPU online observed on AMD C1E machines.

    The broadcast setup code looks at the mode of the tick device to
    determine whether it needs to be shut down or setup. This is wrong
    when the broadcast mode is set to one shot already. This can happen
    when a CPU is brought online as it goes through the periodic setup
    first.

    The problem went unnoticed as sane systems do not call into that code
    before the switch to one shot for the clock event device happens.
    The AMD C1E idle routine switches over immediately and thereby shuts
    down the just setup device before the first interrupt happens.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

17 Sep, 2008

1 commit

  • The device shut down does not cleanup the next_event variable of the
    clock event device. So when the device is reactivated the possible
    stale next_event value can prevent the device to be reprogrammed as it
    claims to wait on a event already.

    This is the root cause of the resurfacing suspend/resume problem,
    where systems need key press to come back to life.

    Fix this by setting next_event to KTIME_MAX when the device is shut
    down. Use a separate function for shutdown which takes care of that
    and only keep the direct set mode call in the broadcast code, where we
    can not touch the next_event value.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

06 Sep, 2008

1 commit


05 Sep, 2008

3 commits

  • The C1E/HPET bug reports on AMDX2/RS690 systems where tracked down to a
    too small value of the HPET minumum delta for programming an event.

    The clockevents code needs to enforce an interrupt event on the clock event
    device in some cases. The enforcement code was stupid and naive, as it just
    added the minimum delta to the current time and tried to reprogram the device.
    When the minimum delta is too small, then this loops forever.

    Add a sanity check. Allow reprogramming to fail 3 times, then print a warning
    and double the minimum delta value to make sure, that this does not happen again.
    Use the same function for both tick-oneshot and tick-broadcast code.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • While chasing the C1E/HPET bugreports I went through the clock events
    code inch by inch and found that the broadcast device can be initialized
    and shutdown multiple times. Multiple shutdowns are not critical, but
    useless waste of time. Multiple initializations are simply broken. Another
    CPU might have the device in use already after the first initialization and
    the second init could just render it unusable again.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • The reprogramming of the periodic broadcast handler was broken,
    when the first programming returned -ETIME. The clockevents code
    stores the new expiry value in the clock events device next_event field
    only when the programming time has not been elapsed yet. The loop in
    question calculates the new expiry value from the next_event value
    and therefor never increases.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

16 Jul, 2008

2 commits

  • Conflicts:

    arch/x86/xen/smp.c
    kernel/sched_rt.c
    net/iucv/iucv.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Conflicts:

    arch/powerpc/Kconfig
    arch/s390/kernel/time.c
    arch/x86/kernel/apic_32.c
    arch/x86/kernel/cpu/perfctr-watchdog.c
    arch/x86/kernel/i8259_64.c
    arch/x86/kernel/ldt.c
    arch/x86/kernel/nmi_64.c
    arch/x86/kernel/smpboot.c
    arch/x86/xen/smp.c
    include/asm-x86/hw_irq_32.h
    include/asm-x86/hw_irq_64.h
    include/asm-x86/mach-default/irq_vectors.h
    include/asm-x86/mach-voyager/irq_vectors.h
    include/asm-x86/smp.h
    kernel/Makefile

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

08 Jul, 2008

1 commit

  • C1E on AMD machines is like C3 but without control from the OS. Up to
    now we disabled the local apic timer for those machines as it stops
    when the CPU goes into C1E. This excludes those machines from high
    resolution timers / dynamic ticks, which hurts especially X2 based
    laptops.

    The current boot time C1E detection has another, more serious flaw
    as well: some BIOSes do not enable C1E until the ACPI processor module
    is loaded. This causes systems to stop working after that point.

    To work nicely with C1E enabled machines we use a separate idle
    function, which checks on idle entry whether C1E was enabled in the
    Interrupt Pending Message MSR. This allows us to do timer broadcasting
    for C1E and covers the late enablement of C1E as well.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

26 Jun, 2008

1 commit


24 May, 2008

1 commit


21 Apr, 2008

1 commit


17 Apr, 2008

1 commit

  • > Generic code is not supposed to include irq.h. Replace this include
    > by linux/hardirq.h instead and add/replace an include of linux/irq.h
    > in asm header files where necessary.
    > This change should only matter for architectures that make use of
    > GENERIC_CLOCKEVENTS.
    > Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
    >
    > I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
    > This patch fixes also build breakages caused by the include replacement in
    > tick-common.h.

    I generally dislike adding optional linux/* includes in asm/* includes -
    I'm nervous about this causing include loops.

    However, there's a separate point to be discussed here.

    That is, what interfaces are expected of every architecture in the kernel.
    If generic code wants to be able to set the affinity of interrupts, then
    that needs to become part of the interfaces listed in linux/interrupt.h
    rather than linux/irq.h.

    So what I suggest is this approach instead (against Linus' tree of a
    couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
    to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
    and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
    rarely used include since not much touches the stacked parent context
    registers.)

    Build tested on ARM PXA family kernels and ARM's Realview platform
    kernels which both use genirq.

    [ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]

    Signed-off-by: Russell King
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Russell King
     

30 Jan, 2008

1 commit


19 Dec, 2007

1 commit

  • Resolve the following regression of a choppy, almost unusable laptop:

    http://lkml.org/lkml/2007/12/7/299
    http://bugzilla.kernel.org/show_bug.cgi?id=9525

    A previous version of the code did the reprogramming of the broadcast
    device in the return from idle code. This was removed, but the logic in
    tick_handle_oneshot_broadcast() was kept the same.

    When a broadcast interrupt happens we signal the expiry to all CPUs
    which have an expired event. If none of the CPUs has an expired event,
    which can happen in dyntick mode, then we reprogram the broadcast
    device. We do not reprogram otherwise, but this is only correct if all
    CPUs, which are in the idle broadcast state have been woken up.

    The code ignores, that there might be pending not yet expired events on
    other CPUs, which are in the idle broadcast state. So the delivery of
    those events can be delayed for quite a time.

    Change the tick_handle_oneshot_broadcast() function to check for CPUs,
    which are in broadcast state and are not woken up by the current event,
    and enforce the rearming of the broadcast device for those CPUs.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

06 Nov, 2007

1 commit


18 Oct, 2007

1 commit


17 Oct, 2007

1 commit


15 Oct, 2007

1 commit

  • The 64bit SMP bootup is slightly different to the 32bit one. It enables
    the boot CPU local APIC timer before all CPUs are brought up. Some AMD C1E
    systems have the C1E feature flag only set in the secondary CPU. Due to
    the early enable of the boot CPU local APIC timer the APIC timer is
    registered as a fully functional device. When we detect the wreckage during
    the bringup of the secondary CPU, we need to force the boot CPU into
    broadcast mode.

    Add a new notifier reason and implement the force broadcast in the clock
    events layer.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

13 Oct, 2007

2 commits


23 Sep, 2007

1 commit

  • In a desparate attempt to fix the suspend/resume problem on Andrews
    VAIO I added a workaround which enforced the broadcast of the oneshot
    timer on resume. This was actually resolving the problem on the VAIO
    but was just a stupid workaround, which was not tackling the root
    cause: the assignement of lower idle C-States in the ACPI processor_idle
    code. The cpuidle patches, which utilize the dynamic tick feature and
    go faster into deeper C-states exposed the problem again. The correct
    solution is the previous patch, which prevents lower C-states across
    the suspend/resume.

    Remove the enforcement code, including the conditional broadcast timer
    arming, which helped to pamper over the real problem for quite a time.
    The oneshot broadcast flag for the cpu, which runs the resume code can
    never be set at the time when this code is executed. It only gets set,
    when the CPU is entering a lower idle C-State.

    Signed-off-by: Thomas Gleixner
    Tested-by: Andrew Morton
    Cc: Len Brown
    Cc: Venkatesh Pallipadi
    Cc: Rafael J. Wysocki
    Signed-off-by: Linus Torvalds

    Thomas Gleixner