21 May, 2011
1 commit
-
Reason: Get upstream fixes and kfree_rcu which is necessary for a
follow up patch.Signed-off-by: Thomas Gleixner
17 May, 2011
1 commit
-
The first cpu which switches from periodic to oneshot mode switches
also the broadcast device into oneshot mode. The broadcast device
serves as a backup for per cpu timers which stop in deeper
C-states. To avoid starvation of the cpus which might be in idle and
depend on broadcast mode it marks the other cpus as broadcast active
and sets the brodcast expiry value of those cpus to the next tick.The oneshot mode broadcast bit for the other cpus is sticky and gets
only cleared when those cpus exit idle. If a cpu was not idle while
the bit got set in consequence the bit prevents that the broadcast
device is armed on behalf of that cpu when it enters idle for the
first time after it switched to oneshot mode.In most cases that goes unnoticed as one of the other cpus has usually
a timer pending which keeps the broadcast device armed with a short
timeout. Now if the only cpu which has a short timer active has the
bit set then the broadcast device will not be armed on behalf of that
cpu and will fire way after the expected timer expiry. In the case of
Christians bug report it took ~145 seconds which is about half of the
wrap around time of HPET (the limit for that device) due to the fact
that all other cpus had no timers armed which expired before the 145
seconds timeframe.The solution is simply to clear the broadcast active bit
unconditionally when a cpu switches to oneshot mode after the first
cpu switched the broadcast device over. It's not idle at that point
otherwise it would not be executing that code.[ I fundamentally hate that broadcast crap. Why the heck thought some
folks that when going into deep idle it's a brilliant concept to
switch off the last device which brings the cpu back from that
state? ]Thanks to Christian for providing all the valuable debug information!
Reported-and-tested-by: Christian Hoffmann
Cc: John Stultz
Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner
05 May, 2011
1 commit
-
Avoid taking broadcast_lock in the idle path for systems where the
timer doesn't stop in C3.[ tglx: Removed the stale label and added comment ]
Signed-off-by: Andi Kleen
Cc: Dave Kleikamp
Cc: Chris Mason
Cc: Peter Zijlstra
Cc: Tim Chen
Cc: lenb@kernel.org
Cc: paulmck@us.ibm.com
Link: http://lkml.kernel.org/r/%3C20110504234806.GF2925%40one.firstfloor.org%3E
Signed-off-by: Thomas Gleixner
16 Mar, 2011
1 commit
-
…l/git/tip/linux-2.6-tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits)
posix-clocks: Check write permissions in posix syscalls
hrtimer: Remove empty hrtimer_init_hres_timer()
hrtimer: Update hrtimer->state documentation
hrtimer: Update base[CLOCK_BOOTTIME].offset correctly
timers: Export CLOCK_BOOTTIME via the posix timers interface
timers: Add CLOCK_BOOTTIME hrtimer base
time: Extend get_xtime_and_monotonic_offset() to also return sleep
time: Introduce get_monotonic_boottime and ktime_get_boottime
hrtimers: extend hrtimer base code to handle more then 2 clockids
ntp: Remove redundant and incorrect parameter check
mn10300: Switch do_timer() to xtimer_update()
posix clocks: Introduce dynamic clocks
posix-timers: Cleanup namespace
posix-timers: Add support for fd based clocks
x86: Add clock_adjtime for x86
posix-timers: Introduce a syscall for clock tuning.
time: Splitout compat timex accessors
ntp: Add ADJ_SETOFFSET mode bit
time: Introduce timekeeping_inject_offset
posix-timer: Update comment
...Fix up new system-call-related conflicts in
arch/x86/ia32/ia32entry.S
arch/x86/include/asm/unistd_32.h
arch/x86/include/asm/unistd_64.h
arch/x86/kernel/syscall_table_32.S
(name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some
due to movement of get_jiffies_64() in:
kernel/time.c
26 Feb, 2011
1 commit
-
When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only
can switch into oneshot mode, when the backup broadcast device
supports oneshot mode as well. Otherwise we would try to switch the
broadcast device into an unsupported mode unconditionally. This went
unnoticed so far as the current available broadcast devices support
oneshot mode. Seth unearthed this problem while debugging and working
around an hpet related BIOS wreckage.Add the necessary check to tick_is_oneshot_available().
Reported-and-tested-by: Seth Forshee
Signed-off-by: Thomas Gleixner
LKML-Reference:
Cc: stable@kernel.org # .21 ->
01 Feb, 2011
1 commit
-
All callers of do_timer() are converted to xtime_update(). The only
users of xtime_lock are in kernel/time/. Make both local to
kernel/time/ and remove them from the global header files.[ tglx: Reuse tick-internal.h instead of creating another local header
file. Massaged changelog ]Signed-off-by: Torben Hohn
Cc: Peter Zijlstra
Cc: johnstul@us.ibm.com
Cc: yong.zhang0@gmail.com
Cc: hch@infradead.org
Signed-off-by: Thomas Gleixner
12 Jul, 2010
1 commit
-
Signed-off-by: Uwe Kleine-König
Signed-off-by: Jiri Kosina
15 Dec, 2009
1 commit
-
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: Ingo Molnar
20 Aug, 2009
1 commit
-
Currently clockevents_notify() is called with interrupts enabled at
some places and interrupts disabled at some other places.This results in a deadlock in this scenario.
cpu A holds clockevents_lock in clockevents_notify() with irqs enabled
cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled
cpu C doing set_mtrr() which will try to rendezvous of all the cpus.This will result in C and A come to the rendezvous point and waiting
for B. B is stuck forever waiting for the spinlock and thus not
reaching the rendezvous point.Fix the clockevents code so that clockevents_lock is taken with
interrupts disabled and thus avoid the above deadlock.Also call lapic_timer_propagate_broadcast() on the destination cpu so
that we avoid calling smp_call_function() in the clockevents notifier
chain.This issue left us wondering if we need to change the MTRR rendezvous
logic to use stop machine logic (instead of smp_call_function) or add
a check in spinlock debug code to see if there are other spinlocks
which gets taken under both interrupts enabled/disabled conditions.Signed-off-by: Suresh Siddha
Signed-off-by: Venkatesh Pallipadi
Cc: "Pallipadi Venkatesh"
Cc: "Brown Len"
LKML-Reference:
Signed-off-by: Thomas Gleixner
02 May, 2009
1 commit
-
The variable tick_broadcast_device is not used outside of the
file where it is defined, so let's make it static.Signed-off-by: Dmitri Vorobiev
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner
01 Jan, 2009
2 commits
-
Impact: cleanup
Simple replacement, now the _nr is redundant.
Signed-off-by: Rusty Russell
Signed-off-by: Mike Travis
Cc: Ingo Molnar -
Impact: Use new APIs
Convert kernel/time functions to use struct cpumask *.
Note the ugly bitmap declarations in tick-broadcast.c. These should
be cpumask_var_t, but there was no obvious initialization function to
put the alloc_cpumask_var() calls in. This was safe.(Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK,
so we use a bitmap here to show we really mean it).Signed-off-by: Rusty Russell
Signed-off-by: Mike Travis
13 Dec, 2008
1 commit
-
Impact: change calling convention of existing clock_event APIs
struct clock_event_timer's cpumask field gets changed to take pointer,
as does the ->broadcast function.Another single-patch change. For safety, we BUG_ON() in
clockevents_register_device() if it's not set.Signed-off-by: Rusty Russell
Cc: Ingo Molnar
18 Oct, 2008
1 commit
-
We did not restart the tick device from irq_enter() to avoid double
reprogramming and extra events in the return immediate to idle case.But long lasting softirqs can lead to a situation where jiffies become
stale:idle()
tick stopped (reprogrammed to next pending timer)
halt()
interrupt
jiffies updated from irq_enter()
interrupt handler
softirq function 1 runs 20ms
softirq function 2 arms a 10ms timer with a stale jiffies value
jiffies updated from irq_exit()
timer wheel has now an already expired timer
(the one added in function 2)
timer fires and timer softirq runsThis was discovered when debugging a timer problem which happend only
when the ath5k driver is active. The debugging proved that there is a
softirq function running for more than 20ms, which is a bug by itself.To solve this we restart the tick timer right from irq_enter(), but do
not go through the other functions which are necessary to return from
idle when need_resched() is set.Reported-by: Elias Oltmanns
Signed-off-by: Thomas Gleixner
Tested-by: Elias Oltmanns
04 Oct, 2008
1 commit
-
Impact: jiffies increment too fast.
Hugh Dickins noted that with NOHZ=n and HIGHRES=n jiffies get
incremented too fast. The reason is a wrong check in the broadcast
enter/exit code, which keeps the local apic timer in periodic mode
when the switch happens.Signed-off-by: Thomas Gleixner
23 Sep, 2008
2 commits
-
Impact: timer hang on CPU online observed on AMD C1E systems
When a CPU is brought online then the broadcast machinery can
be in the one shot state already. Check this and setup the timer
device of the new CPU in one shot mode so the broadcast code
can pick up the next_event value correctly.Another AMD C1E oddity, as we switch to broadcast immediately and
not after the full bring up via the ACPI cpu idle code.Signed-off-by: Thomas Gleixner
-
Impact: Possible hang on CPU online observed on AMD C1E machines.
The broadcast setup code looks at the mode of the tick device to
determine whether it needs to be shut down or setup. This is wrong
when the broadcast mode is set to one shot already. This can happen
when a CPU is brought online as it goes through the periodic setup
first.The problem went unnoticed as sane systems do not call into that code
before the switch to one shot for the clock event device happens.
The AMD C1E idle routine switches over immediately and thereby shuts
down the just setup device before the first interrupt happens.Signed-off-by: Thomas Gleixner
17 Sep, 2008
1 commit
-
The device shut down does not cleanup the next_event variable of the
clock event device. So when the device is reactivated the possible
stale next_event value can prevent the device to be reprogrammed as it
claims to wait on a event already.This is the root cause of the resurfacing suspend/resume problem,
where systems need key press to come back to life.Fix this by setting next_event to KTIME_MAX when the device is shut
down. Use a separate function for shutdown which takes care of that
and only keep the direct set mode call in the broadcast code, where we
can not touch the next_event value.Signed-off-by: Thomas Gleixner
06 Sep, 2008
1 commit
-
Until the C1E patches arrived there where no users of periodic broadcast
before switching to oneshot mode. Now we need to trigger a possible
waiter for a periodic broadcast when switching to oneshot mode.
Otherwise we can starve them for ever.Signed-off-by: Thomas Gleixner
05 Sep, 2008
3 commits
-
The C1E/HPET bug reports on AMDX2/RS690 systems where tracked down to a
too small value of the HPET minumum delta for programming an event.The clockevents code needs to enforce an interrupt event on the clock event
device in some cases. The enforcement code was stupid and naive, as it just
added the minimum delta to the current time and tried to reprogram the device.
When the minimum delta is too small, then this loops forever.Add a sanity check. Allow reprogramming to fail 3 times, then print a warning
and double the minimum delta value to make sure, that this does not happen again.
Use the same function for both tick-oneshot and tick-broadcast code.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar -
While chasing the C1E/HPET bugreports I went through the clock events
code inch by inch and found that the broadcast device can be initialized
and shutdown multiple times. Multiple shutdowns are not critical, but
useless waste of time. Multiple initializations are simply broken. Another
CPU might have the device in use already after the first initialization and
the second init could just render it unusable again.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar -
The reprogramming of the periodic broadcast handler was broken,
when the first programming returned -ETIME. The clockevents code
stores the new expiry value in the clock events device next_event field
only when the programming time has not been elapsed yet. The loop in
question calculates the new expiry value from the next_event value
and therefor never increases.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
16 Jul, 2008
2 commits
-
Conflicts:
arch/x86/xen/smp.c
kernel/sched_rt.c
net/iucv/iucv.cSigned-off-by: Ingo Molnar
-
Conflicts:
arch/powerpc/Kconfig
arch/s390/kernel/time.c
arch/x86/kernel/apic_32.c
arch/x86/kernel/cpu/perfctr-watchdog.c
arch/x86/kernel/i8259_64.c
arch/x86/kernel/ldt.c
arch/x86/kernel/nmi_64.c
arch/x86/kernel/smpboot.c
arch/x86/xen/smp.c
include/asm-x86/hw_irq_32.h
include/asm-x86/hw_irq_64.h
include/asm-x86/mach-default/irq_vectors.h
include/asm-x86/mach-voyager/irq_vectors.h
include/asm-x86/smp.h
kernel/MakefileSigned-off-by: Ingo Molnar
08 Jul, 2008
1 commit
-
C1E on AMD machines is like C3 but without control from the OS. Up to
now we disabled the local apic timer for those machines as it stops
when the CPU goes into C1E. This excludes those machines from high
resolution timers / dynamic ticks, which hurts especially X2 based
laptops.The current boot time C1E detection has another, more serious flaw
as well: some BIOSes do not enable C1E until the ACPI processor module
is loaded. This causes systems to stop working after that point.To work nicely with C1E enabled machines we use a separate idle
function, which checks on idle entry whether C1E was enabled in the
Interrupt Pending Message MSR. This allows us to do timer broadcasting
for C1E and covers the late enablement of C1E as well.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
26 Jun, 2008
1 commit
-
It's never used and the comments refer to nonatomic and retry
interchangably. So get rid of it.Acked-by: Jeremy Fitzhardinge
Signed-off-by: Jens Axboe
24 May, 2008
1 commit
-
Change references from for_each_cpu_mask to for_each_cpu_mask_nr
where appropriateReviewed-by: Paul Jackson
Reviewed-by: Christoph Lameter
Signed-off-by: Mike Travis
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner
21 Apr, 2008
1 commit
-
braodcast -> broadcast
Signed-off-by: Glauber Costa
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner
17 Apr, 2008
1 commit
-
> Generic code is not supposed to include irq.h. Replace this include
> by linux/hardirq.h instead and add/replace an include of linux/irq.h
> in asm header files where necessary.
> This change should only matter for architectures that make use of
> GENERIC_CLOCKEVENTS.
> Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
>
> I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
> This patch fixes also build breakages caused by the include replacement in
> tick-common.h.I generally dislike adding optional linux/* includes in asm/* includes -
I'm nervous about this causing include loops.However, there's a separate point to be discussed here.
That is, what interfaces are expected of every architecture in the kernel.
If generic code wants to be able to set the affinity of interrupts, then
that needs to become part of the interfaces listed in linux/interrupt.h
rather than linux/irq.h.So what I suggest is this approach instead (against Linus' tree of a
couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
rarely used include since not much touches the stacked parent context
registers.)Build tested on ARM PXA family kernels and ARM's Realview platform
kernels which both use genirq.[ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]
Signed-off-by: Russell King
Signed-off-by: Thomas Gleixner
Signed-off-by: Martin Schwidefsky
Signed-off-by: Heiko Carstens
30 Jan, 2008
1 commit
-
clean up tick-broadcast.c
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
19 Dec, 2007
1 commit
-
Resolve the following regression of a choppy, almost unusable laptop:
http://lkml.org/lkml/2007/12/7/299
http://bugzilla.kernel.org/show_bug.cgi?id=9525A previous version of the code did the reprogramming of the broadcast
device in the return from idle code. This was removed, but the logic in
tick_handle_oneshot_broadcast() was kept the same.When a broadcast interrupt happens we signal the expiry to all CPUs
which have an expired event. If none of the CPUs has an expired event,
which can happen in dyntick mode, then we reprogram the broadcast
device. We do not reprogram otherwise, but this is only correct if all
CPUs, which are in the idle broadcast state have been woken up.The code ignores, that there might be pending not yet expired events on
other CPUs, which are in the idle broadcast state. So the delivery of
those events can be delayed for quite a time.Change the tick_handle_oneshot_broadcast() function to check for CPUs,
which are in broadcast state and are not woken up by the current event,
and enforce the rearming of the broadcast device for those CPUs.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
06 Nov, 2007
1 commit
-
Signed-off-by: Li Zefan
Cc: Thomas Gleixner
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Oct, 2007
1 commit
-
Doh, I completely missed that devices marked DUMMY are not running
the set_mode function. So we force broadcasting, but we keep the
local APIC timer running.Let the clock event layer mark the device _after_ switching it off.
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
17 Oct, 2007
1 commit
-
smp_call_function_single() now knows how to call the function on the
current cpu.Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Avi Kivity
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Oct, 2007
1 commit
-
The 64bit SMP bootup is slightly different to the 32bit one. It enables
the boot CPU local APIC timer before all CPUs are brought up. Some AMD C1E
systems have the C1E feature flag only set in the secondary CPU. Due to
the early enable of the boot CPU local APIC timer the APIC timer is
registered as a fully functional device. When we detect the wreckage during
the bringup of the secondary CPU, we need to force the boot CPU into
broadcast mode.Add a new notifier reason and implement the force broadcast in the clock
events layer.Signed-off-by: Thomas Gleixner
13 Oct, 2007
2 commits
-
Change the broadcast timer, if a timer with higher rating becomes available.
Signed-off-by: Venkatesh Pallipadi
Cc: Andi Kleen
Cc: john stultz
Cc: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Arjan van de Ven
Signed-off-by: Thomas Gleixner -
The next_event member of the clock event device is used to keep track
of the next periodic event. For one shot only devices it is wrong to
clear the variable, as the next event will be based on it.Pointed out by Ralf Baechle
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Signed-off-by: Arjan van de Ven
23 Sep, 2007
1 commit
-
In a desparate attempt to fix the suspend/resume problem on Andrews
VAIO I added a workaround which enforced the broadcast of the oneshot
timer on resume. This was actually resolving the problem on the VAIO
but was just a stupid workaround, which was not tackling the root
cause: the assignement of lower idle C-States in the ACPI processor_idle
code. The cpuidle patches, which utilize the dynamic tick feature and
go faster into deeper C-states exposed the problem again. The correct
solution is the previous patch, which prevents lower C-states across
the suspend/resume.Remove the enforcement code, including the conditional broadcast timer
arming, which helped to pamper over the real problem for quite a time.
The oneshot broadcast flag for the cpu, which runs the resume code can
never be set at the time when this code is executed. It only gets set,
when the CPU is entering a lower idle C-State.Signed-off-by: Thomas Gleixner
Tested-by: Andrew Morton
Cc: Len Brown
Cc: Venkatesh Pallipadi
Cc: Rafael J. Wysocki
Signed-off-by: Linus Torvalds
16 Sep, 2007
2 commits
-
When a cpu goes offline it is removed from the broadcast masks. If the
mask becomes empty the code shuts down the broadcast device. This is
wrong, because the broadcast device needs to be ready for the online
cpu going idle (into a c-state, which stops the local apic timer).Signed-off-by: Thomas Gleixner
-
The jinxed VAIO refuses to resume without hitting keys on the keyboard
when this is not enforced. It is unclear why the cpu ends up in a lower
C State without notifying the clock events layer, but enforcing the
oneshot broadcast here is safe.Signed-off-by: Thomas Gleixner