31 Mar, 2009
1 commit
-
It appears I inadvertly introduced rq->lock recursion to the
hrtimer_start() path when I delegated running already expired
timers to softirq context.This patch fixes it by introducing a __hrtimer_start_range_ns()
method that will not use raise_softirq_irqoff() but
__raise_softirq_irqoff() which avoids the wakeup.It then also changes schedule() to check for pending softirqs and
do the wakeup then, I'm not quite sure I like this last bit, nor
am I convinced its really needed.Signed-off-by: Peter Zijlstra
Cc: Peter Zijlstra
Cc: paulus@samba.org
LKML-Reference:
Signed-off-by: Ingo Molnar
31 Jan, 2009
3 commits
-
Impact: prevent false positive WARN_ON() in clockevents_program_event()
clock_was_set() changes the base->offset of CLOCK_REALTIME and
enforces the reprogramming of the clockevent device to expire timers
which are based on CLOCK_REALTIME. If the clock change is large enough
then the subtraction of the timer expiry value and base->offset can
become negative which triggers the warning in
clockevents_program_event().Check the subtraction result and set a negative value to 0.
Signed-off-by: Thomas Gleixner
-
Impact: fix CPU hotplug hang on Power6 testbox
On architectures that support offlining all cpus (at least powerpc/pseries),
hot-unpluging the tick_do_timer_cpu can result in a system hang.This comes from the fact that if the cpu going down happens to be the
cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
cpu is dead (via the CPU_DEAD notification), we're left without ticks,
jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
That's particularly the case for the cpu looping in __cpu_die() waiting
for the dying cpu to be dead.This patch addresses this by having the tick_do_timer_cpu handover happen
earlier during the CPU_DYING notification. For this, a new clockevent
notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
in hrtimer_cpu_notify().Signed-off-by: Sebastien Dugue
Cc:
Signed-off-by: Ingo Molnar -
Impact: avoid timer IRQ hanging slow systems
While using the function graph tracer on a virtualized system, the
hrtimer_interrupt can hang the system on an infinite loop.This can be caused in several situations:
- the hardware is very slow and HZ is set too high
- something intrusive is slowing the system down (tracing under emulation)
... and the next clock events to program are always before the current time.
This patch implements a reasonable compromise: if such a situation is
detected, we share the CPUs time in 1/4 to process the hrtimer interrupts.
This is enough to let the system running without serious starvation.It has been successfully tested under VirtualBox with 1000 HZ and 100 HZ
with function graph tracer launched. On both cases, the clock events were
increased until about 25 ms periodic ticks, which means 40 HZ.So we change a hard to debug hang into a warning message and a system that
still manages to limp along.Signed-off-by: Frederic Weisbecker
Signed-off-by: Ingo Molnar
27 Jan, 2009
1 commit
-
…el/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimers: fix inconsistent lock state on resume in hres_timers_resume
time-sched.c: tick_nohz_update_jiffies should be static
locking, hpet: annotate false positive warning
kernel/fork.c: unused variable 'ret'
itimers: remove the per-cpu-ish-ness
19 Jan, 2009
1 commit
-
Andrey Borzenkov reported this lockdep assert:
> [17854.688347] =================================
> [17854.688347] [ INFO: inconsistent lock state ]
> [17854.688347] 2.6.29-rc2-1avb #1
> [17854.688347] ---------------------------------
> [17854.688347] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> [17854.688347] pm-suspend/18240 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [17854.688347] (&cpu_base->lock){++..}, at: [] retrigger_next_event+0x5c/0xa0
> [17854.688347] {in-hardirq-W} state was registered at:
> [17854.688347] [] __lock_acquire+0x79d/0x1930
> [17854.688347] [] lock_acquire+0x5c/0x80
> [17854.688347] [] _spin_lock+0x35/0x70
> [17854.688347] [] hrtimer_run_queues+0x31/0x140
> [17854.688347] [] run_local_timers+0x8/0x20
> [17854.688347] [] update_process_times+0x23/0x60
> [17854.688347] [] tick_periodic+0x24/0x80
> [17854.688347] [] tick_handle_periodic+0x12/0x70
> [17854.688347] [] timer_interrupt+0x14/0x20
> [17854.688347] [] handle_IRQ_event+0x29/0x60
> [17854.688347] [] handle_level_irq+0x69/0xe0
> [17854.688347] [] 0xffffffff
> [17854.688347] irq event stamp: 55771
> [17854.688347] hardirqs last enabled at (55771): [] _spin_unlock_irqrestore+0x35/0x60
> [17854.688347] hardirqs last disabled at (55770): [] _spin_lock_irqsave+0x19/0x80
> [17854.688347] softirqs last enabled at (54836): [] __do_softirq+0xc4/0x110
> [17854.688347] softirqs last disabled at (54831): [] do_softirq+0x8e/0xe0
> [17854.688347]
> [17854.688347] other info that might help us debug this:
> [17854.688347] 3 locks held by pm-suspend/18240:
> [17854.688347] #0: (&buffer->mutex){--..}, at: [] sysfs_write_file+0x25/0x100
> [17854.688347] #1: (pm_mutex){--..}, at: [] enter_state+0x4f/0x140
> [17854.688347] #2: (dpm_list_mtx){--..}, at: [] device_pm_lock+0xf/0x20
> [17854.688347]
> [17854.688347] stack backtrace:
> [17854.688347] Pid: 18240, comm: pm-suspend Not tainted 2.6.29-rc2-1avb #1
> [17854.688347] Call Trace:
> [17854.688347] [] ? printk+0x18/0x20
> [17854.688347] [] print_usage_bug+0x16c/0x1d0
> [17854.688347] [] mark_lock+0x8bf/0xc90
> [17854.688347] [] ? pit_next_event+0x2f/0x40
> [17854.688347] [] __lock_acquire+0x580/0x1930
> [17854.688347] [] ? _spin_unlock+0x1d/0x20
> [17854.688347] [] ? pit_next_event+0x2f/0x40
> [17854.688347] [] ? clockevents_program_event+0x98/0x160
> [17854.688347] [] ? mark_held_locks+0x48/0x90
> [17854.688347] [] ? _spin_unlock_irqrestore+0x35/0x60
> [17854.688347] [] ? trace_hardirqs_on_caller+0x139/0x190
> [17854.688347] [] ? trace_hardirqs_on+0xb/0x10
> [17854.688347] [] lock_acquire+0x5c/0x80
> [17854.688347] [] ? retrigger_next_event+0x5c/0xa0
> [17854.688347] [] _spin_lock+0x35/0x70
> [17854.688347] [] ? retrigger_next_event+0x5c/0xa0
> [17854.688347] [] retrigger_next_event+0x5c/0xa0
> [17854.688347] [] hres_timers_resume+0xa/0x10
> [17854.688347] [] timekeeping_resume+0xee/0x150
> [17854.688347] [] __sysdev_resume+0x14/0x50
> [17854.688347] [] sysdev_resume+0x47/0x80
> [17854.688347] [] device_power_up+0xb/0x20
> [17854.688347] [] suspend_devices_and_enter+0xcf/0x150
> [17854.688347] [] ? freeze_processes+0x3f/0x90
> [17854.688347] [] enter_state+0xf4/0x140
> [17854.688347] [] state_store+0x7d/0xc0
> [17854.688347] [] ? state_store+0x0/0xc0
> [17854.688347] [] kobj_attr_store+0x24/0x30
> [17854.688347] [] sysfs_write_file+0x9c/0x100
> [17854.688347] [] vfs_write+0x9c/0x160
> [17854.688347] [] ? restore_nocheck_notrace+0x0/0xe
> [17854.688347] [] ? sysfs_write_file+0x0/0x100
> [17854.688347] [] sys_write+0x3d/0x70
> [17854.688347] [] sysenter_do_call+0x12/0x31Andrey's analysis:
> timekeeping_resume() is called via class ->resume
> method; and according to comments in sysdev_resume() and
> device_power_up(), they are called with interrupts disabled.
>
> Looking at suspend_enter, irqs *are* disabled at this point.
>
> So it actually looks like something (may be some driver)
> unconditionally enabled irqs in resume path.Add a debug check to test this theory. If it triggers then it
triggers because the resume code calls it with irqs enabled,
which is a no-no not just for timekeeping_resume(), but also
bad for a number of other resume handlers.Reported-by: Andrey Borzenkov
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
14 Jan, 2009
1 commit
-
Signed-off-by: Heiko Carstens
05 Jan, 2009
6 commits
-
Impact: build fix on !CONFIG_HIGH_RES_TIMERS
Fix:
kernel/hrtimer.c:1586: error: implicit declaration of function '__hrtimer_peek_ahead_timers'
Signen-off-by: Ingo Molnar
-
Clean up the comments
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
Impact: fix rare runtime deadlock
There are a few sites that do:
spin_lock_irq(&foo)
hrtimer_start(&bar)
__run_hrtimer(&bar)
func()
spin_lock(&foo)which obviously deadlocks. In order to avoid this, never call __run_hrtimer()
from hrtimer_start*() context, but instead defer this to softirq context.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
Impact: cleanup
No need for a smp function call, which is likely to run on the same
CPU anyway. We can just call hrtimers_peek_ahead() in the interrupts
disabled section of migrate_hrtimers().Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
Impact: cleanup
kernel/hrtimer.c: In function 'hrtimer_cpu_notify':
kernel/hrtimer.c:1574: warning: unused variable 'dcpu'Introduced by commit 37810659ea7d9572c5ac284ade272f806ef8f788
("hrtimer: removing all ur callback modes, fix hotplug") from the
timers. dcpu is only used if CONFIG_HOTPLUG_CPU is set.Reported-by: Stephen Rothwell
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
Impact: cleanup
Provide a peek ahead function that assumes irqs disabled, allows for micro
optimizations.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
01 Jan, 2009
1 commit
-
…l/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sparseirq: move __weak symbols into separate compilation unit
sparseirq: work around __weak alias bug
sparseirq: fix hang with !SPARSE_IRQ
sparseirq: set lock_class for legacy irq when sparse_irq is selected
sparseirq: work around compiler optimizing away __weak functions
sparseirq: fix desc->lock init
sparseirq: do not printk when migrating IRQ descriptors
sparseirq: remove duplicated arch_early_irq_init()
irq: simplify for_each_irq_desc() usage
proc: remove ifdef CONFIG_SPARSE_IRQ from stat.c
irq: for_each_irq_desc() move to irqnr.h
hrtimer: remove #include <linux/irq.h>
26 Dec, 2008
1 commit
-
Impact: cleanup
can be removed and should be, because:
- hrtimer doesn't use any irq feature.
- shouldn't be include from generic code.Signed-off-by: KOSAKI Motohiro
Signed-off-by: Ingo Molnar
19 Dec, 2008
1 commit
-
this warning:
kernel/hrtimer.c: In function ‘hrtimer_cpu_notify’:
kernel/hrtimer.c:1574: warning: unused variable ‘dcpu’is caused because 'dcpu' is only used in the CONFIG_HOTPLUG_CPU case.
Signed-off-by: Ingo Molnar
09 Dec, 2008
1 commit
-
> Ingo, this addition fixes the hotplug issue on my machine
And because we're all human...
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
04 Dec, 2008
1 commit
-
Impact: fix hrtimer locking (reported by lockdep) in the CPU hotplug case
This addition fixes the hotplug locking issue on my machine
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
25 Nov, 2008
1 commit
-
Impact: cleanup, move all hrtimer processing into hardirq context
This is an attempt at removing some of the hrtimer complexity by
reducing the number of callback modes to 1.This means that all hrtimer callback functions will be ran from HARD-irq
context.I went through all the 30 odd hrtimer callback functions in the kernel
and saw only one that I'm not quite sure of, which is the one in
net/can/bcm.c - hence I'm CC-ing the folks responsible for that code.Furthermore, the hrtimer core now calls callbacks directly with IRQs
disabled in case you try to enqueue an expired timer. If this timer is a
periodic timer (which should use hrtimer_forward() to advance its time)
then it might be possible to end up in an inf. recursive loop due to the
fact that hrtimer_forward() doesn't round up to the next timer
granularity, and therefore keeps on calling the callback - obviously
this needs a fix.Aside from that, this seems to compile and actually boot on my dual core
test box - although I'm sure there are some bugs in, me not hitting any
makes me certain :-)Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
12 Nov, 2008
1 commit
-
Impact: cleanup
git grep HRTIMER_CB_IRQSAFE revealed half the callback modes are actually
unused.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
11 Nov, 2008
1 commit
-
Impact: fix incorrect locking triggered during hotplug-intense stress-tests
While migrating the the CB_IRQSAFE_UNLOCKED timers during a cpu-offline,
we queue them on the cb_pending list, so that they won't go
stale.Thus, when the callbacks of the timers run from the softirq context,
they could run into potential deadlocks, since these callbacks
assume that they're running with irq's disabled, thereby annoying
lockdep!Fix this by emulating hardirq context while running these callbacks from
the hrtimer softirq.=================================
[ INFO: inconsistent lock state ]
2.6.27 #2
--------------------------------
inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
ksoftirqd/0/4 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&rq->lock){++..}, at: [] sched_rt_period_timer+0x9e/0x1fc
{in-hardirq-W} state was registered at:
[] __lock_acquire+0x549/0x121e
[] native_sched_clock+0x88/0x99
[] clocksource_get_next+0x39/0x3f
[] update_wall_time+0x616/0x7df
[] lock_acquire+0x5a/0x74
[] scheduler_tick+0x3a/0x18d
[] _spin_lock+0x1c/0x45
[] scheduler_tick+0x3a/0x18d
[] scheduler_tick+0x3a/0x18d
[] update_process_times+0x3a/0x44
[] tick_periodic+0x63/0x6d
[] tick_handle_periodic+0x14/0x5e
[] timer_interrupt+0x44/0x4a
[] handle_IRQ_event+0x13/0x3d
[] handle_level_irq+0x79/0xbd
[] do_IRQ+0x69/0x7d
[] common_interrupt+0x28/0x30
[] aac_probe_one+0x1a3/0x3f3
[] _spin_unlock_irqrestore+0x36/0x39
[] setup_irq+0x1be/0x1f9
[] start_kernel+0x259/0x2c5
[] 0xffffffff
irq event stamp: 50102
hardirqs last enabled at (50102): [] _spin_unlock_irq+0x20/0x23
hardirqs last disabled at (50101): [] _spin_lock_irq+0xa/0x4b
softirqs last enabled at (50088): [] do_softirq+0x37/0x4d
softirqs last disabled at (50099): [] do_softirq+0x37/0x4dother info that might help us debug this:
no locks held by ksoftirqd/0/4.stack backtrace:
Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.27 #2
[] print_usage_bug+0x13e/0x147
[] mark_lock+0x493/0x797
[] __lock_acquire+0x5be/0x121e
[] lock_acquire+0x5a/0x74
[] sched_rt_period_timer+0x9e/0x1fc
[] _spin_lock+0x1c/0x45
[] sched_rt_period_timer+0x9e/0x1fc
[] sched_rt_period_timer+0x9e/0x1fc
[] finish_task_switch+0x41/0xbd
[] native_sched_clock+0x88/0x99
[] sched_rt_period_timer+0x0/0x1fc
[] run_hrtimer_pending+0x54/0xe5
[] sched_rt_period_timer+0x0/0x1fc
[] __do_softirq+0x7b/0xef
[] do_softirq+0x37/0x4d
[] ksoftirqd+0x56/0xc5
[] ksoftirqd+0x0/0xc5
[] kthread+0x38/0x5d
[] kthread+0x0/0x5d
[] kernel_thread_helper+0x7/0x10
=======================Signed-off-by: Gautham R Shenoy
Acked-by: Peter Zijlstra
Acked-by: "Paul E. McKenney"
Signed-off-by: Ingo Molnar
22 Oct, 2008
1 commit
-
Conflicts:
kernel/time/tick-sched.c
Signed-off-by: Thomas Gleixner
20 Oct, 2008
3 commits
-
Signed-off-by: Thomas Gleixner
-
hrtimer_start() and hrtimer_start_range_ns() handle relative and
absolute timers.Signed-off-by: Thomas Gleixner
-
…tp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus
18 Oct, 2008
1 commit
-
Conflicts:
arch/x86/kvm/i8254.c
13 Oct, 2008
1 commit
-
and please hand me a brown paper bag
(thanks to Thomas for pointing out this very obvious bug)
Signed-off-by: Arjan van de Ven
12 Oct, 2008
1 commit
-
There's a small race/chance that, while hrtimers are enabled globally,
they're later not enabled when we're calling the hrtimer_interrupt() function,
which then BUG_ON()'s for that. This patch closes that race/gap.Signed-off-by: Arjan van de Ven
29 Sep, 2008
4 commits
-
Impact: per CPU hrtimers can be migrated from a dead CPU
The hrtimer code has no knowledge about per CPU timers, but we need to
prevent the migration of such timers and warn when such a timer is
active at migration time.Explicitely mark the timers as per CPU and use a more understandable
mode descriptor for the interrupts safe unlocked callback mode, which
is used by hrtimer_sleeper and the scheduler code.Signed-off-by: Thomas Gleixner
-
Impact: during migration active hrtimers can be seen as inactive
The migration code removes the hrtimers from the queues of the dead
CPU and sets the state temporary to INACTIVE. The enqueue code sets it
to ACTIVE/PENDING again.Prevent that the wrong state can be seen by using a separate migration
state bit.Signed-off-by: Thomas Gleixner
-
Impact: Stale timers after a CPU went offline.
commit 37bb6cb4097e29ffee970065b74499cbf10603a3
hrtimer: unlock hrtimer_wakeupchanged the hrtimer sleeper callback mode to CB_IRQSAFE_NO_SOFTIRQ due
to locking problems. A result of this change is that when enqueue is
called for an already expired hrtimer the callback function is not
longer called directly from the enqueue code. The normal callers have
been fixed in the code, but the migration code which moves hrtimers
from a dead CPU to a live CPU was not made aware of this.This can be fixed by checking the timer state after the call to
enqueue in the migration code.Signed-off-by: Thomas Gleixner
-
Impact: hrtimers which are on the pending list are not migrated at cpu
offline and can be stale foreverAdd the pending list migration when CONFIG_HIGH_RES_TIMERS is enabled
Signed-off-by: Thomas Gleixner
22 Sep, 2008
1 commit
-
Peter Zijlstra noticed this 8 months ago and I just noticed
it again.hrtimer_clock_base::get_softirq_time() is currently unused
in the entire tree. In fact, looking at the logs, it appears
as if it was never used. Remove it.Signed-off-by: Mark McLoughlin
Signed-off-by: Ingo Molnar
11 Sep, 2008
2 commits
-
As part of going idle, we already look at the time of the next timer event to determine
which C-state to select etc.This patch adds functionality that causes the timers that are past their
soft expire time, to fire at this time, before we calculate the next wakeup
time. This functionality will thus avoid wakeups by running timers before
going idle rather than specially waking up for it.Signed-off-by: Arjan van de Ven
-
This patch makes the nanosleep() system call use the per process
slack value; with this users are able to externally control existing
applications to reduce the wakeup rate.Signed-off-by: Arjan van de Ven
08 Sep, 2008
1 commit
-
this patch adds a _range version of hrtimer_start() so that range timers
can be created; the hrtimer_start() function is just a wrapper around this.In addition, hrtimer_start_expires() will now preserve existing ranges.
Signed-off-by: Arjan van de Ven
06 Sep, 2008
3 commits
-
this patch turns hrtimers into range timers; they have 2 expire points
1) the soft expire point
2) the hard expire pointthe kernel will do it's regular best effort attempt to get the timer run
at the hard expire point. However, if some other time fires after the soft
expire point, the kernel now has the freedom to fire this timer at this point,
and thus grouping the events and preventing a power-expensive wakeup in the
future.Signed-off-by: Arjan van de Ven
-
In order to be able to do range hrtimers we need to use accessor functions
to the "expire" member of the hrtimer struct.
This patch converts kernel/* to these accessors.Signed-off-by: Arjan van de Ven
-
This patch adds a schedule_hrtimeout() function, to be used by select() and
poll() in a later patch. This function works similar to schedule_timeout()
in most ways, but takes a timespec rather than jiffies.With a lot of contributions/fixes from Thomas
Signed-off-by: Arjan van de Ven
Signed-off-by: Thomas Gleixner
21 Aug, 2008
1 commit
-
Add the comment to explain why the double lock in migrate_timers()
can't deadlock.Change the code to use spinlock_irq() instead of local_irq_disable()
+ spin_lock().Signed-off-by: Oleg Nesterov
Acked-by: Steven Rostedt
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar