Doug / smarc-fsl-linux-kernel | Embedian Git Server

26 Mar, 2006

2 commits

5ddcfa878 [PATCH] remove pps support ... Browse Code »

This removes the support for pps. It's completely unused within the kernel
and is basically in the way for further cleanups. It should be easier to
readd proper support for it after the rest has been converted to NTP4
(where the pps mechanisms are quite different from NTP3 anyway).

Signed-off-by: Roman Zippel
Cc: Adrian Bunk
Cc: john stultz
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roman Zippel
2006-03-26 00:23:02 +0800
c08b8a491 [PATCH] sys_alarm() unsigned signed conversion fixup ... Browse Code »

alarm() calls the kernel with an unsigend int timeout in seconds. The
value is stored in the tv_sec field of a struct timeval to setup the
itimer. The tv_sec field of struct timeval is of type long, which causes
the tv_sec value to be negative on 32 bit machines if seconds > INT_MAX.

Before the hrtimer merge (pre 2.6.16) such a negative value was converted
to the maximum jiffies timeout by the timeval_to_jiffies conversion. It's
not clear whether this was intended or just happened to be done by the
timeval_to_jiffies code.

hrtimers expect a timeval in canonical form and treat a negative timeout as
already expired. This breaks the legitimate usage of alarm() with a
timeout value > INT_MAX seconds.

For 32 bit machines it is therefor necessary to limit the internal seconds
value to avoid API breakage. Instead of doing this in all implementations
of sys_alarm the duplicated sys_alarm code is moved into a common function
in itimer.c

Signed-off-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2006-03-26 00:22:48 +0800

24 Mar, 2006

2 commits

6687a97d4 [PATCH] timer-irq-driven soft-watchdog, cleanups ... Browse Code »

Make the softlockup detector purely timer-interrupt driven, removing
softirq-context (timer) dependencies. This means that if the softlockup
watchdog triggers, it has truly observed a longer than 10 seconds
scheduling delay of a SCHED_FIFO prio 99 task.

(the patch also turns off the softlockup detector during the initial bootup
phase and does small style fixes)

Signed-off-by: Ingo Molnar
Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2006-03-24 23:33:30 +0800
a4a6198b8 [PATCH] tvec_bases too large for per-cpu data ... Browse Code »

With internal Xen-enabled kernels we see the kernel's static per-cpu data
area exceed the limit of 32k on x86-64, and even native x86-64 kernels get
fairly close to that limit. I generally question whether it is reasonable
to have data structures several kb in size allocated as per-cpu data when
the space there is rather limited.

The biggest arch-independent consumer is tvec_bases (over 4k on 32-bit
archs, over 8k on 64-bit ones), which now gets converted to use dynamically
allocated memory instead.

Signed-off-by: Jan Beulich
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2006-03-24 23:33:21 +0800

17 Mar, 2006

1 commit

67890d708 [PATCH] time_interpolator: add __read_mostly ... Browse Code »

The pointer to the current time interpolator and the current list of time
interpolators are typically only changed during bootup. Adding
__read_mostly takes them away from possibly hot cachelines.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-03-17 23:51:25 +0800

07 Mar, 2006

2 commits

5aee405c6 [PATCH] time: add barrier after updating jiffies_64 ... Browse Code »

Add a compiler barrier so that we don't read jiffies before updating
jiffies_64.

Signed-off-by: Atsushi Nemoto
Cc: Ralf Baechle
Cc: Paul Mackerras
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Atsushi Nemoto
2006-03-07 10:40:44 +0800
69239749e [PATCH] fix next_timer_interrupt() for hrtimer ... Browse Code »

Also from Thomas Gleixner

Function next_timer_interrupt() got broken with a recent patch
6ba1b91213e81aa92b5cf7539f7d2a94ff54947c as sys_nanosleep() was moved to
hrtimer. This broke things as next_timer_interrupt() did not check hrtimer
tree for next event.

Function next_timer_interrupt() is needed with dyntick (CONFIG_NO_IDLE_HZ,
VST) implementations, as the system can be in idle when next hrtimer event
was supposed to happen. At least ARM and S390 currently use
next_timer_interrupt().

Signed-off-by: Thomas Gleixner
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Lindgren
2006-03-07 10:40:44 +0800

03 Mar, 2006

1 commit

685db65e4 [PATCH] time_interpolator: Use readq_relaxed() instead of readq(). ... Browse Code »

On some platforms readq performs additional work to make sure I/O is done
in a coherent way. This is not needed for time retrieval as done by the
time interpolator. So we can use readq_relaxed instead which will improve
performance.

It affects sparc64 and ia64 only. Apparently it makes a significant
difference on ia64.

Signed-off-by: Christoph Lameter
Cc: john stultz
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-03-03 00:33:07 +0800

18 Feb, 2006

1 commit

726c14bf4 [PATCH] Provide an interface for getting the current tick length ... Browse Code »

This provides an interface for arch code to find out how many
nanoseconds are going to be added on to xtime by the next call to
do_timer. The value returned is a fixed-point number in 52.12 format
in nanoseconds. The reason for this format is that it gives the
full precision that the timekeeping code is using internally.

The motivation for this is to fix a problem that has arisen on 32-bit
powerpc in that the value returned by do_gettimeofday drifts apart
from xtime if NTP is being used. PowerPC is now using a lockless
do_gettimeofday based on reading the timebase register and performing
some simple arithmetic. (This method of getting the time is also
exported to userspace via the VDSO.) However, the factor and offset
it uses were calculated based on the nominal tick length and weren't
being adjusted when NTP varied the tick length.

Note that 64-bit powerpc has had the lockless do_gettimeofday for a
long time now. It also had an extremely hairy routine that got called
from the 32-bit compat routine for adjtimex, which adjusted the
factor and offset according to what it thought the timekeeping code
was going to do. Not only was this only called if a 32-bit task did
adjtimex (i.e. not if a 64-bit task did adjtimex), it was also
duplicating computations from kernel/timer.c and it wasn't clear that
it was (still) correct.

The simple solution is to ask the timekeeping code how long the
current jiffy will be on each timer interrupt, after calling
do_timer. If this jiffy will be a different length from the last one,
we then need to compute new values for the factor and offset used in
the lockless do_gettimeofday. In this way we can keep xtime and
do_gettimeofday in sync, even when NTP is varying the tick length.

Note that when adjtimex varies the tick length, it almost always
introduces the variation from the next tick on. The only case I could
see where adjtimex would vary the length of the current tick is when
an old-style adjtime adjustment is being cancelled. (It's not clear
to me why the adjustment has to be cancelled immediately rather than
from the next tick on.) Thus I don't see any real need for a hook in
adjtimex; the rare case of an old-style adjustment being cancelled can
be fixed up at the next tick.

Signed-off-by: Paul Mackerras
Acked-by: john stultz
Signed-off-by: Linus Torvalds

Paul Mackerras
2006-02-18 00:24:29 +0800

08 Feb, 2006

1 commit

53f087feb [PATCH] timer.c NULL noise removal ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2006-02-08 09:57:42 +0800

11 Jan, 2006

2 commits

6ba1b9121 [PATCH] hrtimer: switch sys_nanosleep to hrtimer ... Browse Code »

convert sys_nanosleep() to use hrtimer_nanosleep()

Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2006-01-11 00:01:38 +0800
c0a313296 [PATCH] hrtimer: hrtimer core code ... Browse Code »

hrtimer subsystem core. It is initialized at bootup and expired by the timer
interrupt, but is otherwise not utilized by any other subsystem yet.

Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2006-01-11 00:01:37 +0800

09 Jan, 2006

1 commit

97a41e261 [PATCH] kernel/: small cleanups ... Browse Code »

This patch contains the following cleanups:
- make needlessly global functions static
- every file should include the headers containing the prototypes for
it's global functions

Signed-off-by: Adrian Bunk
Acked-by: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-01-09 12:13:48 +0800

31 Oct, 2005

5 commits

ecea8d19c [PATCH] jiffies_64 cleanup ... Browse Code »

Define jiffies_64 in kernel/timer.c rather than having 24 duplicated
defines in each architecture.

Signed-off-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2005-10-31 09:37:25 +0800
dfc4f94d2 [PATCH] remove timer debug field ... Browse Code »

Remove timer_list.magic and associated debugging code.

I originally added this when a spinlock was added to timer_list - this meant
that an all-zeroes timer became illegal and init_timer() was required.

That spinlock isn't even there any more, although timer.base must now be
initialised.

I'll keep this debugging code in -mm.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2005-10-31 09:37:18 +0800
a5a0d52c7 [PATCH] ntp whitespace cleanup ... Browse Code »

Fix bizarre 4-space coding style in the NTP code.

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2005-10-31 09:37:18 +0800
1bb34a412 [PATCH] NTP shift_right cleanup ... Browse Code »

Create a macro shift_right() that avoids the numerous ugly conditionals in the
NTP code that look like:

if(a < 0)
b = -(-a >> shift);
else
b = a >> shift;

Replacing it with:

b = shift_right(a, shift);

This should have zero effect on the logic, however it should probably have
a bit of testing just to be sure.

Also replace open-coded min/max with the macros.

Signed-off-by : John Stultz

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

john stultz
2005-10-31 09:37:18 +0800
a8db2db1e [PATCH] introduce setup_timer() helper ... Browse Code »

Every user of init_timer() also needs to initialize ->function and ->data
fields. This patch adds a simple setup_timer() helper for that.

The schedule_timeout() is patched as an example of usage.

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-10-31 09:37:17 +0800

30 Oct, 2005

1 commit

4b8f573b5 [PATCH] TIMERS: add missing compensation for HZ == 250 ... Browse Code »

Add missing compensation for (HZ == 250) != (1 << SHIFT_HZ) in
second_overflow().

Signed-off-by: YOSHIFUJI Hideaki
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

YOSHIFUJI Hideaki
2005-10-30 12:40:35 +0800

13 Sep, 2005

1 commit

8a1c17574 [PATCH] schedule_timeout_[un]interruptible() speedup ... Browse Code »

These functions don't need schedule_timeout()'s barrier.

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2005-09-13 23:22:29 +0800

11 Sep, 2005

2 commits

75bcc8c5e [PATCH] kernel: fix-up schedule_timeout() usage ... Browse Code »

Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.

Signed-off-by: Nishanth Aravamudan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nishanth Aravamudan
2005-09-11 01:06:37 +0800
64ed93a26 [PATCH] add schedule_timeout_{,un}interruptible() interfaces ... Browse Code »

Add schedule_timeout_{,un}interruptible() interfaces so that
schedule_timeout() callers don't have to worry about forgetting to add the
set_current_state() call beforehand.

Signed-off-by: Nishanth Aravamudan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nishanth Aravamudan
2005-09-11 01:06:36 +0800

08 Sep, 2005

2 commits

486d46aef [PATCH] optimize writer path in time_interpolator_get_counter() ... Browse Code »

Christoph Lameter

When using a time interpolator that is susceptible to jitter there's
potentially contention over a cmpxchg used to prevent time from going
backwards. This is unnecessary when the caller holds the xtime write
seqlock as all readers will be blocked from returning until the write is
complete. We can therefore allow writers to insert a new value and exit
rather than fight with CPUs who only hold a reader lock.

Signed-off-by: Alex Williamson
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Williamson
2005-09-08 07:57:24 +0800
8446f1d39 [PATCH] detect soft lockups ... Browse Code »

This patch adds a new kernel debug feature: CONFIG_DETECT_SOFTLOCKUP.

When enabled then per-CPU watchdog threads are started, which try to run
once per second. If they get delayed for more than 10 seconds then a
callback from the timer interrupt detects this condition and prints out a
warning message and a stack dump (once per lockup incident). The feature
is otherwise non-intrusive, it doesnt try to unlock the box in any way, it
only gets the debug info out, automatically, and on all CPUs affected by
the lockup.

Signed-off-by: Ingo Molnar
Signed-off-by: Nishanth Aravamudan
Signed-Off-By: Matthias Urlichs
Signed-off-by: Richard Purdie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2005-09-08 07:57:17 +0800

24 Aug, 2005

1 commit

4c5640cb5 [PATCH] preempt race in getppid ... Browse Code »

With CONFIG_PREEMPT && !CONFIG_SMP, it's possible for sys_getppid to
return a bogus value if the parent's task_struct gets reallocated after
current->group_leader->real_parent is read:

asmlinkage long sys_getppid(void)
{
int pid;
struct task_struct *me = current;
struct task_struct *parent;

parent = me->group_leader->real_parent;
RACE HERE => for (;;) {
pid = parent->tgid;
#ifdef CONFIG_SMP
{
struct task_struct *old = parent;

/*
* Make sure we read the pid before re-reading the
* parent pointer:
*/
smp_rmb();
parent = me->group_leader->real_parent;
if (old != parent)
continue;
}
#endif
break;
}
return pid;
}

If the process gets preempted at the indicated point, the parent process
can go ahead and call exit() and then get wait()'d on to reap its
task_struct. When the preempted process gets resumed, it will not do any
further checks of the parent pointer on !CONFIG_SMP: it will read the
bad pid and return.

So, the same algorithm used when SMP is enabled should be used when
preempt is enabled, which will recheck ->real_parent in this case.

Signed-off-by: David Meybohm
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Meybohm
2005-08-24 02:44:29 +0800

26 Jun, 2005

1 commit

96ec3efdc [PATCH] kernel/timer: fix msleep_interruptible() comment ... Browse Code »

The comment for msleep_interruptible() is wrong, as it will ignore
wait-queue events, but will wake up early for signals.

Signed-off-by: Nishanth Aravamudan
Signed-off-by: Domen Puncer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Domen Puncer
2005-06-26 07:24:58 +0800

24 Jun, 2005

3 commits

be5b4fbd0 [PATCH] preempt_count is int - remove cast and don't assign to unsigned type ... Browse Code »

In kernel/sched.c the return value from preempt_count() is cast to an int.
That made sense when preempt_count was defined as different types on is not
needed and should go away. The patch removes the cast.

In kernel/timer.c the return value from preempt_count() is assigned to a
variable of type u32 and then that unsigned value is later compared to
preempt_count(). Since preempt_count() returns an int, an int is what
should be used to store its return value. Storing the result in an
unsigned 32bit integer made a tiny bit of sense back when preempt_count was
different types on different archs, but no more - let's not play signed vs
unsigned comparison games when we don't have to. The patch modifies the
code to use an int to hold the value. While I was around that bit of code
I also made two changes to a nearby (related) printk() - I modified it to
specify the loglevel explicitly and also broke the line into a few pieces
to avoid it being longer than 80 chars and clarified the text a bit.

Signed-off-by: Jesper Juhl
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jesper Juhl
2005-06-24 00:45:19 +0800
fd450b731 [PATCH] timers: introduce try_to_del_timer_sync() ... Browse Code »

This patch splits del_timer_sync() into 2 functions. The new one,
try_to_del_timer_sync(), returns -1 when it hits executing timer.

It can be used in interrupt context, or when the caller hold locks which
can prevent completion of the timer's handler.

NOTE. Currently it can't be used in interrupt context in UP case, because
->running_timer is used only with CONFIG_SMP.

Should the need arise, it is possible to kill #ifdef CONFIG_SMP in
set_running_timer(), it is cheap.

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-06-24 00:45:16 +0800
55c888d6d [PATCH] timers fixes/improvements ... Browse Code »

This patch tries to solve following problems:

1. del_timer_sync() is racy. The timer can be fired again after
del_timer_sync have checked all cpus and before it will recheck
timer_pending().

2. It has scalability problems. All cpus are scanned to determine
if the timer is running on that cpu.

With this patch del_timer_sync is O(1) and no slower than plain
del_timer(pending_timer), unless it has to actually wait for
completion of the currently running timer.

The only restriction is that the recurring timer should not use
add_timer_on().

3. The timers are not serialized wrt to itself.

If CPU_0 does mod_timer(jiffies+1) while the timer is currently
running on CPU 1, it is quite possible that local interrupt on
CPU_0 will start that timer before it finished on CPU_1.

4. The timers locking is suboptimal. __mod_timer() takes 3 locks
at once and still requires wmb() in del_timer/run_timers.

The new implementation takes 2 locks sequentially and does not
need memory barriers.

Currently ->base != NULL means that the timer is pending. In that case
->base.lock is used to lock the timer. __mod_timer also takes timer->lock
because ->base can be == NULL.

This patch uses timer->entry.next != NULL as indication that the timer is
pending. So it does __list_del(), entry->next = NULL instead of list_del()
when the timer is deleted.

The ->base field is used for hashed locking only, it is initialized
in init_timer() which sets ->base = per_cpu(tvec_bases). When the
tvec_bases.lock is locked, it means that all timers which are tied
to this base via timer->base are locked, and the base itself is locked
too.

So __run_timers/migrate_timers can safely modify all timers which could
be found on ->tvX lists (pending timers).

When the timer's base is locked, and the timer removed from ->entry list
(which means that _run_timers/migrate_timers can't see this timer), it is
possible to set timer->base = NULL and drop the lock: the timer remains
locked.

This patch adds lock_timer_base() helper, which waits for ->base != NULL,
locks the ->base, and checks it is still the same.

__mod_timer() schedules the timer on the local CPU and changes it's base.
However, it does not lock both old and new bases at once. It locks the
timer via lock_timer_base(), deletes the timer, sets ->base = NULL, and
unlocks old base. Then __mod_timer() locks new_base, sets ->base = new_base,
and adds this timer. This simplifies the code, because AB-BA deadlock is not
possible. __mod_timer() also ensures that the timer's base is not changed
while the timer's handler is running on the old base.

__run_timers(), del_timer() do not change ->base anymore, they only clear
pending flag.

So del_timer_sync() can test timer->base->running_timer == timer to detect
whether it is running or not.

We don't need timer_list->lock anymore, this patch kills it.

We also don't need barriers. del_timer() and __run_timers() used smp_wmb()
before clearing timer's pending flag. It was needed because __mod_timer()
did not lock old_base if the timer is not pending, so __mod_timer()->list_add()
could race with del_timer()->list_del(). With this patch these functions are
serialized through base->lock.

One problem. TIMER_INITIALIZER can't use per_cpu(tvec_bases). So this patch
adds global

struct timer_base_s {
spinlock_t lock;
struct timer_list *running_timer;
} __init_timer_base;

which is used by TIMER_INITIALIZER. The corresponding fields in tvec_t_base_s
struct are replaced by struct timer_base_s t_base.

It is indeed ugly. But this can't have scalability problems. The global
__init_timer_base.lock is used only when __mod_timer() is called for the first
time AND the timer was compile time initialized. After that the timer migrates
to the local CPU.

Signed-off-by: Oleg Nesterov
Acked-by: Ingo Molnar
Signed-off-by: Renaud Lienhart
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-06-24 00:45:16 +0800

01 May, 2005

1 commit

d59dd4620 [PATCH] use smp_mb/wmb/rmb where possible ... Browse Code »

Replace a number of memory barriers with smp_ variants. This means we won't
take the unnecessary hit on UP machines.

Signed-off-by: Anton Blanchard
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

akpm@osdl.org
2005-05-01 23:58:47 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800