Doug / smarc-fsl-linux-kernel | Embedian Git Server

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

23 Mar, 2010

1 commit

830ec0458 time: Fix accumulation bug triggered by long delay. ... Browse Code »

The logarithmic accumulation done in the timekeeping has some overflow
protection that limits the max shift value. That means it will take
more then shift loops to accumulate all of the cycles. This causes
the shift decrement to underflow, which causes the loop to never exit.

The simplest fix would be simply to do a:
if (shift)
shift--;

However that is not optimal, as we know the cycle offset is larger
then the interval << shift, the above would make shift drop to zero,
then we would be spinning for quite awhile accumulating at interval
chunks at a time.

Instead, this patch only decreases shift if the offset is smaller
then cycle_interval << shift. This makes sure we accumulate using
the largest chunks possible without overflowing tick_length, and limits
the number of iterations through the loop.

This issue was found and reported by Sonic Zhang, who also tested the fix.
Many thanks your explanation and testing!

Reported-by: Sonic Zhang
Signed-off-by: John Stultz
Tested-by: Sonic Zhang
LKML-Reference:
Signed-off-by: Thomas Gleixner

John Stultz
2010-03-23 23:41:01 +0800

13 Mar, 2010

1 commit

80a05b9ff clockevents: Sanitize min_delta_ns adjustment and prevent overflows ... Browse Code »

The current logic which handles clock events programming failures can
increase min_delta_ns unlimited and even can cause overflows.

Sanitize it by:
- prevent zero increase when min_delta_ns == 1
- limiting min_delta_ns to a jiffie
- bail out if the jiffie limit is hit
- add retries stats for /proc/timer_list so we can gather data

Reported-by: Uwe Kleine-Koenig
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2010-03-13 02:10:29 +0800

02 Mar, 2010

2 commits

ad6759fbf timekeeping: Prevent oops when GENERIC_TIME=n ... Browse Code »

Aaro Koskinen reported an issue in kernel.org bugzilla #15366, where
on non-GENERIC_TIME systems, accessing
/sys/devices/system/clocksource/clocksource0/current_clocksource
results in an oops.

It seems the timekeeper/clocksource rework missed initializing the
curr_clocksource value in the !GENERIC_TIME case.

Thanks to Aaro for reporting and diagnosing the issue as well as
testing the fix!

Reported-by: Aaro Koskinen
Signed-off-by: John Stultz
Cc: Martin Schwidefsky
Cc: stable@kernel.org
LKML-Reference:
Signed-off-by: Thomas Gleixner

john stultz
2010-03-02 16:22:25 +0800
e56425b13 Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
posix-timers.c: Don't export local functions
clocksource: start CMT at clocksource resume
clocksource: add suspend callback
clocksource: add argument to resume callback
ntp: Cleanup xtime references in ntp.c
ntp: Make time_esterror and time_maxerror static

Linus Torvalds
2010-03-02 00:48:25 +0800

10 Feb, 2010

1 commit

c93d89f3d Export the symbol of getboottime and mmonotonic_to_bootbased ... Browse Code »

Export getboottime and monotonic_to_bootbased in order to let them
could be used by following patch.

Cc: stable@kernel.org
Signed-off-by: Jason Wang
Signed-off-by: Marcelo Tosatti

Jason Wang
2010-02-10 01:20:15 +0800

05 Feb, 2010

2 commits

c54a42b19 clocksource: add suspend callback ... Browse Code »

Add a clocksource suspend callback. This callback can be used by the
clocksource driver to shutdown and perform any kind of late suspend
activities even though the clocksource driver itself is a non-sysdev
driver.

One example where this is useful is to fix the sh_cmt.c platform driver
that today suspends using the platform bus and shuts down the clocksource
too early.

With this callback in place the sh_cmt driver will suspend using the
clocksource and clockevent hooks and leave the platform device pm
callbacks unused.

Signed-off-by: Magnus Damm
Cc: Paul Mundt
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

Magnus Damm
2010-02-05 21:54:10 +0800
17622339a clocksource: add argument to resume callback ... Browse Code »

Pass the clocksource as an argument to the clocksource resume callback.
Needed so we can point out which CMT channel the sh_cmt.c driver shall
resume.

Signed-off-by: Magnus Damm
Cc: john stultz
Cc: Paul Mundt
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

Magnus Damm
2010-02-05 21:54:10 +0800

29 Jan, 2010

2 commits

7e1b58477 ntp: Cleanup xtime references in ntp.c ... Browse Code »

ntp.c doesn't need to access timekeeping internals directly, so change
xtime references to use the get_seconds() timekeeping interface.

Signed-off-by: John Stultz
Cc: richard@rsk.demon.co.uk
LKML-Reference:
Signed-off-by: Thomas Gleixner

John Stultz
2010-01-29 17:15:19 +0800
1f5b8f8a2 ntp: Make time_esterror and time_maxerror static ... Browse Code »

Make time_esterror and time_maxerror static as no one uses them
outside of ntp.c

Signed-off-by: John Stultz
Cc: richard@rsk.demon.co.uk
LKML-Reference:
Signed-off-by: Thomas Gleixner

john stultz
2010-01-29 17:15:19 +0800

26 Jan, 2010

1 commit

7b7422a56 clocksource: Prevent potential kgdb dead lock ... Browse Code »

commit 0f8e8ef7 (clocksource: Simplify clocksource watchdog resume
logic) introduced a potential kgdb dead lock. When the kernel is
stopped by kgdb inside code which holds watchdog_lock then kgdb dead
locks in clocksource_resume_watchdog().

clocksource_resume_watchdog() is called from kbdg via
clocksource_touch_watchdog() to avoid that the clock source watchdog
marks TSC unstable after the kernel has been stopped.

Solve this by replacing spin_lock with a spin_trylock and just return
in case the lock is held. Not resetting the watchdog might result in
TSC becoming marked unstable, but that's an acceptable penalty for
using kgdb.

The timekeeping is anyway easily screwed up by kgdb when the system
uses either jiffies or a clock source which wraps in short intervals
(e.g. pm_timer wraps about every 4.6s), so we really do not have to
worry about that occasional TSC marked unstable side effect.

The second caller of clocksource_resume_watchdog() is
clocksource_resume(). The trylock is safe here as well because the
system is UP at this point, interrupts are disabled and nothing else
can hold watchdog_lock().

Reported-by: Jason Wessel
LKML-Reference:
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: Martin Schwidefsky
Cc: John Stultz
Cc: Andrew Morton
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2010-01-26 21:53:16 +0800

18 Jan, 2010

1 commit

ea9d8e3f4 clockevent: Don't remove broadcast device when cpu is dead ... Browse Code »

Marc reported that the BUG_ON in clockevents_notify() triggers on his
system. This happens because the kernel tries to remove an active
clock event device (used for broadcasting) from the device list.

The handling of devices which can be used as per cpu device and as a
global broadcast device is suboptimal.

The simplest solution for now (and for stable) is to check whether the
device is used as global broadcast device, but this needs to be
revisited.

[ tglx: restored the cpuweight check and massaged the changelog ]

Reported-by: Marc Dionne
Tested-by: Marc Dionne
Signed-off-by: Xiaotian Feng
LKML-Reference:
Signed-off-by: Thomas Gleixner
Cc: stable@kernel.org

Xiaotian Feng
2010-01-18 21:44:50 +0800

23 Dec, 2009

1 commit

83f57a11d Revert "time: Remove xtime_cache" ... Browse Code »

This reverts commit 7bc7d637452383d56ba4368d4336b0dde1bb476d, as
requested by John Stultz. Quoting John:

"Petr Titěra reported an issue where he saw odd atime regressions with
2.6.33 where there were a full second worth of nanoseconds in the
nanoseconds field.

He also reviewed the time code and narrowed down the problem: unhandled
overflow of the nanosecond field caused by rounding up the
sub-nanosecond accumulated time.

Details:

* At the end of update_wall_time(), we currently round up the
sub-nanosecond portion of accumulated time when storing it into xtime.
This was added to avoid time inconsistencies caused when the
sub-nanosecond portion was truncated when storing into xtime.
Unfortunately we don't handle the possible second overflow caused by
that rounding.

* Previously the xtime_cache code hid this overflow by normalizing the
xtime value when storing into the xtime_cache.

* We could try to handle the second overflow after the rounding up, but
since this affects the timekeeping's internal state, this would further
complicate the next accumulation cycle, causing small errors in ntp
steering. As much as I'd like to get rid of it, the xtime_cache code is
known to work.

* The correct fix is really to include the sub-nanosecond portion in the
timekeeping accessor function, so we don't need to round up at during
accumulation. This would greatly simplify the accumulation code.
Unfortunately, we can't do this safely until the last three
non-GENERIC_TIME arches (sparc32, arm, cris) are converted (those
patches are in -mm) and we kill off the spots where arches set xtime
directly. This is all 2.6.34 material, so I think reverting the
xtime_cache change is the best approach for now.

Many thanks to Petr for both reporting and finding the issue!"

Reported-by: Petr Titěra
Requested-by: john stultz
Cc: Ingo Molnar
Signed-off-by: Linus Torvalds

Linus Torvalds
2009-12-23 06:10:37 +0800

20 Dec, 2009

1 commit

3cd312c3e Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers: Remove duplicate setting of new_base in __mod_timer()
clockevents: Prevent clockevent_devices list corruption on cpu hotplug

Linus Torvalds
2009-12-20 01:47:18 +0800

17 Dec, 2009

1 commit

62ac12795 cpumask: avoid dereferencing struct cpumask ... Browse Code »

struct cpumask will be undefined soon with CONFIG_CPUMASK_OFFSTACK=y,
to avoid them being declared on the stack.

cpumask_bits() does what we want here (of course, this code is crap).

Signed-off-by: Rusty Russell
To: Thomas Gleixner

Rusty Russell
2009-12-17 09:13:29 +0800

16 Dec, 2009

1 commit

f065f41f4 timecompare: fix half-Y2K38 problem in timecompare_update while calculating offset ... Browse Code »

ktime will overflow from 03:14:07 UTC on Tuesday, 19 January 2038,
ktime_add() in timecompare_update() will overflow a half earlier. As a
result, wrong offset will be gotten, then cause some strange problems.

Signed-off-by: Barry Song
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Patrick Ohly
Cc: David S. Miller
Cc: John Stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Barry Song
2009-12-16 23:19:57 +0800

15 Dec, 2009

4 commits

b5f91da0a clockevents: Convert to raw_spinlock ... Browse Code »

Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: Ingo Molnar

Thomas Gleixner
2009-12-15 06:55:34 +0800
d192c47f2 clockevents: Make tick_device_lock static ... Browse Code »

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: Ingo Molnar

Thomas Gleixner
2009-12-15 06:55:34 +0800
ecb49d1a6 hrtimers: Convert to raw_spinlocks ... Browse Code »

Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: Ingo Molnar

Thomas Gleixner
2009-12-15 06:55:34 +0800
d0316554d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
m68k: rename global variable vmalloc_end to m68k_vmalloc_end
percpu: add missing per_cpu_ptr_to_phys() definition for UP
percpu: Fix kdump failure if booted with percpu_alloc=page
percpu: make misc percpu symbols unique
percpu: make percpu symbols in ia64 unique
percpu: make percpu symbols in powerpc unique
percpu: make percpu symbols in x86 unique
percpu: make percpu symbols in xen unique
percpu: make percpu symbols in cpufreq unique
percpu: make percpu symbols in oprofile unique
percpu: make percpu symbols in tracer unique
percpu: make percpu symbols under kernel/ and mm/ unique
percpu: remove some sparse warnings
percpu: make alloc_percpu() handle array types
vmalloc: fix use of non-existent percpu variable in put_cpu_var()
this_cpu: Use this_cpu_xx in trace_functions_graph.c
this_cpu: Use this_cpu_xx for ftrace
this_cpu: Use this_cpu_xx in nmi handling
this_cpu: Use this_cpu operations in RCU
this_cpu: Use this_cpu ops for VM statistics
...

Fix up trivial (famous last words) global per-cpu naming conflicts in
arch/x86/kvm/svm.c
mm/slab.c

Linus Torvalds
2009-12-15 01:58:24 +0800

11 Dec, 2009

1 commit

bb6eddf76 clockevents: Prevent clockevent_devices list corruption on cpu hotplug ... Browse Code »

Xiaotian Feng triggered a list corruption in the clock events list on
CPU hotplug and debugged the root cause.

If a CPU registers more than one per cpu clock event device, then only
the active clock event device is removed on CPU_DEAD. The unused
devices are kept in the clock events device list.

On CPU up the clock event devices are registered again, which means
that we list_add an already enqueued list_head. That results in list
corruption.

Resolve this by removing all devices which are associated to the dead
CPU on CPU_DEAD.

Reported-by: Xiaotian Feng
Signed-off-by: Thomas Gleixner
Tested-by: Xiaotian Feng
Cc: stable@kernel.org

Thomas Gleixner
2009-12-11 17:28:08 +0800

10 Dec, 2009

2 commits

41d2e4949 hrtimer: Tune hrtimer_interrupt hang logic ... Browse Code »

The hrtimer_interrupt hang logic adjusts min_delta_ns based on the
execution time of the hrtimer callbacks.

This is error-prone for virtual machines, where a guest vcpu can be
scheduled out during the execution of the callbacks (and the callbacks
themselves can do operations that translate to blocking operations in
the hypervisor), which in can lead to large min_delta_ns rendering the
system unusable.

Replace the current heuristics with something more reliable. Allow the
interrupt code to try 3 times to catch up with the lost time. If that
fails use the total time spent in the interrupt handler to defer the
next timer interrupt so the system can catch up with other things
which got delayed. Limit that deferment to 100ms.

The retry events and the maximum time spent in the interrupt handler
are recorded and exposed via /proc/timer_list

Inspired by a patch from Marcelo.

Reported-by: Michael Tokarev
Signed-off-by: Thomas Gleixner
Tested-by: Marcelo Tosatti
Cc: kvm@vger.kernel.org

Thomas Gleixner
2009-12-10 20:08:11 +0800
4ef58d4e2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
tree-wide: fix misspelling of "definition" in comments
reiserfs: fix misspelling of "journaled"
doc: Fix a typo in slub.txt.
inotify: remove superfluous return code check
hdlc: spelling fix in find_pvc() comment
doc: fix regulator docs cut-and-pasteism
mtd: Fix comment in Kconfig
doc: Fix IRQ chip docs
tree-wide: fix assorted typos all over the place
drivers/ata/libata-sff.c: comment spelling fixes
fix typos/grammos in Documentation/edac.txt
sysctl: add missing comments
fs/debugfs/inode.c: fix comment typos
sgivwfb: Make use of ARRAY_SIZE.
sky2: fix sky2_link_down copy/paste comment error
tree-wide: fix typos "couter" -> "counter"
tree-wide: fix typos "offest" -> "offset"
fix kerneldoc for set_irq_msi()
spidev: fix double "of of" in comment
comment typo fix: sybsystem -> subsystem
...

Linus Torvalds
2009-12-10 11:43:33 +0800

09 Dec, 2009

2 commits

fbf07eac7 Merge branch 'timers-for-linus-urgent' of git://git.kernel.org/pub/scm/linux/ker… ... Browse Code »

…nel/git/tip/linux-2.6-tip

* 'timers-for-linus-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimer: Fix /proc/timer_list regression
itimers: Fix racy writes to cpu_itimer fields
timekeeping: Fix clock_gettime vsyscall time warp

Linus Torvalds
2009-12-09 11:28:09 +0800
60d8ce2cd Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers, init: Limit the number of per cpu calibration bootup messages
posix-cpu-timers: optimize and document timer_create callback
clockevents: Add missing include to pacify sparse
x86: vmiclock: Fix printk format
x86: Fix printk format due to variable type change
sparc: fix printk for change of variable type
clocksource/events: Fix fallout of generic code changes
nohz: Allow 32-bit machines to sleep for more than 2.15 seconds
nohz: Track last do_timer() cpu
nohz: Prevent clocksource wrapping during idle
nohz: Type cast printk argument
mips: Use generic mult/shift factor calculation for clocks
clocksource: Provide a generic mult/shift factor calculation
clockevents: Use u32 for mult and shift factors
nohz: Introduce arch_needs_cpu
nohz: Reuse ktime in sub-functions of tick_check_idle.
time: Remove xtime_cache
time: Implement logarithmic time accumulation

Linus Torvalds
2009-12-09 11:27:08 +0800

19 Nov, 2009

1 commit

3505d1a9f Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/sfc/sfe4001.c
drivers/net/wireless/libertas/cmd.c
drivers/staging/Kconfig
drivers/staging/Makefile
drivers/staging/rtl8187se/Kconfig
drivers/staging/rtl8192e/Kconfig

David S. Miller
2009-11-19 14:19:03 +0800

18 Nov, 2009

1 commit

8e1a928a2 clockevents: Add missing include to pacify sparse ... Browse Code »

Include "tick-internal.h" in order to pick up the extern function
prototype for clockevents_shutdown(). This quiets the following sparse
build noise:

warning: symbol 'clockevents_shutdown' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten
LKML-Reference:
Reviewed-by: Yong Zhang
Cc: johnstul@us.ibm.com
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

H Hartley Sweeten
2009-11-18 19:31:48 +0800

17 Nov, 2009

1 commit

0696b711e timekeeping: Fix clock_gettime vsyscall time warp ... Browse Code »

Since commit 0a544198 "timekeeping: Move NTP adjusted clock multiplier
to struct timekeeper" the clock multiplier of vsyscall is updated with
the unmodified clock multiplier of the clock source and not with the
NTP adjusted multiplier of the timekeeper.

This causes user space observerable time warps:
new CLOCK-warp maximum: 120 nsecs, 00000025c337c537 -> 00000025c337c4bf

Add a new argument "mult" to update_vsyscall() and hand in the
timekeeping internal NTP adjusted multiplier.

Signed-off-by: Lin Ming
Cc: "Zhang Yanmin"
Cc: Martin Schwidefsky
Cc: Benjamin Herrenschmidt
Cc: Tony Luck
LKML-Reference:
Signed-off-by: Thomas Gleixner

Lin Ming
2009-11-17 18:52:34 +0800

14 Nov, 2009

7 commits

a362c638b clocksource/events: Fix fallout of generic code changes ... Browse Code »

powerpc grew a new warning due to the type change of clockevent->mult.

The architectures which use parts of the generic time keeping
infrastructure tripped over my wrong assumption that
clocksource_register is only used when GENERIC_TIME=y.

I should have looked and also I should have known better. These
renitent Gaul villages are racking my nerves. Some serious deprecating
is due.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2009-11-14 07:35:52 +0800
97813f2fe nohz: Allow 32-bit machines to sleep for more than 2.15 seconds ... Browse Code »

In the dynamic tick code, "max_delta_ns" (member of the
"clock_event_device" structure) represents the maximum sleep time
that can occur between timer events in nanoseconds.

The variable, "max_delta_ns", is defined as an unsigned long
which is a 32-bit integer for 32-bit machines and a 64-bit
integer for 64-bit machines (if -m64 option is used for gcc).
The value of max_delta_ns is set by calling the function
"clockevent_delta2ns()" which returns a maximum value of LONG_MAX.
For a 32-bit machine LONG_MAX is equal to 0x7fffffff and in
nanoseconds this equates to ~2.15 seconds. Hence, the maximum
sleep time for a 32-bit machine is ~2.15 seconds, where as for
a 64-bit machine it will be many years.

This patch changes the type of max_delta_ns to be "u64" instead of
"unsigned long" so that this variable is a 64-bit type for both 32-bit
and 64-bit machines. It also changes the maximum value returned by
clockevent_delta2ns() to KTIME_MAX. Hence this allows a 32-bit
machine to sleep for longer than ~2.15 seconds. Please note that this
patch also changes "min_delta_ns" to be "u64" too and although this is
unnecessary, it makes the patch simpler as it avoids to fixup all
callers of clockevent_delta2ns().

[ tglx: changed "unsigned long long" to u64 as we use this data type
through out the time code ]

Signed-off-by: Jon Hunter
Cc: John Stultz
LKML-Reference:
Signed-off-by: Thomas Gleixner

Jon Hunter
2009-11-14 03:46:24 +0800
27185016b nohz: Track last do_timer() cpu ... Browse Code »

The previous patch which limits the sleep time to the maximum
deferment time of the time keeping clocksource has some limitations on
SMP machines: if all CPUs are idle then for all CPUs the maximum sleep
time is limited.

Solve this by keeping track of which cpu had the do_timer() duty
assigned last and limit the sleep time only for this cpu.

Signed-off-by: Thomas Gleixner
LKML-Reference:
Cc: Jon Hunter
Cc: John Stultz

Thomas Gleixner
2009-11-14 03:46:24 +0800
98962465e nohz: Prevent clocksource wrapping during idle ... Browse Code »

The dynamic tick allows the kernel to sleep for periods longer than a
single tick, but it does not limit the sleep time currently. In the
worst case the kernel could sleep longer than the wrap around time of
the time keeping clock source which would result in losing track of
time.

Prevent this by limiting it to the safe maximum sleep time of the
current time keeping clock source. The value is calculated when the
clock source is registered.

[ tglx: simplified the code a bit and massaged the commit msg ]

Signed-off-by: Jon Hunter
Cc: John Stultz
LKML-Reference:
Signed-off-by: Thomas Gleixner

Jon Hunter
2009-11-14 03:46:24 +0800
529eaccd9 nohz: Type cast printk argument ... Browse Code »

On some archs local_softirq_pending() has a data type of unsigned long
on others its unsigned int. Type cast it to (unsigned int) in the
printk to avoid the compiler warning.

Signed-off-by: Thomas Gleixner
LKML-Reference:

Thomas Gleixner
2009-11-14 03:46:24 +0800
7d2f944a2 clocksource: Provide a generic mult/shift factor calculation ... Browse Code »

MIPS has two functions to calculcate the mult/shift factors for clock
sources and clock events at run time. ARM needs such functions as
well.

Implement a function which calculates the mult/shift factors based on
the frequencies to which and from which is converted. The function
also has a parameter to specify the minimum conversion range in
seconds. This range is guaranteed not to produce a 64bit overflow when
a value is multiplied with the calculated mult factor. The larger the
conversion range the less becomes the conversion accuracy.

Provide two inline wrappers which handle clock events and clock
sources. For clock events the "from" frequency is nano seconds per
second which corresponds to 1GHz and "to" is the device frequency. For
clock sources "from" is the device frequency and "to" is nano seconds
per second.

Signed-off-by: Thomas Gleixner
Tested-by: Mikael Pettersson
Acked-by: Ralf Baechle
Acked-by: Linus Walleij
Cc: John Stultz
LKML-Reference:

Thomas Gleixner
2009-11-14 03:46:23 +0800
23af368e9 clockevents: Use u32 for mult and shift factors ... Browse Code »

The mult and shift factors of clock events differ in their data type
from those of clock sources for no reason. u32 is sufficient for
both. shift is always
Tested-by: Mikael Pettersson
Acked-by: Ralf Baechle
Acked-by: Linus Walleij
Cc: John Stultz
LKML-Reference:

Thomas Gleixner
2009-11-14 03:46:23 +0800

12 Nov, 2009

1 commit

3586e0a9a clocksource/timecompare: Fix symbol exports to be GPL'd. ... Browse Code »

Noticed by Thomas GLeixner.

Signed-off-by: David S. Miller

David S. Miller
2009-11-12 11:06:30 +0800

09 Nov, 2009

1 commit

b71a8eb0f tree-wide: fix typos "selct" + "slect" -> "select" ... Browse Code »

This patch was generated by

git grep -E -i -l 's(le|el)ct' | xargs -r perl -p -i -e 's/([Ss])(le|el)ct/$1elect/

with only skipping net/netfilter/xt_SECMARK.c and
include/linux/netfilter/xt_SECMARK.h which have a struct member called
selctx.

Signed-off-by: Uwe Kleine-König
Signed-off-by: Jiri Kosina

Uwe Kleine-König
2009-11-09 16:40:56 +0800

05 Nov, 2009

2 commits

3c5d92a0c nohz: Introduce arch_needs_cpu ... Browse Code »

Allow the architecture to request a normal jiffy tick when the system
goes idle and tick_nohz_stop_sched_tick is called . On s390 the hook is
used to prevent the system going fully idle if there has been an
interrupt other than a clock comparator interrupt since the last wakeup.

On s390 the HiperSockets response time for 1 connection ping-pong goes
down from 42 to 34 microseconds. The CPU cost decreases by 27%.

Signed-off-by: Martin Schwidefsky
LKML-Reference:
Signed-off-by: Thomas Gleixner

Martin Schwidefsky
2009-11-05 14:53:53 +0800
eed3b9cf3 nohz: Reuse ktime in sub-functions of tick_check_idle. ... Browse Code »

On a system with NOHZ=y tick_check_idle calls tick_nohz_stop_idle and
tick_nohz_update_jiffies. Given the right conditions (ts->idle_active
and/or ts->tick_stopped) both function get a time stamp with ktime_get.
The same time stamp can be reused if both function require one.

On s390 this change has the additional benefit that gcc inlines the
tick_nohz_stop_idle function into tick_check_idle. The number of
instructions to execute tick_check_idle drops from 225 to 144
(without the ktime_get optimization it is 367 vs 215 instructions).

before:

0) | tick_check_idle() {
0) | tick_nohz_stop_idle() {
0) | ktime_get() {
0) | read_tod_clock() {
0) 0.601 us | }
0) 1.765 us | }
0) 3.047 us | }
0) | ktime_get() {
0) | read_tod_clock() {
0) 0.570 us | }
0) 1.727 us | }
0) | tick_do_update_jiffies64() {
0) 0.609 us | }
0) 8.055 us | }

after:

0) | tick_check_idle() {
0) | ktime_get() {
0) | read_tod_clock() {
0) 0.617 us | }
0) 1.773 us | }
0) | tick_do_update_jiffies64() {
0) 0.593 us | }
0) 4.477 us | }

Signed-off-by: Martin Schwidefsky
Cc: john stultz
LKML-Reference:
Signed-off-by: Thomas Gleixner

Martin Schwidefsky
2009-11-05 14:53:53 +0800

29 Oct, 2009

1 commit

1871e52c7 percpu: make percpu symbols under kernel/ and mm/ unique ... Browse Code »

This patch updates percpu related symbols under kernel/ and mm/ such
that percpu symbols are unique and don't clash with local symbols.
This serves two purposes of decreasing the possibility of global
percpu symbol collision and allowing dropping per_cpu__ prefix from
percpu symbols.

* kernel/lockdep.c: s/lock_stats/cpu_lock_stats/

* kernel/sched.c: s/init_rq_rt/init_rt_rq_var/ (any better idea?)
s/sched_group_cpus/sched_groups/

* kernel/softirq.c: s/ksoftirqd/run_ksoftirqd/a

* kernel/softlockup.c: s/(*)_timestamp/softlockup_\1_ts/
s/watchdog_task/softlockup_watchdog/
s/timestamp/ts/ for local variables

* kernel/time/timer_stats: s/lookup_lock/tstats_lookup_lock/

* mm/slab.c: s/reap_work/slab_reap_work/
s/reap_node/slab_reap_node/

* mm/vmstat.c: local variable changed to avoid collision with vmstat_work

Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
which cause name clashes" patch.

Signed-off-by: Tejun Heo
Acked-by: (slab/vmstat) Christoph Lameter
Reviewed-by: Christoph Lameter
Cc: Rusty Russell
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Andrew Morton
Cc: Nick Piggin

Tejun Heo
2009-10-29 21:34:13 +0800