Eric Lee / smarc-fsl-linux-kernel

24 Mar, 2006

40 commits

825a46af5 [PATCH] cpuset memory spread basic implementation ... Browse Code »

This patch provides the implementation and cpuset interface for an alternative
memory allocation policy that can be applied to certain kinds of memory
allocations, such as the page cache (file system buffers) and some slab caches
(such as inode caches).

The policy is called "memory spreading." If enabled, it spreads out these
kinds of memory allocations over all the nodes allowed to a task, instead of
preferring to place them on the node where the task is executing.

All other kinds of allocations, including anonymous pages for a tasks stack
and data regions, are not affected by this policy choice, and continue to be
allocated preferring the node local to execution, as modified by the NUMA
mempolicy.

There are two boolean flag files per cpuset that control where the kernel
allocates pages for the file system buffers and related in kernel data
structures. They are called 'memory_spread_page' and 'memory_spread_slab'.

If the per-cpuset boolean flag file 'memory_spread_page' is set, then the
kernel will spread the file system buffers (page cache) evenly over all the
nodes that the faulting task is allowed to use, instead of preferring to put
those pages on the node where the task is running.

If the per-cpuset boolean flag file 'memory_spread_slab' is set, then the
kernel will spread some file system related slab caches, such as for inodes
and dentries evenly over all the nodes that the faulting task is allowed to
use, instead of preferring to put those pages on the node where the task is
running.

The implementation is simple. Setting the cpuset flags 'memory_spread_page'
or 'memory_spread_cache' turns on the per-process flags PF_SPREAD_PAGE or
PF_SPREAD_SLAB, respectively, for each task that is in the cpuset or
subsequently joins that cpuset. In subsequent patches, the page allocation
calls for the affected page cache and slab caches are modified to perform an
inline check for these flags, and if set, a call to a new routine
cpuset_mem_spread_node() returns the node to prefer for the allocation.

The cpuset_mem_spread_node() routine is also simple. It uses the value of a
per-task rotor cpuset_mem_spread_rotor to select the next node in the current
tasks mems_allowed to prefer for the allocation.

This policy can provide substantial improvements for jobs that need to place
thread local data on the corresponding node, but that need to access large
file system data sets that need to be spread across the several nodes in the
jobs cpuset in order to fit. Without this patch, especially for jobs that
might have one thread reading in the data set, the memory allocation across
the nodes in the jobs cpuset can become very uneven.

A couple of Copyright year ranges are updated as well. And a couple of email
addresses that can be found in the MAINTAINERS file are removed.

Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2006-03-24 23:33:22 +0800
8a39cc60b [PATCH] cpuset use combined atomic_inc_return calls ... Browse Code »

Replace pairs of calls to , with a single call
atomic_inc_return, saving a few bytes of source and kernel text.

Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2006-03-24 23:33:22 +0800
7b5b9ef0e [PATCH] cpuset cleanup not not operators ... Browse Code »

Since the test_bit() bit operator is boolean (return 0 or 1), the double not
"!!" operations needed to convert a scalar (zero or not zero) to a boolean are
not needed.

Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2006-03-24 23:33:22 +0800
0b1303fcf [PATCH] cpusets: only wakeup kswapd for zones in the current cpuset ... Browse Code »

If we get under some memory pressure in a cpuset (we only scan zones that
are in the cpuset for memory) then kswapd is woken up for all zones. This
patch only wakes up kswapd in zones that are part of the current cpuset.

Signed-off-by: Christoph Lameter
Acked-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-03-24 23:33:22 +0800
95c383227 [PATCH] rcutorture: tag success/failure line with module parameters ... Browse Code »

A long-running rcutorture test can overflow dmesg, so that the line
containing the module parameters is lost. Although it is usually possible
to retrieve this information from the log files, it is much better to just
tag it onto the final success/failure line so that it may be easily found.
This patch does just that.

Signed-off-by: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul E. McKenney
2006-03-24 23:33:22 +0800
cdb045278 [PATCH] kill include/linux/platform.h, default_idle() cleanup ... Browse Code »

include/linux/platform.h contained nothing that was actually used except
the default_idle() prototype, and is therefore removed by this patch.

This patch does the following with the platform specific default_idle()
functions on different architectures:
- remove the unused function:
- parisc
- sparc64
- make the needlessly global function static:
- arm
- h8300
- m68k
- m68knommu
- s390
- v850
- x86_64
- add a prototype in asm/system.h:
- cris
- i386
- ia64

Signed-off-by: Adrian Bunk
Acked-by: Patrick Mochel
Acked-by: Kyle McMartin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-03-24 23:33:21 +0800
008accbba [PATCH] extract-ikconfig: don't use --long-options ... Browse Code »

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2006-03-24 23:33:21 +0800
ff45e99dc [PATCH] extract-ikconfig: be sure binoffset exists before extracting ... Browse Code »

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2006-03-24 23:33:21 +0800
66f9f59a5 [PATCH] extract-ikconfig: use mktemp(1) ... Browse Code »

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2006-03-24 23:33:21 +0800
a4a6198b8 [PATCH] tvec_bases too large for per-cpu data ... Browse Code »

With internal Xen-enabled kernels we see the kernel's static per-cpu data
area exceed the limit of 32k on x86-64, and even native x86-64 kernels get
fairly close to that limit. I generally question whether it is reasonable
to have data structures several kb in size allocated as per-cpu data when
the space there is rather limited.

The biggest arch-independent consumer is tvec_bases (over 4k on 32-bit
archs, over 8k on 64-bit ones), which now gets converted to use dynamically
allocated memory instead.

Signed-off-by: Jan Beulich
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2006-03-24 23:33:21 +0800
c98d8cfbc [PATCH] fs/coda/: proper prototypes ... Browse Code »

Introduce a file fs/coda/coda_int.h with proper prototypes for some code.

Signed-off-by: Adrian Bunk
Acked-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-03-24 23:33:21 +0800
2c2212901 [PATCH] fs/ext2/: proper ext2_get_parent() prototype ... Browse Code »

Add a proper prototype for ext2_get_parent().

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-03-24 23:33:21 +0800
29c6e4860 [PATCH] fs/9p/: possible cleanups ... Browse Code »

- mux.c: v9fs_poll_mux() was inline but not static resuling in needless
object size bloat
- mux.c: remove all "inline"s: gcc should know best what to inline
- #if 0 the following unused global functions:
- 9p.c: v9fs_v9fs_t_flush()
- conv.c: v9fs_create_tauth()
- mux.c: v9fs_mux_rpcnb()

Signed-off-by: Adrian Bunk
Cc: Eric Van Hensbergen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-03-24 23:33:21 +0800
caa9ee771 [PATCH] rcu_process_callbacks: don't cli() while testing ->nxtlist ... Browse Code »

__rcu_process_callbacks() disables interrupts to protect itself from
call_rcu() which adds new entries to ->nxtlist.

However we can check "->nxtlist != NULL" with interrupts enabled, we can't
get "false positives" because call_rcu() can only change this condition
from 0 to 1.

Tested with rcutorture.ko.

Signed-off-by: Oleg Nesterov
Acked-by: Dipankar Sarma
Cc: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-03-24 23:33:20 +0800
cba9f33d1 [PATCH] Range checking in do_proc_dointvec_(userhz_)jiffies_conv ... Browse Code »

When (integer) sysctl values are in either seconds or centiseconds, but
represented internally as jiffies, the allowable value range is decreased.
This patch adds range checks to the conversion routines.

For values in seconds: maximum LONG_MAX / HZ.

For values in centiseconds: maximum (LONG_MAX / HZ) * USER_HZ.

(BTW, does anyone else feel that an interface in seconds should not be
accepting negative values?)

Signed-off-by: Bart Samwel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bart Samwel
2006-03-24 23:33:20 +0800
ed5b43f15 [PATCH] Represent laptop_mode as jiffies internally ... Browse Code »

Make that the internal value for /proc/sys/vm/laptop_mode is stored as
jiffies instead of seconds. Let the sysctl interface do the conversions,
instead of doing on-the-fly conversions every time the value is used.

Add a description of the fact that laptop_mode doubles as a flag and a
timeout to the comment above the laptop_mode variable.

Signed-off-by: Bart Samwel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bart Samwel
2006-03-24 23:33:20 +0800
f6ef94381 [PATCH] Represent dirty_*_centisecs as jiffies internally ... Browse Code »

Make that the internal values for:

/proc/sys/vm/dirty_writeback_centisecs
/proc/sys/vm/dirty_expire_centisecs

are stored as jiffies instead of centiseconds. Let the sysctl interface do
the conversions with full precision using clock_t_to_jiffies, instead of
doing overflow-sensitive on-the-fly conversions every time the values are
used.

Cons: apparent precision loss if HZ is not a multiple of 100, because of
conversion back and forth. This is a common problem for all sysctl values
that use proc_dointvec_userhz_jiffies. (There is only one other in-tree
use, in net/core/neighbour.c.)

Signed-off-by: Bart Samwel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bart Samwel
2006-03-24 23:33:20 +0800
36f574135 [PATCH] free_uid() locking improvement ... Browse Code »

Reduce lock hold times in free_uid().

Cc: Ingo Molnar
Cc: "Paul E. McKenney"
Cc: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-03-24 23:33:20 +0800
3cf64b933 [PATCH] bitmap: region restructuring ... Browse Code »

Restructure the bitmap_*_region() operations, to avoid code duplication.

Also reduces binary text size by about 100 bytes (ia64 arch). The original
Bottomley bitmap_*_region patch added about 1000 bytes of compiled kernel text
(ia64). The Mundt multiword extension added another 600 bytes, and this
restructuring patch gets back about 100 bytes.

But the real motivation was the reduced amount of duplicated code.

Tested by Paul Mundt using
Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2006-03-24 23:33:20 +0800
74373c6ac [PATCH] bitmap: region multiword spanning support ... Browse Code »

Add support to the lib/bitmap.c bitmap_*_region() routines

For bitmap regions larger than one word (nbits > BITS_PER_LONG). This removes
a BUG_ON() in lib bitmap.

I have an updated store queue API for SH that is currently using this with
relative success, and at first glance, it seems like this could be useful for
x86 (arch/i386/kernel/pci-dma.c) as well. Particularly for anything using
dma_declare_coherent_memory() on large areas and that attempts to allocate
large buffers from that space.

Paul Jackson also did some cleanup to this patch.

Signed-off-by: Paul Mundt
Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Mundt
2006-03-24 23:33:20 +0800
87e248025 [PATCH] bitmap: region cleanup ... Browse Code »

Paul Mundt says:

This patch set implements a number of patches to clean up and restructure the
bitmap region code, in addition to extending the interface to support
multiword spanning allocations.

The current implementation (before this patch set) is limited by only being
able to allocate pages BITS_PER_LONG);

As I seem to have been the first person to trigger this, the result ends up
being the following patch set with the help of Paul Jackson.

The final patch in the series eliminates quite a bit of code duplication, so
the bitmap code size ends up being smaller than the current implementation as
an added bonus.

After these are applied, it should already be possible to do multiword
allocations with dma_alloc_coherent() out of ranges established by
dma_declare_coherent_memory() on x86 without having to change any of the code,
and the SH store queue API will follow up on this as the other user that needs
support for this.

This patch:

Some code cleanup on the lib/bitmap.c bitmap_*_region() routines:

* spacing
* variable names
* comments

Has no change to code function.

Signed-off-by: Paul Mundt
Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2006-03-24 23:33:20 +0800
f993b3bf8 [PATCH] remove ISA legacy functions: remove documentation ... Browse Code »

This patch removes the documentation of the ISA legacy functions.

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-03-24 23:33:19 +0800
57f3ebcca [PATCH] remove ISA legacy functions: remove the helpers ... Browse Code »

unused isa_...() helpers removed.

Adrian Bunk:
The asm-sh part was rediffed due to unrelated changes.

Signed-off-by: Al Viro
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2006-03-24 23:33:19 +0800
c44fec118 [PATCH] remove ISA legacy functions: drivers/net/lance.c ... Browse Code »

switch to ioremap()

Signed-off-by: Al Viro
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2006-03-24 23:33:19 +0800
b336cea30 [PATCH] remove ISA legacy functions: drivers/net/hp-plus.c ... Browse Code »

switch to ioremap()

Adrian Bunk:
The order of the hunks in the patch was slightly rearranged due to an
unrelated change in the driver.

Signed-off-by: Al Viro
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2006-03-24 23:33:19 +0800
22bc685f4 [PATCH] remove ISA legacy functions: drivers/scsi/in2000.c ... Browse Code »

switched to ioremap(), cleaned the probing up a bit.

Signed-off-by: Al Viro
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2006-03-24 23:33:19 +0800
c818cb640 [PATCH] remove ISA legacy functions: drivers/scsi/g_NCR5380.c ... Browse Code »

switched CONFIG_SCSI_G_NCR5380_MEM code in g_NCR5380 to ioremap(); massaged
g_NCR5380.h accordingly.

Signed-off-by: Al Viro
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2006-03-24 23:33:19 +0800
ef5a4c8b0 [PATCH] remove ISA legacy functions: drivers/char/toshiba.c ... Browse Code »

switch from isa_read...() to ioremap() and read...()

Signed-off-by: Al Viro
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2006-03-24 23:33:19 +0800
e8c96f8c2 [PATCH] fs: Use ARRAY_SIZE macro ... Browse Code »

Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a
duplicate of ARRAY_SIZE. Some trailing whitespaces are also deleted.

Signed-off-by: Tobias Klauser
Cc: David Howells
Cc: Dave Kleikamp
Acked-by: Trond Myklebust
Cc: Neil Brown
Cc: Chris Mason
Cc: Jeff Mahoney
Cc: Christoph Hellwig
Cc: Nathan Scott
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tobias Klauser
2006-03-24 23:33:19 +0800
b5029622a [PATCH] dasd: "cleanup dasd_ioctl" fix ... Browse Code »

Cast the argument correctly.

Cc: Christoph Hellwig
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bastian Blank
2006-03-24 23:33:18 +0800
88abaab4f [PATCH] s390: kzalloc() conversion in drivers/s390 ... Browse Code »

Convert all kmalloc + memset sequences in drivers/s390 to kzalloc usage.

Signed-off-by: Eric Sesterhenn
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Sesterhenn
2006-03-24 23:33:18 +0800
fb630517f [PATCH] s390: kzalloc() conversion in arch/s390 ... Browse Code »

Convert all kmalloc + memset sequences in arch/s390 to kzalloc usage.

Signed-off-by: Eric Sesterhenn
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Sesterhenn
2006-03-24 23:33:18 +0800
96641ee1e [PATCH] s390: CEX2A crt message length ... Browse Code »

Undetected edge case for CRT messages to CEX2A caused length to be too short,
thus truncating the message. The solution was to check a different variable
which actually determines which key type is being used.

Increment version number in z90main.c to correct level of 1.3.3, fix copyright
year and add comment about bitlength limit of CEX2A.

Signed-off-by: Eric Rossman
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Rossman
2006-03-24 23:33:18 +0800
b6cba4ee3 [PATCH] s390: 3590 tape driver ... Browse Code »

Michael Holzheu ,
Martin Schwidefsky

Signed-off-by: Stefan Bader
Signed-off-by: Michael Holzheu
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stefan Bader
2006-03-24 23:33:18 +0800
5f3843388 [PATCH] s390: fix endless retry loop in tape driver ... Browse Code »

If a tape device is assigned to another host, the interrupt for the assign
operation comes back with deferred condition code 1. Under some conditions
this can lead to an endless loop of retries. Check if the current request is
still in IO in deferred condition code handling and prevent retries when the
request has already been cancelled.

Signed-off-by: Michael Holzheu
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Holzheu
2006-03-24 23:33:18 +0800
4cd190a73 [PATCH] s390: tape operation abortion leads to panic ... Browse Code »

When a request is aborted because of a signal, we currently stop the request
via csh, but we do not wait for the interrupt of csh in any case. We free the
request structure and therefore when the interrupt for the csh operation is
presented, the request object is no longer valid and an invalid callback
pointer is used.

To fix this wait until the interrupt for csh arrives and until
wait_event_interruptible() does not return -ERESTARTSYS.

Signed-off-by: Michael Holzheu
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Holzheu
2006-03-24 23:33:18 +0800
842d3fba9 [PATCH] s390: tape retry flooding by deferred CC in interrupt ... Browse Code »

If a deferred CC happens there will be lots of messages, because the retry is
done immediatly in the interrupt handler which can be too fast. To avoid this
requeue the request and schedule the queue to be processed.

Signed-off-by: Stefan Bader
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stefan Bader
2006-03-24 23:33:17 +0800
20c644680 [PATCH] s390: dasd extended error reporting ... Browse Code »

The DASD extended error reporting is a facility that allows to get detailed
information about certain problems in the DASD I/O. This information can be
used to implement fail-over applications that can recover these problems.

Signed-off-by: Stefan Weinhuber
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stefan Weinhuber
2006-03-24 23:33:17 +0800
554a826e0 [PATCH] s390: random values in result of BIODASDINFO2 ... Browse Code »

Use kzalloc to get a zeroed buffer for the structure returned to user space by
the BIODASDINFO2 ioctl. Not all fields are set up, e.g. the read_devno is
missing.

Signed-off-by: Horst Hummel
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Horst Hummel
2006-03-24 23:33:17 +0800
d0b2eaa37 [PATCH] s390: remove experimental flag from dasd diag ... Browse Code »

The dasd diag discipline has been tested on 64 bit and is no longer
experimental.

Signed-off-by: Peter Oberparleiter
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Oberparleiter
2006-03-24 23:33:17 +0800