Eric Lee / smarc-fsl-linux-kernel

16 Dec, 2009

26 commits

9b5e5d0fd hugetlb: use only nodes with memory for huge pages ... Browse Code »

Register per node hstate sysfs attributes only for nodes with memory.
Global replacement of 'all online nodes" with "all nodes with memory" in
mm/hugetlb.c. Suggested by David Rientjes.

A subsequent patch will handle adding/removing of per node hstate sysfs
attributes when nodes transition to/from memoryless state via memory
hotplug.

NOTE: this patch has not been tested with memoryless nodes.

Signed-off-by: Lee Schermerhorn
Reviewed-by: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Acked-by: David Rientjes
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:13 +0800
267b4c281 hugetlb: update hugetlb documentation for NUMA controls ... Browse Code »

Update the kernel huge tlb documentation to describe the numa memory
policy based huge page management. Additionaly, the patch includes a fair
amount of rework to improve consistency, eliminate duplication and set the
context for documenting the memory policy interaction.

Signed-off-by: Lee Schermerhorn
Acked-by: David Rientjes
Acked-by: Mel Gorman
Reviewed-by: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Lee Schermerhorn
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:12 +0800
9a3052306 hugetlb: add per node hstate attributes ... Browse Code »

Add the per huge page size control/query attributes to the per node
sysdevs:

/sys/devices/system/node/node/hugepages/hugepages-/
nr_hugepages - r/w
free_huge_pages - r/o
surplus_huge_pages - r/o

The patch attempts to re-use/share as much of the existing global hstate
attribute initialization and handling, and the "nodes_allowed" constraint
processing as possible.

Calling set_max_huge_pages() with no node indicates a change to global
hstate parameters. In this case, any non-default task mempolicy will be
used to generate the nodes_allowed mask. A valid node id indicates an
update to that node's hstate parameters, and the count argument specifies
the target count for the specified node. From this info, we compute the
target global count for the hstate and construct a nodes_allowed node mask
contain only the specified node.

Setting the node specific nr_hugepages via the per node attribute
effectively ignores any task mempolicy or cpuset constraints.

With this patch:

(me):ls /sys/devices/system/node/node0/hugepages/hugepages-2048kB
./ ../ free_hugepages nr_hugepages surplus_hugepages

Starting from:
Node 0 HugePages_Total: 0
Node 0 HugePages_Free: 0
Node 0 HugePages_Surp: 0
Node 1 HugePages_Total: 0
Node 1 HugePages_Free: 0
Node 1 HugePages_Surp: 0
Node 2 HugePages_Total: 0
Node 2 HugePages_Free: 0
Node 2 HugePages_Surp: 0
Node 3 HugePages_Total: 0
Node 3 HugePages_Free: 0
Node 3 HugePages_Surp: 0
vm.nr_hugepages = 0

Allocate 16 persistent huge pages on node 2:
(me):echo 16 >/sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages

[Note that this is equivalent to:
numactl -m 2 hugeadmin --pool-pages-min 2M:+16
]

Yields:
Node 0 HugePages_Total: 0
Node 0 HugePages_Free: 0
Node 0 HugePages_Surp: 0
Node 1 HugePages_Total: 0
Node 1 HugePages_Free: 0
Node 1 HugePages_Surp: 0
Node 2 HugePages_Total: 16
Node 2 HugePages_Free: 16
Node 2 HugePages_Surp: 0
Node 3 HugePages_Total: 0
Node 3 HugePages_Free: 0
Node 3 HugePages_Surp: 0
vm.nr_hugepages = 16

Global controls work as expected--reduce pool to 8 persistent huge pages:
(me):echo 8 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

Node 0 HugePages_Total: 0
Node 0 HugePages_Free: 0
Node 0 HugePages_Surp: 0
Node 1 HugePages_Total: 0
Node 1 HugePages_Free: 0
Node 1 HugePages_Surp: 0
Node 2 HugePages_Total: 8
Node 2 HugePages_Free: 8
Node 2 HugePages_Surp: 0
Node 3 HugePages_Total: 0
Node 3 HugePages_Free: 0
Node 3 HugePages_Surp: 0

Signed-off-by: Lee Schermerhorn
Acked-by: Mel Gorman
Reviewed-by: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Cc: David Rientjes
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:12 +0800
4e25b2576 hugetlb: add generic definition of NUMA_NO_NODE ... Browse Code »

Move definition of NUMA_NO_NODE from ia64 and x86_64 arch specific headers
to generic header 'linux/numa.h' for use in generic code. NUMA_NO_NODE
replaces bare '-1' where it's used in this series to indicate "no node id
specified". Ultimately, it can be used to replace the -1 elsewhere where
it is used similarly.

Signed-off-by: Lee Schermerhorn
Acked-by: David Rientjes
Acked-by: Mel Gorman
Reviewed-by: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:12 +0800
06808b082 hugetlb: derive huge pages nodes allowed from task mempolicy ... Browse Code »

This patch derives a "nodes_allowed" node mask from the numa mempolicy of
the task modifying the number of persistent huge pages to control the
allocation, freeing and adjusting of surplus huge pages when the pool page
count is modified via the new sysctl or sysfs attribute
"nr_hugepages_mempolicy". The nodes_allowed mask is derived as follows:

* For "default" [NULL] task mempolicy, a NULL nodemask_t pointer
is produced. This will cause the hugetlb subsystem to use
node_online_map as the "nodes_allowed". This preserves the
behavior before this patch.
* For "preferred" mempolicy, including explicit local allocation,
a nodemask with the single preferred node will be produced.
"local" policy will NOT track any internode migrations of the
task adjusting nr_hugepages.
* For "bind" and "interleave" policy, the mempolicy's nodemask
will be used.
* Other than to inform the construction of the nodes_allowed node
mask, the actual mempolicy mode is ignored. That is, all modes
behave like interleave over the resulting nodes_allowed mask
with no "fallback".

See the updated documentation [next patch] for more information
about the implications of this patch.

Examples:

Starting with:

Node 0 HugePages_Total: 0
Node 1 HugePages_Total: 0
Node 2 HugePages_Total: 0
Node 3 HugePages_Total: 0

Default behavior [with or without this patch] balances persistent
hugepage allocation across nodes [with sufficient contiguous memory]:

sysctl vm.nr_hugepages[_mempolicy]=32

yields:

Node 0 HugePages_Total: 8
Node 1 HugePages_Total: 8
Node 2 HugePages_Total: 8
Node 3 HugePages_Total: 8

Of course, we only have nr_hugepages_mempolicy with the patch,
but with default mempolicy, nr_hugepages_mempolicy behaves the
same as nr_hugepages.

Applying mempolicy--e.g., with numactl [using '-m' a.k.a.
'--membind' because it allows multiple nodes to be specified
and it's easy to type]--we can allocate huge pages on
individual nodes or sets of nodes. So, starting from the
condition above, with 8 huge pages per node, add 8 more to
node 2 using:

numactl -m 2 sysctl vm.nr_hugepages_mempolicy=40

This yields:

Node 0 HugePages_Total: 8
Node 1 HugePages_Total: 8
Node 2 HugePages_Total: 16
Node 3 HugePages_Total: 8

The incremental 8 huge pages were restricted to node 2 by the
specified mempolicy.

Similarly, we can use mempolicy to free persistent huge pages
from specified nodes:

numactl -m 0,1 sysctl vm.nr_hugepages_mempolicy=32

yields:

Node 0 HugePages_Total: 4
Node 1 HugePages_Total: 4
Node 2 HugePages_Total: 16
Node 3 HugePages_Total: 8

The 8 huge pages freed were balanced over nodes 0 and 1.

[rientjes@google.com: accomodate reworked NODEMASK_ALLOC]
Signed-off-by: David Rientjes
Signed-off-by: Lee Schermerhorn
Acked-by: Mel Gorman
Reviewed-by: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:12 +0800
c1e6c8d07 hugetlb: factor init_nodemask_of_node() ... Browse Code »

Factor init_nodemask_of_node() out of the nodemask_of_node() macro.

This will be used to populate the huge pages "nodes_allowed" nodemask for
a single node when basing nodes_allowed on a preferred/local mempolicy or
when a persistent huge page pool page count is modified via a per node
sysfs attribute.

Signed-off-by: Lee Schermerhorn
Acked-by: Mel Gorman
Reviewed-by: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Acked-by: David Rientjes
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:12 +0800
6ae11b278 hugetlb: add nodemask arg to huge page alloc, free and surplus adjust functions ... Browse Code »

In preparation for constraining huge page allocation and freeing by the
controlling task's numa mempolicy, add a "nodes_allowed" nodemask pointer
to the allocate, free and surplus adjustment functions. For now, pass
NULL to indicate default behavior--i.e., use node_online_map. A
subsqeuent patch will derive a non-default mask from the controlling
task's numa mempolicy.

Note that this method of updating the global hstate nr_hugepages under the
constraint of a nodemask simplifies keeping the global state
consistent--especially the number of persistent and surplus pages relative
to reservations and overcommit limits. There are undoubtedly other ways
to do this, but this works for both interfaces: mempolicy and per node
attributes.

[rientjes@google.com: fix HIGHMEM compile error]
Signed-off-by: Lee Schermerhorn
Reviewed-by: Mel Gorman
Acked-by: David Rientjes
Reviewed-by: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Cc: Andi Kleen
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:12 +0800
9a76db099 hugetlb: rework hstate_next_node_* functions ... Browse Code »

Modify the hstate_next_node* functions to allow them to be called to
obtain the "start_nid". Then, whereas prior to this patch we
unconditionally called hstate_next_node_to_{alloc|free}(), whether or not
we successfully allocated/freed a huge page on the node, now we only call
these functions on failure to alloc/free to advance to next allowed node.

Factor out the next_node_allowed() function to handle wrap at end of
node_online_map. In this version, the allowed nodes include all of the
online nodes.

Signed-off-by: Lee Schermerhorn
Reviewed-by: Mel Gorman
Acked-by: David Rientjes
Reviewed-by: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Cc: Andi Kleen
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:12 +0800
4e7b8a6ce nodemask: make NODEMASK_ALLOC more general ... Browse Code »

This is a series of patches to provide control over the location of the
allocation and freeing of persistent huge pages on a NUMA platform.
Please consider for merging into mmotm.

This series uses two mechanisms to constrain the nodes from which
persistent huge pages are allocated: 1) the task NUMA mempolicy of the
task modifying a new sysctl "nr_hugepages_mempolicy", based on a
suggestion by Mel Gorman; and 2) a subset of the hugepages hstate sysfs
attributes have been added [in V4] to each node system device under:

/sys/devices/node/node[0-9]*/hugepages

The per node attibutes allow direct assignment of a huge page count on a
specific node, regardless of the task's mempolicy or cpuset constraints.

This patch:

NODEMASK_ALLOC(x, m) assumes x is a type of struct, which is unnecessary.
It's perfectly reasonable to use this macro to allocate a nodemask_t,
which is anonymous, either dynamically or on the stack depending on
NODES_SHIFT.

Signed-off-by: David Rientjes
Signed-off-by: Lee Schermerhorn
Acked-by: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: Randy Dunlap
Cc: Nishanth Aravamudan
Cc: Andi Kleen
Cc: David Rientjes
Cc: Adam Litke
Cc: Andy Whitcroft
Cc: Eric Whitney
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2009-12-16 00:53:12 +0800
6d9c285a6 mm: move inc_zone_page_state(NR_ISOLATED) to just isolated place ... Browse Code »

Christoph pointed out inc_zone_page_state(NR_ISOLATED) should be placed
in right after isolate_page().

This patch does it.

Reviewed-by: Christoph Lameter
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2009-12-16 00:53:12 +0800
ee32398fd /dev/mem: remove redundant parameter from do_write_kmem() ... Browse Code »

Signed-off-by: Wu Fengguang
Cc: Andi Kleen
Cc: Avi Kivity
Cc: Greg Kroah-Hartman
Cc: Johannes Berg
Cc: Marcelo Tosatti
Cc: Mark Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2009-12-16 00:53:12 +0800
80ad89a0c /dev/mem: remove the "written" variable in write_kmem() ... Browse Code »

Also rename "len" to "sz". No behavior change.

Signed-off-by: Wu Fengguang
Cc: Andi Kleen
Cc: Avi Kivity
Cc: Greg Kroah-Hartman
Cc: Johannes Berg
Cc: Marcelo Tosatti
Cc: Mark Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2009-12-16 00:53:11 +0800
7fabaddd0 /dev/mem: make size_inside_page() logic straight ... Browse Code »

Also convert more size_inside_page() users.

Signed-off-by: Wu Fengguang
Cc: Andi Kleen
Cc: Avi Kivity
Cc: Greg Kroah-Hartman
Cc: Johannes Berg
Cc: Marcelo Tosatti
Cc: Mark Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2009-12-16 00:53:11 +0800
fa29e97bb /dev/mem: cleanup unxlate_dev_mem_ptr() calls ... Browse Code »

No behaviour change.

[akpm@linux-foundation.org: cleanuplets]
[akpm@linux-foundation.org: remove unused `ret']
Signed-off-by: Wu Fengguang
Acked-by: Andi Kleen
Cc: Marcelo Tosatti
Cc: Greg Kroah-Hartman
Cc: Mark Brown
Cc: Johannes Berg
Cc: Avi Kivity
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2009-12-16 00:53:11 +0800
f222318e9 /dev/mem: introduce size_inside_page() ... Browse Code »

Introduce size_inside_page() to replace duplicate /dev/mem code.

Also apply it to /dev/kmem, whose alignment logic was buggy.

Signed-off-by: Wu Fengguang
Acked-by: Andi Kleen
Cc: Marcelo Tosatti
Cc: Greg Kroah-Hartman
Cc: Mark Brown
Cc: Johannes Berg
Cc: Avi Kivity
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2009-12-16 00:53:11 +0800
4ea2f43f2 /dev/mem: remove redundant test on len ... Browse Code »

The len test in write_kmem() is always true, so can be reduced.

Signed-off-by: Wu Fengguang
Acked-by: Andi Kleen
Cc: Marcelo Tosatti
Cc: Greg Kroah-Hartman
Cc: Mark Brown
Cc: Johannes Berg
Cc: Avi Kivity
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2009-12-16 00:53:11 +0800
659ace584 mmap: don't return ENOMEM when mapcount is temporarily exceeded in munmap() ... Browse Code »

On ia64, the following test program exit abnormally, because glibc thread
library called abort().

========================================================
(gdb) bt
#0 0xa000000000010620 in __kernel_syscall_via_break ()
#1 0x20000000003208e0 in raise () from /lib/libc.so.6.1
#2 0x2000000000324090 in abort () from /lib/libc.so.6.1
#3 0x200000000027c3e0 in __deallocate_stack () from /lib/libpthread.so.0
#4 0x200000000027f7c0 in start_thread () from /lib/libpthread.so.0
#5 0x200000000047ef60 in __clone2 () from /lib/libc.so.6.1
========================================================

The fact is, glibc call munmap() when thread exitng time for freeing
stack, and it assume munlock() never fail. However, munmap() often make
vma splitting and it with many mapcount make -ENOMEM.

Oh well, that's crazy, because stack unmapping never increase mapcount.
The maxcount exceeding is only temporary. internal temporary exceeding
shouldn't make ENOMEM.

This patch does it.

test_max_mapcount.c
==================================================================
#include
#include
#include
#include
#include
#include

#define THREAD_NUM 30000
#define MAL_SIZE (8*1024*1024)

void *wait_thread(void *args)
{
void *addr;

addr = malloc(MAL_SIZE);
sleep(10);

return NULL;
}

void *wait_thread2(void *args)
{
sleep(60);

return NULL;
}

int main(int argc, char *argv[])
{
int i;
pthread_t thread[THREAD_NUM], th;
int ret, count = 0;
pthread_attr_t attr;

ret = pthread_attr_init(&attr);
if(ret) {
perror("pthread_attr_init");
}

ret = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
if(ret) {
perror("pthread_attr_setdetachstate");
}

for (i = 0; i < THREAD_NUM; i++) {
ret = pthread_create(&th, &attr, wait_thread, NULL);
if(ret) {
fprintf(stderr, "[%d] ", count);
perror("pthread_create");
} else {
printf("[%d] create OK.\n", count);
}
count++;

ret = pthread_create(&thread[i], &attr, wait_thread2, NULL);
if(ret) {
fprintf(stderr, "[%d] ", count);
perror("pthread_create");
} else {
printf("[%d] create OK.\n", count);
}
count++;
}

sleep(3600);
return 0;
}
==================================================================

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Hugh Dickins
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2009-12-16 00:53:11 +0800
bb86a7338 page-types: exit early when invoked with -d|--describe ... Browse Code »

On a system with large amount of memory (256GB), invoking page-types can
take quite a long time, which is unreasonable considering the user only
wants a description of the flags:

# time ./page-types -d 0x10
0x0000000000000010 ____D_____________________________ dirty

real 0m34.285s
user 0m1.966s
sys 0m32.313s

This is because we still walk the entire address range.

Exiting early seems like a reasonble solution:

# time ./page-types -d 0x10
0x0000000000000010 ____D_____________________________ dirty

real 0m0.007s
user 0m0.001s
sys 0m0.005s

Signed-off-by: Alex Chiang
Cc: Andi Kleen
Cc: Haicheng Li
Acked-by: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Chiang
2009-12-16 00:53:11 +0800
9fdcd886a page-types: whitespace alignment ... Browse Code »

Align the output when page-type -h is invoked.

Signed-off-by: Alex Chiang
Acked-by: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Chiang
2009-12-16 00:53:11 +0800
dcfe730c6 page-types: learn to describe flags directly from command line ... Browse Code »

Teach page-types to describe page flags directly from the command line.

Why is this useful? For instance, if you're using memory hotplug and see
this in /var/log/messages:

kernel: removing from LRU failed 3836dd0/1/1e00000000000010

It would be nice to decode those page flags without staring at the source.

Example usage and output:

# Documentation/vm/page-types -d 0x10
0x0000000000000010 ____D_____________________________ dirty

# Documentation/vm/page-types -d anon
0x0000000000001000 ____________a_____________________ anonymous

# Documentation/vm/page-types -d anon,0x10
0x0000000000001010 ____D_______a_____________________ dirty,anonymous

[achiang@hp.com: documentation]
Signed-off-by: Alex Chiang
Signed-off-by: Wu Fengguang
Cc: Andi Kleen
Cc: Haicheng Li
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Chiang
2009-12-16 00:53:11 +0800
f1327bf18 page-types: unsigned cannot be less than 0 in add_page() ... Browse Code »

If not signed, testing of the read() return value in this function
will not work.

Signed-off-by: Roel Kluin
Cc: Wu Fengguang
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roel Kluin
2009-12-16 00:53:11 +0800
3428838d8 page-types: constify read only arrays ... Browse Code »

Signed-off-by: Tommi Rantala
Cc: Randy Dunlap
Cc: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tommi Rantala
2009-12-16 00:53:10 +0800
1b604d75b oom: dump stack and VM state when oom killer panics ... Browse Code »

The oom killer header, including information such as the allocation order
and gfp mask, current's cpuset and memory controller, call trace, and VM
state information is currently only shown when the oom killer has selected
a task to kill.

This information is omitted, however, when the oom killer panics either
because of panic_on_oom sysctl settings or when no killable task was
found. It is still relevant to know crucial pieces of information such as
the allocation order and VM state when diagnosing such issues, especially
at boot.

This patch displays the oom killer header whenever it panics so that bug
reports can include pertinent information to debug the issue, if possible.

Signed-off-by: David Rientjes
Reviewed-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2009-12-16 00:53:10 +0800
5ce45962b MAINTAINERS: new kbuild maintainer ... Browse Code »

Sam was fine with handing over kbuild maintainership to me. The git
trees are already in linux-next, a merge request will follow shortly.

Acked-by: Sam Ravnborg
Signed-off-by: Michal Marek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Marek
2009-12-16 00:53:10 +0800
ec81aecb2 hfs: fix a potential buffer overflow ... Browse Code »
45

A specially-crafted Hierarchical File System (HFS) filesystem could cause
a buffer overflow to occur in a process's kernel stack during a memcpy()
call within the hfs_bnode_read() function (at fs/hfs/bnode.c:24). The
attacker can provide the source buffer and length, and the destination
buffer is a local variable of a fixed length. This local variable (passed
as "&entry" from fs/hfs/dir.c:112 and allocated on line 60) is stored in
the stack frame of hfs_bnode_read()'s caller, which is hfs_readdir().
Because the hfs_readdir() function executes upon any attempt to read a
directory on the filesystem, it gets called whenever a user attempts to
inspect any filesystem contents.

[amwang@redhat.com: modify this patch and fix coding style problems]
Signed-off-by: WANG Cong
Cc: Eugene Teo
Cc: Roman Zippel
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Alexey Dobriyan
Cc: Dave Anderson
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Amerigo Wang
2009-12-16 00:53:10 +0800
4b731d50f bsdacct: fix uid/gid misreporting ... Browse Code »

commit d8e180dcd5bbbab9cd3ff2e779efcf70692ef541 "bsdacct: switch
credentials for writing to the accounting file" introduced credential
switching during final acct data collecting. However, uid/gid pair
continued to be collected from current which became credentials of who
created acct file, not who exits.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14676

Signed-off-by: Alexey Dobriyan
Reported-by: Juho K. Juopperi
Acked-by: Serge Hallyn
Acked-by: David Howells
Reviewed-by: Michal Schmidt
Cc: James Morris
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-12-16 00:53:10 +0800

15 Dec, 2009

14 commits

544304075 Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging ... Browse Code »

* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c-core: i2c bus should support PM entries in struct dev_pm_ops
i2c: Get rid of I2C_CLIENT_MODULE_PARM
i2c: Drop I2C_CLIENT_INSMOD_2 to 8
i2c: Drop I2C_CLIENT_INSMOD_1
i2c: Get rid of struct i2c_client_address_data
i2c: Drop the kind parameter from detect callbacks

Linus Torvalds
2009-12-15 06:11:56 +0800
3ea6b3d0e Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
udf: Avoid IO in udf_clear_inode
udf: Try harder when looking for VAT inode
udf: Fix compilation with UDFFS_DEBUG enabled

Linus Torvalds
2009-12-15 04:50:25 +0800
2c948b3f8 udf: Avoid IO in udf_clear_inode ... Browse Code »

It is not very good to do IO in udf_clear_inode. First, VFS does not really
expect inode to become dirty there and thus we have to write it ourselves,
second, memory reclaim gets blocked waiting for IO when it does not really
expect it, third, the IO pattern (e.g. on umount) resulting from writes in
udf_clear_inode is bad and it slows down writing a lot.

The reason why UDF needed to do IO in udf_clear_inode is that UDF standard
mandates extent length to exactly match inode size. But when we allocate
extents to a file or directory, we don't really know what exactly the final
file size will be and thus temporarily set it to block boundary and later
truncate it to exact length in udf_clear_inode. Now, this is changed to
truncate to final file size in udf_release_file for regular files. For
directories and symlinks, we do the truncation at the moment when learn
what the final file size will be.

Signed-off-by: Jan Kara

Jan Kara
2009-12-15 04:40:04 +0800
e971b0b9e udf: Try harder when looking for VAT inode ... Browse Code »

Some disks do not contain VAT inode in the last recorded block as required
by the standard but a few blocks earlier (or the number of recorded blocks
is wrong). So look for the VAT inode a bit before the end of the media.

Signed-off-by: Jan Kara

Jan Kara
2009-12-15 04:40:04 +0800
1fefd086d udf: Fix compilation with UDFFS_DEBUG enabled ... Browse Code »

Signed-off-by: Jan Kara

Jan Kara
2009-12-15 04:40:04 +0800
75b08038c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mce: Clean up thermal init by introducing intel_thermal_supported()
x86, mce: Thermal monitoring depends on APIC being enabled
x86: Gart: fix breakage due to IOMMU initialization cleanup
x86: Move swiotlb initialization before dma32_free_bootmem
x86: Fix build warning in arch/x86/mm/mmio-mod.c
x86: Remove usedac in feature-removal-schedule.txt
x86: Fix duplicated UV BAU interrupt vector
nvram: Fix write beyond end condition; prove to gcc copy is safe
mm: Adjust do_pages_stat() so gcc can see copy_from_user() is safe
x86: Limit the number of processor bootup messages
x86: Remove enabling x2apic message for every CPU
doc: Add documentation for bootloader_{type,version}
x86, msr: Add support for non-contiguous cpumasks
x86: Use find_e820() instead of hard coded trampoline address
x86, AMD: Fix stale cpuid4_info shared_map data in shared_cpu_map cpumasks

Trivial percpu-naming-introduced conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c

Linus Torvalds
2009-12-15 04:36:46 +0800
fb1beb29b Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
pcmcia: CodingStyle fixes
pcmcia: remove unused IRQ_FIRST_SHARED

Linus Torvalds
2009-12-15 04:33:02 +0800
54067ee20 i2c-core: i2c bus should support PM entries in struct dev_pm_ops ... Browse Code »

Struct dev_pm_ops is not configured in current i2c bus type. i2c drivers
only depends on suspend/resume entries in struct dev_pm_ops are not
informed of PM suspend and resume events by i2c framework.

Signed-off-by: Sonic Zhang
Signed-off-by: Jean Delvare

sonic zhang
2009-12-15 04:17:30 +0800
7f508118b i2c: Get rid of I2C_CLIENT_MODULE_PARM ... Browse Code »

There is no user left of I2C_CLIENT_MODULE_PARM, so we can finally
get rid of this ugly macro.

Signed-off-by: Jean Delvare
Tested-by: Wolfram Sang

Jean Delvare
2009-12-15 04:17:29 +0800
e5e9f44c2 i2c: Drop I2C_CLIENT_INSMOD_2 to 8 ... Browse Code »

These macros simply declare an enum, so drivers might as well declare
it themselves. This puts an end to the arbitrary limit of 8 chip types
per i2c driver.

Signed-off-by: Jean Delvare
Tested-by: Wolfram Sang

Jean Delvare
2009-12-15 04:17:27 +0800
1f86df49d i2c: Drop I2C_CLIENT_INSMOD_1 ... Browse Code »

This macro simply declares an enum, so drivers might as well declare
it themselves.

Signed-off-by: Jean Delvare
Tested-by: Wolfram Sang

Jean Delvare
2009-12-15 04:17:26 +0800
c3813d6af i2c: Get rid of struct i2c_client_address_data ... Browse Code »

Struct i2c_client_address_data only contains one field at this point,
which makes its usefulness questionable. Get rid of it and pass simple
address lists around instead.

Signed-off-by: Jean Delvare
Tested-by: Wolfram Sang

Jean Delvare
2009-12-15 04:17:25 +0800
310ec7921 i2c: Drop the kind parameter from detect callbacks ... Browse Code »

The "kind" parameter always has value -1, and nobody is using it any
longer, so we can remove it.

Signed-off-by: Jean Delvare
Tested-by: Wolfram Sang

Jean Delvare
2009-12-15 04:17:23 +0800
478e4e9d7 Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6 ... Browse Code »

* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (23 commits)
spi: fix probe/remove section markings
Add OMAP spi100k driver
spi-imx: don't access struct device directly but use dev_get_platdata
spi-imx: Add mx25 support
spi-imx: use positive logic to distinguish cpu variants
spi-imx: correct check for platform_get_irq failing
ARM: NUC900: Add spi driver support for nuc900
spi: SuperH MSIOF SPI Master driver V2
spi: fix spidev compilation failure when VERBOSE is defined
spi/au1550_spi: fix setupxfer not to override cfg with zeros
spi/mpc8xxx: don't use __exit_p to wrap plat_mpc8xxx_spi_remove
spi/i.MX: fix broken error handling for gpio_request
spi/i.mx: drain MXC SPI transfer buffer when probing device
MAINTAINERS: add SPI co-maintainer.
spi/xilinx_spi: fix incorrect casting
spi/mpc52xx-spi: minor cleanups
xilinx_spi: add a platform driver using the xilinx_spi common module.
xilinx_spi: add support for the DS570 IP.
xilinx_spi: Switch to iomem functions and support little endian.
xilinx_spi: Split into of driver and generic part.
...

Linus Torvalds
2009-12-15 02:22:11 +0800