Eric Lee / smarc-fsl-linux-kernel

07 Mar, 2008

5 commits

6d2144d35 slab: NUMA slab allocator migration bugfix ... Browse Code »

NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller. As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit. When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

kernel BUG at mm/slab.c:2417
cache_alloc_refill+0xe6
kmem_cache_alloc+0xd0
...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty
Signed-off-by: Christoph Lameter

Joe Korty
2008-03-07 08:21:50 +0800
b62103867 slub: Do not cross cacheline boundaries for very small objects ... Browse Code »

SLUB should pack even small objects nicely into cachelines if that is what
has been asked for. Use the same algorithm as SLAB for this.

The effect of this patch for a system with a cacheline size of 64
bytes is that the 24 byte sized slab caches will now put exactly
2 objects into a cacheline instead of 3 with some overlap into
the next cacheline. This reduces the object density in a 4k slab
from 170 to 128 objects (same as SLAB).

Signed-off-by: Nick Piggin
Signed-off-by: Christoph Lameter

Nick Piggin
2008-03-07 08:21:50 +0800
1c61fc40f slab - use angle brackets for include of kmalloc_sizes.h ... Browse Code »

Make them all use angle brackets and the directory name.

Acked-by: Pekka Enberg
Signed-off-by: Joe Perches
Signed-off-by: Christoph Lameter

Joe Perches
2008-03-07 08:21:49 +0800
9ac33b2b7 slab numa fallback logic: Do not pass unfiltered flags to page allocator ... Browse Code »

The NUMA fallback logic should be passing local_flags to kmem_get_pages() and not simply the
flags passed in.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-07 08:21:49 +0800
b773ad736 slub statistics: Fix check for DEACTIVATE_REMOTE_FREES ... Browse Code »

The remote frees are in the freelist of the page and not in the
percpu freelist.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-07 08:21:49 +0800

05 Mar, 2008

20 commits

348e1e04b hugetlb: fix pool shrinking while in restricted cpuset ... Browse Code »

Adam Litke noticed that currently we grow the hugepage pool independent of any
cpuset the running process may be in, but when shrinking the pool, the cpuset
is checked. This leads to inconsistency when shrinking the pool in a
restricted cpuset -- an administrator may have been able to grow the pool on a
node restricted by a containing cpuset, but they cannot shrink it there.

There are two options: either prevent growing of the pool outside of the
cpuset or allow shrinking outside of the cpuset. >From previous discussions
on linux-mm, /proc/sys/vm/nr_hugepages is an administrative interface that
should not be restricted by cpusets. So allow shrinking the pool by removing
pages from nodes outside of current's cpuset.

Signed-off-by: Nishanth Aravamudan
Acked-by: Adam Litke
Cc: William Irwin
Cc: Lee Schermerhorn
Cc: Christoph Lameter
Cc: Paul Jackson
Cc: David Gibson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nishanth Aravamudan
2008-03-05 08:35:18 +0800
ac09b3a15 hugetlb: close a difficult to trigger reservation race ... Browse Code »

A hugetlb reservation may be inadequately backed in the event of racing
allocations and frees when utilizing surplus huge pages. Consider the
following series of events in processes A and B:

A) Allocates some surplus pages to satisfy a reservation
B) Frees some huge pages
A) A notices the extra free pages and drops hugetlb_lock to free some of
its surplus pages back to the buddy allocator.
B) Allocates some huge pages
A) Reacquires hugetlb_lock and returns from gather_surplus_huge_pages()

Avoid this by commiting the reservation after pages have been allocated but
before dropping the lock to free excess pages. For parity, release the
reservation in return_unused_surplus_pages().

This patch also corrects the cpuset_mems_nr() error path in
hugetlb_acct_memory(). If the cpuset check fails, uncommit the
reservation, but also be sure to return any surplus huge pages that may
have been allocated to back the failed reservation.

Thanks to Andy Whitcroft for discovering this.

Signed-off-by: Adam Litke
Cc: Mel Gorman
Cc: Andy Whitcroft
Cc: Dave Hansen
Cc: William Lee Irwin III
Cc: Andy Whitcroft
Cc: Mel Gorman
Cc: David Gibson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adam Litke
2008-03-05 08:35:18 +0800
fb59e9f1e memcg: fix oops on NULL lru list ... Browse Code »

While testing force_empty, during an exit_mmap, __mem_cgroup_remove_list
called from mem_cgroup_uncharge_page oopsed on a NULL pointer in the lru list.
I couldn't see what racing tasks on other cpus were doing, but surmise that
another must have been in mem_cgroup_charge_common on the same page, between
its unlock_page_cgroup and spin_lock_irqsave near done (thanks to that kzalloc
which I'd almost changed to a kmalloc).

Normally such a race cannot happen, the ref_cnt prevents it, the final
uncharge cannot race with the initial charge. But force_empty buggers the
ref_cnt, that's what it's all about; and thereafter forced pages are
vulnerable to races such as this (just think of a shared page also mapped into
an mm of another mem_cgroup than that just emptied). And remain vulnerable
until they're freed indefinitely later.

This patch just fixes the oops by moving the unlock_page_cgroups down below
adding to and removing from the list (only possible given the previous patch);
and while we're at it, we might as well make it an invariant that
page->page_cgroup is always set while pc is on lru.

But this behaviour of force_empty seems highly unsatisfactory to me: why have
a ref_cnt if we always have to cope with it being violated (as in the earlier
page migration patch). We may prefer force_empty to move pages to an orphan
mem_cgroup (could be the root, but better not), from which other cgroups could
recover them; we might need to reverse the locking again; but no time now for
such concerns.

Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
9b3c0a07e memcg: simplify force_empty and move_lists ... Browse Code »

As for force_empty, though this may not be the main topic here,
mem_cgroup_force_empty_list() can be implemented simpler. It is possible to
make the function just call mem_cgroup_uncharge_page() instead of releasing
page_cgroups by itself. The tip is to call get_page() before invoking
mem_cgroup_uncharge_page(), so the page won't be released during this
function.

Kamezawa-san points out that by the time mem_cgroup_uncharge_page() uncharges,
the page might have been reassigned to an lru of a different mem_cgroup, and
now be emptied from that; but Hugh claims that's okay, the end state is the
same as when it hasn't gone to another list.

And once force_empty stops taking lock_page_cgroup within mz->lru_lock,
mem_cgroup_move_lists() can be simplified to take mz->lru_lock directly while
holding page_cgroup lock (but still has to use try_lock_page_cgroup).

Signed-off-by: Hirokazu Takahashi
Signed-off-by: Hugh Dickins
Cc: David Rientjes
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hirokazu Takahashi
2008-03-05 08:35:15 +0800
2680eed72 memcg: fix mem_cgroup_move_lists locking ... Browse Code »

Ever since the VM_BUG_ON(page_get_page_cgroup(page)) (now Bad page state) went
into page freeing, I've hit it from time to time in testing on some machines,
sometimes only after many days. Recently found a machine which could usually
produce it within a few hours, which got me there at last.

The culprit is mem_cgroup_move_lists, whose locking is inadequate; and the
arrangement of structures was such that you got page_cgroups from the lru list
neatly put on to SLUB's freelist. Kamezawa-san identified the same hole
independently.

The main problem was that it was missing the lock_page_cgroup it needs to
safely page_get_page_cgroup; but it's tricky to go beyond that too, and I
couldn't do it with SLAB_DESTROY_BY_RCU as I'd expected. See the code for
comments on the constraints.

This patch immediately gets replaced by a simpler one from Hirokazu-san; but
is it just foolish pride that tells me to put this one on record, in case we
need to come back to it later?

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Cc: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
6d48ff8bc memcg: css_put after remove_list ... Browse Code »

mem_cgroup_uncharge_page does css_put on the mem_cgroup before uncharging from
it, and before removing page_cgroup from one of its lru lists: isn't there a
danger that struct mem_cgroup memory could be freed and reused before
completing that, so corrupting something? Never seen it, and for all I know
there may be other constraints which make it impossible; but let's be
defensive and reverse the ordering there.

mem_cgroup_force_empty_list is safe because there's an extra css_get around
all its works; but even so, change its ordering the same way round, to help
get in the habit of doing it like this.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Cc: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
b9c565d5a memcg: remove clear_page_cgroup and atomics ... Browse Code »

Remove clear_page_cgroup: it's an unhelpful helper, see for example how
mem_cgroup_uncharge_page had to unlock_page_cgroup just in order to call it
(serious races from that? I'm not sure).

Once that's gone, you can see it's pointless for page_cgroup's ref_cnt to be
atomic: it's always manipulated under lock_page_cgroup, except where
force_empty unilaterally reset it to 0 (and how does uncharge's
atomic_dec_and_test protect against that?).

Simplify this page_cgroup locking: if you've got the lock and the pc is
attached, then the ref_cnt must be positive: VM_BUG_ONs to check that, and to
check that pc->page matches page (we're on the way to finding why sometimes it
doesn't, but this patch doesn't fix that).

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Cc: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
d5b69e38f memcg: memcontrol uninlined and static ... Browse Code »

More cleanup to memcontrol.c, this time changing some of the code generated.
Let the compiler decide what to inline (except for page_cgroup_locked which is
only used when CONFIG_DEBUG_VM): the __always_inline on lock_page_cgroup etc.
was quite a waste since bit_spin_lock etc. are inlines in a header file; made
mem_cgroup_force_empty and mem_cgroup_write_strategy static.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Cc: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
8869b8f6e memcg: memcontrol whitespace cleanups ... Browse Code »

Sorry, before getting down to more important changes, I'd like to do some
cleanup in memcontrol.c. This patch doesn't change the code generated, but
cleans up whitespace, moves up a double declaration, removes an unused enum,
removes void returns, removes misleading comments, that kind of thing.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
8289546e5 memcg: remove mem_cgroup_uncharge ... Browse Code »

Nothing uses mem_cgroup_uncharge apart from mem_cgroup_uncharge_page, (a
trivial wrapper around it) and mem_cgroup_end_migration (which does the same
as mem_cgroup_uncharge_page). And it often ends up having to lock just to let
its caller unlock. Remove it (but leave the silly locking until a later
patch).

Moved mem_cgroup_cache_charge next to mem_cgroup_charge in memcontrol.h.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
7e924aafa memcg: mem_cgroup_charge never NULL ... Browse Code »

My memcgroup patch to fix hang with shmem/tmpfs added NULL page handling to
mem_cgroup_charge_common. It seemed convenient at the time, but hard to
justify now: there's a perfectly appropriate swappage to charge and uncharge
instead, this is not on any hot path through shmem_getpage, and no performance
hit was observed from the slight extra overhead.

So revert that NULL page handling from mem_cgroup_charge_common; and make it
clearer by bringing page_cgroup_assign_new_page_cgroup into its body - that
was a helper I found more of a hindrance to understanding.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
9442ec9df memcg: bad page if page_cgroup when free ... Browse Code »

Replace free_hot_cold_page's VM_BUG_ON(page_get_page_cgroup(page)) by a "Bad
page state" and clear: most users don't have CONFIG_DEBUG_VM on, and if it
were set here, it'd likely cause corruption when the page is reused.

Don't use page_assign_page_cgroup to clear it: that should be private to
memcontrol.c, and always called with the lock taken; and memmap_init_zone
doesn't need it either - like page->mapping and other pointers throughout the
kernel, Linux assumes pointers in zeroed structures are NULL pointers.

Instead use page_reset_bad_cgroup, added to memcontrol.h for this only.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:15 +0800
98837c7f8 memcg: fix VM_BUG_ON from page migration ... Browse Code »

Page migration gave me free_hot_cold_page's VM_BUG_ON page->page_cgroup.
remove_migration_pte was calling mem_cgroup_charge on the new page whenever it
found a swap pte, before it had determined it to be a migration entry. That
left a surplus reference count on the page_cgroup, so it was still attached
when the page was later freed.

Move that mem_cgroup_charge down to where we're sure it's a migration entry.
We were already under i_mmap_lock or anon_vma->lock, so its GFP_KERNEL was
already inappropriate: change that to GFP_ATOMIC.

It's essential that remove_migration_pte removes all the migration entries,
other crashes follow if not. So proceed even when the charge fails: normally
it cannot, but after a mem_cgroup_force_empty it might - comment in the code.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Cc: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:14 +0800
61469f1d5 memcg: when do_swap's do_wp_page fails ... Browse Code »

Don't uncharge when do_swap_page's call to do_wp_page fails: the page which
was charged for is there in the pagetable, and will be correctly uncharged
when that area is unmapped - it was only its COWing which failed.

And while we're here, remove earlier XXX comment: yes, OR in do_wp_page's
return value (maybe VM_FAULT_WRITE) with do_swap_page's there; but if it
fails, mask out success bits, which might confuse some arches e.g. sparc.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:14 +0800
6dbf6d3bb memcg: page_cache_release not __free_page ... Browse Code »

There's nothing wrong with mem_cgroup_charge failure in do_wp_page and
do_anonymous page using __free_page, but it does look odd when nearby code
uses page_cache_release: use that instead (while turning a blind eye to
ancient inconsistencies of page_cache_release versus put_page).

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:14 +0800
427d5416f memcg: move_lists on page not page_cgroup ... Browse Code »

Each caller of mem_cgroup_move_lists is having to use page_get_page_cgroup:
it's more convenient if it acts upon the page itself not the page_cgroup; and
in a later patch this becomes important to handle within memcontrol.c.

Signed-off-by: Hugh Dickins
Cc: David Rientjes
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:14 +0800
bd845e38c memcg: mm_match_cgroup not vm_match_cgroup ... Browse Code »

vm_match_cgroup is a perverse name for a macro to match mm with cgroup: rename
it mm_match_cgroup, matching mm_init_cgroup and mm_free_cgroup.

Signed-off-by: Hugh Dickins
Acked-by: David Rientjes
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Hirokazu Takahashi
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-03-05 08:35:14 +0800
00f0b8259 Memory controller: rename to Memory Resource Controller ... Browse Code »

Rename Memory Controller to Memory Resource Controller. Reflect the same
changes in the CONFIG definition for the Memory Resource Controller. Group
together the config options for Resource Counters and Memory Resource
Controller.

Signed-off-by: Balbir Singh
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-03-05 08:35:12 +0800
be852795e alloc_percpu() fails to allocate percpu data ... Browse Code »

Some oprofile results obtained while using tbench on a 2x2 cpu machine were
very surprising.

For example, loopback_xmit() function was using high number of cpu cycles
to perform the statistic updates, supposed to be real cheap since they use
percpu data

pcpu_lstats = netdev_priv(dev);
lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
lb_stats->packets++; /* HERE : serious contention */
lb_stats->bytes += skb->len;

struct pcpu_lstats is a small structure containing two longs. It appears
that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
instead of giving to each cpu a separate cache line.

Using the following patch gave me impressive boost in various benchmarks
( 6 % in tbench)
(all percpu_counters hit this bug too)

Long term fix (ie >= 2.6.26) would be to let each CPU allocate their own
block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
merging the SGI stuff of course...

Note : SLUB vs SLAB is important here to *show* the improvement, since they
dont have the same minimum allocation sizes (8 bytes vs 32 bytes). This
could very well explain regressions some guys reported when they switched
to SLUB.

Signed-off-by: Eric Dumazet
Acked-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Dumazet
2008-03-05 08:35:11 +0800
10ed273f5 zlc_setup(): handle jiffies wraparound ... Browse Code »

jiffies subtraction may cause an overflow problem. It should be using
time_after().

[akpm@linux-foundation.org: include jiffies.h]
Signed-off-by: KOSAKI Motohiro
Cc: Lee Schermerhorn
Cc: Paul Jackson
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2008-03-05 08:35:10 +0800

04 Mar, 2008

12 commits

62e5c4b4d slub: fix possible NULL pointer dereference ... Browse Code »

This patch fix possible NULL pointer dereference if kzalloc
failed. To be able to return proper error code the function
return type is changed to ssize_t (according to callees and
sysfs definitions).

Signed-off-by: Cyrill Gorcunov
Signed-off-by: Christoph Lameter

Cyrill Gorcunov
2008-03-04 04:22:32 +0800
f619cfe1b slub: Add kmalloc_large_node() to support kmalloc_node fallback ... Browse Code »

Slub is missing some NUMA support for large kmallocs. Provide that.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:32 +0800
769314348 slub: look up object from the freelist once ... Browse Code »

We only need to look up object from c->page->freelist once in
__slab_alloc().

Signed-off-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Pekka J Enberg
2008-03-04 04:22:32 +0800
6446faa2f slub: Fix up comments ... Browse Code »

Provide comments and fix up various spelling / style issues.

Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:32 +0800
d8b42bf54 slub: Rearrange #ifdef CONFIG_SLUB_DEBUG in calculate_sizes() ... Browse Code »

Group SLUB_DEBUG code together to reduce the number of #ifdefs. Move some
debug checks under the #ifdef.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:31 +0800
ae20bfda6 slub: Remove BUG_ON() from ksize and omit checks for !SLUB_DEBUG ... Browse Code »

The BUG_ONs are useless since the pointer derefs will lead to
NULL deref errors anyways. Some of the checks are not necessary
if no debugging is possible.

Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:31 +0800
27d9e4e94 slub: Use the objsize from the kmem_cache_cpu structure ... Browse Code »

No need to access the kmem_cache structure. We have the same value
in kmem_cache_cpu.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:31 +0800
d692ef6dc slub: Remove useless checks in alloc_debug_processing ... Browse Code »

Alloc debug processing is never called with a NULL object pointer.
No reason to check for NULL.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:31 +0800
e153362a5 slub: Remove objsize check in kmem_cache_flags() ... Browse Code »

There is no page->offset anymore and also no associated limit on the number
of objects. The page->offset field was removed for 2.6.24. So the check
in kmem_cache_flags() is now also obsolete (should have been dropped
earlier, somehow a hunk vanished).

Reviewed-by: Pekka Enberg
Signed-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:30 +0800
d9acf4b7b slub: rename slab_objects to show_slab_objects ... Browse Code »

The sysfs callback is better named show_slab_objects since it is always
called from the xxx_show callbacks. We need the name for other purposes
later.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:30 +0800
a973e9dd1 Revert "unique end pointer" patch ... Browse Code »

This only made sense for the alternate fastpath which was reverted last week.

Mathieu is working on a new version that addresses the fastpath issues but that
new code first needs to go through mm and it is not clear if we need the
unique end pointers with his new scheme.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:30 +0800
0643245f5 docbook: fix kernel-api source files ... Browse Code »

Fix docbook problems in kernel-api.tmpl.
These cause the generated docbook to be incorrect.

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2008-03-04 02:47:14 +0800

24 Feb, 2008

3 commits

2dda81ca3 memcgroup: return negative error code in mem_cgroup_create() ... Browse Code »

Cgroup requires the subsystem to return negative error code on error in the
create method.

Signed-off-by: Li Zefan
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-02-24 09:13:25 +0800
7fde4c3eb memcgroup: remove a useless VM_BUG_ON() ... Browse Code »

Remove this VM_BUG_ON(), as Balbir stated:

We used to have a for loop with !list_empty() as a termination condition
and VM_BUG_ON(!pc) is a spill over. With the new loop, VM_BUG_ON(!pc) does
not make sense.

Signed-off-by: Li Zefan
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-02-24 09:13:25 +0800
b5a0e0113 Solve section mismatch for free_area_init_core. ... Browse Code »

WARNING: vmlinux.o(.meminit.text+0x649):
Section mismatch in reference from the
function free_area_init_core() to the function .init.text:setup_usemap()
The function __meminit free_area_init_core() references
a function __init setup_usemap().
If free_area_init_core is only used by setup_usemap then
annotate free_area_init_core with a matching annotation.

The warning is covers this stack of functions in mm/page_alloc.c:

alloc_bootmem_node must be marked __init.
alloc_bootmem_node is used by setup_usemap, if !SPARSEMEM.
(usemap_size is only used by setup_usemap, if !SPARSEMEM.)
setup_usemap is only used by free_area_init_core.
free_area_init_core is only used by free_area_init_node.

free_area_init_node is used by:
arch/alpha/mm/numa.c: __init paging_init()
arch/arm/mm/init.c: __init bootmem_init_node()
arch/avr32/mm/init.c: __init paging_init()
arch/cris/arch-v10/mm/init.c: __init paging_init()
arch/cris/arch-v32/mm/init.c: __init paging_init()
arch/m32r/mm/discontig.c: __init zone_sizes_init()
arch/m32r/mm/init.c: __init zone_sizes_init()
arch/m68k/mm/motorola.c: __init paging_init()
arch/m68k/mm/sun3mmu.c: __init paging_init()
arch/mips/sgi-ip27/ip27-memory.c: __init paging_init()
arch/parisc/mm/init.c: __init paging_init()
arch/sparc/mm/srmmu.c: __init srmmu_paging_init()
arch/sparc/mm/sun4c.c: __init sun4c_paging_init()
arch/sparc64/mm/init.c: __init paging_init()
mm/page_alloc.c: __init free_area_init_nodes()
mm/page_alloc.c: __init free_area_init()
and
mm/memory_hotplug.c: hotadd_new_pgdat()

hotadd_new_pgdat can not be an __init function, but:

It is compiled for MEMORY_HOTPLUG configurations only
MEMORY_HOTPLUG depends on SPARSEMEM || X86_64_ACPI_NUMA
X86_64_ACPI_NUMA depends on X86_64
ARCH_FLATMEM_ENABLE depends on X86_32
ARCH_DISCONTIGMEM_ENABLE depends on X86_32
So X86_64_ACPI_NUMA implies SPARSEMEM, right?

So we can mark the stack of functions __init for !SPARSEMEM, but we must mark
them __meminit for SPARSEMEM configurations. This is ok, because then the
calls to alloc_bootmem_node are also avoided.

Compile-tested on:
silly minimal config
defconfig x86_32
defconfig x86_64
defconfig x86_64 -HIBERNATION +MEMORY_HOTPLUG

Signed-off-by: Alexander van Heukelum
Reviewed-by: Sam Ravnborg
Acked-by: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander van Heukelum
2008-02-24 09:13:24 +0800