Eric Lee / smarc-fsl-linux-kernel

27 Oct, 2010

1 commit

732eacc05 replace nested max/min macros with {max,min}3 macro ... Browse Code »

Use the new {max,min}3 macros to save some cycles and bytes on the stack.
This patch substitutes trivial nested macros with their counterpart.

Signed-off-by: Hagen Paul Pfeifer
Cc: Joe Perches
Cc: Ingo Molnar
Cc: Hartley Sweeten
Cc: Russell King
Cc: Benjamin Herrenschmidt
Cc: Thomas Gleixner
Cc: Herbert Xu
Cc: Roland Dreier
Cc: Sean Hefty
Cc: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hagen Paul Pfeifer
2010-10-27 07:52:12 +0800

23 Aug, 2010

1 commit

bc584c510 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slab: fix object alignment
slub: add missing __percpu markup in mm/slub_def.h

Linus Torvalds
2010-08-23 01:08:52 +0800

10 Aug, 2010

1 commit

4e60c86bd gcc-4.6: mm: fix unused but set warnings ... Browse Code »

No real bugs, just some dead code and some fixups.

Signed-off-by: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2010-08-10 11:44:58 +0800

09 Aug, 2010

1 commit

1ab335d8f slab: fix object alignment ... Browse Code »

This patch fixes alignment of slab objects in case CONFIG_DEBUG_PAGEALLOC is
active.
Before this spot in kmem_cache_create, we have this situation:
- align contains the required alignment of the object
- cachep->obj_offset is 0 or equals align in case of CONFIG_DEBUG_SLAB
- size equals the size of the object, or object plus trailing redzone in case
of CONFIG_DEBUG_SLAB

This spot tries to fill one page per object if the object is in certain size
limits, however setting obj_offset to PAGE_SIZE - size does break the object
alignment since size may not be aligned with the required alignment.
This patch simply adds an ALIGN(size, align) to the equation and fixes the
object size detection accordingly.

This code in drivers/s390/cio/qdio_setup_init has lead to incorrectly aligned
slab objects (sizeof(struct qdio_q) equals 1792):
qdio_q_cache = kmem_cache_create("qdio_q", sizeof(struct qdio_q),
256, 0, NULL);

Acked-by: Christoph Lameter
Signed-off-by: Carsten Otte
Signed-off-by: Pekka Enberg

Carsten Otte
2010-08-09 23:48:07 +0800

07 Aug, 2010

1 commit

b57bdda58 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: Allow removal of slab caches during boot
Revert "slub: Allow removal of slab caches during boot"
slub numa: Fix rare allocation from unexpected node
slab: use deferable timers for its periodic housekeeping
slub: Use kmem_cache flags to detect if slab is in debugging mode.
slub: Allow removal of slab caches during boot
slub: Check kasprintf results in kmem_cache_init()
SLUB: Constants need UL
slub: Use a constant for a unspecified node.
SLOB: Free objects to their own list
slab: fix caller tracking on !CONFIG_DEBUG_SLAB && CONFIG_TRACING

Linus Torvalds
2010-08-07 02:44:08 +0800

20 Jul, 2010

1 commit

78b435368 slab: use deferable timers for its periodic housekeeping ... Browse Code »

slab has a "once every 2 second" timer for its housekeeping.
As the number of logical processors is growing, its more and more
common that this 2 second timer becomes the primary wakeup source.

This patch turns this housekeeping timer into a deferable timer,
which means that the timer does not interrupt idle, but just runs
at the next event that wakes the cpu up.

The impact is that the timer likely runs a bit later, but during the
delay no code is running so there's not all that much reason for
a difference in housekeeping to occur because of this delay.

Signed-off-by: Arjan van de Ven
Signed-off-by: Pekka Enberg

Arjan van de Ven
2010-07-20 15:03:23 +0800

09 Jun, 2010

1 commit

039ca4e74 tracing: Remove kmemtrace ftrace plugin ... Browse Code »

We have been resisting new ftrace plugins and removing existing
ones, and kmemtrace has been superseded by kmem trace events
and perf-kmem, so we remove it.

Signed-off-by: Li Zefan
Acked-by: Pekka Enberg
Acked-by: Eduard - Gabriel Munteanu
Cc: Ingo Molnar
Cc: Steven Rostedt
[ remove kmemtrace from the makefile, handle slob too ]
Signed-off-by: Frederic Weisbecker

Li Zefan
2010-06-09 23:31:22 +0800

28 May, 2010

3 commits

7d6e6d09d numa: slab: use numa_mem_id() for slab local memory node ... Browse Code »

Example usage of generic "numa_mem_id()":

The mainline slab code, since ~ 2.6.19, does not handle memoryless nodes
well. Specifically, the "fast path"--____cache_alloc()--will never
succeed as slab doesn't cache offnode object on the per cpu queues, and
for memoryless nodes, all memory will be "off node" relative to
numa_node_id(). This adds significant overhead to all kmem cache
allocations, incurring a significant regression relative to earlier
kernels [from before slab.c was reorganized].

This patch uses the generic topology function "numa_mem_id()" to return
the "effective local memory node" for the calling context. This is the
first node in the local node's generic fallback zonelist-- the same node
that "local" mempolicy-based allocations would use. This lets slab cache
these "local" allocations and avoid fallback/refill on every allocation.

N.B.: Slab will need to handle node and memory hotplug events that could
change the value returned by numa_mem_id() for any given node if recent
changes to address memory hotplug don't already address this. E.g., flush
all per cpu slab queues before rebuilding the zonelists while the
"machine" is held in the stopped state.

Performance impact on "hackbench 400 process 200"

2.6.34-rc3-mmotm-100405-1609 no-patch this-patch
ia64 no memoryless nodes [avg of 10]: 11.713 11.637 ~0.65 diff
ia64 cpus all on memless nodes [10]: 228.259 26.484 ~8.6x speedup

The slowdown of the patched kernel from ~12 sec to ~28 seconds when
configured with memoryless nodes is the result of all cpus allocating from
a single node's mm pagepool. The cache lines of the single node are
distributed/interleaved over the memory of the real physical nodes, but
the zone lock, list heads, ... of the single node with memory still each
live in a single cache line that is accessed from all processors.

x86_64 [8x6 AMD] [avg of 40]: 2.883 2.845

Signed-off-by: Lee Schermerhorn
Cc: Tejun Heo
Cc: Mel Gorman
Cc: Christoph Lameter
Cc: Nick Piggin
Cc: David Rientjes
Cc: Eric Whitney
Cc: KAMEZAWA Hiroyuki
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: "Luck, Tony"
Cc: Pekka Enberg
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2010-05-28 00:12:57 +0800
eac406801 slab: convert cpu notifier to return encapsulate errno value ... Browse Code »

By the previous modification, the cpu notifier can return encapsulate
errno value. This converts the cpu notifiers for slab.

Signed-off-by: Akinobu Mita
Cc: Christoph Lameter
Acked-by: Pekka Enberg
Cc: Matt Mackall
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2010-05-28 00:12:48 +0800
6adef3ebe cpusets: new round-robin rotor for SLAB allocations ... Browse Code »

We have observed several workloads running on multi-node systems where
memory is assigned unevenly across the nodes in the system. There are
numerous reasons for this but one is the round-robin rotor in
cpuset_mem_spread_node().

For example, a simple test that writes a multi-page file will allocate
pages on nodes 0 2 4 6 ... Odd nodes are skipped. (Sometimes it
allocates on odd nodes & skips even nodes).

An example is shown below. The program "lfile" writes a file consisting
of 10 pages. The program then mmaps the file & uses get_mempolicy(...,
MPOL_F_NODE) to determine the nodes where the file pages were allocated.
The output is shown below:

# ./lfile
allocated on nodes: 2 4 6 0 1 2 6 0 2

There is a single rotor that is used for allocating both file pages & slab
pages. Writing the file allocates both a data page & a slab page
(buffer_head). This advances the RR rotor 2 nodes for each page
allocated.

A quick confirmation seems to confirm this is the cause of the uneven
allocation:

# echo 0 >/dev/cpuset/memory_spread_slab
# ./lfile
allocated on nodes: 6 7 8 9 0 1 2 3 4 5

This patch introduces a second rotor that is used for slab allocations.

Signed-off-by: Jack Steiner
Acked-by: Christoph Lameter
Cc: Pekka Enberg
Cc: Paul Menage
Cc: Jack Steiner
Cc: Robin Holt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jack Steiner
2010-05-28 00:12:44 +0800

25 May, 2010

1 commit

c0ff7453b cpuset,mm: fix no node to alloc memory when changing cpuset's mems ... Browse Code »
46

Before applying this patch, cpuset updates task->mems_allowed and
mempolicy by setting all new bits in the nodemask first, and clearing all
old unallowed bits later. But in the way, the allocator may find that
there is no node to alloc memory.

The reason is that cpuset rebinds the task's mempolicy, it cleans the
nodes which the allocater can alloc pages on, for example:

(mpol: mempolicy)
task1 task1's mpol task2
alloc page 1
alloc on node0? NO 1
1 change mems from 1 to 0
1 rebind task1's mpol
0-1 set new bits
0 clear disallowed bits
alloc on node1? NO 0
...
can't alloc page
goto oom

This patch fixes this problem by expanding the nodes range first(set newly
allowed bits) and shrink it lazily(clear newly disallowed bits). So we
use a variable to tell the write-side task that read-side task is reading
nodemask, and the write-side task clears newly disallowed nodes after
read-side task ends the current memory allocation.

[akpm@linux-foundation.org: fix spello]
Signed-off-by: Miao Xie
Cc: David Rientjes
Cc: Nick Piggin
Cc: Paul Menage
Cc: Lee Schermerhorn
Cc: Hugh Dickins
Cc: Ravikiran Thirumalai
Cc: KOSAKI Motohiro
Cc: Christoph Lameter
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miao Xie
2010-05-25 23:06:57 +0800

22 May, 2010

1 commit

bb4f6b0cd Merge branches 'slab/align', 'slab/cleanups', 'slab/fixes', 'slab/memhotadd' and… ... Browse Code »

… 'slub/fixes' into slab-for-linus

Pekka Enberg
2010-05-22 15:57:52 +0800

20 May, 2010

1 commit

1f0ce8b3d mm: Move ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN to <linux/slab_def.h> ... Browse Code »

Acked-by: Herbert Xu
Signed-off-by: David Woodhouse
Signed-off-by: Pekka Enberg

David Woodhouse
2010-05-20 03:03:13 +0800

15 Apr, 2010

1 commit

5c5e3b33b slab: Fix missing DEBUG_SLAB last user ... Browse Code »

Even with SLAB_RED_ZONE and SLAB_STORE_USER enabled, kernel would NOT store
redzone and last user data around allocated memory space if "arch cache line >
sizeof(unsigned long long)". As a result, last user information is unexpectedly
MISSED while dumping slab corruption log.

This fix makes sure that redzone and last user tags get stored unless the
required alignment breaks redzone's.

Signed-off-by: Shiyong Li
Signed-off-by: Pekka Enberg

Shiyong Li
2010-04-15 01:52:45 +0800

10 Apr, 2010

1 commit

fc1c18335 slab: Generify kernel pointer validation ... Browse Code »

As suggested by Linus, introduce a kern_ptr_validate() helper that does some
sanity checks to make sure a pointer is a valid kernel pointer. This is a
preparational step for fixing SLUB kmem_ptr_validate().

Cc: Andrew Morton
Cc: Christoph Lameter
Cc: David Rientjes
Cc: Ingo Molnar
Cc: Matt Mackall
Cc: Nick Piggin
Signed-off-by: Pekka Enberg
Signed-off-by: Linus Torvalds

Pekka Enberg
2010-04-10 01:09:50 +0800

08 Apr, 2010

1 commit

8f9f8d9e8 slab: add memory hotplug support ... Browse Code »

Slab lacks any memory hotplug support for nodes that are hotplugged
without cpus being hotplugged. This is possible at least on x86
CONFIG_MEMORY_HOTPLUG_SPARSE kernels where SRAT entries are marked
ACPI_SRAT_MEM_HOT_PLUGGABLE and the regions of RAM represent a seperate
node. It can also be done manually by writing the start address to
/sys/devices/system/memory/probe for kernels that have
CONFIG_ARCH_MEMORY_PROBE set, which is how this patch was tested, and
then onlining the new memory region.

When a node is hotadded, a nodelist for that node is allocated and
initialized for each slab cache. If this isn't completed due to a lack
of memory, the hotadd is aborted: we have a reasonable expectation that
kmalloc_node(nid) will work for all caches if nid is online and memory is
available.

Since nodelists must be allocated and initialized prior to the new node's
memory actually being online, the struct kmem_list3 is allocated off-node
due to kmalloc_node()'s fallback.

When an entire node would be offlined, its nodelists are subsequently
drained. If slab objects still exist and cannot be freed, the offline is
aborted. It is possible that objects will be allocated between this
drain and page isolation, so it's still possible that the offline will
still fail, however.

Acked-by: Christoph Lameter
Signed-off-by: David Rientjes
Signed-off-by: Pekka Enberg

David Rientjes
2010-04-08 00:28:31 +0800

29 Mar, 2010

1 commit

e92dd4fd1 slab: Fix continuation lines ... Browse Code »

Signed-off-by: Joe Perches
Signed-off-by: Pekka Enberg

Joe Perches
2010-03-29 01:08:16 +0800

04 Mar, 2010

1 commit

e2b093f3e Merge branches 'slab/cleanups', 'slab/failslab', 'slab/fixes' and 'slub/percpu' into slab-for-linus Browse Code »

Pekka Enberg
2010-03-04 18:07:50 +0800

27 Feb, 2010

1 commit

4c13dd3b4 failslab: add ability to filter slab caches ... Browse Code »

This patch allow to inject faults only for specific slabs.
In order to preserve default behavior cache filter is off by
default (all caches are faulty).

One may define specific set of slabs like this:
# mark skbuff_head_cache as faulty
echo 1 > /sys/kernel/slab/skbuff_head_cache/failslab
# Turn on cache filter (off by default)
echo 1 > /sys/kernel/debug/failslab/cache-filter
# Turn on fault injection
echo 1 > /sys/kernel/debug/failslab/times
echo 1 > /sys/kernel/debug/failslab/probability

Acked-by: David Rientjes
Acked-by: Akinobu Mita
Acked-by: Christoph Lameter
Signed-off-by: Dmitry Monakhov
Signed-off-by: Pekka Enberg

Dmitry Monakhov
2010-02-27 01:19:39 +0800

30 Jan, 2010

1 commit

44b57f1cc slab: fix regression in touched logic ... Browse Code »

When factoring common code into transfer_objects in commit 3ded175 ("slab: add
transfer_objects() function"), the 'touched' logic got a bit broken. When
refilling from the shared array (taking objects from the shared array), we are
making use of the shared array so it should be marked as touched.

Subsequently pulling an element from the cpu array and allocating it should
also touch the cpu array, but that is taken care of after the alloc_done label.
(So yes, the cpu array was getting touched = 1 twice).

So revert this logic to how it worked in earlier kernels.

This also affects the behaviour in __drain_alien_cache, which would previously
'touch' the shared array and now does not. I think it is more logical not to
touch there, because we are pushing objects into the shared array rather than
pulling them off. So there is no good reason to postpone reaping them -- if the
shared array is getting utilized, then it will get 'touched' in the alloc path
(where this patch now restores the touch).

Acked-by: Christoph Lameter
Signed-off-by: Nick Piggin
Signed-off-by: Pekka Enberg

Nick Piggin
2010-01-30 21:02:39 +0800

12 Jan, 2010

1 commit

f3186a9c5 slab: initialize unused alien cache entry as NULL at alloc_alien_cache(). ... Browse Code »

Comparing with existing code, it's a simpler way to use kzalloc_node()
to ensure that each unused alien cache entry is NULL.

CC: Eric Dumazet
Acked-by: Andi Kleen
Acked-by: Christoph Lameter
Acked-by: Matt Mackall
Signed-off-by: Haicheng Li
Signed-off-by: Pekka Enberg

Haicheng Li
2010-01-12 00:56:07 +0800

29 Dec, 2009

1 commit

00afa7580 SLAB: Fix lockdep annotation breakage ... Browse Code »

Commit ce79ddc8e2376a9a93c7d42daf89bfcbb9187e62 ("SLAB: Fix lockdep annotations
for CPU hotplug") broke init_node_lock_keys() off-slab logic which causes
lockdep false positives.

Fix that up by reverting the logic back to original while keeping CPU hotplug
fixes intact.

Reported-and-tested-by: Heiko Carstens
Reported-and-tested-by: Andi Kleen
Signed-off-by: Pekka Enberg

Pekka Enberg
2009-12-29 02:57:27 +0800

18 Dec, 2009

2 commits

55db493b6 Merge branch 'cpumask-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/rusty/linux-2.6-for-linus

* 'cpumask-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
cpumask: rename tsk_cpumask to tsk_cpus_allowed
cpumask: don't recommend set_cpus_allowed hack in Documentation/cpu-hotplug.txt
cpumask: avoid dereferencing struct cpumask
cpumask: convert drivers/idle/i7300_idle.c to cpumask_var_t
cpumask: use modern cpumask style in drivers/scsi/fcoe/fcoe.c
cpumask: avoid deprecated function in mm/slab.c
cpumask: use cpu_online in kernel/perf_event.c

Linus Torvalds
2009-12-18 09:00:20 +0800
dcc7cd011 Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6 ... Browse Code »

* 'kmemleak' of git://linux-arm.org/linux-2.6:
kmemleak: fix kconfig for crc32 build error
kmemleak: Reduce the false positives by checking for modified objects
kmemleak: Show the age of an unreferenced object
kmemleak: Release the object lock before calling put_object()
kmemleak: Scan the _ftrace_events section in modules
kmemleak: Simplify the kmemleak_scan_area() function prototype
kmemleak: Do not use off-slab management with SLAB_NOLEAKTRACE

Linus Torvalds
2009-12-18 08:00:19 +0800

17 Dec, 2009

1 commit

58463c1fe cpumask: avoid deprecated function in mm/slab.c ... Browse Code »

These days we use cpumask_empty() which takes a pointer.

Signed-off-by: Rusty Russell
Acked-by: Christoph Lameter

Rusty Russell
2009-12-17 09:13:13 +0800

15 Dec, 2009

2 commits

2205afa7d Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf sched: Fix build failure on sparc
perf bench: Add "all" pseudo subsystem and "all" pseudo suite
perf tools: Introduce perf_session class
perf symbols: Ditch dso->find_symbol
perf symbols: Allow lookups by symbol name too
perf symbols: Add missing "Variables" entry to map_type__name
perf symbols: Add support for 'variable' symtabs
perf symbols: Introduce ELF counterparts to symbol_type__is_a
perf symbols: Introduce symbol_type__is_a
perf symbols: Rename kthreads to kmaps, using another abstraction for it
perf tools: Allow building for ARM
hw-breakpoints: Handle bad modify_user_hw_breakpoint off-case return value
perf tools: Allow cross compiling
tracing, slab: Fix no callsite ifndef CONFIG_KMEMTRACE
tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING

Trivial conflict due to different fixes to modify_user_hw_breakpoint()
in include/linux/hw_breakpoint.h

Linus Torvalds
2009-12-15 02:13:22 +0800
d0316554d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
m68k: rename global variable vmalloc_end to m68k_vmalloc_end
percpu: add missing per_cpu_ptr_to_phys() definition for UP
percpu: Fix kdump failure if booted with percpu_alloc=page
percpu: make misc percpu symbols unique
percpu: make percpu symbols in ia64 unique
percpu: make percpu symbols in powerpc unique
percpu: make percpu symbols in x86 unique
percpu: make percpu symbols in xen unique
percpu: make percpu symbols in cpufreq unique
percpu: make percpu symbols in oprofile unique
percpu: make percpu symbols in tracer unique
percpu: make percpu symbols under kernel/ and mm/ unique
percpu: remove some sparse warnings
percpu: make alloc_percpu() handle array types
vmalloc: fix use of non-existent percpu variable in put_cpu_var()
this_cpu: Use this_cpu_xx in trace_functions_graph.c
this_cpu: Use this_cpu_xx for ftrace
this_cpu: Use this_cpu_xx in nmi handling
this_cpu: Use this_cpu operations in RCU
this_cpu: Use this_cpu ops for VM statistics
...

Fix up trivial (famous last words) global per-cpu naming conflicts in
arch/x86/kvm/svm.c
mm/slab.c

Linus Torvalds
2009-12-15 01:58:24 +0800

12 Dec, 2009

1 commit

355d79c87 Merge branches 'slab/fixes', 'slab/kmemleak', 'slub/perf' and 'slub/stats' into for-linus Browse Code »

Pekka Enberg
2009-12-12 16:12:19 +0800

11 Dec, 2009

2 commits

0bb38a5cd tracing, slab: Fix no callsite ifndef CONFIG_KMEMTRACE ... Browse Code »

For slab, if CONFIG_KMEMTRACE and CONFIG_DEBUG_SLAB are not set,
__do_kmalloc() will not track callers:

# ./perf record -f -a -R -e kmem:kmalloc
^C
# ./perf trace
...
perf-2204 [000] 147.376774: kmalloc: call_site=c0529d2d ...
perf-2204 [000] 147.400997: kmalloc: call_site=c0529d2d ...
Xorg-1461 [001] 147.405413: kmalloc: call_site=0 ...
Xorg-1461 [001] 147.405609: kmalloc: call_site=0 ...
konsole-1776 [001] 147.405786: kmalloc: call_site=0 ...

Signed-off-by: Li Zefan
Reviewed-by: Pekka Enberg
Cc: Christoph Lameter
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Cc: linux-mm@kvack.org
Cc: Eduard - Gabriel Munteanu
LKML-Reference:
Signed-off-by: Ingo Molnar

Li Zefan
2009-12-11 16:17:03 +0800
0f24f1287 tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING ... Browse Code »

Define kmem_trace_alloc_{,node}_notrace() if CONFIG_TRACING is
enabled, otherwise perf-kmem will show wrong stats ifndef
CONFIG_KMEM_TRACE, because a kmalloc() memory allocation may
be traced by both trace_kmalloc() and trace_kmem_cache_alloc().

Signed-off-by: Li Zefan
Reviewed-by: Pekka Enberg
Cc: Christoph Lameter
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Cc: linux-mm@kvack.org
Cc: Eduard - Gabriel Munteanu
LKML-Reference:
Signed-off-by: Ingo Molnar

Li Zefan
2009-12-11 16:17:02 +0800

06 Dec, 2009

3 commits

ddbf2e836 slab, kmemleak: pass the correct pointer to kmemleak_erase() ... Browse Code »

In ____cache_alloc(), the variable 'ac' may be changed after
cache_alloc_refill() and the following kmemleak_erase() may get an incorrect
pointer. Update 'ac' after cache_alloc_refill() unconditionally.

See the following URL for the discussion of this patch:

http://marc.info/?l=linux-kernel&m=125873373124187&w=2

Acked-by: Catalin Marinas
Signed-off-by: J. R. Okajima
Signed-off-by: Pekka Enberg

J. R. Okajima
2009-12-06 16:24:03 +0800
f3d8b53a3 slab, kmemleak: stop calling kmemleak_erase() unconditionally ... Browse Code »

When the gotten object is NULL (probably due to ENOMEM), kmemleak_erase() is
unnecessary here, It just sets NULL to where already is NULL. Add a condition.

Acked-by: Catalin Marinas
Signed-off-by: J. R. Okajima
Signed-off-by: Pekka Enberg

J. R. Okajima
2009-12-06 16:23:05 +0800
8e15b79cf SLAB: Fix unlikely() annotation in __cache_alloc_node() ... Browse Code »

Branch profiling on my nehalem machine showed 99% incorrect branch hints:

28459 7678524 99 __cache_alloc_node slab.c 3551

Discussion on lkml [1] led to the solution to remove this hint.

[1] http://patchwork.kernel.org/patch/63517/

Signed-off-by: Tim Blechmann
Signed-off-by: Pekka Enberg

Tim Blechmann
2009-12-06 16:21:21 +0800

01 Dec, 2009

1 commit

ce79ddc8e SLAB: Fix lockdep annotations for CPU hotplug ... Browse Code »

As reported by Paul McKenney:

I am seeing some lockdep complaints in rcutorture runs that include
frequent CPU-hotplug operations. The tests are otherwise successful.
My first thought was to send a patch that gave each array_cache
structure's ->lock field its own struct lock_class_key, but you already
have a init_lock_keys() that seems to be intended to deal with this.

------------------------------------------------------------------------

=============================================
[ INFO: possible recursive locking detected ]
2.6.32-rc4-autokern1 #1
---------------------------------------------
syslogd/2908 is trying to acquire lock:
(&nc->lock){..-...}, at: [] .kmem_cache_free+0x118/0x2d4

but task is already holding lock:
(&nc->lock){..-...}, at: [] .kfree+0x1f0/0x324

other info that might help us debug this:
3 locks held by syslogd/2908:
#0: (&u->readlock){+.+.+.}, at: [] .unix_dgram_recvmsg+0x70/0x338
#1: (&nc->lock){..-...}, at: [] .kfree+0x1f0/0x324
#2: (&parent->list_lock){-.-...}, at: [] .__drain_alien_cache+0x50/0xb8

stack backtrace:
Call Trace:
[c0000000e8ccafc0] [c0000000000101e4] .show_stack+0x70/0x184 (unreliable)
[c0000000e8ccb070] [c0000000000afebc] .validate_chain+0x6ec/0xf58
[c0000000e8ccb180] [c0000000000b0ff0] .__lock_acquire+0x8c8/0x974
[c0000000e8ccb280] [c0000000000b2290] .lock_acquire+0x140/0x18c
[c0000000e8ccb350] [c000000000468df0] ._spin_lock+0x48/0x70
[c0000000e8ccb3e0] [c0000000001407f4] .kmem_cache_free+0x118/0x2d4
[c0000000e8ccb4a0] [c000000000140b90] .free_block+0x130/0x1a8
[c0000000e8ccb540] [c000000000140f94] .__drain_alien_cache+0x80/0xb8
[c0000000e8ccb5e0] [c0000000001411e0] .kfree+0x214/0x324
[c0000000e8ccb6a0] [c0000000003ca860] .skb_release_data+0xe8/0x104
[c0000000e8ccb730] [c0000000003ca2ec] .__kfree_skb+0x20/0xd4
[c0000000e8ccb7b0] [c0000000003cf2c8] .skb_free_datagram+0x1c/0x5c
[c0000000e8ccb830] [c00000000045597c] .unix_dgram_recvmsg+0x2f4/0x338
[c0000000e8ccb920] [c0000000003c0f14] .sock_recvmsg+0xf4/0x13c
[c0000000e8ccbb30] [c0000000003c28ec] .SyS_recvfrom+0xb4/0x130
[c0000000e8ccbcb0] [c0000000003bfb78] .sys_recv+0x18/0x2c
[c0000000e8ccbd20] [c0000000003ed388] .compat_sys_recv+0x14/0x28
[c0000000e8ccbd90] [c0000000003ee1bc] .compat_sys_socketcall+0x178/0x220
[c0000000e8ccbe30] [c0000000000085d4] syscall_exit+0x0/0x40

This patch fixes the issue by setting up lockdep annotations during CPU
hotplug.

Reported-by: Paul E. McKenney
Tested-by: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Christoph Lameter
Signed-off-by: Pekka Enberg

Pekka Enberg
2009-12-01 01:16:08 +0800

29 Oct, 2009

1 commit

1871e52c7 percpu: make percpu symbols under kernel/ and mm/ unique ... Browse Code »

This patch updates percpu related symbols under kernel/ and mm/ such
that percpu symbols are unique and don't clash with local symbols.
This serves two purposes of decreasing the possibility of global
percpu symbol collision and allowing dropping per_cpu__ prefix from
percpu symbols.

* kernel/lockdep.c: s/lock_stats/cpu_lock_stats/

* kernel/sched.c: s/init_rq_rt/init_rt_rq_var/ (any better idea?)
s/sched_group_cpus/sched_groups/

* kernel/softirq.c: s/ksoftirqd/run_ksoftirqd/a

* kernel/softlockup.c: s/(*)_timestamp/softlockup_\1_ts/
s/watchdog_task/softlockup_watchdog/
s/timestamp/ts/ for local variables

* kernel/time/timer_stats: s/lookup_lock/tstats_lookup_lock/

* mm/slab.c: s/reap_work/slab_reap_work/
s/reap_node/slab_reap_node/

* mm/vmstat.c: local variable changed to avoid collision with vmstat_work

Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
which cause name clashes" patch.

Signed-off-by: Tejun Heo
Acked-by: (slab/vmstat) Christoph Lameter
Reviewed-by: Christoph Lameter
Cc: Rusty Russell
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Andrew Morton
Cc: Nick Piggin

Tejun Heo
2009-10-29 21:34:13 +0800

28 Oct, 2009

2 commits

c017b4be3 kmemleak: Simplify the kmemleak_scan_area() function prototype ... Browse Code »

This function was taking non-necessary arguments which can be determined
by kmemleak. The patch also modifies the calling sites.

Signed-off-by: Catalin Marinas
Cc: Pekka Enberg
Cc: Christoph Lameter
Cc: Rusty Russell

Catalin Marinas
2009-10-28 23:11:00 +0800
e7cb55b94 kmemleak: Do not use off-slab management with SLAB_NOLEAKTRACE ... Browse Code »

With the slab allocator, if off-slab management is enabled for the
kmem_caches used by kmemleak, it leads to recursive calls into
kmemleak_alloc(). Off-slab management can be triggered by other config
options increasing the slab size, e.g. DEBUG_PAGEALLOC.

Reported-by: Tetsuo Handa
Reviewed-by: Pekka Enberg
Cc: Christoph Lameter
Signed-off-by: Catalin Marinas

Catalin Marinas
2009-10-28 21:33:08 +0800

22 Sep, 2009

1 commit

4481374ce mm: replace various uses of num_physpages by totalram_pages ... Browse Code »

Sizing of memory allocations shouldn't depend on the number of physical
pages found in a system, as that generally includes (perhaps a huge amount
of) non-RAM pages. The amount of what actually is usable as storage
should instead be used as a basis here.

Some of the calculations (i.e. those not intending to use high memory)
should likely even use (totalram_pages - totalhigh_pages).

Signed-off-by: Jan Beulich
Acked-by: Rusty Russell
Acked-by: Ingo Molnar
Cc: Dave Airlie
Cc: Kyle McMartin
Cc: Jeremy Fitzhardinge
Cc: Pekka Enberg
Cc: Hugh Dickins
Cc: "David S. Miller"
Cc: Patrick McHardy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2009-09-22 22:17:38 +0800

29 Jun, 2009

1 commit

ec5a36f94 SLAB: Fix lockdep annotations ... Browse Code »

Commit 8429db5... ("slab: setup cpu caches later on when interrupts are
enabled") broke mm/slab.c lockdep annotations:

[ 11.554715] =============================================
[ 11.555249] [ INFO: possible recursive locking detected ]
[ 11.555560] 2.6.31-rc1 #896
[ 11.555861] ---------------------------------------------
[ 11.556127] udevd/1899 is trying to acquire lock:
[ 11.556436] (&nc->lock){-.-...}, at: [] kmem_cache_free+0xcd/0x25b
[ 11.557101]
[ 11.557102] but task is already holding lock:
[ 11.557706] (&nc->lock){-.-...}, at: [] kfree+0x137/0x292
[ 11.558109]
[ 11.558109] other info that might help us debug this:
[ 11.558720] 2 locks held by udevd/1899:
[ 11.558983] #0: (&nc->lock){-.-...}, at: [] kfree+0x137/0x292
[ 11.559734] #1: (&parent->list_lock){-.-...}, at: [] __drain_alien_cache+0x3b/0xbd
[ 11.560442]
[ 11.560443] stack backtrace:
[ 11.561009] Pid: 1899, comm: udevd Not tainted 2.6.31-rc1 #896
[ 11.561276] Call Trace:
[ 11.561632] [] __lock_acquire+0x15ec/0x168f
[ 11.561901] [] ? __lock_acquire+0x1676/0x168f
[ 11.562171] [] ? trace_hardirqs_on_caller+0x113/0x13e
[ 11.562490] [] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 11.562807] [] lock_acquire+0xc1/0xe5
[ 11.563073] [] ? kmem_cache_free+0xcd/0x25b
[ 11.563385] [] _spin_lock+0x31/0x66
[ 11.563696] [] ? kmem_cache_free+0xcd/0x25b
[ 11.563964] [] kmem_cache_free+0xcd/0x25b
[ 11.564235] [] ? __free_pages+0x1b/0x24
[ 11.564551] [] slab_destroy+0x57/0x5c
[ 11.564860] [] free_block+0xd8/0x123
[ 11.565126] [] __drain_alien_cache+0xa2/0xbd
[ 11.565441] [] kfree+0x14c/0x292
[ 11.565752] [] skb_release_data+0xc6/0xcb
[ 11.566020] [] __kfree_skb+0x19/0x86
[ 11.566286] [] consume_skb+0x2b/0x2d
[ 11.566631] [] skb_free_datagram+0x14/0x3a
[ 11.566901] [] netlink_recvmsg+0x164/0x258
[ 11.567170] [] sock_recvmsg+0xe5/0xfe
[ 11.567486] [] ? might_fault+0xaf/0xb1
[ 11.567802] [] ? autoremove_wake_function+0x0/0x38
[ 11.568073] [] ? core_sys_select+0x3d/0x2b4
[ 11.568378] [] ? __lock_acquire+0x1676/0x168f
[ 11.568693] [] ? sockfd_lookup_light+0x1b/0x54
[ 11.568961] [] sys_recvfrom+0xa3/0xf8
[ 11.569228] [] ? trace_hardirqs_on+0xd/0xf
[ 11.569546] [] system_call_fastpath+0x16/0x1b#

Fix that up.

Closes-bug: http://bugzilla.kernel.org/show_bug.cgi?id=13654
Tested-by: Venkatesh Pallipadi
Signed-off-by: Pekka Enberg

Pekka Enberg
2009-06-29 14:57:10 +0800

26 Jun, 2009

1 commit

7ed9f7e5d fix RCU-callback-after-kmem_cache_destroy problem in sl[aou]b ... Browse Code »

Jesper noted that kmem_cache_destroy() invokes synchronize_rcu() rather than
rcu_barrier() in the SLAB_DESTROY_BY_RCU case, which could result in RCU
callbacks accessing a kmem_cache after it had been destroyed.

Cc:
Acked-by: Matt Mackall
Reported-by: Jesper Dangaard Brouer
Signed-off-by: Paul E. McKenney
Signed-off-by: Pekka Enberg

Paul E. McKenney
2009-06-26 17:10:47 +0800