Doug / smarc-fsl-linux-kernel | Embedian Git Server

23 Nov, 2013

1 commit

24f971abb Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux ... Browse Code »

Pull SLAB changes from Pekka Enberg:
"The patches from Joonsoo Kim switch mm/slab.c to use 'struct page' for
slab internals similar to mm/slub.c. This reduces memory usage and
improves performance:

https://lkml.org/lkml/2013/10/16/155

Rest of the changes are bug fixes from various people"

* 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: (21 commits)
mm, slub: fix the typo in mm/slub.c
mm, slub: fix the typo in include/linux/slub_def.h
slub: Handle NULL parameter in kmem_cache_flags
slab: replace non-existing 'struct freelist *' with 'void *'
slab: fix to calm down kmemleak warning
slub: proper kmemleak tracking if CONFIG_SLUB_DEBUG disabled
slab: rename slab_bufctl to slab_freelist
slab: remove useless statement for checking pfmemalloc
slab: use struct page for slab management
slab: replace free and inuse in struct slab with newly introduced active
slab: remove SLAB_LIMIT
slab: remove kmem_bufctl_t
slab: change the management method of free objects of the slab
slab: use __GFP_COMP flag for allocating slab pages
slab: use well-defined macro, virt_to_slab()
slab: overloading the RCU head over the LRU for RCU free
slab: remove cachep in struct slab_rcu
slab: remove nodeid in struct slab
slab: remove colouroff in struct slab
slab: change return type of kmem_getpages() to struct page
...

Linus Torvalds
2013-11-23 00:10:34 +0800

12 Nov, 2013

1 commit

a941f8360 mm, slub: fix the typo in include/linux/slub_def.h ... Browse Code »

Acked-by: Christoph Lameter
Signed-off-by: Zhi Yong Wu
Signed-off-by: Pekka Enberg

Zhi Yong Wu
2013-11-12 00:18:31 +0800

05 Sep, 2013

2 commits

76b6f3d25 slub: remove verify_mem_not_deleted() ... Browse Code »

I do not see any user for this code in the tree.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2013-09-05 01:53:16 +0800
f1b6eb6e6 mm/sl[aou]b: Move kmallocXXX functions to common code ... Browse Code »

The kmalloc* functions of all slab allcoators are similar now so
lets move them into slab.h. This requires some function naming changes
in slob.

As a results of this patch there is a common set of functions for
all allocators. Also means that kmalloc_large() is now available
in general to perform large order allocations that go directly
via the page allocator. kmalloc_large() can be substituted if
kmalloc() throws warnings because of too large allocations.

kmalloc_large() has exactly the same semantics as kmalloc but
can only used for allocations > PAGE_SIZE.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2013-09-05 01:51:33 +0800

01 Feb, 2013

5 commits

ca34956b8 slab: Common definition for kmem_cache_node ... Browse Code »

Put the definitions for the kmem_cache_node structures together so that
we have one structure. That will allow us to create more common fields in
the future which could yield more opportunities to share code.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2013-02-01 18:32:09 +0800
2c59dd654 slab: Common Kmalloc cache determination ... Browse Code »

Extract the optimized lookup functions from slub and put them into
slab_common.c. Then make slab use these functions as well.

Joonsoo notes that this fixes some issues with constant folding which
also reduces the code size for slub.

https://lkml.org/lkml/2012/10/20/82

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2013-02-01 18:32:08 +0800
9425c58e5 slab: Common definition for the array of kmalloc caches ... Browse Code »

Have a common definition fo the kmalloc cache arrays in
SLAB and SLUB

Acked-by: Glauber Costa
Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2013-02-01 18:32:07 +0800
95a05b428 slab: Common constants for kmalloc boundaries ... Browse Code »

Standardize the constants that describe the smallest and largest
object kept in the kmalloc arrays for SLAB and SLUB.

Differentiate between the maximum size for which a slab cache is used
(KMALLOC_MAX_CACHE_SIZE) and the maximum allocatable size
(KMALLOC_MAX_SIZE, KMALLOC_MAX_ORDER).

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2013-02-01 18:32:07 +0800
ce6a50263 slab: Common kmalloc slab index determination ... Browse Code »

Extract the function to determine the index of the slab within
the array of kmalloc caches as well as a function to determine
maximum object size from the nr of the kmalloc slab.

This is used here only to simplify slub bootstrap but will
be used later also for SLAB.

Acked-by: Glauber Costa
Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2013-02-01 18:32:05 +0800

19 Dec, 2012

3 commits

107dab5c9 slub: slub-specific propagation changes ... Browse Code »

SLUB allows us to tune a particular cache behavior with sysfs-based
tunables. When creating a new memcg cache copy, we'd like to preserve any
tunables the parent cache already had.

This can be done by tapping into the store attribute function provided by
the allocator. We of course don't need to mess with read-only fields.
Since the attributes can have multiple types and are stored internally by
sysfs, the best strategy is to issue a ->show() in the root cache, and
then ->store() in the memcg cache.

The drawback of that, is that sysfs can allocate up to a page in buffering
for show(), that we are likely not to need, but also can't guarantee. To
avoid always allocating a page for that, we can update the caches at store
time with the maximum attribute size ever stored to the root cache. We
will then get a buffer big enough to hold it. The corolary to this, is
that if no stores happened, nothing will be propagated.

It can also happen that a root cache has its tunables updated during
normal system operation. In this case, we will propagate the change to
all caches that are already active.

[akpm@linux-foundation.org: tweak code to avoid __maybe_unused]
Signed-off-by: Glauber Costa
Cc: Christoph Lameter
Cc: David Rientjes
Cc: Frederic Weisbecker
Cc: Greg Thelen
Cc: Johannes Weiner
Cc: JoonSoo Kim
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Pekka Enberg
Cc: Rik van Riel
Cc: Suleiman Souhlal
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Glauber Costa
2012-12-19 07:02:14 +0800
d79923fad sl[au]b: allocate objects from memcg cache ... Browse Code »

We are able to match a cache allocation to a particular memcg. If the
task doesn't change groups during the allocation itself - a rare event,
this will give us a good picture about who is the first group to touch a
cache page.

This patch uses the now available infrastructure by calling
memcg_kmem_get_cache() before all the cache allocations.

Signed-off-by: Glauber Costa
Cc: Christoph Lameter
Cc: David Rientjes
Cc: Frederic Weisbecker
Cc: Greg Thelen
Cc: Johannes Weiner
Cc: JoonSoo Kim
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Pekka Enberg
Cc: Rik van Riel
Cc: Suleiman Souhlal
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Glauber Costa
2012-12-19 07:02:14 +0800
ba6c496ed slab/slub: struct memcg_params ... Browse Code »

For the kmem slab controller, we need to record some extra information in
the kmem_cache structure.

Signed-off-by: Glauber Costa
Signed-off-by: Suleiman Souhlal
Cc: Christoph Lameter
Cc: David Rientjes
Cc: Frederic Weisbecker
Cc: Greg Thelen
Cc: Johannes Weiner
Cc: JoonSoo Kim
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Pekka Enberg
Cc: Rik van Riel
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Glauber Costa
2012-12-19 07:02:13 +0800

14 Jun, 2012

1 commit

3b0efdfa1 mm, sl[aou]b: Extract common fields from struct kmem_cache ... Browse Code »

Define a struct that describes common fields used in all slab allocators.
A slab allocator either uses the common definition (like SLOB) or is
required to provide members of kmem_cache with the definition given.

After that it will be possible to share code that
only operates on those fields of kmem_cache.

The patch basically takes the slob definition of kmem cache and
uses the field namees for the other allocators.

It also standardizes the names used for basic object lengths in
allocators:

object_size Struct size specified at kmem_cache_create. Basically
the payload expected to be used by the subsystem.

size The size of memory allocator for each object. This size
is larger than object_size and includes padding, alignment
and extra metadata for each object (f.e. for debugging
and rcu).

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2012-06-14 14:20:16 +0800

01 Jun, 2012

1 commit

ec3ab083a slub: Get rid of the node field ... Browse Code »

The node field is always page_to_nid(c->page). So its rather easy to
replace. Note that there maybe slightly more overhead in various hot paths
due to the need to shift the bits from page->flags. However, that is mostly
compensated for by a smaller footprint of the kmem_cache_cpu structure (this
patch reduces that to 3 words per cache) which allows better caching.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2012-06-01 14:25:41 +0800

29 Mar, 2012

1 commit

0c9aac082 Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux ... Browse Code »

Pull SLAB changes from Pekka Enberg:
"There's the new kmalloc_array() API, minor fixes and performance
improvements, but quite honestly, nothing terribly exciting."

* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
mm: SLAB Out-of-memory diagnostics
slab: introduce kmalloc_array()
slub: per cpu partial statistics change
slub: include include for prefetch
slub: Do not hold slub_lock when calling sysfs_slab_add()
slub: prefetch next freelist pointer in slab_alloc()
slab, cleanup: remove unneeded return

Linus Torvalds
2012-03-29 06:04:26 +0800

05 Mar, 2012

1 commit

187f1882b BUG: headers with BUG/BUG_ON etc. need linux/bug.h ... Browse Code »

If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
other BUG variant in a static inline (i.e. not in a #define) then
that header really should be including and not just
expecting it to be implicitly present.

We can make this change risk-free, since if the files using these
headers didn't have exposure to linux/bug.h already, they would have
been causing compile failures/warnings.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2012-03-05 06:54:34 +0800

18 Feb, 2012

1 commit

8028dcea8 slub: per cpu partial statistics change ... Browse Code »

This patch split the cpu_partial_free into 2 parts: cpu_partial_node, PCP refilling
times from node partial; and same name cpu_partial_free, PCP refilling times in
slab_free slow path. A new statistic 'cpu_partial_drain' is added to get PCP
drain to node partial times. These info are useful when do PCP tunning.

The slabinfo.c code is unchanged, since cpu_partial_node is not on slow path.

Signed-off-by: Alex Shi
Acked-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Alex Shi
2012-02-18 17:00:09 +0800

28 Sep, 2011

1 commit

9f2649041 slub: correct comments error for per cpu partial ... Browse Code »

Correct comment errors, that mistake cpu partial objects number as pages
number, may make reader misunderstand.

Signed-off-by: Alex Shi
Reviewed-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Alex Shi
2011-09-28 04:03:30 +0800

20 Aug, 2011

1 commit

49e225858 slub: per cpu cache for partial pages ... Browse Code »

Allow filling out the rest of the kmem_cache_cpu cacheline with pointers to
partial pages. The partial page list is used in slab_free() to avoid
per node lock taking.

In __slab_alloc() we can then take multiple partial pages off the per
node partial list in one go reducing node lock pressure.

We can also use the per cpu partial list in slab_alloc() to avoid scanning
partial lists for pages with free objects.

The main effect of a per cpu partial list is that the per node list_lock
is taken for batches of partial pages instead of individual ones.

Potential future enhancements:

1. The pickup from the partial list could be perhaps be done without disabling
interrupts with some work. The free path already puts the page into the
per cpu partial list without disabling interrupts.

2. __slab_free() may have some code paths that could use optimization.

Performance:

Before After
./hackbench 100 process 200000
Time: 1953.047 1564.614
./hackbench 100 process 20000
Time: 207.176 156.940
./hackbench 100 process 20000
Time: 204.468 156.940
./hackbench 100 process 20000
Time: 204.879 158.772
./hackbench 10 process 20000
Time: 20.153 15.853
./hackbench 10 process 20000
Time: 20.153 15.986
./hackbench 10 process 20000
Time: 19.363 16.111
./hackbench 1 process 20000
Time: 2.518 2.307
./hackbench 1 process 20000
Time: 2.258 2.339
./hackbench 1 process 20000
Time: 2.864 2.163

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-08-20 00:34:27 +0800

31 Jul, 2011

1 commit

c11abbbaa Merge branch 'slub/lockless' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 ... Browse Code »

* 'slub/lockless' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: (21 commits)
slub: When allocating a new slab also prep the first object
slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock
Avoid duplicate _count variables in page_struct
Revert "SLUB: Fix build breakage in linux/mm_types.h"
SLUB: Fix build breakage in linux/mm_types.h
slub: slabinfo update for cmpxchg handling
slub: Not necessary to check for empty slab on load_freelist
slub: fast release on full slab
slub: Add statistics for the case that the current slab does not match the node
slub: Get rid of the another_slab label
slub: Avoid disabling interrupts in free slowpath
slub: Disable interrupts in free_debug processing
slub: Invert locking and avoid slab lock
slub: Rework allocator fastpaths
slub: Pass kmem_cache struct to lock and freeze slab
slub: explicit list_lock taking
slub: Add cmpxchg_double_slab()
mm: Rearrange struct page
slub: Move page->frozen handling near where the page->freelist handling occurs
slub: Do not use frozen page flag but a bit in the page counters
...

Linus Torvalds
2011-07-31 02:21:48 +0800

08 Jul, 2011

1 commit

d18a90dd8 slub: Add method to verify memory is not freed ... Browse Code »

This is for tracking down suspect memory usage.

Acked-by: Christoph Lameter
Signed-off-by: Ben Greear
Signed-off-by: Pekka Enberg

Ben Greear
2011-07-08 03:17:08 +0800

02 Jul, 2011

3 commits

03e404af2 slub: fast release on full slab ... Browse Code »

Make deactivation occur implicitly while checking out the current freelist.

This avoids one cmpxchg operation on a slab that is now fully in use.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-07-02 18:26:57 +0800
e36a2652d slub: Add statistics for the case that the current slab does not match the node ... Browse Code »

Slub reloads the per cpu slab if the page does not satisfy the NUMA condition. Track
those reloads since doing so has a performance impact.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-07-02 18:26:56 +0800
b789ef518 slub: Add cmpxchg_double_slab() ... Browse Code »

Add a function that operates on the second doubleword in the page struct
and manipulates the object counters, the freelist and the frozen attribute.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-07-02 18:26:53 +0800

17 Jun, 2011

1 commit

3192b920b slab, slub, slob: Unify alignment definition ... Browse Code »

Every slab has its on alignment definition in include/linux/sl?b_def.h. Extract those
and define a common set in include/linux/slab.h.

SLOB: As notes sometimes we need double word alignment on 32 bit. This gives all
structures allocated by SLOB a unsigned long long alignment like the others do.

SLAB: If ARCH_SLAB_MINALIGN is not set SLAB would set ARCH_SLAB_MINALIGN to
zero meaning no alignment at all. Give it the default unsigned long long alignment.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-06-17 00:40:20 +0800

21 May, 2011

1 commit

3e0c2ab67 slub: Deal with hyperthetical case of PAGE_SIZE > 2M ... Browse Code »

kmalloc_index() currently returns -1 if the PAGE_SIZE is larger than 2M
which seems to cause some concern since the callers do not check for -1.

Insert a BUG() and add a comment to the -1 explaining that the code
cannot be reached.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-05-21 17:53:53 +0800

08 May, 2011

1 commit

1759415e6 slub: Remove CONFIG_CMPXCHG_LOCAL ifdeffery ... Browse Code »

Remove the #ifdefs. This means that the irqsafe_cpu_cmpxchg_double() is used
everywhere.

There may be performance implications since:

A. We now have to manage a transaction ID for all arches

B. The interrupt holdoff for arches not supporting CONFIG_CMPXCHG_LOCAL is reduced
to a very short irqoff section.

There are no multiple irqoff/irqon sequences as a result of this change. Even in the fallback
case we only have to do one disable and enable like before.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-05-08 01:25:38 +0800

23 Mar, 2011

1 commit

4fdccdfbb slub: Add statistics for this_cmpxchg_double failures ... Browse Code »

Add some statistics for debugging.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-03-23 02:48:04 +0800

21 Mar, 2011

1 commit

e8c500c2b Merge branch 'slub/lockless' into for-linus ... Browse Code »

Conflicts:
include/linux/slub_def.h

Pekka Enberg
2011-03-21 00:13:26 +0800

12 Mar, 2011

1 commit

ab9a0f196 slub: automatically reserve bytes at the end of slab ... Browse Code »

There is no "struct" for slub's slab, it shares with struct page.
But struct page is very small, it is insufficient when we need
to add some metadata for slab.

So we add a field "reserved" to struct kmem_cache, when a slab
is allocated, kmem_cache->reserved bytes are automatically reserved
at the end of the slab for slab's metadata.

Changed from v1:
Export the reserved field via sysfs

Acked-by: Christoph Lameter
Signed-off-by: Lai Jiangshan
Signed-off-by: Pekka Enberg

Lai Jiangshan
2011-03-12 00:06:34 +0800

11 Mar, 2011

2 commits

8a5ec0ba4 Lockless (and preemptless) fastpaths for slub ... Browse Code »

Use the this_cpu_cmpxchg_double functionality to implement a lockless
allocation algorithm on arches that support fast this_cpu_ops.

Each of the per cpu pointers is paired with a transaction id that ensures
that updates of the per cpu information can only occur in sequence on
a certain cpu.

A transaction id is a "long" integer that is comprised of an event number
and the cpu number. The event number is incremented for every change to the
per cpu state. This means that the cmpxchg instruction can verify for an
update that nothing interfered and that we are updating the percpu structure
for the processor where we picked up the information and that we are also
currently on that processor when we update the information.

This results in a significant decrease of the overhead in the fastpaths. It
also makes it easy to adopt the fast path for realtime kernels since this
is lockless and does not require the use of the current per cpu area
over the critical section. It is only important that the per cpu area is
current at the beginning of the critical section and at the end.

So there is no need even to disable preemption.

Test results show that the fastpath cycle count is reduced by up to ~ 40%
(alloc/free test goes from ~140 cycles down to ~80). The slowpath for kfree
adds a few cycles.

Sadly this does nothing for the slowpath which is where the main issues with
performance in slub are but the best case performance rises significantly.
(For that see the more complex slub patches that require cmpxchg_double)

Kmalloc: alloc/free test

Before:

10000 times kmalloc(8)/kfree -> 134 cycles
10000 times kmalloc(16)/kfree -> 152 cycles
10000 times kmalloc(32)/kfree -> 144 cycles
10000 times kmalloc(64)/kfree -> 142 cycles
10000 times kmalloc(128)/kfree -> 142 cycles
10000 times kmalloc(256)/kfree -> 132 cycles
10000 times kmalloc(512)/kfree -> 132 cycles
10000 times kmalloc(1024)/kfree -> 135 cycles
10000 times kmalloc(2048)/kfree -> 135 cycles
10000 times kmalloc(4096)/kfree -> 135 cycles
10000 times kmalloc(8192)/kfree -> 144 cycles
10000 times kmalloc(16384)/kfree -> 754 cycles

After:

10000 times kmalloc(8)/kfree -> 78 cycles
10000 times kmalloc(16)/kfree -> 78 cycles
10000 times kmalloc(32)/kfree -> 82 cycles
10000 times kmalloc(64)/kfree -> 88 cycles
10000 times kmalloc(128)/kfree -> 79 cycles
10000 times kmalloc(256)/kfree -> 79 cycles
10000 times kmalloc(512)/kfree -> 85 cycles
10000 times kmalloc(1024)/kfree -> 82 cycles
10000 times kmalloc(2048)/kfree -> 82 cycles
10000 times kmalloc(4096)/kfree -> 85 cycles
10000 times kmalloc(8192)/kfree -> 82 cycles
10000 times kmalloc(16384)/kfree -> 706 cycles

Kmalloc: Repeatedly allocate then free test

Before:

10000 times kmalloc(8) -> 211 cycles kfree -> 113 cycles
10000 times kmalloc(16) -> 174 cycles kfree -> 115 cycles
10000 times kmalloc(32) -> 235 cycles kfree -> 129 cycles
10000 times kmalloc(64) -> 222 cycles kfree -> 120 cycles
10000 times kmalloc(128) -> 343 cycles kfree -> 139 cycles
10000 times kmalloc(256) -> 827 cycles kfree -> 147 cycles
10000 times kmalloc(512) -> 1048 cycles kfree -> 272 cycles
10000 times kmalloc(1024) -> 2043 cycles kfree -> 528 cycles
10000 times kmalloc(2048) -> 4002 cycles kfree -> 571 cycles
10000 times kmalloc(4096) -> 7740 cycles kfree -> 628 cycles
10000 times kmalloc(8192) -> 8062 cycles kfree -> 850 cycles
10000 times kmalloc(16384) -> 8895 cycles kfree -> 1249 cycles

After:

10000 times kmalloc(8) -> 190 cycles kfree -> 129 cycles
10000 times kmalloc(16) -> 76 cycles kfree -> 123 cycles
10000 times kmalloc(32) -> 126 cycles kfree -> 124 cycles
10000 times kmalloc(64) -> 181 cycles kfree -> 128 cycles
10000 times kmalloc(128) -> 310 cycles kfree -> 140 cycles
10000 times kmalloc(256) -> 809 cycles kfree -> 165 cycles
10000 times kmalloc(512) -> 1005 cycles kfree -> 269 cycles
10000 times kmalloc(1024) -> 1999 cycles kfree -> 527 cycles
10000 times kmalloc(2048) -> 3967 cycles kfree -> 570 cycles
10000 times kmalloc(4096) -> 7658 cycles kfree -> 637 cycles
10000 times kmalloc(8192) -> 8111 cycles kfree -> 859 cycles
10000 times kmalloc(16384) -> 8791 cycles kfree -> 1173 cycles

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-03-11 23:42:49 +0800
1a757fe5d slub: min_partial needs to be in first cacheline ... Browse Code »

It is used in unfreeze_slab() which is a performance critical
function.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2011-03-11 23:42:49 +0800

06 Nov, 2010

1 commit

4a92379bd slub tracing: move trace calls out of always inlined functions to reduce kernel code size ... Browse Code »

Having the trace calls defined in the always inlined kmalloc functions
in include/linux/slub_def.h causes a lot of code duplication as the
trace functions get instantiated for each kamalloc call site. This can
simply be removed by pushing the trace calls down into the functions in
slub.c.

On my x86_64 built this patch shrinks the code size of the kernel by
approx 36K and also shrinks the code size of many modules -- too many to
list here ;)

size vmlinux (2.6.36) reports
text data bss dec hex filename
5410611 743172 828928 6982711 6a8c37 vmlinux
5373738 744244 828928 6946910 6a005e vmlinux + patch

The resulting kernel has had some testing & kmalloc trace still seems to
work.

This patch
- moves trace_kmalloc out of the inlined kmalloc() and pushes it down
into kmem_cache_alloc_trace() so this it only get instantiated once.

- rename kmem_cache_alloc_notrace() to kmem_cache_alloc_trace() to
indicate that now is does have tracing. (maybe this would better being
called something like kmalloc_kmem_cache ?)

- adds a new function kmalloc_order() to handle allocation and tracing
of large allocations of page order.

- removes tracing from the inlined kmalloc_large() replacing them with a
call to kmalloc_order();

- move tracing out of inlined kmalloc_node() and pushing it down into
kmem_cache_alloc_node_trace

- rename kmem_cache_alloc_node_notrace() to
kmem_cache_alloc_node_trace()

- removes the include of trace/events/kmem.h from slub_def.h.

v2
- keep kmalloc_order_trace inline when !CONFIG_TRACE

Signed-off-by: Richard Kennedy
Signed-off-by: Pekka Enberg

Richard Kennedy
2010-11-06 15:04:33 +0800

06 Oct, 2010

1 commit

ab4d5ed5e slub: Enable sysfs support for !CONFIG_SLUB_DEBUG ... Browse Code »

Currently disabling CONFIG_SLUB_DEBUG also disabled SYSFS support meaning
that the slabs cannot be tuned without DEBUG.

Make SYSFS support independent of CONFIG_SLUB_DEBUG

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2010-10-06 21:54:36 +0800

02 Oct, 2010

2 commits

7340cc841 slub: reduce differences between SMP and NUMA ... Browse Code »

Reduce the #ifdefs and simplify bootstrap by making SMP and NUMA as much alike
as possible. This means that there will be an additional indirection to get to
the kmem_cache_node field under SMP.

Acked-by: David Rientjes
Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2010-10-02 15:44:10 +0800
51df11428 slub: Dynamically size kmalloc cache allocations ... Browse Code »

kmalloc caches are statically defined and may take up a lot of space just
because the sizes of the node array has to be dimensioned for the largest
node count supported.

This patch makes the size of the kmem_cache structure dynamic throughout by
creating a kmem_cache slab cache for the kmem_cache objects. The bootstrap
occurs by allocating the initial one or two kmem_cache objects from the
page allocator.

C2->C3
- Fix various issues indicated by David
- Make create kmalloc_cache return a kmem_cache * pointer.

Acked-by: David Rientjes
Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2010-10-02 15:24:27 +0800

23 Aug, 2010

1 commit

bc584c510 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slab: fix object alignment
slub: add missing __percpu markup in mm/slub_def.h

Linus Torvalds
2010-08-23 01:08:52 +0800

11 Aug, 2010

1 commit

a6eb9fe10 dma-mapping: rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN ... Browse Code »

Now each architecture has the own dma_get_cache_alignment implementation.

dma_get_cache_alignment returns the minimum DMA alignment. Architectures
define it as ARCH_KMALLOC_MINALIGN (it's used to make sure that malloc'ed
buffer is DMA-safe; the buffer doesn't share a cache with the others). So
we can unify dma_get_cache_alignment implementations.

This patch:

dma_get_cache_alignment() needs to know if an architecture defines
ARCH_KMALLOC_MINALIGN or not (needs to know if architecture has DMA
alignment restriction). However, slab.h define ARCH_KMALLOC_MINALIGN if
architectures doesn't define it.

Let's rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN.
ARCH_KMALLOC_MINALIGN is used only in the internals of slab/slob/slub
(except for crypto).

Signed-off-by: FUJITA Tomonori
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

FUJITA Tomonori
2010-08-11 23:59:21 +0800

09 Aug, 2010

1 commit

1b5ad2487 slub: add missing __percpu markup in mm/slub_def.h ... Browse Code »

kmem_cache->cpu_slab is a percpu pointer but was missing __percpu
markup. Add it.

Signed-off-by: Namhyung Kim
Acked-by: Tejun Heo
Signed-off-by: Pekka Enberg

Namhyung Kim
2010-08-09 23:48:06 +0800

10 Jun, 2010

1 commit

c726b61c6 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/freder… ... Browse Code »

…ic/random-tracing into perf/core

Ingo Molnar
2010-06-10 00:55:57 +0800