Eric Lee / smarc-fsl-linux-kernel

09 Dec, 2011

1 commit

c5a1cb284 memblock: Kill sentinel entries at the end of static region arrays ... Browse Code »

memblock no longer depends on having one more entry at the end during
addition making the sentinel entries at the end of region arrays not
too useful. Remove the sentinels. This eases further updates.

Signed-off-by: Tejun Heo
Cc: Benjamin Herrenschmidt
Cc: Yinghai Lu

Tejun Heo
2011-12-09 02:22:07 +0800

26 Jul, 2011

1 commit

c9d8c3d08 mm/memblock.c: avoid abuse of RED_INACTIVE ... Browse Code »

RED_INACTIVE is a slab thing, and reusing it for memblock was
inappropriate, because memblock is dealing with phys_addr_t's which have a
Kconfigurable sizeof().

Create a new poison type for this application. Fixes the sparse warning

warning: cast truncates bits from constant value (9f911029d74e35b becomes 9d74e35b)

Reported-by: H Hartley Sweeten
Tested-by: H Hartley Sweeten
Acked-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2011-07-26 11:57:09 +0800

11 Aug, 2010

1 commit

0fe6e20b9 hugetlb, rmap: add reverse mapping for hugepage ... Browse Code »

This patch adds reverse mapping feature for hugepage by introducing
mapcount for shared/private-mapped hugepage and anon_vma for
private-mapped hugepage.

While hugepage is not currently swappable, reverse mapping can be useful
for memory error handler.

Without this patch, memory error handler cannot identify processes
using the bad hugepage nor unmap it from them. That is:
- for shared hugepage:
we can collect processes using a hugepage through pagecache,
but can not unmap the hugepage because of the lack of mapcount.
- for privately mapped hugepage:
we can neither collect processes nor unmap the hugepage.
This patch solves these problems.

This patch include the bug fix given by commit 23be7468e8, so reverts it.

Dependency:
"hugetlb: move definition of is_vm_hugetlb_page() to hugepage_inline.h"

ChangeLog since May 24.
- create hugetlb_inline.h and move is_vm_hugetlb_index() in it.
- move functions setting up anon_vma for hugepage into mm/rmap.c.

ChangeLog since May 13.
- rebased to 2.6.34
- fix logic error (in case that private mapping and shared mapping coexist)
- move is_vm_hugetlb_page() into include/linux/mm.h to use this function
from linear_page_index()
- define and use linear_hugepage_index() instead of compound_order()
- use page_move_anon_rmap() in hugetlb_cow()
- copy exclusive switch of __set_page_anon_rmap() into hugepage counterpart.
- revert commit 24be7468 completely

Signed-off-by: Naoya Horiguchi
Cc: Andi Kleen
Cc: Andrew Morton
Cc: Mel Gorman
Cc: Andrea Arcangeli
Cc: Larry Woodman
Cc: Lee Schermerhorn
Acked-by: Fengguang Wu
Acked-by: Mel Gorman
Signed-off-by: Andi Kleen

Naoya Horiguchi
2010-08-11 15:21:15 +0800

25 Apr, 2010

1 commit

23be7468e hugetlb: fix infinite loop in get_futex_key() when backed by huge pages ... Browse Code »

If a futex key happens to be located within a huge page mapped
MAP_PRIVATE, get_futex_key() can go into an infinite loop waiting for a
page->mapping that will never exist.

See https://bugzilla.redhat.com/show_bug.cgi?id=552257 for more details
about the problem.

This patch makes page->mapping a poisoned value that includes
PAGE_MAPPING_ANON mapped MAP_PRIVATE. This is enough for futex to
continue but because of PAGE_MAPPING_ANON, the poisoned value is not
dereferenced or used by futex. No other part of the VM should be
dereferencing the page->mapping of a hugetlbfs page as its page cache is
not on the LRU.

This patch fixes the problem with the test case described in the bugzilla.

[akpm@linux-foundation.org: mel cant spel]
Signed-off-by: Mel Gorman
Acked-by: Peter Zijlstra
Acked-by: Darren Hart
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2010-04-25 02:31:25 +0800

12 Jan, 2010

1 commit

a29815a33 core, x86: make LIST_POISON less deadly ... Browse Code »

The list macros use LIST_POISON1 and LIST_POISON2 as undereferencable
pointers in order to trap erronous use of freed list_heads. Unfortunately
userspace can arrange for those pointers to actually be dereferencable,
potentially turning an oops to an expolit.

To avoid this allow architectures (currently x86_64 only) to override
the default values for these pointers with truly-undereferencable values.
This is easy on x86_64 as the virtual address space is large and contains
areas that cannot be mapped.

Other 64-bit architectures will likely find similar unmapped ranges.

[ingo: switch to 0xdead000000000000 as the unmapped area]
[ingo: add comments, cleanup]
[jaswinder: eliminate sparse warnings]

Acked-by: Linus Torvalds
Signed-off-by: Jaswinder Singh Rajput
Signed-off-by: Ingo Molnar
Signed-off-by: Avi Kivity
Signed-off-by: Linus Torvalds

Avi Kivity
2010-01-12 01:45:37 +0800

22 Sep, 2009

1 commit

19da3dd15 flex_array: poison free elements ... Browse Code »

Newly initialized flex_array's and/or flex_array_part's are now poisoned
with a new poison value, FLEX_ARRAY_FREE. It's value is similar to
POISON_FREE used in the various slab allocators, but is different to
distinguish between flex array's poisoned kmem and slab allocator poisoned
kmem.

This will allow us to identify flex_array_part's that only contain free
elements (and free them with an addition to the flex_array API). This
could also be extended in the future to identify `get' uses on elements
that have not been `put'.

If __GFP_ZERO is passed for a part's gfp mask, the poisoning is avoided.
These elements are considered to be in-use since they have been
initialized.

Signed-off-by: David Rientjes
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2009-09-22 22:17:47 +0800

01 Apr, 2009

1 commit

6a11f75b6 generic debug pagealloc ... Browse Code »

CONFIG_DEBUG_PAGEALLOC is now supported by x86, powerpc, sparc64, and
s390. This patch implements it for the rest of the architectures by
filling the pages with poison byte patterns after free_pages() and
verifying the poison patterns before alloc_pages().

This generic one cannot detect invalid page accesses immediately but
invalid read access may cause invalid dereference by poisoned memory and
invalid write access can be detected after a long delay.

Signed-off-by: Akinobu Mita
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2009-04-01 23:59:13 +0800

30 Apr, 2008

1 commit

c6f3a97f8 debugobjects: add timer specific object debugging code ... Browse Code »

Add calls to the generic object debugging infrastructure and provide fixup
functions which allow to keep the system alive when recoverable problems have
been detected by the object debugging core code.

Signed-off-by: Thomas Gleixner
Acked-by: Ingo Molnar
Cc: Greg KH
Cc: Randy Dunlap
Cc: Kay Sievers
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2008-04-30 23:29:53 +0800

18 Oct, 2007

1 commit

cd02ff0b1 jbd2: JBD_XXX to JBD2_XXX naming cleanup ... Browse Code »

change JBD_XXX macros to JBD2_XXX in JBD2/Ext4

Signed-off-by: Mingming Cao
Signed-off-by: "Theodore Ts'o"

Mingming Cao
2007-10-18 06:49:58 +0800

09 May, 2007

1 commit

b46b8f19c Increase slab redzone to 64bits ... Browse Code »

There are two problems with the existing redzone implementation.

Firstly, it's causing misalignment of structures which contain a 64-bit
integer, such as netfilter's 'struct ipt_entry' -- causing netfilter
modules to fail to load because of the misalignment. (In particular, the
first check in
net/ipv4/netfilter/ip_tables.c::check_entry_size_and_hooks())

On ppc32 and sparc32, amongst others, __alignof__(uint64_t) == 8.

With slab debugging, we use 32-bit redzones. And allocated slab objects
aren't sufficiently aligned to hold a structure containing a uint64_t.

By _just_ setting ARCH_KMALLOC_MINALIGN to __alignof__(u64) we'd disable
redzone checks on those architectures. By using 64-bit redzones we avoid that
loss of debugging, and also fix the other problem while we're at it.

When investigating this, I noticed that on 64-bit platforms we're using a
32-bit value of RED_ACTIVE/RED_INACTIVE in the 64-bit memory location set
aside for the redzone. Which means that the four bytes immediately before
or after the allocated object at 0x00,0x00,0x00,0x00 for LE and BE
machines, respectively. Which is probably not the most useful choice of
poison value.

One way to fix both of those at once is just to switch to 64-bit
redzones in all cases.

Signed-off-by: David Woodhouse
Acked-by: Pekka Enberg
Cc: Christoph Lameter
Acked-by: David S. Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Woodhouse
2007-05-09 02:14:57 +0800

08 May, 2007

1 commit

81819f0fc SLUB core ... Browse Code »

This is a new slab allocator which was motivated by the complexity of the
existing code in mm/slab.c. It attempts to address a variety of concerns
with the existing implementation.

A. Management of object queues

A particular concern was the complex management of the numerous object
queues in SLAB. SLUB has no such queues. Instead we dedicate a slab for
each allocating CPU and use objects from a slab directly instead of
queueing them up.

B. Storage overhead of object queues

SLAB Object queues exist per node, per CPU. The alien cache queue even
has a queue array that contain a queue for each processor on each
node. For very large systems the number of queues and the number of
objects that may be caught in those queues grows exponentially. On our
systems with 1k nodes / processors we have several gigabytes just tied up
for storing references to objects for those queues This does not include
the objects that could be on those queues. One fears that the whole
memory of the machine could one day be consumed by those queues.

C. SLAB meta data overhead

SLAB has overhead at the beginning of each slab. This means that data
cannot be naturally aligned at the beginning of a slab block. SLUB keeps
all meta data in the corresponding page_struct. Objects can be naturally
aligned in the slab. F.e. a 128 byte object will be aligned at 128 byte
boundaries and can fit tightly into a 4k page with no bytes left over.
SLAB cannot do this.

D. SLAB has a complex cache reaper

SLUB does not need a cache reaper for UP systems. On SMP systems
the per CPU slab may be pushed back into partial list but that
operation is simple and does not require an iteration over a list
of objects. SLAB expires per CPU, shared and alien object queues
during cache reaping which may cause strange hold offs.

E. SLAB has complex NUMA policy layer support

SLUB pushes NUMA policy handling into the page allocator. This means that
allocation is coarser (SLUB does interleave on a page level) but that
situation was also present before 2.6.13. SLABs application of
policies to individual slab objects allocated in SLAB is
certainly a performance concern due to the frequent references to
memory policies which may lead a sequence of objects to come from
one node after another. SLUB will get a slab full of objects
from one node and then will switch to the next.

F. Reduction of the size of partial slab lists

SLAB has per node partial lists. This means that over time a large
number of partial slabs may accumulate on those lists. These can
only be reused if allocator occur on specific nodes. SLUB has a global
pool of partial slabs and will consume slabs from that pool to
decrease fragmentation.

G. Tunables

SLAB has sophisticated tuning abilities for each slab cache. One can
manipulate the queue sizes in detail. However, filling the queues still
requires the uses of the spin lock to check out slabs. SLUB has a global
parameter (min_slab_order) for tuning. Increasing the minimum slab
order can decrease the locking overhead. The bigger the slab order the
less motions of pages between per CPU and partial lists occur and the
better SLUB will be scaling.

G. Slab merging

We often have slab caches with similar parameters. SLUB detects those
on boot up and merges them into the corresponding general caches. This
leads to more effective memory use. About 50% of all caches can
be eliminated through slab merging. This will also decrease
slab fragmentation because partial allocated slabs can be filled
up again. Slab merging can be switched off by specifying
slub_nomerge on boot up.

Note that merging can expose heretofore unknown bugs in the kernel
because corrupted objects may now be placed differently and corrupt
differing neighboring objects. Enable sanity checks to find those.

H. Diagnostics

The current slab diagnostics are difficult to use and require a
recompilation of the kernel. SLUB contains debugging code that
is always available (but is kept out of the hot code paths).
SLUB diagnostics can be enabled via the "slab_debug" option.
Parameters can be specified to select a single or a group of
slab caches for diagnostics. This means that the system is running
with the usual performance and it is much more likely that
race conditions can be reproduced.

I. Resiliency

If basic sanity checks are on then SLUB is capable of detecting
common error conditions and recover as best as possible to allow the
system to continue.

J. Tracing

Tracing can be enabled via the slab_debug=T, option
during boot. SLUB will then protocol all actions on that slabcache
and dump the object contents on free.

K. On demand DMA cache creation.

Generally DMA caches are not needed. If a kmalloc is used with
__GFP_DMA then just create this single slabcache that is needed.
For systems that have no ZONE_DMA requirement the support is
completely eliminated.

L. Performance increase

Some benchmarks have shown speed improvements on kernbench in the
range of 5-10%. The locking overhead of slub is based on the
underlying base allocation size. If we can reliably allocate
larger order pages then it is possible to increase slub
performance much further. The anti-fragmentation patches may
enable further performance increases.

Tested on:
i386 UP + SMP, x86_64 UP + SMP + NUMA emulation, IA64 NUMA + Simulator

SLUB Boot options

slub_nomerge Disable merging of slabs
slub_min_order=x Require a minimum order for slab caches. This
increases the managed chunk size and therefore
reduces meta data and locking overhead.
slub_min_objects=x Mininum objects per slab. Default is 8.
slub_max_order=x Avoid generating slabs larger than order specified.
slub_debug Enable all diagnostics for all caches
slub_debug= Enable selective options for all caches
slub_debug=, Enable selective options for a certain set of
caches

Available Debug options
F Double Free checking, sanity and resiliency
R Red zoning
P Object / padding poisoning
U Track last free / alloc
T Trace all allocs / frees (only use for individual slabs).

To use SLUB: Apply this patch and then select SLUB as the default slab
allocator.

[hugh@veritas.com: fix an oops-causing locking error]
[akpm@linux-foundation.org: various stupid cleanups and small fixes]
Signed-off-by: Christoph Lameter
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-08 03:12:53 +0800

03 May, 2007

1 commit

6fb14755a [PATCH] x86: tighten kernel image page access rights ... Browse Code »

On x86-64, kernel memory freed after init can be entirely unmapped instead
of just getting 'poisoned' by overwriting with a debug pattern.

On i386 and x86-64 (under CONFIG_DEBUG_RODATA), kernel text and bug table
can also be write-protected.

Compared to the first version, this one prevents re-creating deleted
mappings in the kernel image range on x86-64, if those got removed
previously. This, together with the original changes, prevents temporarily
having inconsistent mappings when cacheability attributes are being
changed on such pages (e.g. from AGP code). While on i386 such duplicate
mappings don't exist, the same change is done there, too, both for
consistency and because checking pte_present() before using various other
pte_XXX functions is a requirement anyway. At once, i386 code gets
adjusted to use pte_huge() instead of open coding this.

AK: split out cpa() changes

Signed-off-by: Jan Beulich
Signed-off-by: Andi Kleen

Jan Beulich
2007-05-03 01:27:10 +0800

04 Jul, 2006

2 commits

3c6b37732 [ATM]: add+use poison defines ... Browse Code »

ATM: add and use POISON define values.

Signed-off-by: Randy Dunlap
Signed-off-by: David S. Miller

Randy Dunlap
2006-07-04 10:48:25 +0800
4bdbf6c03 [NET]: add+use poison defines ... Browse Code »

Add and use poison defines in net/.

Signed-off-by: Randy Dunlap
Signed-off-by: David S. Miller

Randy Dunlap
2006-07-04 10:47:27 +0800

28 Jun, 2006

3 commits

a7807a32b [PATCH] poison: add & use more constants ... Browse Code »

Add more poison values to include/linux/poison.h. It's not clear to me
whether some others should be added or not, so I haven't added any of
these:

./include/linux/libata.h:#define ATA_TAG_POISON 0xfafbfcfdU
./arch/ppc/8260_io/fcc_enet.c:1918: memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
./drivers/usb/mon/mon_text.c:429: memset(mem, 0xe5, sizeof(struct mon_event_text));
./drivers/char/ftape/lowlevel/ftape-ctl.c:738: memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);
./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2006-06-28 08:32:38 +0800
b3c681e09 [PATCH] update two drivers for poison.h ... Browse Code »

Update two drivers to use poison.h.

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2006-06-28 08:32:38 +0800
c9cf55285 [PATCH] add poison.h and patch primary users ... Browse Code »

Localize poison values into one header file for better documentation and
easier/quicker debugging and so that the same values won't be used for
multiple purposes.

Use these constants in core arch., mm, driver, and fs code.

Signed-off-by: Randy Dunlap
Acked-by: Matt Mackall
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: "David S. Miller"
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2006-06-28 08:32:38 +0800