Eric Lee / smarc-fsl-linux-kernel

18 Mar, 2016

1 commit

1170532bb mm: convert printk(KERN_<LEVEL> to pr_<level> ... Browse Code »

Most of the mm subsystem uses pr_ so make it consistent.

Miscellanea:

- Realign arguments
- Add missing newline to format
- kmemleak-test.c has a "kmemleak: " prefix added to the
"Kmemleak testing" logging message via pr_fmt

Signed-off-by: Joe Perches
Acked-by: Tejun Heo [percpu]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2016-03-18 06:09:34 +0800

01 Jul, 2015

2 commits

74033a798 mm: meminit: remove mminit_verify_page_links ... Browse Code »

mminit_verify_page_links() is an extremely paranoid check that was
introduced when memory initialisation was being heavily reworked.
Profiles indicated that up to 10% of parallel memory initialisation was
spent on checking this for every page. The cost could be reduced but in
practice this check only found problems very early during the
initialisation rewrite and has found nothing since. This patch removes an
expensive unnecessary check.

Signed-off-by: Mel Gorman
Tested-by: Nate Zimmer
Tested-by: Waiman Long
Tested-by: Daniel J Blueman
Acked-by: Pekka Enberg
Cc: Robin Holt
Cc: Nate Zimmer
Cc: Dave Hansen
Cc: Waiman Long
Cc: Scott Norton
Cc: "Luck, Tony"
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2015-07-01 10:44:56 +0800
7e18adb4f mm: meminit: initialise remaining struct pages in parallel with kswapd ... Browse Code »

Only a subset of struct pages are initialised at the moment. When this
patch is applied kswapd initialise the remaining struct pages in parallel.

This should boot faster by spreading the work to multiple CPUs and
initialising data that is local to the CPU. The user-visible effect on
large machines is that free memory will appear to rapidly increase early
in the lifetime of the system until kswapd reports that all memory is
initialised in the kernel log. Once initialised there should be no other
user-visibile effects.

Signed-off-by: Mel Gorman
Tested-by: Nate Zimmer
Tested-by: Waiman Long
Tested-by: Daniel J Blueman
Acked-by: Pekka Enberg
Cc: Robin Holt
Cc: Nate Zimmer
Cc: Dave Hansen
Cc: Waiman Long
Cc: Scott Norton
Cc: "Luck, Tony"
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2015-07-01 10:44:56 +0800

13 Feb, 2015

2 commits

194e81512 mm/mm_init.c: mark mminit_loglevel __meminitdata ... Browse Code »

mminit_loglevel is only referenced from __init and __meminit functions, so
we can mark it __meminitdata.

Signed-off-by: Rasmus Villemoes
Cc: Vlastimil Babka
Cc: Rik van Riel
Cc: Joonsoo Kim
Cc: David Rientjes
Cc: Vishnu Pratap Singh
Cc: Pintu Kumar
Cc: Michal Nazarewicz
Cc: Mel Gorman
Cc: Paul Gortmaker
Cc: Peter Zijlstra
Cc: Tim Chen
Cc: Hugh Dickins
Cc: Li Zefan
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-02-13 10:54:11 +0800
0e2342c70 mm/mm_init.c: park mminit_verify_zonelist as __init ... Browse Code »

The only caller of mminit_verify_zonelist is build_all_zonelists_init,
which is annotated with __init, so it should be safe to also mark the
former as __init, saving ~400 bytes of .text.

Signed-off-by: Rasmus Villemoes
Cc: Vlastimil Babka
Cc: Rik van Riel
Cc: Joonsoo Kim
Cc: David Rientjes
Cc: Vishnu Pratap Singh
Cc: Pintu Kumar
Cc: Michal Nazarewicz
Cc: Mel Gorman
Cc: Paul Gortmaker
Cc: Peter Zijlstra
Cc: Tim Chen
Cc: Hugh Dickins
Cc: Li Zefan
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-02-13 10:54:11 +0800

28 Jan, 2014

1 commit

e82cb95d6 mm: bring back /sys/kernel/mm ... Browse Code »

Commit da29bd36224b ("mm/mm_init.c: make creation of the mm_kobj happen
earlier than device_initcall") changed to pure_initcall(mm_sysfs_init).

That's too early: mm_sysfs_init() depends on core_initcall(ksysfs_init)
to have made the kernel_kobj directory "kernel" in which to create "mm".

Make it postcore_initcall(mm_sysfs_init). We could use core_initcall(),
and depend upon Makefile link order kernel/ mm/ fs/ ipc/ security/ ...
as core_initcall(debugfs_init) and core_initcall(securityfs_init) do;
but better not.

Signed-off-by: Hugh Dickins
Acked-by: Paul Gortmaker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2014-01-28 13:02:39 +0800

24 Jan, 2014

1 commit

da29bd362 mm/mm_init.c: make creation of the mm_kobj happen earlier than device_initcall ... Browse Code »

The use of __initcall is to be eventually replaced by choosing one from
the prioritized groupings laid out in init.h header:

pure_initcall 0
core_initcall 1
postcore_initcall 2
arch_initcall 3
subsys_initcall 4
fs_initcall 5
device_initcall 6
late_initcall 7

In the interim, all __initcall are mapped onto device_initcall, which as
can be seen above, comes quite late in the ordering.

Currently the mm_kobj is created with __initcall in mm_sysfs_init().
This means that any other initcalls that want to reference the mm_kobj
have to be device_initcall (or later), otherwise we will for example,
trip the BUG_ON(!kobj) in sysfs's internal_create_group(). This
unfairly restricts those users; for example something that clearly makes
sense to be an arch_initcall will not be able to choose that.

However, upon examination, it is only this way for historical reasons
(i.e. simply not reprioritized yet). We see that sysfs is ready quite
earlier in init/main.c via:

vfs_caches_init
|_ mnt_init
|_ sysfs_init

well ahead of the processing of the prioritized calls listed above.

So we can recategorize mm_sysfs_init to be a pure_initcall, which in
turn allows any mm_kobj initcall users a wider range (1 --> 7) of
initcall priorities to choose from.

Signed-off-by: Paul Gortmaker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Gortmaker
2014-01-24 08:36:52 +0800

09 Oct, 2013

2 commits

90572890d mm: numa: Change page last {nid,pid} into {cpu,pid} ... Browse Code »

Change the per page last fault tracking to use cpu,pid instead of
nid,pid. This will allow us to try and lookup the alternate task more
easily. Note that even though it is the cpu that is store in the page
flags that the mpol_misplaced decision is still based on the node.

Signed-off-by: Peter Zijlstra
Signed-off-by: Mel Gorman
Reviewed-by: Rik van Riel
Cc: Andrea Arcangeli
Cc: Johannes Weiner
Cc: Srikar Dronamraju
Link: http://lkml.kernel.org/r/1381141781-10992-43-git-send-email-mgorman@suse.de
[ Fixed build failure on 32-bit systems. ]
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-10-09 20:47:45 +0800
b795854b1 sched/numa: Set preferred NUMA node based on number of private faults ... Browse Code »

Ideally it would be possible to distinguish between NUMA hinting faults that
are private to a task and those that are shared. If treated identically
there is a risk that shared pages bounce between nodes depending on
the order they are referenced by tasks. Ultimately what is desirable is
that task private pages remain local to the task while shared pages are
interleaved between sharing tasks running on different nodes to give good
average performance. This is further complicated by THP as even
applications that partition their data may not be partitioning on a huge
page boundary.

To start with, this patch assumes that multi-threaded or multi-process
applications partition their data and that in general the private accesses
are more important for cpu->memory locality in the general case. Also,
no new infrastructure is required to treat private pages properly but
interleaving for shared pages requires additional infrastructure.

To detect private accesses the pid of the last accessing task is required
but the storage requirements are a high. This patch borrows heavily from
Ingo Molnar's patch "numa, mm, sched: Implement last-CPU+PID hash tracking"
to encode some bits from the last accessing task in the page flags as
well as the node information. Collisions will occur but it is better than
just depending on the node information. Node information is then used to
determine if a page needs to migrate. The PID information is used to detect
private/shared accesses. The preferred NUMA node is selected based on where
the maximum number of approximately private faults were measured. Shared
faults are not taken into consideration for a few reasons.

First, if there are many tasks sharing the page then they'll all move
towards the same node. The node will be compute overloaded and then
scheduled away later only to bounce back again. Alternatively the shared
tasks would just bounce around nodes because the fault information is
effectively noise. Either way accounting for shared faults the same as
private faults can result in lower performance overall.

The second reason is based on a hypothetical workload that has a small
number of very important, heavily accessed private pages but a large shared
array. The shared array would dominate the number of faults and be selected
as a preferred node even though it's the wrong decision.

The third reason is that multiple threads in a process will race each
other to fault the shared page making the fault information unreliable.

Signed-off-by: Mel Gorman
[ Fix complication error when !NUMA_BALANCING. ]
Reviewed-by: Rik van Riel
Cc: Andrea Arcangeli
Cc: Johannes Weiner
Cc: Srikar Dronamraju
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1381141781-10992-30-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar

Mel Gorman
2013-10-09 18:40:35 +0800

04 Jul, 2013

1 commit

917d9290a mm: tune vm_committed_as percpu_counter batching size ... Browse Code »

Currently the per cpu counter's batch size for memory accounting is
configured as twice the number of cpus in the system. However, for
system with very large memory, it is more appropriate to make it
proportional to the memory size per cpu in the system.

For example, for a x86_64 system with 64 cpus and 128 GB of memory, the
batch size is only 2*64 pages (0.5 MB). So any memory accounting
changes of more than 0.5MB will overflow the per cpu counter into the
global counter. Instead, for the new scheme, the batch size is
configured to be 0.4% of the memory/cpu = 8MB (128 GB/64 /256), which is
more inline with the memory size.

I've done a repeated brk test of 800KB (from will-it-scale test suite)
with 80 concurrent processes on a 4 socket Westmere machine with a total
of 40 cores. Without the patch, about 80% of cpu is spent on spin-lock
contention within the vm_committed_as counter. With the patch, there's
a 73x speedup on the benchmark and the lock contention drops off almost
entirely.

[akpm@linux-foundation.org: fix section mismatch]
Signed-off-by: Tim Chen
Cc: Tejun Heo
Cc: Eric Dumazet
Cc: Dave Hansen
Cc: Andi Kleen
Cc: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tim Chen
2013-07-04 07:07:32 +0800

24 Feb, 2013

1 commit

a4e1b4c6c mm: init: report on last-nid information stored in page->flags ... Browse Code »

Answering the question "how much space remains in the page->flags" is
time-consuming. mminit_loglevel can help answer the question but it
does not take last_nid information into account. This patch corrects it
and while there it corrects the messages related to page flag usage,
pgshifts and node/zone id. When applied the relevant output looks
something like this but will depend on the kernel configuration.

mminit::pageflags_layout_widths Section 0 Node 9 Zone 2 Lastnid 9 Flags 25
mminit::pageflags_layout_shifts Section 19 Node 9 Zone 2 Lastnid 9
mminit::pageflags_layout_pgshifts Section 0 Node 55 Zone 53 Lastnid 44
mminit::pageflags_layout_nodezoneid Node/Zone ID: 64 -> 53
mminit::pageflags_layout_usage location: 64 -> 44 layout 44 -> 25 unused 25 -> 0 page-flags

Signed-off-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2013-02-24 09:50:18 +0800

31 Oct, 2011

1 commit

b95f1b31b mm: Map most files to use export.h instead of module.h ... Browse Code »

The files changed within are only using the EXPORT_SYMBOL
macro variants. They are not using core modular infrastructure
and hence don't need module.h but only the export.h header.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-10-31 21:20:12 +0800

21 Aug, 2008

1 commit

759f9a2df mm: mminit_loglevel cannot be __meminitdata anymore ... Browse Code »

mminit_loglevel is now used from mminit_verify_zonelist
Acked-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Marcin Slusarz
2008-08-21 06:40:30 +0800

06 Aug, 2008

1 commit

5c9ffc9c3 mm_init.c: avoid ifdef-inside-macro-expansion ... Browse Code »

gcc-3.2:

mm/mm_init.c:77:1: directives may not be used inside a macro argument
mm/mm_init.c:76:47: unterminated argument list invoking macro "mminit_dprintk"
mm/mm_init.c: In function `mminit_verify_pageflags_layout':
mm/mm_init.c:80: `mminit_dprintk' undeclared (first use in this function)
mm/mm_init.c:80: (Each undeclared identifier is reported only once
mm/mm_init.c:80: for each function it appears in.)
mm/mm_init.c:80: syntax error before numeric constant

Also fix a typo in a comment.

Reported-by: Adrian Bunk
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2008-08-06 05:33:46 +0800

25 Jul, 2008

5 commits

ff7ea79cf mm: create /sys/kernel/mm ... Browse Code »

Add a kobject to create /sys/kernel/mm when sysfs is mounted. The kobject
will exist regardless. This will allow for the hugepage related sysfs
directories to exist under the mm "subsystem" directory. Add an ABI file
appropriately.

[kosaki.motohiro@jp.fujitsu.com: fix build]
Signed-off-by: Nishanth Aravamudan
Cc: Nick Piggin
Cc: Mel Gorman
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nishanth Aravamudan
2008-07-25 01:47:17 +0800
5e9426abe mm: remove mm_init compilation dependency on CONFIG_DEBUG_MEMORY_INIT ... Browse Code »

Towards the end of putting all core mm initialization in mm_init.c, I
plan on putting the creation of a mm kobject in a function in that file.
However, the file is currently only compiled if CONFIG_DEBUG_MEMORY_INIT
is set. Remove this dependency, but put the code under an #ifdef on the
same config option. This should result in no functional changes.

Signed-off-by: Nishanth Aravamudan
Cc: Nick Piggin
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nishanth Aravamudan
2008-07-25 01:47:17 +0800
68ad8df42 mm: print out the zonelists on request for manual verification ... Browse Code »

This patch prints out the zonelists during boot for manual verification by the
user if the mminit_loglevel is MMINIT_VERIFY or higher.

Signed-off-by: Mel Gorman
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2008-07-25 01:47:14 +0800
708614e61 mm: verify the page links and memory model ... Browse Code »

Print out information on how the page flags are being used if mminit_loglevel
is MMINIT_VERIFY or higher and unconditionally performs sanity checks on the
flags regardless of loglevel.

When the page flags are updated with section, node and zone information, a
check are made to ensure the values can be retrieved correctly. Finally we
confirm that pfn_to_page and page_to_pfn are the correct inverse functions.

[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Mel Gorman
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2008-07-25 01:47:13 +0800
6b74ab97b mm: add a basic debugging framework for memory initialisation ... Browse Code »

Boot initialisation is very complex, with significant numbers of
architecture-specific routines, hooks and code ordering. While significant
amounts of the initialisation is architecture-independent, it trusts the data
received from the architecture layer. This is a mistake, and has resulted in
a number of difficult-to-diagnose bugs.

This patchset adds some validation and tracing to memory initialisation. It
also introduces a few basic defensive measures. The validation code can be
explicitly disabled for embedded systems.

This patch:

Add additional debugging and verification code for memory initialisation.

Once enabled, the verification checks are always run and when required
additional debugging information may be outputted via a mminit_loglevel=
command-line parameter.

The verification code is placed in a new file mm/mm_init.c. Ideally other mm
initialisation code will be moved here over time.

Signed-off-by: Mel Gorman
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2008-07-25 01:47:13 +0800