Eric Lee / smarc-fsl-linux-kernel

27 Jul, 2011

2 commits

37e7b5f15 cpumask: alloc_cpumask_var() use NUMA_NO_NODE ... Browse Code »

NUMA_NO_NODE and numa_node_id() have different meanings. NUMA_NO_NODE is
obviously the recommended fallback.

Signed-off-by: KOSAKI Motohiro
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-07-27 07:49:44 +0800
95918f4a7 cpumask: convert for_each_cpumask() with for_each_cpu() ... Browse Code »

Adapt new API fashion.

Signed-off-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-07-27 07:49:44 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

12 Jun, 2009

1 commit

38c7fed2f x86: remove some alloc_bootmem_cpumask_var calling ... Browse Code »

Now that we set up the slab allocator earlier, we can get rid of some
alloc_bootmem_cpumask_var() calls in boot code.

Cc: Ingo Molnar
Cc: Johannes Weiner
Cc: Linus Torvalds
Signed-off-by: Yinghai Lu
Signed-off-by: Pekka Enberg

Yinghai Lu
2009-06-12 00:27:07 +0800

09 Jun, 2009

1 commit

0281b5dc0 cpumask: introduce zalloc_cpumask_var ... Browse Code »

So can get cpumask_var with cpumask_clear

Signed-off-by: Yinghai Lu
Signed-off-by: Rusty Russell

Yinghai Lu
2009-06-09 21:00:26 +0800

03 Apr, 2009

1 commit

4f032ac41 cpumask: fix slab corruption caused by alloc_cpumask_var_node() ... Browse Code »

Fix slab corruption caused by alloc_cpumask_var_node() overwriting the
tail end of an off-stack cpumask.

The function zeros out cpumask bits beyond the last possible cpu. The
starting point for zeroing should be the beginning of the mask offset by a
byte count derived from the number of possible cpus. The offset was
calculated in bits instead of bytes. This resulted in overwriting the end
of the cpumask.

Signed-off-by: Jack Steiner
Acked-by: Mike Travis
Acked-by: Ingo Molnar
Cc: Rusty Russell
Cc: Stephen Rothwell
Cc: [2.6.29.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jack Steiner
2009-04-03 10:05:11 +0800

01 Jan, 2009

2 commits

2a5300803 cpumask: zero extra bits in alloc_cpumask_var_node ... Browse Code »

Impact: extra safety checks during transition

When CONFIG_CPUMASKS_OFFSTACK is set, the new cpumask_ operators only
use bits up to nr_cpu_ids, not NR_CPUS. Using the old cpus_ operators
on these masks can mean accessing undefined bits.

After some discussion, Mike and I decided to err on the side of caution;
we zero the "undefined" bits in alloc_cpumask_var_node() until all the
old cpumask functions are removed.

Signed-off-by: Rusty Russell

Rusty Russell
2009-01-01 07:42:30 +0800
e9690a6e4 cpumask: fix bogus kernel-doc ... Browse Code »

Impact: fix kernel-doc

alloc_bootmem_cpumask_var() returns avoid.

Signed-off-by: Li Zefan
Signed-off-by: Rusty Russell

Li Zefan
2009-01-01 07:42:13 +0800

19 Dec, 2008

2 commits

ec26b8058 cpumask: documentation for cpumask_var_t ... Browse Code »

Impact: New kerneldoc comments

Additional documentation added to all the alloc_cpumask and free_cpumask
functions.

Signed-off-by: Mike Travis
Signed-off-by: Rusty Russell (minor additions)

Mike Travis
2008-12-19 14:26:52 +0800
7b4967c53 cpumask: Add alloc_cpumask_var_node() ... Browse Code »

Impact: New API

This will be needed in x86 code to allocate the domain and old_domain
cpumasks on the same node as where the containing irq_cfg struct is
allocated.

(Also fixes double-dump_stack on rare CONFIG_DEBUG_PER_CPU_MAPS case)

Signed-off-by: Mike Travis
Signed-off-by: Rusty Russell (re-impl alloc_cpumask_var)

Mike Travis
2008-12-19 14:26:37 +0800

10 Nov, 2008

1 commit

984f2f377 cpumask: introduce new API, without changing anything, v3 ... Browse Code »

Impact: cleanup

Clean up based on feedback from Andrew Morton and others:

- change to inline functions instead of macros
- add __init to bootmem method
- add a missing debug check

Signed-off-by: Rusty Russell
Signed-off-by: Ingo Molnar

Rusty Russell
2008-11-10 04:09:54 +0800

07 Nov, 2008

1 commit

cd83e42c6 cpumask: new API, v2 ... Browse Code »

- add cpumask_of()
- add free_bootmem_cpumask_var()

Signed-off-by: Rusty Russell
Signed-off-by: Ingo Molnar

Rusty Russell
2008-11-07 19:52:30 +0800

06 Nov, 2008

1 commit

2d3854a37 cpumask: introduce new API, without changing anything ... Browse Code »

Impact: introduce new APIs

We want to deprecate cpumasks on the stack, as we are headed for
gynormous numbers of CPUs. Eventually, we want to head towards an
undefined 'struct cpumask' so they can never be declared on stack.

1) New cpumask functions which take pointers instead of copies.
(cpus_* -> cpumask_*)

2) Several new helpers to reduce requirements for temporary cpumasks
(cpumask_first_and, cpumask_next_and, cpumask_any_and)

3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
(cpumask_var_t, alloc_cpumask_var and free_cpumask_var)

4) 'struct cpumask' for explicitness and to mark new-style code.

5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
not NR_CPUS for time efficiency and for smaller dynamic allocations
in future.

6) cpumask_copy() so we can allocate less than a full cpumask eventually
(for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
definition eventually.

7) work_on_cpu() helper for doing task on a CPU, rather than saving old
cpumask for current thread and manipulating it.

8) smp_call_function_many() which is smp_call_function_mask() except
taking a cpumask pointer.

Note that this patch simply introduces the new functions and leaves
the obsolescent ones in place. This is to simplify the transition
patches.

Signed-off-by: Rusty Russell
Signed-off-by: Ingo Molnar

Rusty Russell
2008-11-06 16:05:33 +0800

24 May, 2008

1 commit

41df0d61c x86: Add performance variants of cpumask operators ... Browse Code »

* Increase performance for systems with large count NR_CPUS by limiting
the range of the cpumask operators that loop over the bits in a cpumask_t
variable. This removes a large amount of wasted cpu cycles.

* Add performance variants of the cpumask operators:

int cpus_weight_nr(mask) Same using nr_cpu_ids instead of NR_CPUS
int first_cpu_nr(mask) Number lowest set bit, or nr_cpu_ids
int next_cpu_nr(cpu, mask) Next cpu past 'cpu', or nr_cpu_ids
for_each_cpu_mask_nr(cpu, mask) for-loop cpu over mask using nr_cpu_ids

* Modify following to use performance variants:

#define num_online_cpus() cpus_weight_nr(cpu_online_map)
#define num_possible_cpus() cpus_weight_nr(cpu_possible_map)
#define num_present_cpus() cpus_weight_nr(cpu_present_map)

#define for_each_possible_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)
#define for_each_online_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)
#define for_each_present_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)

* Comment added to include/linux/cpumask.h:

Note: The alternate operations with the suffix "_nr" are used
to limit the range of the loop to nr_cpu_ids instead of
NR_CPUS when NR_CPUS > 64 for performance reasons.
If NR_CPUS is
Cc: Christoph Lameter
Reviewed-by: Paul Jackson
Reviewed-by: Christoph Lameter
Signed-off-by: Mike Travis
Signed-off-by: Ingo Molnar

Mike Travis
2008-05-24 00:23:38 +0800

08 May, 2007

1 commit

476f35348 Safer nr_node_ids and nr_node_ids determination and initial values ... Browse Code »

The nr_cpu_ids value is currently only calculated in smp_init. However, it
may be needed before (SLUB needs it on kmem_cache_init!) and other kernel
components may also want to allocate dynamically sized per cpu array before
smp_init. So move the determination of possible cpus into sched_init()
where we already loop over all possible cpus early in boot.

Also initialize both nr_node_ids and nr_cpu_ids with the highest value they
could take. If we have accidental users before these values are determined
then the current valud of 0 may cause too small per cpu and per node arrays
to be allocated. If it is set to the maximum possible then we only waste
some memory for early boot users.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-08 03:12:51 +0800

21 Feb, 2007

1 commit

53b8a315b [PATCH] Convert highest_possible_processor_id to nr_cpu_ids ... Browse Code »

We frequently need the maximum number of possible processors in order to
allocate arrays for all processors. So far this was done using
highest_possible_processor_id(). However, we do need the number of
processors not the highest id. Moreover the number was so far dynamically
calculated on each invokation. The number of possible processors does not
change when the system is running. We can therefore calculate that number
once.

Signed-off-by: Christoph Lameter
Cc: Frederik Deweerdt
Cc: Neil Brown
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-02-21 09:10:13 +0800

21 Oct, 2006

1 commit

6220ec784 [PATCH] highest_possible_node_id() linkage fix ... Browse Code »

Qooting Adrian:

- net/sunrpc/svc.c uses highest_possible_node_id()

- include/linux/nodemask.h says highest_possible_node_id() is
out-of-line #if MAX_NUMNODES > 1

- the out-of-line highest_possible_node_id() is in lib/cpumask.c

- lib/Makefile: lib-$(CONFIG_SMP) += cpumask.o
CONFIG_ARCH_DISCONTIGMEM_ENABLE=y, CONFIG_SMP=n, CONFIG_SUNRPC=y

-> highest_possible_node_id() is used in net/sunrpc/svc.c
CONFIG_NODES_SHIFT defined and > 0

-> include/linux/numa.h: MAX_NUMNODES > 1

-> compile error

The bug is not present on architectures where ARCH_DISCONTIGMEM_ENABLE
depends on NUMA (but m32r isn't the only affected architecture).

So move the function into page_alloc.c

Cc: Adrian Bunk
Cc: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-10-21 01:26:43 +0800

02 Oct, 2006

1 commit

0f532f386 [PATCH] cpumask: add highest_possible_node_id ... Browse Code »

cpumask: add highest_possible_node_id(), analogous to
highest_possible_processor_id().

[pj@sgi.com: fix typo]
Signed-off-by: Greg Banks
Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Greg Banks
2006-10-02 22:57:17 +0800

26 Mar, 2006

4 commits

96a9b4d31 [PATCH] cpumask: uninline any_online_cpu() ... Browse Code »

text data bss dec hex filename
before: 3605597 1363528 363328 5332453 515de5 vmlinux
after: 3605295 1363612 363200 5332107 515c8b vmlinux

218 bytes saved.

Also, optimise any_online_cpu() out of existence on CONFIG_SMP=n.

This function seems inefficient. Can't we simply AND the two masks, then use
find_first_bit()?

Cc: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-03-26 00:23:00 +0800
863028207 [PATCH] cpumask: uninline highest_possible_processor_id() ... Browse Code »

Shrinks the only caller (net/bridge/netfilter/ebtables.c) by 174 bytes.

Also, optimise highest_possible_processor_id() out of existence on
CONFIG_SMP=n.

Cc: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-03-26 00:23:00 +0800
3d18bd74a [PATCH] cpumask: uninline next_cpu() ... Browse Code »

text data bss dec hex filename
before: 3488027 1322496 360128 5170651 4ee5db vmlinux
after: 3485112 1322480 359968 5167560 4ed9c8 vmlinux

2931 bytes saved

Cc: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-03-26 00:22:59 +0800
ccb46000f [PATCH] cpumask: uninline first_cpu() ... Browse Code »

text data bss dec hex filename
before: 3490577 1322408 360000 5172985 4eeef9 vmlinux
after: 3488027 1322496 360128 5170651 4ee5db vmlinux

Cc: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-03-26 00:22:59 +0800