01 Jan, 2009

2 commits

  • Impact: extra safety checks during transition

    When CONFIG_CPUMASKS_OFFSTACK is set, the new cpumask_ operators only
    use bits up to nr_cpu_ids, not NR_CPUS. Using the old cpus_ operators
    on these masks can mean accessing undefined bits.

    After some discussion, Mike and I decided to err on the side of caution;
    we zero the "undefined" bits in alloc_cpumask_var_node() until all the
    old cpumask functions are removed.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: fix kernel-doc

    alloc_bootmem_cpumask_var() returns avoid.

    Signed-off-by: Li Zefan
    Signed-off-by: Rusty Russell

    Li Zefan
     

19 Dec, 2008

2 commits

  • Impact: New kerneldoc comments

    Additional documentation added to all the alloc_cpumask and free_cpumask
    functions.

    Signed-off-by: Mike Travis
    Signed-off-by: Rusty Russell (minor additions)

    Mike Travis
     
  • Impact: New API

    This will be needed in x86 code to allocate the domain and old_domain
    cpumasks on the same node as where the containing irq_cfg struct is
    allocated.

    (Also fixes double-dump_stack on rare CONFIG_DEBUG_PER_CPU_MAPS case)

    Signed-off-by: Mike Travis
    Signed-off-by: Rusty Russell (re-impl alloc_cpumask_var)

    Mike Travis
     

10 Nov, 2008

1 commit


07 Nov, 2008

1 commit


06 Nov, 2008

1 commit

  • Impact: introduce new APIs

    We want to deprecate cpumasks on the stack, as we are headed for
    gynormous numbers of CPUs. Eventually, we want to head towards an
    undefined 'struct cpumask' so they can never be declared on stack.

    1) New cpumask functions which take pointers instead of copies.
    (cpus_* -> cpumask_*)

    2) Several new helpers to reduce requirements for temporary cpumasks
    (cpumask_first_and, cpumask_next_and, cpumask_any_and)

    3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
    (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)

    4) 'struct cpumask' for explicitness and to mark new-style code.

    5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
    not NR_CPUS for time efficiency and for smaller dynamic allocations
    in future.

    6) cpumask_copy() so we can allocate less than a full cpumask eventually
    (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
    definition eventually.

    7) work_on_cpu() helper for doing task on a CPU, rather than saving old
    cpumask for current thread and manipulating it.

    8) smp_call_function_many() which is smp_call_function_mask() except
    taking a cpumask pointer.

    Note that this patch simply introduces the new functions and leaves
    the obsolescent ones in place. This is to simplify the transition
    patches.

    Signed-off-by: Rusty Russell
    Signed-off-by: Ingo Molnar

    Rusty Russell
     

24 May, 2008

1 commit

  • * Increase performance for systems with large count NR_CPUS by limiting
    the range of the cpumask operators that loop over the bits in a cpumask_t
    variable. This removes a large amount of wasted cpu cycles.

    * Add performance variants of the cpumask operators:

    int cpus_weight_nr(mask) Same using nr_cpu_ids instead of NR_CPUS
    int first_cpu_nr(mask) Number lowest set bit, or nr_cpu_ids
    int next_cpu_nr(cpu, mask) Next cpu past 'cpu', or nr_cpu_ids
    for_each_cpu_mask_nr(cpu, mask) for-loop cpu over mask using nr_cpu_ids

    * Modify following to use performance variants:

    #define num_online_cpus() cpus_weight_nr(cpu_online_map)
    #define num_possible_cpus() cpus_weight_nr(cpu_possible_map)
    #define num_present_cpus() cpus_weight_nr(cpu_present_map)

    #define for_each_possible_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)
    #define for_each_online_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)
    #define for_each_present_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)

    * Comment added to include/linux/cpumask.h:

    Note: The alternate operations with the suffix "_nr" are used
    to limit the range of the loop to nr_cpu_ids instead of
    NR_CPUS when NR_CPUS > 64 for performance reasons.
    If NR_CPUS is
    Cc: Christoph Lameter
    Reviewed-by: Paul Jackson
    Reviewed-by: Christoph Lameter
    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

08 May, 2007

1 commit

  • The nr_cpu_ids value is currently only calculated in smp_init. However, it
    may be needed before (SLUB needs it on kmem_cache_init!) and other kernel
    components may also want to allocate dynamically sized per cpu array before
    smp_init. So move the determination of possible cpus into sched_init()
    where we already loop over all possible cpus early in boot.

    Also initialize both nr_node_ids and nr_cpu_ids with the highest value they
    could take. If we have accidental users before these values are determined
    then the current valud of 0 may cause too small per cpu and per node arrays
    to be allocated. If it is set to the maximum possible then we only waste
    some memory for early boot users.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

21 Feb, 2007

1 commit

  • We frequently need the maximum number of possible processors in order to
    allocate arrays for all processors. So far this was done using
    highest_possible_processor_id(). However, we do need the number of
    processors not the highest id. Moreover the number was so far dynamically
    calculated on each invokation. The number of possible processors does not
    change when the system is running. We can therefore calculate that number
    once.

    Signed-off-by: Christoph Lameter
    Cc: Frederik Deweerdt
    Cc: Neil Brown
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

21 Oct, 2006

1 commit

  • Qooting Adrian:

    - net/sunrpc/svc.c uses highest_possible_node_id()

    - include/linux/nodemask.h says highest_possible_node_id() is
    out-of-line #if MAX_NUMNODES > 1

    - the out-of-line highest_possible_node_id() is in lib/cpumask.c

    - lib/Makefile: lib-$(CONFIG_SMP) += cpumask.o
    CONFIG_ARCH_DISCONTIGMEM_ENABLE=y, CONFIG_SMP=n, CONFIG_SUNRPC=y

    -> highest_possible_node_id() is used in net/sunrpc/svc.c
    CONFIG_NODES_SHIFT defined and > 0

    -> include/linux/numa.h: MAX_NUMNODES > 1

    -> compile error

    The bug is not present on architectures where ARCH_DISCONTIGMEM_ENABLE
    depends on NUMA (but m32r isn't the only affected architecture).

    So move the function into page_alloc.c

    Cc: Adrian Bunk
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

02 Oct, 2006

1 commit


26 Mar, 2006

4 commits

  • text data bss dec hex filename
    before: 3605597 1363528 363328 5332453 515de5 vmlinux
    after: 3605295 1363612 363200 5332107 515c8b vmlinux

    218 bytes saved.

    Also, optimise any_online_cpu() out of existence on CONFIG_SMP=n.

    This function seems inefficient. Can't we simply AND the two masks, then use
    find_first_bit()?

    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Shrinks the only caller (net/bridge/netfilter/ebtables.c) by 174 bytes.

    Also, optimise highest_possible_processor_id() out of existence on
    CONFIG_SMP=n.

    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • text data bss dec hex filename
    before: 3488027 1322496 360128 5170651 4ee5db vmlinux
    after: 3485112 1322480 359968 5167560 4ed9c8 vmlinux

    2931 bytes saved

    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • text data bss dec hex filename
    before: 3490577 1322408 360000 5172985 4eeef9 vmlinux
    after: 3488027 1322496 360128 5170651 4ee5db vmlinux

    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton