08 Jul, 2020

1 commit

  • The current implementation of cpumask_local_spread() does not respect the
    isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
    it will return it to the caller for pinning of its IRQ threads. Having
    these unwanted IRQ threads on an isolated CPU adds up to a latency
    overhead.

    Restrict the CPUs that are returned for spreading IRQs only to the
    available housekeeping CPUs.

    Signed-off-by: Alex Belits
    Signed-off-by: Nitesh Narayan Lal
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200625223443.2684-2-nitesh@redhat.com

    Alex Belits
     

20 Mar, 2020

1 commit

  • Currently, when updating the affinity of tasks via either cpusets.cpus,
    or, sched_setaffinity(); tasks not currently running within the newly
    specified mask will be arbitrarily assigned to the first CPU within the
    mask.

    This (particularly in the case that we are restricting masks) can
    result in many tasks being assigned to the first CPUs of their new
    masks.

    This:
    1) Can induce scheduling delays while the load-balancer has a chance to
    spread them between their new CPUs.
    2) Can antogonize a poor load-balancer behavior where it has a
    difficult time recognizing that a cross-socket imbalance has been
    forced by an affinity mask.

    This change adds a new cpumask interface to allow iterated calls to
    distribute within the intersection of the provided masks.

    The cases that this mainly affects are:
    - modifying cpuset.cpus
    - when tasks join a cpuset
    - when modifying a task's affinity via sched_setaffinity(2)

    Signed-off-by: Paul Turner
    Signed-off-by: Josh Don
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Qais Yousef
    Tested-by: Qais Yousef
    Link: https://lkml.kernel.org/r/20200311010113.136465-1-joshdon@google.com

    Paul Turner
     

13 Mar, 2019

1 commit

  • Add check for the return value of memblock_alloc*() functions and call
    panic() in case of error. The panic message repeats the one used by
    panicing memblock allocators with adjustment of parameters to include
    only relevant ones.

    The replacement was mostly automated with semantic patches like the one
    below with manual massaging of format strings.

    @@
    expression ptr, size, align;
    @@
    ptr = memblock_alloc(size, align);
    + if (!ptr)
    + panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);

    [anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
    Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
    [rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
    Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
    [rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
    Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
    [akpm@linux-foundation.org: fix xtensa printk warning]
    Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Anders Roxell
    Reviewed-by: Guo Ren [c-sky]
    Acked-by: Paul Burton [MIPS]
    Acked-by: Heiko Carstens [s390]
    Reviewed-by: Juergen Gross [Xen]
    Reviewed-by: Geert Uytterhoeven [m68k]
    Acked-by: Max Filippov [xtensa]
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

06 Mar, 2019

1 commit

  • Patch series "Replace all open encodings for NUMA_NO_NODE", v3.

    All these places for replacement were found by running the following
    grep patterns on the entire kernel code. Please let me know if this
    might have missed some instances. This might also have replaced some
    false positives. I will appreciate suggestions, inputs and review.

    1. git grep "nid == -1"
    2. git grep "node == -1"
    3. git grep "nid = -1"
    4. git grep "node = -1"

    This patch (of 2):

    At present there are multiple places where invalid node number is
    encoded as -1. Even though implicitly understood it is always better to
    have macros in there. Replace these open encodings for an invalid node
    number with the global macro NUMA_NO_NODE. This helps remove NUMA
    related assumptions like 'invalid node' from various places redirecting
    them to a common definition.

    Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com
    Signed-off-by: Anshuman Khandual
    Reviewed-by: David Hildenbrand
    Acked-by: Jeff Kirsher [ixgbe]
    Acked-by: Jens Axboe [mtip32xx]
    Acked-by: Vinod Koul [dmaengine.c]
    Acked-by: Michael Ellerman [powerpc]
    Acked-by: Doug Ledford [drivers/infiniband]
    Cc: Joseph Qi
    Cc: Hans Verkuil
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anshuman Khandual
     

31 Oct, 2018

3 commits

  • When a memblock allocation APIs are called with align = 0, the alignment
    is implicitly set to SMP_CACHE_BYTES.

    Implicit alignment is done deep in the memblock allocator and it can
    come as a surprise. Not that such an alignment would be wrong even
    when used incorrectly but it is better to be explicit for the sake of
    clarity and the prinicple of the least surprise.

    Replace all such uses of memblock APIs with the 'align' parameter
    explicitly set to SMP_CACHE_BYTES and stop implicit alignment assignment
    in the memblock internal allocation functions.

    For the case when memblock APIs are used via helper functions, e.g. like
    iommu_arena_new_node() in Alpha, the helper functions were detected with
    Coccinelle's help and then manually examined and updated where
    appropriate.

    The direct memblock APIs users were updated using the semantic patch below:

    @@
    expression size, min_addr, max_addr, nid;
    @@
    (
    |
    - memblock_alloc_try_nid_raw(size, 0, min_addr, max_addr, nid)
    + memblock_alloc_try_nid_raw(size, SMP_CACHE_BYTES, min_addr, max_addr,
    nid)
    |
    - memblock_alloc_try_nid_nopanic(size, 0, min_addr, max_addr, nid)
    + memblock_alloc_try_nid_nopanic(size, SMP_CACHE_BYTES, min_addr, max_addr,
    nid)
    |
    - memblock_alloc_try_nid(size, 0, min_addr, max_addr, nid)
    + memblock_alloc_try_nid(size, SMP_CACHE_BYTES, min_addr, max_addr, nid)
    |
    - memblock_alloc(size, 0)
    + memblock_alloc(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_raw(size, 0)
    + memblock_alloc_raw(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_from(size, 0, min_addr)
    + memblock_alloc_from(size, SMP_CACHE_BYTES, min_addr)
    |
    - memblock_alloc_nopanic(size, 0)
    + memblock_alloc_nopanic(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_low(size, 0)
    + memblock_alloc_low(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_low_nopanic(size, 0)
    + memblock_alloc_low_nopanic(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_from_nopanic(size, 0, min_addr)
    + memblock_alloc_from_nopanic(size, SMP_CACHE_BYTES, min_addr)
    |
    - memblock_alloc_node(size, 0, nid)
    + memblock_alloc_node(size, SMP_CACHE_BYTES, nid)
    )

    [mhocko@suse.com: changelog update]
    [akpm@linux-foundation.org: coding-style fixes]
    [rppt@linux.ibm.com: fix missed uses of implicit alignment]
    Link: http://lkml.kernel.org/r/20181016133656.GA10925@rapoport-lnx
    Link: http://lkml.kernel.org/r/1538687224-17535-1-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Suggested-by: Michal Hocko
    Acked-by: Paul Burton [MIPS]
    Acked-by: Michael Ellerman [powerpc]
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: Geert Uytterhoeven
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: Matt Turner
    Cc: Michal Simek
    Cc: Richard Weinberger
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • Move remaining definitions and declarations from include/linux/bootmem.h
    into include/linux/memblock.h and remove the redundant header.

    The includes were replaced with the semantic patch below and then
    semi-automated removal of duplicated '#include

    @@
    @@
    - #include
    + #include

    [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
    [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
    [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
    Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
    Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Stephen Rothwell
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The conversion is done using

    sed -i 's@memblock_virt_alloc@memblock_alloc@g' \
    $(git grep -l memblock_virt_alloc)

    Link: http://lkml.kernel.org/r/1536927045-23536-8-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Hocko
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

07 Feb, 2018

1 commit

  • We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
    It's essentially a joined iteration in search for a non-zero bit, which is
    currently implemented as a lookup join (find a nonzero bit on the lhs,
    lookup the rhs to see if it's set there).

    Implement a direct join (find a nonzero bit on the incrementally built
    join). Also add generic bitmap benchmarks in the new `test_find_bit`
    module for new function (see `find_next_and_bit` in [2] and [3] below).

    For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
    faster with a geometric mean of 2.1 on 32 CPUs [1]. No impact on memory
    usage. Note that on Arm, the new pure-C implementation still outperforms
    the old one that uses a mix of C and asm (`find_next_bit`) [3].

    [1] Approximate benchmark code:

    ```
    unsigned long src1p[nr_cpumask_longs] = {pattern1};
    unsigned long src2p[nr_cpumask_longs] = {pattern2};
    for (/*a bunch of repetitions*/) {
    for (int n = -1; n ]
    Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org
    Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.com
    Signed-off-by: Clement Courbet
    Signed-off-by: Geert Uytterhoeven
    Cc: Yury Norov
    Cc: Geert Uytterhoeven
    Cc: Alexey Dobriyan
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton

    Signed-off-by: Linus Torvalds

    Clement Courbet
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

09 Sep, 2017

1 commit

  • Every for_each_XXX_cpu() invocation calls cpumask_next() which is an
    inline function:

    static inline unsigned int cpumask_next(int n, const struct cpumask *srcp)
    {
    /* -1 is a legal arg here. */
    if (n != -1)
    cpumask_check(n);
    return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
    }

    However!

    find_next_bit() is regular out-of-line function which means "nr_cpu_ids"
    load and increment happen at the caller resulting in a lot of bloat

    x86_64 defconfig:
    add/remove: 3/0 grow/shrink: 8/373 up/down: 155/-5668 (-5513)
    x86_64 allyesconfig-ish:
    add/remove: 3/1 grow/shrink: 57/634 up/down: 3515/-28177 (-24662) !!!

    Some archs redefine find_next_bit() but it is OK:

    m68k inline but SMP is not supported
    arm out-of-line
    unicore32 out-of-line

    Function call will happen anyway, so move load and increment into callee.

    Link: http://lkml.kernel.org/r/20170824230010.GA1593@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

15 May, 2017

1 commit

  • More users for for_each_cpu_wrap() have appeared. Promote the construct
    to generic cpumask interface.

    The implementation is slightly modified to reduce arguments.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Lauro Ramos Venancio
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: lwang@redhat.com
    Link: http://lkml.kernel.org/r/20170414122005.o35me2h5nowqkxbv@hirez.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

29 Feb, 2016

1 commit

  • Almost every cpumask function is exported, just not the one I need to make the
    Intel uncore driver modular.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andi Kleen
    Cc: Arnaldo Carvalho de Melo
    Cc: Borislav Petkov
    Cc: David S. Miller
    Cc: Harish Chegondi
    Cc: Jacob Pan
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rusty Russell
    Cc: Stephane Eranian
    Cc: Vince Weaver
    Cc: linux-kernel@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160222221011.878299859@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

19 Jun, 2015

1 commit

  • Revert commit 534b483a86e6 ("cpumask: don't perform while loop in
    cpumask_next_and()").

    This was a minor optimization, but it puts a `struct cpumask' on the
    stack, which consumes too much stack space.

    Sergey Senozhatsky
    Reported-by: Peter Zijlstra
    Cc: Sergey Senozhatsky
    Cc: Tejun Heo
    Cc: "David S. Miller"
    Cc: Amir Vadai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

28 May, 2015

1 commit

  • da91309e0a7e (cpumask: Utility function to set n'th cpu...) created a
    genuinely weird function. I never saw it before, it went through DaveM.
    (He only does this to make us other maintainers feel better about our own
    mistakes.)

    cpumask_set_cpu_local_first's purpose is say "I need to spread things
    across N online cpus, choose the ones on this numa node first"; you call
    it in a loop.

    It can fail. One of the two callers ignores this, the other aborts and
    fails the device open.

    It can fail in two ways: allocating the off-stack cpumask, or through a
    convoluted codepath which AFAICT can only occur if cpu_online_mask
    changes. Which shouldn't happen, because if cpu_online_mask can change
    while you call this, it could return a now-offline cpu anyway.

    It contains a nonsensical test "!cpumask_of_node(numa_node)". This was
    drawn to my attention by Geert, who said this causes a warning on Sparc.
    It sets a single bit in a cpumask instead of returning a cpu number,
    because that's what the callers want.

    It could be made more efficient by passing the previous cpu rather than
    an index, but that would be more invasive to the callers.

    Fixes: da91309e0a7e8966d916a74cce42ed170fde06bf
    Signed-off-by: Rusty Russell (then rebased)
    Tested-by: Amir Vadai
    Acked-by: Amir Vadai
    Acked-by: David S. Miller

    Rusty Russell
     

21 Apr, 2015

1 commit

  • Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell:
    "This is the final removal (after several years!) of the obsolete
    cpus_* functions, prompted by their mis-use in staging.

    With these function removed, all cpu functions should only iterate to
    nr_cpu_ids, so we finally only allocate that many bits when cpumasks
    are allocated offstack"

    * tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
    cpumask: remove __first_cpu / __next_cpu
    cpumask: resurrect CPU_MASK_CPU0
    linux/cpumask.h: add typechecking to cpumask_test_cpu
    cpumask: only allocate nr_cpumask_bits.
    Fix weird uses of num_online_cpus().
    cpumask: remove deprecated functions.
    mips: fix obsolete cpumask_of_cpu usage.
    x86: fix more deprecated cpu function usage.
    ia64: remove deprecated cpus_ usage.
    powerpc: fix deprecated CPU_MASK_CPU0 usage.
    CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region.
    staging/lustre/o2iblnd: Don't use cpus_weight
    staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_
    staging/lustre/ptlrpc: Do not use deprecated cpus_* functions
    blackfin: fix up obsolete cpu function usage.
    parisc: fix up obsolete cpu function usage.
    tile: fix up obsolete cpu function usage.
    arm64: fix up obsolete cpu function usage.
    mips: fix up obsolete cpu function usage.
    x86: fix up obsolete cpu function usage.
    ...

    Linus Torvalds
     

19 Apr, 2015

1 commit

  • They were for use by the deprecated first_cpu() and next_cpu() wrappers,
    but sparc used them directly.

    They're now replaced by cpumask_first / cpumask_next. And __next_cpu_nr
    is completely obsolete.

    Signed-off-by: Rusty Russell
    Acked-by: David S. Miller

    Rusty Russell
     

17 Apr, 2015

1 commit

  • cpumask_next_and() is looking for cpumask_next() in src1 in a loop and
    tests if found cpu is also present in src2. remove that loop, perform
    cpumask_and() of src1 and src2 first and use that new mask to find
    cpumask_next().

    Apart from removing while loop, ./bloat-o-meter on x86_64 shows
    add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
    function old new delta
    cpumask_next_and 62 54 -8

    Signed-off-by: Sergey Senozhatsky
    Cc: Tejun Heo
    Cc: "David S. Miller"
    Cc: Amir Vadai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sergey Senozhatsky
     

10 Mar, 2015

1 commit


03 Jul, 2014

1 commit


12 Jun, 2014

1 commit

  • This function sets the n'th cpu - local cpu's first.
    For example: in a 16 cores server with even cpu's local, will get the
    following values:
    cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set
    cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set
    ...
    cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set
    cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set
    cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set
    ...
    cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set

    Curently this function will be used by multi queue networking devices to
    calculate the irq affinity mask, such that as many local cpu's as
    possible will be utilized to handle the mq device irq's.

    Signed-off-by: Amir Vadai
    Signed-off-by: David S. Miller

    Amir Vadai
     

02 Jun, 2014

2 commits

  • This reverts commit 70a640d0dae3a9b1b222ce673eb5d92c263ddd61
    ("net/mlx4_en: Use affinity hint") and commit
    c8865b64b05b2f4eeefd369373e9c8aeb069e7a1 ("cpumask: Utility function
    to set n'th cpu - local cpu first") because these changes break
    the build when SMP is disabled amongst other things.

    Reported-by: Eric Dumazet
    Signed-off-by: David S. Miller

    David S. Miller
     
  • This function sets the n'th cpu - local cpu's first.
    For example: in a 16 cores server with even cpu's local, will get the
    following values:
    cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set
    cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set
    ...
    cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set
    cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set
    cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set
    ...
    cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set

    Curently this function will be used by multi queue networking devices to
    calculate the irq affinity mask, such that as many local cpu's as
    possible will be utilized to handle the mq device irq's.

    Signed-off-by: Amir Vadai
    Signed-off-by: David S. Miller

    Amir Vadai
     

22 Jan, 2014

1 commit

  • Switch to memblock interfaces for early memory allocator instead of
    bootmem allocator. No functional change in beahvior than what it is in
    current code from bootmem users points of view.

    Archs already converted to NO_BOOTMEM now directly use memblock
    interfaces instead of bootmem wrappers build on top of memblock. And
    the archs which still uses bootmem, these new apis just fallback to
    exiting bootmem APIs.

    Signed-off-by: Santosh Shilimkar
    Cc: "Rafael J. Wysocki"
    Cc: Arnd Bergmann
    Cc: Christoph Lameter
    Cc: Greg Kroah-Hartman
    Cc: Grygorii Strashko
    Cc: H. Peter Anvin
    Cc: Johannes Weiner
    Cc: KAMEZAWA Hiroyuki
    Cc: Konrad Rzeszutek Wilk
    Cc: Michal Hocko
    Cc: Paul Walmsley
    Cc: Pavel Machek
    Cc: Russell King
    Cc: Tejun Heo
    Cc: Tony Lindgren
    Cc: Yinghai Lu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Santosh Shilimkar
     

12 Dec, 2012

1 commit

  • It is strange that alloc_bootmem() returns a virtual address and
    free_bootmem() requires a physical address. Anyway, free_bootmem()'s
    first parameter should be physical address.

    There are some call sites for free_bootmem() with virtual address. So fix
    them.

    [akpm@linux-foundation.org: improve free_bootmem() and free_bootmem_pate() documentation]
    Signed-off-by: Joonsoo Kim
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Johannes Weiner
    Cc: FUJITA Tomonori
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

29 Mar, 2012

1 commit

  • __any_online_cpu() is not optimal and also unnecessary. So, replace its
    use by faster cpumask_* operations.

    Signed-off-by: Srivatsa S. Bhat
    Cc: Eric Dumazet
    Cc: Venkatesh Pallipadi
    Cc: Rusty Russell
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Srivatsa S. Bhat
     

08 Mar, 2012

1 commit


27 Jul, 2011

2 commits


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

12 Jun, 2009

1 commit


09 Jun, 2009

1 commit


03 Apr, 2009

1 commit

  • Fix slab corruption caused by alloc_cpumask_var_node() overwriting the
    tail end of an off-stack cpumask.

    The function zeros out cpumask bits beyond the last possible cpu. The
    starting point for zeroing should be the beginning of the mask offset by a
    byte count derived from the number of possible cpus. The offset was
    calculated in bits instead of bytes. This resulted in overwriting the end
    of the cpumask.

    Signed-off-by: Jack Steiner
    Acked-by: Mike Travis
    Acked-by: Ingo Molnar
    Cc: Rusty Russell
    Cc: Stephen Rothwell
    Cc: [2.6.29.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jack Steiner
     

01 Jan, 2009

2 commits

  • Impact: extra safety checks during transition

    When CONFIG_CPUMASKS_OFFSTACK is set, the new cpumask_ operators only
    use bits up to nr_cpu_ids, not NR_CPUS. Using the old cpus_ operators
    on these masks can mean accessing undefined bits.

    After some discussion, Mike and I decided to err on the side of caution;
    we zero the "undefined" bits in alloc_cpumask_var_node() until all the
    old cpumask functions are removed.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: fix kernel-doc

    alloc_bootmem_cpumask_var() returns avoid.

    Signed-off-by: Li Zefan
    Signed-off-by: Rusty Russell

    Li Zefan
     

19 Dec, 2008

2 commits

  • Impact: New kerneldoc comments

    Additional documentation added to all the alloc_cpumask and free_cpumask
    functions.

    Signed-off-by: Mike Travis
    Signed-off-by: Rusty Russell (minor additions)

    Mike Travis
     
  • Impact: New API

    This will be needed in x86 code to allocate the domain and old_domain
    cpumasks on the same node as where the containing irq_cfg struct is
    allocated.

    (Also fixes double-dump_stack on rare CONFIG_DEBUG_PER_CPU_MAPS case)

    Signed-off-by: Mike Travis
    Signed-off-by: Rusty Russell (re-impl alloc_cpumask_var)

    Mike Travis
     

10 Nov, 2008

1 commit


07 Nov, 2008

1 commit


06 Nov, 2008

1 commit

  • Impact: introduce new APIs

    We want to deprecate cpumasks on the stack, as we are headed for
    gynormous numbers of CPUs. Eventually, we want to head towards an
    undefined 'struct cpumask' so they can never be declared on stack.

    1) New cpumask functions which take pointers instead of copies.
    (cpus_* -> cpumask_*)

    2) Several new helpers to reduce requirements for temporary cpumasks
    (cpumask_first_and, cpumask_next_and, cpumask_any_and)

    3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
    (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)

    4) 'struct cpumask' for explicitness and to mark new-style code.

    5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
    not NR_CPUS for time efficiency and for smaller dynamic allocations
    in future.

    6) cpumask_copy() so we can allocate less than a full cpumask eventually
    (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
    definition eventually.

    7) work_on_cpu() helper for doing task on a CPU, rather than saving old
    cpumask for current thread and manipulating it.

    8) smp_call_function_many() which is smp_call_function_mask() except
    taking a cpumask pointer.

    Note that this patch simply introduces the new functions and leaves
    the obsolescent ones in place. This is to simplify the transition
    patches.

    Signed-off-by: Rusty Russell
    Signed-off-by: Ingo Molnar

    Rusty Russell
     

24 May, 2008

1 commit

  • * Increase performance for systems with large count NR_CPUS by limiting
    the range of the cpumask operators that loop over the bits in a cpumask_t
    variable. This removes a large amount of wasted cpu cycles.

    * Add performance variants of the cpumask operators:

    int cpus_weight_nr(mask) Same using nr_cpu_ids instead of NR_CPUS
    int first_cpu_nr(mask) Number lowest set bit, or nr_cpu_ids
    int next_cpu_nr(cpu, mask) Next cpu past 'cpu', or nr_cpu_ids
    for_each_cpu_mask_nr(cpu, mask) for-loop cpu over mask using nr_cpu_ids

    * Modify following to use performance variants:

    #define num_online_cpus() cpus_weight_nr(cpu_online_map)
    #define num_possible_cpus() cpus_weight_nr(cpu_possible_map)
    #define num_present_cpus() cpus_weight_nr(cpu_present_map)

    #define for_each_possible_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)
    #define for_each_online_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)
    #define for_each_present_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)

    * Comment added to include/linux/cpumask.h:

    Note: The alternate operations with the suffix "_nr" are used
    to limit the range of the loop to nr_cpu_ids instead of
    NR_CPUS when NR_CPUS > 64 for performance reasons.
    If NR_CPUS is
    Cc: Christoph Lameter
    Reviewed-by: Paul Jackson
    Reviewed-by: Christoph Lameter
    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis