15 Jan, 2021

1 commit

  • Code added for cpu pause feature should be conditional based on
    CONFIG_SUSPEND

    Fixes: 5ada76d05637 ("ANDROID: sched/pause: prevent wake up paused cpus")
    Bug: 161210528
    Reported-by: kernel test robot
    Signed-off-by: Todd Kjos
    Change-Id: I8dc31064bafb31dd570daae97b7bb547384a771f

    Todd Kjos
     

13 Jan, 2021

1 commit

  • When used for qos or other reasons, wake up idle
    cpus will wake cpus en-mass. Cpus that are paused
    should not be woken up like this.

    Update to use active_mask, so that paused cpus are
    ignored for general cpu wakeup operations.

    Bug: 161210528
    Change-Id: I10721e75497a8902f8ec998ded4e2eb094770f38
    Signed-off-by: Stephen Dickey

    Stephen Dickey
     

19 Oct, 2020

1 commit

  • Pull RCU changes from Ingo Molnar:

    - Debugging for smp_call_function()

    - RT raw/non-raw lock ordering fixes

    - Strict grace periods for KASAN

    - New smp_call_function() torture test

    - Torture-test updates

    - Documentation updates

    - Miscellaneous fixes

    [ This doesn't actually pull the tag - I've dropped the last merge from
    the RCU branch due to questions about the series. - Linus ]

    * tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
    smp: Make symbol 'csd_bug_count' static
    kernel/smp: Provide CSD lock timeout diagnostics
    smp: Add source and destination CPUs to __call_single_data
    rcu: Shrink each possible cpu krcp
    rcu/segcblist: Prevent useless GP start if no CBs to accelerate
    torture: Add gdb support
    rcutorture: Allow pointer leaks to test diagnostic code
    rcutorture: Hoist OOM registry up one level
    refperf: Avoid null pointer dereference when buf fails to allocate
    rcutorture: Properly synchronize with OOM notifier
    rcutorture: Properly set rcu_fwds for OOM handling
    torture: Add kvm.sh --help and update help message
    rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
    torture: Update initrd documentation
    rcutorture: Replace HTTP links with HTTPS ones
    locktorture: Make function torture_percpu_rwsem_init() static
    torture: document --allcpus argument added to the kvm.sh script
    rcutorture: Output number of elapsed grace periods
    rcutorture: Remove KCSAN stubs
    rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
    ...

    Linus Torvalds
     

17 Oct, 2020

1 commit

  • Fix multiple occurrences of duplicated words in kernel/.

    Fix one typo/spello on the same line as a duplicate word. Change one
    instance of "the the" to "that the". Otherwise just drop one of the
    repeated words.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/98202fa6-8919-ef63-9efe-c0fad5ca7af1@infradead.org
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

05 Sep, 2020

3 commits

  • The sparse tool complains as follows:

    kernel/smp.c:107:10: warning:
    symbol 'csd_bug_count' was not declared. Should it be static?

    Because variable is not used outside of smp.c, this commit marks it
    static.

    Reported-by: Hulk Robot
    Signed-off-by: Wei Yongjun
    Signed-off-by: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Sebastian Andrzej Siewior

    Wei Yongjun
     
  • This commit causes csd_lock_wait() to emit diagnostics when a CPU
    fails to respond quickly enough to one of the smp_call_function()
    family of function calls. These diagnostics are enabled by a new
    CSD_LOCK_WAIT_DEBUG Kconfig option that depends on DEBUG_KERNEL.

    This commit was inspired by an earlier patch by Josef Bacik.

    [ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
    [ paulmck: Fix KASAN use-after-free issue reported by Qian Cai. ]
    [ paulmck: Fix botched nr_cpu_ids comparison per Dan Carpenter. ]
    [ paulmck: Apply Peter Zijlstra feedback. ]
    Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
    Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Sebastian Andrzej Siewior
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit adds a destination CPU to __call_single_data, and is inspired
    by an earlier commit by Peter Zijlstra. This version adds #ifdef to
    permit use by 32-bit systems and supplying the destination CPU for all
    smp_call_function*() requests, not just smp_call_function_single().

    If need be, 32-bit systems could be accommodated by shrinking the flags
    field to 16 bits (the atomic_t variant is currently unused) and by
    providing only eight bits for CPU on such systems.

    It is not clear that the addition of the fields to __call_single_node
    are really needed.

    [ paulmck: Apply Boqun Feng feedback on 32-bit builds. ]
    Link: https://lore.kernel.org/lkml/20200615164048.GC2531@hirez.programming.kicks-ass.net/
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Sebastian Andrzej Siewior
    Cc: Frederic Weisbecker
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

22 Jul, 2020

1 commit

  • The get_option() maybe return 0, it means that the nr_cpus is
    not initialized. Then we will use the stale nr_cpus to initialize
    the nr_cpu_ids. So fix it.

    Signed-off-by: Muchun Song
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200716070457.53255-1-songmuchun@bytedance.com

    Muchun Song
     

28 Jun, 2020

1 commit

  • Instead of relying on BUG_ON() to ensure the various data structures
    line up, use a bunch of horrible unions to make it all automatic.

    Much of the union magic is to ensure irq_work and smp_call_function do
    not (yet) see the members of their respective data structures change
    name.

    Suggested-by: Linus Torvalds
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Frederic Weisbecker
    Link: https://lkml.kernel.org/r/20200622100825.844455025@infradead.org

    Peter Zijlstra
     

04 Jun, 2020

1 commit

  • Pull scheduler updates from Ingo Molnar:
    "The changes in this cycle are:

    - Optimize the task wakeup CPU selection logic, to improve
    scalability and reduce wakeup latency spikes

    - PELT enhancements

    - CFS bandwidth handling fixes

    - Optimize the wakeup path by remove rq->wake_list and replacing it
    with ->ttwu_pending

    - Optimize IPI cross-calls by making flush_smp_call_function_queue()
    process sync callbacks first.

    - Misc fixes and enhancements"

    * tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    irq_work: Define irq_work_single() on !CONFIG_IRQ_WORK too
    sched/headers: Split out open-coded prototypes into kernel/sched/smp.h
    sched: Replace rq::wake_list
    sched: Add rq::ttwu_pending
    irq_work, smp: Allow irq_work on call_single_queue
    smp: Optimize send_call_function_single_ipi()
    smp: Move irq_work_run() out of flush_smp_call_function_queue()
    smp: Optimize flush_smp_call_function_queue()
    sched: Fix smp_call_function_single_async() usage for ILB
    sched/core: Offload wakee task activation if it the wakee is descheduling
    sched/core: Optimize ttwu() spinning on p->on_cpu
    sched: Defend cfs and rt bandwidth quota against overflow
    sched/cpuacct: Fix charge cpuacct.usage_sys
    sched/fair: Replace zero-length array with flexible-array
    sched/pelt: Sync util/runnable_sum with PELT window when propagating
    sched/cpuacct: Use __this_cpu_add() instead of this_cpu_ptr()
    sched/fair: Optimize enqueue_task_fair()
    sched: Make scheduler_ipi inline
    sched: Clean up scheduler_ipi()
    sched/core: Simplify sched_init()
    ...

    Linus Torvalds
     

02 Jun, 2020

1 commit


28 May, 2020

6 commits

  • Move the prototypes for sched_ttwu_pending() and send_call_function_single_ipi()
    into the newly created kernel/sched/smp.h header, to make sure they are all
    the same, and to architectures happy that use -Wmissing-prototypes.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The recent commit: 90b5363acd47 ("sched: Clean up scheduler_ipi()")
    got smp_call_function_single_async() subtly wrong. Even though it will
    return -EBUSY when trying to re-use a csd, that condition is not
    atomic and still requires external serialization.

    The change in ttwu_queue_remote() got this wrong.

    While on first reading ttwu_queue_remote() has an atomic test-and-set
    that appears to serialize the use, the matching 'release' is not in
    the right place to actually guarantee this serialization.

    The actual race is vs the sched_ttwu_pending() call in the idle loop;
    that can run the wakeup-list without consuming the CSD.

    Instead of trying to chain the lists, merge them.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lore.kernel.org/r/20200526161908.129371594@infradead.org

    Peter Zijlstra
     
  • Currently irq_work_queue_on() will issue an unconditional
    arch_send_call_function_single_ipi() and has the handler do
    irq_work_run().

    This is unfortunate in that it makes the IPI handler look at a second
    cacheline and it misses the opportunity to avoid the IPI. Instead note
    that struct irq_work and struct __call_single_data are very similar in
    layout, so use a few bits in the flags word to encode a type and stick
    the irq_work on the call_single_queue list.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lore.kernel.org/r/20200526161908.011635912@infradead.org

    Peter Zijlstra
     
  • Just like the ttwu_queue_remote() IPI, make use of _TIF_POLLING_NRFLAG
    to avoid sending IPIs to idle CPUs.

    [ mingo: Fix UP build bug. ]

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lore.kernel.org/r/20200526161907.953304789@infradead.org

    Peter Zijlstra
     
  • This ensures flush_smp_call_function_queue() is strictly about
    call_single_queue.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lore.kernel.org/r/20200526161907.895109676@infradead.org

    Peter Zijlstra
     
  • The call_single_queue can contain (two) different callbacks,
    synchronous and asynchronous. The current interrupt handler runs them
    in-order, which means that remote CPUs that are waiting for their
    synchronous call can be delayed by running asynchronous callbacks.

    Rework the interrupt handler to first run the synchonous callbacks.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lore.kernel.org/r/20200526161907.836818381@infradead.org

    Peter Zijlstra
     

19 Apr, 2020

1 commit


25 Mar, 2020

1 commit


06 Mar, 2020

1 commit

  • Previously we will raise an warning if we want to insert a csd object
    which is with the LOCK flag set, and if it happens we'll also wait for
    the lock to be released. However, this operation does not match
    perfectly with how the function is named - the name with "_async"
    suffix hints that this function should not block, while we will.

    This patch changed this behavior by simply return -EBUSY instead of
    waiting, at the meantime we allow this operation to happen without
    warning the user to change this into a feature when the caller wants
    to "insert a csd object, if it's there, just wait for that one".

    This is pretty safe because in flush_smp_call_function_queue() for
    async csd objects (where csd->flags&SYNC is zero) we'll first do the
    unlock then we call the csd->func(). So if we see the csd->flags&LOCK
    is true in smp_call_function_single_async(), then it's guaranteed that
    csd->func() will be called after this smp_call_function_single_async()
    returns -EBUSY.

    Update the comment of the function too to refect this.

    Signed-off-by: Peter Xu
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lkml.kernel.org/r/20191216213125.9536-2-peterx@redhat.com

    Peter Xu
     

28 Jan, 2020

1 commit


25 Jan, 2020

3 commits

  • The allocation mask is no longer used by on_each_cpu_cond() and
    on_each_cpu_cond_mask() and can be removed.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lore.kernel.org/r/20200117090137.1205765-4-bigeasy@linutronix.de

    Sebastian Andrzej Siewior
     
  • on_each_cpu_cond_mask() allocates a new CPU mask. The newly allocated
    mask is a subset of the provided mask based on the conditional function.

    This memory allocation can be avoided by extending smp_call_function_many()
    with the conditional function and performing the remote function call based
    on the mask and the conditional function.

    Rename smp_call_function_many() to smp_call_function_many_cond() and add
    the smp_cond_func_t argument. If smp_cond_func_t is provided then it is
    used before invoking the function. Provide smp_call_function_many() with
    cond_func set to NULL. Let on_each_cpu_cond_mask() use
    smp_call_function_many_cond().

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lore.kernel.org/r/20200117090137.1205765-3-bigeasy@linutronix.de

    Sebastian Andrzej Siewior
     
  • Use a typdef for the conditional function instead defining it each time in
    the function prototype.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lore.kernel.org/r/20200117090137.1205765-2-bigeasy@linutronix.de

    Sebastian Andrzej Siewior
     

20 Jul, 2019

1 commit

  • It's clearly documented that smp function calls cannot be invoked from
    softirq handling context. Unfortunately nothing enforces that or emits a
    warning.

    A single function call can be invoked from softirq context only via
    smp_call_function_single_async().

    The only legit context is task context, so add a warning to that effect.

    Reported-by: luferry
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass.net

    Peter Zijlstra
     

23 Jun, 2019

2 commits

  • The return value is fixed. Remove it and amend the callers.

    [ tglx: Fixup arm/bL_switcher and powerpc/rtas ]

    Signed-off-by: Nadav Amit
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Andrew Morton
    Link: https://lkml.kernel.org/r/20190613064813.8102-2-namit@vmware.com

    Nadav Amit
     
  • cfd_data is marked as shared, but although it hold pointers to shared
    data structures, it is private per core.

    Signed-off-by: Nadav Amit
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: Rik van Riel
    Link: https://lkml.kernel.org/r/20190613064813.8102-8-namit@vmware.com

    Nadav Amit
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

31 Jan, 2019

1 commit

  • With the following commit:

    73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")

    ... the hotplug code attempted to detect when SMT was disabled by BIOS,
    in which case it reported SMT as permanently disabled. However, that
    code broke a virt hotplug scenario, where the guest is booted with only
    primary CPU threads, and a sibling is brought online later.

    The problem is that there doesn't seem to be a way to reliably
    distinguish between the HW "SMT disabled by BIOS" case and the virt
    "sibling not yet brought online" case. So the above-mentioned commit
    was a bit misguided, as it permanently disabled SMT for both cases,
    preventing future virt sibling hotplugs.

    Going back and reviewing the original problems which were attempted to
    be solved by that commit, when SMT was disabled in BIOS:

    1) /sys/devices/system/cpu/smt/control showed "on" instead of
    "notsupported"; and

    2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.

    I'd propose that we instead consider #1 above to not actually be a
    problem. Because, at least in the virt case, it's possible that SMT
    wasn't disabled by BIOS and a sibling thread could be brought online
    later. So it makes sense to just always default the smt control to "on"
    to allow for that possibility (assuming cpuid indicates that the CPU
    supports SMT).

    The real problem is #2, which has a simple fix: change vmx_vm_init() to
    query the actual current SMT state -- i.e., whether any siblings are
    currently online -- instead of looking at the SMT "control" sysfs value.

    So fix it by:

    a) reverting the original "fix" and its followup fix:

    73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")
    bc2d8d262cba ("cpu/hotplug: Fix SMT supported evaluation")

    and

    b) changing vmx_vm_init() to query the actual current SMT state --
    instead of the sysfs control value -- to determine whether the L1TF
    warning is needed. This also requires the 'sched_smt_present'
    variable to exported, instead of 'cpu_smt_control'.

    Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")
    Reported-by: Igor Mammedov
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Cc: Joe Mario
    Cc: Jiri Kosina
    Cc: Peter Zijlstra
    Cc: kvm@vger.kernel.org
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com

    Josh Poimboeuf
     

09 Oct, 2018

2 commits

  • Introduce a variant of on_each_cpu_cond that iterates only over the
    CPUs in a cpumask, in order to avoid making callbacks for every single
    CPU in the system when we only need to test a subset.

    Cc: npiggin@gmail.com
    Cc: mingo@kernel.org
    Cc: will.deacon@arm.com
    Cc: songliubraving@fb.com
    Cc: kernel-team@fb.com
    Cc: hpa@zytor.com
    Cc: luto@kernel.org
    Signed-off-by: Rik van Riel
    Signed-off-by: Peter Zijlstra (Intel)
    Link: http://lkml.kernel.org/r/20180926035844.1420-5-riel@surriel.com

    Rik van Riel
     
  • The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask,
    which should never be used by other CPUs simultaneously. There is no
    need to use locked memory accesses to set the bits in this bitmap.

    Switch to __cpumask_set_cpu.

    Cc: npiggin@gmail.com
    Cc: mingo@kernel.org
    Cc: will.deacon@arm.com
    Cc: songliubraving@fb.com
    Cc: kernel-team@fb.com
    Cc: hpa@zytor.com
    Suggested-by: Peter Zijlstra
    Signed-off-by: Rik van Riel
    Reviewed-by: Andy Lutomirski
    Signed-off-by: Peter Zijlstra (Intel)
    Link: http://lkml.kernel.org/r/20180926035844.1420-4-riel@surriel.com

    Rik van Riel
     

07 Aug, 2018

1 commit

  • Josh reported that the late SMT evaluation in cpu_smt_state_init() sets
    cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied
    on the kernel command line as it cannot differentiate between SMT disabled
    by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and
    makes the sysfs interface unusable.

    Rework this so that during bringup of the non boot CPUs the availability of
    SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a
    'primary' thread then set the local cpu_smt_available marker and evaluate
    this explicitely right after the initial SMP bringup has finished.

    SMT evaulation on x86 is a trainwreck as the firmware has all the
    information _before_ booting the kernel, but there is no interface to query
    it.

    Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")
    Reported-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

08 Nov, 2017

1 commit

  • Use lockdep to check that IRQs are enabled or disabled as expected. This
    way the sanity check only shows overhead when concurrency correctness
    debug code is enabled.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Thomas Gleixner
    Cc: David S . Miller
    Cc: Lai Jiangshan
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Tejun Heo
    Link: http://lkml.kernel.org/r/1509980490-4285-7-git-send-email-frederic@kernel.org
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

09 Sep, 2017

1 commit

  • First, number of CPUs can't be negative number.

    Second, different signnnedness leads to suboptimal code in the following
    cases:

    1)
    kmalloc(nr_cpu_ids * sizeof(X));

    "int" has to be sign extended to size_t.

    2)
    while (loff_t *pos < nr_cpu_ids)

    MOVSXD is 1 byte longed than the same MOV.

    Other cases exist as well. Basically compiler is told that nr_cpu_ids
    can't be negative which can't be deduced if it is "int".

    Code savings on allyesconfig kernel: -3KB

    add/remove: 0/0 grow/shrink: 25/264 up/down: 261/-3631 (-3370)
    function old new delta
    coretemp_cpu_online 450 512 +62
    rcu_init_one 1234 1272 +38
    pci_device_probe 374 399 +25

    ...

    pgdat_reclaimable_pages 628 556 -72
    select_fallback_rq 446 369 -77
    task_numa_find_cpu 1923 1807 -116

    Link: http://lkml.kernel.org/r/20170819114959.GA30580@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

29 Aug, 2017

1 commit

  • struct call_single_data is used in IPIs to transfer information between
    CPUs. Its size is bigger than sizeof(unsigned long) and less than
    cache line size. Currently it is not allocated with any explicit alignment
    requirements. This makes it possible for allocated call_single_data to
    cross two cache lines, which results in double the number of the cache lines
    that need to be transferred among CPUs.

    This can be fixed by requiring call_single_data to be aligned with the
    size of call_single_data. Currently the size of call_single_data is the
    power of 2. If we add new fields to call_single_data, we may need to
    add padding to make sure the size of new definition is the power of 2
    as well.

    Fortunately, this is enforced by GCC, which will report bad sizes.

    To set alignment requirements of call_single_data to the size of
    call_single_data, a struct definition and a typedef is used.

    To test the effect of the patch, I used the vm-scalability multiple
    thread swap test case (swap-w-seq-mt). The test will create multiple
    threads and each thread will eat memory until all RAM and part of swap
    is used, so that huge number of IPIs are triggered when unmapping
    memory. In the test, the throughput of memory writing improves ~5%
    compared with misaligned call_single_data, because of faster IPIs.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Huang, Ying
    [ Add call_single_data_t and align with size of call_single_data. ]
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Aaron Lu
    Cc: Borislav Petkov
    Cc: Eric Dumazet
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Michael Ellerman
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.com
    Signed-off-by: Ingo Molnar

    Ying Huang
     

23 May, 2017

2 commits

  • The cpumasks in smp_call_function_many() are private and not subject
    to concurrency, atomic bitops are pointless and expensive.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Inter-Processor-Interrupt(IPI) is needed when a page is unmapped and the
    process' mm_cpumask() shows the process has ever run on other CPUs. page
    migration, page reclaim all need IPIs. The number of IPI needed to send
    to different CPUs is especially large for multi-threaded workload since
    mm_cpumask() is per process.

    For smp_call_function_many(), whenever a CPU queues a CSD to a target
    CPU, it will send an IPI to let the target CPU to handle the work.
    This isn't necessary - we need only send IPI when queueing a CSD
    to an empty call_single_queue.

    The reason:

    flush_smp_call_function_queue() that is called upon a CPU receiving an
    IPI will empty the queue and then handle all of the CSDs there. So if
    the target CPU's call_single_queue is not empty, we know that:
    i. An IPI for the target CPU has already been sent by 'previous queuers';
    ii. flush_smp_call_function_queue() hasn't emptied that CPU's queue yet.
    Thus, it's safe for us to just queue our CSD there without sending an
    addtional IPI. And for the 'previous queuers', we can limit it to the
    first queuer.

    To demonstrate the effect of this patch, a multi-thread workload that
    spawns 80 threads to equally consume 100G memory is used. This is tested
    on a 2 node broadwell-EP which has 44cores/88threads and 32G memory. So
    after 32G memory is used up, page reclaiming starts to happen a lot.

    With this patch, IPI number dropped 88% and throughput increased about
    15% for the above workload.

    Signed-off-by: Aaron Lu
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Dave Hansen
    Cc: Huang Ying
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Link: http://lkml.kernel.org/r/20170519075331.GE2084@aaronlu.sh.intel.com
    Signed-off-by: Ingo Molnar

    Aaron Lu
     

02 Mar, 2017

1 commit


26 Oct, 2016

2 commits

  • Currently we don't print anything before starting to bring up secondary
    CPUs. This can be confusing if it takes a long time to bring up the
    secondaries, or if the kernel crashes while doing so and produces no
    further output.

    On x86 they work around this by detecting when the first secondary CPU
    comes up and printing a message (see announce_cpu()). But doing it in
    smp_init() is simpler and works for all arches.

    Signed-off-by: Michael Ellerman
    Reviewed-by: Borislav Petkov
    Cc: akpm@osdl.org
    Cc: jgross@suse.com
    Cc: ak@linux.intel.com
    Cc: tim.c.chen@linux.intel.com
    Cc: len.brown@intel.com
    Cc: peterz@infradead.org
    Cc: richard@nod.at
    Cc: jolsa@redhat.com
    Cc: boris.ostrovsky@oracle.com
    Cc: mgorman@techsingularity.net
    Link: http://lkml.kernel.org/r/1477460275-8266-3-git-send-email-mpe@ellerman.id.au
    Signed-off-by: Thomas Gleixner

    Michael Ellerman
     
  • Currently after bringing up secondary CPUs all arches print "Brought up
    %d CPUs". On x86 they also print the number of nodes that were brought
    online.

    It would be nice to also print the number of nodes on other arches.
    Although we could override smp_announce() on the other ~10 NUMA aware
    arches, it seems simpler to just always print the number of nodes. On
    non-NUMA arches there is just always 1 node.

    Having done that, smp_announce() is no longer weak, and seems small
    enough to just pull directly into smp_init().

    Also update the printing of "%d CPUs" to be smart when an SMP kernel is
    booted on a single CPU system, or when only one CPU is available, eg:

    smp: Brought up 2 nodes, 1 CPU

    Signed-off-by: Michael Ellerman
    Reviewed-by: Borislav Petkov
    Cc: akpm@osdl.org
    Cc: jgross@suse.com
    Cc: ak@linux.intel.com
    Cc: tim.c.chen@linux.intel.com
    Cc: len.brown@intel.com
    Cc: peterz@infradead.org
    Cc: richard@nod.at
    Cc: jolsa@redhat.com
    Cc: boris.ostrovsky@oracle.com
    Cc: mgorman@techsingularity.net
    Link: http://lkml.kernel.org/r/1477460275-8266-2-git-send-email-mpe@ellerman.id.au
    Signed-off-by: Thomas Gleixner

    Michael Ellerman