13 Jan, 2021
1 commit
-
Export cpuidle_driver_state_disabled() so that CPU idle states may be
disabled at runtime for debugging CPU and cluster idle states.Bug: 175718935
Signed-off-by: Lina Iyer
Change-Id: Id9038074d64fb6c0444d9aca68420414c3223e93
05 Jan, 2021
1 commit
-
When entering cluster-wide or system-wide power mode, Exynos cpu
power management driver checks the next hrtimer events of cpu
composing the power domain to prevent unnecessary attempts to enter
the power mode. Since struct cpuidle_device has next_hrtimer, it
can be solved by passing cpuidle device as a parameter of vh.In order to improve responsiveness, it is necessary to prevent
entering the deep idle state in boosting scenario. So, vendor
driver should be able to control the idle state.Due to above requirements, the parameters required for idle enter
and exit different, so the vendor hook is separated into
cpu_idle_enter and cpu_idle_exit.Bug: 176198732
Change-Id: I2262ba1bae5e6622a8e76bc1d5d16fb27af0bb8a
Signed-off-by: Park Bumgyu
13 Dec, 2020
1 commit
-
To select domain idlestates for cpuidle-psci when OSI mode has been
enabled, the PM domains via genpd are being managed through runtime PM.
This works fine for the regular idlepath, but it doesn't during system wide
suspend. More precisely, the domain idlestates becomes temporarily
disabled, which is because the PM core disables runtime PM for devices
during system wide suspend.Later in the system suspend phase, genpd intends to deal with this from its
->suspend_noirq() callback, but this doesn't work as expected for a device
corresponding to a CPU, because the domain idlestates needs to be selected
on a per CPU basis (the PM core doesn't invoke the callbacks like that).To address this problem, let's enable the syscore flag for the
corresponding CPU device that becomes successfully attached to its PM
domain (applicable only in OSI mode). This informs the PM core to skip
invoke the system wide suspend/resume callbacks for the device, thus also
prevents genpd from screwing up its internal state of it.Moreover, to properly select a domain idlestate for the CPUs during
suspend-to-idle, let's assign a specific ->enter_s2idle() callback for the
corresponding domain idlestate (applicable only in OSI mode). From that
callback, let's invoke dev_pm_genpd_suspend|resume(), as this allows a
domain idlestate to be selected for the current CPU by genpd.Signed-off-by: Ulf Hansson
(cherry picked from commit 670c90def03429a228229420fa48a17913fdcc0d git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git)
Bug: 175076037
Change-Id: Ie70c496e0c14b18fa1e8b67231d0a56ff047414f
Signed-off-by: Lina Iyer
20 Nov, 2020
1 commit
-
…nux/kernel/git/netdev/net") into android-mainline
Steps on the way to 5.10-rc5
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I00726ee0d08f08ae6ac5edd07c8fa502b41d4800
16 Nov, 2020
1 commit
-
Annotate tegra_pm_set[clear]_cpu_in_lp2() with RCU_NONIDLE in order to
fix lockdep warning about suspicious RCU usage of a spinlock during late
idling phase.WARNING: suspicious RCU usage
...
include/trace/events/lock.h:13 suspicious rcu_dereference_check() usage!
...
(dump_stack) from (lock_acquire)
(lock_acquire) from (_raw_spin_lock)
(_raw_spin_lock) from (tegra_pm_set_cpu_in_lp2)
(tegra_pm_set_cpu_in_lp2) from (tegra_cpuidle_enter)
(tegra_cpuidle_enter) from (cpuidle_enter_state)
(cpuidle_enter_state) from (cpuidle_enter_state_coupled)
(cpuidle_enter_state_coupled) from (cpuidle_enter)
(cpuidle_enter) from (do_idle)
...Tested-by: Peter Geis
Reported-by: Peter Geis
Signed-off-by: Dmitry Osipenko
Signed-off-by: Rafael J. Wysocki
26 Oct, 2020
1 commit
-
…nux/kernel/git/mchehab/linux-media") into android-mainline
Steps on the way to 5.10-rc1
Resolves conflicts in:
fs/userfaultfd.cSigned-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie3fe3c818f1f6565cfd4fa551de72d2b72ef60af
25 Oct, 2020
1 commit
-
…g/pub/scm/linux/kernel/git/konrad/swiotlb") into android-mainline
Resolves merge issues with:
drivers/cpufreq/cpufreq.cSigned-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic2907cf71867dab7b8e1b5fcbd4c888fc01f8c22
24 Oct, 2020
1 commit
-
Commit 83788c0caed3 ("cpuidle: remove unused exports") removed
capability of registering cpuidle governors, which was unused at that
time. By exporting the symbol, let's allow platform specific modules to
register cpuidle governors and use cpuidle_governor_latency_req() to get
the QoS for the CPU.Bug: 169136276
Link: https://lore.kernel.org/linux-pm/010101746fc98add-45e77496-d2d6-4bc1-a1ce-0692599a9a7a-000000@us-west-2.amazonses.com/
Signed-off-by: Lina Iyer
Change-Id: Ifa91576af0a3ae92ce9b216cb67728f037546c5b
17 Oct, 2020
1 commit
-
Pull powerpc updates from Michael Ellerman:
- A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting
it for powerpc, as well as a related fix for sparc.- Remove support for PowerPC 601.
- Some fixes for watchpoints & addition of a new ptrace flag for
detecting ISA v3.1 (Power10) watchpoint features.- A fix for kernels using 4K pages and the hash MMU on bare metal
Power9 systems with > 16TB of RAM, or RAM on the 2nd node.- A basic idle driver for shallow stop states on Power10.
- Tweaks to our sched domains code to better inform the scheduler about
the hardware topology on Power9/10, where two SMT4 cores can be
presented by firmware as an SMT8 core.- A series doing further reworks & cleanups of our EEH code.
- Addition of a filter for RTAS (firmware) calls done via sys_rtas(),
to prevent root from overwriting kernel memory.- Other smaller features, fixes & cleanups.
Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe
Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn
Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero,
Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad
Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca
Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas
Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro
Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang
Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha,
Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt,
Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain,
Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang
Yingliang, zhengbin.* tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits)
Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed"
selftests/powerpc: Fix eeh-basic.sh exit codes
cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier
powerpc/time: Make get_tb() common to PPC32 and PPC64
powerpc/time: Make get_tbl() common to PPC32 and PPC64
powerpc/time: Remove get_tbu()
powerpc/time: Avoid using get_tbl() and get_tbu() internally
powerpc/time: Make mftb() common to PPC32 and PPC64
powerpc/time: Rename mftbl() to mftb()
powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S
powerpc/32s: Rename head_32.S to head_book3s_32.S
powerpc/32s: Setup the early hash table at all time.
powerpc/time: Remove ifdef in get_dec() and set_dec()
powerpc: Remove get_tb_or_rtc()
powerpc: Remove __USE_RTC()
powerpc: Tidy up a bit after removal of PowerPC 601.
powerpc: Remove support for PowerPC 601
powerpc: Remove PowerPC 601
powerpc: Drop SYNC_601() ISYNC_601() and SYNC()
powerpc: Remove CONFIG_PPC601_SYNC_FIX
...
28 Sep, 2020
1 commit
27 Sep, 2020
1 commit
-
…x/kernel/git/jejb/scsi") into 'android-mainline'
Fixes up a merge issue in:
net/ipv6/route.c
on the way to a 5.9-rc7 release.Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I4eb508eb3761b95ad8f39dd79f03b3352481ceaf
23 Sep, 2020
2 commits
-
CPUs may fail to enter the chosen idle state if there was a
pending interrupt, causing the cpuidle driver to return an error
value.Record that and export it via sysfs along with the other idle state
statistics.This could prove useful in understanding behavior of the governor
and the system during usecases that involve multiple CPUs.Signed-off-by: Lina Iyer
[ rjw: Changelog and documentation edits ]
Signed-off-by: Rafael J. Wysocki -
The commit 1098582a0f6c ("sched,idle,rcu: Push rcu_idle deeper into the
idle path"), moved the calls rcu_idle_enter|exit() into the cpuidle core.However, it forgot to remove a couple of comments in enter_s2idle_proper()
about why RCU_NONIDLE earlier was needed. So, let's drop them as they have
become a bit misleading.Fixes: 1098582a0f6c ("sched,idle,rcu: Push rcu_idle deeper into the idle path")
Signed-off-by: Ulf Hansson
Signed-off-by: Rafael J. Wysocki
22 Sep, 2020
2 commits
-
If the PSCI OSI mode isn't supported or fails to be enabled, the PM domain
topology with the genpd providers isn't initialized. This is perfectly fine
from cpuidle-psci point of view.However, since the PM domain topology in the DTS files is a description of
the HW, no matter of whether the PSCI OSI mode is supported or not, other
consumers besides the CPUs may rely on it.Therefore, let's always allow the initialization of the PM domain topology
to succeed, independently of whether the PSCI OSI mode is supported.
Consequentially we need to track if we succeed to enable the OSI mode, as
to know when a domain idlestate can be selected.Note that, CPU devices are still not being attached to the PM domain
topology, unless the PSCI OSI mode is supported.Acked-by: Sudeep Holla
Signed-off-by: Ulf Hansson
Signed-off-by: Rafael J. Wysocki -
The current user (cpuidle-psci) of psci_set_osi_mode() only needs to enable
the PSCI OSI mode. Although, as subsequent changes shows, there is a need
to be able to reset back into the PSCI PC mode.Therefore, let's extend psci_set_osi_mode() to take a bool as in-parameter,
to let the user indicate whether to enable OSI or to switch back to PC
mode.Reviewed-by: Sudeep Holla
Signed-off-by: Ulf Hansson
Signed-off-by: Rafael J. Wysocki
21 Sep, 2020
3 commits
-
The enter() callback of CPUIDLE drivers returns index of the entered idle
state on success or a negative value on failure. The negative value could
any negative value, i.e. it doesn't necessarily needs to be a error code.
That's because CPUIDLE core only cares about the fact of failure and not
about the reason of the enter() failure.Like every other enter() callback, the arm_cpuidle_simple_enter() returns
the entered idle-index on success. Unlike some of other drivers, it never
fails. It happened that TEGRA_C1=index=err=0 in the code of cpuidle-tegra
driver, and thus, there is no problem for the cpuidle-tegra driver created
by the typo in the code which assumes that the arm_cpuidle_simple_enter()
returns a error code.The arm_cpuidle_simple_enter() also may return a -ENODEV error if CPU_IDLE
is disabled in a kernel's config, but all CPUIDLE drivers are disabled if
CPU_IDLE is disabled, including the cpuidle-tegra driver. So we can't ever
see the error code from arm_cpuidle_simple_enter() today.Of course the code may get some changes in the future and then the
typo may transform into a real bug, so let's correct the typo! The
tegra_cpuidle_state_enter() is now changed to make it return the entered
idle-index on success and negative error code on fail, which puts it on
par with the arm_cpuidle_simple_enter(), making code consistent in regards
to the error handling.This patch fixes a minor typo in the code, it doesn't fix any bugs.
Signed-off-by: Dmitry Osipenko
Reviewed-by: Jon Hunter
Signed-off-by: Rafael J. Wysocki -
The commit eb1f00237aca ("lockdep,trace: Expose tracepoints"), started to
expose us for tracepoints. This lead to the following RCU splat on an ARM64
Qcom board.[ 5.529634] WARNING: suspicious RCU usage
[ 5.537307] sdhci-pltfm: SDHCI platform and OF driver helper
[ 5.541092] 5.9.0-rc3 #86 Not tainted
[ 5.541098] -----------------------------
[ 5.541105] ../include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
[ 5.541110]
[ 5.541110] other info that might help us debug this:
[ 5.541110]
[ 5.541116]
[ 5.541116] rcu_scheduler_active = 2, debug_locks = 1
[ 5.541122] RCU used illegally from extended quiescent state!
[ 5.541129] no locks held by swapper/0/0.
[ 5.541134]
[ 5.541134] stack backtrace:
[ 5.541143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #86
[ 5.541149] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[ 5.541157] Call trace:
[ 5.568185] sdhci_msm 7864900.sdhci: Got CD GPIO
[ 5.574186] dump_backtrace+0x0/0x1c8
[ 5.574206] show_stack+0x14/0x20
[ 5.574229] dump_stack+0xe8/0x154
[ 5.574250] lockdep_rcu_suspicious+0xd4/0xf8
[ 5.574269] lock_acquire+0x3f0/0x460
[ 5.574292] _raw_spin_lock_irqsave+0x80/0xb0
[ 5.574314] __pm_runtime_suspend+0x4c/0x188
[ 5.574341] psci_enter_domain_idle_state+0x40/0xa0
[ 5.574362] cpuidle_enter_state+0xc0/0x610
[ 5.646487] cpuidle_enter+0x38/0x50
[ 5.650651] call_cpuidle+0x18/0x40
[ 5.654467] do_idle+0x228/0x278
[ 5.657678] cpu_startup_entry+0x24/0x70
[ 5.661153] rest_init+0x1a4/0x278
[ 5.665061] arch_call_rest_init+0xc/0x14
[ 5.668272] start_kernel+0x508/0x540Following the path in pm_runtime_put_sync_suspend() from
psci_enter_domain_idle_state(), it seems like we end up using the RCU.
Therefore, let's simply silence the splat by informing the RCU about it
with RCU_NONIDLE.Note that, this is a temporary solution. Instead we should strive to avoid
using RCU_NONIDLE (and similar), but rather push rcu_idle_enter|exit()
further down, closer to the arch specific code. However, as the CPU PM
notifiers are also using the RCU, additional rework is needed.Reported-by: Naresh Kamboju
Signed-off-by: Ulf Hansson
Acked-by: Paul E. McKenney
Signed-off-by: Rafael J. Wysocki -
Linux 5.9-rc6
Signed-off-by: Greg Kroah-Hartman
Change-Id: I3bccdbb773bfc2c604742e6ff5983bf0b61ba0b5
19 Sep, 2020
1 commit
-
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.9:- Opt us out of the DEBUG_VM_PGTABLE support for now as it's causing
crashes.- Fix a long standing bug in our DMA mask handling that was hidden
until recently, and which caused problems with some drivers.- Fix a boot failure on systems with large amounts of RAM, and no
hugepage support and using Radix MMU, only seen in the lab.- A few other minor fixes.
Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Gautham R. Shenoy,
Hari Bathini, Ira Weiny, Nick Desaulniers, Shirisha Ganta, Vaibhav
Jain, and Vaidyanathan Srinivasan"* tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
cpuidle: pseries: Fix CEDE latency conversion from tb to us
powerpc/dma: Fix dma_map_ops::get_required_mask
Revert "powerpc/build: vdso linker warning for orphan sections"
powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc
selftests/powerpc: Skip PROT_SAO test in guests/LPARS
powerpc/book3s64/radix: Fix boot failure with large amount of guest memory
17 Sep, 2020
1 commit
-
Some drivers have to do significant work, some of which relies on RCU
still being active. Instead of using RCU_NONIDLE in the drivers and
flipping RCU back on, allow drivers to take over RCU-idle duty.Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Ulf Hansson
Tested-by: Borislav Petkov
Signed-off-by: Rafael J. Wysocki
15 Sep, 2020
1 commit
-
This driver does not restore stop > 3 state, so it limits itself
to states which do not lose full state or TB.The POWER10 SPRs are sufficiently different from P9 that it seems
easier to split out the P10 code. The POWER10 deep sleep code
(e.g., the BHRB restore) has been taken out, but it can be re-added
when stop > 3 support is added.Signed-off-by: Nicholas Piggin
Tested-by: Pratik Rajesh Sampat
Tested-by: Vaidyanathan Srinivasan
Reviewed-by: Pratik Rajesh Sampat
Reviewed-by: Gautham R. Shenoy
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20200819094700.493399-1-npiggin@gmail.com
08 Sep, 2020
1 commit
-
Commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for
CEDE(0)") sets the exit latency of CEDE(0) based on the latency values
of the Extended CEDE states advertised by the platform. The values
advertised by the platform are in timebase ticks. However the cpuidle
framework requires the latency values in microseconds.If the tb-ticks value advertised by the platform correspond to a value
smaller than 1us, during the conversion from tb-ticks to microseconds,
in the current code, the result becomes zero. This is incorrect as it
puts a CEDE state on par with the snooze state.This patch fixes this by rounding up the result obtained while
converting the latency value from tb-ticks to microseconds. It also
prints a warning in case we discover an extended-cede state with
wakeup latency to be 0. In such a case, ensure that CEDE(0) has a
non-zero wakeup latency.Fixes: d947fb4c965c ("cpuidle: pseries: Fixup exit latency for CEDE(0)")
Signed-off-by: Gautham R. Shenoy
Reviewed-by: Vaidyanathan Srinivasan
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/1599125247-28488-1-git-send-email-ego@linux.vnet.ibm.com
01 Sep, 2020
1 commit
-
Linux 5.9-rc3
Signed-off-by: Greg Kroah-Hartman
Change-Id: Ic7758bc57a7d91861657388ddd015db5c5db5480
26 Aug, 2020
3 commits
-
This allows moving the leave_mm() call into generic code before
rcu_idle_enter(). Gets rid of more trace_*_rcuidle() users.Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Steven Rostedt (VMware)
Reviewed-by: Thomas Gleixner
Acked-by: Rafael J. Wysocki
Tested-by: Marco Elver
Link: https://lkml.kernel.org/r/20200821085348.369441600@infradead.org -
Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
that the locking tracepoints were using RCU.Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.Specifically, sched_clock_idle_wakeup_event() will use ktime which
will use seqlocks which will tickle lockdep, and
stop_critical_timings() uses lock.Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Steven Rostedt (VMware)
Reviewed-by: Thomas Gleixner
Acked-by: Rafael J. Wysocki
Tested-by: Marco Elver
Link: https://lkml.kernel.org/r/20200821085348.310943801@infradead.org -
Match the pattern elsewhere in this file.
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Steven Rostedt (VMware)
Reviewed-by: Thomas Gleixner
Acked-by: Rafael J. Wysocki
Tested-by: Marco Elver
Link: https://lkml.kernel.org/r/20200821085348.251340558@infradead.org
20 Aug, 2020
1 commit
-
An event that gather the idle state that the cpu attempted to enter and
actually entered is added. Through this, the idle statistics of the cpu
can be obtained and used for vendor specific algorithms or for system
analysis.Bug: 162980647
Change-Id: I9c2491d524722042e881864488f7b3cf7e903d1e
Signed-off-by: Park Bumgyu
08 Aug, 2020
1 commit
-
Pull powerpc updates from Michael Ellerman:
- Add support for (optionally) using queued spinlocks & rwlocks.
- Support for a new faster system call ABI using the scv instruction on
Power9 or later.- Drop support for the PROT_SAO mmap/mprotect flag as it will be
unsupported on Power10 and future processors, leaving us with no way
to implement the functionality it requests. This risks breaking
userspace, though we believe it is unused in practice.- A bug fix for, and then the removal of, our custom stack expansion
checking. We now allow stack expansion up to the rlimit, like other
architectures.- Remove the remnants of our (previously disabled) topology update
code, which tried to react to NUMA layout changes on virtualised
systems, but was prone to crashes and other problems.- Add PMU support for Power10 CPUs.
- A change to our signal trampoline so that we don't unbalance the link
stack (branch return predictor) in the signal delivery path.- Lots of other cleanups, refactorings, smaller features and so on as
usual.Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey
Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju
T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan
S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris
Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan
Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn
Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand,
Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel
Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh
Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan
Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton
Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran,
Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud,
Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh
Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar
Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza
Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov,
Wei Yongjun, Wen Xiong, YueHaibing.* tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (337 commits)
selftests/powerpc: Fix pkey syscall redefinitions
powerpc: Fix circular dependency between percpu.h and mmu.h
powerpc/powernv/sriov: Fix use of uninitialised variable
selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs
powerpc/40x: Fix assembler warning about r0
powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric
powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
cpuidle: pseries: Fixup exit latency for CEDE(0)
cpuidle: pseries: Add function to parse extended CEDE records
cpuidle: pseries: Set the latency-hint before entering CEDE
selftests/powerpc: Fix online CPU selection
powerpc/perf: Consolidate perf_callchain_user_[64|32]()
powerpc/pseries/hotplug-cpu: Remove double free in error path
powerpc/pseries/mobility: Add pr_debug() for device tree changes
powerpc/pseries/mobility: Set pr_fmt()
powerpc/cacheinfo: Warn if cache object chain becomes unordered
powerpc/cacheinfo: Improve diagnostics about malformed cache lists
powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages
powerpc/cacheinfo: Set pr_fmt()
powerpc: fix function annotations to avoid section mismatch warnings with gcc-10
...
30 Jul, 2020
9 commits
-
We are currently assuming that CEDE(0) has exit latency 10us, since
there is no way for us to query from the platform. However, if the
wakeup latency of an Extended CEDE state is smaller than 10us, then we
can be sure that the exit latency of CEDE(0) cannot be more than that.In this patch, we fix the exit latency of CEDE(0) if we discover an
Extended CEDE state with wakeup latency smaller than 10us.Benchmark results:
On POWER8, this patch does not have any impact since the advertized
latency of Extended CEDE (1) is 30us which is higher than the default
latency of CEDE (0) which is 10us.On POWER9 we see improvement the single-threaded performance of
ebizzy, and no regression in the wakeup latency or the number of
context-switches.ebizzy:
2 ebizzy threads bound to the same big-core. 25% improvement in the
avg records/s with patch.x without_patch
* with_patch
N Min Max Median Avg Stddev
x 10 2491089 5834307 5398375 4244335 1596244.9
* 10 2893813 5834474 5832448 5327281.3 1055941.4context_switch2:
There is no major regression observed with this patch as seen from the
context_switch2 benchmark.context_switch2 across CPU0 CPU1 (Both belong to same big-core, but
different small cores). We observe a minor 0.14% regression in the
number of context-switches (higher is better).x without_patch
* with_patch
N Min Max Median Avg Stddev
x 500 348872 362236 354712 354745.69 2711.827
* 500 349422 361452 353942 354215.4 2576.9258Difference at 99.0% confidence
-530.288 +/- 430.963
-0.149484% +/- 0.121485%
(Student's t, pooled s = 2645.24)context_switch2 across CPU0 CPU8 (Different big-cores). We observe a
0.37% improvement in the number of context-switches (higher is
better).x without_patch
* with_patch
N Min Max Median Avg Stddev
x 500 287956 294940 288896 288977.23 646.59295
* 500 288300 294646 289582 290064.76 1161.9992Difference at 99.0% confidence
1087.53 +/- 153.194
0.376337% +/- 0.0530125%
(Student's t, pooled s = 940.299)schbench:
No major difference could be seen until the 99.9th percentile.Without-patch:
Latency percentiles (usec)
50.0th: 29
75.0th: 39
90.0th: 49
95.0th: 59
*99.0th: 13104
99.5th: 14672
99.9th: 15824
min=0, max=17993With-patch:
Latency percentiles (usec)
50.0th: 29
75.0th: 40
90.0th: 50
95.0th: 61
*99.0th: 13648
99.5th: 14768
99.9th: 15664
min=0, max=29812Signed-off-by: Gautham R. Shenoy
[mpe: Minor formatting]
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/1596087177-30329-4-git-send-email-ego@linux.vnet.ibm.com -
Currently we use CEDE with latency-hint 0 as the only other idle state
on a dedicated LPAR apart from the polling "snooze" state.The platform might support additional extended CEDE idle states, which
can be discovered through the "ibm,get-system-parameter" rtas-call
made with CEDE_LATENCY_TOKEN.This patch adds a function to obtain information about the extended
CEDE idle states from the platform and parse the contents to populate
an array of extended CEDE states. These idle states thus discovered
will be added to the cpuidle framework in the next patch.dmesg on a POWER8 and POWER9 LPAR, demonstrating the output of parsing
the extended CEDE latency parameters are as followsPOWER8
[ 10.093279] xcede : xcede_record_size = 10
[ 10.093285] xcede : Record 0 : hint = 1, latency = 0x3c00 tb ticks, Wake-on-irq = 1
[ 10.093291] xcede : Record 1 : hint = 2, latency = 0x4e2000 tb ticks, Wake-on-irq = 0
[ 10.093297] cpuidle : Skipping the 2 Extended CEDE idle statesPOWER9
[ 5.913180] xcede : xcede_record_size = 10
[ 5.913183] xcede : Record 0 : hint = 1, latency = 0x400 tb ticks, Wake-on-irq = 1
[ 5.913188] xcede : Record 1 : hint = 2, latency = 0x3e8000 tb ticks, Wake-on-irq = 0
[ 5.913193] cpuidle : Skipping the 2 Extended CEDE idle statesSigned-off-by: Gautham R. Shenoy
[mpe: Make space for 16 records, drop memset, minor cleanup & formatting]
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/1596087177-30329-3-git-send-email-ego@linux.vnet.ibm.com -
As per the PAPR, each H_CEDE call is associated with a latency-hint to
be passed in the VPA field "cede_latency_hint". The CEDE states that
we were implicitly entering so far is CEDE with latency-hint = 0.This patch explicitly sets the latency hint corresponding to the CEDE
state that we are currently entering. While at it, we save the
previous hint, to be restored once we wakeup from CEDE. This will be
required in the future when we expose extended-cede states through the
cpuidle framework, where each of them will have a different
cede-latency hint.Signed-off-by: Gautham R. Shenoy
[mpe: Make cede_latency_hint static]
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/1596087177-30329-2-git-send-email-ego@linux.vnet.ibm.com -
Control Flow Integrity(CFI) is a security mechanism that disallows
changes to the original control flow graph of a compiled binary,
making it significantly harder to perform such attacks.init_state_node() assign same function callback to different
function pointer declarations.static int init_state_node(struct cpuidle_state *idle_state,
const struct of_device_id *matches,
struct device_node *state_node) { ...
idle_state->enter = match_id->data; ...
idle_state->enter_s2idle = match_id->data; }Function declarations:
struct cpuidle_state { ...
int (*enter) (struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index);void (*enter_s2idle) (struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index); };In this case, either enter() or enter_s2idle() would cause CFI check
failed since they use same callee.Align function prototype of enter() since it needs return value for
some use cases. The return value of enter_s2idle() is no
need currently.Signed-off-by: Neal Liu
Reviewed-by: Sami Tolvanen
Signed-off-by: Rafael J. Wysocki -
Depending on the SoC/platform, additional devices may be part of the PSCI
PM domain topology. This is the case with 'qcom,rpmh-rsc' device, for
example, even if this is not yet visible in the corresponding DTS-files.Without going into too much details, a device like the 'qcom,rpmh-rsc' may
have HW constraints that needs to be obeyed to, before a domain idlestate
can be picked.Therefore, let's implement the ->sync_state() callback to receive a
notification when all consumers of the PSCI PM domain providers have been
attached/probed to it. In this way, we can make sure all constraints from
all relevant devices, are taken into account before allowing a domain
idlestate to be picked.Acked-by: Saravana Kannan
Signed-off-by: Ulf Hansson
Reviewed-by: Lukasz Luba
Signed-off-by: Rafael J. Wysocki -
To enable support for deferred probing and to allow implementation of the
->sync_state() callback from subsequent changes, let's convert into a
platform driver.Reviewed-by: Lina Iyer
Signed-off-by: Ulf Hansson
Signed-off-by: Rafael J. Wysocki -
The current error paths for the cpuidle-psci driver, may leak memory or
possibly leave CPU devices attached to their PM domains. These are quite
harmless issues, but still deserves to be taken care of.Although, rather than fixing them by keeping track of allocations that
needs to be freed, which tends to become a bit messy, let's convert into a
platform driver. In this way, it gets easier to fix the memory leaks as we
can rely on the devm_* functions.Moreover, converting to a platform driver also enables support for deferred
probe, which subsequent changes takes benefit from.Signed-off-by: Ulf Hansson
Reviewed-by: Lukasz Luba
Signed-off-by: Rafael J. Wysocki -
Currently we allow the cpuidle driver registration to succeed, even if we
failed to enable the OSI mode when the hierarchical DT layout is used. This
means running in a degraded mode, by using the available idle states per
CPU, while also preventing the domain idle states.Moving forward, this behaviour looks quite questionable to maintain, as
complexity seems to grow around it, especially when trying to add support
for deferred probe, for example.Therefore, let's make the cpuidle driver registration to fail in this
situation, thus relying on the default architectural cpuidle backend for
WFI to be used.Reviewed-by: Lina Iyer
Signed-off-by: Ulf Hansson
Signed-off-by: Rafael J. Wysocki -
The combined build object for the PSCI cpuidle driver and the PSCI PM
domain, is a bit messy. Therefore let's split it up by adding a new Kconfig
ARM_PSCI_CPUIDLE_DOMAIN and convert into two separate objects.Reviewed-by: Lina Iyer
Reviewed-by: Sudeep Holla
Signed-off-by: Ulf Hansson
Signed-off-by: Rafael J. Wysocki
16 Jul, 2020
1 commit
-
The sparse tool complains as follows:
drivers/cpuidle/cpuidle-pseries.c:25:23: warning:
symbol 'pseries_idle_driver' was not declared. Should it be static?'pseries_idle_driver' is not used outside of this file, so marks
it static.Reported-by: Hulk Robot
Signed-off-by: Wei Yongjun
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20200714142424.66648-1-weiyongjun1@huawei.com
15 Jul, 2020
1 commit
-
Commit 1961acad2f88559c2cdd2ef67c58c3627f1f6e54 removes usage of
function "validate_dt_prop_sizes". This patch removes this unused
function.Signed-off-by: Abhishek Goel
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20200706053258.121475-1-huntbag@linux.vnet.ibm.com
25 Jun, 2020
1 commit
-
Implement call_cpuidle_s2idle() in analogy with call_cpuidle()
for the s2idle-specific idle state entry and invoke it from
cpuidle_idle_call() to make the s2idle-specific idle entry code
path look more similar to the "regular" idle entry one.No intentional functional impact.
Signed-off-by: Rafael J. Wysocki
Acked-by: Chen Yu