13 Feb, 2019

1 commit

  • [ Upstream commit 9456823c842f346c74265fcd98d008d87a7eb6f5 ]

    of_find_node_by_path() acquires a reference to the node
    returned by it and that reference needs to be dropped by its caller.
    bl_idle_init() doesn't do that, so fix it.

    Signed-off-by: Yangtao Li
    Acked-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Sasha Levin

    Yangtao Li
     

26 Jan, 2019

1 commit

  • [ Upstream commit 2b038cbc5fcf12a7ee1cc9bfd5da1e46dacdee87 ]

    When booting a pseries kernel with PREEMPT enabled, it dumps the
    following warning:

    BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
    caller is pseries_processor_idle_init+0x5c/0x22c
    CPU: 13 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc3-00090-g12201a0128bc-dirty #828
    Call Trace:
    [c000000429437ab0] [c0000000009c8878] dump_stack+0xec/0x164 (unreliable)
    [c000000429437b00] [c0000000005f2f24] check_preemption_disabled+0x154/0x160
    [c000000429437b90] [c000000000cab8e8] pseries_processor_idle_init+0x5c/0x22c
    [c000000429437c10] [c000000000010ed4] do_one_initcall+0x64/0x300
    [c000000429437ce0] [c000000000c54500] kernel_init_freeable+0x3f0/0x500
    [c000000429437db0] [c0000000000112dc] kernel_init+0x2c/0x160
    [c000000429437e20] [c00000000000c1d0] ret_from_kernel_thread+0x5c/0x6c

    This happens because the code calls get_lppaca() which calls
    get_paca() and it checks if preemption is disabled through
    check_preemption_disabled().

    Preemption should be disabled because the per CPU variable may make no
    sense if there is a preemption (and a CPU switch) after it reads the
    per CPU data and when it is used.

    In this device driver specifically, it is not a problem, because this
    code just needs to have access to one lppaca struct, and it does not
    matter if it is the current per CPU lppaca struct or not (i.e. when
    there is a preemption and a CPU migration).

    That said, the most appropriate fix seems to be related to avoiding
    the debug_smp_processor_id() call at get_paca(), instead of calling
    preempt_disable() before get_paca().

    Signed-off-by: Breno Leitao
    Signed-off-by: Michael Ellerman
    Signed-off-by: Sasha Levin

    Breno Leitao
     

21 Nov, 2018

1 commit

  • commit 763f191af51f127cf8e69cd361f50bf6180768a5 upstream.

    There's no point to register the cpuidle driver for the current CPU, when
    the initialization of the arch specific back-end data fails by returning
    -ENXIO.

    Instead, let's re-order the sequence to its original flow, by first trying
    to initialize the back-end part and then act accordingly on the returned
    error code. Additionally, let's print the error message, no matter of what
    error code that was returned.

    Fixes: a0d46a3dfdc3 (ARM: cpuidle: Register per cpuidle device)
    Signed-off-by: Ulf Hansson
    Reviewed-by: Daniel Lezcano
    Cc: 4.19+ # v4.19+
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Ulf Hansson
     

25 Aug, 2018

1 commit

  • The case addressed by commit 5ef499cd571c (cpuidle: menu: Handle
    stopped tick more aggressively) in the stopped tick case is present
    when the tick has not been stopped yet too. Namely, if only two CPU
    idle states, shallow state A with target residency significantly
    below the tick boundary and deep state B with target residency
    significantly above it, are available and the predicted idle
    duration is above the tick boundary, but below the target residency
    of state B, state A will be selected and the CPU may spend indefinite
    amount of time in it, which is not quite energy-efficient.

    However, if the tick has not been stopped yet and the governor is
    about to select a shallow idle state for the CPU even though the idle
    duration predicted by it is above the tick boundary, it should be
    fine to wake up the CPU early, so the tick can be retained then and
    the governor will have a chance to select a deeper state when it runs
    next time.

    [Note that when this really happens, it will make the idle duration
    predictor believe that the CPU might be idle longer than predicted,
    which will make it more likely to predict longer idle durations going
    forward, but that will also cause deeper idle states to be selected
    going forward, on average, which is what's needed here.]

    Fixes: 87c9fe6ee495 (cpuidle: menu: Avoid selecting shallow states with stopped tick)
    Reported-by: Leo Yan
    Cc: 4.17+ # 4.17+: 5ef499cd571c (cpuidle: menu: Handle ...)
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

22 Aug, 2018

1 commit

  • Pull more power management updates from Rafael Wysocki:
    "These fix the main idle loop and the menu cpuidle governor, clean up
    the latter, fix a mistake in the PCI bus type's support for system
    suspend and resume, fix the ondemand and conservative cpufreq
    governors, address a build issue in the system wakeup framework and
    make the ACPI C-states desciptions less confusing.

    Specifics:

    - Make the idle loop handle stopped scheduler tick correctly (Rafael
    Wysocki).

    - Prevent the menu cpuidle governor from letting CPUs spend too much
    time in shallow idle states when it is invoked with scheduler tick
    stopped and clean it up somewhat (Rafael Wysocki).

    - Avoid invoking the platform firmware to make the platform enter the
    ACPI S3 sleep state with suspended PCIe root ports which may
    confuse the firmware and cause it to crash (Rafael Wysocki).

    - Fix sysfs-related race in the ondemand and conservative cpufreq
    governors which may cause the system to crash if the governor
    module is removed during an update of CPU frequency limits (Henry
    Willard).

    - Select SRCU when building the system wakeup framework to avoid a
    build issue in it (zhangyi).

    - Make the descriptions of ACPI C-states vendor-neutral to avoid
    confusion (Prarit Bhargava)"

    * tag 'pm-4.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    cpuidle: menu: Handle stopped tick more aggressively
    sched: idle: Avoid retaining the tick when it has been stopped
    PCI / ACPI / PM: Resume all bridges on suspend-to-RAM
    cpuidle: menu: Update stale polling override comment
    cpufreq: governor: Avoid accessing invalid governor_data
    x86/ACPI/cstate: Make APCI C1 FFH MWAIT C-state description vendor-neutral
    cpuidle: menu: Fix white space
    PM / sleep: wakeup: Fix build error caused by missing SRCU support

    Linus Torvalds
     

20 Aug, 2018

1 commit

  • Commit 87c9fe6ee495 (cpuidle: menu: Avoid selecting shallow states
    with stopped tick) missed the case when the target residencies of
    deep idle states of CPUs are above the tick boundary which may cause
    the CPU to get stuck in a shallow idle state for a long time.

    Say there are two CPU idle states available: one shallow, with the
    target residency much below the tick boundary and one deep, with
    the target residency significantly above the tick boundary. In
    that case, if the tick has been stopped already and the expected
    next timer event is relatively far in the future, the governor will
    assume the idle duration to be equal to TICK_USEC and it will select
    the idle state for the CPU accordingly. However, that will cause the
    shallow state to be selected even though it would have been more
    energy-efficient to select the deep one.

    To address this issue, modify the governor to always use the time
    till the closest timer event instead of the predicted idle duration
    if the latter is less than the tick period length and the tick has
    been stopped already. Also make it extend the search for a matching
    idle state if the tick is stopped to avoid settling on a shallow
    state if deep states with target residencies above the tick period
    length are available.

    In addition, make it always indicate that the tick should be stopped
    if it has been stopped already for consistency.

    Fixes: 87c9fe6ee495 (cpuidle: menu: Avoid selecting shallow states with stopped tick)
    Reported-by: Leo Yan
    Acked-by: Peter Zijlstra (Intel)
    Cc: 4.17+ # 4.17+
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

18 Aug, 2018

1 commit

  • Pull powerpc updates from Michael Ellerman:
    "Notable changes:

    - A fix for a bug in our page table fragment allocator, where a page
    table page could be freed and reallocated for something else while
    still in use, leading to memory corruption etc. The fix reuses
    pt_mm in struct page (x86 only) for a powerpc only refcount.

    - Fixes to our pkey support. Several are user-visible changes, but
    bring us in to line with x86 behaviour and/or fix outright bugs.
    Thanks to Florian Weimer for reporting many of these.

    - A series to improve the hvc driver & related OPAL console code,
    which have been seen to cause hardlockups at times. The hvc driver
    changes in particular have been in linux-next for ~month.

    - Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y.

    - Remove Power8 DD1 and Power9 DD1 support, neither chip should be in
    use anywhere other than as a paper weight.

    - An optimised memcmp implementation using Power7-or-later VMX
    instructions

    - Support for barrier_nospec on some NXP CPUs.

    - Support for flushing the count cache on context switch on some IBM
    CPUs (controlled by firmware), as a Spectre v2 mitigation.

    - A series to enhance the information we print on unhandled signals
    to bring it into line with other arches, including showing the
    offending VMA and dumping the instructions around the fault.

    Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey
    Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan,
    Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski,
    Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng,
    Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph
    Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave
    Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer,
    Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand,
    Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel
    Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh
    Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues,
    Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha,
    Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul
    Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza
    Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood,
    Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago
    Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat
    Rao, zhong jiang"

    * tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (234 commits)
    powerpc/mm/book3s/radix: Add mapping statistics
    powerpc/uaccess: Enable get_user(u64, *p) on 32-bit
    powerpc/mm/hash: Remove unnecessary do { } while(0) loop
    powerpc/64s: move machine check SLB flushing to mm/slb.c
    powerpc/powernv/idle: Fix build error
    powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range
    powerpc/mm: remove warning about ‘type’ being set
    powerpc/32: Include setup.h header file to fix warnings
    powerpc: Move `path` variable inside DEBUG_PROM
    powerpc/powermac: Make some functions static
    powerpc/powermac: Remove variable x that's never read
    cxl: remove a dead branch
    powerpc/powermac: Add missing include of header pmac.h
    powerpc/kexec: Use common error handling code in setup_new_fdt()
    powerpc/xmon: Add address lookup for percpu symbols
    powerpc/mm: remove huge_pte_offset_and_shift() prototype
    powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled
    powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.
    powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements
    powerpc/fadump: handle crash memory ranges array index overflow
    ...

    Linus Torvalds
     

17 Aug, 2018

1 commit


15 Aug, 2018

1 commit


31 Jul, 2018

1 commit


25 Jun, 2018

1 commit

  • It's perfectly fine to have multiple cpuidle driver compiled in the
    build configuration. However, it's not good to throw error on driver
    registration failure if some other driver is already initialised and
    assigned. In such cases, __cpuidle_register_driver returns -EBUSY and
    we can check for such error before throwing the error.

    Signed-off-by: Sudeep Holla
    Acked-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki

    Sudeep Holla
     

08 Jun, 2018

1 commit

  • Pull powerpc updates from Michael Ellerman:
    "Notable changes:

    - Support for split PMD page table lock on 64-bit Book3S (Power8/9).

    - Add support for HAVE_RELIABLE_STACKTRACE, so we properly support
    live patching again.

    - Add support for patching barrier_nospec in copy_from_user() and
    syscall entry.

    - A couple of fixes for our data breakpoints on Book3S.

    - A series from Nick optimising TLB/mm handling with the Radix MMU.

    - Numerous small cleanups to squash sparse/gcc warnings from Mathieu
    Malaterre.

    - Several series optimising various parts of the 32-bit code from
    Christophe Leroy.

    - Removal of support for two old machines, "SBC834xE" and "C2K"
    ("GEFanuc,C2K"), which is why the diffstat has so many deletions.

    And many other small improvements & fixes.

    There's a few out-of-area changes. Some minor ftrace changes OK'ed by
    Steve, and a fix to our powernv cpuidle driver. Then there's a series
    touching mm, x86 and fs/proc/task_mmu.c, which cleans up some details
    around pkey support. It was ack'ed/reviewed by Ingo & Dave and has
    been in next for several weeks.

    Thanks to: Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Al
    Viro, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd
    Bergmann, Balbir Singh, Cédric Le Goater, Christophe Leroy, Christophe
    Lombard, Colin Ian King, Dave Hansen, Fabio Estevam, Finn Thain,
    Frederic Barrat, Gautham R. Shenoy, Haren Myneni, Hari Bathini, Ingo
    Molnar, Jonathan Neuschäfer, Josh Poimboeuf, Kamalesh Babulal,
    Madhavan Srinivasan, Mahesh Salgaonkar, Mark Greer, Mathieu Malaterre,
    Matthew Wilcox, Michael Neuling, Michal Suchanek, Naveen N. Rao,
    Nicholas Piggin, Nicolai Stange, Olof Johansson, Paul Gortmaker, Paul
    Mackerras, Peter Rosin, Pridhiviraj Paidipeddi, Ram Pai, Rashmica
    Gupta, Ravi Bangoria, Russell Currey, Sam Bobroff, Samuel
    Mendoza-Jonas, Segher Boessenkool, Shilpasri G Bhat, Simon Guo,
    Souptick Joarder, Stewart Smith, Thiago Jung Bauermann, Torsten Duwe,
    Vaibhav Jain, Wei Yongjun, Wolfram Sang, Yisheng Xie, YueHaibing"

    * tag 'powerpc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (251 commits)
    powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap
    cpuidle: powernv: Fix promotion from snooze if next state disabled
    powerpc: fix build failure by disabling attribute-alias warning in pci_32
    ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait()
    powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted"
    powerpc: fix spelling mistake: "Usupported" -> "Unsupported"
    powerpc/pkeys: Detach execute_only key on !PROT_EXEC
    powerpc/powernv: copy/paste - Mask SO bit in CR
    powerpc: Remove core support for Marvell mv64x60 hostbridges
    powerpc/boot: Remove core support for Marvell mv64x60 hostbridges
    powerpc/boot: Remove support for Marvell mv64x60 i2c controller
    powerpc/boot: Remove support for Marvell MPSC serial controller
    powerpc/embedded6xx: Remove C2K board support
    powerpc/lib: optimise PPC32 memcmp
    powerpc/lib: optimise 32 bits __clear_user()
    powerpc/time: inline arch_vtime_task_switch()
    powerpc/Makefile: set -mcpu=860 flag for the 8xx
    powerpc: Implement csum_ipv6_magic in assembly
    powerpc/32: Optimise __csum_partial()
    powerpc/lib: Adjust .balign inside string functions for PPC32
    ...

    Linus Torvalds
     

05 Jun, 2018

1 commit

  • The commit 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of
    snooze to deeper idle state") introduced a timeout for the snooze idle
    state so that it could be eventually be promoted to a deeper idle
    state. The snooze timeout value is static and set to the target
    residency of the next idle state, which would train the cpuidle
    governor to pick the next idle state eventually.

    The unfortunate side-effect of this is that if the next idle state(s)
    is disabled, the CPU will forever remain in snooze, despite the fact
    that the system is completely idle, and other deeper idle states are
    available.

    This patch fixes the issue by dynamically setting the snooze timeout
    to the target residency of the next enabled state on the device.

    Before Patch:
    POWER8 : Only nap disabled.
    $ cpupower monitor sleep 30
    sleep took 30.01297 seconds and exited with status 0
    |Idle_Stats
    PKG |CORE|CPU | snoo | Nap | Fast
    0| 8| 0| 96.41| 0.00| 0.00
    0| 8| 1| 96.43| 0.00| 0.00
    0| 8| 2| 96.47| 0.00| 0.00
    0| 8| 3| 96.35| 0.00| 0.00
    0| 8| 4| 96.37| 0.00| 0.00
    0| 8| 5| 96.37| 0.00| 0.00
    0| 8| 6| 96.47| 0.00| 0.00
    0| 8| 7| 96.47| 0.00| 0.00

    POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1,
    stop2) disabled:
    $ cpupower monitor sleep 30
    sleep took 30.05033 seconds and exited with status 0
    |Idle_Stats
    PKG |CORE|CPU | snoo | stop | stop | stop | stop | stop | stop | stop | stop
    0| 16| 0| 89.79| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
    0| 16| 1| 90.12| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
    0| 16| 2| 90.21| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
    0| 16| 3| 90.29| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00

    After Patch:
    POWER8 : Only nap disabled.
    $ cpupower monitor sleep 30
    sleep took 30.01200 seconds and exited with status 0
    |Idle_Stats
    PKG |CORE|CPU | snoo | Nap | Fast
    0| 8| 0| 16.58| 0.00| 77.21
    0| 8| 1| 18.42| 0.00| 75.38
    0| 8| 2| 4.70| 0.00| 94.09
    0| 8| 3| 17.06| 0.00| 81.73
    0| 8| 4| 3.06| 0.00| 95.73
    0| 8| 5| 7.00| 0.00| 96.80
    0| 8| 6| 1.00| 0.00| 98.79
    0| 8| 7| 5.62| 0.00| 94.17

    POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1,
    stop2) disabled:

    $ cpupower monitor sleep 30
    sleep took 30.02110 seconds and exited with status 0
    |Idle_Stats
    PKG |CORE|CPU | snoo | stop | stop | stop | stop | stop | stop | stop | stop
    0| 0| 0| 0.69| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 9.39| 89.70
    0| 0| 1| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.05| 93.21
    0| 0| 2| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 89.93
    0| 0| 3| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 93.26

    Fixes: 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state")
    Cc: stable@vger.kernel.org # v4.2+
    Signed-off-by: Gautham R. Shenoy
    Reviewed-by: Balbir Singh
    Signed-off-by: Michael Ellerman

    Gautham R. Shenoy
     

31 May, 2018

2 commits


09 Apr, 2018

2 commits

  • If the scheduler tick has been stopped already and the governor
    selects a shallow idle state, the CPU can spend a long time in that
    state if the selection is based on an inaccurate prediction of idle
    time. That effect turns out to be relevant, so it needs to be
    mitigated.

    To that end, modify the menu governor to discard the result of the
    idle time prediction if the tick is stopped and the predicted idle
    time is less than the tick period length, unless the tick timer is
    going to expire soon.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     
  • If the tick isn't stopped, the target residency of the state selected
    by the menu governor may be greater than the actual time to the next
    tick and that means lost energy.

    To avoid that, make tick_nohz_get_sleep_length() return the current
    time to the next event (before stopping the tick) in addition to the
    estimated one via an extra pointer argument and make menu_select()
    use that value to refine the state selection when necessary.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

06 Apr, 2018

1 commit

  • Add a new pointer argument to cpuidle_select() and to the ->select
    cpuidle governor callback to allow a boolean value indicating
    whether or not the tick should be stopped before entering the
    selected state to be returned from there.

    Make the ladder governor ignore that pointer (to preserve its
    current behavior) and make the menu governor return 'false" through
    it if:
    (1) the idle exit latency is constrained at 0, or
    (2) the selected state is a polling one, or
    (3) the expected idle period duration is within the tick period
    range.

    In addition to that, the correction factor computations in the menu
    governor need to take the possibility that the tick may not be
    stopped into account to avoid artificially small correction factor
    values. To that end, add a mechanism to record tick wakeups, as
    suggested by Peter Zijlstra, and use it to modify the menu_update()
    behavior when tick wakeup occurs. Namely, if the CPU is woken up by
    the tick and the return value of tick_nohz_get_sleep_length() is not
    within the tick boundary, the predicted idle duration is likely too
    short, so make menu_update() try to compensate for that by updating
    the governor statistics as though the CPU was idle for a long time.

    Since the value returned through the new argument pointer of
    cpuidle_select() is not used by its caller yet, this change by
    itself is not expected to alter the functionality of the code.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

29 Mar, 2018

4 commits

  • Rik reports that he sees an increase in CPU use in one benchmark
    due to commit 612f1a22f067 "cpuidle: poll_state: Add time limit to
    poll_idle()" that caused poll_idle() to call local_clock() in every
    iteration of the loop. Utilization increase generally means more
    non-idle time with respect to total CPU time (on the average) which
    implies reduced CPU frequency.

    Doug reports that limiting the rate of local_clock() invocations
    in there causes much less power to be drawn during a CPU-intensive
    parallel workload (with idle states 1 and 2 disabled to enforce more
    state 0 residency).

    These two reports together suggest that executing local_clock() on
    multiple CPUs in parallel at a high rate may cause chips to get hot
    and trigger thermal/power limits on them to kick in, so reduce the
    rate of local_clock() invocations in poll_idle() to avoid that issue.

    Fixes: 612f1a22f067 "cpuidle: poll_state: Add time limit to poll_idle()"
    Reported-by: Rik van Riel
    Reported-by: Doug Smythies
    Signed-off-by: Rafael J. Wysocki
    Tested-by: Rik van Riel
    Reviewed-by: Rik van Riel

    Rafael J. Wysocki
     
  • Add a new attribute group called "s2idle" under the sysfs directory
    of each cpuidle state that supports the ->enter_s2idle callback
    and put two new attributes, "usage" and "time", into that group to
    represent the number of times the given state was requested for
    suspend-to-idle and the total time spent in suspend-to-idle after
    requesting that state, respectively.

    That will allow diagnostic information related to suspend-to-idle
    to be collected without enabling advanced debug features and
    analyzing dmesg output.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • All the needed code has been already merged to mach-exynos core in
    commit af9971144dde ("ARM: EXYNOS: add coupled cpuidle support for
    Exynos3250"), so enable support for coupled variant also for Exynos3250
    SoCs.

    Signed-off-by: Marek Szyprowski
    Acked-by: Krzysztof Kozlowski
    Acked-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Rafael J. Wysocki

    Marek Szyprowski
     
  • If poll_idle() is allowed to spin until need_resched() returns 'true',
    it may actually spin for a much longer time than expected by the idle
    governor, since set_tsk_need_resched() is not always called by the
    timer interrupt handler. If that happens, the CPU may spend much
    more time than anticipated in the "polling" state.

    To prevent that from happening, limit the time of the spinning loop
    in poll_idle().

    Suggested-by: Peter Zijlstra
    Signed-off-by: Rafael J. Wysocki
    Tested-by: Doug Smythies

    Rafael J. Wysocki
     

05 Mar, 2018

1 commit


03 Feb, 2018

1 commit

  • Pull powerpc updates from Michael Ellerman:
    "Highlights:

    - Enable support for memory protection keys aka "pkeys" on Power7/8/9
    when using the hash table MMU.

    - Extend our interrupt soft masking to support masking PMU interrupts
    as well as "normal" interrupts, and then use that to implement
    local_t for a ~4x speedup vs the current atomics-based
    implementation.

    - A new driver "ocxl" for "Open Coherent Accelerator Processor
    Interface (OpenCAPI)" devices.

    - Support for new device tree properties on PowerVM to describe
    hotpluggable memory and devices.

    - Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 64-bit
    VDSO.

    - Freescale updates from Scott: fixes for CPM GPIO and an FSL PCI
    erratum workaround, plus a minor cleanup patch.

    As well as quite a lot of other changes all over the place, and small
    fixes and cleanups as always.

    Thanks to: Alan Modra, Alastair D'Silva, Alexey Kardashevskiy,
    Alistair Popple, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V,
    Anju T Sudhakar, Anshuman Khandual, Anton Blanchard, Arnd Bergmann,
    Balbir Singh, Benjamin Herrenschmidt, Bhaktipriya Shridhar, Bryant G.
    Ly, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Cyril Bur,
    David Gibson, Desnes A. Nunes do Rosario, Dmitry Torokhov, Frederic
    Barrat, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo A. R. Silva,
    Gustavo Romero, Ivan Mikhaylov, Joakim Tjernlund, Joe Perches, Josh
    Poimboeuf, Juan J. Alvarez, Julia Cartwright, Kamalesh Babulal,
    Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Malaterre, Michael
    Bringmann, Michael Hanselmann, Michael Neuling, Nathan Fontenot,
    Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Philippe Bergheaud,
    Ram Pai, Russell Currey, Santosh Sivaraj, Scott Wood, Seth Forshee,
    Simon Guo, Stewart Smith, Sukadev Bhattiprolu, Thiago Jung Bauermann,
    Vaibhav Jain, Vasyl Gomonovych"

    * tag 'powerpc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (199 commits)
    powerpc/mm/radix: Fix build error when RADIX_MMU=n
    macintosh/ams-input: Use true and false for boolean values
    macintosh: change some data types from int to bool
    powerpc/watchdog: Print the NIP in soft_nmi_interrupt()
    powerpc/watchdog: regs can't be null in soft_nmi_interrupt()
    powerpc/watchdog: Tweak watchdog printks
    powerpc/cell: Remove axonram driver
    rtc-opal: Fix handling of firmware error codes, prevent busy loops
    powerpc/mpc52xx_gpt: make use of raw_spinlock variants
    macintosh/adb: Properly mark continued kernel messages
    powerpc/pseries: Fix cpu hotplug crash with memoryless nodes
    powerpc/numa: Ensure nodes initialized for hotplug
    powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes
    powerpc/kernel: Block interrupts when updating TIDR
    powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn
    powerpc/mm/nohash: do not flush the entire mm when range is a single page
    powerpc/pseries: Add Initialization of VF Bars
    powerpc/pseries/pci: Associate PEs to VFs in configure SR-IOV
    powerpc/eeh: Add EEH notify resume sysfs
    powerpc/eeh: Add EEH operations to notify resume
    ...

    Linus Torvalds
     

18 Jan, 2018

3 commits


05 Jan, 2018

1 commit


17 Nov, 2017

1 commit

  • Pull powerpc updates from Michael Ellerman:
    "A bit of a small release, I suspect in part due to me travelling for
    KS. But my backlog of patches to review is smaller than usual, so I
    think in part folks just didn't send as much this cycle.

    Non-highlights:

    - Five fixes for the >128T address space handling, both to fix bugs
    in our implementation and to bring the semantics exactly into line
    with x86.

    Highlights:

    - Support for a new OPAL call on bare metal machines which gives us a
    true NMI (ie. is not masked by MSR[EE]=0) for debugging etc.

    - Support for Power9 DD2 in the CXL driver.

    - Improvements to machine check handling so that uncorrectable errors
    can be reported into the generic memory_failure() machinery.

    - Some fixes and improvements for VPHN, which is used under PowerVM
    to notify the Linux partition of topology changes.

    - Plumbing to enable TM (transactional memory) without suspend on
    some Power9 processors (PPC_FEATURE2_HTM_NO_SUSPEND).

    - Support for emulating vector loads form cache-inhibited memory, on
    some Power9 revisions.

    - Disable the fast-endian switch "syscall" by default (behind a
    CONFIG), we believe it has never had any users.

    - A major rework of the API drivers use when initiating and waiting
    for long running operations performed by OPAL firmware, and changes
    to the powernv_flash driver to use the new API.

    - Several fixes for the handling of FP/VMX/VSX while processes are
    using transactional memory.

    - Optimisations of TLB range flushes when using the radix MMU on
    Power9.

    - Improvements to the VAS facility used to access coprocessors on
    Power9, and related improvements to the way the NX crypto driver
    handles requests.

    - Implementation of PMEM_API and UACCESS_FLUSHCACHE for 64-bit.

    Thanks to: Alexey Kardashevskiy, Alistair Popple, Allen Pais, Andrew
    Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Balbir Singh, Benjamin
    Herrenschmidt, Breno Leitao, Christophe Leroy, Christophe Lombard,
    Cyril Bur, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven,
    Guilherme G. Piccoli, Gustavo Romero, Haren Myneni, Joel Stanley,
    Kamalesh Babulal, Kautuk Consul, Markus Elfring, Masami Hiramatsu,
    Michael Bringmann, Michael Neuling, Michal Suchanek, Naveen N. Rao,
    Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pedro Miraglia
    Franco de Carvalho, Philippe Bergheaud, Sandipan Das, Seth Forshee,
    Shriya, Stephen Rothwell, Stewart Smith, Sukadev Bhattiprolu, Tyrel
    Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, and William A.
    Kennington III"

    * tag 'powerpc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (151 commits)
    powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature
    powerpc/64s: Fix masking of SRR1 bits on instruction fault
    powerpc/64s: mm_context.addr_limit is only used on hash
    powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation
    powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary
    powerpc/64s/hash: Fix fork() with 512TB process address space
    powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation
    powerpc/64s/hash: Fix 512T hint detection to use >= 128T
    powerpc: Fix DABR match on hash based systems
    powerpc/signal: Properly handle return value from uprobe_deny_signal()
    powerpc/fadump: use kstrtoint to handle sysfs store
    powerpc/lib: Implement UACCESS_FLUSHCACHE API
    powerpc/lib: Implement PMEM API
    powerpc/powernv/npu: Don't explicitly flush nmmu tlb
    powerpc/powernv/npu: Use flush_all_mm() instead of flush_tlb_mm()
    powerpc/powernv/idle: Round up latency and residency values
    powerpc/kprobes: refactor kprobe_lookup_name for safer string operations
    powerpc/kprobes: Blacklist emulate_update_regs() from kprobes
    powerpc/kprobes: Do not disable interrupts for optprobes and kprobes_on_ftrace
    powerpc/kprobes: Disable preemption before invoking probe handler for optprobes
    ...

    Linus Torvalds
     

13 Nov, 2017

3 commits

  • * pm-cpuidle:
    intel_idle: Graceful probe failure when MWAIT is disabled
    cpuidle: Avoid assignment in if () argument
    cpuidle: Clean up cpuidle_enable_device() error handling a bit
    cpuidle: ladder: Add per CPU PM QoS resume latency support
    ARM: cpuidle: Refactor rollback operations if init fails
    ARM: cpuidle: Correct driver unregistration if init fails
    intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT)
    cpuidle: fix broadcast control when broadcast can not be entered

    Conflicts:
    drivers/idle/intel_idle.c

    Rafael J. Wysocki
     
  • * pm-qos:
    PM / QoS: Fix device resume latency framework
    PM / QoS: Drop PM_QOS_FLAG_REMOTE_WAKEUP

    Rafael J. Wysocki
     
  • On PowerNV platforms, firmware provides exit latency and
    target residency for each of the idle states in nano
    seconds. Cpuidle framework expects the values in micro
    seconds. Round up to nearest micro seconds to avoid errors
    in cases where the values are defined as fractional micro
    seconds.

    Default idle state of 'snooze' has exit latency of zero. If
    other states have fractional micro second exit latency, they
    would get rounded down to zero micro second and make cpuidle
    framework choose deeper idle state when snooze loop is the
    right choice.

    Reported-by: Anton Blanchard
    Signed-off-by: Vaidyanathan Srinivasan
    Reviewed-by: Gautham R. Shenoy
    Signed-off-by: Michael Ellerman

    Vaidyanathan Srinivasan
     

09 Nov, 2017

2 commits


08 Nov, 2017

3 commits

  • Individual CPUs may have special requirements to not enter
    deep idle states. For example, a CPU running real time
    applications would not want to enter deep idle states to
    avoid latency impacts. At the same time other CPUs that
    do not have such a requirement could allow deep idle
    states to save power.

    This was already implemented in the menu governor.
    Implementing similar changes in the ladder governor which
    gets selected when CONFIG_NO_HZ and CONFIG_NO_HZ_IDLE are not
    set. Refer following commits for the menu governor changes.

    Signed-off-by: Ramesh Thomas
    Signed-off-by: Rafael J. Wysocki

    Ramesh Thomas
     
  • Rafael J. Wysocki
     
  • The special value of 0 for device resume latency PM QoS means
    "no restriction", but there are two problems with that.

    First, device resume latency PM QoS requests with 0 as the
    value are always put in front of requests with positive
    values in the priority lists used internally by the PM QoS
    framework, causing 0 to be chosen as an effective constraint
    value. However, that 0 is then interpreted as "no restriction"
    effectively overriding the other requests with specific
    restrictions which is incorrect.

    Second, the users of device resume latency PM QoS have no
    way to specify that *any* resume latency at all should be
    avoided, which is an artificial limitation in general.

    To address these issues, modify device resume latency PM QoS to
    use S32_MAX as the "no constraint" value and 0 as the "no
    latency at all" one and rework its users (the cpuidle menu
    governor, the genpd QoS governor and the runtime PM framework)
    to follow these changes.

    Also add a special "n/a" value to the corresponding user space I/F
    to allow user space to indicate that it cannot accept any resume
    latencies at all for the given device.

    Fixes: 85dc0b8a4019 (PM / QoS: Make it possible to expose PM QoS latency constraints)
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323
    Reported-by: Reinette Chatre
    Signed-off-by: Rafael J. Wysocki
    Tested-by: Reinette Chatre
    Tested-by: Geert Uytterhoeven
    Tested-by: Tero Kristo
    Reviewed-by: Ramesh Thomas

    Rafael J. Wysocki
     

04 Nov, 2017

1 commit

  • MIPS will soon not be a part of Imagination Technologies, and as such
    many @imgtec.com email addresses will no longer be valid. This patch
    updates the addresses for those who:

    - Have 10 or more patches in mainline authored using an @imgtec.com
    email address, or any patches dated within the past year.

    - Are still with Imagination but leaving as part of the MIPS business
    unit, as determined from an internal email address list.

    - Haven't already updated their email address (ie. JamesH) or expressed
    a desire to be excluded (ie. Maciej).

    - Acked v2 or earlier of this patch, which leaves Deng-Cheng, Matt &
    myself.

    New addresses are of the form firstname.lastname@mips.com, and all
    verified against an internal email address list. An entry is added to
    .mailmap for each person such that get_maintainer.pl will report the new
    addresses rather than @imgtec.com addresses which will soon be dead.

    Instances of the affected addresses throughout the tree are then
    mechanically replaced with the new @mips.com address.

    Signed-off-by: Paul Burton
    Cc: Deng-Cheng Zhu
    Cc: Deng-Cheng Zhu
    Acked-by: Dengcheng Zhu
    Cc: Matt Redfearn
    Cc: Matt Redfearn
    Acked-by: Matt Redfearn
    Cc: Andrew Morton
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: trivial@kernel.org
    Signed-off-by: Linus Torvalds

    Paul Burton
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

14 Oct, 2017

1 commit

  • If init fails, we need execute two levels rollback operations: the first
    level is for the failed CPU rollback operations, the second level is to
    iterate all succeeded CPUs to cancel their registration; currently the
    code uses one function to finish these two levels rollback operations.

    This commit is to refactor rollback operations, so it adds a new
    function arm_idle_init_cpu() to encapsulate one specified CPU driver
    registration and rollback the first level operations; and use function
    arm_idle_init() to iterate all CPUs and finish the second level's
    rollback operations.

    Suggested-by: Daniel Lezcano
    Signed-off-by: Leo Yan
    Acked-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki

    Leo Yan