13 Nov, 2017

1 commit

  • * pm-cpuidle:
    intel_idle: Graceful probe failure when MWAIT is disabled
    cpuidle: Avoid assignment in if () argument
    cpuidle: Clean up cpuidle_enable_device() error handling a bit
    cpuidle: ladder: Add per CPU PM QoS resume latency support
    ARM: cpuidle: Refactor rollback operations if init fails
    ARM: cpuidle: Correct driver unregistration if init fails
    intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT)
    cpuidle: fix broadcast control when broadcast can not be entered

    Conflicts:
    drivers/idle/intel_idle.c

    Rafael J. Wysocki
     

09 Nov, 2017

1 commit

  • When MWAIT is disabled, intel_idle refuses to probe.
    But it may mis-lead the user by blaming this on the model number:

    intel_idle: does not run on family 6 modesl 79

    So defer the check for MWAIT until after the model# white-list check succeeds,
    and if the MWAIT check fails, tell the user how to fix it:

    intel_idle: Please enable MWAIT in BIOS SETUP

    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Len Brown
     

04 Nov, 2017

1 commit

  • This reverts commit 43858b4f25cf0adc5c2ca9cf5ce5fdf2532941e5.

    The reason I removed the leave_mm() calls in question is because the
    heuristic wasn't needed after that patch. With the original version
    of my PCID series, we never flushed a "lazy cpu" (i.e. a CPU running
    kernel thread) due a flush on the loaded mm.

    Unfortunately, that caused architectural issues, so now I've
    reinstated these flushes on non-PCID systems in:

    commit b956575bed91 ("x86/mm: Flush more aggressively in lazy TLB mode").

    That, in turn, gives us a power management and occasionally
    performance regression as compared to old kernels: a process that
    goes into a deep idle state on a given CPU and gets its mm flushed
    due to activity on a different CPU will wake the idle CPU.

    Reinstate the old ugly heuristic: if a CPU goes into ACPI C3 or an
    intel_idle state that is likely to cause a TLB flush gets its mm
    switched to init_mm before going idle.

    FWIW, this heuristic is lousy. Whether we should change CR3 before
    idle isn't a good hint except insofar as the performance hit is a bit
    lower if the TLB is getting flushed by the idle code anyway. What we
    really want to know is whether we anticipate being idle long enough
    that the mm is likely to be flushed before we wake up. This is more a
    matter of the expected latency than the idle state that gets chosen.
    This heuristic also completely fails on systems that don't know
    whether the TLB will be flushed (e.g. AMD systems?). OTOH it may be a
    bit obsolete anyway -- PCID systems don't presently benefit from this
    heuristic at all.

    We also shouldn't do this callback from innermost bit of the idle code
    due to the RCU nastiness it causes. All the information need is
    available before rcu_idle_enter() needs to happen.

    Signed-off-by: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: 43858b4f25cf "x86/mm: Stop calling leave_mm() in idle code"
    Link: http://lkml.kernel.org/r/c513bbd4e653747213e05bc7062de000bf0202a5.1509793738.git.luto@kernel.org
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     

11 Oct, 2017

1 commit


06 Sep, 2017

1 commit

  • Pull power management updates from Rafael Wysocki:
    "This time (again) cpufreq gets the majority of changes which mostly
    are driver updates (including a major consolidation of intel_pstate),
    some schedutil governor modifications and core cleanups.

    There also are some changes in the system suspend area, mostly related
    to diagnostics and debug messages plus some renames of things related
    to suspend-to-idle. One major change here is that suspend-to-idle is
    now going to be preferred over S3 on systems where the ACPI tables
    indicate to do so and provide requsite support (the Low Power Idle S0
    _DSM in particular). The system sleep documentation and the tools
    related to it are updated too.

    The rest is a few cpuidle changes (nothing major), devfreq updates,
    generic power domains (genpd) framework updates and a few assorted
    modifications elsewhere.

    Specifics:

    - Drop the P-state selection algorithm based on a PID controller from
    intel_pstate and make it use the same P-state selection method
    (based on the CPU load) for all types of systems in the active mode
    (Rafael Wysocki, Srinivas Pandruvada).

    - Rework the cpufreq core and governors to make it possible to take
    cross-CPU utilization updates into account and modify the schedutil
    governor to actually do so (Viresh Kumar).

    - Clean up the handling of transition latency information in the
    cpufreq core and untangle it from the information on which drivers
    cannot do dynamic frequency switching (Viresh Kumar).

    - Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek
    cpufreq driver and update its DT bindings (Sean Wang).

    - Modify the cpufreq dt-platdev driver to autimatically create
    cpufreq devices for the new (v2) Operating Performance Points (OPP)
    DT bindings and update its whitelist of supported systems (Viresh
    Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley
    Xiao).

    - Add support for Ux500 to the cpufreq-dt driver and drop the
    obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann).

    - Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem
    Nguyen).

    - Fix and clean up assorted issues in the cpufreq drivers and core
    (Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva,
    Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla).

    - Update the IO-wait boost handling in the schedutil governor to make
    it less aggressive (Joel Fernandes).

    - Rework system suspend diagnostics to make it print fewer messages
    to the kernel log by default, add a sysfs knob to allow more
    suspend-related messages to be printed and add Low Power S0 Idle
    constraints checks to the ACPI suspend-to-idle code (Rafael
    Wysocki, Srinivas Pandruvada).

    - Prefer suspend-to-idle over S3 on ACPI-based systems with the
    ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM
    interface present in the ACPI tables (Rafael Wysocki).

    - Update documentation related to system sleep and rename a number of
    items in the code to make it cleare that they are related to
    suspend-to-idle (Rafael Wysocki).

    - Export a variable allowing device drivers to check the target
    system sleep state from the core system suspend code (Florian
    Fainelli).

    - Clean up the cpuidle subsystem to handle the polling state on x86
    in a more straightforward way and to use %pOF instead of full_name
    (Rafael Wysocki, Rob Herring).

    - Update the devfreq framework to fix and clean up a few minor issues
    (Chanwoo Choi, Rob Herring).

    - Extend diagnostics in the generic power domains (genpd) framework
    and clean it up slightly (Thara Gopinath, Rob Herring).

    - Fix and clean up a couple of issues in the operating performance
    points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz).

    - Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling
    (AVS) driver (David Wu).

    - Fix the usage of notifiers in CPU power management on some
    platforms (Alex Shi).

    - Update the pm-graph system suspend/hibernation and boot profiling
    utility (Todd Brandt).

    - Make it possible to run the cpupower utility without CPU0 (Prarit
    Bhargava)"

    * tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits)
    cpuidle: Make drivers initialize polling state
    cpuidle: Move polling state initialization code to separate file
    cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol
    cpufreq: imx6q: Fix imx6sx low frequency support
    cpufreq: speedstep-lib: make several arrays static, makes code smaller
    PM: docs: Delete the obsolete states.txt document
    PM: docs: Describe high-level PM strategies and sleep states
    PM / devfreq: Fix memory leak when fail to register device
    PM / devfreq: Add dependency on PM_OPP
    PM / devfreq: Move private devfreq_update_stats() into devfreq
    PM / devfreq: Convert to using %pOF instead of full_name
    PM / AVS: rockchip-io: add io selectors and supplies for RV1108
    cpufreq: ti: Fix 'of_node_put' being called twice in error handling path
    cpufreq: dt-platdev: Drop few entries from whitelist
    cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2
    ARM: ux500: don't select CPUFREQ_DT
    cpuidle: Convert to using %pOF instead of full_name
    cpufreq: Convert to using %pOF instead of full_name
    PM / Domains: Convert to using %pOF instead of full_name
    cpufreq: Cap the default transition delay value to 10 ms
    ...

    Linus Torvalds
     

04 Sep, 2017

1 commit

  • * pm-sleep:
    ACPI / PM: Check low power idle constraints for debug only
    PM / s2idle: Rename platform operations structure
    PM / s2idle: Rename ->enter_freeze to ->enter_s2idle
    PM / s2idle: Rename freeze_state enum and related items
    PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE
    ACPI / PM: Prefer suspend-to-idle over S3 on some systems
    platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle
    PM / suspend: Define pr_fmt() in suspend.c
    PM / suspend: Use mem_sleep_labels[] strings in messages
    PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG
    PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq()
    PM / core: Add error argument to dpm_show_time()
    PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq()
    PM / s2idle: Rearrange the main suspend-to-idle loop
    PM / timekeeping: Print debug messages when requested
    PM / sleep: Mark suspend/hibernation start and finish
    PM / sleep: Do not print debug messages by default
    PM / suspend: Export pm_suspend_target_state

    Rafael J. Wysocki
     

30 Aug, 2017

1 commit

  • Make the drivers that want to include the polling state into their
    states table initialize it explicitly and drop the initialization of
    it (which in fact is conditional, but that is not obvious from the
    code) from the core.

    Signed-off-by: Rafael J. Wysocki
    Tested-by: Sudeep Holla
    Acked-by: Daniel Lezcano

    Rafael J. Wysocki
     

11 Aug, 2017

1 commit


18 Jul, 2017

1 commit


05 Jul, 2017

1 commit

  • Now that lazy TLB suppresses all flush IPIs (as opposed to all but
    the first), there's no need to leave_mm() when going idle.

    This means we can get rid of the rcuidle hack in
    switch_mm_irqs_off() and we can unexport leave_mm().

    This also removes acpi_unlazy_tlb() from the x86 and ia64 headers,
    since it has no callers any more.

    Signed-off-by: Andy Lutomirski
    Reviewed-by: Nadav Amit
    Reviewed-by: Borislav Petkov
    Reviewed-by: Thomas Gleixner
    Cc: Andrew Morton
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: Linus Torvalds
    Cc: Mel Gorman
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/03c699cfd6021e467be650d6b73deaccfe4b4bd7.1498751203.git.luto@kernel.org
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     

30 Jun, 2017

1 commit

  • Remove #define PREFIX and add #define pr_fmt to use more common logging.

    Miscellanea:

    o Add missing newline to format
    o Convert a single printk without KERN_ to pr_info

    Signed-off-by: Joe Perches
    Acked-by: Jacob Pan
    Signed-off-by: Rafael J. Wysocki

    Joe Perches
     

02 May, 2017

1 commit


03 Mar, 2017

1 commit

  • Pull turbostat utility updates from Rafael Wysocki:
    "Power management turbostat utility updates.

    These update turbostat significantly and in particular:

    - default output is now verbose, --debug is no longer required to get
    all counters. As a result, some options have been added to specify
    exactly what output is wanted.

    - added --quiet to skip system configuration output

    - added --list, --show and --hide parameters

    - added --cpu parameter

    - enhanced Baytrail SoC support

    - added Gemini Lake SoC support

    - added sysfs C-state columns

    Also the symbol definitions in arch/x86/include/asm/intel-family.h and
    arch/x86/include/asm/msr-index.h are updated and the intel_idle and
    intel_pstate drivers are modified to use the updated symbols.

    Credits to Len Brown for all of these changes"

    * tag 'pm-turbostat-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (44 commits)
    tools/power turbostat: version 17.02.24
    tools/power turbostat: bugfix: --add u32 was printed as u64
    tools/power turbostat: show error on exec
    tools/power turbostat: dump p-state software config
    tools/power turbostat: show package number, even without --debug
    tools/power turbostat: support "--hide C1" etc.
    tools/power turbostat: move --Package and --processor into the --cpu option
    tools/power turbostat: turbostat.8 update
    tools/power turbostat: update --list feature
    tools/power turbostat: use wide columns to display large numbers
    tools/power turbostat: Add --list option to show available header names
    tools/power turbostat: fix zero IRQ count shown in one-shot command mode
    tools/power turbostat: add --cpu parameter
    tools/power turbostat: print sysfs C-state stats
    tools/power turbostat: extend --add option to accept /sys path
    tools/power turbostat: skip unused counters on BDX
    tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits
    tools/power turbostat: skip unused counters on SKX
    tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7
    tools/power turbostat: initial Gemini Lake SOC support
    ...

    Linus Torvalds
     

01 Mar, 2017

2 commits

  • previously known as MSR_NHM_SNB_PKG_CST_CFG_CTL

    Signed-off-by: Len Brown

    Len Brown
     
  • Cosmetic only -- no functional change in this patch.

    sysfs before:

    state4/desc:MWAIT 0x20
    state4/name:C6-HSW

    sysfs after:

    state4/desc:MWAIT 0x20
    state4/name:C6

    We remove the platform acronyms from the end of the state name
    (-HSW in this case) for three reasonse.

    1. more consistency with acpi_idle, which prints C1, C2, C3 etc.

    2. users know what platform they are on already
    an acronym for the processor code name here
    seems to cause more confusion than clarity.

    3. less clutter in "cpupower monitor" output,
    which truncates the names to 4 columns.

    The precise definition of the state continues to be available in "desc".

    Reported-by: Artem Bityutskiy
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Len Brown
     

02 Dec, 2016

2 commits

  • Install the callbacks via the state machine and let the core invoke the
    callbacks on the already online CPUs.

    The two smp_call_function_single() invocations in intel_idle_cpu_init() have
    been removed because intel_idle_cpu_init() is now invoked via the hotplug
    callback which runs on the target CPU. The IRQ-off calling convention for
    auto_demotion_disable() and c1e_promotion_disable() has not been preserved
    because only those two modify the MSR during CPU intialization.

    Signed-off-by: Sebastian Andrzej Siewior
    Acked-by: Jacob Pan
    Signed-off-by: Rafael J. Wysocki

    Sebastian Andrzej Siewior
     
  • Since commit 1cf4f629d9d2 ("cpu/hotplug: Move online calls to
    hotplugged cpu") the CPU_ONLINE and CPU_DOWN_PREPARE notifiers are
    always run on the hot plugged CPU, and as of commit 3b9d6da67e11
    ("cpu/hotplug: Fix rollback during error-out in __cpu_disable()") the
    CPU_DOWN_FAILED notifier also runs on the hot plugged CPU. This patch
    converts the SMP functional calls into direct calls.

    smp_function_call_single() executes the function with interrupts
    disabled. This calling convention is not preserved, because
    tick_broadcast_enable() and tick_braodcast_disable() handle
    interrupts themselves.

    Signed-off-by: Anna-Maria Gleixner
    Signed-off-by: Sebastian Andrzej Siewior
    Acked-by: Jacob Pan
    Signed-off-by: Rafael J. Wysocki

    Anna-Maria Gleixner
     

01 Dec, 2016

2 commits


08 Oct, 2016

1 commit

  • When doing an nmi backtrace of many cores, most of which are idle, the
    output is a little overwhelming and very uninformative. Suppress
    messages for cpus that are idling when they are interrupted and just
    emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".

    We do this by grouping all the cpuidle code together into a new
    .cpuidle.text section, and then checking the address of the interrupted
    PC to see if it lies within that section.

    This commit suitably tags x86 and tile idle routines, and only adds in
    the minimal framework for other architectures.

    Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
    Signed-off-by: Chris Metcalf
    Acked-by: Peter Zijlstra (Intel)
    Tested-by: Peter Zijlstra (Intel)
    Tested-by: Daniel Thompson [arm]
    Tested-by: Petr Mladek
    Cc: Aaron Tomlin
    Cc: Peter Zijlstra (Intel)
    Cc: "Rafael J. Wysocki"
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     

31 Jul, 2016

1 commit

  • Pull x86 cpufeature updates from Thomas Gleixner:

    - a workaround for the MONITOR instruction erratum of Goldmont CPUs

    - small fixes and cleanups here and there

    * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUs
    x86/cpu: Rename "WESTMERE2" family to "NEHALEM_G"
    x86/amd_nb: Clean up init path
    x86/cpufeature: Add helper macro for mask check macros
    x86/cpufeature: Make sure DISABLED/REQUIRED macros are updated
    x86/cpufeature: Update cpufeaure macros

    Linus Torvalds
     

09 Jul, 2016

2 commits

  • Commit 5dcef69486 ("intel_idle: add BXT support") added an 8-element
    lookup array with just a 2-bit value used for lookups. As per the SDM
    that bit field is really 3 bits wide. While this is supposedly benign
    here, future re-use of the code for other CPUs might expose the issue.

    Signed-off-by: Jan Beulich
    Signed-off-by: Rafael J. Wysocki

    Jan Beulich
     
  • Since irtl_ns_units[] has itself zero entries, make sure the caller
    recognized those cases along with the MSR read returning zero, as zero
    is not a valid value for exit_latency and target_residency.

    Signed-off-by: Jan Beulich
    Signed-off-by: Rafael J. Wysocki

    Jan Beulich
     

01 Jul, 2016

1 commit

  • Len Brown noticed something was amiss in our INTEL_FAM6_*
    definitions. It seems like model 0x1F was a Nehalem part,
    marketed as "Intel Core i7 and i5 Processors" (according to the
    SDM). But, although it was a Nehalem 0x1F had some uncore events
    which were shared with Westmere.

    Len also mentioned he thought it was called "Havendale", which
    Wikipedia says was graphics-oriented and canceled:

    https://en.wikipedia.org/wiki/Nehalem_(microarchitecture)

    So either way, it's probably not imporant what we call it, but
    call it Nehalem to be accurate, and add a "G" since it seems
    graphics-related. If it were canceled that would be a good reason
    why it's so sparsely and inconsistently referred to in the code.

    Signed-off-by: Dave Hansen
    Cc: Dave Hansen
    Cc: Len Brown
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20160629192737.949C41A8@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

23 Jun, 2016

2 commits

  • Denverton is an Intel Atom based micro server which shares the same
    Goldmont architecture as Broxton. The available C-states on
    Denverton is a subset of Broxton with only C1, C1e, and C6.

    Signed-off-by: Jacob Pan
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Jacob Pan
     
  • The Kconfig for this driver is currently declared with:

    config INTEL_IDLE
    bool "Cpuidle Driver for Intel Processors"

    ...meaning that it currently is not being built as a module by anyone.

    This was done in commit 6ce9cd8669fa1195fdc21643370e34523c7ac988
    ("intel_idle: disable module support") since "...the module capability
    is cauing more trouble than it is worth."

    This was done over 5y ago, and Daniel adds that:

    ...the modular support has been removed from almost all the cpuidle
    drivers and the cpuidle framework is no longer assuming driver could
    be unloaded.

    Removing the modular dead code in the driver makes sense as this
    what have been done in the others drivers.

    So lets remove the modular code that is essentially orphaned, so that
    when reading the driver there is no doubt it is builtin-only.

    Since module_init translates to device_initcall in the non-modular
    case, the init ordering remains unchanged with this commit. At a
    later date we might want to consider whether subsys_init or another
    init category seems more appropriate than device_init.

    We replace module.h with moduleparam.h since the file does declare
    some module parameters, and leaving them as such is currently the
    easiest way to remain compatible with existing boot arg use cases.

    Note that MODULE_DEVICE_TABLE is a no-op for non-modular code.

    Also note that we can't remove intel_idle_cpuidle_devices_uninit() as
    that is still used for unwind purposes if the init fails.

    We also delete the MODULE_LICENSE tag etc. since all that information
    is already contained at the top of the file in the comments.

    Signed-off-by: Paul Gortmaker
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Paul Gortmaker
     

08 Jun, 2016

1 commit

  • Use the new INTEL_FAM6_* macros for intel_idle.c. Also fix up
    some of the macros to be consistent with how some of the
    intel_idle code refers to the model.

    There's on oddity here: model 0x1F is uniquely referred to here
    and nowhere else that I could find. 0x1E/0x1F are just spelled
    out as "Intel Core i7 and i5 Processors" in the SDM or as "Intel
    processors based on the Nehalem, Westmere microarchitectures" in
    the RDPMC section. Comments between tables 19-19 and 19-20 in
    the SDM seem to point to 0x1F being some kind of Westmere, so
    let's call it "WESTMERE2".

    Signed-off-by: Dave Hansen
    Acked-by: Rafael J. Wysocki
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Len Brown
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: jacob.jun.pan@intel.com
    Cc: linux-pm@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160603001932.EE978EB9@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

09 Apr, 2016

1 commit

  • Broxton has all the HSW C-states, except C3.
    BXT C-state timing is slightly different.

    Here we trust the IRTL MSRs as authority
    on maximum C-state latency, and override the driver's tables
    with the values found in the associated IRTL MSRs.
    Further we set the target_residency to 1x maximum latency,
    trusting the hardware demotion logic.

    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Len Brown
     

08 Apr, 2016

12 commits

  • KBL is similar to SKL

    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Len Brown
     
  • SKX is similar to BDX

    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Len Brown
     
  • This driver registers cpuidle devices when a CPU comes online, but it
    leaves the registrations in place when a CPU goes offline. The module
    exit code only unregisters the currently online CPUs, leaving the
    devices for offline CPUs dangling.

    This patch changes the driver to clean up all registrations on exit,
    even those from CPUs that are offline.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • If a cpuidle registration error occurs during the hot plug notifier
    callback, we should really inform the hot plug machinery instead of
    just ignoring the error. This patch changes the callback to properly
    return on error.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • The helper function, intel_idle_cpu_init, registers one new device
    with the cpuidle layer. If the registration should fail, that
    function immediately calls intel_idle_cpuidle_devices_uninit() to
    unregister every last CPU's device. However, it makes no sense to do
    so, when called from the hot plug notifier callback.

    This patch moves the call to intel_idle_cpuidle_devices_uninit()
    outside of the helper function to the one call site that actually
    needs to perform the de-registrations.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • This driver sets the broadcast tick quite early on during probe and does
    not clean up again in cast of failure. This patch moves the setup call
    after the registration, placing the on_each_cpu() calls within the global
    CPU lock region.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • The helper function, intel_idle_cpuidle_devices_uninit, frees the
    globally allocated per-CPU data. However, this function is invoked
    from the hot plug notifier callback at a time when freeing that data
    is not safe.

    If the call to cpuidle_register_driver() should fail (say, due to lack
    of memory), then the driver will free its per-CPU region. On the
    *next* CPU_ONLINE event, the driver will happily use the region again
    and even free it again if the failure repeats.

    This patch fixes the issue by moving the call to free_percpu() outside
    of the helper function at the two call sites that actually need to
    free the per-CPU data.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • In the module_init() method, if the per-CPU allocation fails, then the
    active cpuidle registration is not cleaned up. This patch fixes the
    issue by attempting the allocation before registration, and then
    cleaning it up again on registration failure.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • In the module_exit() method, this driver first frees its per-CPU
    pointer, then unregisters a callback making use of the pointer.
    Furthermore, the function, intel_idle_cpuidle_devices_uninit, is racy
    against CPU hot plugging as it calls for_each_online_cpu().

    This patch corrects the issues by unregistering first on the exit path
    while holding the hot plug lock.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • The function, intel_idle_cpuidle_driver_init, makes calls on each CPU
    to auto_demotion_disable() and c1e_promotion_disable(). These calls
    are redundant, as intel_idle_cpu_init() does the same calls just a bit
    later on. They are also premature, as the driver registration may yet
    fail.

    This patch removes the redundant code.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • The function, intel_idle_cpuidle_driver_init, delivers no error codes
    at all. This patch changes the function to return 'void' instead of
    returning zero.

    Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran
     
  • Signed-off-by: Richard Cochran
    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Richard Cochran