25 Jan, 2021

1 commit


18 Jan, 2021

1 commit


14 Jan, 2021

1 commit


13 Jan, 2021

1 commit

  • debugfs nodes were created in genpd_debug_init alled in late_initcall
    preventing power domains registered though loadable modules to have
    a debugfs entry.

    Create/remove debugfs nodes when the power domain is added/removed
    to/from the internal gpd_list.

    Signed-off-by: Thierry Strudel
    Reviewed-by: Greg Kroah-Hartman
    Reviewed-by: Ulf Hansson
    Signed-off-by: Rafael J. Wysocki
    (cherry picked from commit 718072ceb211833f3c71724f49d733d636067191)
    Signed-off-by: Will McVicker
    Change-Id: Ibde0adddc5fb50a8c8e1a16d66ee0f6b58330a96

    Thierry Strudel
     

15 Dec, 2020

3 commits

  • Currently, a PM domain's idle state is determined based on whether the
    QoS requirements are met. This may not save power, if the idle state
    residency requirements are not met.

    CPU PM domains use the next timer wakeup for the CPUs in the domain to
    determine the sleep duration of the domain. This is compared with the
    idle state residencies to determine the optimal idle state. For other PM
    domains, determining the sleep length is not that straight forward. But
    if the device's next_event is available, we can use that to determine
    the sleep duration of the PM domain.

    Let's update the domain governor logic to check for idle state residency
    based on the next wakeup of devices as well as QoS constraints.

    Bug: 170654157
    Link: https://lore.kernel.org/linux-pm/CAJZ5v0g+nK+jV+Gy+BKEALRtsXDK0HnDbz07Nv3KPK5L3V3OKg@mail.gmail.com/T/#meedddf8b7c5c6b3972b71922a6caae88fd499168
    Signed-off-by: Lina Iyer
    Change-Id: Ibbb5fb28720ab87fb551ce09e478e5f6822e9004

    Lina Iyer
     
  • Some devices may have a predictable interrupt pattern while executing
    usecases. An example would be the VSYNC interrupt associated with
    display devices. A 60 Hz display could cause a interrupt every 16 ms. If
    the device were in a PM domain, the domain would need to be powered up
    for device to resume and handle the interrupt.

    Entering a domain idle state saves power, only if the residency of the
    idle state is met. Without knowing the idle duration of the domain, the
    governor would just choose the deepest idle state that matches the QoS
    requirements. The domain might be powered off just as the device is
    expecting to wake up. If devices could inform PM frameworks of their
    next event, the parent PM domain's idle duration can be determined.

    So let's add the dev_pm_genpd_set_next_wakeup() API for the device to
    inform PM domains of the impending wakeup. This information will be the
    domain governor to determine the best idle state given the wakeup.

    Bug: 170654157
    Link: https://lore.kernel.org/linux-pm/CAJZ5v0g+nK+jV+Gy+BKEALRtsXDK0HnDbz07Nv3KPK5L3V3OKg@mail.gmail.com/T/#m55f3f4a218f6c91431066505841ba5339486b1ab
    Signed-off-by: Lina Iyer
    Change-Id: I34371ef21fde9c045ecf739e9b53c3128656db8e

    Lina Iyer
     
  • PM domains may support entering multiple power down states when the
    component devices and sub-domains are suspended. Also, they may specify
    the residency value for an idle state, only after which the idle state
    may provide power benefits. If the domain does not specify the residency
    for any of its idle states, the governor's choice is much simplified.

    Let's make this optional with the use of a PM domain feature flag.

    Bug: 170654157
    Link: https://lore.kernel.org/linux-pm/CAJZ5v0g+nK+jV+Gy+BKEALRtsXDK0HnDbz07Nv3KPK5L3V3OKg@mail.gmail.com/T/#meffa01877c7c78964b3ddf55bd88959969ed8ad2
    Signed-off-by: Lina Iyer
    Change-Id: Ie98bebf15f81428b53512f37935af2e885edec97

    Lina Iyer
     

14 Dec, 2020

4 commits

  • It's possible for a platform to define multi states but without
    using a governor.

    Signed-off-by: Dong Aisheng

    Dong Aisheng
     
  • For a domain has no working devices anymore, let's choose the deepest state
    to enter to save power. e.g. driver probe failure.

    Signed-off-by: Dong Aisheng

    Dong Aisheng
     
  • Currently the generic power domain will power off the domain if all
    devices in it have been stopped during system suspend.

    It is done by checking if the domain is active in genpd_sync_power_off,
    then disable it. However, for power domains supporting multiple low power
    states, it may have already entered an intermediate low power state by
    runtime PM before system suspend and the status is already
    GPD_STATE_POWER_OFF which results in then the power domain stay at an
    intermediate low power state during system suspend.
    Then genpd_sync_power_off will keep it at the low power state instead
    of completely gate off it.

    Let's give the power domain a chance to switch to the deepest state in
    case it's already off but in an intermediate low power state.

    Signed-off-by: Dong Aisheng

    Dong Aisheng
     
  • Move the Subdomain check into _genpd_power_off, then the caller does
    not have to check it each time. This also ensures a double check
    of &genpd->sd_count before really power off domain in case it's
    increased asychronously by subdomains. This is the same behavior
    as the original genpd_power_off() does.

    Signed-off-by: Dong Aisheng

    Dong Aisheng
     

13 Dec, 2020

2 commits

  • The dev_pm_genpd_suspend|resume() have so far only been used during the
    syscore suspend/resume phases. However, during suspend-to-idle, where the
    syscore phases doesn't exist, similar operations are sometimes needed.

    An existing example are the timekeeping_suspend|resume() functions, which
    are being called both through a registered syscore ops during the syscore
    phases, but also as regular functions calls from cpuidle (via
    tick_freeze()) during suspend-to-idle.

    For similar reasons, let's enable the dev_pm_genpd_suspend|resume() APIs to
    be re-used for corresponding CPU devices that are attached to a genpd,
    during suspend-to-idle.

    Signed-off-by: Ulf Hansson
    (cherry picked from commit b9795a3e4e1cbf521bbb5ef240eb47803c303b02 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git)
    Bug: 175076037
    Change-Id: I0019bd01e19c60dc57320b366b1e762cd12977f7
    Signed-off-by: Lina Iyer

    Ulf Hansson
     
  • To better describe what the pm_genpd_syscore_poweroff|poweron() functions
    actually do, let's rename them to dev_pm_genpd_suspend|resume() and update
    the rather few callers of them accordingly (a couple of clocksource
    drivers).

    Moreover, let's take the opportunity to add some documentation of these
    exported functions, as that is currently missing.

    Cc: Daniel Lezcano
    Cc: Thomas Gleixner
    Signed-off-by: Ulf Hansson
    (cherry picked from commit fc51989062138744b56e47190915ce68484e73f3 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git)
    Bug: 175076037
    Change-Id: I8b59f0ca12e63b39f2a39528eb566232c78172c9
    Signed-off-by: Lina Iyer

    Ulf Hansson
     

28 Nov, 2020

1 commit

  • min_freq and max_freq from sysfs now go to the dev_pm_qos_request
    interface, so porting will require access to dev_pm_qos_read_value.

    Signed-off-by: Mark Salyzyn
    Fixes: 27dbc542f651ed09de910f274b32634904103774 ("PM / devfreq: Use PM QoS for sysfs min/max_freq")
    Bug: 165523817
    Change-Id: I6837f8d75a61faf8bf18d1b9a37419632e5c7134
    (cherry picked from commit dc32196ca1bf69f0040e0a8732179dd9cc3d8f30)
    Signed-off-by: Will McVicker
    (cherry picked from commit d13ee33d477989a1171194e8df94711546a565c8)
    Signed-off-by: Will McVicker

    Mark Salyzyn
     

06 Nov, 2020

1 commit


03 Nov, 2020

2 commits

  • After commit d12544fb2aa9 ("PM: runtime: Remove link state checks in
    rpm_get/put_supplier()") nothing prevents the consumer device's
    runtime PM from acquiring additional references to the supplier
    device after pm_runtime_clean_up_links() has run (or even while it
    is running), so calling this function from __device_release_driver()
    may be pointless (or even harmful).

    Moreover, it ignores stateless device links, so the runtime PM
    handling of managed and stateless device links is inconsistent
    because of it, so better get rid of it entirely.

    Fixes: d12544fb2aa9 ("PM: runtime: Remove link state checks in rpm_get/put_supplier()")
    Signed-off-by: Rafael J. Wysocki
    Cc: 5.1+ # 5.1+
    Tested-by: Xiang Chen
    Reviewed-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • While removing a device link, drop the supplier device's runtime PM
    usage counter as many times as needed to drop all of the runtime PM
    references to it from the consumer in addition to dropping the
    consumer's link count.

    Fixes: baa8809f6097 ("PM / runtime: Optimize the use of device links")
    Signed-off-by: Rafael J. Wysocki
    Cc: 5.1+ # 5.1+
    Tested-by: Xiang Chen
    Reviewed-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     

27 Oct, 2020

1 commit


26 Oct, 2020

1 commit


25 Oct, 2020

1 commit


24 Oct, 2020

2 commits

  • Pull more power management updates from Rafael Wysocki:
    "First of all, the adaptive voltage scaling (AVS) drivers go to new
    platform-specific locations as planned (this part was reported to have
    merge conflicts against the new arm-soc updates in linux-next).

    In addition to that, there are some fixes (intel_idle, intel_pstate,
    RAPL, acpi_cpufreq), the addition of on/off notifiers and idle state
    accounting support to the generic power domains (genpd) code and some
    janitorial changes all over.

    Specifics:

    - Move the AVS drivers to new platform-specific locations and get rid
    of the drivers/power/avs directory (Ulf Hansson).

    - Add on/off notifiers and idle state accounting support to the
    generic power domains (genpd) framework (Ulf Hansson, Lina Iyer).

    - Ulf will maintain the PM domain part of cpuidle-psci (Ulf Hansson).

    - Make intel_idle disregard ACPI _CST if it cannot use the data
    returned by that method (Mel Gorman).

    - Modify intel_pstate to avoid leaving useless sysfs directory
    structure behind if it cannot be registered (Chen Yu).

    - Fix domain detection in the RAPL power capping driver and prevent
    it from failing to enumerate the Psys RAPL domain (Zhang Rui).

    - Allow acpi-cpufreq to use ACPI _PSD information with Family 19 and
    later AMD chips (Wei Huang).

    - Update the driver assumptions comment in intel_idle and fix a
    kerneldoc comment in the runtime PM framework (Alexander Monakov,
    Bean Huo).

    - Avoid unnecessary resets of the cached frequency in the schedutil
    cpufreq governor to reduce overhead (Wei Wang).

    - Clean up the cpufreq core a bit (Viresh Kumar).

    - Make assorted minor janitorial changes (Daniel Lezcano, Geert
    Uytterhoeven, Hubert Jasudowicz, Tom Rix).

    - Clean up and optimize the cpupower utility somewhat (Colin Ian
    King, Martin Kaistra)"

    * tag 'pm-5.10-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
    PM: sleep: remove unreachable break
    PM: AVS: Drop the avs directory and the corresponding Kconfig
    PM: AVS: qcom-cpr: Move the driver to the qcom specific drivers
    PM: runtime: Fix typo in pm_runtime_set_active() helper comment
    PM: domains: Fix build error for genpd notifiers
    powercap: Fix typo in Kconfig "Plance" -> "Plane"
    cpufreq: schedutil: restore cached freq when next_f is not changed
    acpi-cpufreq: Honor _PSD table setting on new AMD CPUs
    PM: AVS: smartreflex Move driver to soc specific drivers
    PM: AVS: rockchip-io: Move the driver to the rockchip specific drivers
    PM: domains: enable domain idle state accounting
    PM: domains: Add curly braces to delimit comment + statement block
    PM: domains: Add support for PM domain on/off notifiers for genpd
    powercap/intel_rapl: enumerate Psys RAPL domain together with package RAPL domain
    powercap/intel_rapl: Fix domain detection
    intel_idle: Ignore _CST if control cannot be taken from the platform
    cpuidle: Remove pointless stub
    intel_idle: mention assumption that WBINVD is not needed
    MAINTAINERS: Add section for cpuidle-psci PM domain
    cpufreq: intel_pstate: Delete intel_pstate sysfs if failed to register the driver
    ...

    Linus Torvalds
     
  • * pm-core:
    PM: runtime: Fix typo in pm_runtime_set_active() helper comment

    * pm-sleep:
    PM: sleep: remove unreachable break

    * pm-tools:
    cpupower: speed up generating git version string
    cpupowerutils: fix spelling mistake "dependant" -> "dependent"

    * powercap:
    powercap: Fix typo in Kconfig "Plance" -> "Plane"
    powercap/intel_rapl: enumerate Psys RAPL domain together with package RAPL domain
    powercap/intel_rapl: Fix domain detection

    Rafael J. Wysocki
     

23 Oct, 2020

1 commit


21 Oct, 2020

1 commit

  • The __raw_notifier_call_chain() was recently removed and replaced with
    raw_notifier_call_chain_robust(). Recent changes to genpd didn't take that
    into account, which causes a build error. Let's fix this by converting to
    the raw_notifier_call_chain_robust() in genpd.

    Reported-by: kernel test robot
    Reported-by: Lina Iyer
    Signed-off-by: Ulf Hansson
    Signed-off-by: Rafael J. Wysocki

    Ulf Hansson
     

16 Oct, 2020

3 commits

  • To enable better debug of PM domains, keep a track of successful
    and failing attempts to enter each domain idle state.

    This statistics are exported in debugfs when reading the
    idle_states node associated with each PM domain.

    Signed-off-by: Lina Iyer
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki

    Lina Iyer
     
  • There is not strict need to group a comment and a single statement in an
    if block, as comments are stripped by the pre-processor. However,
    adding curly braces does make the code easier to read, and may avoid
    mistakes when changing the code later.

    Signed-off-by: Geert Uytterhoeven
    Acked-by: Ulf Hansson
    Signed-off-by: Rafael J. Wysocki

    Geert Uytterhoeven
     
  • A device may have specific HW constraints that must be obeyed to, before
    its corresponding PM domain (genpd) can be powered off - and vice verse at
    power on. These constraints can't be managed through the regular runtime PM
    based deployment for a device, because the access pattern for it, isn't
    always request based. In other words, using the runtime PM callbacks to
    deal with the constraints doesn't work for these cases.

    For these reasons, let's instead add a PM domain power on/off notification
    mechanism to genpd. To add/remove a notifier for a device, the device must
    already have been attached to the genpd, which also means that it needs to
    be a part of the PM domain topology.

    To add/remove a notifier, let's introduce two genpd specific functions:
    - dev_pm_genpd_add|remove_notifier()

    Note that, to further clarify when genpd power on/off notifiers may be
    used, one can compare with the existing CPU_CLUSTER_PM_ENTER|EXIT
    notifiers. In the long run, the genpd power on/off notifiers should be able
    to replace them, but that requires additional genpd based platform support
    for the current users.

    Signed-off-by: Ulf Hansson
    Tested-by: Lina Iyer
    Signed-off-by: Rafael J. Wysocki

    Ulf Hansson
     

15 Oct, 2020

1 commit

  • Pull driver core updates from Greg KH:
    "Here is the "big" set of driver core patches for 5.10-rc1

    They include a lot of different things, all related to the driver core
    and/or some driver logic:

    - sysfs common write functions to make it easier to audit sysfs
    attributes

    - device connection cleanups and fixes

    - devm helpers for a few functions

    - NOIO allocations for when devices are being removed

    - minor cleanups and fixes

    All have been in linux-next for a while with no reported issues"

    * tag 'driver-core-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits)
    regmap: debugfs: use semicolons rather than commas to separate statements
    platform/x86: intel_pmc_core: do not create a static struct device
    drivers core: node: Use a more typical macro definition style for ACCESS_ATTR
    drivers core: Use sysfs_emit for shared_cpu_map_show and shared_cpu_list_show
    mm: and drivers core: Convert hugetlb_report_node_meminfo to sysfs_emit
    drivers core: Miscellaneous changes for sysfs_emit
    drivers core: Reindent a couple uses around sysfs_emit
    drivers core: Remove strcat uses around sysfs_emit and neaten
    drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions
    sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output
    dyndbg: use keyword, arg varnames for query term pairs
    driver core: force NOIO allocations during unplug
    platform_device: switch to simpler IDA interface
    driver core: platform: Document return type of more functions
    Revert "driver core: Annotate dev_err_probe() with __must_check"
    Revert "test_firmware: Test platform fw loading on non-EFI systems"
    iio: adc: xilinx-xadc: use devm_krealloc()
    hwmon: pmbus: use more devres helpers
    devres: provide devm_krealloc()
    syscore: Use pm_pr_dbg() for syscore_{suspend,resume}()
    ...

    Linus Torvalds
     

13 Oct, 2020

1 commit

  • * pm-core:
    PM: runtime: Fix timer_expires data type on 32-bit arches
    PM: runtime: Remove link state checks in rpm_get/put_supplier()

    * pm-sleep:
    ACPI: EC: PM: Drop ec_no_wakeup check from acpi_ec_dispatch_gpe()
    ACPI: EC: PM: Flush EC work unconditionally after wakeup
    PM: hibernate: remove the bogus call to get_gendisk() in software_resume()
    PM: hibernate: Batch hibernate and resume IO requests

    * pm-pci:
    PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPI

    * pm-domains:
    PM: domains: Allow to abort power off when no ->power_off() callback
    PM: domains: Rename power state enums for genpd

    Rafael J. Wysocki
     

05 Oct, 2020

1 commit

  • Pull opertaing performance points (OPP) framework fixes for 5.10-rc1
    from Viresh Kumar:

    "- Return -EPROBE_DEFER properly from dev_pm_opp_get_opp_table()
    (Stephan Gerhold).

    - Minor cleanups around required-opps (Stephan Gerhold).

    - Extends opp-supported-hw property to contain multiple versions
    (Viresh Kumar).

    - Multiple cleanups around dev_pm_opp_attach_genpd() (Viresh Kumar).

    - Multiple fixes, cleanups in the OPP core for overall better design
    (Viresh Kumar)."

    * 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
    opp: Allow opp-level to be set to 0
    opp: Prevent memory leak in dev_pm_opp_attach_genpd()
    ARM: tegra: Pass multiple versions in opp-supported-hw property
    opp: Allow opp-supported-hw to contain multiple versions
    dt-bindings: opp: Allow opp-supported-hw to contain multiple versions
    opp: Set required OPPs in reverse order when scaling down
    opp: Reduce code duplication in _set_required_opps()
    opp: Drop unnecessary check from dev_pm_opp_attach_genpd()
    opp: Handle multiple calls for same OPP table in _of_add_opp_table_v1()
    opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER
    opp: Remove _dev_pm_opp_find_and_remove_table() wrapper
    opp: Split out _opp_set_rate_zero()
    opp: Reuse the enabled flag in !target_freq path
    opp: Rename regulator_enabled and use it as status of all resources

    Rafael J. Wysocki
     

03 Oct, 2020

2 commits

  • In genpd_power_off() we may decide to abort the power off of the PM domain,
    even beyond the point when the governor would accept it. The abort is done
    if it turns out that a child domain has been requested to be powered on,
    which means it's waiting for the lock of the parent to be released.

    However, the abort is currently only considered if the genpd in question
    has a ->power_off() callback assigned. This is unnecessary limiting,
    especially if the genpd would have a parent of its own. Let's remove the
    limitation and make the behaviour consistent.

    Signed-off-by: Ulf Hansson
    [ rjw: Subject edit ]
    Signed-off-by: Rafael J. Wysocki

    Ulf Hansson
     
  • To clarify the code a bit, let's rename GPD_STATE_ACTIVE into
    GENPD_STATE_ON and GPD_STATE_POWER_OFF to GENPD_STATE_OFF.

    Signed-off-by: Ulf Hansson
    [ rjw: Subject edit ]
    Signed-off-by: Rafael J. Wysocki

    Ulf Hansson
     

02 Oct, 2020

4 commits

  • Change additional instances that could use sysfs_emit and sysfs_emit_at
    that the coccinelle script could not convert.

    o macros creating show functions with ## concatenation
    o unbound sprintf uses with buf+len for start of output to sysfs_emit_at
    o returns with ?: tests and sprintf to sysfs_emit
    o sysfs output with struct class * not struct device * arguments

    Miscellanea:

    o remove unnecessary initializations around these changes
    o consistently use int len for return length of show functions
    o use octal permissions and not S_
    o rename a few show function names so DEVICE_ATTR_ can be used
    o use DEVICE_ATTR_ADMIN_RO where appropriate
    o consistently use const char *output for strings
    o checkpatch/style neatening

    Signed-off-by: Joe Perches
    Link: https://lore.kernel.org/r/8bc24444fe2049a9b2de6127389b57edfdfe324d.1600285923.git.joe@perches.com
    Signed-off-by: Greg Kroah-Hartman

    Joe Perches
     
  • Just a couple of whitespace realignment to open parenthesis for
    multi-line statements.

    Signed-off-by: Joe Perches
    Link: https://lore.kernel.org/r/33224191421dbb56015eded428edfddcba997d63.1600285923.git.joe@perches.com
    Signed-off-by: Greg Kroah-Hartman

    Joe Perches
     
  • strcat is no longer necessary for sysfs_emit and sysfs_emit_at uses.

    Convert the strcat uses to sysfs_emit calls and neaten other block
    uses of direct returns to use an intermediate const char *.

    Signed-off-by: Joe Perches
    Link: https://lore.kernel.org/r/5d606519698ce4c8f1203a2b35797d8254c6050a.1600285923.git.joe@perches.com
    Signed-off-by: Greg Kroah-Hartman

    Joe Perches
     
  • Convert the various sprintf fmaily calls in sysfs device show functions
    to sysfs_emit and sysfs_emit_at for PAGE_SIZE buffer safety.

    Done with:

    $ spatch -sp-file sysfs_emit_dev.cocci --in-place --max-width=80 .

    And cocci script:

    $ cat sysfs_emit_dev.cocci
    @@
    identifier d_show;
    identifier dev, attr, buf;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {

    }

    @@
    identifier d_show;
    identifier dev, attr, buf;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {

    }

    @@
    identifier d_show;
    identifier dev, attr, buf;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {

    }

    @@
    identifier d_show;
    identifier dev, attr, buf;
    expression chr;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {

    }

    @@
    identifier d_show;
    identifier dev, attr, buf;
    identifier len;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {

    return len;
    }

    @@
    identifier d_show;
    identifier dev, attr, buf;
    identifier len;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {

    return len;
    }

    @@
    identifier d_show;
    identifier dev, attr, buf;
    identifier len;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {

    return len;
    }

    @@
    identifier d_show;
    identifier dev, attr, buf;
    identifier len;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {

    return len;
    }

    @@
    identifier d_show;
    identifier dev, attr, buf;
    expression chr;
    @@

    ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
    {
    ...
    - strcpy(buf, chr);
    - return strlen(buf);
    + return sysfs_emit(buf, chr);
    }

    Signed-off-by: Joe Perches
    Link: https://lore.kernel.org/r/3d033c33056d88bbe34d4ddb62afd05ee166ab9a.1600285923.git.joe@perches.com
    Signed-off-by: Greg Kroah-Hartman

    Joe Perches
     

25 Sep, 2020

1 commit

  • To support runtime PM for hisi SAS driver (the driver is in directory
    drivers/scsi/hisi_sas), we add device link between scsi_device->sdev_gendev
    (consumer device) and hisi_hba->dev(supplier device) with flags
    DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE.

    After runtime suspended consumers and supplier, unload the dirver which
    causes a hung.

    We found that it called function device_release_driver_internal() to
    release the supplier device (hisi_hba->dev), as the device link was
    busy, it set the device link state to DL_STATE_SUPPLIER_UNBIND, and
    then it called device_release_driver_internal() to release the consumer
    device (scsi_device->sdev_gendev).

    Then it would try to call pm_runtime_get_sync() to resume the consumer
    device, but because consumer-supplier relation existed, it would try
    to resume the supplier first, but as the link state was already
    DL_STATE_SUPPLIER_UNBIND, so it skipped resuming the supplier and only
    resumed the consumer which hanged (it sends IOs to resume scsi_device
    while the SAS controller is suspended).

    Simple flow is as follows:

    device_release_driver_internal -> (supplier device)
    if device_links_busy ->
    device_links_unbind_consumers ->
    ...
    WRITE_ONCE(link->status, DL_STATE_SUPPLIER_UNBIND)
    device_release_driver_internal (consumer device)
    pm_runtime_get_sync -> (consumer device)
    ...
    __rpm_callback ->
    rpm_get_suppliers ->
    if link->state == DL_STATE_SUPPLIER_UNBIND -> skip the action of resuming the supplier
    ...
    pm_runtime_clean_up_links
    ...

    Correct suspend/resume ordering between a supplier device and its consumer
    devices (resume the supplier device before resuming consumer devices, and
    suspend consumer devices before suspending the supplier device) should be
    guaranteed by runtime PM, but the state checks in rpm_get_supplier() and
    rpm_put_supplier() break this rule, so remove them.

    Signed-off-by: Xiang Chen
    [ rjw: Subject and changelog edits ]
    Cc: All applicable
    Signed-off-by: Rafael J. Wysocki

    Xiang Chen
     

01 Sep, 2020

1 commit


25 Aug, 2020

2 commits

  • It has been reported that system-wide suspend may be aborted in the
    absence of any wakeup events due to unforseen interactions of it with
    the runtume PM framework.

    One failing scenario is when there are multiple devices sharing an
    ACPI power resource and runtime-resume needs to be carried out for
    one of them during system-wide suspend (for example, because it needs
    to be reconfigured before the whole system goes to sleep). In that
    case, the runtime-resume of that device involves turning the ACPI
    power resource "on" which in turn causes runtime-resume requests
    to be queued up for all of the other devices sharing it. Those
    requests go to the runtime PM workqueue which is frozen during
    system-wide suspend, so they are not actually taken care of until
    the resume of the whole system, but the pm_runtime_barrier()
    call in __device_suspend() sees them and triggers system wakeup
    events for them which then cause the system-wide suspend to be
    aborted if wakeup source objects are in active use.

    Of course, the logic that leads to triggering those wakeup events is
    questionable in the first place, because clearly there are cases in
    which a pending runtime resume request for a device is not connected
    to any real wakeup events in any way (like the one above). Moreover,
    it is racy, because the device may be resuming already by the time
    the pm_runtime_barrier() runs and so if the driver doesn't take care
    of signaling the wakeup event as appropriate, it will be lost.
    However, if the driver does take care of that, the extra
    pm_wakeup_event() call in the core is redundant.

    Accordingly, drop the conditional pm_wakeup_event() call fron
    __device_suspend() and make the latter call pm_runtime_barrier()
    alone. Also modify the comment next to that call to reflect the new
    code and extend it to mention the need to avoid unwanted interactions
    between runtime PM and system-wide device suspend callbacks.

    Fixes: 1e2ef05bb8cf8 ("PM: Limit race conditions between runtime PM and system sleep (v2)")
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Alan Stern
    Reported-by: Utkarsh H Patel
    Tested-by: Utkarsh H Patel
    Tested-by: Pengfei Xu
    Cc: All applicable

    Rafael J. Wysocki
     
  • The OPP core manages various resources, e.g. clocks or interconnect paths.
    These resources are looked up when the OPP table is allocated once
    dev_pm_opp_get_opp_table() is called the first time (either directly
    or indirectly through one of the many helper functions).

    At this point, the resources may not be available yet, i.e. looking them
    up will result in -EPROBE_DEFER. Unfortunately, dev_pm_opp_get_opp_table()
    is currently unable to propagate this error code since it only returns
    the allocated OPP table or NULL.

    This means that all consumers of the OPP core are required to make sure
    that all necessary resources are available. Usually this happens by
    requesting them, checking the result and releasing them immediately after.

    For example, we have added "dev_pm_opp_of_find_icc_paths(dev, NULL)" to
    several drivers now just to make sure the interconnect providers are
    ready before the OPP table is allocated. If this call is missing,
    the OPP core will only warn about this and then attempt to continue
    without interconnect. This will eventually fail horribly, e.g.:

    cpu cpu0: _allocate_opp_table: Error finding interconnect paths: -517
    ... later ...
    of: _read_bw: Mismatch between opp-peak-kBps and paths (1 0)
    cpu cpu0: _opp_add_static_v2: opp key field not found
    cpu cpu0: _of_add_opp_table_v2: Failed to add OPP, -22

    This example happens when trying to use interconnects for a CPU OPP
    table together with qcom-cpufreq-nvmem.c. qcom-cpufreq-nvmem calls
    dev_pm_opp_set_supported_hw(), which ends up allocating the OPP table
    early. To fix the problem with the current approach we would need to add
    yet another call to dev_pm_opp_of_find_icc_paths(dev, NULL).
    But actually qcom-cpufreq-nvmem.c has nothing to do with interconnects...

    This commit attempts to make this more robust by allowing
    dev_pm_opp_get_opp_table() to return an error pointer. Fixing all
    the usages is trivial because the function is usually used indirectly
    through another helper (e.g. dev_pm_opp_set_supported_hw() above).
    These other helpers already return an error pointer.

    The example above then works correctly because set_supported_hw() will
    return -EPROBE_DEFER, and qcom-cpufreq-nvmem.c already propagates that
    error. It should also be possible to remove the remaining usages of
    "dev_pm_opp_of_find_icc_paths(dev, NULL)" from other drivers as well.

    Note that this commit currently only handles -EPROBE_DEFER for the
    clock/interconnects within _allocate_opp_table(). Other errors are just
    ignored as before. Eventually those should be propagated as well.

    Signed-off-by: Stephan Gerhold
    Acked-by: Krzysztof Kozlowski
    Reviewed-by: Ulf Hansson
    [ Viresh: skip checking return value of dev_pm_opp_get_opp_table() for
    EPROBE_DEFER in domain.c, fix NULL return value and reorder
    code a bit in core.c, and update exynos-asv.c ]
    Signed-off-by: Viresh Kumar

    Stephan Gerhold