07 Nov, 2015

2 commits

  • __GFP_WAIT was used to signal that the caller was in atomic context and
    could not sleep. Now it is possible to distinguish between true atomic
    context and callers that are not willing to sleep. The latter should
    clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing
    __GFP_WAIT behaves differently, there is a risk that people will clear the
    wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
    indicate what it does -- setting it allows all reclaim activity, clearing
    them prevents it.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Mel Gorman
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Acked-by: Johannes Weiner
    Cc: Christoph Lameter
    Acked-by: David Rientjes
    Cc: Vitaly Wool
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • …d avoiding waking kswapd

    __GFP_WAIT has been used to identify atomic context in callers that hold
    spinlocks or are in interrupts. They are expected to be high priority and
    have access one of two watermarks lower than "min" which can be referred
    to as the "atomic reserve". __GFP_HIGH users get access to the first
    lower watermark and can be called the "high priority reserve".

    Over time, callers had a requirement to not block when fallback options
    were available. Some have abused __GFP_WAIT leading to a situation where
    an optimisitic allocation with a fallback option can access atomic
    reserves.

    This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
    cannot sleep and have no alternative. High priority users continue to use
    __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
    are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
    callers that want to wake kswapd for background reclaim. __GFP_WAIT is
    redefined as a caller that is willing to enter direct reclaim and wake
    kswapd for background reclaim.

    This patch then converts a number of sites

    o __GFP_ATOMIC is used by callers that are high priority and have memory
    pools for those requests. GFP_ATOMIC uses this flag.

    o Callers that have a limited mempool to guarantee forward progress clear
    __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
    into this category where kswapd will still be woken but atomic reserves
    are not used as there is a one-entry mempool to guarantee progress.

    o Callers that are checking if they are non-blocking should use the
    helper gfpflags_allow_blocking() where possible. This is because
    checking for __GFP_WAIT as was done historically now can trigger false
    positives. Some exceptions like dm-crypt.c exist where the code intent
    is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
    flag manipulations.

    o Callers that built their own GFP flags instead of starting with GFP_KERNEL
    and friends now also need to specify __GFP_KSWAPD_RECLAIM.

    The first key hazard to watch out for is callers that removed __GFP_WAIT
    and was depending on access to atomic reserves for inconspicuous reasons.
    In some cases it may be appropriate for them to use __GFP_HIGH.

    The second key hazard is callers that assembled their own combination of
    GFP flags instead of starting with something like GFP_KERNEL. They may
    now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
    if it's missed in most cases as other activity will wake kswapd.

    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Vitaly Wool <vitalywool@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Mel Gorman
     

14 Oct, 2015

2 commits

  • Just fix a typo in a function name in kerneldoc comments.

    Signed-off-by: Geliang Tang
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki

    Geliang Tang
     
  • There are quite a few cases in which device drivers, bus types or
    even the PM core itself may benefit from knowing whether or not
    the platform firmware will be involved in the upcoming system power
    transition (during system suspend) or whether or not it was involved
    in it (during system resume).

    For this reason, introduce global system suspend flags that can be
    used by the platform code to expose that information for the benefit
    of the other parts of the kernel and make the ACPI core set them
    as appropriate.

    Users of the new flags will be added later.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

16 Sep, 2015

1 commit

  • Add a sysfs attribute, /sys/power/pm_wakeup_irq, reporting the IRQ
    number of the first wakeup interrupt (that is, the first interrupt
    from an IRQ line armed for system wakeup) seen by the kernel during
    the most recent system suspend/resume cycle.

    This feature will be useful for system wakeup diagnostics of
    spurious wakeup interrupts.

    Signed-off-by: Alexandra Yates
    [ rjw: Fixed up pm_wakeup_irq definition ]
    Signed-off-by: Rafael J. Wysocki

    Alexandra Yates
     

03 Sep, 2015

1 commit

  • Pull core block updates from Jens Axboe:
    "This first core part of the block IO changes contains:

    - Cleanup of the bio IO error signaling from Christoph. We used to
    rely on the uptodate bit and passing around of an error, now we
    store the error in the bio itself.

    - Improvement of the above from myself, by shrinking the bio size
    down again to fit in two cachelines on x86-64.

    - Revert of the max_hw_sectors cap removal from a revision again,
    from Jeff Moyer. This caused performance regressions in various
    tests. Reinstate the limit, bump it to a more reasonable size
    instead.

    - Make /sys/block//queue/discard_max_bytes writeable, by me.
    Most devices have huge trim limits, which can cause nasty latencies
    when deleting files. Enable the admin to configure the size down.
    We will look into having a more sane default instead of UINT_MAX
    sectors.

    - Improvement of the SGP gaps logic from Keith Busch.

    - Enable the block core to handle arbitrarily sized bios, which
    enables a nice simplification of bio_add_page() (which is an IO hot
    path). From Kent.

    - Improvements to the partition io stats accounting, making it
    faster. From Ming Lei.

    - Also from Ming Lei, a basic fixup for overflow of the sysfs pending
    file in blk-mq, as well as a fix for a blk-mq timeout race
    condition.

    - Ming Lin has been carrying Kents above mentioned patches forward
    for a while, and testing them. Ming also did a few fixes around
    that.

    - Sasha Levin found and fixed a use-after-free problem introduced by
    the bio->bi_error changes from Christoph.

    - Small blk cgroup cleanup from Viresh Kumar"

    * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
    blk: Fix bio_io_vec index when checking bvec gaps
    block: Replace SG_GAPS with new queue limits mask
    block: bump BLK_DEF_MAX_SECTORS to 2560
    Revert "block: remove artifical max_hw_sectors cap"
    blk-mq: fix race between timeout and freeing request
    blk-mq: fix buffer overflow when reading sysfs file of 'pending'
    Documentation: update notes in biovecs about arbitrarily sized bios
    block: remove bio_get_nr_vecs()
    fs: use helper bio_add_page() instead of open coding on bi_io_vec
    block: kill merge_bvec_fn() completely
    md/raid5: get rid of bio_fits_rdev()
    md/raid5: split bio for chunk_aligned_read
    block: remove split code in blkdev_issue_{discard,write_same}
    btrfs: remove bio splitting and merge_bvec_fn() calls
    bcache: remove driver private bio splitting code
    block: simplify bio_add_page()
    block: make generic_make_request handle arbitrarily sized bios
    blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
    block: don't access bio->bi_error after bio_put()
    block: shrink struct bio down to 2 cache lines again
    ...

    Linus Torvalds
     

01 Aug, 2015

1 commit

  • The Linux kernel suspend path has traditionally invoked sys_sync()
    before freezing user threads.

    But sys_sync() can be expensive, and some user-space OS's do not want
    the kernel to pay the cost of sys_sync() on every suspend -- preferring
    invoke sync() from user-space if/when they want it.

    So make sys_sync on suspend build-time optional.

    The default is unchanged.

    Signed-off-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Len Brown
     

29 Jul, 2015

1 commit

  • Currently we have two different ways to signal an I/O error on a BIO:

    (1) by clearing the BIO_UPTODATE flag
    (2) by returning a Linux errno value to the bi_end_io callback

    The first one has the drawback of only communicating a single possible
    error (-EIO), and the second one has the drawback of not beeing persistent
    when bios are queued up, and are not passed along from child to parent
    bio in the ever more popular chaining scenario. Having both mechanisms
    available has the additional drawback of utterly confusing driver authors
    and introducing bugs where various I/O submitters only deal with one of
    them, and the others have to add boilerplate code to deal with both kinds
    of error returns.

    So add a new bi_error field to store an errno value directly in struct
    bio and remove the existing mechanisms to clean all this up.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: NeilBrown
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

15 Jul, 2015

1 commit

  • The synchronous synchronize_rcu() in wakeup_source_remove() makes
    user process which writes to /sys/kernel/wake_unlock blocked sometimes.

    For example, when android eventhub tries to release a wakelock, this
    blocking process can occur, and eventhub can't get input events
    for a while.

    Using a work item instead of direct function call at pm_wake_unlock()
    can prevent this unnecessary delay from happening.

    Signed-off-by: SungEun Kim
    Signed-off-by: Rafael J. Wysocki

    SungEun Kim
     

02 Jul, 2015

1 commit

  • Pull power management and ACPI fixes from Rafael Wysocki:
    "These are fixes that didn't make it to the previous PM+ACPI pull
    request or are fixing issues introduced by it.

    Specifics:

    - Fix a recently added memory leak in an error path in the ACPI
    resources management code (Dan Carpenter)

    - Fix a build warning triggered by an ACPI video header function that
    should be static inline (Borislav Petkov)

    - Change names of helper function converting struct fwnode_handle
    pointers to either struct device_node or struct acpi_device
    pointers so they don't conflict with local variable names
    (Alexander Sverdlin)

    - Make the hibernate core re-enable nonboot CPUs on failures to
    disable them as expected (Vitaly Kuznetsov)

    - Increase the default timeout of the device suspend watchdog to
    prevent it from triggering too early on some systems (Takashi Iwai)

    - Prevent the cpuidle powernv driver from registering idle states
    with CPUIDLE_FLAG_TIMER_STOP set if CONFIG_TICK_ONESHOT is unset
    which leads to boot hangs (Preeti U Murthy)"

    * tag 'pm+acpi-4.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode
    PM / sleep: Increase default DPM watchdog timeout to 60
    PM / hibernate: re-enable nonboot cpus on disable_nonboot_cpus() failure
    ACPI / OF: Rename of_node() and acpi_node() to to_of_node() and to_acpi_node()
    ACPI / video: Inline acpi_video_set_dmi_backlight_type
    ACPI / resources: free memory on error in add_region_before()

    Linus Torvalds
     

26 Jun, 2015

1 commit

  • Pull core block IO update from Jens Axboe:
    "Nothing really major in here, mostly a collection of smaller
    optimizations and cleanups, mixed with various fixes. In more detail,
    this contains:

    - Addition of policy specific data to blkcg for block cgroups. From
    Arianna Avanzini.

    - Various cleanups around command types from Christoph.

    - Cleanup of the suspend block I/O path from Christoph.

    - Plugging updates from Shaohua and Jeff Moyer, for blk-mq.

    - Eliminating atomic inc/dec of both remaining IO count and reference
    count in a bio. From me.

    - Fixes for SG gap and chunk size support for data-less (discards)
    IO, so we can merge these better. From me.

    - Small restructuring of blk-mq shared tag support, freeing drivers
    from iterating hardware queues. From Keith Busch.

    - A few cfq-iosched tweaks, from Tahsin Erdogan and me. Makes the
    IOPS mode the default for non-rotational storage"

    * 'for-4.2/core' of git://git.kernel.dk/linux-block: (35 commits)
    cfq-iosched: fix other locations where blkcg_to_cfqgd() can return NULL
    cfq-iosched: fix sysfs oops when attempting to read unconfigured weights
    cfq-iosched: move group scheduling functions under ifdef
    cfq-iosched: fix the setting of IOPS mode on SSDs
    blktrace: Add blktrace.c to BLOCK LAYER in MAINTAINERS file
    block, cgroup: implement policy-specific per-blkcg data
    block: Make CFQ default to IOPS mode on SSDs
    block: add blk_set_queue_dying() to blkdev.h
    blk-mq: Shared tag enhancements
    block: don't honor chunk sizes for data-less IO
    block: only honor SG gap prevention for merges that contain data
    block: fix returnvar.cocci warnings
    block, dm: don't copy bios for request clones
    block: remove management of bi_remaining when restoring original bi_end_io
    block: replace trylock with mutex_lock in blkdev_reread_part()
    block: export blkdev_reread_part() and __blkdev_reread_part()
    suspend: simplify block I/O handling
    block: collapse bio bit space
    block: remove unused BIO_RW_BLOCK and BIO_EOF flags
    block: remove BIO_EOPNOTSUPP
    ...

    Linus Torvalds
     

25 Jun, 2015

2 commits

  • Many harddisks (mostly WD ones) have firmware problems and take too
    long, more than 10 seconds, to resume from suspend. And this often
    exceeds the default DPM watchdog timeout (12 seconds), resulting in a
    kernel panic out of sudden.

    Since most distros just take the default as is, we should give a bit
    more safer value. This patch increases the default value from 12
    seconds to one minute, which has been confirmed to be long enough for
    such problematic disks.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=91921
    Fixes: 70fea60d888d (PM / Sleep: Detect device suspend/resume lockup and log event)
    Cc: 3.13+ # 3.13+
    Signed-off-by: Takashi Iwai
    Signed-off-by: Rafael J. Wysocki

    Takashi Iwai
     
  • When disable_nonboot_cpus() fails on some cpu it doesn't bring back all
    cpus it managed to offline, a consequent call to enable_nonboot_cpus() is
    expected. In hibernation_platform_enter() we don't call
    enable_nonboot_cpus() on error so cpus stay offlined.

    create_image() and resume_target_kernel() functions handle
    disable_nonboot_cpus() faults correctly, hibernation_platform_enter()
    is the only one which is doing it wrong.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: Rafael J. Wysocki

    Vitaly Kuznetsov
     

19 May, 2015

2 commits

  • Stop abusing struct page functionality and the swap end_io handler, and
    instead add a modified version of the blk-lib.c bio_batch helpers.

    Also move the block I/O code into swap.c as they are directly tied into
    each other.

    Signed-off-by: Christoph Hellwig
    Tested-by: Pavel Machek
    Tested-by: Ming Lin
    Acked-by: Pavel Machek
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • If a wakeup source is found to be pending in the last stage of
    suspend after syscore suspend, then the machine won't suspend, but
    suspend_enter() will return 0. That is confusing, as wakeup detection
    elsewhere causes -EBUSY to be returned from suspend_enter().

    To avoid the confusion, make suspend_enter() return -EBUSY in that
    case too.

    Signed-off-by: Ruchi Kandoi
    [ rjw: Subject and changelog ]
    Signed-off-by: Rafael J. Wysocki

    Ruchi Kandoi
     

13 May, 2015

2 commits


10 Apr, 2015

1 commit


07 Apr, 2015

1 commit


18 Mar, 2015

1 commit

  • Occasionally, the system can't come back up after suspend/resume
    due to problems of device suspending phase. This patch make
    PM_TRACE infrastructure cover device suspending phase of
    suspend/resume process, and the information in RTC can tell
    developers which device suspending function make system hang.

    Signed-off-by: Zhonghui Fu
    Signed-off-by: Rafael J. Wysocki

    Zhonghui Fu
     

26 Feb, 2015

1 commit

  • When CONFIG_PM_DEBUG=y, we provide a sysfs file (/sys/power/pm_test) for
    selecting one of a few suspend test modes, where rather than entering a
    full suspend state, the kernel will perform some subset of suspend
    steps, wait 5 seconds, and then resume back to normal operation.

    This mode is useful for (among other things) observing the state of the
    system just before entering a sleep mode, for debugging or analysis
    purposes. However, a constant 5 second wait is not sufficient for some
    sorts of analysis; for example, on an SoC, one might want to use
    external tools to probe the power states of various on-chip controllers
    or clocks.

    This patch turns this 5 second delay into a configurable module
    parameter, so users can determine how long to wait in this
    pseudo-suspend state before resuming the system.

    Example (wait 30 seconds);

    # echo 30 > /sys/module/suspend/parameters/pm_test_delay
    # echo core > /sys/power/pm_test
    # time echo mem > /sys/power/state
    ...
    [ 17.583625] suspend debug: Waiting for 30 second(s).
    ...
    real 0m30.381s
    user 0m0.017s
    sys 0m0.080s

    Signed-off-by: Brian Norris
    Acked-by: Pavel Machek
    Reviewed-by: Kevin Cernekee
    Acked-by: Florian Fainelli
    Signed-off-by: Rafael J. Wysocki

    Brian Norris
     

14 Feb, 2015

1 commit

  • In preparation for adding support for quiescing timers in the final
    stage of suspend-to-idle transitions, rework the freeze_enter()
    function making the system wait on a wakeup event, the freeze_wake()
    function terminating the suspend-to-idle loop and the mechanism by
    which deep idle states are entered during suspend-to-idle.

    First of all, introduce a simple state machine for suspend-to-idle
    and make the code in question use it.

    Second, prevent freeze_enter() from losing wakeup events due to race
    conditions and ensure that the number of online CPUs won't change
    while it is being executed. In addition to that, make it force
    all of the CPUs re-enter the idle loop in case they are in idle
    states already (so they can enter deeper idle states if possible).

    Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
    checks in cpuidle_select() and cpuidle_reflect() with a single
    suspend-to-idle state check in cpuidle_idle_call().

    Finally, introduce cpuidle_enter_freeze() that will simply find the
    deepest idle state available to the given CPU and enter it using
    cpuidle_enter().

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

12 Feb, 2015

2 commits

  • Commit 5695be142e20 ("OOM, PM: OOM killed task shouldn't escape PM
    suspend") has left a race window when OOM killer manages to
    note_oom_kill after freeze_processes checks the counter. The race
    window is quite small and really unlikely and partial solution deemed
    sufficient at the time of submission.

    Tejun wasn't happy about this partial solution though and insisted on a
    full solution. That requires the full OOM and freezer's task freezing
    exclusion, though. This is done by this patch which introduces oom_sem
    RW lock and turns oom_killer_disable() into a full OOM barrier.

    oom_killer_disabled check is moved from the allocation path to the OOM
    level and we take oom_sem for reading for both the check and the whole
    OOM invocation.

    oom_killer_disable() takes oom_sem for writing so it waits for all
    currently running OOM killer invocations. Then it disable all the further
    OOMs by setting oom_killer_disabled and checks for any oom victims.
    Victims are counted via mark_tsk_oom_victim resp. unmark_oom_victim. The
    last victim wakes up all waiters enqueued by oom_killer_disable().
    Therefore this function acts as the full OOM barrier.

    The page fault path is covered now as well although it was assumed to be
    safe before. As per Tejun, "We used to have freezing points deep in file
    system code which may be reacheable from page fault." so it would be
    better and more robust to not rely on freezing points here. Same applies
    to the memcg OOM killer.

    out_of_memory tells the caller whether the OOM was allowed to trigger and
    the callers are supposed to handle the situation. The page allocation
    path simply fails the allocation same as before. The page fault path will
    retry the fault (more on that later) and Sysrq OOM trigger will simply
    complain to the log.

    Normally there wouldn't be any unfrozen user tasks after
    try_to_freeze_tasks so the function will not block. But if there was an
    OOM killer racing with try_to_freeze_tasks and the OOM victim didn't
    finish yet then we have to wait for it. This should complete in a finite
    time, though, because

    - the victim cannot loop in the page fault handler (it would die
    on the way out from the exception)
    - it cannot loop in the page allocator because all the further
    allocation would fail and __GFP_NOFAIL allocations are not
    acceptable at this stage
    - it shouldn't be blocked on any locks held by frozen tasks
    (try_to_freeze expects lockless context) and kernel threads and
    work queues are not frozen yet

    Signed-off-by: Michal Hocko
    Suggested-by: Tejun Heo
    Cc: David Rientjes
    Cc: Johannes Weiner
    Cc: Oleg Nesterov
    Cc: Cong Wang
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • While touching this area let's convert printk to pr_*. This also makes
    the printing of continuation lines done properly.

    Signed-off-by: Michal Hocko
    Acked-by: Tejun Heo
    Cc: David Rientjes
    Cc: Johannes Weiner
    Cc: Oleg Nesterov
    Cc: Cong Wang
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

11 Feb, 2015

1 commit

  • Pull ACPI and power management updates from Rafael Wysocki:
    "We have a few new features this time, including a new SFI-based
    cpufreq driver, a new devfreq driver for Tegra Activity Monitor, a new
    devfreq class for providing its governors with raw utilization data
    and a new ACPI driver for AMD SoCs.

    Still, the majority of changes here are reworks of existing code to
    make it more straightforward or to prepare it for implementing new
    features on top of it. The primary example is the rework of ACPI
    resources handling from Jiang Liu, Thomas Gleixner and Lv Zheng with
    support for IOAPIC hotplug implemented on top of it, but there is
    quite a number of changes of this kind in the cpufreq core, ACPICA,
    ACPI EC driver, ACPI processor driver and the generic power domains
    core code too.

    The most active developer is Viresh Kumar with his cpufreq changes.

    Specifics:

    - Rework of the core ACPI resources parsing code to fix issues in it
    and make using resource offsets more convenient and consolidation
    of some resource-handing code in a couple of places that have grown
    analagous data structures and code to cover the the same gap in the
    core (Jiang Liu, Thomas Gleixner, Lv Zheng).

    - ACPI-based IOAPIC hotplug support on top of the resources handling
    rework (Jiang Liu, Yinghai Lu).

    - ACPICA update to upstream release 20150204 including an interrupt
    handling rework that allows drivers to install raw handlers for
    ACPI GPEs which then become entirely responsible for the given GPE
    and the ACPICA core code won't touch it (Lv Zheng, David E Box,
    Octavian Purdila).

    - ACPI EC driver rework to fix several concurrency issues and other
    problems related to events handling on top of the ACPICA's new
    support for raw GPE handlers (Lv Zheng).

    - New ACPI driver for AMD SoCs analogous to the LPSS (Low-Power
    Subsystem) driver for Intel chips (Ken Xue).

    - Two minor fixes of the ACPI LPSS driver (Heikki Krogerus, Jarkko
    Nikula).

    - Two new blacklist entries for machines (Samsung 730U3E/740U3E and
    510R) where the native backlight interface doesn't work correctly
    while the ACPI one does (Hans de Goede).

    - Rework of the ACPI processor driver's handling of idle states to
    make the code more straightforward and less bloated overall (Rafael
    J Wysocki).

    - Assorted minor fixes related to ACPI and SFI (Andreas Ruprecht,
    Andy Shevchenko, Hanjun Guo, Jan Beulich, Rafael J Wysocki, Yaowei
    Bai).

    - PCI core power management modification to avoid resuming (some)
    runtime-suspended devices during system suspend if they are in the
    right states already (Rafael J Wysocki).

    - New SFI-based cpufreq driver for Intel platforms using SFI
    (Srinidhi Kasagar).

    - cpufreq core fixes, cleanups and simplifications (Viresh Kumar,
    Doug Anderson, Wolfram Sang).

    - SkyLake CPU support and other updates for the intel_pstate driver
    (Kristen Carlson Accardi, Srinivas Pandruvada).

    - cpufreq-dt driver cleanup (Markus Elfring).

    - Init fix for the ARM big.LITTLE cpuidle driver (Sudeep Holla).

    - Generic power domains core code fixes and cleanups (Ulf Hansson).

    - Operating Performance Points (OPP) core code cleanups and kernel
    documentation update (Nishanth Menon).

    - New dabugfs interface to make the list of PM QoS constraints
    available to user space (Nishanth Menon).

    - New devfreq driver for Tegra Activity Monitor (Tomeu Vizoso).

    - New devfreq class (devfreq_event) to provide raw utilization data
    to devfreq governors (Chanwoo Choi).

    - Assorted minor fixes and cleanups related to power management
    (Andreas Ruprecht, Krzysztof Kozlowski, Rickard Strandqvist, Pavel
    Machek, Todd E Brandt, Wonhong Kwon).

    - turbostat updates (Len Brown) and cpupower Makefile improvement
    (Sriram Raghunathan)"

    * tag 'pm+acpi-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (151 commits)
    tools/power turbostat: relax dependency on APERF_MSR
    tools/power turbostat: relax dependency on invariant TSC
    Merge branch 'pci/host-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into acpi-resources
    tools/power turbostat: decode MSR_*_PERF_LIMIT_REASONS
    tools/power turbostat: relax dependency on root permission
    ACPI / video: Add disable_native_backlight quirk for Samsung 510R
    ACPI / PM: Remove unneeded nested #ifdef
    USB / PM: Remove unneeded #ifdef and associated dead code
    intel_pstate: provide option to only use intel_pstate with HWP
    ACPI / EC: Add GPE reference counting debugging messages
    ACPI / EC: Add query flushing support
    ACPI / EC: Refine command storm prevention support
    ACPI / EC: Add command flushing support.
    ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag
    ACPI: add AMD ACPI2Platform device support for x86 system
    ACPI / table: remove duplicate NULL check for the handler of acpi_table_parse()
    ACPI / EC: Update revision due to raw handler mode.
    ACPI / EC: Reduce ec_poll() by referencing the last register access timestamp.
    ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
    ACPICA: Events: Enable APIs to allow interrupt/polling adaptive request based GPE handling model
    ...

    Linus Torvalds
     

10 Feb, 2015

1 commit

  • * pm-sleep:
    PM / hibernate: exclude freed pages from allocated pages printout
    PM / sleep: export suspend_resume trace event
    PM / sleep: Mention async suspend in PM_TRACE documentation
    PM / hibernate: Remove unused function

    * pm-runtime:
    ACPI / PM: Remove unneeded nested #ifdef
    USB / PM: Remove unneeded #ifdef and associated dead code

    Rafael J. Wysocki
     

04 Feb, 2015

1 commit


24 Jan, 2015

2 commits

  • Remove the function get_safe_write_buffer() that is not used anywhere.

    This was partially found by using a static code analysis program called cppcheck.

    Signed-off-by: Rickard Strandqvist
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki

    Rickard Strandqvist
     
  • PM QoS requests are notoriously hard to debug and made even
    more so due to their highly dynamic nature. Having visibility
    into the internal data representation per constraint allows
    us to have much better appreciation of potential issues or
    bad usage by drivers in the system.

    So introduce for all classes of PM QoS, an entry in
    /sys/kernel/debug/pm_qos that shall show all the current
    requests as well as the snapshot of the value these requests
    boil down to. For example:
    ==> /sys/kernel/debug/pm_qos/cpu_dma_latency /sys/kernel/debug/pm_qos/memory_bandwidth
    Signed-off-by: Dave Gerlach
    Acked-by: Kevin Hilman
    Signed-off-by: Rafael J. Wysocki

    Nishanth Menon
     

07 Jan, 2015

1 commit

  • SRCU is not necessary to be compiled by default in all cases. For tinification
    efforts not compiling SRCU unless necessary is desirable.

    The current patch tries to make compiling SRCU optional by introducing a new
    Kconfig option CONFIG_SRCU which is selected when any of the components making
    use of SRCU are selected.

    If we do not select CONFIG_SRCU, srcu.o will not be compiled at all.

    text data bss dec hex filename
    2007 0 0 2007 7d7 kernel/rcu/srcu.o

    Size of arch/powerpc/boot/zImage changes from

    text data bss dec hex filename
    831552 64180 23944 919676 e087c arch/powerpc/boot/zImage : before
    829504 64180 23952 917636 e0084 arch/powerpc/boot/zImage : after

    so the savings are about ~2000 bytes.

    Signed-off-by: Pranith Kumar
    CC: Paul E. McKenney
    CC: Josh Triplett
    CC: Lai Jiangshan
    Signed-off-by: Paul E. McKenney
    [ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]

    Pranith Kumar
     

20 Dec, 2014

1 commit

  • Having switched over all of the users of CONFIG_PM_RUNTIME to use
    CONFIG_PM directly, turn the latter into a user-selectable option
    and drop the former entirely from the tree.

    Signed-off-by: Rafael J. Wysocki
    Reviewed-by: Ulf Hansson
    Acked-by: Kevin Hilman

    Rafael J. Wysocki
     

09 Dec, 2014

2 commits

  • * pm-runtime: (25 commits)
    i2c-omap / PM: Drop CONFIG_PM_RUNTIME from i2c-omap.c
    dmaengine / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    drivers: sh / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME
    MMC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    MFD / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    misc / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    media / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    input / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    iio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    hsi / OMAP / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    i2c-hid / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    drm / exynos / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    gpio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    hwrandom / exynos / PM: Use CONFIG_PM in #ifdef
    block / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    USB / PM: Drop CONFIG_PM_RUNTIME from the USB core
    PM: Merge the SET*_RUNTIME_PM_OPS() macros
    PM / Kconfig: Do not select PM directly from Kconfig files
    PCI / PM: Drop CONFIG_PM_RUNTIME from the PCI core
    ...

    Rafael J. Wysocki
     
  • * pm-domains:
    ARM: shmobile: Convert to genpd flags for PM clocks for R-mobile
    ARM: shmobile: Convert to genpd flags for PM clocks for r8a7779
    PM / Domains: Initial PM clock support for genpd
    PM / Domains: Power on the PM domain right after attach completes
    PM / Domains: Move struct pm_domain_data to pm_domain.h
    PM / Domains: Extract code to power off/on a PM domain
    PM / Domains: Make genpd parameter of pm_genpd_present() const

    * pm-sleep:
    PM / hibernate: Deletion of an unnecessary check before the function call "vfree"
    PM / Hibernate: Migrate to ktime_t

    * pm-tools:
    tools: cpupower: fix return checks for sysfs_get_idlestate_count()

    Rafael J. Wysocki
     

04 Dec, 2014

1 commit

  • After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
    selected) PM_RUNTIME is always set if PM is set, so quite a few
    depend on CONFIG_PM or even may be dropped entirely in some cases.

    Replace CONFIG_PM_RUNTIME with CONFIG_PM in the PM core code.

    Reviewed-by: Ulf Hansson
    Acked-by: Kevin Hilman
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

27 Nov, 2014

1 commit


19 Nov, 2014

1 commit

  • The number of and dependencies between high-level power management
    Kconfig options make life much harder than necessary. Several
    conbinations of them have to be tested and supported, even though
    some of those combinations are very rarely used in practice (if
    they are used in practice at all). Moreover, the fact that we
    have separate independent Kconfig options for runtime PM and
    system suspend is a serious obstacle for integration between
    the two frameworks.

    To overcome these difficulties, always select PM_RUNTIME if PM_SLEEP
    is set. Among other things, this will allow system suspend callbacks
    provided by bus types and device drivers to rely on the runtime PM
    framework regardless of the kernel configuration.

    Enthusiastically-acked-by: Kevin Hilman
    Tested-by: Geert Uytterhoeven
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

18 Nov, 2014

3 commits


15 Nov, 2014

1 commit

  • The IA64_HP_SIM dependency on PM_RUNTIME should be done in the arch
    Kconfig instead of in the PM core. Move it accordingly.

    NOTE: arch/ia64/Kconfig currently does a 'select PM', which since
    commit 1eb208aea317 (PM: Make CONFIG_PM depend on (CONFIG_PM_SLEEP ||
    CONFIG_PM_RUNTIME)) is effectively a noop unless PM_SLEEP or
    PM_RUNTIME are set elsewhere.

    Signed-off-by: Kevin Hilman
    Reviewed-by: Ulf Hansson
    Signed-off-by: Rafael J. Wysocki

    Kevin Hilman