11 Nov, 2020

1 commit


21 Mar, 2020

1 commit

  • seqlock consists of a sequence counter and a spinlock_t which is used to
    serialize the writers. spinlock_t is substituted by a "sleeping" spinlock
    on PREEMPT_RT enabled kernels which breaks the usage in the timekeeping
    code as the writers are executed in hard interrupt and therefore
    non-preemptible context even on PREEMPT_RT.

    The spinlock in seqlock cannot be unconditionally replaced by a
    raw_spinlock_t as many seqlock users have nesting spinlock sections or
    other code which is not suitable to run in truly atomic context on RT.

    Instead of providing a raw_seqlock API for a single use case, open code the
    seqlock for the jiffies use case and implement it with a raw_spinlock_t and
    a sequence counter.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200321113242.120587764@linutronix.de

    Thomas Gleixner
     

16 Jan, 2020

1 commit

  • Suspend to IDLE invokes tick_unfreeze() on resume. tick_unfreeze() on the
    first resuming CPU resumes timekeeping, which also has the side effect of
    resetting the softlockup watchdog on this CPU.

    But on the secondary CPUs the watchdog is not reset in the resume /
    unfreeze() path, which can result in false softlockup warnings on those
    CPUs depending on the time spent in suspend.

    Prevent this by clearing the softlock watchdog in the unfreeze path also
    on the secondary resuming CPUs.

    [ tglx: Massaged changelog ]

    Signed-off-by: Chunyan Zhang
    Signed-off-by: Thomas Gleixner
    Link: https://lore.kernel.org/r/20200110083902.27276-1-chunyan.zhang@unisoc.com

    Chunyan Zhang
     

07 May, 2019

1 commit

  • Pull timer updates from Ingo Molnar:
    "This cycle had the following changes:

    - Timer tracing improvements (Anna-Maria Gleixner)

    - Continued tasklet reduction work: remove the hrtimer_tasklet
    (Thomas Gleixner)

    - Fix CPU hotplug remove race in the tick-broadcast mask handling
    code (Thomas Gleixner)

    - Force upper bound for setting CLOCK_REALTIME, to fix ABI
    inconsistencies with handling values that are close to the maximum
    supported and the vagueness of when uptime related wraparound might
    occur. Make the consistent maximum the year 2232 across all
    relevant ABIs and APIs. (Thomas Gleixner)

    - various cleanups and smaller fixes"

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    tick: Fix typos in comments
    tick/broadcast: Fix warning about undefined tick_broadcast_oneshot_offline()
    timekeeping: Force upper bound for setting CLOCK_REALTIME
    timer/trace: Improve timer tracing
    timer/trace: Replace deprecated vsprintf pointer extension %pf by %ps
    timer: Move trace point to get proper index
    tick/sched: Update tick_sched struct documentation
    tick: Remove outgoing CPU from broadcast masks
    timekeeping: Consistently use unsigned int for seqcount snapshot
    softirq: Remove tasklet_hrtimer
    xfrm: Replace hrtimer tasklet with softirq hrtimer
    mac80211_hwsim: Replace hrtimer tasklet with softirq hrtimer

    Linus Torvalds
     

04 May, 2019

1 commit

  • Allow the boot CPU/CPU0 to be nohz_full. Have the boot CPU take the
    do_timer duty during boot until a housekeeping CPU can take over.

    This is supported when CONFIG_PM_SLEEP_SMP is not configured, or when
    it is configured and the arch allows suspend on non-zero CPUs.

    nohz_full has been trialed at a large supercomputer site and found to
    significantly reduce jitter. In order to deploy it in production, they
    need CPU0 to be nohz_full because their job control system requires
    the application CPUs to start from 0, and the housekeeping CPUs are
    placed higher. An equivalent job scheduling that uses CPU0 for
    housekeeping could be achieved by modifying their system, but it is
    preferable if nohz_full can support their environment without
    modification.

    Signed-off-by: Nicholas Piggin
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rafael J . Wysocki
    Cc: Thomas Gleixner
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: https://lkml.kernel.org/r/20190411033448.20842-6-npiggin@gmail.com
    Signed-off-by: Ingo Molnar

    Nicholas Piggin
     

18 Apr, 2019

1 commit

  • tick_freeze() introduced by suspend-to-idle in commit 124cf9117c5f ("PM /
    sleep: Make it possible to quiesce timers during suspend-to-idle") uses
    timekeeping_suspend() instead of syscore_suspend() during
    suspend-to-idle. As a consequence generic sched_clock will keep going
    because sched_clock_suspend() and sched_clock_resume() are not invoked
    during suspend-to-idle which can result in a generic sched_clock wrap.

    On a ARM system with suspend-to-idle enabled, sched_clock is registered
    as "56 bits at 13MHz, resolution 76ns, wraps every 4398046511101ns", which
    means the real wrapping duration is 8796093022202ns.

    [ 134.551779] suspend-to-idle suspend (timekeeping_suspend())
    [ 1204.912239] suspend-to-idle resume (timekeeping_resume())
    ......
    [ 1206.912239] suspend-to-idle suspend (timekeeping_suspend())
    [ 5880.502807] suspend-to-idle resume (timekeeping_resume())
    ......
    [ 6000.403724] suspend-to-idle suspend (timekeeping_suspend())
    [ 8035.753167] suspend-to-idle resume (timekeeping_resume())
    ......
    [ 8795.786684] (2)[321:charger_thread]......
    [ 8795.788387] (2)[321:charger_thread]......
    [ 0.057226] (0)[0:swapper/0]......
    [ 0.061447] (2)[0:swapper/2]......

    sched_clock was not stopped during suspend-to-idle, and sched_clock_poll
    hrtimer was not expired because timekeeping_suspend() was invoked during
    suspend-to-idle. It makes sched_clock wrap at kernel time 8796s.

    To prevent this, invoke sched_clock_suspend() and sched_clock_resume() in
    tick_freeze() together with timekeeping_suspend() and timekeeping_resume().

    Fixes: 124cf9117c5f (PM / sleep: Make it possible to quiesce timers during suspend-to-idle)
    Signed-off-by: Chang-An Chen
    Signed-off-by: Thomas Gleixner
    Cc: Frederic Weisbecker
    Cc: Matthias Brugger
    Cc: John Stultz
    Cc: Kees Cook
    Cc: Corey Minyard
    Cc:
    Cc:
    Cc: Stanley Chu
    Cc:
    Cc:
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/1553828349-8914-1-git-send-email-chang-an.chen@mediatek.com

    Chang-An Chen
     

23 Mar, 2019

1 commit

  • The timekeeping code uses a random mix of "unsigned long" and "unsigned
    int" for the seqcount snapshots (ratio 14:12). Since the seqlock.h API is
    entirely based on unsigned int, use that throughout.

    Signed-off-by: Rasmus Villemoes
    Signed-off-by: Thomas Gleixner
    Cc: Frederic Weisbecker
    Cc: John Stultz
    Cc: Stephen Boyd
    Link: https://lkml.kernel.org/r/20190318195557.20773-1-linux@rasmusvillemoes.dk

    Rasmus Villemoes
     

23 Nov, 2018

3 commits

  • "For licencing details see kernel-base/COPYING" and similar license
    references have no value over the SPDX identifier. Remove them.

    Signed-off-by: Thomas Gleixner
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Peter Anvin
    Cc: Russell King
    Cc: Richard Cochran
    Cc: "Paul E. McKenney"
    Cc: Nicolas Pitre
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Link: https://lkml.kernel.org/r/20181031182252.963632760@linutronix.de

    Thomas Gleixner
     
  • Update the time(r) core files files with the correct SPDX license
    identifier based on the license text in the file itself. The SPDX
    identifier is a legally binding shorthand, which can be used instead of the
    full boiler plate text.

    This work is based on a script and data from Philippe Ombredanne, Kate
    Stewart and myself. The data has been created with two independent license
    scanners and manual inspection.

    The following files do not contain any direct license information and have
    been omitted from the big initial SPDX changes:

    timeconst.bc: The .bc files were not touched
    time.c, timer.c, timekeeping.c: Licence was deduced from EXPORT_SYMBOL_GPL

    As those files do not contain direct license references they fall under the
    project license, i.e. GPL V2 only.

    Signed-off-by: Thomas Gleixner
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Russell King
    Cc: Richard Cochran
    Cc: Nicolas Pitre
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Link: https://lkml.kernel.org/r/20181031182252.879109557@linutronix.de

    Thomas Gleixner
     
  • Remove the pointless filenames in the top level comments. They have no
    value at all and just occupy space. While at it tidy up some of the
    comments and remove a stale one.

    Signed-off-by: Thomas Gleixner
    Acked-by: Nicolas Pitre
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Peter Anvin
    Cc: Russell King
    Cc: Richard Cochran
    Cc: "Paul E. McKenney"
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Link: https://lkml.kernel.org/r/20181031182252.794898238@linutronix.de

    Thomas Gleixner
     

11 Jul, 2018

1 commit

  • This reverts commit 1332a90558013ae4242e3dd7934bdcdeafb06c0d.

    The original issue was not because of incorrect checking of cpumask for
    both new and old tick device. It was incorrectly analysed was due to the
    misunderstanding of the comment and misinterpretation of the return value
    from tick_check_preferred. The main issue is with the clockevent driver
    that sets the cpumask to cpu_all_mask instead of cpu_possible_mask.

    Signed-off-by: Sudeep Holla
    Signed-off-by: Thomas Gleixner
    Tested-by: Kevin Hilman
    Tested-by: Martin Blumenstingl
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: Marc Zyngier
    Link: https://lkml.kernel.org/r/1531151136-18297-1-git-send-email-sudeep.holla@arm.com

    Sudeep Holla
     

06 Jun, 2018

1 commit

  • Pull power management updates from Rafael Wysocki:
    "These include a significant update of the generic power domains
    (genpd) and Operating Performance Points (OPP) frameworks, mostly
    related to the introduction of power domain performance levels,
    cpufreq updates (new driver for Qualcomm Kryo processors, updates of
    the existing drivers, some core fixes, schedutil governor
    improvements), PCI power management fixes, ACPI workaround for
    EC-based wakeup events handling on resume from suspend-to-idle, and
    major updates of the turbostat and pm-graph utilities.

    Specifics:

    - Introduce power domain performance levels into the the generic
    power domains (genpd) and Operating Performance Points (OPP)
    frameworks (Viresh Kumar, Rajendra Nayak, Dan Carpenter).

    - Fix two issues in the runtime PM framework related to the
    initialization and removal of devices using device links (Ulf
    Hansson).

    - Clean up the initialization of drivers for devices in PM domains
    (Ulf Hansson, Geert Uytterhoeven).

    - Fix a cpufreq core issue related to the policy sysfs interface
    causing CPU online to fail for CPUs sharing one cpufreq policy in
    some situations (Tao Wang).

    - Make it possible to use platform-specific suspend/resume hooks in
    the cpufreq-dt driver and make the Armada 37xx DVFS use that
    feature (Viresh Kumar, Miquel Raynal).

    - Optimize policy transition notifications in cpufreq (Viresh Kumar).

    - Improve the iowait boost mechanism in the schedutil cpufreq
    governor (Patrick Bellasi).

    - Improve the handling of deferred frequency updates in the schedutil
    cpufreq governor (Joel Fernandes, Dietmar Eggemann, Rafael Wysocki,
    Viresh Kumar).

    - Add a new cpufreq driver for Qualcomm Kryo (Ilia Lin).

    - Fix and clean up some cpufreq drivers (Colin Ian King, Dmitry
    Osipenko, Doug Smythies, Luc Van Oostenryck, Simon Horman, Viresh
    Kumar).

    - Fix the handling of PCI devices with the DPM_SMART_SUSPEND flag set
    and update stale comments in the PCI core PM code (Rafael Wysocki).

    - Work around an issue related to the handling of EC-based wakeup
    events in the ACPI PM core during resume from suspend-to-idle if
    the EC has been put into the low-power mode (Rafael Wysocki).

    - Improve the handling of wakeup source objects in the PM core (Doug
    Berger, Mahendran Ganesh, Rafael Wysocki).

    - Update the driver core to prevent deferred probe from breaking
    suspend/resume ordering (Feng Kan).

    - Clean up the PM core somewhat (Bjorn Helgaas, Ulf Hansson, Rafael
    Wysocki).

    - Make the core suspend/resume code and cpufreq support the RT patch
    (Sebastian Andrzej Siewior, Thomas Gleixner).

    - Consolidate the PM QoS handling in cpuidle governors (Rafael
    Wysocki).

    - Fix a possible crash in the hibernation core (Tetsuo Handa).

    - Update the rockchip-io Adaptive Voltage Scaling (AVS) driver (David
    Wu).

    - Update the turbostat utility (fixes, cleanups, new CPU IDs, new
    command line options, built-in "Low Power Idle" counters support,
    new POLL and POLL% columns) and add an entry for it to MAINTAINERS
    (Len Brown, Artem Bityutskiy, Chen Yu, Laura Abbott, Matt Turner,
    Prarit Bhargava, Srinivas Pandruvada).

    - Update the pm-graph to version 5.1 (Todd Brandt).

    - Update the intel_pstate_tracer utility (Doug Smythies)"

    * tag 'pm-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (128 commits)
    tools/power turbostat: update version number
    tools/power turbostat: Add Node in output
    tools/power turbostat: add node information into turbostat calculations
    tools/power turbostat: remove num_ from cpu_topology struct
    tools/power turbostat: rename num_cores_per_pkg to num_cores_per_node
    tools/power turbostat: track thread ID in cpu_topology
    tools/power turbostat: Calculate additional node information for a package
    tools/power turbostat: Fix node and siblings lookup data
    tools/power turbostat: set max_num_cpus equal to the cpumask length
    tools/power turbostat: if --num_iterations, print for specific number of iterations
    tools/power turbostat: Add Cannon Lake support
    tools/power turbostat: delete duplicate #defines
    x86: msr-index.h: Correct SNB_C1/C3_AUTO_UNDEMOTE defines
    tools/power turbostat: Correct SNB_C1/C3_AUTO_UNDEMOTE defines
    tools/power turbostat: add POLL and POLL% column
    tools/power turbostat: Fix --hide Pk%pc10
    tools/power turbostat: Build-in "Low Power Idle" counters support
    tools/power turbostat: Don't make man pages executable
    tools/power turbostat: remove blank lines
    tools/power turbostat: a small C-states dump readability immprovement
    ...

    Linus Torvalds
     

27 May, 2018

1 commit

  • timekeeping suspend/resume calls read_persistent_clock() which takes
    rtc_lock. That results in might sleep warnings because at that point
    we run with interrupts disabled.

    We cannot convert rtc_lock to a raw spinlock as that would trigger
    other might sleep warnings.

    As a workaround we disable the might sleep warnings by setting
    system_state to SYSTEM_SUSPEND before calling sysdev_suspend() and
    restoring it to SYSTEM_RUNNING afer sysdev_resume(). There is no lock
    contention because hibernate / suspend to RAM is single-CPU at this
    point.

    In s2idle's case the system_state is set to SYSTEM_SUSPEND before
    timekeeping_suspend() which is invoked by the last CPU. In the resume
    case it set back to SYSTEM_RUNNING after timekeeping_resume() which is
    invoked by the first CPU in the resume case. The other CPUs will block
    on tick_freeze_lock.

    Signed-off-by: Thomas Gleixner
    [bigeasy: cover s2idle in tick_freeze() / tick_unfreeze()]
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Rafael J. Wysocki

    Thomas Gleixner
     

13 May, 2018

1 commit

  • Checking the equality of cpumask for both new and old tick device doesn't
    ensure that it's CPU local device. This will cause issue if a low rating
    clockevent tick device is registered first followed by the registration
    of higher rating clockevent tick device.

    In such case, clockevents_released list will never get emptied as both
    the devices get selected as preferred one and we will loop forever in
    clockevents_notify_released.

    Signed-off-by: Sudeep Holla
    Signed-off-by: Thomas Gleixner
    Cc: Frederic Weisbecker
    Link: https://lkml.kernel.org/r/1525881728-4858-1-git-send-email-sudeep.holla@arm.com

    Sudeep Holla
     

26 Apr, 2018

1 commit

  • Revert commits

    92af4dcb4e1c ("tracing: Unify the "boot" and "mono" tracing clocks")
    127bfa5f4342 ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
    7250a4047aa6 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
    d6c7270e913d ("timekeeping: Remove boot time specific code")
    f2d6fdbfd238 ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
    d6ed449afdb3 ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
    72199320d49d ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")

    As stated in the pull request for the unification of CLOCK_MONOTONIC and
    CLOCK_BOOTTIME, it was clear that we might have to revert the change.

    As reported by several folks systemd and other applications rely on the
    documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
    changes. After resume daemons time out and other timeout related issues are
    observed. Rafael compiled this list:

    * systemd kills daemons on resume, after >WatchdogSec seconds
    of suspending (Genki Sky). [Verified that that's because systemd uses
    CLOCK_MONOTONIC and expects it to not include the suspend time.]

    * systemd-journald misbehaves after resume:
    systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
    corrupted or uncleanly shut down, renaming and replacing.
    (Mike Galbraith).

    * NetworkManager reports "networking disabled" and networking is broken
    after resume 50% of the time (Pavel). [May be because of systemd.]

    * MATE desktop dims the display and starts the screensaver right after
    system resume (Pavel).

    * Full system hang during resume (me). [May be due to systemd or NM or both.]

    That happens on debian and open suse systems.

    It's sad, that these problems were neither catched in -next nor by those
    folks who expressed interest in this change.

    Reported-by: Rafael J. Wysocki
    Reported-by: Genki Sky ,
    Reported-by: Pavel Machek
    Signed-off-by: Thomas Gleixner
    Cc: Dmitry Torokhov
    Cc: John Stultz
    Cc: Jonathan Corbet
    Cc: Kevin Easton
    Cc: Linus Torvalds
    Cc: Mark Salyzyn
    Cc: Michael Kerrisk
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: Prarit Bhargava
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt

    Thomas Gleixner
     

13 Mar, 2018

1 commit

  • The MONOTONIC clock is not fast forwarded by the time spent in suspend on
    resume. This is only done for the BOOTTIME clock. The reason why the
    MONOTONIC clock is not forwarded is historical: the original Linux
    implementation was using jiffies as a base for the MONOTONIC clock and
    jiffies have never been advanced after resume.

    At some point when timekeeping was unified in the core code, the
    MONONOTIC clock was advanced after resume which also advanced jiffies causing
    interesting side effects. As a consequence the the MONOTONIC clock forwarding
    was disabled again and the BOOTTIME clock was introduced, which allows to read
    time since boot.

    Back then it was not possible to completely distangle the MONOTONIC clock and
    jiffies because there were still interfaces which exposed the MONOTONIC clock
    behaviour based on the timer wheel and therefore jiffies.

    As of today none of the MONOTONIC clock facilities depends on jiffies
    anymore so the forwarding can be done seperately. This is achieved by
    forwarding the variables which are used for the jiffies update after resume
    before the tick is restarted,

    In timekeeping resume, the change is rather simple. Instead of updating the
    offset between the MONOTONIC clock and the REALTIME/BOOTTIME clocks, advance the
    time keeper base for the MONOTONIC and the MONOTONIC_RAW clocks by the time
    spent in suspend.

    The MONOTONIC clock is now the same as the BOOTTIME clock and the offset between
    the REALTIME and the MONOTONIC clocks is the same as before suspend.

    There might be side effects in applications, which rely on the
    (unfortunately) well documented behaviour of the MONOTONIC clock, but the
    downsides of the existing behaviour are probably worse.

    There is one obvious issue. Up to now it was possible to retrieve the time
    spent in suspend by observing the delta between the MONOTONIC clock and the
    BOOTTIME clock. This is not longer available, but the previously introduced
    mechanism to read the active non-suspended monotonic time can mitigate that
    in a detectable fashion.

    Signed-off-by: Thomas Gleixner
    Cc: Andrew Morton
    Cc: Dmitry Torokhov
    Cc: John Stultz
    Cc: Jonathan Corbet
    Cc: Kevin Easton
    Cc: Linus Torvalds
    Cc: Mark Salyzyn
    Cc: Michael Kerrisk
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: Prarit Bhargava
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20180301165150.062975504@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

26 Dec, 2016

1 commit

  • ktime_set(S,N) was required for the timespec storage type and is still
    useful for situations where a Seconds and Nanoseconds part of a time value
    needs to be converted. For anything where the Seconds argument is 0, this
    is pointless and can be replaced with a simple assignment.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra

    Thomas Gleixner
     

14 Sep, 2015

1 commit

  • All users are migrated to the per-state callbacks, get rid of the
    unused interface and the core support code.

    Signed-off-by: Viresh Kumar
    Signed-off-by: Thomas Gleixner
    Cc: linaro-kernel@lists.linaro.org
    Cc: John Stultz
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/fd60de14cf6d125489c031207567bb255ad946f6.1441943991.git.viresh.kumar@linaro.org
    Signed-off-by: Ingo Molnar

    Viresh Kumar
     

01 Aug, 2015

1 commit


14 Jul, 2015

1 commit


08 Jul, 2015

1 commit

  • Currently the broadcast busy check, which prevents the idle code from
    going into deep idle, works only in one shot mode.

    If NOHZ and HIGHRES are off (config or command line) there is no
    sanity check at all, so under certain conditions cpus are allowed to
    go into deep idle, where the local timer stops, and are not woken up
    again because there is no broadcast timer installed or a hrtimer based
    broadcast device is not evaluated.

    Move tick_broadcast_oneshot_control() into the common code and provide
    proper subfunctions for the various config combinations.

    The common check in tick_broadcast_oneshot_control() is for the C3STOP
    misfeature flag of the local clock event device. If its not set, idle
    can proceed. If set, further checks are necessary.

    Provide checks for the trivial cases:

    - If broadcast is disabled in the config, then return busy

    - If oneshot mode (NOHZ/HIGHES) is disabled in the config, return
    busy if the broadcast device is hrtimer based.

    - If oneshot mode is enabled in the config call the original
    tick_broadcast_oneshot_control() function. That function needs
    extra checks which will be implemented in seperate patches.

    [ Split out from a larger combo patch ]

    Reported-and-tested-by: Sudeep Holla
    Signed-off-by: Thomas Gleixner
    Cc: Suzuki Poulose
    Cc: Lorenzo Pieralisi
    Cc: Catalin Marinas
    Cc: Rafael J. Wysocki
    Cc: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Ingo Molnar
    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

    Thomas Gleixner
     

24 Jun, 2015

1 commit

  • Pull power management and ACPI updates from Rafael Wysocki:
    "The rework of backlight interface selection API from Hans de Goede
    stands out from the number of commits and the number of affected
    places perspective. The cpufreq core fixes from Viresh Kumar are
    quite significant too as far as the number of commits goes and because
    they should reduce CPU online/offline overhead quite a bit in the
    majority of cases.

    From the new featues point of view, the ACPICA update (to upstream
    revision 20150515) adding support for new ACPI 6 material to ACPICA is
    the one that matters the most as some new significant features will be
    based on it going forward. Also included is an update of the ACPI
    device power management core to follow ACPI 6 (which in turn reflects
    the Windows' device PM implementation), a PM core extension to support
    wakeup interrupts in a more generic way and support for the ACPI _CCA
    device configuration object.

    The rest is mostly fixes and cleanups all over and some documentation
    updates, including new DT bindings for Operating Performance Points.

    There is one fix for a regression introduced in the 4.1 cycle, but it
    adds quite a number of lines of code, it wasn't really ready before
    Thursday and you were on vacation, so I refrained from pushing it on
    the last minute for 4.1.

    Specifics:

    - ACPICA update to upstream revision 20150515 including basic support
    for ACPI 6 features: new ACPI tables introduced by ACPI 6 (STAO,
    XENV, WPBT, NFIT, IORT), changes related to the other tables (DTRM,
    FADT, LPIT, MADT), new predefined names (_BTH, _CR3, _DSD, _LPI,
    _MTL, _PRR, _RDI, _RST, _TFP, _TSN), fixes and cleanups (Bob Moore,
    Lv Zheng).

    - ACPI device power management core code update to follow ACPI 6
    which reflects the ACPI device power management implementation in
    Windows (Rafael J Wysocki).

    - rework of the backlight interface selection logic to reduce the
    number of kernel command line options and improve the handling of
    DMI quirks that may be involved in that and to make the code
    generally more straightforward (Hans de Goede).

    - fixes for the ACPI Embedded Controller (EC) driver related to the
    handling of EC transactions (Lv Zheng).

    - fix for a regression related to the ACPI resources management and
    resulting from a recent change of ACPI initialization code ordering
    (Rafael J Wysocki).

    - fix for a system initialization regression related to ACPI
    introduced during the 3.14 cycle and caused by running the code
    that switches the platform over to the ACPI mode too early in the
    initialization sequence (Rafael J Wysocki).

    - support for the ACPI _CCA device configuration object related to
    DMA cache coherence (Suravee Suthikulpanit).

    - ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov).

    - ACPI battery driver cleanups (Luis Henriques, Mathias Krause).

    - ACPI processor driver cleanups (Hanjun Guo).

    - cleanups and documentation update related to the ACPI device
    properties interface based on _DSD (Rafael J Wysocki).

    - ACPI device power management fixes (Rafael J Wysocki).

    - assorted cleanups related to ACPI (Dominik Brodowski, Fabian
    Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki).

    - fix for a long-standing issue causing General Protection Faults to
    be generated occasionally on return to user space after resume from
    ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar).

    - fix to make the suspend core code return -EBUSY consistently in all
    cases when system suspend is aborted due to wakeup detection (Ruchi
    Kandoi).

    - support for automated device wakeup IRQ handling allowing drivers
    to make their PM support more starightforward (Tony Lindgren).

    - new tracepoints for suspend-to-idle tracing and rework of the
    prepare/complete callbacks tracing in the PM core (Todd E Brandt,
    Rafael J Wysocki).

    - wakeup sources framework enhancements (Jin Qian).

    - new macro for noirq system PM callbacks (Grygorii Strashko).

    - assorted cleanups related to system suspend (Rafael J Wysocki).

    - cpuidle core cleanups to make the code more efficient (Rafael J
    Wysocki).

    - powernv/pseries cpuidle driver update (Shilpasri G Bhat).

    - cpufreq core fixes related to CPU online/offline that should reduce
    the overhead of these operations quite a bit, unless the CPU in
    question is physically going away (Viresh Kumar, Saravana Kannan).

    - serialization of cpufreq governor callbacks to avoid race
    conditions in some cases (Viresh Kumar).

    - intel_pstate driver fixes and cleanups (Doug Smythies, Prarit
    Bhargava, Joe Konno).

    - cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep
    Holla, Felipe Balbi, Tang Yuantian).

    - assorted cleanups in cpufreq drivers and core (Shailendra Verma,
    Fabian Frederick, Wang Long).

    - new Device Tree bindings for representing Operating Performance
    Points (Viresh Kumar).

    - updates for the common clock operations support code in the PM core
    (Rajendra Nayak, Geert Uytterhoeven).

    - PM domains core code update (Geert Uytterhoeven).

    - Intel Knights Landing support for the RAPL (Running Average Power
    Limit) power capping driver (Dasaratharaman Chandramouli).

    - fixes related to the floor frequency setting on Atom SoCs in the
    RAPL power capping driver (Ajay Thomas).

    - runtime PM framework documentation update (Ben Dooks).

    - cpupower tool fix (Herton R Krzesinski)"

    * tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (194 commits)
    cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
    x86: Load __USER_DS into DS/ES after resume
    PM / OPP: Add binding for 'opp-suspend'
    PM / OPP: Allow multiple OPP tables to be passed via DT
    PM / OPP: Add new bindings to address shortcomings of existing bindings
    ACPI: Constify ACPI device IDs in documentation
    ACPI / enumeration: Document the rules regarding the PRP0001 device ID
    ACPI / video: Make acpi_video_unregister_backlight() private
    acpi-video-detect: Remove old API
    toshiba-acpi: Port to new backlight interface selection API
    thinkpad-acpi: Port to new backlight interface selection API
    sony-laptop: Port to new backlight interface selection API
    samsung-laptop: Port to new backlight interface selection API
    msi-wmi: Port to new backlight interface selection API
    msi-laptop: Port to new backlight interface selection API
    intel-oaktrail: Port to new backlight interface selection API
    ideapad-laptop: Port to new backlight interface selection API
    fujitsu-laptop: Port to new backlight interface selection API
    eeepc-laptop: Port to new backlight interface selection API
    dell-wmi: Port to new backlight interface selection API
    ...

    Linus Torvalds
     

02 Jun, 2015

3 commits


19 May, 2015

1 commit


15 May, 2015

1 commit


22 Apr, 2015

1 commit

  • hrtimer softirq is a leftover from the initial implementation and
    serves only the purpose to handle the enqueueing of already expired
    timers in the high resolution timer mode. We discussed whether we
    change the return value and force all start sites to handle that the
    timer is already expired, but that would be a Herculean task and I'm
    not sure whether its a good idea to enforce that handling on
    everyone.

    A simpler solution is to enforce a timer interrupt instead of raising
    and scheduling a softirq. Just use the existing infrastructure to do
    so and remove all the softirq leftovers.

    The HRTIMER softirq enum is now unused, but kept around because trace
    parsers rely on the existing numbering.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203501.840834708@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

03 Apr, 2015

4 commits

  • Some braces in tick_freeze() are not necessary, so drop them.

    Signed-off-by: Rafael J. Wysocki
    Cc: peterz@infradead.org
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1534128.H5hN3KBFB4@vostro.rjw.lan
    Signed-off-by: Ingo Molnar

    Rafael J. Wysocki
     
  • A recent conflict resolution has left tick_resume() in
    tick_unfreeze() which leads to an unbalanced execution of
    tick_resume_broadcast() every time that function runs.

    Fix that by replacing the tick_resume() in tick_unfreeze()
    with tick_resume_local() as appropriate.

    Signed-off-by: Rafael J. Wysocki
    Cc: boris.ostrovsky@oracle.com
    Cc: david.vrabel@citrix.com
    Cc: konrad.wilk@oracle.com
    Cc: peterz@infradead.org
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/8099075.V0LvN3pQAV@vostro.rjw.lan
    Signed-off-by: Ingo Molnar

    Rafael J. Wysocki
     
  • clockevents_notify() is a leftover from the early design of the
    clockevents facility. It's really not a notification mechanism,
    it's a multiplex call. We are way better off to have explicit
    calls instead of this monstrosity.

    Split out the cleanup function for a dead cpu and invoke it
    directly from the cpu down code. Make it conditional on
    CPU_HOTPLUG as well.

    Temporary change, will be refined in the future.

    Signed-off-by: Thomas Gleixner
    [ Rebased, added clockevents_notify() removal ]
    Signed-off-by: Rafael J. Wysocki
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1735025.raBZdQHM3m@vostro.rjw.lan
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • clockevents_notify() is a leftover from the early design of the
    clockevents facility. It's really not a notification mechanism,
    it's a multiplex call. We are way better off to have explicit
    calls instead of this monstrosity.

    Split out the tick_handover call and invoke it explicitely from
    the hotplug code. Temporary solution will be cleaned up in later
    patches.

    Signed-off-by: Thomas Gleixner
    [ Rebase ]
    Signed-off-by: Rafael J. Wysocki
    Cc: Peter Zijlstra
    Cc: John Stultz
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Link: http://lkml.kernel.org/r/1658173.RkEEILFiQZ@vostro.rjw.lan
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

01 Apr, 2015

3 commits

  • Use the new tick_suspend/resume_local() and get rid of the
    homebrewn implementation of these in the ARM bL switcher. The
    check for the cpumask is completely pointless. There is no harm
    to suspend a per cpu tick device unconditionally. If that's a
    real issue then we fix it proper at the core level and not with
    some completely undocumented hacks in some random core code.

    Move the tick internals to the core code, now that this nuisance
    is gone.

    Signed-off-by: Thomas Gleixner
    [ rjw: Rebase, changelog ]
    Signed-off-by: Rafael J. Wysocki
    Cc: Nicolas Pitre
    Cc: Peter Zijlstra
    Cc: Russell King
    Link: http://lkml.kernel.org/r/1655112.Ws17YsMfN7@vostro.rjw.lan
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Xen calls on every cpu into tick_resume() which is just wrong.
    tick_resume() is for the syscore global suspend/resume
    invocation. What XEN really wants is a per cpu local resume
    function.

    Provide a tick_resume_local() function and use it in XEN.

    Also provide a complementary tick_suspend_local() and modify
    tick_unfreeze() and tick_freeze(), respectively, to use the
    new local tick resume/suspend functions.

    Signed-off-by: Thomas Gleixner
    [ Combined two patches, rebased, modified subject/changelog. ]
    Signed-off-by: Rafael J. Wysocki
    Cc: Boris Ostrovsky
    Cc: David Vrabel
    Cc: Konrad Rzeszutek Wilk
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1698741.eezk9tnXtG@vostro.rjw.lan
    [ Merged to latest timers/core. ]
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • clockevents_notify() is a leftover from the early design of the
    clockevents facility. It's really not a notification mechanism,
    it's a multiplex call.

    We are way better off to have explicit calls instead of this
    monstrosity. Split out the suspend/resume() calls and invoke
    them directly from the call sites.

    No locking required at this point because these calls happen
    with interrupts disabled and a single cpu online.

    Signed-off-by: Thomas Gleixner
    [ Rebased on top of 4.0-rc5. ]
    Signed-off-by: Rafael J. Wysocki
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/713674030.jVm1qaHuPf@vostro.rjw.lan
    [ Rebased on top of latest timers/core. ]
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

27 Mar, 2015

2 commits

  • 'enum clock_event_mode' is used for two purposes today:

    - to pass mode to the driver of clockevent device::set_mode().

    - for managing state of the device for clockevents core.

    For supporting new modes/states we have moved away from the
    legacy set_mode() callback to new per-mode/state callbacks. New
    modes/states shouldn't be exposed to the legacy (now OBSOLOTE)
    callbacks and so we shouldn't add new states to 'enum
    clock_event_mode'.

    Lets have separate enums for the two use cases mentioned above.
    Keep using the earlier enum for legacy set_mode() callback and
    mark it OBSOLETE. And add another enum to clearly specify the
    possible states of a clockevent device.

    This also renames the newly added per-mode callbacks to reflect
    state changes.

    We haven't got rid of 'mode' member of 'struct
    clock_event_device' as it is used by some of the clockevent
    drivers and it would automatically die down once we migrate
    those drivers to the new interface. It ('mode') is only updated
    now for the drivers using the legacy interface.

    Suggested-by: Peter Zijlstra
    Suggested-by: Ingo Molnar
    Signed-off-by: Viresh Kumar
    Acked-by: Peter Zijlstra
    Cc: Daniel Lezcano
    Cc: Frederic Weisbecker
    Cc: Kevin Hilman
    Cc: Preeti U Murthy
    Cc: linaro-kernel@lists.linaro.org
    Cc: linaro-networking@linaro.org
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/b6b0143a8a57bd58352ad35e08c25424c879c0cb.1425037853.git.viresh.kumar@linaro.org
    Signed-off-by: Ingo Molnar

    Viresh Kumar
     
  • Upcoming patch will redefine possible states of a clockevent
    device. The RESUME mode is a special case only for tick's
    clockevent devices. In future it can be replaced by ->resume()
    callback already available for clockevent devices.

    Lets handle it separately so that clockevents_set_mode() only
    handles states valid across all devices. This also renames
    set_mode_resume() to tick_resume() to make it more explicit.

    Signed-off-by: Viresh Kumar
    Acked-by: Peter Zijlstra
    Cc: Daniel Lezcano
    Cc: Frederic Weisbecker
    Cc: Kevin Hilman
    Cc: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: linaro-kernel@lists.linaro.org
    Cc: linaro-networking@linaro.org
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/c1b0112410870f49e7bf06958e1483eac6c15e20.1425037853.git.viresh.kumar@linaro.org
    Signed-off-by: Ingo Molnar

    Viresh Kumar
     

16 Feb, 2015

1 commit

  • The efficiency of suspend-to-idle depends on being able to keep CPUs
    in the deepest available idle states for as much time as possible.
    Ideally, they should only be brought out of idle by system wakeup
    interrupts.

    However, timer interrupts occurring periodically prevent that from
    happening and it is not practical to chase all of the "misbehaving"
    timers in a whack-a-mole fashion. A much more effective approach is
    to suspend the local ticks for all CPUs and the entire timekeeping
    along the lines of what is done during full suspend, which also
    helps to keep suspend-to-idle and full suspend reasonably similar.

    The idea is to suspend the local tick on each CPU executing
    cpuidle_enter_freeze() and to make the last of them suspend the
    entire timekeeping. That should prevent timer interrupts from
    triggering until an IO interrupt wakes up one of the CPUs. It
    needs to be done with interrupts disabled on all of the CPUs,
    though, because otherwise the suspended clocksource might be
    accessed by an interrupt handler which might lead to fatal
    consequences.

    Unfortunately, the existing ->enter callbacks provided by cpuidle
    drivers generally cannot be used for implementing that, because some
    of them re-enable interrupts temporarily and some idle entry methods
    cause interrupts to be re-enabled automatically on exit. Also some
    of these callbacks manipulate local clock event devices of the CPUs
    which really shouldn't be done after suspending their ticks.

    To overcome that difficulty, introduce a new cpuidle state callback,
    ->enter_freeze, that will be guaranteed (1) to keep interrupts
    disabled all the time (and return with interrupts disabled) and (2)
    not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to
    look for the deepest available idle state with ->enter_freeze present
    and to make the CPU execute that callback with suspended tick (and the
    last of the online CPUs to execute it with suspended timekeeping).

    Suggested-by: Thomas Gleixner
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

15 Oct, 2014

1 commit

  • Pull percpu consistent-ops changes from Tejun Heo:
    "Way back, before the current percpu allocator was implemented, static
    and dynamic percpu memory areas were allocated and handled separately
    and had their own accessors. The distinction has been gone for many
    years now; however, the now duplicate two sets of accessors remained
    with the pointer based ones - this_cpu_*() - evolving various other
    operations over time. During the process, we also accumulated other
    inconsistent operations.

    This pull request contains Christoph's patches to clean up the
    duplicate accessor situation. __get_cpu_var() uses are replaced with
    with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

    Unfortunately, the former sometimes is tricky thanks to C being a bit
    messy with the distinction between lvalues and pointers, which led to
    a rather ugly solution for cpumask_var_t involving the introduction of
    this_cpu_cpumask_var_ptr().

    This converts most of the uses but not all. Christoph will follow up
    with the remaining conversions in this merge window and hopefully
    remove the obsolete accessors"

    * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
    irqchip: Properly fetch the per cpu offset
    percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
    ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
    percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
    Revert "powerpc: Replace __get_cpu_var uses"
    percpu: Remove __this_cpu_ptr
    clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
    sparc: Replace __get_cpu_var uses
    avr32: Replace __get_cpu_var with __this_cpu_write
    blackfin: Replace __get_cpu_var uses
    tile: Use this_cpu_ptr() for hardware counters
    tile: Replace __get_cpu_var uses
    powerpc: Replace __get_cpu_var uses
    alpha: Replace __get_cpu_var
    ia64: Replace __get_cpu_var uses
    s390: cio driver &__get_cpu_var replacements
    s390: Replace __get_cpu_var uses
    mips: Replace __get_cpu_var uses
    MIPS: Replace __get_cpu_var uses in FPU emulator.
    arm: Replace __this_cpu_ptr with raw_cpu_ptr
    ...

    Linus Torvalds
     

14 Sep, 2014

1 commit

  • This way we unbloat a bit main.c and more importantly we initialize
    nohz full after init_IRQ(). This dependency will be needed in further
    patches because nohz full needs irq work to raise its own IRQ.
    Information about the support for this ability on ARM64 is obtained on
    init_IRQ() which initialize the pointer to __smp_call_function.

    Since tick_init() is called right after init_IRQ(), this is a good place
    to call tick_nohz_init() and prepare for that dependency.

    Acked-by: Peter Zijlstra (Intel)
    Cc: Ingo Molnar
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker