23 Jun, 2019

1 commit

  • With CONFIG_PROC_FS=n the following warning is emitted:

    kernel/time/timer_list.c:361:36: warning: unused variable
    'timer_list_sops' [-Wunused-const-variable]
    static const struct seq_operations timer_list_sops = {

    Add #ifdef guard around procfs specific code.

    Signed-off-by: Nathan Huckleberry
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Nick Desaulniers
    Cc: john.stultz@linaro.org
    Cc: sboyd@kernel.org
    Cc: clang-built-linux@googlegroups.com
    Link: https://github.com/ClangBuiltLinux/linux/issues/534
    Link: https://lkml.kernel.org/r/20190614181604.112297-1-nhuck@google.com

    Nathan Huckleberry
     

23 Nov, 2018

3 commits

  • "For licencing details see kernel-base/COPYING" and similar license
    references have no value over the SPDX identifier. Remove them.

    Signed-off-by: Thomas Gleixner
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Peter Anvin
    Cc: Russell King
    Cc: Richard Cochran
    Cc: "Paul E. McKenney"
    Cc: Nicolas Pitre
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Link: https://lkml.kernel.org/r/20181031182252.963632760@linutronix.de

    Thomas Gleixner
     
  • Update the time(r) core files files with the correct SPDX license
    identifier based on the license text in the file itself. The SPDX
    identifier is a legally binding shorthand, which can be used instead of the
    full boiler plate text.

    This work is based on a script and data from Philippe Ombredanne, Kate
    Stewart and myself. The data has been created with two independent license
    scanners and manual inspection.

    The following files do not contain any direct license information and have
    been omitted from the big initial SPDX changes:

    timeconst.bc: The .bc files were not touched
    time.c, timer.c, timekeeping.c: Licence was deduced from EXPORT_SYMBOL_GPL

    As those files do not contain direct license references they fall under the
    project license, i.e. GPL V2 only.

    Signed-off-by: Thomas Gleixner
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Russell King
    Cc: Richard Cochran
    Cc: Nicolas Pitre
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Link: https://lkml.kernel.org/r/20181031182252.879109557@linutronix.de

    Thomas Gleixner
     
  • Remove the pointless filenames in the top level comments. They have no
    value at all and just occupy space. While at it tidy up some of the
    comments and remove a stale one.

    Signed-off-by: Thomas Gleixner
    Acked-by: Nicolas Pitre
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Peter Anvin
    Cc: Russell King
    Cc: Richard Cochran
    Cc: "Paul E. McKenney"
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Link: https://lkml.kernel.org/r/20181031182252.794898238@linutronix.de

    Thomas Gleixner
     

05 Jun, 2018

1 commit

  • Pull timers and timekeeping updates from Thomas Gleixner:

    - Core infrastucture work for Y2038 to address the COMPAT interfaces:

    + Add a new Y2038 safe __kernel_timespec and use it in the core
    code

    + Introduce config switches which allow to control the various
    compat mechanisms

    + Use the new config switch in the posix timer code to control the
    32bit compat syscall implementation.

    - Prevent bogus selection of CPU local clocksources which causes an
    endless reselection loop

    - Remove the extra kthread in the clocksource code which has no value
    and just adds another level of indirection

    - The usual bunch of trivial updates, cleanups and fixlets all over the
    place

    - More SPDX conversions

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
    clocksource/drivers/mxs_timer: Switch to SPDX identifier
    clocksource/drivers/timer-imx-tpm: Switch to SPDX identifier
    clocksource/drivers/timer-imx-gpt: Switch to SPDX identifier
    clocksource/drivers/timer-imx-gpt: Remove outdated file path
    clocksource/drivers/arc_timer: Add comments about locking while read GFRC
    clocksource/drivers/mips-gic-timer: Add pr_fmt and reword pr_* messages
    clocksource/drivers/sprd: Fix Kconfig dependency
    clocksource: Move inline keyword to the beginning of function declarations
    timer_list: Remove unused function pointer typedef
    timers: Adjust a kernel-doc comment
    tick: Prefer a lower rating device only if it's CPU local device
    clocksource: Remove kthread
    time: Change nanosleep to safe __kernel_* types
    time: Change types to new y2038 safe __kernel_* types
    time: Fix get_timespec64() for y2038 safe compat interfaces
    time: Add new y2038 safe __kernel_timespec
    posix-timers: Make compat syscalls depend on CONFIG_COMPAT_32BIT_TIME
    time: Introduce CONFIG_COMPAT_32BIT_TIME
    time: Introduce CONFIG_64BIT_TIME in architectures
    compat: Enable compat_get/put_timespec64 always
    ...

    Linus Torvalds
     

16 May, 2018

1 commit


13 May, 2018

1 commit


13 Nov, 2017

1 commit


24 Mar, 2017

1 commit

  • On systems with a large number of CPUs, running sysrq- can cause
    watchdog timeouts. There are two slow sections of code in the sysrq-
    path in timer_list.c.

    1. print_active_timers() - This function is called by print_cpu() and
    contains a slow goto loop. On a machine with hundreds of CPUs, this
    loop took approximately 100ms for the first CPU in a NUMA node.
    (Subsequent CPUs in the same node ran much quicker.) The total time
    to print all of the CPUs is ultimately long enough to trigger the
    soft lockup watchdog.

    2. print_tickdevice() - This function outputs a large amount of textual
    information. This function also took approximately 100ms per CPU.

    Since sysrq- is not a performance critical path, there should be no
    harm in touching the nmi watchdog in both slow sections above. Touching
    it in just one location was insufficient on systems with hundreds of
    CPUs as occasional timeouts were still observed during testing.

    This issue was observed on an Oracle T7 machine with 128 CPUs, but I
    anticipate it may affect other systems with similarly large numbers of
    CPUs.

    Signed-off-by: Tom Hromatka
    Reviewed-by: Rob Gardner
    Signed-off-by: John Stultz

    Tom Hromatka
     

10 Feb, 2017

2 commits

  • hrtimer_resolution is already unsigned int, not necessary to cast
    it when printing.

    Signed-off-by: Mars Cheng
    Cc: CC Hwang
    Cc: wsd_upstream@mediatek.com
    Cc: Loda Chou
    Cc: Jades Shih
    Cc: Miles Chen
    Cc: John Stultz
    Cc: My Chuang
    Cc: Matthias Brugger
    Cc: Yingjoe Chen
    Link: http://lkml.kernel.org/r/1486626615-5879-1-git-send-email-mars.cheng@mediatek.com
    Signed-off-by: Thomas Gleixner

    Mars Cheng
     
  • Currently CONFIG_TIMER_STATS exposes process information across namespaces:

    kernel/time/timer_list.c print_timer():

    SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);

    /proc/timer_list:

    #11: , hrtimer_wakeup, S:01, do_nanosleep, cron/2570

    Given that the tracer can give the same information, this patch entirely
    removes CONFIG_TIMER_STATS.

    Suggested-by: Thomas Gleixner
    Signed-off-by: Kees Cook
    Acked-by: John Stultz
    Cc: Nicolas Pitre
    Cc: linux-doc@vger.kernel.org
    Cc: Lai Jiangshan
    Cc: Shuah Khan
    Cc: Xing Gao
    Cc: Jonathan Corbet
    Cc: Jessica Frazelle
    Cc: kernel-hardening@lists.openwall.com
    Cc: Nicolas Iooss
    Cc: "Paul E. McKenney"
    Cc: Petr Mladek
    Cc: Richard Cochran
    Cc: Tejun Heo
    Cc: Michal Marek
    Cc: Josh Poimboeuf
    Cc: Dmitry Vyukov
    Cc: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Cc: Olof Johansson
    Cc: Andrew Morton
    Cc: linux-api@vger.kernel.org
    Cc: Arjan van de Ven
    Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast
    Signed-off-by: Thomas Gleixner

    Kees Cook
     

25 Dec, 2016

1 commit


17 Jan, 2016

1 commit

  • If CONFIG_TIME_LOW_RES is enabled we add a jiffie to the relative timeout to
    prevent short sleeps, but we do not account for that in interfaces which
    retrieve the remaining time.

    Helge observed that timerfd can return a remaining time larger than the
    relative timeout. That's not expected and breaks userland test programs.

    Store the information that the timer was armed relative and provide functions
    to adjust the remaining time. To avoid bloating the hrtimer struct make state
    a u8, which as a bonus results in better code on x86 at least.

    Reported-and-tested-by: Helge Deller
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: dhowells@redhat.com
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160114164159.273328486@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

14 Sep, 2015

1 commit

  • All users are migrated to the per-state callbacks, get rid of the
    unused interface and the core support code.

    Signed-off-by: Viresh Kumar
    Signed-off-by: Thomas Gleixner
    Cc: linaro-kernel@lists.linaro.org
    Cc: John Stultz
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/fd60de14cf6d125489c031207567bb255ad946f6.1441943991.git.viresh.kumar@linaro.org
    Signed-off-by: Ingo Molnar

    Viresh Kumar
     

18 Aug, 2015

1 commit

  • I noticed for non-monotonic timers in timer_list, some of the
    output looked a little confusing.

    For example:
    #1: , posix_timer_fn, S:01, hrtimer_start_range_ns, leap-a-day/2360
    # expires at 1434412800000000000-1434412800000000000 nsecs [in 1434410725062375469 to 1434410725062375469 nsecs]

    You'll note the relative time till the expiration "[in xxx to
    yyy nsecs]" is incorrect. This is because its printing the delta
    between CLOCK_MONOTONIC time to the CLOCK_REALTIME expiration.

    This patch fixes this issue by adding the clock offset to the
    "now" time which we use to calculate the delta.

    Cc: Prarit Bhargava
    Cc: Daniel Bristot de Oliveira
    Cc: Richard Cochran
    Cc: Jan Kara
    Cc: Jiri Bohac
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Shuah Khan
    Signed-off-by: John Stultz

    John Stultz
     

19 Jun, 2015

1 commit

  • Eric reported that the timer_migration sysctl is not really nice
    performance wise as it needs to check at every timer insertion whether
    the feature is enabled or not. Further the check does not live in the
    timer code, so we have an extra function call which checks an extra
    cache line to figure out that it is disabled.

    We can do better and store that information in the per cpu (hr)timer
    bases. I pondered to use a static key, but that's a nightmare to
    update from the nohz code and the timer base cache line is hot anyway
    when we select a timer base.

    The old logic enabled the timer migration unconditionally if
    CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
    line.

    With this modification, we start off with migration disabled. The user
    visible sysctl is still set to enabled. If the kernel switches to NOHZ
    migration is enabled, if the user did not disable it via the sysctl
    prior to the switch. If nohz=off is on the kernel command line,
    migration stays disabled no matter what.

    Before:
    47.76% hog [.] main
    14.84% [kernel] [k] _raw_spin_lock_irqsave
    9.55% [kernel] [k] _raw_spin_unlock_irqrestore
    6.71% [kernel] [k] mod_timer
    6.24% [kernel] [k] lock_timer_base.isra.38
    3.76% [kernel] [k] detach_if_pending
    3.71% [kernel] [k] del_timer
    2.50% [kernel] [k] internal_add_timer
    1.51% [kernel] [k] get_nohz_timer_target
    1.28% [kernel] [k] __internal_add_timer
    0.78% [kernel] [k] timerfn
    0.48% [kernel] [k] wake_up_nohz_cpu

    After:
    48.10% hog [.] main
    15.25% [kernel] [k] _raw_spin_lock_irqsave
    9.76% [kernel] [k] _raw_spin_unlock_irqrestore
    6.50% [kernel] [k] mod_timer
    6.44% [kernel] [k] lock_timer_base.isra.38
    3.87% [kernel] [k] detach_if_pending
    3.80% [kernel] [k] del_timer
    2.67% [kernel] [k] internal_add_timer
    1.33% [kernel] [k] __internal_add_timer
    0.73% [kernel] [k] timerfn
    0.54% [kernel] [k] wake_up_nohz_cpu

    Reported-by: Eric Dumazet
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Paul McKenney
    Cc: Frederic Weisbecker
    Cc: Viresh Kumar
    Cc: John Stultz
    Cc: Joonwoo Park
    Cc: Wenbo Wang
    Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

19 May, 2015

1 commit

  • When no timers/hrtimers are pending, the expiry time is set to a
    special value: 'KTIME_MAX'. This normally happens with
    NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.

    When 'expiry == KTIME_MAX', we either cancel the 'tick-sched' hrtimer
    (NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device
    (NOHZ_MODE_LOWRES). But, the clockevent device is already
    reprogrammed from the tick-handler for next tick.

    As the clock event device is programmed in ONESHOT mode it will at
    least fire one more time (unnecessarily). Timers on few
    implementations (like arm_arch_timer, etc.) only support PERIODIC mode
    and their drivers emulate ONESHOT over that. Which means that on these
    platforms we will get spurious interrupts periodically (at last
    programmed interval rate, normally tick rate).

    In order to avoid spurious interrupts, the clockevent device should be
    stopped or its interrupts should be masked.

    A simple (yet hacky) solution to get this fixed could be: update
    hrtimer_force_reprogram() to always reprogram clockevent device and
    update clockevent drivers to STOP generating events (or delay it to
    max time) when 'expires' is set to KTIME_MAX. But the drawback here is
    that every clockevent driver has to be hacked for this particular case
    and its very easy for new ones to miss this.

    However, Thomas suggested to add an optional state ONESHOT_STOPPED to
    solve this problem: lkml.org/lkml/2014/5/9/508.

    This patch adds support for ONESHOT_STOPPED state in clockevents
    core. It will only be available to drivers that implement the
    state-specific callbacks instead of the legacy ->set_mode() callback.

    Signed-off-by: Viresh Kumar
    Reviewed-by: Preeti U. Murthy
    Cc: linaro-kernel@lists.linaro.org
    Cc: Frederic Weisbecker
    Cc: Kevin Hilman
    Cc: Daniel Lezcano
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/b8b383a03ac07b13312c16850b5106b82e4245b5.1428031396.git.viresh.kumar@linaro.org
    Signed-off-by: Thomas Gleixner

    Viresh Kumar
     

05 May, 2015

1 commit

  • Today the number of bits of the broadcast masks that is output into
    /proc/timer_list is sizeof(unsigned long). This means that on machines
    with a larger number of CPUs, the bitmasks of CPUs beyond this range do
    not appear.

    Fix this by using bitmap printing through "%*pb" instead, so as to
    output the broadcast masks for the range of nr_cpu_ids into
    /proc/timer_list.

    Signed-off-by: Preeti U Murthy
    Cc: peterz@infradead.org
    Cc: linuxppc-dev@ozlabs.org
    Cc: john.stultz@linaro.org
    Link: http://lkml.kernel.org/r/20150428084520.3314.62668.stgit@preeti.in.ibm.com
    Signed-off-by: Thomas Gleixner

    Preeti U Murthy
     

22 Apr, 2015

4 commits

  • The evaluation of the next timer in the nohz code is based on jiffies
    while all the tick internals are nano seconds based. We have also to
    convert hrtimer nanoseconds to jiffies in the !highres case. That's
    just wrong and introduces interesting corner cases.

    Turn it around and convert the next timer wheel timer expiry and the
    rcu event to clock monotonic and base all calculations on
    nanoseconds. That identifies the case where no timer is pending
    clearly with an absolute expiry value of KTIME_MAX.

    Makes the code more readable and gets rid of the jiffies magic in the
    nohz code.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Paul E. McKenney
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Cc: Josh Triplett
    Cc: Lai Jiangshan
    Cc: John Stultz
    Cc: Marcelo Tosatti
    Link: http://lkml.kernel.org/r/20150414203502.184198593@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • No point in having usigned long for /proc/timer_list statistics. Make
    them unsigned int.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203500.959773467@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The field has no value because all clock bases have the same
    resolution. The resolution only changes when we switch to high
    resolution timer mode. We can evaluate that from a single static
    variable as well. In the !HIGHRES case its simply a constant.

    Export the variable, so we can simplify the usage sites.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Preeti U Murthy
    Acked-by: Peter Zijlstra
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203500.645454122@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • This macro can be converted to a static function to reduce
    object size.

    (x86-64 defconfig)
    $ size kernel/time/timer_list.o*
    text data bss dec hex filename
    6583 8 0 6591 19bf kernel/time/timer_list.o.old
    4647 8 0 4655 122f kernel/time/timer_list.o.new

    Signed-off-by: Joe Perches
    Cc: John Stultz
    Link: http://lkml.kernel.org/r/1429295958.2850.104.camel@perches.com
    Signed-off-by: Thomas Gleixner

    Joe Perches
     

01 Apr, 2015

1 commit

  • No point to expose everything to the world. People just believe
    such functions can be abused for whatever purposes. Sigh.

    Signed-off-by: Thomas Gleixner
    [ Rebased on top of 4.0-rc5 ]
    Signed-off-by: Rafael J. Wysocki
    Cc: Nicolas Pitre
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/28017337.VbCUc39Gme@vostro.rjw.lan
    [ Merged to latest timers/core ]
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

27 Mar, 2015

2 commits

  • 'enum clock_event_mode' is used for two purposes today:

    - to pass mode to the driver of clockevent device::set_mode().

    - for managing state of the device for clockevents core.

    For supporting new modes/states we have moved away from the
    legacy set_mode() callback to new per-mode/state callbacks. New
    modes/states shouldn't be exposed to the legacy (now OBSOLOTE)
    callbacks and so we shouldn't add new states to 'enum
    clock_event_mode'.

    Lets have separate enums for the two use cases mentioned above.
    Keep using the earlier enum for legacy set_mode() callback and
    mark it OBSOLETE. And add another enum to clearly specify the
    possible states of a clockevent device.

    This also renames the newly added per-mode callbacks to reflect
    state changes.

    We haven't got rid of 'mode' member of 'struct
    clock_event_device' as it is used by some of the clockevent
    drivers and it would automatically die down once we migrate
    those drivers to the new interface. It ('mode') is only updated
    now for the drivers using the legacy interface.

    Suggested-by: Peter Zijlstra
    Suggested-by: Ingo Molnar
    Signed-off-by: Viresh Kumar
    Acked-by: Peter Zijlstra
    Cc: Daniel Lezcano
    Cc: Frederic Weisbecker
    Cc: Kevin Hilman
    Cc: Preeti U Murthy
    Cc: linaro-kernel@lists.linaro.org
    Cc: linaro-networking@linaro.org
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/b6b0143a8a57bd58352ad35e08c25424c879c0cb.1425037853.git.viresh.kumar@linaro.org
    Signed-off-by: Ingo Molnar

    Viresh Kumar
     
  • Upcoming patch will redefine possible states of a clockevent
    device. The RESUME mode is a special case only for tick's
    clockevent devices. In future it can be replaced by ->resume()
    callback already available for clockevent devices.

    Lets handle it separately so that clockevents_set_mode() only
    handles states valid across all devices. This also renames
    set_mode_resume() to tick_resume() to make it more explicit.

    Signed-off-by: Viresh Kumar
    Acked-by: Peter Zijlstra
    Cc: Daniel Lezcano
    Cc: Frederic Weisbecker
    Cc: Kevin Hilman
    Cc: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: linaro-kernel@lists.linaro.org
    Cc: linaro-networking@linaro.org
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/c1b0112410870f49e7bf06958e1483eac6c15e20.1425037853.git.viresh.kumar@linaro.org
    Signed-off-by: Ingo Molnar

    Viresh Kumar
     

18 Feb, 2015

1 commit

  • It is not possible for the clockevents core to know which modes (other than
    those with a corresponding feature flag) are supported by a particular
    implementation. And drivers are expected to handle transition to all modes
    elegantly, as ->set_mode() would be issued for them unconditionally.

    Now, adding support for a new mode complicates things a bit if we want to use
    the legacy ->set_mode() callback. We need to closely review all clockevents
    drivers to see if they would break on addition of a new mode. And after such
    reviews, it is found that we have to do non-trivial changes to most of the
    drivers [1].

    Introduce mode-specific set_mode_*() callbacks, some of which the drivers may or
    may not implement. A missing callback would clearly convey the message that the
    corresponding mode isn't supported.

    A driver may still choose to keep supporting the legacy ->set_mode() callback,
    but ->set_mode() wouldn't be supporting any new modes beyond RESUME. If a driver
    wants to benefit from using a new mode, it would be required to migrate to
    the mode specific callbacks.

    The legacy ->set_mode() callback and the newly introduced mode-specific
    callbacks are mutually exclusive. Only one of them should be supported by the
    driver.

    Sanity check is done at the time of registration to distinguish between optional
    and required callbacks and to make error recovery and handling simpler. If the
    legacy ->set_mode() callback is provided, all mode specific ones would be
    ignored by the core but a warning is thrown if they are present.

    Call sites calling ->set_mode() directly are also updated to use
    __clockevents_set_mode() instead, as ->set_mode() may not be available anymore
    for few drivers.

    [1] https://lkml.org/lkml/2014/12/9/605
    [2] https://lkml.org/lkml/2015/1/23/255

    Suggested-by: Thomas Gleixner [2]
    Signed-off-by: Viresh Kumar
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Daniel Lezcano
    Cc: Frederic Weisbecker
    Cc: John Stultz
    Cc: Kevin Hilman
    Cc: Linus Torvalds
    Cc: Preeti U Murthy
    Cc: linaro-kernel@lists.linaro.org
    Cc: linaro-networking@linaro.org
    Link: http://lkml.kernel.org/r/792d59a40423f0acffc9bb0bec9de1341a06fa02.1423788565.git.viresh.kumar@linaro.org
    Signed-off-by: Ingo Molnar

    Viresh Kumar
     

29 Aug, 2013

1 commit

  • Correct an issue with /proc/timer_list reported by Holger.

    When reading from the proc file with a sufficiently small buffer, 2k so
    not really that small, there was one could get hung trying to read the
    file a chunk at a time.

    The timer_list_start function failed to account for the possibility that
    the offset was adjusted outside the timer_list_next.

    Signed-off-by: Nathan Zimmer
    Reported-by: Holger Hans Peter Freyther
    Cc: John Stultz
    Cc: Thomas Gleixner
    Cc: Berke Durak
    Cc: Jeff Layton
    Tested-by: Al Viro
    Cc: # 3.10.x
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nathan Zimmer
     

18 Apr, 2013

2 commits

  • When running with 4096 cores attemping to read /proc/timer_list will fail
    with an ENOMEM condition. On a sufficantly large systems the total amount
    of data is more then 4mb, so it won't fit into a single buffer. The
    failure can also occur on smaller systems when memory fragmentation is
    high as reported by Dave Jones.

    Convert /proc/timer_list to a proper seq_file with its own iterator. This
    is a little more complex given that we have to make two passes with two
    separate headers.

    sysrq_timer_list_show also needed to be updated to reflect the fact that
    now timer_list_show only does one cpu at at time.

    Signed-off-by: Nathan Zimmer
    Reported-by: Dave Jones
    Cc: John Stultz
    Cc: Stephen Boyd
    Link: http://lkml.kernel.org/r/1364345790-14577-3-git-send-email-nzimmer@sgi.com
    Signed-off-by: Thomas Gleixner

    Nathan Zimmer
     
  • Split timer_list_show_tickdevices() into the header printout and pull
    the rest up to timer_list_show. This is a preparatory patch for
    converting timer_list to a proper seqfile with its own iterator

    Signed-off-by: Nathan Zimmer
    Reported-by: Dave Jones
    Cc: John Stultz
    Cc: Stephen Boyd
    Link: http://lkml.kernel.org/r/1364345790-14577-2-git-send-email-nzimmer@sgi.com
    Signed-off-by: Thomas Gleixner

    Nathan Zimmer
     

12 Jun, 2012

1 commit

  • Now that idle and nohz logics are going to be independant each others,
    ts->idle_tick becomes too much a biased name to describe the field that
    saves the last scheduled tick on top of which we re-calculate the next
    tick to schedule when the timer is restarted.

    We want to reuse this even to stop the tick outside idle cases. So let's
    rename it to some more generic name: ts->last_tick.

    This changes a bit the timer list stat export so we need to increase its
    version.

    Signed-off-by: Frederic Weisbecker
    Cc: Alessio Igor Bogani
    Cc: Andrew Morton
    Cc: Avi Kivity
    Cc: Chris Metcalf
    Cc: Christoph Lameter
    Cc: Daniel Lezcano
    Cc: Geoff Levand
    Cc: Gilad Ben Yossef
    Cc: Hakan Akkan
    Cc: Ingo Molnar
    Cc: Kevin Hilman
    Cc: Max Krasnyansky
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Stephen Hemminger
    Cc: Steven Rostedt
    Cc: Sven-Thorsten Dietrich
    Cc: Thomas Gleixner

    Frederic Weisbecker
     

12 Feb, 2011

1 commit


11 Dec, 2010

1 commit


10 May, 2010

1 commit

  • For the ondemand cpufreq governor, it is desired that the iowait
    time is microaccounted in a similar way as idle time is.

    This patch introduces the infrastructure to account and expose
    this information via the get_cpu_iowait_time_us() function.

    [akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build]
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    Cc: davej@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     

13 Mar, 2010

1 commit

  • The current logic which handles clock events programming failures can
    increase min_delta_ns unlimited and even can cause overflows.

    Sanitize it by:
    - prevent zero increase when min_delta_ns == 1
    - limiting min_delta_ns to a jiffie
    - bail out if the jiffie limit is hit
    - add retries stats for /proc/timer_list so we can gather data

    Reported-by: Uwe Kleine-Koenig
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

17 Dec, 2009

1 commit


15 Dec, 2009

1 commit


10 Dec, 2009

1 commit

  • The hrtimer_interrupt hang logic adjusts min_delta_ns based on the
    execution time of the hrtimer callbacks.

    This is error-prone for virtual machines, where a guest vcpu can be
    scheduled out during the execution of the callbacks (and the callbacks
    themselves can do operations that translate to blocking operations in
    the hypervisor), which in can lead to large min_delta_ns rendering the
    system unusable.

    Replace the current heuristics with something more reliable. Allow the
    interrupt code to try 3 times to catch up with the lost time. If that
    fails use the total time spent in the interrupt handler to defer the
    next timer interrupt so the system can catch up with other things
    which got delayed. Limit that deferment to 100ms.

    The retry events and the maximum time spent in the interrupt handler
    are recorded and exposed via /proc/timer_list

    Inspired by a patch from Marcelo.

    Reported-by: Michael Tokarev
    Signed-off-by: Thomas Gleixner
    Tested-by: Marcelo Tosatti
    Cc: kvm@vger.kernel.org

    Thomas Gleixner
     

14 Nov, 2009

2 commits

  • In the dynamic tick code, "max_delta_ns" (member of the
    "clock_event_device" structure) represents the maximum sleep time
    that can occur between timer events in nanoseconds.

    The variable, "max_delta_ns", is defined as an unsigned long
    which is a 32-bit integer for 32-bit machines and a 64-bit
    integer for 64-bit machines (if -m64 option is used for gcc).
    The value of max_delta_ns is set by calling the function
    "clockevent_delta2ns()" which returns a maximum value of LONG_MAX.
    For a 32-bit machine LONG_MAX is equal to 0x7fffffff and in
    nanoseconds this equates to ~2.15 seconds. Hence, the maximum
    sleep time for a 32-bit machine is ~2.15 seconds, where as for
    a 64-bit machine it will be many years.

    This patch changes the type of max_delta_ns to be "u64" instead of
    "unsigned long" so that this variable is a 64-bit type for both 32-bit
    and 64-bit machines. It also changes the maximum value returned by
    clockevent_delta2ns() to KTIME_MAX. Hence this allows a 32-bit
    machine to sleep for longer than ~2.15 seconds. Please note that this
    patch also changes "min_delta_ns" to be "u64" too and although this is
    unnecessary, it makes the patch simpler as it avoids to fixup all
    callers of clockevent_delta2ns().

    [ tglx: changed "unsigned long long" to u64 as we use this data type
    through out the time code ]

    Signed-off-by: Jon Hunter
    Cc: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Jon Hunter
     
  • The mult and shift factors of clock events differ in their data type
    from those of clock sources for no reason. u32 is sufficient for
    both. shift is always
    Tested-by: Mikael Pettersson
    Acked-by: Ralf Baechle
    Acked-by: Linus Walleij
    Cc: John Stultz
    LKML-Reference:

    Thomas Gleixner
     

02 Oct, 2009

1 commit