06 Dec, 2011

1 commit


22 Nov, 2011

2 commits


18 Nov, 2011

1 commit

  • ktime_get and ktime_get_ts were calling timekeeping_get_ns()
    but later they were not calling arch_gettimeoffset() so architectures
    using this mechanism returned 0 ns when calling these functions.

    This happened for example when running Busybox's ping which calls
    syscall(__NR_clock_gettime, CLOCK_MONOTONIC, ts) which eventually
    calls ktime_get. As a result the returned ping travel time was zero.

    CC: stable@kernel.org
    Signed-off-by: Hector Palacios
    Signed-off-by: John Stultz

    Hector Palacios
     

11 Nov, 2011

2 commits

  • …tz/linux into timers/core

    Conflicts:
    kernel/time/timekeeping.c

    Ingo Molnar
     
  • For some frequencies, the clocks_calc_mult_shift() function will
    unfortunately select mult values very close to 0xffffffff. This
    has the potential to overflow when NTP adjusts the clock, adding
    to the mult value.

    This patch adds a clocksource.maxadj value, which provides
    an approximation of an 11% adjustment(NTP limits adjustments to
    500ppm and the tick adjustment is limited to 10%), which could
    be made to the clocksource.mult value. This is then used to both
    check that the current mult value won't overflow/underflow, as
    well as warning us if the timekeeping_adjust() code pushes over
    that 11% boundary.

    v2: Fix max_adjustment calculation, and improve WARN_ONCE
    messages.

    v3: Don't warn before maxadj has actually been set

    CC: Yong Zhang
    CC: David Daney
    CC: Thomas Gleixner
    CC: Chen Jie
    CC: zhangfx
    CC: stable@kernel.org
    Reported-by: Chen Jie
    Reported-by: zhangfx
    Tested-by: Yong Zhang
    Signed-off-by: John Stultz

    John Stultz
     

28 Oct, 2011

1 commit

  • After getting a number of questions in private emails about the
    math around admittedly very complex timekeeping_adjust() and
    timekeeping_big_adjust(), I figure the code needs some better
    comments.

    Hopefully the explanations are clear enough and don't muddy the
    water any worse.

    Still needs documentation for ntp_error, but I couldn't recall
    exactly the full explanation behind the code that's there
    (although I do recall once working it out when Roman first
    proposed it). Given a bit more time I can probably work it out,
    but I don't want to hold back this documentation until then.

    Signed-off-by: John Stultz
    Cc: Chen Jie
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/1319764362-32367-1-git-send-email-john.stultz@linaro.org
    Signed-off-by: Ingo Molnar

    John Stultz
     

21 Jul, 2011

1 commit

  • Terribly embarassing. Don't know how I committed this, but its
    KERN_WARNING not KERN_WARN.

    This fixes the following compile error:
    kernel/time/timekeeping.c: In function ‘__timekeeping_inject_sleeptime’:
    kernel/time/timekeeping.c:608: error: ‘KERN_WARN’ undeclared (first use in this function)
    kernel/time/timekeeping.c:608: error: (Each undeclared identifier is reported only once
    kernel/time/timekeeping.c:608: error: for each function it appears in.)
    kernel/time/timekeeping.c:608: error: expected ‘)’ before string constant
    make[2]: *** [kernel/time/timekeeping.o] Error 1

    Reported-by: Ingo Molnar
    Signed-off-by: John Stultz

    John Stultz
     

22 Jun, 2011

2 commits

  • Because the read_persistent_clock interface is usually backed by
    only a second granular interface, each time we read from the persistent
    clock for suspend/resume, we introduce a half second (on average) of error.

    In order to avoid this error accumulating as the system is suspended
    over and over, this patch measures the time delta between the persistent
    clock and the system CLOCK_REALTIME.

    If the delta is less then 2 seconds from the last suspend, we compensate
    by using the previous time delta (keeping it close). If it is larger
    then 2 seconds, we assume the clock was set or has been changed, so we
    do no correction and update the delta.

    Note: If NTP is running, ths could seem to "fight" with the NTP corrected
    time, where as if the system time was off by 1 second, and NTP slewed the
    value in, a suspend/resume cycle could undo this correction, by trying to
    restore the previous offset from the persistent clock. However, without
    this patch, since each read could cause almost a full second worth of
    error, its possible to get almost 2 seconds of error just from the
    suspend/resume cycle alone, so this about equal to any offset added by
    the compensation.

    Further on systems that suspend/resume frequently, this should keep time
    closer then NTP could compensate for if the errors were allowed to
    accumulate.

    Credits to Arve Hjønnevåg for suggesting this solution.

    CC: Arve Hjønnevåg
    CC: Thomas Gleixner
    Signed-off-by: John Stultz

    John Stultz
     
  • Arve suggested making sure we catch possible negative sleep time
    intervals that could be passed into timekeeping_inject_sleeptime.

    CC: Arve Hjønnevåg
    CC: Thomas Gleixner
    Signed-off-by: John Stultz

    John Stultz
     

03 May, 2011

2 commits

  • Some applications must be aware of clock realtime being set
    backward. A simple example is a clock applet which arms a timer for
    the next minute display. If clock realtime is set backward then the
    applet displays a stale time for the amount of time which the clock
    was set backwards. Due to that applications poll the time because we
    don't have an interface.

    Extend the timerfd interface by adding a flag which puts the timer
    onto a different internal realtime clock. All timers on this clock are
    expired whenever the clock was set.

    The timerfd core records the monotonic offset when the timer is
    created. When the timer is armed, then the current offset is compared
    to the previous recorded offset. When it has changed, then
    timerfd_settime returns -ECANCELED. When a timer is read the offset is
    compared and if it changed -ECANCELED returned to user space. Periodic
    timers are not rearmed in the cancelation case.

    Signed-off-by: Thomas Gleixner
    Acked-by: John Stultz
    Cc: Chris Friesen
    Tested-by: Kay Sievers
    Cc: "Kirill A. Shutemov"
    Cc: Peter Zijlstra
    Cc: Davide Libenzi
    Reviewed-by: Alexander Shishkin
    Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104271359580.3323%40ionos%3E
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Make clock_was_set() unconditional and rename hres_timers_resume to
    hrtimers_resume. This is a preparatory patch for hrtimers which are
    cancelled when clock realtime was set.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

27 Apr, 2011

1 commit

  • Some platforms cannot implement read_persistent_clock, as
    their RTC devices are only accessible when interrupts are enabled.
    This keeps them from being used by the timekeeping code on resume
    to measure the time in suspend.

    The RTC layer tries to work around this, by calling do_settimeofday
    on resume after irqs are reenabled to set the time properly. However,
    this only corrects CLOCK_REALTIME, and does not properly adjust
    the sleep time value. This causes btime in /proc/stat to be incorrect
    as well as making the new CLOCK_BOTTTIME inaccurate.

    This patch resolves the issue by introducing a new timekeeping hook
    to allow the RTC layer to inject the sleep time on resume.

    The code also checks to make sure that read_persistent_clock is
    nonfunctional before setting the sleep time, so that should the RTC's
    HCTOSYS option be configured in on a system that does support
    read_persistent_clock we will not increase the total_sleep_time twice.

    CC: Arve Hjønnevåg
    CC: Thomas Gleixner
    Acked-by: Arnd Bergmann
    Signed-off-by: John Stultz

    John Stultz
     

24 Mar, 2011

1 commit

  • The timekeeping subsystem uses a sysdev class and a sysdev for
    executing timekeeping_suspend() after interrupts have been turned off
    on the boot CPU (during system suspend) and for executing
    timekeeping_resume() before turning on interrupts on the boot CPU
    (during system resume). However, since both of these functions
    ignore their arguments, the entire mechanism may be replaced with a
    struct syscore_ops object which is simpler.

    Signed-off-by: Rafael J. Wysocki
    Reviewed-by: Thomas Gleixner

    Rafael J. Wysocki
     

22 Feb, 2011

2 commits


02 Feb, 2011

2 commits

  • This adds a kernel-internal timekeeping interface to add or subtract
    a fixed amount from CLOCK_REALTIME. This makes it so kernel users or
    interfaces trying to do so do not have to read the time, then add an
    offset and then call settimeofday(), which adds some extra error in
    comparision to just simply adding the offset in the kernel timekeeping
    core.

    Signed-off-by: John Stultz
    Signed-off-by: Richard Cochran
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    John Stultz
     
  • Both settimeofday() and clock_settime() promise with a 'const'
    attribute not to alter the arguments passed in. This patch adds the
    missing 'const' attribute into the various kernel functions
    implementing these calls.

    Signed-off-by: Richard Cochran
    Acked-by: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Richard Cochran
     

31 Jan, 2011

4 commits

  • xtime_update() takes xtime_lock write locked and calls
    do_timer(). Provided to replace the do_timer() calls in the
    architecture code.

    Signed-off-by: Torben Hohn
    Cc: Peter Zijlstra
    Cc: johnstul@us.ibm.com
    Cc: yong.zhang0@gmail.com
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Torben Hohn
     
  • No users left. Remove it.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The hrtimer code accesses timekeeping variables under
    xtime_lock. Provide a sensible accessor function and use it.

    [ tglx: Removed the conditionals, unused variable, fixed codingstyle
    and massaged changelog ]

    Signed-off-by: Torben Hohn
    Cc: Peter Zijlstra
    Cc: johnstul@us.ibm.com
    Cc: yong.zhang0@gmail.com
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Torben Hohn
     
  • do_timer() is primary timekeeping related. calc_global_load() is
    called from do_timer() as well, but that's more for historical
    reasons.

    [ tglx: Fixed up the calc_global_load() reject andmassaged changelog ]

    Signed-off-by: Torben Hohn
    Cc: Peter Zijlstra
    Cc: johnstul@us.ibm.com
    Cc: yong.zhang0@gmail.com
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Torben Hohn
     

16 Jan, 2011

1 commit

  • …linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    rcu: avoid pointless blocked-task warnings
    rcu: demote SRCU_SYNCHRONIZE_DELAY from kernel-parameter status
    rtmutex: Fix comment about why new_owner can be NULL in wake_futex_pi()

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, olpc: Add missing Kconfig dependencies
    x86, mrst: Set correct APB timer IRQ affinity for secondary cpu
    x86: tsc: Fix calibration refinement conditionals to avoid divide by zero
    x86, ia64, acpi: Clean up x86-ism in drivers/acpi/numa.c

    * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    timekeeping: Make local variables static
    time: Rename misnamed minsec argument of clocks_calc_mult_shift()

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    tracing: Remove syscall_exit_fields
    tracing: Only process module tracepoints once
    perf record: Add "nodelay" mode, disabled by default
    perf sched: Fix list of events, dropping unsupported ':r' modifier
    Revert "perf tools: Emit clearer message for sys_perf_event_open ENOENT return"
    perf top: Fix annotate segv
    perf evsel: Fix order of event list deletion

    Linus Torvalds
     

14 Jan, 2011

1 commit

  • MONOTONIC_RAW clock timestamps are ideally suited for frequency
    calculation and also fit well into the original NTP hardpps design. Now
    phase and frequency can be adjusted separately: the former based on
    REALTIME clock and the latter based on MONOTONIC_RAW clock.

    A new function getnstime_raw_and_real is added to timekeeping subsystem to
    capture both timestamps at the same time and atomically.

    Signed-off-by: Alexander Gordeev
    Acked-by: John Stultz
    Cc: Rodolfo Giometti
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Gordeev
     

12 Jan, 2011

1 commit


21 Oct, 2010

1 commit

  • When the clocksource is not a multiple of HZ, the clock will be off. For
    acpi_pm, HZ=1000 the error is 127.111 ppm:

    The rounding of cycle_interval ends up generating a false error term in
    ntp_error accumulation since xtime_interval is not exactly 1/HZ. So, we
    subtract out the error caused by the rounding.

    This has been visible since 2.6.32-rc2
    commit a092ff0f90cae22b2ac8028ecd2c6f6c1a9e4601
    time: Implement logarithmic time accumulation
    That commit raised NTP_INTERVAL_FREQ and exposed the rounding error.

    testing tool: http://n1.taur.dk/permanent/testpmt.c
    Also tested with ntpd and a frequency counter.

    Signed-off-by: Kasper Pedersen
    Acked-by: john stultz
    Cc: John Kacur
    Cc: Clark Williams
    Cc: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Kasper Pedersen
     

14 Aug, 2010

1 commit

  • Early 4.3 versions of gcc apparently aggressively optimize the raw
    time accumulation loop, replacing it with a divide.

    On 32bit systems, this causes the following link errors:
    undefined reference to `__umoddi3'
    undefined reference to `__udivdi3'

    The gcc issue has been fixed in 4.4 and greater.

    This patch replaces the accumulation loop with a do_div, as suggested
    by Linus.

    Signed-off-by: John Stultz
    CC: Jason Wessel
    CC: Larry Finger
    CC: Ingo Molnar
    CC: Linus Torvalds
    Signed-off-by: Linus Torvalds

    John Stultz
     

13 Aug, 2010

1 commit

  • The tv_nsec is a long and when added to the shifted interval it can wrap
    and become negative which later causes looping problems in the
    getrawmonotonic(). The edge case occurs when the system has slept for
    a short period of time of ~2 seconds.

    A trace printk of the values in this patch illustrate the problem:

    ftrace time stamp: log
    43.716079: logarithmic_accumulation: raw: 3d0913 tv_nsec d687faa
    43.718513: logarithmic_accumulation: raw: 3d0913 tv_nsec da588bd
    43.722161: logarithmic_accumulation: raw: 3d0913 tv_nsec de291d0
    46.349925: logarithmic_accumulation: raw: 7a122600 tv_nsec e1f9ae3
    46.349930: logarithmic_accumulation: raw: 1e848980 tv_nsec 8831c0e3

    The kernel starts looping at 46.349925 in the getrawmonotonic() due to
    the negative value from adding the raw value to tv_nsec.

    A simple solution is to accumulate into a u64, and then normalize it
    to a timespec_t.

    Signed-off-by: Jason Wessel
    [ Reworked variable names and simplified some of the code. - John ]
    Signed-off-by: John Stultz
    Cc: Thomas Gleixner
    Cc: H. Peter Anvin
    Signed-off-by: Linus Torvalds

    Jason Wessel
     

27 Jul, 2010

5 commits

  • This patch makes xtime and wall_to_monotonic static, as planned in
    Documentation/feature-removal-schedule.txt. This will allow for
    further cleanups to the timekeeping core.

    Signed-off-by: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    John Stultz
     
  • Provides an accessor function to replace hrtimer.c's
    direct access of wall_to_monotonic.

    This will allow wall_to_monotonic to be made static as
    planned in Documentation/feature-removal-schedule.txt

    Signed-off-by: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    John Stultz
     
  • update_vsyscall() did not provide the wall_to_monotoinc offset,
    so arch specific implementations tend to reference wall_to_monotonic
    directly. This limits future cleanups in the timekeeping core, so
    this patch fixes the update_vsyscall interface to provide
    wall_to_monotonic, allowing wall_to_monotonic to be made static
    as planned in Documentation/feature-removal-schedule.txt

    Signed-off-by: John Stultz
    Cc: Martin Schwidefsky
    Cc: Anton Blanchard
    Cc: Paul Mackerras
    Cc: Tony Luck
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    John Stultz
     
  • Now that all arches have been converted over to use generic time via
    clocksources or arch_gettimeoffset(), we can remove the GENERIC_TIME
    config option and simplify the generic code.

    Signed-off-by: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    John Stultz
     
  • After accidentally misusing timespec_add_safe, I wanted to make sure
    we don't accidently trip over that issue again, so I created a simple
    timespec_add() function which we can use to replace the instances
    of timespec_add_safe() that don't want the overflow detection.

    Signed-off-by: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    John Stultz
     

10 May, 2010

1 commit


13 Apr, 2010

1 commit

  • With the earlier logarithmic time accumulation patch, xtime will now
    always be within one "tick" of the current time, instead of possibly
    half a second off.

    This removes the need for the xtime_cache value, which always stored the
    time at the last interrupt, so this patch cleans that up removing the
    xtime_cache related code.

    This patch also addresses an issue with an earlier version of this change,
    where xtime_cache was normalizing xtime, which could in some cases be
    not valid (ie: tv_nsec == NSEC_PER_SEC). This is fixed by handling
    the edge case in update_wall_time().

    Signed-off-by: John Stultz
    Cc: Petr Titěra
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    John Stultz
     

23 Mar, 2010

1 commit

  • The logarithmic accumulation done in the timekeeping has some overflow
    protection that limits the max shift value. That means it will take
    more then shift loops to accumulate all of the cycles. This causes
    the shift decrement to underflow, which causes the loop to never exit.

    The simplest fix would be simply to do a:
    if (shift)
    shift--;

    However that is not optimal, as we know the cycle offset is larger
    then the interval << shift, the above would make shift drop to zero,
    then we would be spinning for quite awhile accumulating at interval
    chunks at a time.

    Instead, this patch only decreases shift if the offset is smaller
    then cycle_interval << shift. This makes sure we accumulate using
    the largest chunks possible without overflowing tick_length, and limits
    the number of iterations through the loop.

    This issue was found and reported by Sonic Zhang, who also tested the fix.
    Many thanks your explanation and testing!

    Reported-by: Sonic Zhang
    Signed-off-by: John Stultz
    Tested-by: Sonic Zhang
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    John Stultz
     

02 Mar, 2010

1 commit


10 Feb, 2010

1 commit


05 Feb, 2010

1 commit

  • Add a clocksource suspend callback. This callback can be used by the
    clocksource driver to shutdown and perform any kind of late suspend
    activities even though the clocksource driver itself is a non-sysdev
    driver.

    One example where this is useful is to fix the sh_cmt.c platform driver
    that today suspends using the platform bus and shuts down the clocksource
    too early.

    With this callback in place the sh_cmt driver will suspend using the
    clocksource and clockevent hooks and leave the platform device pm
    callbacks unused.

    Signed-off-by: Magnus Damm
    Cc: Paul Mundt
    Cc: john stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Magnus Damm
     

23 Dec, 2009

1 commit

  • This reverts commit 7bc7d637452383d56ba4368d4336b0dde1bb476d, as
    requested by John Stultz. Quoting John:

    "Petr Titěra reported an issue where he saw odd atime regressions with
    2.6.33 where there were a full second worth of nanoseconds in the
    nanoseconds field.

    He also reviewed the time code and narrowed down the problem: unhandled
    overflow of the nanosecond field caused by rounding up the
    sub-nanosecond accumulated time.

    Details:

    * At the end of update_wall_time(), we currently round up the
    sub-nanosecond portion of accumulated time when storing it into xtime.
    This was added to avoid time inconsistencies caused when the
    sub-nanosecond portion was truncated when storing into xtime.
    Unfortunately we don't handle the possible second overflow caused by
    that rounding.

    * Previously the xtime_cache code hid this overflow by normalizing the
    xtime value when storing into the xtime_cache.

    * We could try to handle the second overflow after the rounding up, but
    since this affects the timekeeping's internal state, this would further
    complicate the next accumulation cycle, causing small errors in ntp
    steering. As much as I'd like to get rid of it, the xtime_cache code is
    known to work.

    * The correct fix is really to include the sub-nanosecond portion in the
    timekeeping accessor function, so we don't need to round up at during
    accumulation. This would greatly simplify the accumulation code.
    Unfortunately, we can't do this safely until the last three
    non-GENERIC_TIME arches (sparc32, arm, cris) are converted (those
    patches are in -mm) and we kill off the spots where arches set xtime
    directly. This is all 2.6.34 material, so I think reverting the
    xtime_cache change is the best approach for now.

    Many thanks to Petr for both reporting and finding the issue!"

    Reported-by: Petr Titěra
    Requested-by: john stultz
    Cc: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Linus Torvalds