06 Dec, 2011

2 commits

  • The expiry function compares the timer against current time and does
    not expire the timer when the expiry time is >= now. That's wrong. If
    the timer is set for now, then it must expire.

    Make the condition expiry > now for breaking out the loop.

    Signed-off-by: Thomas Gleixner
    Acked-by: John Stultz
    Cc: stable@kernel.org

    Thomas Gleixner
     
  • * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    clockevents: Set noop handler in clockevents_exchange_device()
    tick-broadcast: Stop active broadcast device when replacing it
    clocksource: Fix bug with max_deferment margin calculation
    rtc: Fix some bugs that allowed accumulating time drift in suspend/resume
    rtc: Disable the alarm in the hardware

    Linus Torvalds
     

02 Dec, 2011

3 commits


29 Nov, 2011

1 commit


18 Nov, 2011

1 commit

  • ktime_get and ktime_get_ts were calling timekeeping_get_ns()
    but later they were not calling arch_gettimeoffset() so architectures
    using this mechanism returned 0 ns when calling these functions.

    This happened for example when running Busybox's ping which calls
    syscall(__NR_clock_gettime, CLOCK_MONOTONIC, ts) which eventually
    calls ktime_get. As a result the returned ping travel time was zero.

    CC: stable@kernel.org
    Signed-off-by: Hector Palacios
    Signed-off-by: John Stultz

    Hector Palacios
     

11 Nov, 2011

2 commits

  • …tz/linux into timers/core

    Conflicts:
    kernel/time/timekeeping.c

    Ingo Molnar
     
  • For some frequencies, the clocks_calc_mult_shift() function will
    unfortunately select mult values very close to 0xffffffff. This
    has the potential to overflow when NTP adjusts the clock, adding
    to the mult value.

    This patch adds a clocksource.maxadj value, which provides
    an approximation of an 11% adjustment(NTP limits adjustments to
    500ppm and the tick adjustment is limited to 10%), which could
    be made to the clocksource.mult value. This is then used to both
    check that the current mult value won't overflow/underflow, as
    well as warning us if the timekeeping_adjust() code pushes over
    that 11% boundary.

    v2: Fix max_adjustment calculation, and improve WARN_ONCE
    messages.

    v3: Don't warn before maxadj has actually been set

    CC: Yong Zhang
    CC: David Daney
    CC: Thomas Gleixner
    CC: Chen Jie
    CC: zhangfx
    CC: stable@kernel.org
    Reported-by: Chen Jie
    Reported-by: zhangfx
    Tested-by: Yong Zhang
    Signed-off-by: John Stultz

    John Stultz
     

01 Nov, 2011

1 commit


28 Oct, 2011

1 commit

  • After getting a number of questions in private emails about the
    math around admittedly very complex timekeeping_adjust() and
    timekeeping_big_adjust(), I figure the code needs some better
    comments.

    Hopefully the explanations are clear enough and don't muddy the
    water any worse.

    Still needs documentation for ntp_error, but I couldn't recall
    exactly the full explanation behind the code that's there
    (although I do recall once working it out when Roman first
    proposed it). Given a bit more time I can probably work it out,
    but I don't want to hold back this documentation until then.

    Signed-off-by: John Stultz
    Cc: Chen Jie
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/1319764362-32367-1-git-send-email-john.stultz@linaro.org
    Signed-off-by: Ingo Molnar

    John Stultz
     

26 Oct, 2011

2 commits

  • * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
    time, s390: Get rid of compile warning
    dw_apb_timer: constify clocksource name
    time: Cleanup old CONFIG_GENERIC_TIME references that snuck in
    time: Change jiffies_to_clock_t() argument type to unsigned long
    alarmtimers: Fix error handling
    clocksource: Make watchdog reset lockless
    posix-cpu-timers: Cure SMP accounting oddities
    s390: Use direct ktime path for s390 clockevent device
    clockevents: Add direct ktime programming function
    clockevents: Make minimum delay adjustments configurable
    nohz: Remove "Switched to NOHz mode" debugging messages
    proc: Consider NO_HZ when printing idle and iowait times
    nohz: Make idle/iowait counter update conditional
    nohz: Fix update_ts_time_stat idle accounting
    cputime: Clean up cputime_to_usecs and usecs_to_cputime macros
    alarmtimers: Rework RTC device selection using class interface
    alarmtimers: Add try_to_cancel functionality
    alarmtimers: Add more refined alarm state tracking
    alarmtimers: Remove period from alarm structure
    alarmtimers: Remove interval cap limit hack
    ...

    Linus Torvalds
     
  • * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
    rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp()
    rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states
    rcu: Wire up RCU_BOOST_PRIO for rcutree
    rcu: Make rcu_torture_boost() exit loops at end of test
    rcu: Make rcu_torture_fqs() exit loops at end of test
    rcu: Permit rt_mutex_unlock() with irqs disabled
    rcu: Avoid having just-onlined CPU resched itself when RCU is idle
    rcu: Suppress NMI backtraces when stall ends before dump
    rcu: Prohibit grace periods during early boot
    rcu: Simplify unboosting checks
    rcu: Prevent early boot set_need_resched() from __rcu_pending()
    rcu: Dump local stack if cannot dump all CPUs' stacks
    rcu: Move __rcu_read_unlock()'s barrier() within if-statement
    rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation
    rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier
    rcu: Make rcu_implicit_dynticks_qs() locals be correct size
    rcu: Eliminate in_irq() checks in rcu_enter_nohz()
    nohz: Remove nohz_cpu_mask
    rcu: Document interpretation of RCU-lockdep splats
    rcu: Allow rcutorture's stat_interval parameter to be changed at runtime
    ...

    Linus Torvalds
     

29 Sep, 2011

1 commit

  • RCU no longer uses this global variable, nor does anyone else. This
    commit therefore removes this variable. This reduces memory footprint
    and also removes some atomic instructions and memory barriers from
    the dyntick-idle path.

    Signed-off-by: Alex Shi
    Signed-off-by: Paul E. McKenney

    Shi, Alex
     

14 Sep, 2011

1 commit


13 Sep, 2011

2 commits

  • The table_lock lock can be taken in atomic context and therefore
    cannot be preempted on -rt - annotate it.

    In mainline this change documents the low level nature of
    the lock - otherwise there's no functional difference. Lockdep
    and Sparse checking will work as usual.

    Reported-by: Andreas Sundebo
    Signed-off-by: Thomas Gleixner
    Tested-by: Andreas Sundebo

    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • KGDB needs to trylock watchdog_lock when trying to reset the
    clocksource watchdog after the system has been stopped to avoid a
    potential deadlock. When the trylock fails TSC usually becomes
    unstable.

    We can be more clever by using an atomic counter and checking it in
    the clocksource_watchdog callback. We restart the watchdog whenever
    the counter is > 0 and only decrement the counter when we ran through
    a full update cycle.

    Signed-off-by: Thomas Gleixner
    Cc: John Stultz
    Acked-by: Jason Wessel
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1109121326280.2723@ionos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

08 Sep, 2011

5 commits

  • There is at least one architecture (s390) with a sane clockevent device
    that can be programmed with the equivalent of a ktime. No need to create
    a delta against the current time, the ktime can be used directly.

    A new clock device function 'set_next_ktime' is introduced that is called
    with the unmodified ktime for the timer if the clock event device has the
    CLOCK_EVT_FEAT_KTIME bit set.

    Signed-off-by: Martin Schwidefsky
    Cc: john stultz
    Link: http://lkml.kernel.org/r/20110823133142.815350967@de.ibm.com
    Signed-off-by: Thomas Gleixner

    Martin Schwidefsky
     
  • The automatic increase of the min_delta_ns of a clockevents device
    should be done in the clockevents code as the minimum delay is an
    attribute of the clockevents device.

    In addition not all architectures want the automatic adjustment, on a
    massively virtualized system it can happen that the programming of a
    clock event fails several times in a row because the virtual cpu has
    been rescheduled quickly enough. In that case the minimum delay will
    erroneously be increased with no way back. The new config symbol
    GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic
    adjustment. The config option is selected only for x86.

    Signed-off-by: Martin Schwidefsky
    Cc: john stultz
    Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com
    Signed-off-by: Thomas Gleixner

    Martin Schwidefsky
     
  • When performing cpu hotplug tests the kernel printk log buffer gets flooded
    with pointless "Switched to NOHz mode..." messages. Especially when afterwards
    analyzing a dump this might have removed more interesting stuff out of the
    buffer.
    Assuming that switching to NOHz mode simply works just remove the printk.

    Signed-off-by: Heiko Carstens
    Link: http://lkml.kernel.org/r/20110823112046.GB2540@osiris.boeblingen.de.ibm.com
    Signed-off-by: Thomas Gleixner

    Heiko Carstens
     
  • get_cpu_{idle,iowait}_time_us update idle/iowait counters
    unconditionally if the given CPU is in the idle loop.

    This doesn't work well outside of CPU governors which are singletons
    so nobody (except for IRQ) can race with them.

    We will need to use both functions from /proc/stat handler to properly
    handle nohz idle/iowait times.

    Make the update depend on a non NULL last_update_time argument.

    Signed-off-by: Michal Hocko
    Cc: Dave Jones
    Cc: Arnd Bergmann
    Cc: Alexey Dobriyan
    Link: http://lkml.kernel.org/r/11f23179472635ce52e78921d47a20216b872f23.1314172057.git.mhocko@suse.cz
    Signed-off-by: Thomas Gleixner

    Michal Hocko
     
  • update_ts_time_stat currently updates idle time even if we are in
    iowait loop at the moment. The only real users of the idle counter
    (via get_cpu_idle_time_us) are CPU governors and they expect to get
    cumulative time for both idle and iowait times.
    The value (idle_sleeptime) is also printed to userspace by print_cpu
    but it prints both idle and iowait times so the idle part is misleading.

    Let's clean this up and fix update_ts_time_stat to account both counters
    properly and update consumers of idle to consider iowait time as well.
    If we do this we might use get_cpu_{idle,iowait}_time_us from other
    contexts as well and we will get expected values.

    Signed-off-by: Michal Hocko
    Cc: Dave Jones
    Cc: Arnd Bergmann
    Cc: Alexey Dobriyan
    Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz
    Signed-off-by: Thomas Gleixner

    Michal Hocko
     

11 Aug, 2011

9 commits


10 Aug, 2011

2 commits


23 Jul, 2011

1 commit


21 Jul, 2011

1 commit

  • Terribly embarassing. Don't know how I committed this, but its
    KERN_WARNING not KERN_WARN.

    This fixes the following compile error:
    kernel/time/timekeeping.c: In function ‘__timekeeping_inject_sleeptime’:
    kernel/time/timekeeping.c:608: error: ‘KERN_WARN’ undeclared (first use in this function)
    kernel/time/timekeeping.c:608: error: (Each undeclared identifier is reported only once
    kernel/time/timekeeping.c:608: error: for each function it appears in.)
    kernel/time/timekeeping.c:608: error: expected ‘)’ before string constant
    make[2]: *** [kernel/time/timekeeping.o] Error 1

    Reported-by: Ingo Molnar
    Signed-off-by: John Stultz

    John Stultz
     

22 Jun, 2011

4 commits

  • Because the read_persistent_clock interface is usually backed by
    only a second granular interface, each time we read from the persistent
    clock for suspend/resume, we introduce a half second (on average) of error.

    In order to avoid this error accumulating as the system is suspended
    over and over, this patch measures the time delta between the persistent
    clock and the system CLOCK_REALTIME.

    If the delta is less then 2 seconds from the last suspend, we compensate
    by using the previous time delta (keeping it close). If it is larger
    then 2 seconds, we assume the clock was set or has been changed, so we
    do no correction and update the delta.

    Note: If NTP is running, ths could seem to "fight" with the NTP corrected
    time, where as if the system time was off by 1 second, and NTP slewed the
    value in, a suspend/resume cycle could undo this correction, by trying to
    restore the previous offset from the persistent clock. However, without
    this patch, since each read could cause almost a full second worth of
    error, its possible to get almost 2 seconds of error just from the
    suspend/resume cycle alone, so this about equal to any offset added by
    the compensation.

    Further on systems that suspend/resume frequently, this should keep time
    closer then NTP could compensate for if the errors were allowed to
    accumulate.

    Credits to Arve Hjønnevåg for suggesting this solution.

    CC: Arve Hjønnevåg
    CC: Thomas Gleixner
    Signed-off-by: John Stultz

    John Stultz
     
  • Arve suggested making sure we catch possible negative sleep time
    intervals that could be passed into timekeeping_inject_sleeptime.

    CC: Arve Hjønnevåg
    CC: Thomas Gleixner
    Signed-off-by: John Stultz

    John Stultz
     
  • Toralf Förster and Richard Weinberger noted that if there is
    no RTC device, the alarm timers core prints out an annoying
    "ALARM timers will not wake from suspend" message.

    This warning has been removed in a previous patch, however
    the issue still remains: The original idea was to support
    alarm timers even if there was no rtc device, as long as the
    system didn't go into suspend.

    However, after further consideration, communicating to the application
    that alarmtimers are not fully functional seems like the better
    solution.

    So this patch makes it so we return -ENOTSUPP to any posix _ALARM
    clockid calls if there is no backing RTC device on the system.

    Further this changes the behavior where when there is no rtc device
    we will check for one on clock_getres, clock_gettime, timer_create,
    and timer_nsleep instead of on suspend.

    CC: Toralf Förster
    CC: Richard Weinberger
    CC: Thomas Gleixner
    Reported-by: Toralf Förster
    Reported by: Richard Weinberger
    Signed-off-by: John Stultz

    John Stultz
     
  • The alarmtimers code currently picks a rtc device to use at
    late init time. However, if your rtc driver is loaded as a module,
    it may be registered after the alarmtimers late init code, leaving
    the alarmtimers nonfunctional.

    This patch moves the the rtcdevice selection to when we actually try
    to use it, allowing us to make use of rtc modules that may have been
    loaded at any point since bootup.

    CC: Thomas Gleixner
    CC: Meelis Roos
    Reported-by: Meelis Roos
    Signed-off-by: John Stultz

    John Stultz
     

17 Jun, 2011

1 commit

  • The clocksource watchdog code is interruptible and it has been
    observed that this can trigger false positives which disable the TSC.

    The reason is that an interrupt storm or a long running interrupt
    handler between the read of the watchdog source and the read of the
    TSC brings the two far enough apart that the delta is larger than the
    unstable treshold. Move both reads into a short interrupt disabled
    region to avoid that.

    Reported-and-tested-by: Vernon Mauery
    Signed-off-by: Thomas Gleixner
    Cc: stable@kernel.org

    Thomas Gleixner