15 Jul, 2016

1 commit

  • Split out the clockevents callbacks instead of piggybacking them on
    hrtimers.

    This gets rid of a POST_DEAD user. See commit:

    54e88fad223c ("sched: Make sure timers have migrated before killing the migration_thread")

    We just move the callback state to the proper place in the state machine.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Anna-Maria Gleixner
    Reviewed-by: Sebastian Andrzej Siewior
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Rasmus Villemoes
    Cc: Rusty Russell
    Cc: rt@linutronix.de
    Link: http://lkml.kernel.org/r/20160713153337.485419196@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

18 Mar, 2016

1 commit

  • This patchset introduces a /proc//timerslack_ns interface which
    would allow controlling processes to be able to set the timerslack value
    on other processes in order to save power by avoiding wakeups (Something
    Android currently does via out-of-tree patches).

    The first patch tries to fix the internal timer_slack_ns usage which was
    defined as a long, which limits the slack range to ~4 seconds on 32bit
    systems. It converts it to a u64, which provides the same basically
    unlimited slack (500 years) on both 32bit and 64bit machines.

    The second patch introduces the /proc//timerslack_ns interface
    which allows the full 64bit slack range for a task to be read or set on
    both 32bit and 64bit machines.

    With these two patches, on a 32bit machine, after setting the slack on
    bash to 10 seconds:

    $ time sleep 1

    real 0m10.747s
    user 0m0.001s
    sys 0m0.005s

    The first patch is a little ugly, since I had to chase the slack delta
    arguments through a number of functions converting them to u64s. Let me
    know if it makes sense to break that up more or not.

    Other than that things are fairly straightforward.

    This patch (of 2):

    The timer_slack_ns value in the task struct is currently a unsigned
    long. This means that on 32bit applications, the maximum slack is just
    over 4 seconds. However, on 64bit machines, its much much larger (~500
    years).

    This disparity could make application development a little (as well as
    the default_slack) to a u64. This means both 32bit and 64bit systems
    have the same effective internal slack range.

    Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK specify
    the interface as a unsigned long, so we preserve that limitation on
    32bit systems, where SET_TIMERSLACK can only set the slack to a unsigned
    long value, and GET_TIMERSLACK will return ULONG_MAX if the slack is
    actually larger then what can be stored by an unsigned long.

    This patch also modifies hrtimer functions which specified the slack
    delta as a unsigned long.

    Signed-off-by: John Stultz
    Cc: Arjan van de Ven
    Cc: Thomas Gleixner
    Cc: Oren Laadan
    Cc: Ruchi Kandoi
    Cc: Rom Lemarchand
    Cc: Kees Cook
    Cc: Android Kernel Team
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Stultz
     

17 Jan, 2016

1 commit

  • If CONFIG_TIME_LOW_RES is enabled we add a jiffie to the relative timeout to
    prevent short sleeps, but we do not account for that in interfaces which
    retrieve the remaining time.

    Helge observed that timerfd can return a remaining time larger than the
    relative timeout. That's not expected and breaks userland test programs.

    Store the information that the timer was armed relative and provide functions
    to adjust the remaining time. To avoid bloating the hrtimer struct make state
    a u8, which as a bonus results in better code on x86 at least.

    Reported-and-tested-by: Helge Deller
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: dhowells@redhat.com
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160114164159.273328486@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

19 Jun, 2015

4 commits

  • If nohz is disabled on the kernel command line the [hr]timer code
    still calls wake_up_nohz_cpu() and tick_nohz_full_cpu(), a pretty
    pointless exercise. Cache nohz_active in [hr]timer per cpu bases and
    avoid the overhead.

    Before:
    48.10% hog [.] main
    15.25% [kernel] [k] _raw_spin_lock_irqsave
    9.76% [kernel] [k] _raw_spin_unlock_irqrestore
    6.50% [kernel] [k] mod_timer
    6.44% [kernel] [k] lock_timer_base.isra.38
    3.87% [kernel] [k] detach_if_pending
    3.80% [kernel] [k] del_timer
    2.67% [kernel] [k] internal_add_timer
    1.33% [kernel] [k] __internal_add_timer
    0.73% [kernel] [k] timerfn
    0.54% [kernel] [k] wake_up_nohz_cpu

    After:
    48.73% hog [.] main
    15.36% [kernel] [k] _raw_spin_lock_irqsave
    9.77% [kernel] [k] _raw_spin_unlock_irqrestore
    6.61% [kernel] [k] lock_timer_base.isra.38
    6.42% [kernel] [k] mod_timer
    3.90% [kernel] [k] detach_if_pending
    3.76% [kernel] [k] del_timer
    2.41% [kernel] [k] internal_add_timer
    1.39% [kernel] [k] __internal_add_timer
    0.76% [kernel] [k] timerfn

    We probably should have a cached value for nohz full in the per cpu
    bases as well to avoid the cpumask check. The base cache line is hot
    already, the cpumask not necessarily.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Paul McKenney
    Cc: Frederic Weisbecker
    Cc: Eric Dumazet
    Cc: Viresh Kumar
    Cc: John Stultz
    Cc: Joonwoo Park
    Cc: Wenbo Wang
    Link: http://lkml.kernel.org/r/20150526224512.207378134@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Eric reported that the timer_migration sysctl is not really nice
    performance wise as it needs to check at every timer insertion whether
    the feature is enabled or not. Further the check does not live in the
    timer code, so we have an extra function call which checks an extra
    cache line to figure out that it is disabled.

    We can do better and store that information in the per cpu (hr)timer
    bases. I pondered to use a static key, but that's a nightmare to
    update from the nohz code and the timer base cache line is hot anyway
    when we select a timer base.

    The old logic enabled the timer migration unconditionally if
    CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
    line.

    With this modification, we start off with migration disabled. The user
    visible sysctl is still set to enabled. If the kernel switches to NOHZ
    migration is enabled, if the user did not disable it via the sysctl
    prior to the switch. If nohz=off is on the kernel command line,
    migration stays disabled no matter what.

    Before:
    47.76% hog [.] main
    14.84% [kernel] [k] _raw_spin_lock_irqsave
    9.55% [kernel] [k] _raw_spin_unlock_irqrestore
    6.71% [kernel] [k] mod_timer
    6.24% [kernel] [k] lock_timer_base.isra.38
    3.76% [kernel] [k] detach_if_pending
    3.71% [kernel] [k] del_timer
    2.50% [kernel] [k] internal_add_timer
    1.51% [kernel] [k] get_nohz_timer_target
    1.28% [kernel] [k] __internal_add_timer
    0.78% [kernel] [k] timerfn
    0.48% [kernel] [k] wake_up_nohz_cpu

    After:
    48.10% hog [.] main
    15.25% [kernel] [k] _raw_spin_lock_irqsave
    9.76% [kernel] [k] _raw_spin_unlock_irqrestore
    6.50% [kernel] [k] mod_timer
    6.44% [kernel] [k] lock_timer_base.isra.38
    3.87% [kernel] [k] detach_if_pending
    3.80% [kernel] [k] del_timer
    2.67% [kernel] [k] internal_add_timer
    1.33% [kernel] [k] __internal_add_timer
    0.73% [kernel] [k] timerfn
    0.54% [kernel] [k] wake_up_nohz_cpu

    Reported-by: Eric Dumazet
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Paul McKenney
    Cc: Frederic Weisbecker
    Cc: Viresh Kumar
    Cc: John Stultz
    Cc: Joonwoo Park
    Cc: Wenbo Wang
    Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Currently an hrtimer callback function cannot free its own timer
    because __run_hrtimer() still needs to clear HRTIMER_STATE_CALLBACK
    after it. Freeing the timer would result in a clear use-after-free.

    Solve this by using a scheme similar to regular timers; track the
    current running timer in hrtimer_clock_base::running.

    Suggested-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: ktkhai@parallels.com
    Cc: rostedt@goodmis.org
    Cc: juri.lelli@gmail.com
    Cc: pang.xunlei@linaro.org
    Cc: wanpeng.li@linux.intel.com
    Cc: Al Viro
    Cc: Linus Torvalds
    Cc: Paul McKenney
    Cc: Oleg Nesterov
    Cc: umgwanakikbuti@gmail.com
    Link: http://lkml.kernel.org/r/20150611124743.471563047@infradead.org
    Signed-off-by: Thomas Gleixner

    Peter Zijlstra
     
  • I do not understand HRTIMER_STATE_MIGRATE. Unless I am totally
    confused it looks buggy and simply unneeded.

    migrate_hrtimer_list() sets it to keep hrtimer_active() == T, but this
    is not enough: this can fool, say, hrtimer_is_queued() in
    dequeue_signal().

    Can't migrate_hrtimer_list() simply use HRTIMER_STATE_ENQUEUED?
    This fixes the race and we can kill STATE_MIGRATE.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: ktkhai@parallels.com
    Cc: rostedt@goodmis.org
    Cc: juri.lelli@gmail.com
    Cc: pang.xunlei@linaro.org
    Cc: wanpeng.li@linux.intel.com
    Cc: umgwanakikbuti@gmail.com
    Link: http://lkml.kernel.org/r/20150611124743.072387650@infradead.org
    Signed-off-by: Thomas Gleixner

    Oleg Nesterov
     

08 Jun, 2015

1 commit

  • ... in the !CONFIG_HIGH_RES_TIMERS case too. And thus fix warnings like
    this one:

    net/sched/sch_api.c: In function ‘psched_show’:
    net/sched/sch_api.c:1891:6: warning: format ‘%x’ expects argument of type ‘unsigned int’, but argument 6 has type ‘long int’ [-Wformat=]
    (u32)NSEC_PER_SEC / hrtimer_resolution);

    Signed-off-by: Borislav Petkov
    Link: http://lkml.kernel.org/r/1433583000-32090-1-git-send-email-bp@alien8.de
    Signed-off-by: Thomas Gleixner
    Cc: Thomas Gleixner

    Borislav Petkov
     

22 Apr, 2015

15 commits

  • No user was ever interested whether the timer was active or not when
    it was started. All abusers of the return value are gone, so get rid
    of it.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203503.483556394@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • No point for an extra export just to set the extra argument of
    hrtimer_start_range_ns() to 0.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203502.808544539@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • No more callers. Remove the leftovers.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203502.707871492@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The evaluation of the next timer in the nohz code is based on jiffies
    while all the tick internals are nano seconds based. We have also to
    convert hrtimer nanoseconds to jiffies in the !highres case. That's
    just wrong and introduces interesting corner cases.

    Turn it around and convert the next timer wheel timer expiry and the
    rcu event to clock monotonic and base all calculations on
    nanoseconds. That identifies the case where no timer is pending
    clearly with an absolute expiry value of KTIME_MAX.

    Makes the code more readable and gets rid of the jiffies magic in the
    nohz code.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Paul E. McKenney
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Cc: Josh Triplett
    Cc: Lai Jiangshan
    Cc: John Stultz
    Cc: Marcelo Tosatti
    Link: http://lkml.kernel.org/r/20150414203502.184198593@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • hrtimer softirq is a leftover from the initial implementation and
    serves only the purpose to handle the enqueueing of already expired
    timers in the high resolution timer mode. We discussed whether we
    change the return value and force all start sites to handle that the
    timer is already expired, but that would be a Herculean task and I'm
    not sure whether its a good idea to enforce that handling on
    everyone.

    A simpler solution is to enforce a timer interrupt instead of raising
    and scheduling a softirq. Just use the existing infrastructure to do
    so and remove all the softirq leftovers.

    The HRTIMER softirq enum is now unused, but kept around because trace
    parsers rely on the existing numbering.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203501.840834708@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • __remove_hrtimer() needs to evaluate the expiry time to figure out
    whether the timer which is removed is eventually the first expiring
    timer on the cpu. Keep a pointer to it, which is lazily updated, so we
    can avoid the evaluation dance and retrieve the information from there.

    Generates slightly better code.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203501.752838019@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • We don't use cacheline_align here because that might waste lot of
    space on 32bit machine with 64 bytes cachelines and on 64bit machines
    with 128 bytes cachelines.

    The size of struct hrtimer_clock_base is 64byte on 64bit and 32byte on
    32bit machines. So we utilize the cache lines proper.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203501.498165771@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • We really want that data structure to start at a cache line boundary.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203501.417597627@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • No point in wasting 12 byte storage space. Generates better code as well.

    Text size reduction:
    x8664 -64, i386 -16, ARM -132, ARM64 -0, power64 -48

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203501.227955358@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • On every tick/hrtimer interrupt we update the offset variables of the
    clock bases. That's silly because these offsets change very seldom.

    Add a sequence counter to the time keeping code which keeps track of
    the offset updates (clock_was_set()). Have a sequence cache in the
    hrtimer cpu bases to evaluate whether the offsets must be updated or
    not. This allows us later to avoid pointless cacheline pollution.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Preeti U Murthy
    Acked-by: Peter Zijlstra
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Cc: John Stultz
    Link: http://lkml.kernel.org/r/20150414203501.132820245@linutronix.de
    Signed-off-by: Thomas Gleixner
    Cc: John Stultz

    Thomas Gleixner
     
  • The softirq time field in the clock bases is an optimization from the
    early days of hrtimers. It provides a coarse "jiffies" like time
    mostly for self rearming timers.

    But that comes with a price:
    - Larger code size
    - Extra storage space
    - Duplicated functions with really small differences

    The benefit of this is optimization is marginal for contemporary
    systems.

    Consolidate everything on the high resolution timer
    implementation. This makes further optimizations possible.

    Text size reduction:
    x8664 -95, i386 -356, ARM -148, ARM64 -40, power64 -16

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203501.039977424@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • No point in having usigned long for /proc/timer_list statistics. Make
    them unsigned int.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203500.959773467@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The resolution is directly accessible now. So its simpler just to fill
    in the values of the timespec and be done with it.

    Text size reduction (combined with "hrtimer: Get rid of the resolution
    field in hrtimer_clock_base"):
    x8664 -61, i386 -221, ARM -60, power64 -48

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203500.879888080@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The field has no value because all clock bases have the same
    resolution. The resolution only changes when we switch to high
    resolution timer mode. We can evaluate that from a single static
    variable as well. In the !HIGHRES case its simply a constant.

    Export the variable, so we can simplify the usage sites.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Preeti U Murthy
    Acked-by: Peter Zijlstra
    Cc: Viresh Kumar
    Cc: Marcelo Tosatti
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150414203500.645454122@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Document the calling context conditions.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20150413210035.178751779@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

23 Jan, 2015

1 commit

  • hrtimer_interrupt() has the following subtle issue:

    hrtimer_interrupt()
    lock(cpu_base);
    expires_next = KTIME_MAX;

    expire_timers(CLOCK_MONOTONIC);
    expires = get_next_timer(CLOCK_MONOTONIC);
    if (expires < expires_next)
    expires_next = expires;

    expire_timers(CLOCK_REALTIME);
    unlock(cpu_base);
    wakeup()
    hrtimer_start(CLOCK_MONOTONIC, newtimer);
    lock(cpu_base();
    expires = get_next_timer(CLOCK_REALTIME);
    if (expires < expires_next)
    expires_next = expires;

    So because we already evaluated the next expiring timer of
    CLOCK_MONOTONIC we ignore that the expiry time of newtimer might be
    earlier than the overall next expiry time in hrtimer_interrupt().

    To solve this, remove the caching of the next expiry value from
    hrtimer_interrupt() and reevaluate all active clock bases for the next
    expiry value. To avoid another code duplication, create a shared
    evaluation function and use it for hrtimer_get_next_event(),
    hrtimer_force_reprogram() and hrtimer_interrupt().

    There is another subtlety in this mechanism:

    While hrtimer_interrupt() is running, we want to avoid to touch the
    hardware device because we will reprogram it anyway at the end of
    hrtimer_interrupt(). This works nicely for hrtimers which get rearmed
    via the HRTIMER_RESTART mechanism, because we drop out when the
    callback on that CPU is running. But that fails, if a new timer gets
    enqueued like in the example above.

    This has another implication: While hrtimer_interrupt() is running we
    refuse remote enqueueing of timers - see hrtimer_interrupt() and
    hrtimer_check_target().

    hrtimer_interrupt() tries to prevent this by setting cpu_base->expires
    to KTIME_MAX, but that fails if a new timer gets queued.

    Prevent both the hardware access and the remote enqueue
    explicitely. We can loosen the restriction on the remote enqueue now
    due to reevaluation of the next expiry value, but that needs a
    seperate patch.

    Folded in a fix from Vignesh Radhakrishnan.

    Reported-and-tested-by: Stanislav Fomichev
    Based-on-patch-by: Stanislav Fomichev
    Signed-off-by: Thomas Gleixner
    Cc: vigneshr@codeaurora.org
    Cc: john.stultz@linaro.org
    Cc: viresh.kumar@linaro.org
    Cc: fweisbec@gmail.com
    Cc: cl@linux.com
    Cc: stuart.w.hayes@gmail.com
    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1501202049190.5526@nanos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

24 Jul, 2014

3 commits

  • Right now we have time related prototypes in 3 different header
    files. Move it to a single timekeeping header file and move the core
    internal stuff into a core private header.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: John Stultz

    Thomas Gleixner
     
  • With the plain nanoseconds based ktime_t we can simply use
    ktime_divns() instead of going through loops and hoops of
    timespec/timeval conversion.

    Reported-by: John Stultz
    Signed-off-by: Thomas Gleixner
    Signed-off-by: John Stultz

    Thomas Gleixner
     
  • Rather then having two similar but totally different implementations
    that provide timekeeping state to the hrtimer code, try to unify the
    two implementations to be more simliar.

    Thus this clarifies ktime_get_update_offsets to
    ktime_get_update_offsets_now and changes get_xtime... to
    ktime_get_update_offsets_tick.

    Signed-off-by: John Stultz
    Signed-off-by: Thomas Gleixner
    Signed-off-by: John Stultz

    John Stultz
     

23 Jun, 2014

1 commit

  • In lowres mode, hrtimers are serviced by the tick instead of a clock
    event. Now it works well as long as the tick stays periodic but we
    must also make sure that the hrtimers are serviced in dynticks mode.

    Part of that job consist in kicking a dynticks hrtimer target in order
    to make it reconsider the next tick to schedule to correctly handle the
    hrtimer's expiring time. And that part isn't handled by the hrtimers
    subsystem.

    To prepare for fixing this, we need __hrtimer_start_range_ns() to be
    able to resolve the CPU target associated to a hrtimer's object
    'cpu_base' so that the kick can be centralized there.

    So lets store it in the 'struct hrtimer_cpu_base' to resolve the CPU
    without overhead. It is set once at CPU's online notification.

    Signed-off-by: Viresh Kumar
    Signed-off-by: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/1403393357-2070-4-git-send-email-fweisbec@gmail.com
    Signed-off-by: Thomas Gleixner

    Viresh Kumar
     

20 Mar, 2014

1 commit


23 Mar, 2013

1 commit


12 Jul, 2012

2 commits

  • To finally fix the infamous leap second issue and other race windows
    caused by functions which change the offsets between the various time
    bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a
    function which atomically gets the current monotonic time and updates
    the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic
    overhead. The previous patch which provides ktime_t offsets allows us
    to make this function almost as cheap as ktime_get() which is going to
    be replaced in hrtimer_interrupt().

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar
    Acked-by: Peter Zijlstra
    Acked-by: Prarit Bhargava
    Cc: stable@vger.kernel.org
    Signed-off-by: John Stultz
    Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.com
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • clock_was_set() cannot be called from hard interrupt context because
    it calls on_each_cpu().

    For fixing the widely reported leap seconds issue it is necessary to
    call it from hard interrupt context, i.e. the timer tick code, which
    does the timekeeping updates.

    Provide a new function which denotes it in the hrtimer cpu base
    structure of the cpu on which it is called and raise the hrtimer
    softirq. We then execute the clock_was_set() notificiation from
    softirq context in run_hrtimer_softirq(). The hrtimer softirq is
    rarely used, so polling the flag there is not a performance issue.

    [ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get
    rid of all this ifdeffery ASAP ]

    Signed-off-by: John Stultz
    Reported-by: Jan Engelhardt
    Reviewed-by: Ingo Molnar
    Acked-by: Peter Zijlstra
    Acked-by: Prarit Bhargava
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/1341960205-56738-2-git-send-email-johnstul@us.ibm.com
    Signed-off-by: Thomas Gleixner

    John Stultz
     

28 Jun, 2011

1 commit

  • Fix 'make htmldocs' warnings:

    Warning(/include/linux/hrtimer.h:153): No description found for parameter 'clockid'
    Warning(/include/linux/device.h:604): Excess struct/union/enum/typedef member 'of_match' description in 'device'
    Warning(/include/net/sock.h:349): Excess struct/union/enum/typedef member 'sk_rmem_alloc' description in 'sock'

    Signed-off-by: Vitaliy Ivanov
    Acked-by: Grant Likely
    Acked-by: David S. Miller
    Signed-off-by: Linus Torvalds

    Vitaliy Ivanov
     

23 May, 2011

4 commits

  • The ordering of the clock bases is historical due to the
    CLOCK_REALTIME and CLOCK_MONOTONIC constants. Now the hrtimer bases
    have their own enumeration due to the gap between CLOCK_MONOTONIC and
    CLOCK_BOOTTIME. So we can be more clever as most timers end up on the
    CLOCK_MONOTONIC base due to the virtue of POSIX declaring that
    relative CLOCK_REALTIME timers are not affected by time changes. In
    desktop environments this is slowly changing as applications switch to
    absolute timers, but I've observed empty CLOCK_REALTIME bases often
    enough. There is no performance penalty or overhead when
    CLOCK_REALTIME timers are active, but in case they are not we don't
    skip over a full cache line.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     
  • Instead of iterating over all possible timer bases avoid it by marking
    the active bases in the cpu base.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     
  • In the HIGHRES=y case we access the members at the end of struct
    hrtimer_cpu_base first and then the one at the beginning. Move the
    hrtimer data to front, so we have linear progressing access.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     
  • Peter is concerned about the extra scan of CLOCK_REALTIME_COS in the
    timer interrupt. Yes, I did not think about it, because the solution
    was so elegant. I didn't like the extra list in timerfd when it was
    proposed some time ago, but with a rcu based list the list walk it's
    less horrible than the original global lock, which was held over the
    list iteration.

    Requested-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     

03 May, 2011

2 commits

  • Some applications must be aware of clock realtime being set
    backward. A simple example is a clock applet which arms a timer for
    the next minute display. If clock realtime is set backward then the
    applet displays a stale time for the amount of time which the clock
    was set backwards. Due to that applications poll the time because we
    don't have an interface.

    Extend the timerfd interface by adding a flag which puts the timer
    onto a different internal realtime clock. All timers on this clock are
    expired whenever the clock was set.

    The timerfd core records the monotonic offset when the timer is
    created. When the timer is armed, then the current offset is compared
    to the previous recorded offset. When it has changed, then
    timerfd_settime returns -ECANCELED. When a timer is read the offset is
    compared and if it changed -ECANCELED returned to user space. Periodic
    timers are not rearmed in the cancelation case.

    Signed-off-by: Thomas Gleixner
    Acked-by: John Stultz
    Cc: Chris Friesen
    Tested-by: Kay Sievers
    Cc: "Kirill A. Shutemov"
    Cc: Peter Zijlstra
    Cc: Davide Libenzi
    Reviewed-by: Alexander Shishkin
    Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104271359580.3323%40ionos%3E
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Make clock_was_set() unconditional and rename hres_timers_resume to
    hrtimers_resume. This is a preparatory patch for hrtimers which are
    cancelled when clock realtime was set.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

11 Mar, 2011

1 commit