23 Apr, 2014

1 commit


20 Feb, 2014

1 commit

  • The generic sched_clock registration function was previously
    done lockless, due to the fact that it was expected to be called
    only once. However, now there are systems that may register
    multiple sched_clock sources, for which the lack of locking has
    casued problems:

    If two sched_clock sources are registered we may end up in a
    situation where a call to sched_clock() may be accessing the
    epoch cycle count for the old counter and the cycle count for the
    new counter. This can lead to confusing results where
    sched_clock() values jump and then are reset to 0 (due to the way
    the registration function forces the epoch_ns to be 0).

    Fix this by reorganizing the registration function to hold the
    seqlock for as short a time as possible while we update the
    clock_data structure for a new counter. We also put any
    accumulated time into epoch_ns instead of resetting the time to
    0 so that the clock doesn't reset after each successful
    registration.

    [jstultz: Added extra context to the commit message]

    Reported-by: Will Deacon
    Signed-off-by: Stephen Boyd
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Will Deacon
    Cc: Peter Zijlstra
    Cc: Josh Cartwright
    Link: http://lkml.kernel.org/r/1392662736-7803-2-git-send-email-john.stultz@linaro.org
    Signed-off-by: John Stultz
    Signed-off-by: Thomas Gleixner

    Stephen Boyd
     

12 Jan, 2014

1 commit

  • Unfortunately the seqlock lockdep enablement can't be used
    in sched_clock(), since the lockdep infrastructure eventually
    calls into sched_clock(), which causes a deadlock.

    Thus, this patch changes all generic sched_clock() usage
    to use the raw_* methods.

    Acked-by: Linus Torvalds
    Reviewed-by: Stephen Boyd
    Reported-by: Krzysztof Hałasa
    Signed-off-by: John Stultz
    Cc: Uwe Kleine-König
    Cc: Willy Tarreau
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1388704274-5278-2-git-send-email-john.stultz@linaro.org
    Signed-off-by: Ingo Molnar

    John Stultz
     

10 Oct, 2013

1 commit


31 Jul, 2013

3 commits

  • The ARM architected system counter has at least 56 usable bits.
    Add support for counters with more than 32 bits to the generic
    sched_clock implementation so we can increase the time between
    wakeups due to dealing with wrap-around on these devices while
    benefiting from the irqtime accounting and suspend/resume
    handling that the generic sched_clock code already has. On my
    system using 56 bits over 32 bits changes the wraparound time
    from a few minutes to an hour. For faster running counters (GHz
    range) this is even more important because we may not be able to
    execute the timer in time to deal with the wraparound if only 32
    bits are used.

    We choose a maxsec value of 3600 seconds because we assume no
    system will go idle for more than an hour. In the future we may
    need to increase this value.

    Note: All users should switch over to the 64-bit read function so
    we can remove setup_sched_clock() in favor of sched_clock_register().

    Cc: Russell King
    Signed-off-by: Stephen Boyd
    Signed-off-by: John Stultz

    Stephen Boyd
     
  • In the next patch we're going to increase the number of bits that
    the generic sched_clock can handle to be greater than 32. With
    more than 32 bits the wraparound time can be larger than what can
    fit into the units that msecs_to_jiffies takes (unsigned int).
    Luckily, the wraparound is initially calculated in nanoseconds
    which we can easily use with hrtimers, so switch to using an
    hrtimer.

    Cc: Russell King
    Signed-off-by: Stephen Boyd
    [jstultz: Fixup hrtimer intitialization order issue]
    Signed-off-by: John Stultz

    Stephen Boyd
     
  • We're going to increase the cyc value to 64 bits in the near
    future. Doing that is going to break the custom seqcount
    implementation in the sched_clock code because 64 bit numbers
    aren't guaranteed to be atomic. Replace the cyc_copy with a
    seqcount to avoid this problem.

    Cc: Russell King
    Acked-by: Will Deacon
    Signed-off-by: Stephen Boyd
    Signed-off-by: John Stultz

    Stephen Boyd
     

18 Jun, 2013

1 commit

  • There is a small race between when the cycle count is read from
    the hardware and when the epoch stabilizes. Consider this
    scenario:

    CPU0 CPU1
    ---- ----
    cyc = read_sched_clock()
    cyc_to_sched_clock()
    update_sched_clock()
    ...
    cd.epoch_cyc = cyc;
    epoch_cyc = cd.epoch_cyc;
    ...
    epoch_ns + cyc_to_ns((cyc - epoch_cyc)

    The cyc on cpu0 was read before the epoch changed. But we
    calculate the nanoseconds based on the new epoch by subtracting
    the new epoch from the old cycle count. Since epoch is most likely
    larger than the old cycle count we calculate a large number that
    will be converted to nanoseconds and added to epoch_ns, causing
    time to jump forward too much.

    Fix this problem by reading the hardware after the epoch has
    stabilized.

    Cc: Russell King
    Signed-off-by: Stephen Boyd
    Signed-off-by: John Stultz

    Stephen Boyd
     

13 Jun, 2013

1 commit