Eric Lee / smarc-fsl-linux-kernel

24 Jul, 2014

40 commits

7806f60e1 clocksource: document some basic timekeeping concepts ... Browse Code »

This adds some documentation about clock sources, clock events,
the weak sched_clock() function and delay timers that answers
questions that repeatedly arise on the mailing lists.

Cc: Thomas Gleixner
Cc: Nicolas Pitre
Cc: Colin Cross
Cc: John Stultz
Cc: Peter Zijlstra
Cc: Ingo Molnar
Signed-off-by: Linus Walleij
Acked-by: Nicolas Pitre
Signed-off-by: John Stultz

Linus Walleij
2014-07-24 06:07:13 +0800
375f45b5b timekeeping: Use cached ntp_tick_length when accumulating error ... Browse Code »

By caching the ntp_tick_length() when we correct the frequency error,
and then using that cached value to accumulate error, we avoid large
initial errors when the tick length is changed.

This makes convergence happen much faster in the simulator, since the
initial error doesn't have to be slowly whittled away.

This initially seems like an accounting error, but Miroslav pointed out
that ntp_tick_length() can change mid-tick, so when we apply it in the
error accumulation, we are applying any recent change to the entire tick.

This approach chooses to apply changes in the ntp_tick_length() only to
the next tick, which allows us to calculate the freq correction before
using the new tick length, which avoids accummulating error.

Credit to Miroslav for pointing this out and providing the original patch
this functionality has been pulled out from, along with the rational.

Cc: Miroslav Lichvar
Cc: Richard Cochran
Cc: Prarit Bhargava
Reported-by: Miroslav Lichvar
Signed-off-by: John Stultz

John Stultz
2014-07-24 06:01:57 +0800
dc491596f timekeeping: Rework frequency adjustments to work better w/ nohz ... Browse Code »

The existing timekeeping_adjust logic has always been complicated
to understand. Further, since it was developed prior to NOHZ becoming
common, its not surprising it performs poorly when NOHZ is enabled.

Since Miroslav pointed out the problematic nature of the existing code
in the NOHZ case, I've tried to refactor the code to perform better.

The problem with the previous approach was that it tried to adjust
for the total cumulative error using a scaled dampening factor. This
resulted in large errors to be corrected slowly, while small errors
were corrected quickly. With NOHZ the timekeeping code doesn't know
how far out the next tick will be, so this results in bad
over-correction to small errors, and insufficient correction to large
errors.

Inspired by Miroslav's patch, I've refactored the code to try to
address the correction in two steps.

1) Check the future freq error for the next tick, and if the frequency
error is large, try to make sure we correct it so it doesn't cause
much accumulated error.

2) Then make a small single unit adjustment to correct any cumulative
error that has collected over time.

This method performs fairly well in the simulator Miroslav created.

Major credit to Miroslav for pointing out the issue, providing the
original patch to resolve this, a simulator for testing, as well as
helping debug and resolve issues in my implementation so that it
performed closer to his original implementation.

Cc: Miroslav Lichvar
Cc: Richard Cochran
Cc: Prarit Bhargava
Reported-by: Miroslav Lichvar
Signed-off-by: John Stultz

John Stultz
2014-07-24 06:01:56 +0800
e2dff1ec0 timekeeping: Minor fixup for timespec64->timespec assignment ... Browse Code »

In the GENERIC_TIME_VSYSCALL_OLD update_vsyscall implementation,
we take the tk_xtime() value, which returns a timespec64, and
store it in a timespec.

This luckily is ok, since the only architectures that use
GENERIC_TIME_VSYSCALL_OLD are ia64 and ppc64, which are both
64 bit systems where timespec64 is the same as a timespec.

Even so, for cleanliness reasons, use the conversion function
to assign the proper type.

Signed-off-by: John Stultz

John Stultz
2014-07-24 06:01:56 +0800
1b3e5c093 ftrace: Provide trace clocks monotonic ... Browse Code »

Expose the new NMI safe accessor to clock monotonic to the tracer.

Signed-off-by: Thomas Gleixner
Cc: Steven Rostedt
Cc: Peter Zijlstra
Cc: Mathieu Desnoyers
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:55 +0800
4396e058c timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC ... Browse Code »

Tracers want a correlated time between the kernel instrumentation and
user space. We really do not want to export sched_clock() to user
space, so we need to provide something sensible for this.

Using separate data structures with an non blocking sequence count
based update mechanism allows us to do that. The data structure
required for the readout has a sequence counter and two copies of the
timekeeping data.

On the update side:

smp_wmb();
tkf->seq++;
smp_wmb();
update(tkf->base[0], tk);
smp_wmb();
tkf->seq++;
smp_wmb();
update(tkf->base[1], tk);

On the reader side:

do {
seq = tkf->seq;
smp_rmb();
idx = seq & 0x01;
now = now(tkf->base[idx]);
smp_rmb();
} while (seq != tkf->seq)

So if a NMI hits the update of base[0] it will use base[1] which is
still consistent, but this timestamp is not guaranteed to be monotonic
across an update.

The timestamp is calculated by:

now = base_mono + clock_delta * slope

So if the update lowers the slope, readers who are forced to the
not yet updated second array are still using the old steeper slope.

tmono
^
| o n
| o n
| u
| o
|o
|12345678---> reader order

o = old slope
u = update
n = new slope

So reader 6 will observe time going backwards versus reader 5.

While other CPUs are likely to be able observe that, the only way
for a CPU local observation is when an NMI hits in the middle of
the update. Timestamps taken from that NMI context might be ahead
of the following timestamps. Callers need to be aware of that and
deal with it.

V2: Got rid of clock monotonic raw and reorganized the data
structures. Folded in the barrier fix from Mathieu.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Mathieu Desnoyers
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:55 +0800
9b0fd802e seqcount: Add raw_write_seqcount_latch() ... Browse Code »

For NMI safe access to clock monotonic we use the seqcount LSB as
index of a timekeeper array. The update sequence looks like this:

smp_wmb();
Cc: John Stultz
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Steven Rostedt
Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Mathieu Desnoyers
2014-07-24 06:01:54 +0800
0ea5a520f seqcount: Provide raw_read_seqcount() ... Browse Code »

raw_read_seqcount opens a read critical section of the given seqcount
without any lockdep checking and without checking or masking the
LSB. Calling code is responsible for handling that.

Preparatory patch to provide a NMI safe clock monotonic accessor
function.

Signed-off-by: Thomas Gleixner
Cc: John Stultz
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Mathieu Desnoyers
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:54 +0800
0e5ac3a8b timekeeping: Use tk_read_base as argument for timekeeping_get_ns() ... Browse Code »

All the function needs is in the tk_read_base struct. No functional
change for the current code, just a preparatory patch for the NMI safe
accessor to clock monotonic which will use struct tk_read_base as well.

Signed-off-by: Thomas Gleixner
Cc: Steven Rostedt
Cc: Peter Zijlstra
Cc: Mathieu Desnoyers
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:53 +0800
d28ede837 timekeeping: Create struct tk_read_base and use it in struct timekeeper ... Browse Code »

The members of the new struct are the required ones for the new NMI
safe accessor to clcok monotonic. In order to reuse the existing
timekeeping code and to make the update of the fast NMI safe
timekeepers a simple memcpy use the struct for the timekeeper as well
and convert all users.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Mathieu Desnoyers
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:53 +0800
6d3aadf3e timekeeping: Restructure the timekeeper some more ... Browse Code »

Access to time requires to touch two cachelines at minimum

1) The timekeeper data structure

2) The clocksource data structure

The access to the clocksource data structure can be avoided as almost
all clocksource implementations ignore the argument to the read
callback, which is a pointer to the clocksource.

But the core needs to touch it to access the members @read and @mask.

So we are better off by copying the @read function pointer and the
@mask from the clocksource to the core data structure itself.

For the most used ktime_get() access all required data including the
@read and @mask copies fits together with the sequence counter into a
single 64 byte cacheline.

For the other time access functions we touch in the current code three
cache lines in the worst case. But with the clocksource data copies we
can reduce that to two adjacent cachelines, which is more efficient
than disjunct cache lines.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:52 +0800
4a0e63773 clocksource: Get rid of cycle_last ... Browse Code »

cycle_last was added to the clocksource to support the TSC
validation. We moved that to the core code, so we can get rid of the
extra copy.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:52 +0800
09ec54429 clocksource: Move cycle_last validation to core code ... Browse Code »

The only user of the cycle_last validation is the x86 TSC. In order to
provide NMI safe accessor functions for clock monotonic and
monotonic_raw we need to do that in the core.

We can't do the TSC specific

if (now < cycle_last)
now = cycle_last;

for the other wrapping around clocksources, but TSC has
CLOCKSOURCE_MASK(64) which actually does not mask out anything so if
now is less than cycle_last the subtraction will give a negative
result. So we can check for that in clocksource_delta() and return 0
for that case.

Implement and enable it for x86

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:51 +0800
3a9783778 clocksource: Make delta calculation a function ... Browse Code »

We want to move the TSC sanity check into core code to make NMI safe
accessors to clock monotonic[_raw] possible. For this we need to
sanity check the delta calculation. Create a helper function and
convert all sites to use it.

[ Build fix from jstultz ]

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:51 +0800
6438e0ddc wireless: ath9k: Get rid of timespec conversions ... Browse Code »

We have interfaces. Remove the open coded cruft. Reduces text size
along with the code.

Signed-off-by: Thomas Gleixner
Cc: QCA ath9k Development
Cc: John W. Linville
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:51 +0800
f166e6dcb drm: vmwgfx: Use nsec based interfaces ... Browse Code »

No point in converting timespecs back and forth.

Signed-off-by: Thomas Gleixner
Cc: Thomas Hellstrom
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:50 +0800
5ed0bdf21 drm: i915: Use nsec based interfaces ... Browse Code »

Use ktime_get_raw_ns() and get rid of the back and forth timespec
conversions.

Signed-off-by: Thomas Gleixner
Acked-by: Daniel Vetter
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:50 +0800
f519b1a2e timekeeping: Provide ktime_get_raw() ... Browse Code »

Provide a ktime_t based interface for raw monotonic time.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:49 +0800
2044fdb03 hangcheck-timer: Use ktime_get_ns() ... Browse Code »

There is no point in having a S390 private implementation and there is
no point in using the raw monotonic time. The NTP freqeuency
adjustment of CLOCK_MONOTONIC is really not doing any harm for the
hang check timer.

Use ktime_get_ns() for everything and get rid of the timespec
conversions.

V2: Drop the raw monotonic and the S390 special case

Signed-off-by: Thomas Gleixner
Cc: Arnd Bergmann
Cc: Greg Kroah-Hartman
Cc: Heiko Carstens
Acked-by: Greg Kroah-Hartman
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:49 +0800
61edec81d timekeeping: Simplify timekeeping_clocktai() ... Browse Code »

timekeeping_clocktai() is not used in fast pathes, so the extra
timespec conversion is not problematic.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:48 +0800
47da70d32 timekeeping: Remove timekeeper.total_sleep_time ... Browse Code »

No more users. Remove it

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:48 +0800
02cba1598 timekeeping: Simplify getboottime() ... Browse Code »

Subtracting plain nsec values and converting to timespec is simpler
than the whole timespec math. Not really fastpath code, so the
division is not an issue.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:47 +0800
48f18fd6a timekeeping: Use ktime_get_boottime() for get_monotonic_boottime() ... Browse Code »

get_monotonic_boottime() is not used in fast pathes, so the extra
timespec conversion is not problematic.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:47 +0800
250fade8a timekeeping: Remove monotonic_to_bootbased ... Browse Code »

No more users.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:46 +0800
cbcf2dd3b x86: kvm: Make kvm_get_time_and_clockread() nanoseconds based ... Browse Code »

Convert the relevant base data right away to nanoseconds instead of
doing the conversion on every readout. Reduces text size by 160 bytes.

Signed-off-by: Thomas Gleixner
Cc: Gleb Natapov
Cc: kvm@vger.kernel.org
Acked-by: Paolo Bonzini
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:46 +0800
bb0b58127 x86: kvm: Use ktime_get_boot_ns() ... Browse Code »

Use the new nanoseconds based interface and get rid of the timespec
conversion dance.

Signed-off-by: Thomas Gleixner
Cc: Gleb Natapov
Cc: kvm@vger.kernel.org
Acked-by: Paolo Bonzini
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:45 +0800
41fa4215f arm: bL_switcher:k Use ktime_get_real_ns() ... Browse Code »

Use the nanoseconds based interface instead of converting from a
timespec.

Signed-off-by: Thomas Gleixner
Cc: Russell King
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:45 +0800
fb31cc153 iio: Use ktime_get_real_ns() ... Browse Code »

No idea why iio needs wall clock based time stamps, but we can avoid
the timespec conversion dance by using the new interfaces.

Signed-off-by: Thomas Gleixner
Acked-by: Jonathan Cameron
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:45 +0800
d659f9b13 hwmon: ibmaem: Use ktime_get_ns() ... Browse Code »

Using the wall clock time for delta time calculations is wrong to
begin with because wall clock time can be set from userspace and NTP.
Such data wants to be based on clock monotonic.

The calculations also are done on a nanosecond basis. Use the
nanoseconds based interface right away.

Signed-off-by: Thomas Gleixner
Cc: Jean Delvare
Acked-by: Jean Delvare
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:44 +0800
5eaaed4fe fs: lockd: Use ktime_get_ns() ... Browse Code »

Replace the ever recurring:
ts = ktime_get_ts();
ns = timespec_to_ns(&ts);
with
ns = ktime_get_ns();

Signed-off-by: Thomas Gleixner
Acked-by: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:44 +0800
14a700467 net: mlx5: Use ktime_get_ns() ... Browse Code »

This code is beyond silly:

struct timespec ts = ktime_get_ts();
ktime_t ktime = timespec_to_ktime(ts);

Further down the code builds the delta of two ktime_t values and
converts the result to nanoseconds.

Use ktime_get_ns() and replace all the nonsense.

Signed-off-by: Thomas Gleixner
Cc: Eli Cohen
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:43 +0800
6d9b757c6 misc: ioc4: Use ktime_get_ns() ... Browse Code »

Replace the ever recurring:
ts = ktime_get_ts();
ns = timespec_to_ns(&ts);
with
ns = ktime_get_ns();

Signed-off-by: Thomas Gleixner
Acked-by: Arnd Bergmann
Acked-by: Greg Kroah-Hartman
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:43 +0800
154adc14b mfd: cros_ec_spi: Use ktime_get_ns() ... Browse Code »

Replace the ever recurring:
ts = ktime_get_ts();
ns = timespec_to_ns(&ts);
with
ns = ktime_get_ns();

Signed-off-by: Thomas Gleixner
Acked-by: Lee Jones
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 06:01:38 +0800
9e93f21b4 connector: Use ktime_get_ns() ... Browse Code »

Replace the ever recurring:
ts = ktime_get_ts();
ns = timespec_to_ns(&ts);
with
ns = ktime_get_ns();

Signed-off-by: Thomas Gleixner
Cc: Evgeniy Polyakov
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 01:18:07 +0800
f2dec1eae powerpc: cell: Use ktime_get_ns() ... Browse Code »

Replace the ever recurring:
ts = ktime_get_ts();
ns = timespec_to_ns(&ts);
with
ns = ktime_get_ns();

Signed-off-by: Thomas Gleixner
Acked-by: Arnd Bergmann
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 01:18:07 +0800
68f6783d2 delayacct: Remove braindamaged type conversions ... Browse Code »

Converting cputime to timespec and timespec to nanoseconds makes no
sense. Use cputime_to_ns() and be done with it.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 01:18:06 +0800
9667a23db delayacct: Make accounting nanosecond based ... Browse Code »

Kill the timespec juggling and calculate with plain nanoseconds.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 01:18:06 +0800
ccbf62d8a sched: Make task->start_time nanoseconds based ... Browse Code »

Simplify the timespec to nsec/usec conversions.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 01:18:05 +0800
57e0be041 sched: Make task->real_start_time nanoseconds based ... Browse Code »

Simplify the only user of this data by removing the timespec
conversion.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 01:18:05 +0800
d560fed6a time: Export nsecs_to_jiffies() ... Browse Code »

Required for moving drivers to the nanosecond based interfaces.

Signed-off-by: Thomas Gleixner
Signed-off-by: John Stultz

Thomas Gleixner
2014-07-24 01:18:04 +0800