01 Oct, 2020

1 commit

  • [ Upstream commit 4cbbc3a0eeed675449b1a4d080008927121f3da3 ]

    While unlikely the divisor in scale64_check_overflow() could be >= 32bit in
    scale64_check_overflow(). do_div() truncates the divisor to 32bit at least
    on 32bit platforms.

    Use div64_u64() instead to avoid the truncation to 32-bit.

    [ tglx: Massaged changelog ]

    Signed-off-by: Wen Yang
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20200120100523.45656-1-wenyang@linux.alibaba.com
    Signed-off-by: Sasha Levin

    Wen Yang
     

23 Aug, 2019

1 commit

  • The VDSO update for CLOCK_BOOTTIME has a overflow issue as it shifts the
    nanoseconds based boot time offset left by the clocksource shift. That
    overflows once the boot time offset becomes large enough. As a consequence
    CLOCK_BOOTTIME in the VDSO becomes a random number causing applications to
    misbehave.

    Fix it by storing a timespec64 representation of the offset when boot time
    is adjusted and add that to the MONOTONIC base time value in the vdso data
    page. Using the timespec64 representation avoids a 64bit division in the
    update code.

    Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation")
    Reported-by: Chris Clayton
    Signed-off-by: Thomas Gleixner
    Tested-by: Chris Clayton
    Tested-by: Vincenzo Frascino
    Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908221257580.1983@nanos.tec.linutronix.de

    Thomas Gleixner
     

22 Jun, 2019

1 commit

  • While this doesn't actually amount to a real difference, since the macro
    evaluates to the same thing, every place else operates on ktime_t using
    these functions, so let's not break the pattern.

    Fixes: e3ff9c3678b4 ("timekeeping: Repair ktime_get_coarse*() granularity")
    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Arnd Bergmann
    Link: https://lkml.kernel.org/r/20190621203249.3909-1-Jason@zx2c4.com

    Jason A. Donenfeld
     

14 Jun, 2019

1 commit

  • Jason reported that the coarse ktime based time getters advance only once
    per second and not once per tick as advertised.

    The code reads only the monotonic base time, which advances once per
    second. The nanoseconds are accumulated on every tick in xtime_nsec up to
    a second and the regular time getters take this nanoseconds offset into
    account, but the ktime_get_coarse*() implementation fails to do so.

    Add the accumulated xtime_nsec value to the monotonic base time to get the
    proper per tick advancing coarse tinme.

    Fixes: b9ff604cff11 ("timekeeping: Add ktime_get_coarse_with_offset")
    Reported-by: Jason A. Donenfeld
    Signed-off-by: Thomas Gleixner
    Tested-by: Jason A. Donenfeld
    Cc: Arnd Bergmann
    Cc: Peter Zijlstra
    Cc: Clemens Ladisch
    Cc: Sultan Alsawaf
    Cc: Waiman Long
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906132136280.1791@nanos.tec.linutronix.de

    Thomas Gleixner
     

08 May, 2019

1 commit

  • Pull audit updates from Paul Moore:
    "We've got a reasonably broad set of audit patches for the v5.2 merge
    window, the highlights are below:

    - The biggest change, and the source of all the arch/* changes, is
    the patchset from Dmitry to help enable some of the work he is
    doing around PTRACE_GET_SYSCALL_INFO.

    To be honest, including this in the audit tree is a bit of a
    stretch, but it does help move audit a little further along towards
    proper syscall auditing for all arches, and everyone else seemed to
    agree that audit was a "good" spot for this to land (or maybe they
    just didn't want to merge it? dunno.).

    - We can now audit time/NTP adjustments.

    - We continue the work to connect associated audit records into a
    single event"

    * tag 'audit-pr-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: (21 commits)
    audit: fix a memory leak bug
    ntp: Audit NTP parameters adjustment
    timekeeping: Audit clock adjustments
    audit: purge unnecessary list_empty calls
    audit: link integrity evm_write_xattrs record to syscall event
    syscall_get_arch: add "struct task_struct *" argument
    unicore32: define syscall_get_arch()
    Move EM_UNICORE to uapi/linux/elf-em.h
    nios2: define syscall_get_arch()
    nds32: define syscall_get_arch()
    Move EM_NDS32 to uapi/linux/elf-em.h
    m68k: define syscall_get_arch()
    hexagon: define syscall_get_arch()
    Move EM_HEXAGON to uapi/linux/elf-em.h
    h8300: define syscall_get_arch()
    c6x: define syscall_get_arch()
    arc: define syscall_get_arch()
    Move EM_ARCOMPACT and EM_ARCV2 to uapi/linux/elf-em.h
    audit: Make audit_log_cap and audit_copy_inode static
    audit: connect LOGIN record to its syscall record
    ...

    Linus Torvalds
     

16 Apr, 2019

2 commits

  • Emit an audit record every time selected NTP parameters are modified
    from userspace (via adjtimex(2) or clock_adjtime(2)). These parameters
    may be used to indirectly change system clock, and thus their
    modifications should be audited.

    Such events will now generate records of type AUDIT_TIME_ADJNTPVAL
    containing the following fields:
    - op -- which value was adjusted:
    - offset -- corresponding to the time_offset variable
    - freq -- corresponding to the time_freq variable
    - status -- corresponding to the time_status variable
    - adjust -- corresponding to the time_adjust variable
    - tick -- corresponding to the tick_usec variable
    - tai -- corresponding to the timekeeping's TAI offset
    - old -- the old value
    - new -- the new value

    Example records:

    type=TIME_ADJNTPVAL msg=audit(1530616044.507:7): op=status old=64 new=8256
    type=TIME_ADJNTPVAL msg=audit(1530616044.511:11): op=freq old=0 new=49180377088000

    The records of this type will be associated with the corresponding
    syscall records.

    An overview of parameter changes that can be done via do_adjtimex()
    (based on information from Miroslav Lichvar) and whether they are
    audited:
    __timekeeping_set_tai_offset() -- sets the offset from the
    International Atomic Time
    (AUDITED)
    NTP variables:
    time_offset -- can adjust the clock by up to 0.5 seconds per call
    and also speed it up or slow down by up to about
    0.05% (43 seconds per day) (AUDITED)
    time_freq -- can speed up or slow down by up to about 0.05%
    (AUDITED)
    time_status -- can insert/delete leap seconds and it also enables/
    disables synchronization of the hardware real-time
    clock (AUDITED)
    time_maxerror, time_esterror -- change error estimates used to
    inform userspace applications
    (NOT AUDITED)
    time_constant -- controls the speed of the clock adjustments that
    are made when time_offset is set (NOT AUDITED)
    time_adjust -- can temporarily speed up or slow down the clock by up
    to 0.05% (AUDITED)
    tick_usec -- a more extreme version of time_freq; can speed up or
    slow down the clock by up to 10% (AUDITED)

    Signed-off-by: Ondrej Mosnacek
    Reviewed-by: Richard Guy Briggs
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Paul Moore

    Ondrej Mosnacek
     
  • Emit an audit record whenever the system clock is changed (i.e. shifted
    by a non-zero offset) by a syscall from userspace. The syscalls than can
    (at the time of writing) trigger such record are:
    - settimeofday(2), stime(2), clock_settime(2) -- via
    do_settimeofday64()
    - adjtimex(2), clock_adjtime(2) -- via do_adjtimex()

    The new records have type AUDIT_TIME_INJOFFSET and contain the following
    fields:
    - sec -- the 'seconds' part of the offset
    - nsec -- the 'nanoseconds' part of the offset

    Example record (time was shifted backwards by ~15.875 seconds):

    type=TIME_INJOFFSET msg=audit(1530616049.652:13): sec=-16 nsec=124887145

    The records of this type will be associated with the corresponding
    syscall records.

    Signed-off-by: Ondrej Mosnacek
    Reviewed-by: Richard Guy Briggs
    Reviewed-by: Thomas Gleixner
    [PM: fixed a line width problem in __audit_tk_injoffset()]
    Signed-off-by: Paul Moore

    Ondrej Mosnacek
     

28 Mar, 2019

1 commit

  • Several people reported testing failures after setting CLOCK_REALTIME close
    to the limits of the kernel internal representation in nanoseconds,
    i.e. year 2262.

    The failures are exposed in subsequent operations, i.e. when arming timers
    or when the advancing CLOCK_MONOTONIC makes the calculation of
    CLOCK_REALTIME overflow into negative space.

    Now people start to paper over the underlying problem by clamping
    calculations to the valid range, but that's just wrong because such
    workarounds will prevent detection of real issues as well.

    It is reasonable to force an upper bound for the various methods of setting
    CLOCK_REALTIME. Year 2262 is the absolute upper bound. Assume a maximum
    uptime of 30 years which is plenty enough even for esoteric embedded
    systems. That results in an upper bound of year 2232 for setting the time.

    Once that limit is reached in reality this limit is only a small part of
    the problem space. But until then this stops people from trying to paper
    over the problem at the wrong places.

    Reported-by: Xiongfeng Wang
    Reported-by: Hongbo Yao
    Signed-off-by: Thomas Gleixner
    Cc: John Stultz
    Cc: Stephen Boyd
    Cc: Miroslav Lichvar
    Cc: Arnd Bergmann
    Cc: Richard Cochran
    Cc: Peter Zijlstra
    Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1903231125480.2157@nanos.tec.linutronix.de

    Thomas Gleixner
     

23 Mar, 2019

1 commit

  • The timekeeping code uses a random mix of "unsigned long" and "unsigned
    int" for the seqcount snapshots (ratio 14:12). Since the seqlock.h API is
    entirely based on unsigned int, use that throughout.

    Signed-off-by: Rasmus Villemoes
    Signed-off-by: Thomas Gleixner
    Cc: Frederic Weisbecker
    Cc: John Stultz
    Cc: Stephen Boyd
    Link: https://lkml.kernel.org/r/20190318195557.20773-1-linux@rasmusvillemoes.dk

    Rasmus Villemoes
     

07 Feb, 2019

1 commit

  • struct timex is not y2038 safe.
    Replace all uses of timex with y2038 safe __kernel_timex.

    Note that struct __kernel_timex is an ABI interface definition.
    We could define a new structure based on __kernel_timex that
    is only available internally instead. Right now, there isn't
    a strong motivation for this as the structure is isolated to
    a few defined struct timex interfaces and such a structure would
    be exactly the same as struct timex.

    The patch was generated by the following coccinelle script:

    virtual patch

    @depends on patch forall@
    identifier ts;
    expression e;
    @@
    (
    - struct timex ts;
    + struct __kernel_timex ts;
    |
    - struct timex ts = {};
    + struct __kernel_timex ts = {};
    |
    - struct timex ts = e;
    + struct __kernel_timex ts = e;
    |
    - struct timex *ts;
    + struct __kernel_timex *ts;
    |
    (memset \| copy_from_user \| copy_to_user \)(...,
    - sizeof(struct timex))
    + sizeof(struct __kernel_timex))
    )

    @depends on patch forall@
    identifier ts;
    identifier fn;
    @@
    fn(...,
    - struct timex *ts,
    + struct __kernel_timex *ts,
    ...) {
    ...
    }

    @depends on patch forall@
    identifier ts;
    identifier fn;
    @@
    fn(...,
    - struct timex *ts) {
    + struct __kernel_timex *ts) {
    ...
    }

    Signed-off-by: Deepa Dinamani
    Cc: linux-alpha@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Signed-off-by: Arnd Bergmann

    Deepa Dinamani
     

29 Dec, 2018

1 commit

  • Pull y2038 updates from Arnd Bergmann:
    "More syscalls and cleanups

    This concludes the main part of the system call rework for 64-bit
    time_t, which has spread over most of year 2018, the last six system
    calls being

    - ppoll
    - pselect6
    - io_pgetevents
    - recvmmsg
    - futex
    - rt_sigtimedwait

    As before, nothing changes for 64-bit architectures, while 32-bit
    architectures gain another entry point that differs only in the layout
    of the timespec structure. Hopefully in the next release we can wire
    up all 22 of those system calls on all 32-bit architectures, which
    gives us a baseline version for glibc to start using them.

    This does not include the clock_adjtime, getrusage/waitid, and
    getitimer/setitimer system calls. I still plan to have new versions of
    those as well, but they are not required for correct operation of the
    C library since they can be emulated using the old 32-bit time_t based
    system calls.

    Aside from the system calls, there are also a few cleanups here,
    removing old kernel internal interfaces that have become unused after
    all references got removed. The arch/sh cleanups are part of this,
    there were posted several times over the past year without a reaction
    from the maintainers, while the corresponding changes made it into all
    other architectures"

    * tag 'y2038-for-4.21' of ssh://gitolite.kernel.org:/pub/scm/linux/kernel/git/arnd/playground:
    timekeeping: remove obsolete time accessors
    vfs: replace current_kernel_time64 with ktime equivalent
    timekeeping: remove timespec_add/timespec_del
    timekeeping: remove unused {read,update}_persistent_clock
    sh: remove board_time_init() callback
    sh: remove unused rtc_sh_get/set_time infrastructure
    sh: sh03: rtc: push down rtc class ops into driver
    sh: dreamcast: rtc: push down rtc class ops into driver
    y2038: signal: Add compat_sys_rt_sigtimedwait_time64
    y2038: signal: Add sys_rt_sigtimedwait_time32
    y2038: socket: Add compat_sys_recvmmsg_time64
    y2038: futex: Add support for __kernel_timespec
    y2038: futex: Move compat implementation into futex.c
    io_pgetevents: use __kernel_timespec
    pselect6: use __kernel_timespec
    ppoll: use __kernel_timespec
    signal: Add restore_user_sigmask()
    signal: Add set_user_sigmask()

    Linus Torvalds
     

18 Dec, 2018

1 commit


05 Dec, 2018

1 commit

  • tk_core.seq is initialized open coded, but that misses to initialize the
    lockdep map when lockdep is enabled. Lockdep splats involving tk_core seq
    consequently lack a name and are hard to read.

    Use the proper initializer which takes care of the lockdep map
    initialization.

    [ tglx: Massaged changelog ]

    Signed-off-by: Bart Van Assche
    Signed-off-by: Thomas Gleixner
    Cc: peterz@infradead.org
    Cc: tj@kernel.org
    Cc: johannes.berg@intel.com
    Link: https://lkml.kernel.org/r/20181128234325.110011-12-bvanassche@acm.org

    Bart Van Assche
     

23 Nov, 2018

2 commits

  • Update the time(r) core files files with the correct SPDX license
    identifier based on the license text in the file itself. The SPDX
    identifier is a legally binding shorthand, which can be used instead of the
    full boiler plate text.

    This work is based on a script and data from Philippe Ombredanne, Kate
    Stewart and myself. The data has been created with two independent license
    scanners and manual inspection.

    The following files do not contain any direct license information and have
    been omitted from the big initial SPDX changes:

    timeconst.bc: The .bc files were not touched
    time.c, timer.c, timekeeping.c: Licence was deduced from EXPORT_SYMBOL_GPL

    As those files do not contain direct license references they fall under the
    project license, i.e. GPL V2 only.

    Signed-off-by: Thomas Gleixner
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Russell King
    Cc: Richard Cochran
    Cc: Nicolas Pitre
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Link: https://lkml.kernel.org/r/20181031182252.879109557@linutronix.de

    Thomas Gleixner
     
  • Remove the pointless filenames in the top level comments. They have no
    value at all and just occupy space. While at it tidy up some of the
    comments and remove a stale one.

    Signed-off-by: Thomas Gleixner
    Acked-by: Nicolas Pitre
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Peter Anvin
    Cc: Russell King
    Cc: Richard Cochran
    Cc: "Paul E. McKenney"
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Link: https://lkml.kernel.org/r/20181031182252.794898238@linutronix.de

    Thomas Gleixner
     

27 Aug, 2018

1 commit

  • get_seconds() and do_gettimeofday() are only used by a few modules now any
    more (waiting for the respective patches to get accepted), and they are
    among the last holdouts of code that is not y2038 safe in the core kernel.

    Move the implementation into the timekeeping32.h header to clean up
    the core kernel and isolate the old interfaces further.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

14 Aug, 2018

1 commit

  • Pull x86 timer updates from Thomas Gleixner:
    "Early TSC based time stamping to allow better boot time analysis.

    This comes with a general cleanup of the TSC calibration code which
    grew warts and duct taping over the years and removes 250 lines of
    code. Initiated and mostly implemented by Pavel with help from various
    folks"

    * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
    x86/kvmclock: Mark kvm_get_preset_lpj() as __init
    x86/tsc: Consolidate init code
    sched/clock: Disable interrupts when calling generic_sched_clock_init()
    timekeeping: Prevent false warning when persistent clock is not available
    sched/clock: Close a hole in sched_clock_init()
    x86/tsc: Make use of tsc_calibrate_cpu_early()
    x86/tsc: Split native_calibrate_cpu() into early and late parts
    sched/clock: Use static key for sched_clock_running
    sched/clock: Enable sched clock early
    sched/clock: Move sched clock initialization and merge with generic clock
    x86/tsc: Use TSC as sched clock early
    x86/tsc: Initialize cyc2ns when tsc frequency is determined
    x86/tsc: Calibrate tsc only once
    ARM/time: Remove read_boot_clock64()
    s390/time: Remove read_boot_clock64()
    timekeeping: Default boot time offset to local_clock()
    timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset()
    s390/time: Add read_persistent_wall_and_boot_offset()
    x86/xen/time: Output xen sched_clock time from 0
    x86/xen/time: Initialize pv xen time in init_hypervisor_platform()
    ...

    Linus Torvalds
     

31 Jul, 2018

1 commit

  • On arches with no persistent clock a message like this is printed during
    boot:

    [ 0.000000] Persistent clock returned invalid value

    The value is not invalid: Zero means that no persistent clock is available
    and the absence of persistent clock should be quietly accepted.

    Fixes: 3eca993740b8 ("timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset()")
    Signed-off-by: Pavel Tatashin
    Signed-off-by: Thomas Gleixner
    Cc: steven.sistare@oracle.com
    Cc: daniel.m.jordan@oracle.com
    Cc: sboyd@kernel.org
    Cc: john.stultz@linaro.org
    Link: https://lkml.kernel.org/r/20180725200018.23722-1-pasha.tatashin@oracle.com

    Pavel Tatashin
     

20 Jul, 2018

5 commits

  • On some hardware with multiple clocksources, we have coarse grained
    clocksources that support the CLOCK_SOURCE_SUSPEND_NONSTOP flag, but
    which are less than ideal for timekeeping whereas other clocksources
    can be better candidates but halt on suspend.

    Currently, the timekeeping core only supports timing suspend using
    CLOCK_SOURCE_SUSPEND_NONSTOP clocksources if that clocksource is the
    current clocksource for timekeeping.

    As a result, some architectures try to implement read_persistent_clock64()
    using those non-stop clocksources, but isn't really ideal, which will
    introduce more duplicate code. To fix this, provide logic to allow a
    registered SUSPEND_NONSTOP clocksource, which isn't the current
    clocksource, to be used to calculate the suspend time.

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Miroslav Lichvar
    Cc: Richard Cochran
    Cc: Prarit Bhargava
    Cc: Stephen Boyd
    Cc: Daniel Lezcano
    Reviewed-by: Thomas Gleixner
    Reviewed-by: Daniel Lezcano
    Suggested-by: Thomas Gleixner
    Signed-off-by: Baolin Wang
    [jstultz: minor tweaks to merge with previous resume changes]
    Signed-off-by: John Stultz

    Baolin Wang
     
  • Currently, there exists a corner case assuming when there is
    only one clocksource e.g RTC, and system failed to go to
    suspend mode. While resume rtc_resume() injects the sleeptime
    as timekeeping_rtc_skipresume() returned 'false' (default value
    of sleeptime_injected) due to which we can see mismatch in
    timestamps.

    This issue can also come in a system where more than one
    clocksource are present and very first suspend fails.

    Success case:
    ------------
    {sleeptime_injected=false}
    rtc_suspend() => timekeeping_suspend() => timekeeping_resume() =>

    (sleeptime injected)
    rtc_resume()

    Failure case:
    ------------
    {failure in sleep path} {sleeptime_injected=false}
    rtc_suspend() => rtc_resume()

    {sleeptime injected again which was not required as the suspend failed}

    Fix this by handling the boolean logic properly.

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Miroslav Lichvar
    Cc: Richard Cochran
    Cc: Prarit Bhargava
    Cc: Stephen Boyd
    Originally-by: Thomas Gleixner
    Signed-off-by: Mukesh Ojha
    Signed-off-by: John Stultz

    Mukesh Ojha
     
  • Add 'const' to some function arguments and variables to make it easier
    to read the code.

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Miroslav Lichvar
    Cc: Richard Cochran
    Cc: Prarit Bhargava
    Cc: Stephen Boyd
    Signed-off-by: Ondrej Mosnacek
    [jstultz: Also fixup pre-existing checkpatch warnings for
    prototype arguments with no variable name]
    Signed-off-by: John Stultz

    Ondrej Mosnacek
     
  • read_persistent_wall_and_boot_offset() is called during boot to read
    both the persistent clock and also return the offset between the boot time
    and the value of persistent clock.

    Change the default boot_offset from zero to local_clock() so architectures,
    that do not have a dedicated boot_clock but have early sched_clock(), such
    as SPARCv9, x86, and possibly more will benefit from this change by getting
    a better and more consistent estimate of the boot time without need for an
    arch specific implementation.

    Signed-off-by: Pavel Tatashin
    Signed-off-by: Thomas Gleixner
    Cc: steven.sistare@oracle.com
    Cc: daniel.m.jordan@oracle.com
    Cc: linux@armlinux.org.uk
    Cc: schwidefsky@de.ibm.com
    Cc: heiko.carstens@de.ibm.com
    Cc: john.stultz@linaro.org
    Cc: sboyd@codeaurora.org
    Cc: hpa@zytor.com
    Cc: douly.fnst@cn.fujitsu.com
    Cc: peterz@infradead.org
    Cc: prarit@redhat.com
    Cc: feng.tang@intel.com
    Cc: pmladek@suse.com
    Cc: gnomes@lxorguk.ukuu.org.uk
    Cc: linux-s390@vger.kernel.org
    Cc: boris.ostrovsky@oracle.com
    Cc: jgross@suse.com
    Cc: pbonzini@redhat.com
    Link: https://lkml.kernel.org/r/20180719205545.16512-17-pasha.tatashin@oracle.com

    Pavel Tatashin
     
  • If architecture does not support exact boot time, it is challenging to
    estimate boot time without having a reference to the current persistent
    clock value. Yet, it cannot read the persistent clock time again, because
    this may lead to math discrepancies with the caller of read_boot_clock64()
    who have read the persistent clock at a different time.

    This is why it is better to provide two values simultaneously: the
    persistent clock value, and the boot time.

    Replace read_boot_clock64() with:
    read_persistent_wall_and_boot_offset(wall_time, boot_offset)

    Where wall_time is returned by read_persistent_clock() And boot_offset is
    wall_time - boot time, which defaults to 0.

    Signed-off-by: Pavel Tatashin
    Signed-off-by: Thomas Gleixner
    Cc: steven.sistare@oracle.com
    Cc: daniel.m.jordan@oracle.com
    Cc: linux@armlinux.org.uk
    Cc: schwidefsky@de.ibm.com
    Cc: heiko.carstens@de.ibm.com
    Cc: john.stultz@linaro.org
    Cc: sboyd@codeaurora.org
    Cc: hpa@zytor.com
    Cc: douly.fnst@cn.fujitsu.com
    Cc: peterz@infradead.org
    Cc: prarit@redhat.com
    Cc: feng.tang@intel.com
    Cc: pmladek@suse.com
    Cc: gnomes@lxorguk.ukuu.org.uk
    Cc: linux-s390@vger.kernel.org
    Cc: boris.ostrovsky@oracle.com
    Cc: jgross@suse.com
    Cc: pbonzini@redhat.com
    Link: https://lkml.kernel.org/r/20180719205545.16512-16-pasha.tatashin@oracle.com

    Pavel Tatashin
     

13 Jul, 2018

1 commit


11 Jul, 2018

1 commit

  • When the NTP frequency is set directly from userspace using the
    ADJ_FREQUENCY or ADJ_TICK timex mode, immediately update the
    timekeeper's multiplier instead of waiting for the next tick.

    This removes a hidden non-deterministic delay in setting of the
    frequency and allows an extremely tight control of the system clock
    with update rates close to or even exceeding the kernel HZ.

    The update is limited to archs using modern timekeeping
    (!ARCH_USES_GETTIMEOFFSET).

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Miroslav Lichvar
    Cc: Richard Cochran
    Cc: Prarit Bhargava
    Cc: Stephen Boyd
    Signed-off-by: Miroslav Lichvar
    Signed-off-by: John Stultz

    Miroslav Lichvar
     

19 Jun, 2018

1 commit


19 May, 2018

3 commits

  • I have run into a couple of drivers using current_kernel_time()
    suffering from the y2038 problem, and they could be converted
    to using ktime_t, but don't have interfaces that skip the nanosecond
    calculation at the moment.

    This introduces ktime_get_coarse_with_offset() as a simpler
    variant of ktime_get_with_offset(), and adds wrappers for the
    three time domains we support with the existing function.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Thomas Gleixner
    Cc: Stephen Boyd
    Cc: y2038@lists.linaro.org
    Cc: John Stultz
    Link: https://lkml.kernel.org/r/20180427134016.2525989-5-arnd@arndb.de

    Arnd Bergmann
     
  • The current_kernel_time64, get_monotonic_coarse64, getrawmonotonic64,
    get_monotonic_boottime64 and timekeeping_clocktai64 interfaces have
    rather inconsistent naming, and they differ in the calling conventions
    by passing the output either by reference or as a return value.

    Rename them to ktime_get_coarse_real_ts64, ktime_get_coarse_ts64,
    ktime_get_raw_ts64, ktime_get_boottime_ts64 and ktime_get_clocktai_ts64
    respectively, and provide the interfaces with macros or inline
    functions as needed.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Thomas Gleixner
    Cc: Stephen Boyd
    Cc: y2038@lists.linaro.org
    Cc: John Stultz
    Link: https://lkml.kernel.org/r/20180427134016.2525989-4-arnd@arndb.de

    Arnd Bergmann
     
  • In a move to make ktime_get_*() the preferred driver interface into the
    timekeeping code, sanitizes ktime_get_real_ts64() to be a proper exported
    symbol rather than an alias for getnstimeofday64().

    The internal __getnstimeofday64() is no longer used, so remove that
    and merge it into ktime_get_real_ts64().

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Thomas Gleixner
    Cc: Stephen Boyd
    Cc: y2038@lists.linaro.org
    Cc: John Stultz
    Link: https://lkml.kernel.org/r/20180427134016.2525989-3-arnd@arndb.de

    Arnd Bergmann
     

26 Apr, 2018

1 commit

  • Revert commits

    92af4dcb4e1c ("tracing: Unify the "boot" and "mono" tracing clocks")
    127bfa5f4342 ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
    7250a4047aa6 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
    d6c7270e913d ("timekeeping: Remove boot time specific code")
    f2d6fdbfd238 ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
    d6ed449afdb3 ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
    72199320d49d ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")

    As stated in the pull request for the unification of CLOCK_MONOTONIC and
    CLOCK_BOOTTIME, it was clear that we might have to revert the change.

    As reported by several folks systemd and other applications rely on the
    documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
    changes. After resume daemons time out and other timeout related issues are
    observed. Rafael compiled this list:

    * systemd kills daemons on resume, after >WatchdogSec seconds
    of suspending (Genki Sky). [Verified that that's because systemd uses
    CLOCK_MONOTONIC and expects it to not include the suspend time.]

    * systemd-journald misbehaves after resume:
    systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
    corrupted or uncleanly shut down, renaming and replacing.
    (Mike Galbraith).

    * NetworkManager reports "networking disabled" and networking is broken
    after resume 50% of the time (Pavel). [May be because of systemd.]

    * MATE desktop dims the display and starts the screensaver right after
    system resume (Pavel).

    * Full system hang during resume (me). [May be due to systemd or NM or both.]

    That happens on debian and open suse systems.

    It's sad, that these problems were neither catched in -next nor by those
    folks who expressed interest in this change.

    Reported-by: Rafael J. Wysocki
    Reported-by: Genki Sky ,
    Reported-by: Pavel Machek
    Signed-off-by: Thomas Gleixner
    Cc: Dmitry Torokhov
    Cc: John Stultz
    Cc: Jonathan Corbet
    Cc: Kevin Easton
    Cc: Linus Torvalds
    Cc: Mark Salyzyn
    Cc: Michael Kerrisk
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: Prarit Bhargava
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt

    Thomas Gleixner
     

17 Apr, 2018

1 commit

  • The __current_kernel_time() function based on 'struct timespec' is no
    longer recommended for new code, and the only user of this function has
    been replaced by commit 6909e29fdefb ("kdb: use __ktime_get_real_seconds
    instead of __current_kernel_time").

    Remove the obsolete interface.

    Signed-off-by: Baolin Wang
    Signed-off-by: Thomas Gleixner
    Cc: arnd@arndb.de
    Cc: sboyd@kernel.org
    Cc: broonie@kernel.org
    Cc: john.stultz@linaro.org
    Link: https://lkml.kernel.org/r/1a9dbea7ee2cda7efe9ed330874075cf17fdbff6.1523596316.git.baolin.wang@linaro.org

    Baolin Wang
     

13 Mar, 2018

4 commits

  • Now that th MONOTONIC and BOOTTIME clocks are indentical remove all the special
    casing.

    The user space visible interfaces still support both clocks, but their behavior
    is identical.

    Signed-off-by: Thomas Gleixner
    Cc: Dmitry Torokhov
    Cc: John Stultz
    Cc: Jonathan Corbet
    Cc: Kevin Easton
    Cc: Linus Torvalds
    Cc: Mark Salyzyn
    Cc: Michael Kerrisk
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: Prarit Bhargava
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20180301165150.410218515@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Now that the MONOTONIC and BOOTTIME clocks are the same, remove all the
    special handling from timekeeping. Keep wrappers for the existing users of
    the *boot* timekeeper interfaces.

    Signed-off-by: Thomas Gleixner
    Cc: Dmitry Torokhov
    Cc: John Stultz
    Cc: Jonathan Corbet
    Cc: Kevin Easton
    Cc: Linus Torvalds
    Cc: Mark Salyzyn
    Cc: Michael Kerrisk
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: Prarit Bhargava
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20180301165150.236279497@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • The MONOTONIC clock is not fast forwarded by the time spent in suspend on
    resume. This is only done for the BOOTTIME clock. The reason why the
    MONOTONIC clock is not forwarded is historical: the original Linux
    implementation was using jiffies as a base for the MONOTONIC clock and
    jiffies have never been advanced after resume.

    At some point when timekeeping was unified in the core code, the
    MONONOTIC clock was advanced after resume which also advanced jiffies causing
    interesting side effects. As a consequence the the MONOTONIC clock forwarding
    was disabled again and the BOOTTIME clock was introduced, which allows to read
    time since boot.

    Back then it was not possible to completely distangle the MONOTONIC clock and
    jiffies because there were still interfaces which exposed the MONOTONIC clock
    behaviour based on the timer wheel and therefore jiffies.

    As of today none of the MONOTONIC clock facilities depends on jiffies
    anymore so the forwarding can be done seperately. This is achieved by
    forwarding the variables which are used for the jiffies update after resume
    before the tick is restarted,

    In timekeeping resume, the change is rather simple. Instead of updating the
    offset between the MONOTONIC clock and the REALTIME/BOOTTIME clocks, advance the
    time keeper base for the MONOTONIC and the MONOTONIC_RAW clocks by the time
    spent in suspend.

    The MONOTONIC clock is now the same as the BOOTTIME clock and the offset between
    the REALTIME and the MONOTONIC clocks is the same as before suspend.

    There might be side effects in applications, which rely on the
    (unfortunately) well documented behaviour of the MONOTONIC clock, but the
    downsides of the existing behaviour are probably worse.

    There is one obvious issue. Up to now it was possible to retrieve the time
    spent in suspend by observing the delta between the MONOTONIC clock and the
    BOOTTIME clock. This is not longer available, but the previously introduced
    mechanism to read the active non-suspended monotonic time can mitigate that
    in a detectable fashion.

    Signed-off-by: Thomas Gleixner
    Cc: Andrew Morton
    Cc: Dmitry Torokhov
    Cc: John Stultz
    Cc: Jonathan Corbet
    Cc: Kevin Easton
    Cc: Linus Torvalds
    Cc: Mark Salyzyn
    Cc: Michael Kerrisk
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: Prarit Bhargava
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20180301165150.062975504@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • The planned change to unify the behaviour of the MONOTONIC and BOOTTIME
    clocks vs. suspend removes the ability to retrieve the active
    non-suspended time of a system.

    Provide a new CLOCK_MONOTONIC_ACTIVE clock which returns the active
    non-suspended time of the system via clock_gettime().

    This preserves the old behaviour of CLOCK_MONOTONIC before the
    BOOTTIME/MONOTONIC unification.

    This new clock also allows applications to detect programmatically that
    the MONOTONIC and BOOTTIME clocks are identical.

    Signed-off-by: Thomas Gleixner
    Cc: Dmitry Torokhov
    Cc: John Stultz
    Cc: Jonathan Corbet
    Cc: Kevin Easton
    Cc: Linus Torvalds
    Cc: Mark Salyzyn
    Cc: Michael Kerrisk
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: Prarit Bhargava
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20180301165149.965235774@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

10 Mar, 2018

2 commits

  • When the length of the NTP tick changes significantly, e.g. when an
    NTP/PTP application is correcting the initial offset of the clock, a
    large value may accumulate in the NTP error before the multiplier
    converges to the correct value. It may then take a very long time (hours
    or even days) before the error is corrected. This causes the clock to
    have an unstable frequency offset, which has a negative impact on the
    stability of synchronization with precise time sources (e.g. NTP/PTP
    using hardware timestamping or the PTP KVM clock).

    Use division to determine the correct multiplier directly from the NTP
    tick length and replace the iterative approach. This removes the last
    major source of the NTP error. The only remaining source is now limited
    resolution of the multiplier, which is corrected by adding 1 to the
    multiplier when the system clock is behind the NTP time.

    Signed-off-by: Miroslav Lichvar
    Signed-off-by: John Stultz
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Prarit Bhargava
    Cc: Richard Cochran
    Cc: Stephen Boyd
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1520620971-9567-3-git-send-email-john.stultz@linaro.org
    Signed-off-by: Ingo Molnar

    Miroslav Lichvar
     
  • When the timekeeping multiplier is changed, the NTP error is updated to
    correct the clock for the delay between the tick and the update of the
    clock. This error is corrected in later updates and the clock appears as
    if the frequency was changed exactly on the tick.

    Remove this correction to keep the point where the frequency is
    effectively changed at the time of the update. This removes a major
    source of the NTP error.

    Signed-off-by: Miroslav Lichvar
    Signed-off-by: John Stultz
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Prarit Bhargava
    Cc: Richard Cochran
    Cc: Stephen Boyd
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1520620971-9567-2-git-send-email-john.stultz@linaro.org
    Signed-off-by: Ingo Molnar

    Miroslav Lichvar
     

14 Nov, 2017

1 commit

  • As of d4d1fc61eb38f (ia64: Update fsyscall gettime to use modern
    vsyscall_update)the last user of CONFIG_GENERIC_TIME_VSYSCALL_OLD
    have been updated, the legacy support for old-style vsyscall
    implementations can be removed from the timekeeping code.

    (Thanks again to Tony Luck for helping remove the last user!)

    [jstultz: Commit message rework]

    Signed-off-by: Miroslav Lichvar
    Signed-off-by: John Stultz
    Signed-off-by: Thomas Gleixner
    Cc: Prarit Bhargava
    Cc: Tony Luck
    Cc: Richard Cochran
    Cc: Stephen Boyd
    Link: https://lkml.kernel.org/r/1510613491-16695-1-git-send-email-john.stultz@linaro.org

    Miroslav Lichvar
     

12 Nov, 2017

1 commit

  • __getnstimeofday() is a rather odd interface, with a number of quirks:

    - The caller may come from NMI context, but the implementation is not NMI safe,
    one way to get there from NMI is

    NMI handler:
    something bad
    panic()
    kmsg_dump()
    pstore_dump()
    pstore_record_init()
    __getnstimeofday()

    - The calling conventions are different from any other timekeeping functions,
    to deal with returning an error code during suspended timekeeping.

    Address the above issues by using a completely different method to get the
    time: ktime_get_real_fast_ns() is NMI safe and has a reasonable behavior
    when timekeeping is suspended: it returns the time at which it got
    suspended. As Thomas Gleixner explained, this is safe, as
    ktime_get_real_fast_ns() does not call into the clocksource driver that
    might be suspended.

    The result can easily be transformed into a timespec structure. Since
    ktime_get_real_fast_ns() was not exported to modules, add the export.

    The pstore behavior for the suspended case changes slightly, as it now
    stores the timestamp at which timekeeping was suspended instead of storing
    a zero timestamp.

    This change is not addressing y2038-safety, that's subject to a more
    complex follow up patch.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Thomas Gleixner
    Acked-by: Kees Cook
    Cc: Tony Luck
    Cc: Anton Vorontsov
    Cc: Stephen Boyd
    Cc: John Stultz
    Cc: Colin Cross
    Link: https://lkml.kernel.org/r/20171110152530.1926955-1-arnd@arndb.de

    Arnd Bergmann
     

01 Nov, 2017

1 commit