18 Apr, 2019
1 commit
-
Commit 0a0e0829f990 ("nohz: Fix missing tick reprogram when interrupting an
inline softirq") got backported to stable trees and now causes the NOHZ
softirq pending warning to trigger. It's not an upstream issue as the NOHZ
update logic has been changed there.The problem is when a softirq disabled section gets interrupted and on
return from interrupt the tick/nohz state is evaluated, which then can
observe pending soft interrupts. These soft interrupts are legitimately
pending because they cannot be processed as long as soft interrupts are
disabled and the interrupted code will correctly process them when soft
interrupts are reenabled.Add a check for softirqs disabled to the pending check to prevent the
warning.Reported-by: Grygorii Strashko
Reported-by: John Crispin
Signed-off-by: Thomas Gleixner
Tested-by: Grygorii Strashko
Tested-by: John Crispin
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Anna-Maria Gleixner
Cc: Greg Kroah-Hartman
Cc: stable@vger.kernel.org
Acked-by: Frederic Weisbecker
Tested-by: Geert UytterhoevenSigned-off-by: Leonard Crestez
Acked-by: Jason Liu
Signed-off-by: Arulpandiyan Vadivel
17 Apr, 2019
1 commit
-
commit 07d7e12091f4ab869cc6a4bb276399057e73b0b3 upstream.
To calculate a remaining time, it's required to subtract the current time
from the expiration time. In alarm_timer_remaining() the arguments of
ktime_sub are swapped.Fixes: d653d8457c76 ("alarmtimer: Implement remaining callback")
Signed-off-by: Andrei Vagin
Signed-off-by: Thomas Gleixner
Reviewed-by: Mukesh Ojha
Cc: Stephen Boyd
Cc: John Stultz
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190408041542.26338-1-avagin@gmail.com
Signed-off-by: Greg Kroah-Hartman
13 Feb, 2019
1 commit
-
[ Upstream commit ce10a5b3954f2514af726beb78ed8d7350c5e41c ]
tk_core.seq is initialized open coded, but that misses to initialize the
lockdep map when lockdep is enabled. Lockdep splats involving tk_core seq
consequently lack a name and are hard to read.Use the proper initializer which takes care of the lockdep map
initialization.[ tglx: Massaged changelog ]
Signed-off-by: Bart Van Assche
Signed-off-by: Thomas Gleixner
Cc: peterz@infradead.org
Cc: tj@kernel.org
Cc: johannes.berg@intel.com
Link: https://lkml.kernel.org/r/20181128234325.110011-12-bvanassche@acm.org
Signed-off-by: Sasha Levin
31 Jan, 2019
1 commit
-
commit 93ad0fc088c5b4631f796c995bdd27a082ef33a6 upstream.
The recent commit which prevented a division by 0 issue in the alarm timer
code broke posix CPU timers as an unwanted side effect.The reason is that the common rearm code checks for timer->it_interval
being 0 now. What went unnoticed is that the posix cpu timer setup does not
initialize timer->it_interval as it stores the interval in CPU timer
specific storage. The reason for the separate storage is historical as the
posix CPU timers always had a 64bit nanoseconds representation internally
while timer->it_interval is type ktime_t which used to be a modified
timespec representation on 32bit machines.Instead of reverting the offending commit and fixing the alarmtimer issue
in the alarmtimer code, store the interval in timer->it_interval at CPU
timer setup time so the common code check works. This also repairs the
existing inconistency of the posix CPU timer code which kept a single shot
timer armed despite of the interval being 0.The separate storage can be removed in mainline, but that needs to be a
separate commit as the current one has to be backported to stable kernels.Fixes: 0e334db6bb4b ("posix-timers: Fix division by zero bug")
Reported-by: H.J. Lu
Signed-off-by: Thomas Gleixner
Cc: John Stultz
Cc: Peter Zijlstra
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190111133500.840117406@linutronix.de
Signed-off-by: Greg Kroah-Hartman
29 Dec, 2018
1 commit
-
commit 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 upstream.
The signal delivery path of posix-timers can try to rearm the timer even if
the interval is zero. That's handled for the common case (hrtimer) but not
for alarm timers. In that case the forwarding function raises a division by
zero exception.The handling for hrtimer based posix timers is wrong because it marks the
timer as active despite the fact that it is stopped.Move the check from common_hrtimer_rearm() to posixtimer_rearm() to cure
both issues.Reported-by: syzbot+9d38bedac9cc77b8ad5e@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner
Cc: John Stultz
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: sboyd@kernel.org
Cc: stable@vger.kernel.org
Cc: syzkaller-bugs@googlegroups.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1812171328050.1880@nanos.tec.linutronix.de
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman
07 Sep, 2018
1 commit
-
I turns out that the silly spawn kthread from worker was actually needed.
clocksource_watchdog_kthread() cannot be called directly from
clocksource_watchdog_work(), because clocksource_select() calls
timekeeping_notify() which uses stop_machine(). One cannot use
stop_machine() from a workqueue() due lock inversions wrt CPU hotplug.Revert the patch but add a comment that explain why we jump through such
apparently silly hoops.Fixes: 7197e77abcb6 ("clocksource: Remove kthread")
Reported-by: Siegfried Metz
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Tested-by: Niklas Cassel
Tested-by: Kevin Shanahan
Tested-by: viktor_jaegerskuepper@freenet.de
Tested-by: Siegfried Metz
Cc: rafael.j.wysocki@intel.com
Cc: len.brown@intel.com
Cc: diego.viola@gmail.com
Cc: rui.zhang@intel.com
Cc: bjorn.andersson@linaro.org
Link: https://lkml.kernel.org/r/20180905084158.GR24124@hirez.programming.kicks-ass.net
22 Aug, 2018
1 commit
-
…iederm/user-namespace
Pull core signal handling updates from Eric Biederman:
"It was observed that a periodic timer in combination with a
sufficiently expensive fork could prevent fork from every completing.
This contains the changes to remove the need for that restart.This set of changes is split into several parts:
- The first part makes PIDTYPE_TGID a proper pid type instead
something only for very special cases. The part starts using
PIDTYPE_TGID enough so that in __send_signal where signals are
actually delivered we know if the signal is being sent to a a group
of processes or just a single process.- With that prep work out of the way the logic in fork is modified so
that fork logically makes signals received while it is running
appear to be received after the fork completes"* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (22 commits)
signal: Don't send signals to tasks that don't exist
signal: Don't restart fork when signals come in.
fork: Have new threads join on-going signal group stops
fork: Skip setting TIF_SIGPENDING in ptrace_init_task
signal: Add calculate_sigpending()
fork: Unconditionally exit if a fatal signal is pending
fork: Move and describe why the code examines PIDNS_ADDING
signal: Push pid type down into complete_signal.
signal: Push pid type down into __send_signal
signal: Push pid type down into send_signal
signal: Pass pid type into do_send_sig_info
signal: Pass pid type into send_sigio_to_task & send_sigurg_to_task
signal: Pass pid type into group_send_sig_info
signal: Pass pid and pid type into send_sigqueue
posix-timers: Noralize good_sigevent
signal: Use PIDTYPE_TGID to clearly store where file signals will be sent
pid: Implement PIDTYPE_TGID
pids: Move the pgrp and session pid pointers from task_struct to signal_struct
kvm: Don't open code task_pid in kvm_vcpu_ioctl
pids: Compute task_tgid using signal->leader_pid
...
14 Aug, 2018
3 commits
-
Pull parisc updates from Helge Deller:
- parisc now uses the generic dma_noncoherent_ops implementation
(Christoph Hellwig)- further memory barrier and spinlock improvements (John David Anglin)
- prepare removal of current_text_addr() functions (Nick Desaulniers)
- improve kernel stack unwinding on parisc (me)
- drop ENOTSUP which was defined on parisc only (me)
* 'parisc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix and improve kernel stack unwinding
parisc: Remove unnecessary barriers from spinlock.h
parisc: Remove ordered stores from syscall.S
parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions
parisc: Add HAVE_REGS_AND_STACK_ACCESS_API feature
parisc: Drop architecture-specific ENOTSUP define
parisc: use generic dma_noncoherent_ops
parisc: always use flush_kernel_dcache_range for DMA cache maintainance
parisc: merge pcx_dma_ops and pcxl_dma_ops -
Pull x86 timer updates from Thomas Gleixner:
"Early TSC based time stamping to allow better boot time analysis.This comes with a general cleanup of the TSC calibration code which
grew warts and duct taping over the years and removes 250 lines of
code. Initiated and mostly implemented by Pavel with help from various
folks"* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
x86/kvmclock: Mark kvm_get_preset_lpj() as __init
x86/tsc: Consolidate init code
sched/clock: Disable interrupts when calling generic_sched_clock_init()
timekeeping: Prevent false warning when persistent clock is not available
sched/clock: Close a hole in sched_clock_init()
x86/tsc: Make use of tsc_calibrate_cpu_early()
x86/tsc: Split native_calibrate_cpu() into early and late parts
sched/clock: Use static key for sched_clock_running
sched/clock: Enable sched clock early
sched/clock: Move sched clock initialization and merge with generic clock
x86/tsc: Use TSC as sched clock early
x86/tsc: Initialize cyc2ns when tsc frequency is determined
x86/tsc: Calibrate tsc only once
ARM/time: Remove read_boot_clock64()
s390/time: Remove read_boot_clock64()
timekeeping: Default boot time offset to local_clock()
timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset()
s390/time: Add read_persistent_wall_and_boot_offset()
x86/xen/time: Output xen sched_clock time from 0
x86/xen/time: Initialize pv xen time in init_hypervisor_platform()
... -
Pull timer updates from Thomas Gleixner:
"The timers departement more or less proudly presents:- More Y2038 timekeeping work mostly in the core code. The work is
slowly, but steadily targeting the actuall syscalls.- Enhanced timekeeping suspend/resume support by utilizing
clocksources which do not stop during suspend, but are otherwise
not the main timekeeping clocksources.- Make NTP adjustmets more accurate and immediate when the frequency
is set directly and not incrementally.- Sanitize the overrung handing of posix timers
- A new timer driver for Mediatek SoCs
- The usual pile of fixes and updates all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
clockevents: Warn if cpu_all_mask is used as cpumask
tick/broadcast-hrtimer: Use cpu_possible_mask for ce_broadcast_hrtimer
clocksource/drivers/arm_arch_timer: Fix bogus cpu_all_mask usage
clocksource: ti-32k: Remove CLOCK_SOURCE_SUSPEND_NONSTOP flag
timers: Clear timer_base::must_forward_clk with timer_base::lock held
clocksource/drivers/sprd: Register one always-on timer to compensate suspend time
clocksource/drivers/timer-mediatek: Add support for system timer
clocksource/drivers/timer-mediatek: Convert the driver to timer-of
clocksource/drivers/timer-mediatek: Use specific prefix for GPT
clocksource/drivers/timer-mediatek: Rename mtk_timer to timer-mediatek
clocksource/drivers/timer-mediatek: Add system timer bindings
clocksource/drivers: Set clockevent device cpumask to cpu_possible_mask
time: Introduce one suspend clocksource to compensate the suspend time
time: Fix extra sleeptime injection when suspend fails
timekeeping/ntp: Constify some function arguments
ntp: Use kstrtos64 for s64 variable
ntp: Remove redundant arguments
timer: Fix coding style
ktime: Provide typesafe ktime_to_ns()
hrtimer: Improve kernel message printing
...
13 Aug, 2018
1 commit
-
parisc is the only Linux architecture which has defined a value for ENOTSUP.
All other architectures #define ENOTSUP as EOPNOTSUPP in their libc headers.Having an own value for ENOTSUP which is different than EOPNOTSUPP often gives
problems with userspace programs which expect both to be the same. One such
example is a build error in the libuv package, as can be seen in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900237.Since we dropped HP-UX support, there is no real benefit in keeping an own
value for ENOTSUP. This patch drops the parisc value for ENOTSUP from the
kernel sources. glibc needs no patch, it reuses the exported headers.Signed-off-by: Helge Deller
02 Aug, 2018
3 commits
-
Using cpu_all_mask in clockevents cpumask may result in issues while
comparing multiple clockevent devices to choose the preferred one.On one of the platforms with 2 system (i.e. non per-CPU) timers with
different ratings, having cpu_all_mask for one of the device resulted in a
boot hang due to a endless loop in clockevents_notify_released() as both
were clocksources were selected as preferred.In order to prevent such issues in the future, warn if any clockevent
driver sets cpu_all_mask as it's cpumask and just override it to use
cpu_possible_mask. All the existing occurrences of cpu_all_mask are already
replaced with cpu_possible_mask.Signed-off-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lkml.kernel.org/r/1531308264-24220-3-git-send-email-sudeep.holla@arm.com -
This is the last instance of cpu_all_mask usage in the core framework.
Replace it with cpu_possible_mask like all other instances in the
clockevent drivers. This makes it possible to add a warning in the core
clockevents_register_device on usage of cpu_all_mask from any clockevent
drivers in the future.Signed-off-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lkml.kernel.org/r/1531308264-24220-2-git-send-email-sudeep.holla@arm.com -
timer_base::must_forward_clock is indicating that the base clock might be
stale due to a long idle sleep.The forwarding of the base clock takes place in the timer softirq or when a
timer is enqueued to a base which is idle. If the enqueue of timer to an
idle base happens from a remote CPU, then the following race can happen:CPU0 CPU1
run_timer_softirq mod_timerbase = lock_timer_base(timer);
base->must_forward_clk = false
if (base->must_forward_clk)
forward(base); -> skippedenqueue_timer(base, timer, idx);
-> idx is calculated high due to
stale base
unlock_timer_base(timer);
base = lock_timer_base(timer);
forward(base);The root cause is that timer_base::must_forward_clk is cleared outside the
timer_base::lock held region, so the remote queuing CPU observes it as
cleared, but the base clock is still stale. This can cause large
granularity values for timers, i.e. the accuracy of the expiry time
suffers.Prevent this by clearing the flag with timer_base::lock held, so that the
forwarding takes place before the cleared flag is observable by a remote
CPU.Signed-off-by: Gaurav Kohli
Signed-off-by: Thomas Gleixner
Cc: john.stultz@linaro.org
Cc: sboyd@kernel.org
Cc: linux-arm-msm@vger.kernel.org
Link: https://lkml.kernel.org/r/1533199863-22748-1-git-send-email-gkohli@codeaurora.org
01 Aug, 2018
1 commit
-
local_timer_softirq_pending() checks whether the timer softirq is
pending with: local_softirq_pending() & TIMER_SOFTIRQ.This is wrong because TIMER_SOFTIRQ is the softirq number and not a
bitmask. So the test checks for the wrong bit.Use BIT(TIMER_SOFTIRQ) instead.
Fixes: 5d62c183f9e9 ("nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()")
Signed-off-by: Anna-Maria Gleixner
Signed-off-by: Thomas Gleixner
Reviewed-by: Paul E. McKenney
Reviewed-by: Daniel Bristot de Oliveira
Acked-by: Frederic Weisbecker
Cc: bigeasy@linutronix.de
Cc: peterz@infradead.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180731161358.29472-1-anna-maria@linutronix.de
31 Jul, 2018
1 commit
-
On arches with no persistent clock a message like this is printed during
boot:[ 0.000000] Persistent clock returned invalid value
The value is not invalid: Zero means that no persistent clock is available
and the absence of persistent clock should be quietly accepted.Fixes: 3eca993740b8 ("timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset()")
Signed-off-by: Pavel Tatashin
Signed-off-by: Thomas Gleixner
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: sboyd@kernel.org
Cc: john.stultz@linaro.org
Link: https://lkml.kernel.org/r/20180725200018.23722-1-pasha.tatashin@oracle.com
21 Jul, 2018
3 commits
-
Make the code more maintainable by performing more of the signal
related work in send_sigqueue.A quick inspection of do_timer_create will show that this code path
does not lookup a thread group by a thread's pid. Making it safe
to find the task pointed to by it_pid with "pid_task(it_pid, type)";This supports the changes needed in fork to tell if a signal was sent
to a single process or a group of processes.Having the pid to task transition in signal.c will also make it easier
to sort out races with de_thread and and the thread group leader
exiting when it comes time to address that.Signed-off-by: "Eric W. Biederman"
-
In good_sigevent directly compute the default return value as
"task_tgid(current)". This is exactly the same as
"task_pid(current->group_leader)" but written more clearly.In the thread case first compute the thread's pid. Then veify that
attached to that pid is a thread of the current thread group.This has the net effect of making the code a little clearer, and
making it obvious that posix timers never look up a process by a the
pid of a thread.Signed-off-by: "Eric W. Biederman"
-
Everywhere except in the pid array we distinguish between a tasks pid and
a tasks tgid (thread group id). Even in the enumeration we want that
distinction sometimes so we have added __PIDTYPE_TGID. With leader_pid
we almost have an implementation of PIDTYPE_TGID in struct signal_struct.Add PIDTYPE_TGID as a first class member of the pid_type enumeration and
into the pids array. Then remove the __PIDTYPE_TGID special case and the
leader_pid in signal_struct.The net size increase is just an extra pointer added to struct pid and
an extra pair of pointers of an hlist_node added to task_struct.The effect on code maintenance is the removal of a number of special
cases today and the potential to remove many more special cases as
PIDTYPE_TGID gets used to it's fullest. The long term potential
is allowing zombie thread group leaders to exit, which will remove
a lot more special cases in the code.Signed-off-by: "Eric W. Biederman"
20 Jul, 2018
9 commits
-
…/linux into timers/core
Pull the second set of timekeeping things for 4.19 from John Stultz
* NTP argument clenaups and constification from Ondrej Mosnacek
* Fix to avoid RTC injecting sleeptime when suspend fails from
Mukesh Ojha
* Broading suspsend-timing to include non-stop clocksources that
aren't currently used for timekeeping from Baolin Wang -
On some hardware with multiple clocksources, we have coarse grained
clocksources that support the CLOCK_SOURCE_SUSPEND_NONSTOP flag, but
which are less than ideal for timekeeping whereas other clocksources
can be better candidates but halt on suspend.Currently, the timekeeping core only supports timing suspend using
CLOCK_SOURCE_SUSPEND_NONSTOP clocksources if that clocksource is the
current clocksource for timekeeping.As a result, some architectures try to implement read_persistent_clock64()
using those non-stop clocksources, but isn't really ideal, which will
introduce more duplicate code. To fix this, provide logic to allow a
registered SUSPEND_NONSTOP clocksource, which isn't the current
clocksource, to be used to calculate the suspend time.Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Miroslav Lichvar
Cc: Richard Cochran
Cc: Prarit Bhargava
Cc: Stephen Boyd
Cc: Daniel Lezcano
Reviewed-by: Thomas Gleixner
Reviewed-by: Daniel Lezcano
Suggested-by: Thomas Gleixner
Signed-off-by: Baolin Wang
[jstultz: minor tweaks to merge with previous resume changes]
Signed-off-by: John Stultz -
Currently, there exists a corner case assuming when there is
only one clocksource e.g RTC, and system failed to go to
suspend mode. While resume rtc_resume() injects the sleeptime
as timekeeping_rtc_skipresume() returned 'false' (default value
of sleeptime_injected) due to which we can see mismatch in
timestamps.This issue can also come in a system where more than one
clocksource are present and very first suspend fails.Success case:
------------
{sleeptime_injected=false}
rtc_suspend() => timekeeping_suspend() => timekeeping_resume() =>(sleeptime injected)
rtc_resume()Failure case:
------------
{failure in sleep path} {sleeptime_injected=false}
rtc_suspend() => rtc_resume(){sleeptime injected again which was not required as the suspend failed}
Fix this by handling the boolean logic properly.
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Miroslav Lichvar
Cc: Richard Cochran
Cc: Prarit Bhargava
Cc: Stephen Boyd
Originally-by: Thomas Gleixner
Signed-off-by: Mukesh Ojha
Signed-off-by: John Stultz -
Add 'const' to some function arguments and variables to make it easier
to read the code.Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Miroslav Lichvar
Cc: Richard Cochran
Cc: Prarit Bhargava
Cc: Stephen Boyd
Signed-off-by: Ondrej Mosnacek
[jstultz: Also fixup pre-existing checkpatch warnings for
prototype arguments with no variable name]
Signed-off-by: John Stultz -
sched_clock_postinit() initializes a generic clock on systems where no
other clock is provided. This function may be called only after
timekeeping_init().Rename sched_clock_postinit to generic_clock_inti() and call it from
sched_clock_init(). Move the call for sched_clock_init() until after
time_init().Suggested-by: Peter Zijlstra
Signed-off-by: Pavel Tatashin
Signed-off-by: Thomas Gleixner
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-23-pasha.tatashin@oracle.com -
read_persistent_wall_and_boot_offset() is called during boot to read
both the persistent clock and also return the offset between the boot time
and the value of persistent clock.Change the default boot_offset from zero to local_clock() so architectures,
that do not have a dedicated boot_clock but have early sched_clock(), such
as SPARCv9, x86, and possibly more will benefit from this change by getting
a better and more consistent estimate of the boot time without need for an
arch specific implementation.Signed-off-by: Pavel Tatashin
Signed-off-by: Thomas Gleixner
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-17-pasha.tatashin@oracle.com -
If architecture does not support exact boot time, it is challenging to
estimate boot time without having a reference to the current persistent
clock value. Yet, it cannot read the persistent clock time again, because
this may lead to math discrepancies with the caller of read_boot_clock64()
who have read the persistent clock at a different time.This is why it is better to provide two values simultaneously: the
persistent clock value, and the boot time.Replace read_boot_clock64() with:
read_persistent_wall_and_boot_offset(wall_time, boot_offset)Where wall_time is returned by read_persistent_clock() And boot_offset is
wall_time - boot time, which defaults to 0.Signed-off-by: Pavel Tatashin
Signed-off-by: Thomas Gleixner
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-16-pasha.tatashin@oracle.com -
...instead of kstrtol with a dirty cast.
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Miroslav Lichvar
Cc: Richard Cochran
Cc: Prarit Bhargava
Cc: Stephen Boyd
Signed-off-by: Ondrej Mosnacek
Signed-off-by: John Stultz -
The 'ts' argument of process_adj_status() and process_adjtimex_modes()
is unused and can be safely removed.Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Miroslav Lichvar
Cc: Richard Cochran
Cc: Prarit Bhargava
Cc: Stephen Boyd
Signed-off-by: Ondrej Mosnacek
Signed-off-by: John Stultz
19 Jul, 2018
1 commit
-
The call to wake_up_nohz_cpu() is incorrectly indented. Remove the surplus TAB.
Signed-off-by: Yi Wang
Signed-off-by: Thomas Gleixner
Reviewed-by: Jiang Biao
Cc: john.stultz@linaro.org
Cc: sboyd@kernel.org
Cc: zhong.weidong@zte.com.cn
CC: Anna-Maria Gleixner
Link: https://lkml.kernel.org/r/1531721337-30284-1-git-send-email-wang.yi59@zte.com.cn
13 Jul, 2018
2 commits
-
Pull timekeeping updates from John Stultz:
- Make the timekeeping update more precise when NTP frequency is set
directly by updating the multiplier.- Adjust selftests
-
- Join split message for easier grepping,
- Use pr_*() instead of printk*(),
- Use %u to format unsigned cpu numbers.Signed-off-by: Geert Uytterhoeven
Signed-off-by: Thomas Gleixner
Link: https://lkml.kernel.org/r/20180712144118.8819-1-geert+renesas@glider.be
11 Jul, 2018
2 commits
-
This reverts commit 1332a90558013ae4242e3dd7934bdcdeafb06c0d.
The original issue was not because of incorrect checking of cpumask for
both new and old tick device. It was incorrectly analysed was due to the
misunderstanding of the comment and misinterpretation of the return value
from tick_check_preferred. The main issue is with the clockevent driver
that sets the cpumask to cpu_all_mask instead of cpu_possible_mask.Signed-off-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Tested-by: Kevin Hilman
Tested-by: Martin Blumenstingl
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier
Link: https://lkml.kernel.org/r/1531151136-18297-1-git-send-email-sudeep.holla@arm.com -
When the NTP frequency is set directly from userspace using the
ADJ_FREQUENCY or ADJ_TICK timex mode, immediately update the
timekeeper's multiplier instead of waiting for the next tick.This removes a hidden non-deterministic delay in setting of the
frequency and allows an extremely tight control of the system clock
with update rates close to or even exceeding the kernel HZ.The update is limited to archs using modern timekeeping
(!ARCH_USES_GETTIMEOFFSET).Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Miroslav Lichvar
Cc: Richard Cochran
Cc: Prarit Bhargava
Cc: Stephen Boyd
Signed-off-by: Miroslav Lichvar
Signed-off-by: John Stultz
02 Jul, 2018
3 commits
-
Air Icy reported:
UBSAN: Undefined behaviour in kernel/time/alarmtimer.c:811:7
signed integer overflow:
1529859276030040771 + 9223372036854775807 cannot be represented in type 'long long int'
Call Trace:
alarm_timer_nsleep+0x44c/0x510 kernel/time/alarmtimer.c:811
__do_sys_clock_nanosleep kernel/time/posix-timers.c:1235 [inline]
__se_sys_clock_nanosleep kernel/time/posix-timers.c:1213 [inline]
__x64_sys_clock_nanosleep+0x326/0x4e0 kernel/time/posix-timers.c:1213
do_syscall_64+0xb8/0x3a0 arch/x86/entry/common.c:290alarm_timer_nsleep() uses ktime_add() to add the current time and the
relative expiry value. ktime_add() has no sanity checks so the addition
can overflow when the relative timeout is large enough.Use ktime_add_safe() which has the necessary sanity checks in place and
limits the result to the valid range.Fixes: 9a7adcf5c6de ("timers: Posix interface for alarm-timers")
Reported-by: Team OWL337
Signed-off-by: Thomas Gleixner
Cc: John Stultz
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1807020926360.1595@nanos.tec.linutronix.de -
The posix timer overrun handling is broken because the forwarding functions
can return a huge number of overruns which does not fit in an int. As a
consequence timer_getoverrun(2) and siginfo::si_overrun can turn into
random number generators.The k_clock::timer_forward() callbacks return a 64 bit value now. Make
k_itimer::ti_overrun[_last] 64bit as well, so the kernel internal
accounting is correct. 3Remove the temporary (int) casts.Add a helper function which clamps the overrun value returned to user space
via timer_getoverrun(2) or siginfo::si_overrun limited to a positive value
between 0 and INT_MAX. INT_MAX is an indicator for user space that the
overrun value has been clamped.Reported-by: Team OWL337
Signed-off-by: Thomas Gleixner
Acked-by: John Stultz
Cc: Peter Zijlstra
Cc: Michael Kerrisk
Link: https://lkml.kernel.org/r/20180626132705.018623573@linutronix.de -
The posix timer ti_overrun handling is broken because the forwarding
functions can return a huge number of overruns which does not fit in an
int. As a consequence timer_getoverrun(2) and siginfo::si_overrun can turn
into random number generators.As a first step to address that let the timer_forward() callbacks return
the full 64 bit value.Cast it to (int) temporarily until k_itimer::ti_overrun is converted to
64bit and the conversion to user space visible values is sanitized.Reported-by: Team OWL337
Signed-off-by: Thomas Gleixner
Acked-by: John Stultz
Cc: Peter Zijlstra
Cc: Michael Kerrisk
Link: https://lkml.kernel.org/r/20180626132704.922098090@linutronix.de
24 Jun, 2018
3 commits
-
timer_set/gettime and timerfd_set/get apis use struct itimerspec at the
user interface layer. struct itimerspec is not y2038-safe. Change these
interfaces to use y2038-safe struct __kernel_itimerspec instead. This will
help define new syscalls when 32bit architectures select CONFIG_64BIT_TIME.Signed-off-by: Deepa Dinamani
Signed-off-by: Thomas Gleixner
Cc: arnd@arndb.de
Cc: viro@zeniv.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: y2038@lists.linaro.org
Link: https://lkml.kernel.org/r/20180617051144.29756-4-deepa.kernel@gmail.com -
This will aid in enabling the compat syscalls on 32-bit architectures later
on.Also move compat_itimerspec and related defines to compat_time.h. The
compat_time.h file will eventually be deleted.Signed-off-by: Deepa Dinamani
Signed-off-by: Thomas Gleixner
Cc: arnd@arndb.de
Cc: viro@zeniv.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: y2038@lists.linaro.org
Link: https://lkml.kernel.org/r/20180617051144.29756-3-deepa.kernel@gmail.com -
struct itimerspec is not y2038-safe.
Introduce a new struct __kernel_itimerspec based on the kernel internal
y2038-safe struct itimerspec64.The definition of struct __kernel_itimerspec includes two struct
__kernel_timespec.Since struct __kernel_timespec has the same representation in native and
compat modes, so does struct __kernel_itimerspec. This helps have a common
entry point for syscalls using struct __kernel_itimerspec.New y2038-safe syscalls will use this new type. Since most of the new
syscalls are just an update to the native syscalls with the type update,
place the new definition under CONFIG_64BIT_TIME. This helps architectures
that do not support the above config to keep using the old definition of
struct itimerspec.Also change the get/put_itimerspec64 to use struct__kernel_itimerspec.
This will help 32 bit architectures to use the new syscalls when
architectures select CONFIG_64BIT_TIME.Signed-off-by: Deepa Dinamani
Signed-off-by: Thomas Gleixner
Cc: arnd@arndb.de
Cc: viro@zeniv.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: y2038@lists.linaro.org
Link: https://lkml.kernel.org/r/20180617051144.29756-2-deepa.kernel@gmail.com
22 Jun, 2018
1 commit
-
For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of
1000, jiffies_to_msecs() never returns zero when passed a non-zero time
period.However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or
1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero
for small non-zero time periods. This may break code that relies on
receiving back a non-zero value.jiffies_to_usecs() does not need such a fix: one jiffy can only be less
than one µs if HZ > 1000000, and such large values of HZ are already
rejected at build time, twice:- include/linux/jiffies.h does #error if HZ >= 12288,
- kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC).Broken since forever.
Signed-off-by: Geert Uytterhoeven
Signed-off-by: Thomas Gleixner
Reviewed-by: Arnd Bergmann
Cc: John Stultz
Cc: Stephen Boyd
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.org