25 Jan, 2021

1 commit


18 Jan, 2021

1 commit

  • Broadcast device is switched to oneshot mode in
    hrtimer_switch_to_hres() -> tick_broadcast_switch_to_oneshot().
    After high resolution timers are enabled, new installed
    broadcast device has no chance to switch mode.

    This issue happens in below situation:
    To make broadcast clock source driver build as module,
    use module_platform_driver() to replace TIMER_OF_DECLARE().
    This will make clock source driver probed later than
    high resolution timers enabled.

    Change-Id: I5cada6507bf44162b0642bc10efd1548b1b3f68a
    Signed-off-by: Jindong Yue

    Jindong Yue
     

06 Jan, 2021

1 commit

  • [ Upstream commit ba8ea8e7dd6e1662e34e730eadfc52aa6816f9dd ]

    can_stop_idle_tick() checks whether the do_timer() duty has been taken over
    by a CPU on boot. That's silly because the boot CPU always takes over with
    the initial clockevent device.

    But even if no CPU would have installed a clockevent and taken over the
    duty then the question whether the tick on the current CPU can be stopped
    or not is moot. In that case the current CPU would have no clockevent
    either, so there would be nothing to keep ticking.

    Remove it.

    Signed-off-by: Thomas Gleixner
    Acked-by: Frederic Weisbecker
    Link: https://lore.kernel.org/r/20201206212002.725238293@linutronix.de
    Signed-off-by: Sasha Levin

    Thomas Gleixner
     

22 Dec, 2020

1 commit


14 Nov, 2020

1 commit

  • Add vendor hook to print epoch values when system enter and exit
    out of suspend and resume. These epoch values are useful to know
    how long the device is in suspend state. These values can be used
    to synchronize various subsystem timestamps and have an unique
    timestamp to correlate between various subsystems.

    Bug: 172945021
    Change-Id: I82a01e348d05a46c9c3921869cc9d2fc0fd28867
    Signed-off-by: Murali Nalajala

    Murali Nalajala
     

11 Nov, 2020

1 commit


02 Nov, 2020

1 commit


29 Oct, 2020

1 commit


26 Oct, 2020

5 commits

  • UBSAN reports:

    Undefined behaviour in ./include/linux/time64.h:127:27
    signed integer overflow:
    17179869187 * 1000000000 cannot be represented in type 'long long int'
    Call Trace:
    timespec64_to_ns include/linux/time64.h:127 [inline]
    set_cpu_itimer+0x65c/0x880 kernel/time/itimer.c:180
    do_setitimer+0x8e/0x740 kernel/time/itimer.c:245
    __x64_sys_setitimer+0x14c/0x2c0 kernel/time/itimer.c:336
    do_syscall_64+0xa1/0x540 arch/x86/entry/common.c:295

    Commit bd40a175769d ("y2038: itimer: change implementation to timespec64")
    replaced the original conversion which handled time clamping correctly with
    timespec64_to_ns() which has no overflow protection.

    Fix it in timespec64_to_ns() as this is not necessarily limited to the
    usage in itimers.

    [ tglx: Added comment and adjusted the fixes tag ]

    Fixes: 361a3bf00582 ("time64: Add time64.h header and define struct timespec64")
    Signed-off-by: Zeng Tao
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Arnd Bergmann
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/1598952616-6416-1-git-send-email-prime.zeng@hisilicon.com

    Zeng Tao
     
  • There is no caller in tree, remove it.

    Signed-off-by: YueHaibing
    Signed-off-by: Thomas Gleixner
    Link: https://lore.kernel.org/r/20200909134749.32300-1-yuehaibing@huawei.com

    YueHaibing
     
  • There is no caller in tree, remove it.

    Signed-off-by: YueHaibing
    Signed-off-by: Thomas Gleixner
    Link: https://lore.kernel.org/r/20200909134850.21940-1-yuehaibing@huawei.com

    YueHaibing
     
  • Since sched_clock_read_begin() and sched_clock_read_retry() are called
    by notrace function sched_clock(), they shouldn't be traceable either,
    or else ftrace_graph_caller will run into a dead loop on the path
    as below (arm for instance):

    ftrace_graph_caller()
    prepare_ftrace_return()
    function_graph_enter()
    ftrace_push_return_trace()
    trace_clock_local()
    sched_clock()
    sched_clock_read_begin/retry()

    Fixes: 1b86abc1c645 ("sched_clock: Expose struct clock_read_data")
    Signed-off-by: Quanyang Wang
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20200929082027.16787-1-quanyang.wang@windriver.com

    Quanyang Wang
     
  • …/linux/kernel/git/dlemoal/zonefs") into android-mainline

    Steps on the way to 5.10-rc1

    Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
    Change-Id: I520719ae5e0d992c3756e393cb299d77d650622e

    Greg Kroah-Hartman
     

25 Oct, 2020

2 commits

  • With the removal of the interrupt perturbations in previous random32
    change (random32: make prandom_u32() output unpredictable), the PRNG
    has become 100% deterministic again. While SipHash is expected to be
    way more robust against brute force than the previous Tausworthe LFSR,
    there's still the risk that whoever has even one temporary access to
    the PRNG's internal state is able to predict all subsequent draws till
    the next reseed (roughly every minute). This may happen through a side
    channel attack or any data leak.

    This patch restores the spirit of commit f227e3ec3b5c ("random32: update
    the net random state on interrupt and activity") in that it will perturb
    the internal PRNG's statee using externally collected noise, except that
    it will not pick that noise from the random pool's bits nor upon
    interrupt, but will rather combine a few elements along the Tx path
    that are collectively hard to predict, such as dev, skb and txq
    pointers, packet length and jiffies values. These ones are combined
    using a single round of SipHash into a single long variable that is
    mixed with the net_rand_state upon each invocation.

    The operation was inlined because it produces very small and efficient
    code, typically 3 xor, 2 add and 2 rol. The performance was measured
    to be the same (even very slightly better) than before the switch to
    SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
    (i40e), the connection rate dropped from 556k/s to 555k/s while the
    SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.

    Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
    Cc: George Spelvin
    Cc: Amit Klein
    Cc: Eric Dumazet
    Cc: "Jason A. Donenfeld"
    Cc: Andy Lutomirski
    Cc: Kees Cook
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: tytso@mit.edu
    Cc: Florian Westphal
    Cc: Marc Plumb
    Tested-by: Sedat Dilek
    Signed-off-by: Willy Tarreau

    Willy Tarreau
     
  • Non-cryptographic PRNGs may have great statistical properties, but
    are usually trivially predictable to someone who knows the algorithm,
    given a small sample of their output. An LFSR like prandom_u32() is
    particularly simple, even if the sample is widely scattered bits.

    It turns out the network stack uses prandom_u32() for some things like
    random port numbers which it would prefer are *not* trivially predictable.
    Predictability led to a practical DNS spoofing attack. Oops.

    This patch replaces the LFSR with a homebrew cryptographic PRNG based
    on the SipHash round function, which is in turn seeded with 128 bits
    of strong random key. (The authors of SipHash have *not* been consulted
    about this abuse of their algorithm.) Speed is prioritized over security;
    attacks are rare, while performance is always wanted.

    Replacing all callers of prandom_u32() is the quick fix.
    Whether to reinstate a weaker PRNG for uses which can tolerate it
    is an open question.

    Commit f227e3ec3b5c ("random32: update the net random state on interrupt
    and activity") was an earlier attempt at a solution. This patch replaces
    it.

    Reported-by: Amit Klein
    Cc: Willy Tarreau
    Cc: Eric Dumazet
    Cc: "Jason A. Donenfeld"
    Cc: Andy Lutomirski
    Cc: Kees Cook
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: tytso@mit.edu
    Cc: Florian Westphal
    Cc: Marc Plumb
    Fixes: f227e3ec3b5c ("random32: update the net random state on interrupt and activity")
    Signed-off-by: George Spelvin
    Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
    [ willy: partial reversal of f227e3ec3b5c; moved SIPROUND definitions
    to prandom.h for later use; merged George's prandom_seed() proposal;
    inlined siprand_u32(); replaced the net_rand_state[] array with 4
    members to fix a build issue; cosmetic cleanups to make checkpatch
    happy; fixed RANDOM32_SELFTEST build ]
    Signed-off-by: Willy Tarreau

    George Spelvin
     

24 Oct, 2020

1 commit


21 Oct, 2020

3 commits


19 Oct, 2020

1 commit

  • Pull RCU changes from Ingo Molnar:

    - Debugging for smp_call_function()

    - RT raw/non-raw lock ordering fixes

    - Strict grace periods for KASAN

    - New smp_call_function() torture test

    - Torture-test updates

    - Documentation updates

    - Miscellaneous fixes

    [ This doesn't actually pull the tag - I've dropped the last merge from
    the RCU branch due to questions about the series. - Linus ]

    * tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
    smp: Make symbol 'csd_bug_count' static
    kernel/smp: Provide CSD lock timeout diagnostics
    smp: Add source and destination CPUs to __call_single_data
    rcu: Shrink each possible cpu krcp
    rcu/segcblist: Prevent useless GP start if no CBs to accelerate
    torture: Add gdb support
    rcutorture: Allow pointer leaks to test diagnostic code
    rcutorture: Hoist OOM registry up one level
    refperf: Avoid null pointer dereference when buf fails to allocate
    rcutorture: Properly synchronize with OOM notifier
    rcutorture: Properly set rcu_fwds for OOM handling
    torture: Add kvm.sh --help and update help message
    rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
    torture: Update initrd documentation
    rcutorture: Replace HTTP links with HTTPS ones
    locktorture: Make function torture_percpu_rwsem_init() static
    torture: document --allcpus argument added to the kvm.sh script
    rcutorture: Output number of elapsed grace periods
    rcutorture: Remove KCSAN stubs
    rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
    ...

    Linus Torvalds
     

13 Oct, 2020

2 commits

  • Pull locking updates from Ingo Molnar:
    "These are the locking updates for v5.10:

    - Add deadlock detection for recursive read-locks.

    The rationale is outlined in commit 224ec489d3cd ("lockdep/
    Documention: Recursive read lock detection reasoning")

    The main deadlock pattern we want to detect is:

    TASK A: TASK B:

    read_lock(X);
    write_lock(X);
    read_lock_2(X);

    - Add "latch sequence counters" (seqcount_latch_t):

    A sequence counter variant where the counter even/odd value is used
    to switch between two copies of protected data. This allows the
    read path, typically NMIs, to safely interrupt the write side
    critical section.

    We utilize this new variant for sched-clock, and to make x86 TSC
    handling safer.

    - Other seqlock cleanups, fixes and enhancements

    - KCSAN updates

    - LKMM updates

    - Misc updates, cleanups and fixes"

    * tag 'locking-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits)
    lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables"
    lockdep: Fix lockdep recursion
    lockdep: Fix usage_traceoverflow
    locking/atomics: Check atomic-arch-fallback.h too
    locking/seqlock: Tweak DEFINE_SEQLOCK() kernel doc
    lockdep: Optimize the memory usage of circular queue
    seqlock: Unbreak lockdep
    seqlock: PREEMPT_RT: Do not starve seqlock_t writers
    seqlock: seqcount_LOCKNAME_t: Introduce PREEMPT_RT support
    seqlock: seqcount_t: Implement all read APIs as statement expressions
    seqlock: Use unique prefix for seqcount_t property accessors
    seqlock: seqcount_LOCKNAME_t: Standardize naming convention
    seqlock: seqcount latch APIs: Only allow seqcount_latch_t
    rbtree_latch: Use seqcount_latch_t
    x86/tsc: Use seqcount_latch_t
    timekeeping: Use seqcount_latch_t
    time/sched_clock: Use seqcount_latch_t
    seqlock: Introduce seqcount_latch_t
    mm/swap: Do not abuse the seqcount_t latching API
    time/sched_clock: Use raw_read_seqcount_latch() during suspend
    ...

    Linus Torvalds
     
  • Pull timekeeping updates from Thomas Gleixner:
    "Updates for timekeeping, timers and related drivers:

    Core:

    - Early boot support for the NMI safe timekeeper by utilizing
    local_clock() up to the point where timekeeping is initialized.
    This allows printk() to store multiple timestamps in the ringbuffer
    which is useful for coordinating dmesg information across a fleet
    of machines.

    - Provide a multi-timestamp accessor for printk()

    - Make timer init more robust by checking for invalid timer flags.

    - Comma vs semicolon fixes

    Drivers:

    - Support for new platforms in existing drivers (SP804 and Renesas
    CMT)

    - Comma vs semicolon fixes

    * tag 'timers-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    clocksource/drivers/armada-370-xp: Use semicolons rather than commas to separate statements
    clocksource/drivers/mps2-timer: Use semicolons rather than commas to separate statements
    timers: Mask invalid flags in do_init_timer()
    clocksource/drivers/sp804: Enable Hisilicon sp804 timer 64bit mode
    clocksource/drivers/sp804: Add support for Hisilicon sp804 timer
    clocksource/drivers/sp804: Support non-standard register offset
    clocksource/drivers/sp804: Prepare for support non-standard register offset
    clocksource/drivers/sp804: Remove a mismatched comment
    clocksource/drivers/sp804: Delete the leading "__" of some functions
    clocksource/drivers/sp804: Remove unused sp804_timer_disable() and timer-sp804.h
    clocksource/drivers/sp804: Cleanup clk_get_sys()
    dt-bindings: timer: renesas,cmt: Document r8a774e1 CMT support
    dt-bindings: timer: renesas,cmt: Document r8a7742 CMT support
    alarmtimer: Convert comma to semicolon
    timekeeping: Provide multi-timestamp accessor to NMI safe timekeeper
    timekeeping: Utilize local_clock() for NMI safe timekeeper during early boot

    Linus Torvalds
     

09 Oct, 2020

2 commits


25 Sep, 2020

2 commits

  • do_init_timer() accepts any combination of timer flags handed in by the
    caller without a sanity check, but only TIMER_DEFFERABLE, TIMER_PINNED and
    TIMER_IRQSAFE are valid.

    If the supplied flags have other bits set, this could result in
    malfunction. If bits are set in TIMER_CPUMASK the first timer usage could
    deference a cpu base which is outside the range of possible CPUs. If
    TIMER_MIGRATION is set, then the switch_timer_base() will live lock.

    Prevent that with a sanity check which warns when invalid flags are
    supplied and masks them out.

    [ tglx: Made it WARN_ON_ONCE() and added context to the changelog ]

    Signed-off-by: Qianli Zhao
    Signed-off-by: Thomas Gleixner
    Link: https://lore.kernel.org/r/9d79a8aa4eb56713af7379f99f062dedabcde140.1597326756.git.zhaoqianli@xiaomi.com

    Qianli Zhao
     
  • This should make it harder for the kernel to corrupt the debug object
    descriptor, used to call functions to fixup state and track debug objects,
    by moving the structure to read-only memory.

    Signed-off-by: Stephen Boyd
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Kees Cook
    Link: https://lore.kernel.org/r/20200815004027.2046113-3-swboyd@chromium.org

    Stephen Boyd
     

10 Sep, 2020

3 commits

  • Latch sequence counters are a multiversion concurrency control mechanism
    where the seqcount_t counter even/odd value is used to switch between
    two data storage copies. This allows the seqcount_t read path to safely
    interrupt its write side critical section (e.g. from NMIs).

    Initially, latch sequence counters were implemented as a single write
    function, raw_write_seqcount_latch(), above plain seqcount_t. The read
    path was expected to use plain seqcount_t raw_read_seqcount().

    A specialized read function was later added, raw_read_seqcount_latch(),
    and became the standardized way for latch read paths. Having unique read
    and write APIs meant that latch sequence counters are basically a data
    type of their own -- just inappropriately overloading plain seqcount_t.
    The seqcount_latch_t data type was thus introduced at seqlock.h.

    Use that new data type instead of seqcount_raw_spinlock_t. This ensures
    that only latch-safe APIs are to be used with the sequence counter.

    Note that the use of seqcount_raw_spinlock_t was not very useful in the
    first place. Only the "raw_" subset of seqcount_t APIs were used at
    timekeeping.c. This subset was created for contexts where lockdep cannot
    be used. seqcount_LOCKTYPE_t's raison d'être -- verifying that the
    seqcount_t writer serialization lock is held -- cannot thus be done.

    References: 0c3351d451ae ("seqlock: Use raw_ prefix instead of _no_lockdep")
    References: 55f3560df975 ("seqlock: Extend seqcount API with associated locks")
    Signed-off-by: Ahmed S. Darwish
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200827114044.11173-6-a.darwish@linutronix.de

    Ahmed S. Darwish
     
  • Latch sequence counters have unique read and write APIs, and thus
    seqcount_latch_t was recently introduced at seqlock.h.

    Use that new data type instead of plain seqcount_t. This adds the
    necessary type-safety and ensures only latching-safe seqcount APIs are
    to be used.

    Signed-off-by: Ahmed S. Darwish
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200827114044.11173-5-a.darwish@linutronix.de

    Ahmed S. Darwish
     
  • sched_clock uses seqcount_t latching to switch between two storage
    places protected by the sequence counter. This allows it to have
    interruptible, NMI-safe, seqcount_t write side critical sections.

    Since 7fc26327b756 ("seqlock: Introduce raw_read_seqcount_latch()"),
    raw_read_seqcount_latch() became the standardized way for seqcount_t
    latch read paths. Due to the dependent load, it has one read memory
    barrier less than the currently used raw_read_seqcount() API.

    Use raw_read_seqcount_latch() for the suspend path.

    Commit aadd6e5caaac ("time/sched_clock: Use raw_read_seqcount_latch()")
    missed changing that instance of raw_read_seqcount().

    References: 1809bfa44e10 ("timers, sched/clock: Avoid deadlock during read from NMI")
    Signed-off-by: Ahmed S. Darwish
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200715092345.GA231464@debian-buster-darwi.lab.linutronix.de

    Ahmed S. Darwish
     

01 Sep, 2020

1 commit


25 Aug, 2020

2 commits

  • Replace a comma between expression statements by a semicolon.

    Signed-off-by: Xu Wang
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Stephen Boyd
    Link: https://lore.kernel.org/r/20200818062651.21680-1-vulab@iscas.ac.cn

    Xu Wang
     
  • Currently, can_stop_idle_tick() prints "NOHZ: local_softirq_pending HH"
    (where "HH" is the hexadecimal softirq vector number) when one or more
    non-RCU softirq handlers are still enabled when checking to stop the
    scheduler-tick interrupt. This message is not as enlightening as one
    might hope, so this commit changes it to "NOHZ tick-stop error: Non-RCU
    local softirq work is pending, handler #HH".

    Reported-by: Andy Lutomirski
    Cc: Frederic Weisbecker
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

23 Aug, 2020

2 commits

  • printk wants to store various timestamps (MONOTONIC, REALTIME, BOOTTIME) to
    make correlation of dmesg from several systems easier.

    Provide an interface to retrieve all three timestamps in one go.

    There are some caveats:

    1) Boot time and late sleep time injection

    Boot time is a racy access on 32bit systems if the sleep time injection
    happens late during resume and not in timekeeping_resume(). That could be
    avoided by expanding struct tk_read_base with boot offset for 32bit and
    adding more overhead to the update. As this is a hard to observe once per
    resume event which can be filtered with reasonable effort using the
    accurate mono/real timestamps, it's probably not worth the trouble.

    Aside of that it might be possible on 32 and 64 bit to observe the
    following when the sleep time injection happens late:

    CPU 0 CPU 1
    timekeeping_resume()
    ktime_get_fast_timestamps()
    mono, real = __ktime_get_real_fast()
    inject_sleep_time()
    update boot offset
    boot = mono + bootoffset;

    That means that boot time already has the sleep time adjustment, but
    real time does not. On the next readout both are in sync again.

    Preventing this for 64bit is not really feasible without destroying the
    careful cache layout of the timekeeper because the sequence count and
    struct tk_read_base would then need two cache lines instead of one.

    2) Suspend/resume timestamps

    Access to the time keeper clock source is disabled accross the innermost
    steps of suspend/resume. The accessors still work, but the timestamps
    are frozen until time keeping is resumed which happens very early.

    For regular suspend/resume there is no observable difference vs. sched
    clock, but it might affect some of the nasty low level debug printks.

    OTOH, access to sched clock is not guaranteed accross suspend/resume on
    all systems either so it depends on the hardware in use.

    If that turns out to be a real problem then this could be mitigated by
    using sched clock in a similar way as during early boot. But it's not as
    trivial as on early boot because it needs some careful protection
    against the clock monotonic timestamp jumping backwards on resume.

    Signed-off-by: Thomas Gleixner
    Tested-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200814115512.159981360@linutronix.de

    Thomas Gleixner
     
  • During early boot the NMI safe timekeeper returns 0 until the first
    clocksource becomes available.

    This prevents it from being used for printk or other facilities which today
    use sched clock. sched clock can be available way before timekeeping is
    initialized.

    The obvious workaround for this is to utilize the early sched clock in the
    default dummy clock read function until a clocksource becomes available.

    After switching to the clocksource clock MONOTONIC and BOOTTIME will not
    jump because the timekeeping_init() bases clock MONOTONIC on sched clock
    and the offset between clock MONOTONIC and BOOTTIME is zero during boot.

    Clock REALTIME cannot provide useful timestamps during early boot up to
    the point where a persistent clock becomes available, which is either in
    timekeeping_init() or later when the RTC driver which might depend on I2C
    or other subsystems is initialized.

    There is a minor difference to sched_clock() vs. suspend/resume. As the
    timekeeper clock source might not be accessible during suspend, after
    timekeeping_suspend() timestamps freeze up to the point where
    timekeeping_resume() is invoked. OTOH this is true for some sched clock
    implementations as well.

    Signed-off-by: Thomas Gleixner
    Tested-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200814115512.041422402@linutronix.de

    Thomas Gleixner
     

17 Aug, 2020

1 commit


15 Aug, 2020

2 commits

  • Pull timekeeping updates from Thomas Gleixner:
    "A set of timekeeping/VDSO updates:

    - Preparatory work to allow S390 to switch over to the generic VDSO
    implementation.

    S390 requires that the VDSO data pointer is handed in to the
    counter read function when time namespace support is enabled.
    Adding the pointer is a NOOP for all other architectures because
    the compiler is supposed to optimize that out when it is unused in
    the architecture specific inline. The change also solved a similar
    problem for MIPS which fortunately has time namespaces not yet
    enabled.

    S390 needs to update clock related VDSO data independent of the
    timekeeping updates. This was solved so far with yet another
    sequence counter in the S390 implementation. A better solution is
    to utilize the already existing VDSO sequence count for this. The
    core code now exposes helper functions which allow to serialize
    against the timekeeper code and against concurrent readers.

    S390 needs extra data for their clock readout function. The initial
    common VDSO data structure did not provide a way to add that. It
    now has an embedded architecture specific struct embedded which
    defaults to an empty struct.

    Doing this now avoids tree dependencies and conflicts post rc1 and
    allows all other architectures which work on generic VDSO support
    to work from a common upstream base.

    - A trivial comment fix"

    * tag 'timers-urgent-2020-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    time: Delete repeated words in comments
    lib/vdso: Allow to add architecture-specific vdso data
    timekeeping/vsyscall: Provide vdso_update_begin/end()
    vdso/treewide: Add vdso_data pointer argument to __arch_get_hw_counter()

    Linus Torvalds
     
  • Pull more timer updates from Thomas Gleixner:
    "A set of posix CPU timer changes which allows to defer the heavy work
    of posix CPU timers into task work context. The tick interrupt is
    reduced to a quick check which queues the work which is doing the
    heavy lifting before returning to user space or going back to guest
    mode. Moving this out is deferring the signal delivery slightly but
    posix CPU timers are inaccurate by nature as they depend on the tick
    so there is no real damage. The relevant test cases all passed.

    This lifts the last offender for RT out of the hard interrupt context
    tick handler, but it also has the general benefit that the actual
    heavy work is accounted to the task/process and not to the tick
    interrupt itself.

    Further optimizations are possible to break long sighand lock hold and
    interrupt disabled (on !RT kernels) times when a massive amount of
    posix CPU timers (which are unpriviledged) is armed for a
    task/process.

    This is currently only enabled for x86 because the architecture has to
    ensure that task work is handled in KVM before entering a guest, which
    was just established for x86 with the new common entry/exit code which
    got merged post 5.8 and is not the case for other KVM architectures"

    * tag 'timers-core-2020-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: Select POSIX_CPU_TIMERS_TASK_WORK
    posix-cpu-timers: Provide mechanisms to defer timer handling to task_work
    posix-cpu-timers: Split run_posix_cpu_timers()

    Linus Torvalds
     

12 Aug, 2020

1 commit

  • - Add EXPORT_SYMBOL_GPL for nsec_to_clock_t() so that drivers
    be loadable as a module.

    - This API is required by loadable driver module from samsung to
    fetch process uptime based on CPU clock ticks to get the exact time
    during which app is scheduled in user mode.
    Signed-off-by: Abhilasha Rao
    Bug: 158067689
    Change-Id: I45be5fd7873dc7c21aa583313499f48f8b10bb1b

    Abhilasha Rao
     

11 Aug, 2020

1 commit

  • Pull locking updates from Thomas Gleixner:
    "A set of locking fixes and updates:

    - Untangle the header spaghetti which causes build failures in
    various situations caused by the lockdep additions to seqcount to
    validate that the write side critical sections are non-preemptible.

    - The seqcount associated lock debug addons which were blocked by the
    above fallout.

    seqcount writers contrary to seqlock writers must be externally
    serialized, which usually happens via locking - except for strict
    per CPU seqcounts. As the lock is not part of the seqcount, lockdep
    cannot validate that the lock is held.

    This new debug mechanism adds the concept of associated locks.
    sequence count has now lock type variants and corresponding
    initializers which take a pointer to the associated lock used for
    writer serialization. If lockdep is enabled the pointer is stored
    and write_seqcount_begin() has a lockdep assertion to validate that
    the lock is held.

    Aside of the type and the initializer no other code changes are
    required at the seqcount usage sites. The rest of the seqcount API
    is unchanged and determines the type at compile time with the help
    of _Generic which is possible now that the minimal GCC version has
    been moved up.

    Adding this lockdep coverage unearthed a handful of seqcount bugs
    which have been addressed already independent of this.

    While generally useful this comes with a Trojan Horse twist: On RT
    kernels the write side critical section can become preemtible if
    the writers are serialized by an associated lock, which leads to
    the well known reader preempts writer livelock. RT prevents this by
    storing the associated lock pointer independent of lockdep in the
    seqcount and changing the reader side to block on the lock when a
    reader detects that a writer is in the write side critical section.

    - Conversion of seqcount usage sites to associated types and
    initializers"

    * tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
    locking/seqlock, headers: Untangle the spaghetti monster
    locking, arch/ia64: Reduce header dependencies by moving XTP bits into the new header
    x86/headers: Remove APIC headers from
    seqcount: More consistent seqprop names
    seqcount: Compress SEQCNT_LOCKNAME_ZERO()
    seqlock: Fold seqcount_LOCKNAME_init() definition
    seqlock: Fold seqcount_LOCKNAME_t definition
    seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
    hrtimer: Use sequence counter with associated raw spinlock
    kvm/eventfd: Use sequence counter with associated spinlock
    userfaultfd: Use sequence counter with associated spinlock
    NFSv4: Use sequence counter with associated spinlock
    iocost: Use sequence counter with associated spinlock
    raid5: Use sequence counter with associated spinlock
    vfs: Use sequence counter with associated spinlock
    timekeeping: Use sequence counter with associated raw spinlock
    xfrm: policy: Use sequence counters with associated lock
    netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
    netfilter: conntrack: Use sequence counter with associated spinlock
    sched: tasks: Use sequence counter with associated spinlock
    ...

    Linus Torvalds