01 Sep, 2020

2 commits


25 Jul, 2020

2 commits

  • Forbid splitting VVAR VMA resulting in a stricter ABI and reducing the
    amount of corner-cases to consider while working further on VDSO time
    namespace support.

    As the offset from timens to VVAR page is computed compile-time, the pages
    in VVAR should stay together and not being partically mremap()'ed.

    Signed-off-by: Andrei Vagin
    Reviewed-by: Vincenzo Frascino
    Reviewed-by: Dmitry Safonov
    Link: https://lore.kernel.org/r/20200624083321.144975-6-avagin@gmail.com
    Signed-off-by: Catalin Marinas

    Andrei Vagin
     
  • If a task belongs to a time namespace then the VVAR page which contains
    the system wide VDSO data is replaced with a namespace specific page
    which has the same layout as the VVAR page.

    Signed-off-by: Andrei Vagin
    Reviewed-by: Vincenzo Frascino
    Reviewed-by: Dmitry Safonov
    Link: https://lore.kernel.org/r/20200624083321.144975-5-avagin@gmail.com
    Signed-off-by: Catalin Marinas

    Andrei Vagin
     

24 Jul, 2020

3 commits

  • Allocate the time namespace page among VVAR pages. Provide
    __arch_get_timens_vdso_data() helper for VDSO code to get the
    code-relative position of VVARs on that special page.

    If a task belongs to a time namespace then the VVAR page which contains
    the system wide VDSO data is replaced with a namespace specific page
    which has the same layout as the VVAR page. That page has vdso_data->seq
    set to 1 to enforce the slow path and vdso_data->clock_mode set to
    VCLOCK_TIMENS to enforce the time namespace handling path.

    The extra check in the case that vdso_data->seq is odd, e.g. a concurrent
    update of the VDSO data is in progress, is not really affecting regular
    tasks which are not part of a time namespace as the task is spin waiting
    for the update to finish and vdso_data->seq to become even again.

    If a time namespace task hits that code path, it invokes the corresponding
    time getter function which retrieves the real VVAR page, reads host time
    and then adds the offset for the requested clock which is stored in the
    special VVAR page.

    The time-namespace page isn't allocated on !CONFIG_TIME_NAMESPACE, but
    vma is the same size, which simplifies criu/vdso migration between
    different kernel configs.

    Signed-off-by: Andrei Vagin
    Reviewed-by: Vincenzo Frascino
    Reviewed-by: Dmitry Safonov
    Cc: Mark Rutland
    Link: https://lore.kernel.org/r/20200624083321.144975-4-avagin@gmail.com
    Signed-off-by: Catalin Marinas

    Andrei Vagin
     
  • The order of vvar pages depends on whether a task belongs to the root
    time namespace or not. In the root time namespace, a task doesn't have a
    per-namespace page. In a non-root namespace, the VVAR page which contains
    the system-wide VDSO data is replaced with a namespace specific page
    that contains clock offsets.

    Whenever a task changes its namespace, the VVAR page tables are cleared
    and then they will be re-faulted with a corresponding layout.

    A task can switch its time namespace only if its ->mm isn't shared with
    another task.

    Signed-off-by: Andrei Vagin
    Reviewed-by: Vincenzo Frascino
    Reviewed-by: Dmitry Safonov
    Reviewed-by: Christian Brauner
    Link: https://lore.kernel.org/r/20200624083321.144975-3-avagin@gmail.com
    Signed-off-by: Catalin Marinas

    Andrei Vagin
     
  • Currently the vdso has no awareness of time namespaces, which may
    apply distinct offsets to processes in different namespaces. To handle
    this within the vdso, we'll need to expose a per-namespace data page.

    As a preparatory step, this patch separates the vdso data page from
    the code pages, and has it faulted in via its own fault callback.
    Subsquent patches will extend this to support distinct pages per time
    namespace.

    The vvar vma has to be installed with the VM_PFNMAP flag to handle
    faults via its vma fault callback.

    Signed-off-by: Andrei Vagin
    Reviewed-by: Vincenzo Frascino
    Reviewed-by: Dmitry Safonov
    Link: https://lore.kernel.org/r/20200624083321.144975-2-avagin@gmail.com
    Signed-off-by: Catalin Marinas

    Andrei Vagin
     

23 Jun, 2020

1 commit

  • In preparation for removing the signal trampoline from the compat vDSO,
    allow the sigpage and the compat vDSO to co-exist.

    For the moment the vDSO signal trampoline will still be used when built.
    Subsequent patches will move to the sigpage consistently.

    Acked-by: Dave Martin
    Reviewed-by: Vincenzo Frascino
    Reviewed-by: Ard Biesheuvel
    Reviewed-by: Mark Rutland
    Signed-off-by: Will Deacon

    Will Deacon
     

10 Jun, 2020

1 commit

  • This change converts the existing mmap_sem rwsem calls to use the new mmap
    locking API instead.

    The change is generated using coccinelle with the following rule:

    // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

    @@
    expression mm;
    @@
    (
    -init_rwsem
    +mmap_init_lock
    |
    -down_write
    +mmap_write_lock
    |
    -down_write_killable
    +mmap_write_lock_killable
    |
    -down_write_trylock
    +mmap_write_trylock
    |
    -up_write
    +mmap_write_unlock
    |
    -downgrade_write
    +mmap_write_downgrade
    |
    -down_read
    +mmap_read_lock
    |
    -down_read_killable
    +mmap_read_lock_killable
    |
    -down_read_trylock
    +mmap_read_trylock
    |
    -up_read
    +mmap_read_unlock
    )
    -(&mm->mmap_sem)
    +(mm)

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Daniel Jordan
    Reviewed-by: Laurent Dufour
    Reviewed-by: Vlastimil Babka
    Cc: Davidlohr Bueso
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

29 May, 2020

1 commit

  • Support for Branch Target Identification (BTI) in user and kernel
    (Mark Brown and others)
    * for-next/bti: (39 commits)
    arm64: vdso: Fix CFI directives in sigreturn trampoline
    arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
    arm64: bti: Fix support for userspace only BTI
    arm64: kconfig: Update and comment GCC version check for kernel BTI
    arm64: vdso: Map the vDSO text with guarded pages when built for BTI
    arm64: vdso: Force the vDSO to be linked as BTI when built for BTI
    arm64: vdso: Annotate for BTI
    arm64: asm: Provide a mechanism for generating ELF note for BTI
    arm64: bti: Provide Kconfig for kernel mode BTI
    arm64: mm: Mark executable text as guarded pages
    arm64: bpf: Annotate JITed code for BTI
    arm64: Set GP bit in kernel page tables to enable BTI for the kernel
    arm64: asm: Override SYM_FUNC_START when building the kernel with BTI
    arm64: bti: Support building kernel C code using BTI
    arm64: Document why we enable PAC support for leaf functions
    arm64: insn: Report PAC and BTI instructions as skippable
    arm64: insn: Don't assume unrecognized HINTs are skippable
    arm64: insn: Provide a better name for aarch64_insn_is_nop()
    arm64: insn: Add constants for new HINT instruction decode
    arm64: Disable old style assembly annotations
    ...

    Will Deacon
     

08 May, 2020

1 commit

  • The kernel is responsible for mapping the vDSO into userspace processes,
    including mapping the text section as executable. Handle the mapping of
    the vDSO for BTI similarly, mapping the text section as guarded pages so
    the BTI annotations in the vDSO become effective when they are present.

    This will mean that we can have BTI active for the vDSO in processes that
    do not otherwise support BTI. This should not be an issue for any expected
    use of the vDSO.

    Signed-off-by: Mark Brown
    Reviewed-by: Catalin Marinas
    Link: https://lore.kernel.org/r/20200506195138.22086-12-broonie@kernel.org
    Signed-off-by: Will Deacon

    Mark Brown
     

29 Apr, 2020

4 commits

  • The current code doesn't use a consistent naming scheme for structures,
    enums, or variables, making it harder than necessary to determine the
    relationship between these.

    Let's make this easier by consistently using 'map' nomenclature for
    mappings created in userspace, minimizing redundant comments, and
    using designated array initializers to tie indices to their respective
    elements.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Cc: Catalin Marinas
    Cc: Vincenzo Frascino
    Cc: Will Deacon
    Link: https://lore.kernel.org/r/20200428164921.41641-5-mark.rutland@arm.com
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • The current code doesn't use a consistent naming scheme for structures,
    enums, or variables, making it harder than necessary to determine the
    relationship between these.

    Let's make this easier by consistently using 'vdso_abi' nomenclature.
    The 'vdso_lookup' array is renamed to 'vdso_info' to describe what it
    contains rather than how it is consumed.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Cc: Catalin Marinas
    Cc: Vincenzo Frascino
    Cc: Will Deacon
    Link: https://lore.kernel.org/r/20200428164921.41641-4-mark.rutland@arm.com
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • Currently we have some ifdeffery to determine the number of elements in
    enum arch_vdso_type as VDSO_TYPES, rather that the usual pattern of
    having the enum define this:

    | enum foo_type {
    | FOO_TYPE_A,
    | FOO_TYPE_B,
    | #ifdef CONFIG_C
    | FOO_TYPE_C,
    | #endif
    | NR_FOO_TYPES
    | }

    ... however, given we only use this number to size the vdso_lookup[]
    array, this is redundant anyway as the compiler can automatically size
    the array to fit all defined elements.

    So let's remove the VDSO_TYPES to simplify the code.

    At the same time, let's use designated initializers for the array
    elements so that these are guarnateed to be at the expected indices,
    regardless of how we modify the structure. For clariy the redundant
    explicit initialization of the enum elements is dropped.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Cc: Catalin Marinas
    Cc: Vincenzo Frascino
    Cc: Will Deacon
    Link: https://lore.kernel.org/r/20200428164921.41641-3-mark.rutland@arm.com
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • The aarch32_vdso_pages[] array is unnecessarily confusing. We only ever
    use the C_VECTORS and C_SIGPAGE slots, and the other slots are unused
    despite having corresponding mappings (sharing pages with the AArch64
    vDSO).

    Let's make this clearer by using separate variables for the vectors page
    and the sigreturn page. A subsequent patch will clean up the C_* naming
    and conflation of pages with mappings.

    Note that since both the vectors page and sig page are single
    pages, and the mapping is a single page long, their pages array do not
    need to be NULL-terminated (and this was not the case with the existing
    code for the sig page as it was the last entry in the aarch32_vdso_pages
    array).

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Cc: Catalin Marinas
    Cc: Vincenzo Frascino
    Cc: Will Deacon
    Link: https://lore.kernel.org/r/20200428164921.41641-2-mark.rutland@arm.com
    Signed-off-by: Will Deacon

    Mark Rutland
     

15 Apr, 2020

1 commit

  • The aarch32_vdso_pages[] array never has entries allocated in the C_VVAR
    or C_VDSO slots, and as the array is zero initialized these contain
    NULL.

    However in __aarch32_alloc_vdso_pages() when
    aarch32_alloc_kuser_vdso_page() fails we attempt to free the page whose
    struct page is at NULL, which is obviously nonsensical.

    This patch removes the erroneous page freeing.

    Fixes: 7c1deeeb0130 ("arm64: compat: VDSO setup for compat layer")
    Cc: # 5.3.x-
    Cc: Vincenzo Frascino
    Acked-by: Will Deacon
    Signed-off-by: Mark Rutland
    Signed-off-by: Catalin Marinas

    Mark Rutland
     

23 Jun, 2019

3 commits

  • If CONFIG_GENERIC_COMPAT_VDSO is enabled, compat vDSO is installed in a
    compat (32 bit) process instead of sigpage.

    Add the necessary code to setup the vDSO required pages.

    Signed-off-by: Vincenzo Frascino
    Signed-off-by: Thomas Gleixner
    Tested-by: Shijith Thotton
    Tested-by: Andre Przywara
    Cc: linux-arch@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-mips@vger.kernel.org
    Cc: linux-kselftest@vger.kernel.org
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Arnd Bergmann
    Cc: Russell King
    Cc: Ralf Baechle
    Cc: Paul Burton
    Cc: Daniel Lezcano
    Cc: Mark Salyzyn
    Cc: Peter Collingbourne
    Cc: Shuah Khan
    Cc: Dmitry Safonov
    Cc: Rasmus Villemoes
    Cc: Huw Davies
    Link: https://lkml.kernel.org/r/20190621095252.32307-13-vincenzo.frascino@arm.com

    Vincenzo Frascino
     
  • Most of the code for initializing the vDSOs in arm64 and compat will be
    shared, hence refactoring of the current code is required to avoid
    duplication and to simplify maintainability.

    No functional change.

    Signed-off-by: Vincenzo Frascino
    Signed-off-by: Thomas Gleixner
    Tested-by: Shijith Thotton
    Tested-by: Andre Przywara
    Cc: linux-arch@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-mips@vger.kernel.org
    Cc: linux-kselftest@vger.kernel.org
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Arnd Bergmann
    Cc: Russell King
    Cc: Ralf Baechle
    Cc: Paul Burton
    Cc: Daniel Lezcano
    Cc: Mark Salyzyn
    Cc: Peter Collingbourne
    Cc: Shuah Khan
    Cc: Dmitry Safonov
    Cc: Rasmus Villemoes
    Cc: Huw Davies
    Link: https://lkml.kernel.org/r/20190621095252.32307-12-vincenzo.frascino@arm.com

    Vincenzo Frascino
     
  • To take advantage of the commonly defined vdso interface for gettimeofday()
    the architectural code requires an adaptation.

    Re-implement the gettimeofday VDSO in C in order to use lib/vdso.

    With the new implementation arm64 gains support for CLOCK_BOOTTIME
    and CLOCK_TAI.

    [ tglx: Reformatted the function line breaks ]

    Signed-off-by: Vincenzo Frascino
    Signed-off-by: Thomas Gleixner
    Tested-by: Shijith Thotton
    Tested-by: Andre Przywara
    Cc: linux-arch@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-mips@vger.kernel.org
    Cc: linux-kselftest@vger.kernel.org
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Arnd Bergmann
    Cc: Russell King
    Cc: Ralf Baechle
    Cc: Paul Burton
    Cc: Daniel Lezcano
    Cc: Mark Salyzyn
    Cc: Peter Collingbourne
    Cc: Shuah Khan
    Cc: Dmitry Safonov
    Cc: Rasmus Villemoes
    Cc: Huw Davies
    Link: https://lkml.kernel.org/r/20190621095252.32307-5-vincenzo.frascino@arm.com

    Vincenzo Frascino
     

19 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation this program is
    distributed in the hope that it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with this program if not see http www gnu org
    licenses

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 503 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Alexios Zavras
    Reviewed-by: Allison Randal
    Reviewed-by: Enrico Weigelt
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

24 Apr, 2019

3 commits

  • When kuser helpers are enabled the kernel maps the relative code at
    a fixed address (0xffff0000). Making configurable the option to disable
    them means that the kernel can remove this mapping and any access to
    this memory area results in a sigfault.

    Add a KUSER_HELPERS config option that can be used to disable the
    mapping when it is turned off.

    This option can be turned off if and only if the applications are
    designed specifically for the platform and they do not make use of the
    kuser helpers code.

    Cc: Catalin Marinas
    Cc: Will Deacon
    Signed-off-by: Vincenzo Frascino
    Reviewed-by: Catalin Marinas
    [will: Use IS_ENABLED() instead of #ifdef]
    Signed-off-by: Will Deacon

    Vincenzo Frascino
     
  • aarch32_alloc_vdso_pages() needs to be refactored to make it
    easier to disable kuser helpers.

    Divide the function in aarch32_alloc_kuser_vdso_page() and
    aarch32_alloc_sigreturn_vdso_page().

    Cc: Catalin Marinas
    Cc: Will Deacon
    Signed-off-by: Vincenzo Frascino
    Reviewed-by: Catalin Marinas
    [will: Inlined sigpage allocation to simplify error paths]
    Signed-off-by: Will Deacon

    Vincenzo Frascino
     
  • For AArch32 tasks, we install a special "[vectors]" page that contains
    the sigreturn trampolines and kuser helpers, which is mapped at a fixed
    address specified by the kuser helpers ABI.

    Having the sigreturn trampolines in the same page as the kuser helpers
    makes it impossible to disable the kuser helpers independently.

    Follow the Arm implementation, by moving the signal trampolines out of
    the "[vectors]" page and into their own "[sigpage]".

    Cc: Catalin Marinas
    Cc: Will Deacon
    Signed-off-by: Vincenzo Frascino
    Reviewed-by: Catalin Marinas
    [will: tweaked comments and fixed sparse warning]
    Signed-off-by: Will Deacon

    Vincenzo Frascino
     

17 Apr, 2019

1 commit

  • clock_getres() in the vDSO library has to preserve the same behaviour
    of posix_get_hrtimer_res().

    In particular, posix_get_hrtimer_res() does:

    sec = 0;
    ns = hrtimer_resolution;

    where 'hrtimer_resolution' depends on whether or not high resolution
    timers are enabled, which is a runtime decision.

    The vDSO incorrectly returns the constant CLOCK_REALTIME_RES. Fix this
    by exposing 'hrtimer_resolution' in the vDSO datapage and returning that
    instead.

    Reviewed-by: Catalin Marinas
    Signed-off-by: Vincenzo Frascino
    [will: Use WRITE_ONCE(), move adr off COARSE path, renumber labels, use 'w' reg]
    Signed-off-by: Will Deacon

    Vincenzo Frascino
     

03 Apr, 2019

1 commit

  • Since commit ad67b74d2469d9b8 ("printk: hash addresses printed with %p"),
    two obfuscated kernel pointer are printed at every boot:

    vdso: 2 pages (1 code @ (____ptrval____), 1 data @ (____ptrval____))

    Remove the the print completely, as it's useless without the addresses.

    Fixes: ad67b74d2469d9b8 ("printk: hash addresses printed with %p")
    Acked-by: Mark Rutland
    Signed-off-by: Matteo Croce
    Signed-off-by: Will Deacon

    Matteo Croce
     

09 Aug, 2017

1 commit

  • vDSO VMA address is saved in mm_context for the purpose of using
    restorer from vDSO page to return to userspace after signal handling.

    In Checkpoint Restore in Userspace (CRIU) project we place vDSO VMA
    on restore back to the place where it was on the dump.
    With the exception for x86 (where there is API to map vDSO with
    arch_prctl()), we move vDSO inherited from CRIU task to restoree
    position by mremap().

    CRIU does support arm64 architecture, but kernel doesn't update
    context.vdso pointer after mremap(). Which results in translation
    fault after signal handling on restored application:
    https://github.com/xemul/criu/issues/288

    Make vDSO code track the VMA address by supplying .mremap() fops
    the same way it's done for x86 and arm32 by:
    commit b059a453b1cf ("x86/vdso: Add mremap hook to vm_special_mapping")
    commit 280e87e98c09 ("ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO").

    Cc: Russell King
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: Cyrill Gorcunov
    Cc: Pavel Emelyanov
    Cc: Christopher Covington
    Reviewed-by: Will Deacon
    Signed-off-by: Dmitry Safonov
    Signed-off-by: Catalin Marinas

    Dmitry Safonov
     

06 Jul, 2017

1 commit

  • Pull arm64 updates from Will Deacon:

    - RAS reporting via GHES/APEI (ACPI)

    - Indirect ftrace trampolines for modules

    - Improvements to kernel fault reporting

    - Page poisoning

    - Sigframe cleanups and preparation for SVE context

    - Core dump fixes

    - Sparse fixes (mainly relating to endianness)

    - xgene SoC PMU v3 driver

    - Misc cleanups and non-critical fixes

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (75 commits)
    arm64: fix endianness annotation for 'struct jit_ctx' and friends
    arm64: cpuinfo: constify attribute_group structures.
    arm64: ptrace: Fix incorrect get_user() use in compat_vfp_set()
    arm64: ptrace: Remove redundant overrun check from compat_vfp_set()
    arm64: ptrace: Avoid setting compat FP[SC]R to garbage if get_user fails
    arm64: fix endianness annotation for __apply_alternatives()/get_alt_insn()
    arm64: fix endianness annotation in get_kaslr_seed()
    arm64: add missing conversion to __wsum in ip_fast_csum()
    arm64: fix endianness annotation in acpi_parking_protocol.c
    arm64: use readq() instead of readl() to read 64bit entry_point
    arm64: fix endianness annotation for reloc_insn_movw() & reloc_insn_imm()
    arm64: fix endianness annotation for aarch64_insn_write()
    arm64: fix endianness annotation in aarch64_insn_read()
    arm64: fix endianness annotation in call_undef_hook()
    arm64: fix endianness annotation for debug-monitors.c
    ras: mark stub functions as 'inline'
    arm64: pass endianness info to sparse
    arm64: ftrace: fix !CONFIG_ARM64_MODULE_PLTS kernels
    arm64: signal: Allow expansion of the signal frame
    acpi: apei: check for pending errors when probing GHES entries
    ...

    Linus Torvalds
     

21 Jun, 2017

1 commit

  • Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW,
    remove the duplicitive tk->raw_time.tv_nsec, which can be
    stored in tk->tkr_raw.xtime_nsec (similarly to how its handled
    for monotonic time).

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Miroslav Lichvar
    Cc: Richard Cochran
    Cc: Prarit Bhargava
    Cc: Stephen Boyd
    Cc: Kevin Brodsky
    Cc: Will Deacon
    Cc: Daniel Mentz
    Tested-by: Daniel Mentz
    Signed-off-by: John Stultz

    John Stultz
     

20 Jun, 2017

1 commit

  • Recently vDSO support for CLOCK_MONOTONIC_RAW was added in
    49eea433b326 ("arm64: Add support for CLOCK_MONOTONIC_RAW in
    clock_gettime() vDSO"). Noticing that the core timekeeping code
    never set tkr_raw.xtime_nsec, the vDSO implementation didn't
    bother exposing it via the data page and instead took the
    unshifted tk->raw_time.tv_nsec value which was then immediately
    shifted left in the vDSO code.

    Unfortunately, by accellerating the MONOTONIC_RAW clockid, it
    uncovered potential 1ns time inconsistencies caused by the
    timekeeping core not handing sub-ns resolution.

    Now that the core code has been fixed and is actually setting
    tkr_raw.xtime_nsec, we need to take that into account in the
    vDSO by adding it to the shifted raw_time value, in order to
    fix the user-visible inconsistency. Rather than do that at each
    use (and expand the data page in the process), instead perform
    the shift/addition operation when populating the data page and
    remove the shift from the vDSO code entirely.

    [jstultz: minor whitespace tweak, tried to improve commit
    message to make it more clear this fixes a regression]
    Reported-by: John Stultz
    Signed-off-by: Will Deacon
    Signed-off-by: John Stultz
    Tested-by: Daniel Mentz
    Acked-by: Kevin Brodsky
    Cc: Prarit Bhargava
    Cc: Richard Cochran
    Cc: Stephen Boyd
    Cc: "stable #4 . 8+"
    Cc: Miroslav Lichvar
    Link: http://lkml.kernel.org/r/1496965462-20003-4-git-send-email-john.stultz@linaro.org
    Signed-off-by: Thomas Gleixner

    Will Deacon
     

07 Jun, 2017

1 commit

  • Adjust vdso_{start|end} to be char arrays to avoid compile-time analysis
    that flags "too large" memcmp() calls with CONFIG_FORTIFY_SOURCE.

    Cc: Jisheng Zhang
    Acked-by: Catalin Marinas
    Suggested-by: Mark Rutland
    Signed-off-by: Kees Cook
    Signed-off-by: Will Deacon

    Kees Cook
     

12 Jan, 2017

1 commit

  • __pa_symbol is technically the marcro that should be used for kernel
    symbols. Switch to this as a pre-requisite for DEBUG_VIRTUAL which
    will do bounds checking.

    Reviewed-by: Mark Rutland
    Tested-by: Mark Rutland
    Signed-off-by: Laura Abbott
    Signed-off-by: Will Deacon

    Laura Abbott
     

24 Sep, 2016

1 commit


22 Aug, 2016

3 commits

  • These objects are set during initialization, thereafter are read only.

    Previously I only want to mark vdso_pages, vdso_spec, vectors_page and
    cpu_ops as __read_mostly from performance point of view. Then inspired
    by Kees's patch[1] to apply more __ro_after_init for arm, I think it's
    better to mark them as __ro_after_init. What's more, I find some more
    objects are also read only after init. So apply __ro_after_init to all
    of them.

    This patch also removes global vdso_pagelist and tries to clean up
    vdso_spec[] assignment code.

    [1] http://www.spinics.net/lists/arm-kernel/msg523188.html

    Acked-by: Mark Rutland
    Signed-off-by: Jisheng Zhang
    Signed-off-by: Will Deacon

    Jisheng Zhang
     
  • The vm_special_mapping spec which is used for aarch32 vectors page is
    never modified, so mark it as const.

    Acked-by: Mark Rutland
    Signed-off-by: Jisheng Zhang
    Signed-off-by: Will Deacon

    Jisheng Zhang
     
  • It is not needed after booting, this patch moves the alloc_vectors_page
    function to the __init section.

    Acked-by: Mark Rutland
    Signed-off-by: Jisheng Zhang
    Signed-off-by: Will Deacon

    Jisheng Zhang
     

12 Jul, 2016

1 commit

  • So far the arm64 clock_gettime() vDSO implementation only supported
    the following clocks, falling back to the syscall for the others:
    - CLOCK_REALTIME{,_COARSE}
    - CLOCK_MONOTONIC{,_COARSE}

    This patch adds support for the CLOCK_MONOTONIC_RAW clock, taking
    advantage of the recent refactoring of the vDSO time functions. Like
    the non-_COARSE clocks, this only works when the "arch_sys_counter"
    clocksource is in use (allowing us to read the current time from the
    virtual counter register), otherwise we also have to fall back to the
    syscall.

    Most of the data is shared with CLOCK_MONOTONIC, and the algorithm is
    similar. The reference implementation in kernel/time/timekeeping.c
    shows that:
    - CLOCK_MONOTONIC = tk->wall_to_monotonic + tk->xtime_sec +
    timekeeping_get_ns(&tk->tkr_mono)
    - CLOCK_MONOTONIC_RAW = tk->raw_time + timekeeping_get_ns(&tk->tkr_raw)
    - tkr_mono and tkr_raw are identical (in particular, same
    clocksource), except these members:
    * mult (only mono's multiplier is NTP-adjusted)
    * xtime_nsec (always 0 for raw)

    Therefore, tk->raw_time and tkr_raw->mult are now also stored in the
    vDSO data page.

    Cc: Ali Saidi
    Signed-off-by: Kevin Brodsky
    Reviewed-by: Dave Martin
    Acked-by: Will Deacon
    Signed-off-by: Catalin Marinas

    Kevin Brodsky
     

24 May, 2016

1 commit

  • most architectures are relying on mmap_sem for write in their
    arch_setup_additional_pages. If the waiting task gets killed by the oom
    killer it would block oom_reaper from asynchronous address space reclaim
    and reduce the chances of timely OOM resolving. Wait for the lock in
    the killable mode and return with EINTR if the task got killed while
    waiting.

    Signed-off-by: Michal Hocko
    Acked-by: Andy Lutomirski [x86 vdso]
    Acked-by: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

14 Apr, 2016

1 commit


10 Aug, 2015

1 commit

  • Since 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the
    real timekeeper last") it has become possible on arm64 to:

    - Obtain a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE timestamp
    via syscall.
    - Subsequently obtain a timestamp for the same clock ID via VDSO which
    predates the first timestamp (by one jiffy).

    This is because arm64's update_vsyscall is deriving the coarse time
    using the __current_kernel_time interface, when it should really be
    using the timekeeper object provided to it by the timekeeping core.
    It happened to work before only because __current_kernel_time would
    access the same timekeeper object which had been passed to
    update_vsyscall. This is no longer the case.

    Signed-off-by: Nathan Lynch
    Acked-by: Will Deacon
    Signed-off-by: Catalin Marinas

    Nathan Lynch
     

27 Mar, 2015

1 commit

  • In preparation of adding another tkr field, rename this one to
    tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base,
    since the structure is not specific to CLOCK_MONOTONIC and the mono
    name got added to the tk_read_base instance.

    Lots of trivial churn.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: John Stultz
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra