17 Oct, 2020

1 commit

  • Fix multiple occurrences of duplicated words in kernel/.

    Fix one typo/spello on the same line as a duplicate word. Change one
    instance of "the the" to "that the". Otherwise just drop one of the
    repeated words.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/98202fa6-8919-ef63-9efe-c0fad5ca7af1@infradead.org
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

01 Sep, 2020

1 commit


29 Aug, 2019

1 commit

  • On architectures that discard .exit.* sections at runtime, a
    warning is printed for each jump label that is used within an
    in-kernel __exit annotated function:

    can't patch jump_label at ehci_hcd_cleanup+0x8/0x3c
    WARNING: CPU: 0 PID: 1 at kernel/jump_label.c:410 __jump_label_update+0x12c/0x138

    As these functions will never get executed (they are free'd along
    with the rest of initmem) - we do not need to patch them and should
    not display any warnings.

    The warning is displayed because the test required to satisfy
    jump_entry_is_init is based on init_section_contains (__init_begin to
    __init_end) whereas the test in __jump_label_update is based on
    init_kernel_text (_sinittext to _einittext) via kernel_text_address).

    Fixes: 19483677684b ("jump_label: Annotate entries that operate on __init code earlier")
    Signed-off-by: Andrew Murray
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Mark Rutland
    Signed-off-by: Will Deacon

    Andrew Murray
     

17 Jun, 2019

3 commits

  • If the architecture supports the batching of jump label updates, use it!

    An easy way to see the benefits of this patch is switching the
    schedstats on and off. For instance:

    -------------------------- %< ----------------------------
    #!/bin/sh
    while [ true ]; do
    sysctl -w kernel.sched_schedstats=1
    sleep 2
    sysctl -w kernel.sched_schedstats=0
    sleep 2
    done
    -------------------------- >% ----------------------------

    while watching the IPI count:

    -------------------------- %< ----------------------------
    # watch -n1 "cat /proc/interrupts | grep Function"
    -------------------------- >% ----------------------------

    With the current mode, it is possible to see +- 168 IPIs each 2 seconds,
    while with this patch the number of IPIs goes to 3 each 2 seconds.

    Regarding the performance impact of this patch set, I made two measurements:

    The time to update a key (the task that is causing the change)
    The time to run the int3 handler (the side effect on a thread that
    hits the code being changed)

    The schedstats static key was chosen as the key to being switched on and off.
    The reason being is that it is used in more than 56 places, in a hot path. The
    change in the schedstats static key will be done with the following command:

    while [ true ]; do
    sysctl -w kernel.sched_schedstats=1
    usleep 500000
    sysctl -w kernel.sched_schedstats=0
    usleep 500000
    done

    In this way, they key will be updated twice per second. To force the hit of the
    int3 handler, the system will also run a kernel compilation with two jobs per
    CPU. The test machine is a two nodes/24 CPUs box with an Intel Xeon processor
    @2.27GHz.

    Regarding the update part, on average, the regular kernel takes 57 ms to update
    the schedstats key, while the kernel with the batch updates takes just 1.4 ms
    on average. Although it seems to be too good to be true, it makes sense: the
    schedstats key is used in 56 places, so it was expected that it would take
    around 56 times to update the keys with the current implementation, as the
    IPIs are the most expensive part of the update.

    Regarding the int3 handler, the non-batch handler takes 45 ns on average, while
    the batch version takes around 180 ns. At first glance, it seems to be a high
    value. But it is not, considering that it is doing 56 updates, rather than one!
    It is taking four times more, only. This gain is possible because the patch
    uses a binary search in the vector: log2(56)=5.8. So, it was expected to have
    an overhead within four times.

    (voice of tv propaganda) But, that is not all! As the int3 handler keeps on for
    a shorter period (because the update part is on for a shorter time), the number
    of hits in the int3 handler decreased by 10%.

    The question then is: Is it worth paying the price of "135 ns" more in the int3
    handler?

    Considering that, in this test case, we are saving the handling of 53 IPIs,
    that takes more than these 135 ns, it seems to be a meager price to be paid.
    Moreover, the test case was forcing the hit of the int3, in practice, it
    does not take that often. While the IPI takes place on all CPUs, hitting
    the int3 handler or not!

    For instance, in an isolated CPU with a process running in user-space
    (nohz_full use-case), the chances of hitting the int3 handler is barely zero,
    while there is no way to avoid the IPIs. By bounding the IPIs, we are improving
    a lot this scenario.

    Signed-off-by: Daniel Bristot de Oliveira
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Chris von Recklinghausen
    Cc: Clark Williams
    Cc: Greg Kroah-Hartman
    Cc: H. Peter Anvin
    Cc: Jason Baron
    Cc: Jiri Kosina
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Marcelo Tosatti
    Cc: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Scott Wood
    Cc: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Link: https://lkml.kernel.org/r/acc891dbc2dbc9fd616dd680529a2337b1d1274c.1560325897.git.bristot@redhat.com
    Signed-off-by: Ingo Molnar

    Daniel Bristot de Oliveira
     
  • In the batching mode, all the entries of a given key are updated at once.
    During the update of a key, a hit in the int3 handler will check if the
    hitting code address belongs to one of these keys.

    To optimize the search of a given code in the vector of entries being
    updated, a binary search is used. The binary search relies on the order
    of the entries of a key by its code. Hence the keys need to be sorted
    by the code too, so sort the entries of a given key by the code.

    Signed-off-by: Daniel Bristot de Oliveira
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Chris von Recklinghausen
    Cc: Clark Williams
    Cc: Greg Kroah-Hartman
    Cc: H. Peter Anvin
    Cc: Jason Baron
    Cc: Jiri Kosina
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Marcelo Tosatti
    Cc: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Scott Wood
    Cc: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Link: https://lkml.kernel.org/r/f57ae83e0592418ba269866bb7ade570fc8632e0.1560325897.git.bristot@redhat.com
    Signed-off-by: Ingo Molnar

    Daniel Bristot de Oliveira
     
  • Move the check if a jump_entry is valid to a function. No functional
    change.

    Signed-off-by: Daniel Bristot de Oliveira
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Chris von Recklinghausen
    Cc: Clark Williams
    Cc: Greg Kroah-Hartman
    Cc: H. Peter Anvin
    Cc: Jason Baron
    Cc: Jiri Kosina
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Marcelo Tosatti
    Cc: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Scott Wood
    Cc: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Link: https://lkml.kernel.org/r/56b69bd3f8e644ed64f2dbde7c088030b8cbe76b.1560325897.git.bristot@redhat.com
    Signed-off-by: Ingo Molnar

    Daniel Bristot de Oliveira
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

29 Apr, 2019

3 commits

  • Changing jump_label state is protected by jump_label_lock().
    Rate limited static_key_slow_dec(), however, will never
    directly call jump_label_update(), it will schedule a delayed
    work instead. Therefore it's unnecessary to take both the
    cpus_read_lock() and jump_label_lock().

    This allows static_key_slow_dec_deferred() to be called
    from atomic contexts, like socket destructing in net/tls,
    without the need for another indirection.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Simon Horman
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: alexei.starovoitov@gmail.com
    Cc: ard.biesheuvel@linaro.org
    Cc: oss-drivers@netronome.com
    Cc: yamada.masahiro@socionext.com
    Link: https://lkml.kernel.org/r/20190330000854.30142-4-jakub.kicinski@netronome.com
    Signed-off-by: Ingo Molnar

    Jakub Kicinski
     
  • static_key_slow_dec() checks if the atomic enable count is larger
    than 1, and if so there decrements it before taking the jump_label_lock.
    Move this logic into a helper for reuse in rate limitted keys.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Simon Horman
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: alexei.starovoitov@gmail.com
    Cc: ard.biesheuvel@linaro.org
    Cc: oss-drivers@netronome.com
    Cc: yamada.masahiro@socionext.com
    Link: https://lkml.kernel.org/r/20190330000854.30142-3-jakub.kicinski@netronome.com
    Signed-off-by: Ingo Molnar

    Jakub Kicinski
     
  • Add deferred static branches. We can't unfortunately use the
    nice trick of encapsulating the entire structure in true/false
    variants, because the inside has to be either struct static_key_true
    or struct static_key_false. Use defines to pass the appropriate
    members to the helpers separately.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Simon Horman
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: alexei.starovoitov@gmail.com
    Cc: ard.biesheuvel@linaro.org
    Cc: oss-drivers@netronome.com
    Cc: yamada.masahiro@socionext.com
    Link: https://lkml.kernel.org/r/20190330000854.30142-2-jakub.kicinski@netronome.com
    Signed-off-by: Ingo Molnar

    Jakub Kicinski
     

03 Apr, 2019

1 commit

  • Even though the atomic_dec_and_mutex_lock() in
    __static_key_slow_dec_cpuslocked() can never see a negative value in
    key->enabled the subsequent sanity check is re-reading key->enabled, which may
    have been set to -1 in the meantime by static_key_slow_inc_cpuslocked().

    CPU A CPU B

    __static_key_slow_dec_cpuslocked(): static_key_slow_inc_cpuslocked():
    # enabled = 1
    atomic_dec_and_mutex_lock()
    # enabled = 0
    atomic_read() == 0
    atomic_set(-1)
    # enabled = -1
    val = atomic_read()
    # Oops - val == -1!

    The test case is TCP's clean_acked_data_enable() / clean_acked_data_disable()
    as tickled by KTLS (net/ktls).

    Suggested-by: Jakub Kicinski
    Reported-by: Jakub Kicinski
    Tested-by: Jakub Kicinski
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: ard.biesheuvel@linaro.org
    Cc: oss-drivers@netronome.com
    Cc: pbonzini@redhat.com
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

06 Jan, 2019

1 commit

  • Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".

    The jump label is controlled by HAVE_JUMP_LABEL, which is defined
    like this:

    #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
    # define HAVE_JUMP_LABEL
    #endif

    We can improve this by testing 'asm goto' support in Kconfig, then
    make JUMP_LABEL depend on CC_HAS_ASM_GOTO.

    Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
    match to the real kernel capability.

    Signed-off-by: Masahiro Yamada
    Acked-by: Michael Ellerman (powerpc)
    Tested-by: Sedat Dilek

    Masahiro Yamada
     

16 Oct, 2018

1 commit


02 Oct, 2018

1 commit

  • Commit 19483677684b ("jump_label: Annotate entries that operate on
    __init code earlier") refactored the code that manages runtime
    patching of jump labels in modules that are tied to static keys
    defined in other modules or in the core kernel.

    In the latter case, we may iterate over the static_key_mod linked
    list until we hit the entry for the core kernel, whose 'mod' field
    will be NULL, and attempt to dereference it to get at its 'state'
    member.

    So let's add a non-NULL check: this forces the 'init' argument of
    __jump_label_update() to false for static keys that are defined in
    the core kernel, which is appropriate given that __init annotated
    jump_label entries in the core kernel should no longer be active
    at this point (i.e., when loading modules).

    Fixes: 19483677684b ("jump_label: Annotate entries that operate on ...")
    Reported-by: Dan Carpenter
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Kees Cook
    Cc: Jessica Yu
    Cc: Peter Zijlstra
    Link: https://lkml.kernel.org/r/20181001081324.11553-1-ard.biesheuvel@linaro.org

    Ard Biesheuvel
     

27 Sep, 2018

3 commits

  • Jump table entries are mostly read-only, with the exception of the
    init and module loader code that defuses entries that point into init
    code when the code being referred to is freed.

    For robustness, it would be better to move these entries into the
    ro_after_init section, but clearing the 'code' member of each jump
    table entry referring to init code at module load time races with the
    module_enable_ro() call that remaps the ro_after_init section read
    only, so we'd like to do it earlier.

    So given that whether such an entry refers to init code can be decided
    much earlier, we can pull this check forward. Since we may still need
    the code entry at this point, let's switch to setting a low bit in the
    'key' member just like we do to annotate the default state of a jump
    table entry.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Kees Cook
    Acked-by: Peter Zijlstra (Intel)
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Will Deacon
    Cc: Catalin Marinas
    Cc: Steven Rostedt
    Cc: Martin Schwidefsky
    Cc: Jessica Yu
    Link: https://lkml.kernel.org/r/20180919065144.25010-8-ard.biesheuvel@linaro.org

    Ard Biesheuvel
     
  • To reduce the size taken up by absolute references in jump label
    entries themselves and the associated relocation records in the
    .init segment, add support for emitting them as relative references
    instead.

    Note that this requires some extra care in the sorting routine, given
    that the offsets change when entries are moved around in the jump_entry
    table.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Kees Cook
    Cc: Will Deacon
    Cc: Catalin Marinas
    Cc: Steven Rostedt
    Cc: Martin Schwidefsky
    Cc: Jessica Yu
    Link: https://lkml.kernel.org/r/20180919065144.25010-3-ard.biesheuvel@linaro.org

    Ard Biesheuvel
     
  • In preparation of allowing architectures to use relative references
    in jump_label entries [which can dramatically reduce the memory
    footprint], introduce abstractions for references to the 'code' and
    'key' members of struct jump_entry.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Kees Cook
    Cc: Will Deacon
    Cc: Catalin Marinas
    Cc: Steven Rostedt
    Cc: Martin Schwidefsky
    Cc: Jessica Yu
    Link: https://lkml.kernel.org/r/20180919065144.25010-2-ard.biesheuvel@linaro.org

    Ard Biesheuvel
     

10 Sep, 2018

4 commits


20 Mar, 2018

1 commit

  • With the following commit:

    333522447063 ("jump_label: Explicitly disable jump labels in __init code")

    ... we explicitly disabled jump labels in __init code, so they could be
    detected and not warned about in the following commit:

    dc1dd184c2f0 ("jump_label: Warn on failed jump_label patching attempt")

    In-kernel __exit code has the same issue. It's never used, so it's
    freed along with the rest of initmem. But jump label entries in __exit
    code aren't explicitly disabled, so we get the following warning when
    enabling pr_debug() in __exit code:

    can't patch jump_label at dmi_sysfs_exit+0x0/0x2d
    WARNING: CPU: 0 PID: 22572 at kernel/jump_label.c:376 __jump_label_update+0x9d/0xb0

    Fix the warning by disabling all jump labels in initmem (which includes
    both __init and __exit code).

    Reported-and-tested-by: Li Wang
    Signed-off-by: Josh Poimboeuf
    Cc: Borislav Petkov
    Cc: Jason Baron
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: dc1dd184c2f0 ("jump_label: Warn on failed jump_label patching attempt")
    Link: http://lkml.kernel.org/r/7121e6e595374f06616c505b6e690e275c0054d1.1521483452.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

14 Mar, 2018

1 commit

  • The kbuild test robot reported the following warning on sparc64:

    kernel/jump_label.c: In function '__jump_label_update':
    kernel/jump_label.c:376:51: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    WARN_ONCE(1, "can't patch jump_label at %pS", (void *)entry->code);

    On sparc64, the jump_label entry->code field is of type u32, but
    pointers are 64-bit. Silence the warning by casting entry->code to an
    unsigned long before casting it to a pointer. This is also what the
    sparc jump label code does.

    Fixes: dc1dd184c2f0 ("jump_label: Warn on failed jump_label patching attempt")
    Reported-by: kbuild test robot
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Jason Baron
    Cc: Borislav Petkov
    Cc: "David S . Miller"
    Link: https://lkml.kernel.org/r/c966fed42be6611254a62d46579ec7416548d572.1521041026.git.jpoimboe@redhat.com

    Josh Poimboeuf
     

21 Feb, 2018

3 commits

  • Convert init_kernel_text() to a global function and use it in a few
    places instead of manually comparing _sinittext and _einittext.

    Note that kallsyms.h has a very similar function called
    is_kernel_inittext(), but its end check is inclusive. I'm not sure
    whether that's intentional behavior, so I didn't touch it.

    Suggested-by: Jason Baron
    Signed-off-by: Josh Poimboeuf
    Acked-by: Peter Zijlstra
    Acked-by: Steven Rostedt (VMware)
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/4335d02be8d45ca7d265d2f174251d0b7ee6c5fd.1519051220.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     
  • Currently when the jump label code encounters an address which isn't
    recognized by kernel_text_address(), it just silently fails.

    This can be dangerous because jump labels are used in a variety of
    places, and are generally expected to work. Convert the silent failure
    to a warning.

    This won't warn about attempted writes to tracepoints in __init code
    after initmem has been freed, as those are already guarded by the
    entry->code check.

    Signed-off-by: Josh Poimboeuf
    Acked-by: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Jason Baron
    Cc: Linus Torvalds
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/de3a271c93807adb7ed48f4e946b4f9156617680.1519051220.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     
  • After initmem has been freed, any jump labels in __init code are
    prevented from being written to by the kernel_text_address() check in
    __jump_label_update(). However, this check is quite broad. If
    kernel_text_address() were to return false for any other reason, the
    jump label write would fail silently with no warning.

    For jump labels in module init code, entry->code is set to zero to
    indicate that the entry is disabled. Do the same thing for core kernel
    init code. This makes the behavior more consistent, and will also make
    it more straightforward to detect non-init jump label write failures in
    the next patch.

    Signed-off-by: Josh Poimboeuf
    Acked-by: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Jason Baron
    Cc: Linus Torvalds
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/c52825c73f3a174e8398b6898284ec20d4deb126.1519051220.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

24 Jan, 2018

1 commit

  • Tejun reported the following cpu-hotplug lock (percpu-rwsem) read recursion:

    tg_set_cfs_bandwidth()
    get_online_cpus()
    cpus_read_lock()

    cfs_bandwidth_usage_inc()
    static_key_slow_inc()
    cpus_read_lock()

    Reported-by: Tejun Heo
    Tested-by: Tejun Heo
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20180122215328.GP3397@worktop
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

14 Nov, 2017

1 commit

  • Fengguang Wu reported that running the rcuperf test during boot can cause
    the jump_label_test() to hit a WARN_ON(). The issue is that the core jump
    label code relies on kernel_text_address() to detect when it can no longer
    update branches that may be contained in __init sections. The
    kernel_text_address() in turn assumes that if the system_state variable is
    greter than or equal to SYSTEM_RUNNING then __init sections are no longer
    valid (since the assumption is that they have been freed). However, when
    rcuperf is setup to run in early boot it can call kernel_power_off() which
    sets the system_state to SYSTEM_POWER_OFF.

    Since rcuperf initialization is invoked via a module_init(), we can make
    the dependency of jump_label_test() needing to complete before rcuperf
    explicit by calling it via early_initcall().

    Reported-by: Fengguang Wu
    Signed-off-by: Jason Baron
    Acked-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1510609727-2238-1-git-send-email-jbaron@akamai.com
    Signed-off-by: Ingo Molnar

    Jason Baron
     

19 Oct, 2017

1 commit

  • Right now it says:

    static_key_disable_cpuslocked used before call to jump_label_init
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:161 static_key_disable_cpuslocked+0x68/0x70
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.0-rc5+ #1
    Hardware name: SGI.COM C2112-4GP3/X10DRT-P-Series, BIOS 2.0a 05/09/2016
    task: ffffffff81c0e480 task.stack: ffffffff81c00000
    RIP: 0010:static_key_disable_cpuslocked+0x68/0x70
    RSP: 0000:ffffffff81c03ef0 EFLAGS: 00010096 ORIG_RAX: 0000000000000000
    RAX: 0000000000000041 RBX: ffffffff81c32680 RCX: ffffffff81c5cbf8
    RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000002
    RBP: ffff88807fffd240 R08: 726f666562206465 R09: 0000000000000136
    R10: 0000000000000000 R11: 696e695f6c656261 R12: ffffffff82158900
    R13: ffffffff8215f760 R14: 0000000000000001 R15: 0000000000000008
    FS: 0000000000000000(0000) GS:ffff883f7f400000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffff88807ffff000 CR3: 0000000001c09000 CR4: 00000000000606b0
    Call Trace:
    static_key_disable+0x16/0x20
    start_kernel+0x15a/0x45d
    ? load_ucode_intel_bsp+0x11/0x2d
    secondary_startup_64+0xa5/0xb0
    Code: 48 c7 c7 a0 15 cf 81 e9 47 53 4b 00 48 89 df e8 5f fc ff ff eb e8 48 c7 c6 \
    c0 97 83 81 48 c7 c7 d0 ff a2 81 31 c0 e8 c5 9d f5 ff ff eb a7 0f ff eb \
    b0 e8 eb a2 4b 00 53 48 89 fb e8 42 0e f0

    but it doesn't tell me which key it is. So dump the key's name too:

    static_key_disable_cpuslocked(): static key 'virt_spin_lock_key' used before call to jump_label_init()

    And that makes pinpointing which key is causing that a lot easier.

    include/linux/jump_label.h | 14 +++++++-------
    include/linux/jump_label_ratelimit.h | 6 +++---
    kernel/jump_label.c | 14 +++++++-------
    3 files changed, 17 insertions(+), 17 deletions(-)

    Signed-off-by: Borislav Petkov
    Reviewed-by: Steven Rostedt (VMware)
    Cc: Boris Ostrovsky
    Cc: Hannes Frederic Sowa
    Cc: Jason Baron
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20171018152428.ffjgak4o25f7ept6@pd.tnic
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     

10 Aug, 2017

5 commits

  • As using the normal static key API under the hotplug lock is
    pretty much impossible, let's provide a variant of some of them
    that require the hotplug lock to have already been taken.

    These function are only meant to be used in CPU hotplug callbacks.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Leo Yan
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/20170801080257.5056-4-marc.zyngier@arm.com
    Signed-off-by: Ingo Molnar

    Marc Zyngier
     
  • In order to later introduce an "already locked" version of some
    of the static key funcions, let's split the code into the core stuff
    (the *_cpuslocked functions) and the usual helpers, which now
    take/release the hotplug lock and call into the _cpuslocked
    versions.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Leo Yan
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/20170801080257.5056-3-marc.zyngier@arm.com
    Signed-off-by: Ingo Molnar

    Marc Zyngier
     
  • As we're about to rework the locking, let's move the taking and
    release of the CPU hotplug lock to locations that will make its
    reworking completely obvious.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Leo Yan
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/20170801080257.5056-2-marc.zyngier@arm.com
    Signed-off-by: Ingo Molnar

    Marc Zyngier
     
  • In the unlikely case text modification does not fully order things,
    add some extra ordering of our own to ensure we only enabled the fast
    path after all text is visible.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Jason Baron
    Cc: Linus Torvalds
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • static_key_enable/disable are trying to cap the static key count to
    0/1. However, their use of key->enabled is outside jump_label_lock
    so they do not really ensure that.

    Rewrite them to do a quick check for an already enabled (respectively,
    already disabled), and then recheck under the jump label lock. Unlike
    static_key_slow_inc/dec, a failed check under the jump label lock does
    not modify key->enabled.

    Signed-off-by: Paolo Bonzini
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Eric Dumazet
    Cc: Jason Baron
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1501601046-35683-2-git-send-email-pbonzini@redhat.com
    Signed-off-by: Ingo Molnar

    Paolo Bonzini
     

26 May, 2017

1 commit

  • The conversion of the hotplug locking to a percpu rwsem unearthed lock
    ordering issues all over the place.

    The jump_label code has two issues:

    1) Nested get_online_cpus() invocations

    2) Ordering problems vs. the cpus rwsem and the jump_label_mutex

    To cure these, the following lock order has been established;

    cpus_rwsem -> jump_label_lock -> text_mutex

    Even if not all architectures need protection against CPU hotplug, taking
    cpus_rwsem before jump_label_lock is now mandatory in code pathes which
    actually modify code and therefor need text_mutex protection.

    Move the get_online_cpus() invocations into the core jump label code and
    establish the proper lock order where required.

    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Acked-by: "David S. Miller"
    Cc: Paul E. McKenney
    Cc: Chris Metcalf
    Cc: Peter Zijlstra
    Cc: Sebastian Siewior
    Cc: Steven Rostedt
    Cc: Jason Baron
    Cc: Ralf Baechle
    Link: http://lkml.kernel.org/r/20170524081549.025830817@linutronix.de

    Thomas Gleixner
     

28 Feb, 2017

1 commit

  • Pull tracing updates from Steven Rostedt:
    "This release has no new tracing features, just clean ups, minor fixes
    and small optimizations"

    * tag 'trace-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (25 commits)
    tracing: Remove outdated ring buffer comment
    tracing/probes: Fix a warning message to show correct maximum length
    tracing: Fix return value check in trace_benchmark_reg()
    tracing: Use modern function declaration
    jump_label: Reduce the size of struct static_key
    tracing/probe: Show subsystem name in messages
    tracing/hwlat: Update old comment about migration
    timers: Make flags output in the timer_start tracepoint useful
    tracing: Have traceprobe_probes_write() not access userspace unnecessarily
    tracing: Have COMM event filter key be treated as a string
    ftrace: Have set_graph_function handle multiple functions in one write
    ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock
    tracing: Reset parser->buffer to allow multiple "puts"
    ftrace: Have set_graph_functions handle write with RDWR
    ftrace: Reset fgd->hash in ftrace_graph_write()
    ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY
    ftrace: Create a slight optimization on searching the ftrace_hash
    tracing: Add ftrace_hash_key() helper function
    ftrace: Convert graph filter to use hash tables
    ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip
    ...

    Linus Torvalds
     

15 Feb, 2017

1 commit

  • The static_key->next field goes mostly unused. The field is used for
    associating module uses with a static key. Most uses of struct static_key
    define a static key in the core kernel and make use of it entirely within
    the core kernel, or define the static key in a module and make use of it
    only from within that module. In fact, of the ~3,000 static keys defined,
    I found only about 5 or so that did not fit this pattern.

    Thus, we can remove the static_key->next field entirely and overload
    the static_key->entries field. That is, when all the static_key uses
    are contained within the same module, static_key->entries continues
    to point to those uses. However, if the static_key uses are not contained
    within the module where the static_key is defined, then we allocate a
    struct static_key_mod, store a pointer to the uses within that
    struct static_key_mod, and have the static key point at the static_key_mod.
    This does incur some extra memory usage when a static_key is used in a
    module that does not define it, but since there are only a handful of such
    cases there is a net savings.

    In order to identify if the static_key->entries pointer contains a
    struct static_key_mod or a struct jump_entry pointer, bit 1 of
    static_key->entries is set to 1 if it points to a struct static_key_mod and
    is 0 if it points to a struct jump_entry. We were already using bit 0 in a
    similar way to store the initial value of the static_key. This does mean
    that allocations of struct static_key_mod and that the struct jump_entry
    tables need to be at least 4-byte aligned in memory. As far as I can tell
    all arches meet this criteria.

    For my .config, the patch increased the text by 778 bytes, but reduced
    the data + bss size by 14912, for a net savings of 14,134 bytes.

    text data bss dec hex filename
    8092427 5016512 790528 13899467 d416cb vmlinux.pre
    8093205 5001600 790528 13885333 d3df95 vmlinux.post

    Link: http://lkml.kernel.org/r/1486154544-4321-1-git-send-email-jbaron@akamai.com

    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Joe Perches
    Signed-off-by: Jason Baron
    Signed-off-by: Steven Rostedt (VMware)

    Jason Baron
     

12 Jan, 2017

1 commit

  • Modules that use static_key_deferred need a way to synchronize with
    any delayed work that is still pending when the module is unloaded.
    Introduce static_key_deferred_flush() which flushes any pending
    jump label updates.

    Signed-off-by: David Matlack
    Cc: stable@vger.kernel.org
    Acked-by: Peter Zijlstra (Intel)
    Signed-off-by: Paolo Bonzini

    David Matlack
     

05 Aug, 2016

1 commit

  • Pull more powerpc updates from Michael Ellerman:
    "These were delayed for various reasons, so I let them sit in next a
    bit longer, rather than including them in my first pull request.

    Fixes:
    - Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
    - Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
    - Move register_process_table() out of ppc_md from Michael Ellerman

    Use jump_label use for [cpu|mmu]_has_feature():
    - Add mmu_early_init_devtree() from Michael Ellerman
    - Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman
    - Do hash device tree scanning earlier from Michael Ellerman
    - Do radix device tree scanning earlier from Michael Ellerman
    - Do feature patching before MMU init from Michael Ellerman
    - Check features don't change after patching from Michael Ellerman
    - Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
    - Convert mmu_has_feature() to returning bool from Michael Ellerman
    - Convert cpu_has_feature() to returning bool from Michael Ellerman
    - Define radix_enabled() in one place & use static inline from Michael Ellerman
    - Add early_[cpu|mmu]_has_feature() from Michael Ellerman
    - Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V
    - jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao
    - Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
    - Remove mfvtb() from Kevin Hao
    - Move cpu_has_feature() to a separate file from Kevin Hao
    - Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman
    - Add option to use jump label for cpu_has_feature() from Kevin Hao
    - Add option to use jump label for mmu_has_feature() from Kevin Hao
    - Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V
    - Annotate jump label assembly from Michael Ellerman

    TLB flush enhancements from Aneesh Kumar K.V:
    - radix: Implement tlb mmu gather flush efficiently
    - Add helper for finding SLBE LLP encoding
    - Use hugetlb flush functions
    - Drop multiple definition of mm_is_core_local
    - radix: Add tlb flush of THP ptes
    - radix: Rename function and drop unused arg
    - radix/hugetlb: Add helper for finding page size
    - hugetlb: Add flush_hugetlb_tlb_range
    - remove flush_tlb_page_nohash

    Add new ptrace regsets from Anshuman Khandual and Simon Guo:
    - elf: Add powerpc specific core note sections
    - Add the function flush_tmregs_to_thread
    - Enable in transaction NT_PRFPREG ptrace requests
    - Enable in transaction NT_PPC_VMX ptrace requests
    - Enable in transaction NT_PPC_VSX ptrace requests
    - Adapt gpr32_get, gpr32_set functions for transaction
    - Enable support for NT_PPC_CGPR
    - Enable support for NT_PPC_CFPR
    - Enable support for NT_PPC_CVMX
    - Enable support for NT_PPC_CVSX
    - Enable support for TM SPR state
    - Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
    - Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
    - Enable support for EBB registers
    - Enable support for Performance Monitor registers"

    * tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
    powerpc/mm: Move register_process_table() out of ppc_md
    powerpc/perf: Fix incorrect event codes in power9-event-list
    powerpc/32: Fix early access to cpu_spec relocation
    powerpc/ptrace: Enable support for Performance Monitor registers
    powerpc/ptrace: Enable support for EBB registers
    powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
    powerpc/ptrace: Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
    powerpc/ptrace: Enable support for TM SPR state
    powerpc/ptrace: Enable support for NT_PPC_CVSX
    powerpc/ptrace: Enable support for NT_PPC_CVMX
    powerpc/ptrace: Enable support for NT_PPC_CFPR
    powerpc/ptrace: Enable support for NT_PPC_CGPR
    powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction
    powerpc/ptrace: Enable in transaction NT_PPC_VSX ptrace requests
    powerpc/ptrace: Enable in transaction NT_PPC_VMX ptrace requests
    powerpc/ptrace: Enable in transaction NT_PRFPREG ptrace requests
    powerpc/process: Add the function flush_tmregs_to_thread
    elf: Add powerpc specific core note sections
    powerpc/mm: remove flush_tlb_page_nohash
    powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
    ...

    Linus Torvalds
     

04 Aug, 2016

1 commit

  • Pull module updates from Rusty Russell:
    "The only interesting thing here is Jessica's patch to add
    ro_after_init support to modules. The rest are all trivia"

    * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    extable.h: add stddef.h so "NULL" definition is not implicit
    modules: add ro_after_init support
    jump_label: disable preemption around __module_text_address().
    exceptions: fork exception table content from module.h into extable.h
    modules: Add kernel parameter to blacklist modules
    module: Do a WARN_ON_ONCE() for assert module mutex not held
    Documentation/module-signing.txt: Note need for version info if reusing a key
    module: Invalidate signatures on force-loaded modules
    module: Issue warnings when tainting kernel
    module: fix redundant test.
    module: fix noreturn attribute for __module_put_and_exit()

    Linus Torvalds