08 Apr, 2020

1 commit

  • Clang warns:

    ../kernel/extable.c:37:52: warning: array comparison always evaluates to
    a constant [-Wtautological-compare]
    if (main_extable_sort_needed && __stop___ex_table > __start___ex_table) {
    ^
    1 warning generated.

    These are not true arrays, they are linker defined symbols, which are just
    addresses. Using the address of operator silences the warning and does
    not change the resulting assembly with either clang/ld.lld or gcc/ld
    (tested with diff + objdump -Dr).

    Suggested-by: Nick Desaulniers
    Signed-off-by: Nathan Chancellor
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://github.com/ClangBuiltLinux/linux/issues/892
    Link: http://lkml.kernel.org/r/20200219202036.45702-1-natechancellor@gmail.com
    Signed-off-by: Linus Torvalds

    Nathan Chancellor
     

14 Mar, 2020

1 commit

  • Now that we have all the objects (bpf_prog, bpf_trampoline,
    bpf_dispatcher) linked in bpf_tree, there's no need to have
    separate bpf_image tree for images.

    Reverting the bpf_image tree together with struct bpf_image,
    because it's no longer needed.

    Also removing bpf_image_alloc function and adding the original
    bpf_jit_alloc_exec_page interface instead.

    The kernel_text_address function can now rely only on is_bpf_text_address,
    because it checks the bpf_tree that contains all the objects.

    Keeping bpf_image_ksym_add and bpf_image_ksym_del because they are
    useful wrappers with perf's ksymbol interface calls.

    Signed-off-by: Jiri Olsa
    Signed-off-by: Alexei Starovoitov
    Link: https://lore.kernel.org/bpf/20200312195610.346362-13-jolsa@kernel.org
    Signed-off-by: Alexei Starovoitov

    Jiri Olsa
     

25 Jan, 2020

1 commit

  • When unwinding the stack we need to identify each address
    to successfully continue. Adding latch tree to keep trampolines
    for quick lookup during the unwind.

    The patch uses first 48 bytes for latch tree node, leaving 4048
    bytes from the rest of the page for trampoline or dispatcher
    generated code.

    It's still enough not to affect trampoline and dispatcher progs
    maximum counts.

    Signed-off-by: Jiri Olsa
    Signed-off-by: Alexei Starovoitov
    Link: https://lore.kernel.org/bpf/20200123161508.915203-3-jolsa@kernel.org

    Jiri Olsa
     

17 Oct, 2019

1 commit

  • Pointer to BTF object is a pointer to kernel object or NULL.
    Such pointers can only be used by BPF_LDX instructions.
    The verifier changed their opcode from LDX|MEM|size
    to LDX|PROBE_MEM|size to make JITing easier.
    The number of entries in extable is the number of BPF_LDX insns
    that access kernel memory via "pointer to BTF type".
    Only these load instructions can fault.
    Since x86 extable is relative it has to be allocated in the same
    memory region as JITed code.
    Allocate it prior to last pass of JITing and let the last pass populate it.
    Pointer to extable in bpf_prog_aux is necessary to make page fault
    handling fast.
    Page fault handling is done in two steps:
    1. bpf_prog_kallsyms_find() finds BPF program that page faulted.
    It's done by walking rb tree.
    2. then extable for given bpf program is binary searched.
    This process is similar to how page faulting is done for kernel modules.
    The exception handler skips over faulting x86 instruction and
    initializes destination register with zero. This mimics exact
    behavior of bpf_probe_read (when probe_kernel_read faults dest is zeroed).

    JITs for other architectures can add support in similar way.
    Until then they will reject unknown opcode and fallback to interpreter.

    Since extable should be aligned and placed near JITed code
    make bpf_jit_binary_alloc() return 4 byte aligned image offset,
    so that extable aligning formula in bpf_int_jit_compile() doesn't need
    to rely on internal implementation of bpf_jit_binary_alloc().
    On x86 gcc defaults to 16-byte alignment for regular kernel functions
    due to better performance. JITed code may be aligned to 16 in the future,
    but it will use 4 in the meantime.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Acked-by: Andrii Nakryiko
    Acked-by: Martin KaFai Lau
    Link: https://lore.kernel.org/bpf/20191016032505.2089704-10-ast@kernel.org

    Alexei Starovoitov
     

21 Aug, 2019

1 commit

  • Certain architecture specific operating modes (e.g., in powerpc machine
    check handler that is unable to access vmalloc memory), the
    search_exception_tables cannot be called because it also searches the
    module exception tables if entry is not found in the kernel exception
    table.

    Signed-off-by: Santosh Sivaraj
    Reviewed-by: Nicholas Piggin
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20190820081352.8641-5-santosh@fossix.org

    Santosh Sivaraj
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version this program is distributed in the
    hope that it will be useful but without any warranty without even
    the implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details you
    should have received a copy of the gnu general public license along
    with this program if not write to the free software foundation inc
    59 temple place suite 330 boston ma 02111 1307 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 1334 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Richard Fontana
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 Feb, 2018

1 commit

  • Convert init_kernel_text() to a global function and use it in a few
    places instead of manually comparing _sinittext and _einittext.

    Note that kallsyms.h has a very similar function called
    is_kernel_inittext(), but its end check is inclusive. I'm not sure
    whether that's intentional behavior, so I didn't touch it.

    Suggested-by: Jason Baron
    Signed-off-by: Josh Poimboeuf
    Acked-by: Peter Zijlstra
    Acked-by: Steven Rostedt (VMware)
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/4335d02be8d45ca7d265d2f174251d0b7ee6c5fd.1519051220.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

07 Nov, 2017

1 commit

  • We use alternatives_text_reserved() to check if the address is in
    the fixed pieces of alternative reserved, but the problem is that
    we don't hold the smp_alt mutex when call this function. So the list
    traversal may encounter a deleted list_head if another path is doing
    alternatives_smp_module_del().

    One solution is that we can hold smp_alt mutex before call this
    function, but the difficult point is that the callers of this
    functions, arch_prepare_kprobe() and arch_prepare_optimized_kprobe(),
    are called inside the text_mutex. So we must hold smp_alt mutex
    before we go into these arch dependent code. But we can't now,
    the smp_alt mutex is the arch dependent part, only x86 has it.
    Maybe we can export another arch dependent callback to solve this.

    But there is a simpler way to handle this problem. We can reuse the
    text_mutex to protect smp_alt_modules instead of using another mutex.
    And all the arch dependent checks of kprobes are inside the text_mutex,
    so it's safe now.

    Signed-off-by: Zhou Chengming
    Reviewed-by: Masami Hiramatsu
    Acked-by: Steven Rostedt (VMware)
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bp@suse.de
    Fixes: 2cfa197 "ftrace/alternatives: Introducing *_text_reserved functions"
    Link: http://lkml.kernel.org/r/1509585501-79466-1-git-send-email-zhouchengming1@huawei.com
    Signed-off-by: Ingo Molnar

    Zhou Chengming
     

24 Sep, 2017

2 commits

  • If kernel_text_address() is called when RCU is not watching, it can cause an
    RCU bug because is_module_text_address(), the is_kprobe_*insn_slot()
    and is_bpf_text_address() functions require the use of RCU.

    Only enable RCU if it is not currently watching before it calls
    is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
    because kernel_text_address() can happen pretty much anywhere (like an NMI),
    and even from within an NMI. It is called via save_stack_trace() that can be
    called by any WARN() or tracing function, which can happen while RCU is not
    watching (for example, going to or coming from idle, or during CPU take down
    or bring up).

    Cc: stable@vger.kernel.org
    Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
    Acked-by: Paul E. McKenney
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • The functionality between kernel_text_address() and _kernel_text_address()
    is the same except that _kernel_text_address() does a little more (that
    function needs a rename, but that can be done another time). Instead of
    having duplicate code in both, simply have _kernel_text_address() calls
    kernel_text_address() instead.

    This is marked for stable because there's an RCU bug that can happen if
    one of these functions gets called while RCU is not watching. That fix
    depends on this fix to keep from having to write the fix twice.

    Cc: stable@vger.kernel.org
    Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
    Acked-by: Paul E. McKenney
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

11 Jul, 2017

1 commit


07 Jul, 2017

1 commit

  • core_kernel_text is used by MIPS in its function graph trace processing,
    so having this method traced leads to an infinite set of recursive calls
    such as:

    Call Trace:
    ftrace_return_to_handler+0x50/0x128
    core_kernel_text+0x10/0x1b8
    prepare_ftrace_return+0x6c/0x114
    ftrace_graph_caller+0x20/0x44
    return_to_handler+0x10/0x30
    return_to_handler+0x0/0x30
    return_to_handler+0x0/0x30
    ftrace_ops_no_ops+0x114/0x1bc
    core_kernel_text+0x10/0x1b8
    core_kernel_text+0x10/0x1b8
    core_kernel_text+0x10/0x1b8
    ftrace_ops_no_ops+0x114/0x1bc
    core_kernel_text+0x10/0x1b8
    prepare_ftrace_return+0x6c/0x114
    ftrace_graph_caller+0x20/0x44
    (...)

    Mark the function notrace to avoid it being traced.

    Link: http://lkml.kernel.org/r/1498028607-6765-1-git-send-email-marcin.nowakowski@imgtec.com
    Signed-off-by: Marcin Nowakowski
    Reviewed-by: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Thomas Meyer
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Daniel Borkmann
    Cc: Paul Gortmaker
    Cc: Thomas Gleixner
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marcin Nowakowski
     

23 May, 2017

1 commit

  • To enable smp_processor_id() and might_sleep() debug checks earlier, it's
    required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.

    Adjust the system_state check in core_kernel_text() to handle the extra
    states, i.e. to cover init text up to the point where the system switches
    to state RUNNING.

    Tested-by: Mark Rutland
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Steven Rostedt (VMware)
    Cc: Greg Kroah-Hartman
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20170516184735.949992741@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

23 Feb, 2017

1 commit

  • Pull networking updates from David Miller:
    "Highlights:

    1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
    Varadhan.

    2) Simplify classifier state on sk_buff in order to shrink it a bit.
    From Willem de Bruijn.

    3) Introduce SIPHASH and it's usage for secure sequence numbers and
    syncookies. From Jason A. Donenfeld.

    4) Reduce CPU usage for ICMP replies we are going to limit or
    suppress, from Jesper Dangaard Brouer.

    5) Introduce Shared Memory Communications socket layer, from Ursula
    Braun.

    6) Add RACK loss detection and allow it to actually trigger fast
    recovery instead of just assisting after other algorithms have
    triggered it. From Yuchung Cheng.

    7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.

    8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.

    9) Export MPLS packet stats via netlink, from Robert Shearman.

    10) Significantly improve inet port bind conflict handling, especially
    when an application is restarted and changes it's setting of
    reuseport. From Josef Bacik.

    11) Implement TX batching in vhost_net, from Jason Wang.

    12) Extend the dummy device so that VF (virtual function) features,
    such as configuration, can be more easily tested. From Phil
    Sutter.

    13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
    Dumazet.

    14) Add new bpf MAP, implementing a longest prefix match trie. From
    Daniel Mack.

    15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.

    16) Add new aquantia driver, from David VomLehn.

    17) Add bpf tracepoints, from Daniel Borkmann.

    18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
    Florian Fainelli.

    19) Remove custom busy polling in many drivers, it is done in the core
    networking since 4.5 times. From Eric Dumazet.

    20) Support XDP adjust_head in virtio_net, from John Fastabend.

    21) Fix several major holes in neighbour entry confirmation, from
    Julian Anastasov.

    22) Add XDP support to bnxt_en driver, from Michael Chan.

    23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.

    24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.

    25) Support GRO in IPSEC protocols, from Steffen Klassert"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
    Revert "ath10k: Search SMBIOS for OEM board file extension"
    net: socket: fix recvmmsg not returning error from sock_error
    bnxt_en: use eth_hw_addr_random()
    bpf: fix unlocking of jited image when module ronx not set
    arch: add ARCH_HAS_SET_MEMORY config
    net: napi_watchdog() can use napi_schedule_irqoff()
    tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
    net/hsr: use eth_hw_addr_random()
    net: mvpp2: enable building on 64-bit platforms
    net: mvpp2: switch to build_skb() in the RX path
    net: mvpp2: simplify MVPP2_PRS_RI_* definitions
    net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
    net: mvpp2: remove unused register definitions
    net: mvpp2: simplify mvpp2_bm_bufs_add()
    net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
    net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
    net: mvpp2: release reference to txq_cpu[] entry after unmapping
    net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
    net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
    net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
    ...

    Linus Torvalds
     

22 Feb, 2017

1 commit

  • Pull exception table module split from Paul Gortmaker:
    "Final extable.h related changes.

    This completes the separation of exception table content from the
    module.h header file. This is achieved with the final commit that
    removes the one line back compatible change that sourced extable.h
    into the module.h file.

    The commits are unchanged since January, with the exception of a
    couple Acks that came in for the last two commits a bit later. The
    changes have been in linux-next for quite some time[1] and have got
    widespread arch coverage via toolchains I have and also from
    additional ones the kbuild bot has.

    Maintaners of the various arch were Cc'd during the postings to
    lkml[2] and informed that the intention was to take the remaining arch
    specific changes and lump them together with the final two non-arch
    specific changes and submit for this merge window.

    The ia64 diffstat stands out and probably warrants a mention. In an
    earlier review, Al Viro made a valid comment that the original header
    separation of content left something to be desired, and that it get
    fixed as a part of this change, hence the larger diffstat"

    * tag 'extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (21 commits)
    module.h: remove extable.h include now users have migrated
    core: migrate exception table users off module.h and onto extable.h
    cris: migrate exception table users off module.h and onto extable.h
    hexagon: migrate exception table users off module.h and onto extable.h
    microblaze: migrate exception table users off module.h and onto extable.h
    unicore32: migrate exception table users off module.h and onto extable.h
    score: migrate exception table users off module.h and onto extable.h
    metag: migrate exception table users off module.h and onto extable.h
    arc: migrate exception table users off module.h and onto extable.h
    nios2: migrate exception table users off module.h and onto extable.h
    sparc: migrate exception table users onto extable.h
    openrisc: migrate exception table users off module.h and onto extable.h
    frv: migrate exception table users off module.h and onto extable.h
    sh: migrate exception table users off module.h and onto extable.h
    xtensa: migrate exception table users off module.h and onto extable.h
    mn10300: migrate exception table users off module.h and onto extable.h
    alpha: migrate exception table users off module.h and onto extable.h
    arm: migrate exception table users off module.h and onto extable.h
    m32r: migrate exception table users off module.h and onto extable.h
    ia64: ensure exception table search users include extable.h
    ...

    Linus Torvalds
     

18 Feb, 2017

1 commit

  • Long standing issue with JITed programs is that stack traces from
    function tracing check whether a given address is kernel code
    through {__,}kernel_text_address(), which checks for code in core
    kernel, modules and dynamically allocated ftrace trampolines. But
    what is still missing is BPF JITed programs (interpreted programs
    are not an issue as __bpf_prog_run() will be attributed to them),
    thus when a stack trace is triggered, the code walking the stack
    won't see any of the JITed ones. The same for address correlation
    done from user space via reading /proc/kallsyms. This is read by
    tools like perf, but the latter is also useful for permanent live
    tracing with eBPF itself in combination with stack maps when other
    eBPF types are part of the callchain. See offwaketime example on
    dumping stack from a map.

    This work tries to tackle that issue by making the addresses and
    symbols known to the kernel. The lookup from *kernel_text_address()
    is implemented through a latched RB tree that can be read under
    RCU in fast-path that is also shared for symbol/size/offset lookup
    for a specific given address in kallsyms. The slow-path iteration
    through all symbols in the seq file done via RCU list, which holds
    a tiny fraction of all exported ksyms, usually below 0.1 percent.
    Function symbols are exported as bpf_prog_, in order to aide
    debugging and attribution. This facility is currently enabled for
    root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
    is active in any mode. The rationale behind this is that still a lot
    of systems ship with world read permissions on kallsyms thus addresses
    should not get suddenly exposed for them. If that situation gets
    much better in future, we always have the option to change the
    default on this. Likewise, unprivileged programs are not allowed
    to add entries there either, but that is less of a concern as most
    such programs types relevant in this context are for root-only anyway.
    If enabled, call graphs and stack traces will then show a correct
    attribution; one example is illustrated below, where the trace is
    now visible in tooling such as perf script --kallsyms=/proc/kallsyms
    and friends.

    Before:

    7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
    f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)

    After:

    7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
    [...]
    7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

10 Feb, 2017

1 commit

  • These files were including module.h for exception table related
    functions. We've now separated that content out into its own file
    "extable.h" so now move over to that and where possible, avoid all
    the extra header content in module.h that we don't really need to
    compile these non-modular files.

    Note:
    init/main.c still needs module.h for __init_or_module
    kernel/extable.c still needs module.h for is_module_text_address

    ...and so we don't get the benefit of removing module.h from the cpp
    feed for these two files, unlike the almost universal 1:1 exchange
    of module.h for extable.h we were able to do in the arch dirs.

    Cc: Rusty Russell
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Acked-by: Jessica Yu
    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

14 Jan, 2017

1 commit

  • Improve __kernel_text_address()/kernel_text_address() to return
    true if the given address is on a kprobe's instruction slot
    trampoline.

    This can help stacktraces to determine the address is on a
    text area or not.

    To implement this atomically in is_kprobe_*_slot(), also change
    the insn_cache page list to an RCU list.

    This changes timings a bit (it delays page freeing to the RCU garbage
    collection phase), but none of that is in the hot path.

    Note: this change can add small overhead to stack unwinders because
    it adds 2 additional checks to __kernel_text_address(). However, the
    impact should be very small, because kprobe_insn_pages list has 1 entry
    per 256 probes(on x86, on arm/arm64 it will be 1024 probes),
    and kprobe_optinsn_pages has 1 entry per 32 probes(on x86).
    In most use cases, the number of kprobe events may be less
    than 20, which means that is_kprobe_*_slot() will check just one entry.

    Tested-by: Josh Poimboeuf
    Signed-off-by: Masami Hiramatsu
    Acked-by: Peter Zijlstra
    Cc: Alexander Shishkin
    Cc: Ananth N Mavinakayanahalli
    Cc: Andrew Morton
    Cc: Andrey Konovalov
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/148388747896.6869.6354262871751682264.stgit@devbox
    [ Improved the changelog and coding style. ]
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     

25 Dec, 2016

1 commit


11 Sep, 2015

1 commit


20 Nov, 2014

1 commit

  • Stack traces that happen from function tracing check if the address
    on the stack is a __kernel_text_address(). That is, is the address
    kernel code. This calls core_kernel_text() which returns true
    if the address is part of the builtin kernel code. It also calls
    is_module_text_address() which returns true if the address belongs
    to module code.

    But what is missing is ftrace dynamically allocated trampolines.
    These trampolines are allocated for individual ftrace_ops that
    call the ftrace_ops callback functions directly. But if they do a
    stack trace, the code checking the stack wont detect them as they
    are neither core kernel code nor module address space.

    Adding another field to ftrace_ops that also stores the size of
    the trampoline assigned to it we can create a new function called
    is_ftrace_trampoline() that returns true if the address is a
    dynamically allocate ftrace trampoline. Note, it ignores trampolines
    that are not dynamically allocated as they will return true with
    the core_kernel_text() function.

    Link: http://lkml.kernel.org/r/20141119034829.497125839@goodmis.org

    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Acked-by: Thomas Gleixner
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

14 Feb, 2014

1 commit

  • main_extable_sort_needed is used by the build system and needs
    to be a normal ELF symbol. Make it visible so that LTO
    does not remove or mangle it.

    Signed-off-by: Andi Kleen
    Link: http://lkml.kernel.org/r/1391845930-28580-8-git-send-email-ak@linux.intel.com
    Signed-off-by: H. Peter Anvin

    Andi Kleen
     

29 Nov, 2013

1 commit


12 Sep, 2013

1 commit

  • At least on ARM no-MMU the extable is empty and so there is nothing to
    sort. So add a check for the table to be empty which effectively only
    changes that the misleading pr_notice is suppressed.

    Signed-off-by: Uwe Kleine-König
    Cc: Ingo Molnar
    Cc: David Daney
    Cc: "H. Peter Anvin"
    Cc: Borislav Petkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Uwe Kleine-König
     

15 Apr, 2013

1 commit

  • Now that we do sort the __extable at build time, we actually are
    interested only in the case where we still do need to sort it.

    Signed-off-by: Borislav Petkov
    Cc: David Daney
    Link: http://lkml.kernel.org/r/1366023109-12098-1-git-send-email-bp@alien8.de
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     

20 Apr, 2012

1 commit


20 May, 2011

2 commits

  • A new utility function (core_kernel_data()) is used to determine if a
    passed in address is part of core kernel data or not. It may or may not
    return true for RO data, but this utility must work for RW data.

    Thus both _sdata and _edata must be defined and continuous,
    without .init sections that may later be freed and replaced by
    volatile memory (memory that can be freed).

    This utility function is used to determine if data is safe from
    ever being freed. Thus it should return true for all RW global
    data that is not in a module or has been allocated, or false
    otherwise.

    Also change core_kernel_data() back to the more precise _sdata condition
    and document the function.

    Signed-off-by: Steven Rostedt
    Acked-by: Ralf Baechle
    Acked-by: Hirokazu Takata
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Geert Uytterhoeven
    Cc: Roman Zippel
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: Kyle McMartin
    Cc: Helge Deller
    Cc: JamesE.J.Bottomley
    Link: http://lkml.kernel.org/r/1305855298.1465.19.camel@gandalf.stny.rr.com
    Signed-off-by: Ingo Molnar
    ----
    arch/alpha/kernel/vmlinux.lds.S | 1 +
    arch/m32r/kernel/vmlinux.lds.S | 1 +
    arch/m68k/kernel/vmlinux-std.lds | 2 ++
    arch/m68k/kernel/vmlinux-sun3.lds | 1 +
    arch/mips/kernel/vmlinux.lds.S | 1 +
    arch/parisc/kernel/vmlinux.lds.S | 3 +++
    kernel/extable.c | 12 +++++++++++-
    7 files changed, 20 insertions(+), 1 deletion(-)

    Steven Rostedt
     
  • Some architectures such as Alpha do not define _sdata but _data:

    kernel/built-in.o: In function `core_kernel_data':
    kernel/extable.c:77: undefined reference to `_sdata'

    So expand the scope of the data range to the text addresses too,
    this might be more correct anyway because this way we can
    cover readonly variables as well.

    Cc: Paul E. McKenney
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/n/tip-i878c8a0e0g0ep4v7i6vxnhz@git.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

19 May, 2011

1 commit

  • Now that functions may be selected individually, it only makes sense
    that we should allow dynamically allocated trace structures to
    be traced. This will allow perf to allocate a ftrace_ops structure
    at runtime and use it to pick and choose which functions that
    structure will trace.

    Note, a dynamically allocated ftrace_ops will always be called
    indirectly instead of being called directly from the mcount in
    entry.S. This is because there's no safe way to prevent mcount
    from being preempted before calling the function, unless we
    modify every entry.S to do so (not likely). Thus, dynamically allocated
    functions will now be called by the ftrace_ops_list_func() that
    loops through the ops that are allocated if there are more than
    one op allocated at a time. This loop is protected with a
    preempt_disable.

    To determine if an ftrace_ops structure is allocated or not, a new
    util function was added to the kernel/extable.c called
    core_kernel_data(), which returns 1 if the address is between
    _sdata and _edata.

    Cc: Paul E. McKenney
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

06 Apr, 2009

2 commits

  • * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
    tracing, net: fix net tree and tracing tree merge interaction
    tracing, powerpc: fix powerpc tree and tracing tree interaction
    ring-buffer: do not remove reader page from list on ring buffer free
    function-graph: allow unregistering twice
    trace: make argument 'mem' of trace_seq_putmem() const
    tracing: add missing 'extern' keywords to trace_output.h
    tracing: provide trace_seq_reserve()
    blktrace: print out BLK_TN_MESSAGE properly
    blktrace: extract duplidate code
    blktrace: fix memory leak when freeing struct blk_io_trace
    blktrace: fix blk_probes_ref chaos
    blktrace: make classic output more classic
    blktrace: fix off-by-one bug
    blktrace: fix the original blktrace
    blktrace: fix a race when creating blk_tree_root in debugfs
    blktrace: fix timestamp in binary output
    tracing, Text Edit Lock: cleanup
    tracing: filter fix for TRACE_EVENT_FORMAT events
    ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
    x86: kretprobe-booster interrupt emulation code fix
    ...

    Fix up trivial conflicts in
    arch/parisc/include/asm/ftrace.h
    include/linux/memory.h
    kernel/extable.c
    kernel/module.c

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param:
    module: use strstarts()
    strstarts: helper function for !strncmp(str, prefix, strlen(prefix))
    arm: allow usage of string functions in linux/string.h
    module: don't use stop_machine on module load
    module: create a request_module_nowait()
    module: include other structures in module version check
    module: remove the SHF_ALLOC flag on the __versions section.
    module: clarify the force-loading taint message.
    module: Export symbols needed for Ksplice
    Ksplice: Add functions for walking kallsyms symbols
    module: remove module_text_address()
    module: __module_address
    module: Make find_symbol return a struct kernel_symbol
    kernel/module.c: fix an unused goto label
    param: fix charp parameters set via sysfs

    Fix trivial conflicts in kernel/extable.c manually.

    Linus Torvalds
     

31 Mar, 2009

1 commit

  • Impact: Replace and remove risky (non-EXPORTed) API

    module_text_address() returns a pointer to the module, which given locking
    improvements in module.c, is useless except to test for NULL:

    1) If the module can't go away, use __module_text_address.
    2) Otherwise, just use is_module_text_address().

    Cc: linux-mtd@lists.infradead.org
    Signed-off-by: Rusty Russell

    Rusty Russell
     

23 Mar, 2009

1 commit

  • Impact: cleanup.

    The global mutex text_mutex if declared in linux/memory.h, so
    this file needs to be included into kernel/extable.c, where the
    same mutex is defined. This fixes the following sparse warning:

    kernel/extable.c:32:1: warning: symbol 'text_mutex' was not declared.
    Should it be static?

    Signed-off-by: Dmitri Vorobiev
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Dmitri Vorobiev
     

20 Mar, 2009

1 commit

  • Impact: build fix on SH !CONFIG_MMU

    Stephen Rothwell reported this linux-next build failure on the SH
    architecture:

    kernel/built-in.o: In function `disable_all_kprobes':
    kernel/kprobes.c:1382: undefined reference to `text_mutex'
    [...]

    And observed:

    | Introduced by commit 4460fdad85becd569f11501ad5b91814814335ff ("tracing,
    | Text Edit Lock - kprobes architecture independent support") from the
    | tracing tree. text_mutex is defined in mm/memory.c which is only built
    | if CONFIG_MMU is defined, which is not true for sh allmodconfig.

    Move this lock to kernel/extable.c (which is already home to various
    kernel text related routines), which file is always built-in.

    Reported-by: Stephen Rothwell
    Cc: Paul Mundt
    Cc: Mathieu Desnoyers
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

19 Mar, 2009

1 commit

  • Impact: fix incomplete stacktraces

    I noticed such weird stacktrace entries in lockdep dumps:

    [ 0.285956] {HARDIRQ-ON-W} state was registered at:
    [ 0.285956] [] mark_irqflags+0xbe/0x125
    [ 0.285956] [] __lock_acquire+0x674/0x82d
    [ 0.285956] [] lock_acquire+0xfc/0x128
    [ 0.285956] [] rt_spin_lock+0xc8/0xd0
    [ 0.285956] [] 0xffffffffffffffff

    The stacktrace entry is cut off after rt_spin_lock.

    After much debugging i found out that stacktrace entries that
    belong to init symbols dont get printed out, due to commit:

    a2da405: module: Don't report discarded init pages as kernel text.

    The reason is this check added to core_kernel_text():

    - if (addr >= (unsigned long)_sinittext &&
    + if (system_state == SYSTEM_BOOTING &&
    + addr >= (unsigned long)_sinittext &&
    addr
    Cc: Rusty Russell
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

09 Feb, 2009

1 commit

  • When the function graph tracer picks a return address, it ensures this address
    is really a kernel text one by calling __kernel_text_address()

    Actually this path has never been taken.Its role was more likely to debug the tracer
    on the beginning of its development but this function is wasteful since it is called
    for every traced function.

    The fault check is already sufficient.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

31 Dec, 2008

1 commit

  • * 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
    stacktrace: provide save_stack_trace_tsk() weak alias
    rcu: provide RCU options on non-preempt architectures too
    printk: fix discarding message when recursion_bug
    futex: clean up futex_(un)lock_pi fault handling
    "Tree RCU": scalable classic RCU implementation
    futex: rename field in futex_q to clarify single waiter semantics
    x86/swiotlb: add default swiotlb_arch_range_needs_mapping
    x86/swiotlb: add default physbus conversion
    x86: unify pci iommu setup and allow swiotlb to compile for 32 bit
    x86: add swiotlb allocation functions
    swiotlb: consolidate swiotlb info message printing
    swiotlb: support bouncing of HighMem pages
    swiotlb: factor out copy to/from device
    swiotlb: add arch hook to force mapping
    swiotlb: allow architectures to override physbusphys conversions
    swiotlb: add comment where we handle the overflow of a dma mask on 32 bit
    rcu: fix rcutorture behavior during reboot
    resources: skip sanity check of busy resources
    swiotlb: move some definitions to header
    swiotlb: allow architectures to override swiotlb pool allocation
    ...

    Fix up trivial conflicts in
    arch/x86/kernel/Makefile
    arch/x86/mm/init_32.c
    include/linux/hardirq.h
    as per Ingo's suggestions.

    Linus Torvalds
     

08 Dec, 2008

1 commit

  • Impact: trace more functions

    When the function graph tracer is configured, three more files are not
    traced to prevent only four functions to be traced. And this impacts the
    normal function tracer too.

    arch/x86/kernel/process_64/32.c:

    I had crashes when I let this file traced. After some debugging, I saw
    that the "current" task point was changed inside__swtich_to(), ie:
    "write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
    the original return address of the function inside current, we had
    crashes. Only __switch_to() has to be excluded from tracing.

    kernel/module.c and kernel/extable.c:

    Because of a function used internally by the function graph tracer:
    __kernel_text_address()

    To let the other functions inside these files to be traced, this patch
    introduces the __notrace_funcgraph function prefix which is __notrace if
    function graph tracer is configured and nothing if not.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

10 Sep, 2008

1 commit


29 Jan, 2008

1 commit

  • Current code could cause a bug in symbol_put_addr() if an arch used
    kmalloc module text: we might think the symbol belongs to the core
    kernel.

    The downside is that this might make backtraces through (discarded)
    init functions harder to read on some archs, but we already have that
    issue for modules and noone has complained.

    Signed-off-by: Rusty Russell

    Rusty Russell