01 Sep, 2020

1 commit

  • Add the inline static call implementation for x86-64. The generated code
    is identical to the out-of-line case, except we move the trampoline into
    it's own section.

    Objtool uses the trampoline naming convention to detect all the call
    sites. It then annotates those call sites in the .static_call_sites
    section.

    During boot (and module init), the call sites are patched to call
    directly into the destination function. The temporary trampoline is
    then no longer used.

    [peterz: merged trampolines, put trampoline in section]

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org

    Josh Poimboeuf
     

18 Jun, 2020

3 commits


03 Jun, 2020

1 commit

  • Currently objtool only collects information about relocations with
    addends. In recordmcount, which we are about to merge into objtool,
    some supported architectures do not use rela relocations.

    Signed-off-by: Matt Helsley
    Reviewed-by: Julien Thierry
    Reviewed-by: Kamalesh Babulal
    Signed-off-by: Josh Poimboeuf

    Matt Helsley
     

01 Jun, 2020

1 commit

  • Before supporting additional relocation types rename the relevant
    types and functions from "rela" to "reloc". This work be done with
    the following regex:

    sed -e 's/struct rela/struct reloc/g' \
    -e 's/\([_\*]\)rela\(s\{0,1\}\)/\1reloc\2/g' \
    -e 's/tmprela\(s\{0,1\}\)/tmpreloc\1/g' \
    -e 's/relasec/relocsec/g' \
    -e 's/rela_list/reloc_list/g' \
    -e 's/rela_hash/reloc_hash/g' \
    -e 's/add_rela/add_reloc/g' \
    -e 's/rela->/reloc->/g' \
    -e '/rela[,\.]/{ s/\([^\.>]\)rela\([\.,]\)/\1reloc\2/g ; }' \
    -e 's/rela =/reloc =/g' \
    -e 's/relas =/relocs =/g' \
    -e 's/relas\[/relocs[/g' \
    -e 's/relaname =/relocname =/g' \
    -e 's/= rela\;/= reloc\;/g' \
    -e 's/= relas\;/= relocs\;/g' \
    -e 's/= relaname\;/= relocname\;/g' \
    -e 's/, rela)/, reloc)/g' \
    -e 's/\([ @]\)rela\([ "]\)/\1reloc\2/g' \
    -e 's/ rela$/ reloc/g' \
    -e 's/, relaname/, relocname/g' \
    -e 's/sec->rela/sec->reloc/g' \
    -e 's/(\(!\{0,1\}\)rela/(\1reloc/g' \
    -i \
    arch.h \
    arch/x86/decode.c \
    check.c \
    check.h \
    elf.c \
    elf.h \
    orc_gen.c \
    special.c

    Notable exceptions which complicate the regex include gelf_*
    library calls and standard/expected section names which still use
    "rela" because they encode the type of relocation expected. Also, keep
    "rela" in the struct because it encodes a specific type of relocation
    we currently expect.

    It will eventually turn into a member of an anonymous union when a
    susequent patch adds implicit addend, or "rel", relocation support.

    Signed-off-by: Matt Helsley
    Signed-off-by: Josh Poimboeuf

    Matt Helsley
     

18 May, 2020

1 commit


11 May, 2020

1 commit

  • Pull x86 fixes from Thomas Gleixner:
    "A set of fixes for x86:

    - Ensure that direct mapping alias is always flushed when changing
    page attributes. The optimization for small ranges failed to do so
    when the virtual address was in the vmalloc or module space.

    - Unbreak the trace event registration for syscalls without arguments
    caused by the refactoring of the SYSCALL_DEFINE0() macro.

    - Move the printk in the TSC deadline timer code to a place where it
    is guaranteed to only be called once during boot and cannot be
    rearmed by clearing warn_once after boot. If it's invoked post boot
    then lockdep rightfully complains about a potential deadlock as the
    calling context is different.

    - A series of fixes for objtool and the ORC unwinder addressing
    variety of small issues:

    - Stack offset tracking for indirect CFAs in objtool ignored
    subsequent pushs and pops

    - Repair the unwind hints in the register clearing entry ASM code

    - Make the unwinding in the low level exit to usermode code stop
    after switching to the trampoline stack. The unwind hint is no
    longer valid and the ORC unwinder emits a warning as it can't
    find the registers anymore.

    - Fix unwind hints in switch_to_asm() and rewind_stack_do_exit()
    which caused objtool to generate bogus ORC data.

    - Prevent unwinder warnings when dumping the stack of a
    non-current task as there is no way to be sure about the
    validity because the dumped stack can be a moving target.

    - Make the ORC unwinder behave the same way as the frame pointer
    unwinder when dumping an inactive tasks stack and do not skip
    the first frame.

    - Prevent ORC unwinding before ORC data has been initialized

    - Immediately terminate unwinding when a unknown ORC entry type
    is found.

    - Prevent premature stop of the unwinder caused by IRET frames.

    - Fix another infinite loop in objtool caused by a negative
    offset which was not catched.

    - Address a few build warnings in the ORC unwinder and add
    missing static/ro_after_init annotations"

    * tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/unwind/orc: Move ORC sorting variables under !CONFIG_MODULES
    x86/apic: Move TSC deadline timer debug printk
    ftrace/x86: Fix trace event registration for syscalls without arguments
    x86/mm/cpa: Flush direct map alias during cpa
    objtool: Fix infinite loop in for_offset_range()
    x86/unwind/orc: Fix premature unwind stoppage due to IRET frames
    x86/unwind/orc: Fix error path for bad ORC entry type
    x86/unwind/orc: Prevent unwinding before ORC initialization
    x86/unwind/orc: Don't skip the first frame for inactive tasks
    x86/unwind: Prevent false warnings for non-current tasks
    x86/unwind/orc: Convert global variables to static
    x86/entry/64: Fix unwind hints in rewind_stack_do_exit()
    x86/entry/64: Fix unwind hints in __switch_to_asm()
    x86/entry/64: Fix unwind hints in kernel exit path
    x86/entry/64: Fix unwind hints in register clearing code
    objtool: Fix stack offset tracking for indirect CFAs

    Linus Torvalds
     

01 May, 2020

1 commit

  • Quoting Julien:

    "And the other suggestion is my other email was that you don't even
    need to add INSN_EXCEPTION_RETURN. You can keep IRET as
    INSN_CONTEXT_SWITCH by default and x86 decoder lookups the symbol
    conaining an iret. If it's a function symbol, it can just set the type
    to INSN_OTHER so that it caries on to the next instruction after
    having handled the stack_op."

    Suggested-by: Julien Thierry
    Signed-off-by: Miroslav Benes
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200428191659.913283807@infradead.org

    Miroslav Benes
     

26 Apr, 2020

1 commit

  • Randy reported that objtool got stuck in an infinite loop when
    processing drivers/i2c/busses/i2c-parport.o. It was caused by the
    following code:

    00000000000001fd :
    1fd: 48 b8 00 00 00 00 00 movabs $0x0,%rax
    204: 00 00 00
    1ff: R_X86_64_64 .rodata-0x8
    207: 41 55 push %r13
    209: 41 89 f5 mov %esi,%r13d
    20c: 41 54 push %r12
    20e: 49 89 fc mov %rdi,%r12
    211: 55 push %rbp
    212: 48 89 d5 mov %rdx,%rbp
    215: 53 push %rbx
    216: 0f b6 5a 01 movzbl 0x1(%rdx),%ebx
    21a: 48 8d 34 dd 00 00 00 lea 0x0(,%rbx,8),%rsi
    221: 00
    21e: R_X86_64_32S .rodata
    222: 48 89 f1 mov %rsi,%rcx
    225: 48 29 c1 sub %rax,%rcx

    find_jump_table() saw the .rodata reference and tried to find a jump
    table associated with it (though there wasn't one). The -0x8 rela
    addend is unusual. It caused find_jump_table() to send a negative
    table_offset (unsigned 0xfffffffffffffff8) to find_rela_by_dest().

    The negative offset should have been harmless, but it actually threw
    for_offset_range() for a loop... literally. When the mask value got
    incremented past the end value, it also wrapped to zero, causing the
    loop exit condition to remain true forever.

    Prevent this scenario from happening by ensuring the incremented value
    is always >= the starting value.

    Fixes: 74b873e49d92 ("objtool: Optimize find_rela_by_dest_range()")
    Reported-by: Randy Dunlap
    Tested-by: Randy Dunlap
    Acked-by: Randy Dunlap
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Ingo Molnar
    Cc: Julien Thierry
    Cc: Miroslav Benes
    Cc: Peter Zijlstra
    Link: https://lore.kernel.org/r/02b719674b031800b61e33c30b2e823183627c19.1587842122.git.jpoimboe@redhat.com

    Josh Poimboeuf
     

23 Apr, 2020

3 commits

  • 'struct elf *' handling is an open/close paradigm, make sure the naming
    matches that:

    elf_open_read()
    elf_write()
    elf_close()

    Acked-by: Josh Poimboeuf
    Signed-off-by: Ingo Molnar
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Sami Tolvanen
    Cc: Thomas Gleixner
    Link: https://lore.kernel.org/r/20200422103205.61900-3-mingo@kernel.org

    Ingo Molnar
     
  • In preparation to parallelize certain parts of objtool, map out which uses
    of various data structures are read-only vs. read-write.

    As a first step constify 'struct elf' pointer passing, most of the secondary
    uses of it in find_symbol_*() methods are read-only.

    Also, while at it, better group the 'struct elf' handling methods in elf.h.

    Acked-by: Josh Poimboeuf
    Signed-off-by: Ingo Molnar
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Sami Tolvanen
    Cc: Thomas Gleixner
    Link: https://lore.kernel.org/r/20200422103205.61900-2-mingo@kernel.org

    Ingo Molnar
     
  • Apparently there's people doing 64bit builds on 32bit machines.

    Fixes: 74b873e49d92 ("objtool: Optimize find_rela_by_dest_range()")
    Reported-by: youling257@gmail.com
    Signed-off-by: Peter Zijlstra (Intel)

    Peter Zijlstra
     

22 Apr, 2020

3 commits

  • When doing kbuild tests to see if the objtool changes affected those I
    found that there was a measurable regression:

    pre post

    real 1m13.594 1m16.488s
    user 34m58.246s 35m23.947s
    sys 4m0.393s 4m27.312s

    Perf showed that for small files the increased hash-table sizes were a
    measurable difference. Since we already have -l "vmlinux" to
    distinguish between the modes, make it also use a smaller portion of
    the hash-tables.

    This flips it into a small win:

    real 1m14.143s
    user 34m49.292s
    sys 3m44.746s

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Reviewed-by: Alexandre Chartre
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200416115119.167588731@infradead.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Validate that any call out of .noinstr.text is in between
    instr_begin() and instr_end() annotations.

    This annotation is useful to ensure correct behaviour wrt tracing
    sensitive code like entry/exit and idle code. When we run code in a
    sensitive context we want a guarantee no unknown code is ran.

    Since this validation relies on knowing the section of call
    destination symbols, we must run it on vmlinux.o instead of on
    individual object files.

    Add two options:

    -d/--duplicate "duplicate validation for vmlinux"
    -l/--vmlinux "vmlinux.o validation"

    Where the latter auto-detects when objname ends with "vmlinux.o" and
    the former will force all validations, also those already done on
    !vmlinux object files.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Reviewed-by: Alexandre Chartre
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200416115119.106268040@infradead.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Apparently there's people doing 64bit builds on 32bit machines.

    Fixes: 74b873e49d92 ("objtool: Optimize find_rela_by_dest_range()")
    Reported-by: youling257@gmail.com
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

26 Mar, 2020

8 commits

  • Perf shows there is significant time in find_rela_by_dest(); this is
    because we have to iterate the address space per byte, looking for
    relocation entries.

    Optimize this by reducing the address space granularity.

    This reduces objtool on vmlinux.o runtime from 4.8 to 4.4 seconds.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200324160924.861321325@infradead.org

    Peter Zijlstra
     
  • Perf showed that __hash_init() is a significant portion of
    read_sections(), so instead of doing a per section rela_hash, use an
    elf-wide rela_hash.

    Statistics show us there are about 1.1 million relas, so size it
    accordingly.

    This reduces the objtool on vmlinux.o runtime to a third, from 15 to 5
    seconds.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200324160924.739153726@infradead.org

    Peter Zijlstra
     
  • Perf showed that find_symbol_by_name() takes time; add a symbol name
    hash.

    This shaves another second off of objtool on vmlinux.o runtime, down
    to 15 seconds.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200324160924.676865656@infradead.org

    Peter Zijlstra
     
  • For consistency; we have:

    find_symbol_by_offset() / find_symbol_containing()
    find_func_by_offset() / find_containing_func()

    fix that.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200324160924.558470724@infradead.org

    Peter Zijlstra
     
  • All of:

    read_symbols(), find_symbol_by_offset(), find_symbol_containing(),
    find_containing_func()

    do a linear search of the symbols. Add an RB tree to make it go
    faster.

    This about halves objtool runtime on vmlinux.o, from 34s to 18s.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200324160924.499016559@infradead.org

    Peter Zijlstra
     
  • In order to avoid yet another linear search of (20k) sections, add a
    name based hash.

    This reduces objtool runtime on vmlinux.o by some 10s to around 35s.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200324160924.440174280@infradead.org

    Peter Zijlstra
     
  • In order to avoid a linear search (over 20k entries), add an
    section_hash to the elf object.

    This reduces objtool on vmlinux.o from a few minutes to around 45
    seconds.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200324160924.381249993@infradead.org

    Peter Zijlstra
     
  • The symbol index is object wide, not per section, so it makes no sense
    to have the symbol_hash be part of the section object. By moving it to
    the elf object we avoid the linear sections iteration.

    This reduces the runtime of objtool on vmlinux.o from over 3 hours (I
    gave up) to a few minutes. The defconfig vmlinux.o has around 20k
    sections.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Link: https://lkml.kernel.org/r/20200324160924.261852348@infradead.org

    Peter Zijlstra
     

21 Feb, 2020

1 commit

  • A recent clang change, combined with a binutils bug, can trigger a
    situation where a ".Lprintk$local" STT_NOTYPE symbol gets created at the
    same offset as the "printk" STT_FUNC symbol. This confuses objtool:

    kernel/printk/printk.o: warning: objtool: ignore_loglevel_setup()+0x10: can't find call dest symbol at .text+0xc67

    Improve the call destination detection by looking specifically for an
    STT_FUNC symbol.

    Reported-by: Nick Desaulniers
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Borislav Petkov
    Tested-by: Nick Desaulniers
    Tested-by: Nathan Chancellor
    Link: https://github.com/ClangBuiltLinux/linux/issues/872
    Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25551
    Link: https://lkml.kernel.org/r/0a7ee320bc0ea4469bd3dc450a7b4725669e0ea9.1581997059.git.jpoimboe@redhat.com

    Josh Poimboeuf
     

19 Jul, 2019

2 commits

  • This fixes objtool for both a GCC issue and a Clang issue:

    1) GCC issue:

    kernel/bpf/core.o: warning: objtool: ___bpf_prog_run()+0x8d5: sibling call from callable instruction with modified stack frame

    With CONFIG_RETPOLINE=n, GCC is doing the following optimization in
    ___bpf_prog_run().

    Before:

    select_insn:
    jmp *jumptable(,%rax,8)
    ...
    ALU64_ADD_X:
    ...
    jmp select_insn
    ALU_ADD_X:
    ...
    jmp select_insn

    After:

    select_insn:
    jmp *jumptable(, %rax, 8)
    ...
    ALU64_ADD_X:
    ...
    jmp *jumptable(, %rax, 8)
    ALU_ADD_X:
    ...
    jmp *jumptable(, %rax, 8)

    This confuses objtool. It has never seen multiple indirect jump
    sites which use the same jump table.

    For GCC switch tables, the only way of detecting the size of a table
    is by continuing to scan for more tables. The size of the previous
    table can only be determined after another switch table is found, or
    when the scan reaches the end of the function.

    That logic was reused for C jump tables, and was based on the
    assumption that each jump table only has a single jump site. The
    above optimization breaks that assumption.

    2) Clang issue:

    drivers/usb/misc/sisusbvga/sisusb.o: warning: objtool: sisusb_write_mem_bulk()+0x588: can't find switch jump table

    With clang 9, code can be generated where a function contains two
    indirect jump instructions which use the same switch table.

    The fix is the same for both issues: split the jump table parsing into
    two passes.

    In the first pass, locate the heads of all switch tables for the
    function and mark their locations.

    In the second pass, parse the switch tables and add them.

    Fixes: e55a73251da3 ("bpf: Fix ORC unwinding in non-JIT BPF code")
    Reported-by: Randy Dunlap
    Reported-by: Arnd Bergmann
    Signed-off-by: Jann Horn
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Tested-by: Nick Desaulniers
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/e995befaada9d4d8b2cf788ff3f566ba900d2b4d.1563413318.git.jpoimboe@redhat.com

    Co-developed-by: Josh Poimboeuf

    Jann Horn
     
  • Now that C jump tables are supported, call them "jump tables" instead of
    "switch tables". Also rename some other variables, add comments, and
    simplify the code flow a bit.

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Tested-by: Nick Desaulniers
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/cf951b0c0641628e0b9b81f7ceccd9bcabcb4bd8.1563413318.git.jpoimboe@redhat.com

    Josh Poimboeuf
     

18 Jul, 2019

1 commit

  • The elftoolchain version of libelf has a function named elf_open().

    The function name isn't quite accurate anyway, since it also reads all
    the ELF data. Rename it to elf_read(), which is more accurate.

    [ jpoimboe: rename to elf_read(); write commit description ]

    Signed-off-by: Michael Forney
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/7ce2d1b35665edf19fd0eb6fbc0b17b81a48e62f.1562793604.git.jpoimboe@redhat.com

    Michael Forney
     

21 May, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version this program is distributed in the
    hope that it will be useful but without any warranty without even
    the implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details you
    should have received a copy of the gnu general public license along
    with this program if not see http www gnu org licenses

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version this program is distributed in the
    hope that it will be useful but without any warranty without even
    the implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details [based]
    [from] [clk] [highbank] [c] you should have received a copy of the
    gnu general public license along with this program if not see http
    www gnu org licenses

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 355 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Kate Stewart
    Reviewed-by: Jilayne Lovejoy
    Reviewed-by: Steve Winslow
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

03 Apr, 2019

2 commits

  • It is important that UACCESS regions are as small as possible;
    furthermore the UACCESS state is not scheduled, so doing anything that
    might directly call into the scheduler will cause random code to be
    ran with UACCESS enabled.

    Teach objtool too track UACCESS state and warn about any CALL made
    while UACCESS is enabled. This very much includes the __fentry__()
    and __preempt_schedule() calls.

    Note that exceptions _do_ save/restore the UACCESS state, and therefore
    they can drive preemption. This also means that all exception handlers
    must have an otherwise redundant UACCESS disable instruction;
    therefore ignore this warning for !STT_FUNC code (exception handlers
    are not normal functions).

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Josh Poimboeuf
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Function aliases result in different symbols for the same set of
    instructions; track a canonical symbol so there is a unique point of
    access.

    This again prepares the way for function attributes. And in particular
    the need for aliases comes from how KASAN uses them.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Josh Poimboeuf
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

08 Sep, 2018

1 commit

  • Add support for processing switch jump tables in objects with multiple
    .rodata sections, such as those created by '-ffunction-sections' and
    '-fdata-sections'. Currently, objtool always looks in .rodata for jump
    table information, which results in many "sibling call from callable
    instruction with modified stack frame" warnings with objects compiled
    using those flags.

    The fix is comprised of three parts:

    1. Flagging all .rodata sections when importing ELF information for
    easier checking later.

    2. Keeping a reference to the section each relocation is from in order
    to get the list_head for the other relocations in that section.

    3. Finding jump tables by following relocations to .rodata sections,
    rather than always referencing a single global .rodata section.

    The patch has been tested without data sections enabled and no
    differences in the resulting orc unwind information were seen.

    Note that as objtool adds terminators to end of each .text section the
    unwind information generated between a function+data sections build and
    a normal build aren't directly comparable. Manual inspection suggests
    that objtool is now generating the correct information, or at least
    making more of an effort to do so than it did previously.

    Signed-off-by: Allan Xavier
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/099bdc375195c490dda04db777ee0b95d566ded1.1536325914.git.jpoimboe@redhat.com

    Allan Xavier
     

14 May, 2018

1 commit

  • GCC 8 moves a lot of unlikely code out of line to "cold" subfunctions in
    .text.unlikely. Properly detect the new subfunctions and treat them as
    extensions of the original functions.

    This fixes a bunch of warnings like:

    kernel/cgroup/cgroup.o: warning: objtool: parse_cgroup_root_flags()+0x33: sibling call from callable instruction with modified stack frame
    kernel/cgroup/cgroup.o: warning: objtool: cgroup_addrm_files()+0x290: sibling call from callable instruction with modified stack frame
    kernel/cgroup/cgroup.o: warning: objtool: cgroup_apply_control_enable()+0x25b: sibling call from callable instruction with modified stack frame
    kernel/cgroup/cgroup.o: warning: objtool: rebind_subsystems()+0x325: sibling call from callable instruction with modified stack frame

    Reported-and-tested-by: damian
    Reported-by: Arnd Bergmann
    Signed-off-by: Josh Poimboeuf
    Acked-by: Peter Zijlstra (Intel)
    Cc: David Laight
    Cc: Greg KH
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Randy Dunlap
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/0965e7fcfc5f31a276f0c7f298ff770c19b68706.1525923412.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

18 Jul, 2017

1 commit

  • Now that objtool knows the states of all registers on the stack for each
    instruction, it's straightforward to generate debuginfo for an unwinder
    to use.

    Instead of generating DWARF, generate a new format called ORC, which is
    more suitable for an in-kernel unwinder. See
    Documentation/x86/orc-unwinder.txt for a more detailed description of
    this new debuginfo format and why it's preferable to DWARF.

    Signed-off-by: Josh Poimboeuf
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Jiri Slaby
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: live-patching@vger.kernel.org
    Link: http://lkml.kernel.org/r/c9b9f01ba6c5ed2bdc9bb0957b78167fdbf9632e.1499786555.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

30 Jun, 2017

1 commit

  • This is a major rewrite of objtool. Instead of only tracking frame
    pointer changes, it now tracks all stack-related operations, including
    all register saves/restores.

    In addition to making stack validation more robust, this also paves the
    way for undwarf generation.

    Signed-off-by: Josh Poimboeuf
    Cc: Andy Lutomirski
    Cc: Jiri Slaby
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: live-patching@vger.kernel.org
    Link: http://lkml.kernel.org/r/678bd94c0566c6129bcc376cddb259c4c5633004.1498659915.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

07 Mar, 2017

1 commit

  • Arnd Bergmann reported a (false positive) objtool warning:

    drivers/infiniband/sw/rxe/rxe_resp.o: warning: objtool: rxe_responder()+0xfe: sibling call from callable instruction with changed frame pointer

    The issue is in find_switch_table(). It tries to find a switch
    statement's jump table by walking backwards from an indirect jump
    instruction, looking for a relocation to the .rodata section. In this
    case it stopped walking prematurely: the first .rodata relocation it
    encountered was for a variable (resp_state_name) instead of a jump
    table, so it just assumed there wasn't a jump table.

    The fix is to ignore any .rodata relocation which refers to an ELF
    object symbol. This works because the jump tables are anonymous and
    have no symbols associated with them.

    Reported-by: Arnd Bergmann
    Tested-by: Arnd Bergmann
    Signed-off-by: Josh Poimboeuf
    Cc: Denys Vlasenko
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern detection")
    Link: http://lkml.kernel.org/r/20170302225723.3ndbsnl4hkqbne7a@treble
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

17 May, 2016

1 commit

  • The switch to elf_getshdr{num,strndx} post-dates the oldest tool chain
    the kernel is supposed to be able to build with, so try to cope with
    such an environment.

    Signed-off-by: Jan Beulich
    Signed-off-by: Josh Poimboeuf
    Cc: # for v4.6
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jan Beulich
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/732dae6872b7ff187d94f22bb699a12849d3fe04.1463430618.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Jan Beulich
     

09 Mar, 2016

2 commits

  • Use hash tables for instruction and rela lookups (and keep the linked
    lists around for sequential access).

    Also cache the section struct for the "__func_stack_frame_non_standard"
    section.

    With this change, "objtool check net/wireless/nl80211.o" goes from:

    real 0m1.168s
    user 0m1.163s
    sys 0m0.005s

    to:

    real 0m0.059s
    user 0m0.042s
    sys 0m0.017s

    for a 20x speedup.

    With the same object, it should be noted that the memory heap usage grew
    from 8MB to 62MB. Reducing the memory usage is on the TODO list.

    Reported-by: Ingo Molnar
    Signed-off-by: Josh Poimboeuf
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Arnaldo Carvalho de Melo
    Cc: Arnaldo Carvalho de Melo
    Cc: Bernd Petrovitsch
    Cc: Borislav Petkov
    Cc: Chris J Arges
    Cc: Jiri Slaby
    Cc: Linus Torvalds
    Cc: Michal Marek
    Cc: Namhyung Kim
    Cc: Pedro Alves
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: live-patching@vger.kernel.org
    Link: http://lkml.kernel.org/r/dd0d8e1449506cfa7701b4e7ba73577077c44253.1457502970.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     
  • Rename some list heads to distinguish them from hash node heads, which
    are added later in the patch series.

    Also rename the get_*() functions to add_*(), which is more descriptive:
    they "add" data to the objtool_file struct.

    Also rename rodata_rela and text_rela to be clearer:
    - text_rela refers to a rela entry in .rela.text.
    - rodata_rela refers to a rela entry in .rela.rodata.

    Signed-off-by: Josh Poimboeuf
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Arnaldo Carvalho de Melo
    Cc: Arnaldo Carvalho de Melo
    Cc: Bernd Petrovitsch
    Cc: Borislav Petkov
    Cc: Chris J Arges
    Cc: Jiri Slaby
    Cc: Linus Torvalds
    Cc: Michal Marek
    Cc: Namhyung Kim
    Cc: Pedro Alves
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: live-patching@vger.kernel.org
    Link: http://lkml.kernel.org/r/ee0eca2bba8482aa45758958c5586c00a7b71e62.1457502970.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

29 Feb, 2016

1 commit

  • This adds a host tool named objtool which has a "check" subcommand which
    analyzes .o files to ensure the validity of stack metadata. It enforces
    a set of rules on asm code and C inline assembly code so that stack
    traces can be reliable.

    For each function, it recursively follows all possible code paths and
    validates the correct frame pointer state at each instruction.

    It also follows code paths involving kernel special sections, like
    .altinstructions, __jump_table, and __ex_table, which can add
    alternative execution paths to a given instruction (or set of
    instructions). Similarly, it knows how to follow switch statements, for
    which gcc sometimes uses jump tables.

    Here are some of the benefits of validating stack metadata:

    a) More reliable stack traces for frame pointer enabled kernels

    Frame pointers are used for debugging purposes. They allow runtime
    code and debug tools to be able to walk the stack to determine the
    chain of function call sites that led to the currently executing
    code.

    For some architectures, frame pointers are enabled by
    CONFIG_FRAME_POINTER. For some other architectures they may be
    required by the ABI (sometimes referred to as "backchain pointers").

    For C code, gcc automatically generates instructions for setting up
    frame pointers when the -fno-omit-frame-pointer option is used.

    But for asm code, the frame setup instructions have to be written by
    hand, which most people don't do. So the end result is that
    CONFIG_FRAME_POINTER is honored for C code but not for most asm code.

    For stack traces based on frame pointers to be reliable, all
    functions which call other functions must first create a stack frame
    and update the frame pointer. If a first function doesn't properly
    create a stack frame before calling a second function, the *caller*
    of the first function will be skipped on the stack trace.

    For example, consider the following example backtrace with frame
    pointers enabled:

    [] dump_stack+0x4b/0x63
    [] cmdline_proc_show+0x12/0x30
    [] seq_read+0x108/0x3e0
    [] proc_reg_read+0x42/0x70
    [] __vfs_read+0x37/0x100
    [] vfs_read+0x86/0x130
    [] SyS_read+0x58/0xd0
    [] entry_SYSCALL_64_fastpath+0x12/0x76

    It correctly shows that the caller of cmdline_proc_show() is
    seq_read().

    If we remove the frame pointer logic from cmdline_proc_show() by
    replacing the frame pointer related instructions with nops, here's
    what it looks like instead:

    [] dump_stack+0x4b/0x63
    [] cmdline_proc_show+0x12/0x30
    [] proc_reg_read+0x42/0x70
    [] __vfs_read+0x37/0x100
    [] vfs_read+0x86/0x130
    [] SyS_read+0x58/0xd0
    [] entry_SYSCALL_64_fastpath+0x12/0x76

    Notice that cmdline_proc_show()'s caller, seq_read(), has been
    skipped. Instead the stack trace seems to show that
    cmdline_proc_show() was called by proc_reg_read().

    The benefit of "objtool check" here is that because it ensures that
    *all* functions honor CONFIG_FRAME_POINTER, no functions will ever[*]
    be skipped on a stack trace.

    [*] unless an interrupt or exception has occurred at the very
    beginning of a function before the stack frame has been created,
    or at the very end of the function after the stack frame has been
    destroyed. This is an inherent limitation of frame pointers.

    b) 100% reliable stack traces for DWARF enabled kernels

    This is not yet implemented. For more details about what is planned,
    see tools/objtool/Documentation/stack-validation.txt.

    c) Higher live patching compatibility rate

    This is not yet implemented. For more details about what is planned,
    see tools/objtool/Documentation/stack-validation.txt.

    To achieve the validation, "objtool check" enforces the following rules:

    1. Each callable function must be annotated as such with the ELF
    function type. In asm code, this is typically done using the
    ENTRY/ENDPROC macros. If objtool finds a return instruction
    outside of a function, it flags an error since that usually indicates
    callable code which should be annotated accordingly.

    This rule is needed so that objtool can properly identify each
    callable function in order to analyze its stack metadata.

    2. Conversely, each section of code which is *not* callable should *not*
    be annotated as an ELF function. The ENDPROC macro shouldn't be used
    in this case.

    This rule is needed so that objtool can ignore non-callable code.
    Such code doesn't have to follow any of the other rules.

    3. Each callable function which calls another function must have the
    correct frame pointer logic, if required by CONFIG_FRAME_POINTER or
    the architecture's back chain rules. This can by done in asm code
    with the FRAME_BEGIN/FRAME_END macros.

    This rule ensures that frame pointer based stack traces will work as
    designed. If function A doesn't create a stack frame before calling
    function B, the _caller_ of function A will be skipped on the stack
    trace.

    4. Dynamic jumps and jumps to undefined symbols are only allowed if:

    a) the jump is part of a switch statement; or

    b) the jump matches sibling call semantics and the frame pointer has
    the same value it had on function entry.

    This rule is needed so that objtool can reliably analyze all of a
    function's code paths. If a function jumps to code in another file,
    and it's not a sibling call, objtool has no way to follow the jump
    because it only analyzes a single file at a time.

    5. A callable function may not execute kernel entry/exit instructions.
    The only code which needs such instructions is kernel entry code,
    which shouldn't be be in callable functions anyway.

    This rule is just a sanity check to ensure that callable functions
    return normally.

    It currently only supports x86_64. I tried to make the code generic so
    that support for other architectures can hopefully be plugged in
    relatively easily.

    On my Lenovo laptop with a i7-4810MQ 4-core/8-thread CPU, building the
    kernel with objtool checking every .o file adds about three seconds of
    total build time. It hasn't been optimized for performance yet, so
    there are probably some opportunities for better build performance.

    Signed-off-by: Josh Poimboeuf
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Arnaldo Carvalho de Melo
    Cc: Bernd Petrovitsch
    Cc: Borislav Petkov
    Cc: Chris J Arges
    Cc: Jiri Slaby
    Cc: Linus Torvalds
    Cc: Michal Marek
    Cc: Namhyung Kim
    Cc: Pedro Alves
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: live-patching@vger.kernel.org
    Link: http://lkml.kernel.org/r/f3efb173de43bd067b060de73f856567c0fa1174.1456719558.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf