15 Aug, 2018

2 commits

  • Currently, the addresses of PTI entry trampolines are not exported to
    user space. Kernel profiling tools need these addresses to identify the
    kernel code, so add a symbol and address for each CPU's PTI entry
    trampoline.

    Signed-off-by: Alexander Shishkin
    Acked-by: Andi Kleen
    Acked-by: Peter Zijlstra (Intel)
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Jiri Olsa
    Cc: Joerg Roedel
    Cc: Thomas Gleixner
    Cc: x86@kernel.org
    Link: http://lkml.kernel.org/r/1528289651-4113-3-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexander Shishkin
     
  • The logic in update_iter_mod() is overcomplicated and gets worse every
    time another get_ksymbol_* function is added.

    In preparation for adding another get_ksymbol_* function, simplify logic
    in update_iter_mod().

    Signed-off-by: Adrian Hunter
    Tested-by: (ftrace changes only) Steven Rostedt (VMware)
    Acked-by: Andi Kleen
    Acked-by: Peter Zijlstra (Intel)
    Cc: Alexander Shishkin
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Jiri Olsa
    Cc: Joerg Roedel
    Cc: Thomas Gleixner
    Cc: x86@kernel.org
    Link: http://lkml.kernel.org/r/1528289651-4113-2-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

02 Feb, 2018

1 commit

  • Pull printk updates from Petr Mladek:

    - Add a console_msg_format command line option:

    The value "default" keeps the old "[time stamp] text\n" format. The
    value "syslog" allows to see the syslog-like "[timestamp] text" format.

    This feature was requested by people doing regression tests, for
    example, 0day robot. They want to have both filtered and full logs
    at hands.

    - Reduce the risk of softlockup:

    Pass the console owner in a busy loop.

    This is a new approach to the old problem. It was first proposed by
    Steven Rostedt on Kernel Summit 2017. It marks a context in which
    the console_lock owner calls console drivers and could not sleep.
    On the other side, printk() callers could detect this state and use
    a busy wait instead of a simple console_trylock(). Finally, the
    console_lock owner checks if there is a busy waiter at the end of
    the special context and eventually passes the console_lock to the
    waiter.

    The hand-off works surprisingly well and helps in many situations.
    Well, there is still a possibility of the softlockup, for example,
    when the flood of messages stops and the last owner still has too
    much to flush.

    There is increasing number of people having problems with
    printk-related softlockups. We might eventually need to get better
    solution. Anyway, this looks like a good start and promising
    direction.

    - Do not allow to schedule in console_unlock() called from printk():

    This reverts an older controversial commit. The reschedule helped
    to avoid softlockups. But it also slowed down the console output.
    This patch is obsoleted by the new console waiter logic described
    above. In fact, the reschedule made the hand-off less effective.

    - Deprecate "%pf" and "%pF" format specifier:

    It was needed on ia64, ppc64 and parisc64 to dereference function
    descriptors and show the real function address. It is done
    transparently by "%ps" and "pS" format specifier now.

    Sergey Senozhatsky found that all the function descriptors were in
    a special elf section and could be easily detected.

    - Remove printk_symbol() API:

    It has been obsoleted by "%pS" format specifier, and this change
    helped to remove few continuous lines and a less intuitive old API.

    - Remove redundant memsets:

    Sergey removed unnecessary memset when processing printk.devkmsg
    command line option.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (27 commits)
    printk: drop redundant devkmsg_log_str memsets
    printk: Never set console_may_schedule in console_trylock()
    printk: Hide console waiter logic into helpers
    printk: Add console owner and waiter logic to load balance console writes
    kallsyms: remove print_symbol() function
    checkpatch: add pF/pf deprecation warning
    symbol lookup: introduce dereference_symbol_descriptor()
    parisc64: Add .opd based function descriptor dereference
    powerpc64: Add .opd based function descriptor dereference
    ia64: Add .opd based function descriptor dereference
    sections: split dereference_function_descriptor()
    openrisc: Fix conflicting types for _exext and _stext
    lib: do not use print_symbol()
    irq debug: do not use print_symbol()
    sysfs: do not use print_symbol()
    drivers: do not use print_symbol()
    x86: do not use print_symbol()
    unicore32: do not use print_symbol()
    sh: do not use print_symbol()
    mn10300: do not use print_symbol()
    ...

    Linus Torvalds
     

22 Jan, 2018

1 commit


16 Jan, 2018

1 commit

  • No more print_symbol()/__print_symbol() users left, remove these
    symbols.

    It was a very old API that encouraged people use continuous lines.
    It had been obsoleted by %pS format specifier in a normal printk()
    call.

    Link: http://lkml.kernel.org/r/20180105102538.GC471@jagdpanzerIV
    Cc: Andrew Morton
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Mark Salter
    Cc: Tony Luck
    Cc: David Howells
    Cc: Yoshinori Sato
    Cc: Guan Xuetao
    Cc: Borislav Petkov
    Cc: Greg Kroah-Hartman
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Vineet Gupta
    Cc: Fengguang Wu
    Cc: Steven Rostedt
    Cc: LKML
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-am33-list@redhat.com
    Cc: linux-sh@vger.kernel.org
    Cc: linux-edac@vger.kernel.org
    Cc: x86@kernel.org
    Cc: linux-snps-arc@lists.infradead.org
    Cc: Sergey Senozhatsky
    Signed-off-by: Sergey Senozhatsky
    Suggested-by: Joe Perches
    [pmladek@suse.com: updated commit message]
    Signed-off-by: Petr Mladek

    Sergey Senozhatsky
     

09 Jan, 2018

1 commit

  • dereference_symbol_descriptor() invokes appropriate ARCH specific
    function descriptor dereference callbacks:
    - dereference_kernel_function_descriptor() if the pointer is a
    kernel symbol;

    - dereference_module_function_descriptor() if the pointer is a
    module symbol.

    This is the last step needed to make '%pS/%ps' smart enough to
    handle function descriptor dereference on affected ARCHs and
    to retire '%pF/%pf'.

    To refresh it:
    Some architectures (ia64, ppc64, parisc64) use an indirect pointer
    for C function pointers - the function pointer points to a function
    descriptor and we need to dereference it to get the actual function
    pointer.

    Function descriptors live in .opd elf section and all affected
    ARCHs (ia64, ppc64, parisc64) handle it properly for kernel and
    modules. So we, technically, can decide if the dereference is
    needed by simply looking at the pointer: if it belongs to .opd
    section then we need to dereference it.

    The kernel and modules have their own .opd sections, obviously,
    that's why we need to split dereference_function_descriptor()
    and use separate kernel and module dereference arch callbacks.

    Link: http://lkml.kernel.org/r/20171206043649.GB15885@jagdpanzerIV
    Cc: Fenghua Yu
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: James Bottomley
    Cc: Andrew Morton
    Cc: Jessica Yu
    Cc: Steven Rostedt
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Sergey Senozhatsky
    Tested-by: Tony Luck #ia64
    Tested-by: Santosh Sivaraj #powerpc
    Tested-by: Helge Deller #parisc64
    Signed-off-by: Petr Mladek

    Sergey Senozhatsky
     

30 Nov, 2017

1 commit

  • The conditional kallsym hex printing used a special fixed-width '%lx'
    output (KALLSYM_FMT) in preparation for the hashing of %p, but that
    series ended up adding a %px specifier to help with the conversions.

    Use it, and avoid the "print pointer as an unsigned long" code.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

18 Nov, 2017

1 commit

  • Pull tracing updates from

    - allow module init functions to be traced

    - clean up some unused or not used by config events (saves space)

    - clean up of trace histogram code

    - add support for preempt and interrupt enabled/disable events

    - other various clean ups

    * tag 'trace-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (30 commits)
    tracing, thermal: Hide cpu cooling trace events when not in use
    tracing, thermal: Hide devfreq trace events when not in use
    ftrace: Kill FTRACE_OPS_FL_PER_CPU
    perf/ftrace: Small cleanup
    perf/ftrace: Fix function trace events
    perf/ftrace: Revert ("perf/ftrace: Fix double traces of perf on ftrace:function")
    tracing, dma-buf: Remove unused trace event dma_fence_annotate_wait_on
    tracing, memcg, vmscan: Hide trace events when not in use
    tracing/xen: Hide events that are not used when X86_PAE is not defined
    tracing: mark trace_test_buffer as __maybe_unused
    printk: Remove superfluous memory barriers from printk_safe
    ftrace: Clear hashes of stale ips of init memory
    tracing: Add support for preempt and irq enable/disable events
    tracing: Prepare to add preempt and irq trace events
    ftrace/kallsyms: Have /proc/kallsyms show saved mod init functions
    ftrace: Add freeing algorithm to free ftrace_mod_maps
    ftrace: Save module init functions kallsyms symbols for tracing
    ftrace: Allow module init functions to be traced
    ftrace: Add a ftrace_free_mem() function for modules to use
    tracing: Reimplement log2
    ...

    Linus Torvalds
     

13 Nov, 2017

1 commit


09 Nov, 2017

1 commit

  • Not only is it annoying to have one single flag for all pointers, as if
    that was a global choice and all kernel pointers are the same, but %pK
    can't get the 'access' vs 'open' time check right anyway.

    So make the /proc/kallsyms pointer value code use logic specific to that
    particular file. We do continue to honor kptr_restrict, but the default
    (which is unrestricted) is changed to instead take expected users into
    account, and restrict access by default.

    Right now the only actual expected user is kernel profiling, which has a
    separate sysctl flag for kernel profile access. There may be others.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

06 Oct, 2017

2 commits

  • If a module is loaded while tracing is enabled, then there's a possibility
    that the module init functions were traced. These functions have their name
    and address stored by ftrace such that it can translate the function address
    that is written into the buffer into a human readable function name.

    As userspace tools may be doing the same, they need a way to map function
    names to their address as well. This is done through reading /proc/kallsyms.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • If function tracing is active when the module init functions are freed, then
    store them to be referenced by kallsyms. As module init functions can now be
    traced on module load, they were useless:

    ># echo ':mod:snd_seq' > set_ftrace_filter
    ># echo function > current_tracer
    ># modprobe snd_seq
    ># cat trace
    # tracer: function
    #
    # _-----=> irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| / delay
    # TASK-PID CPU# |||| TIMESTAMP FUNCTION
    # | | | |||| | |
    modprobe-2786 [000] .... 3189.037874: 0xffffffffa0860000 irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| / delay
    # TASK-PID CPU# |||| TIMESTAMP FUNCTION
    # | | | |||| | |
    modprobe-2463 [002] .... 174.243237: alsa_seq_init

    Steven Rostedt (VMware)
     

11 Jul, 2017

1 commit


18 Feb, 2017

1 commit

  • Long standing issue with JITed programs is that stack traces from
    function tracing check whether a given address is kernel code
    through {__,}kernel_text_address(), which checks for code in core
    kernel, modules and dynamically allocated ftrace trampolines. But
    what is still missing is BPF JITed programs (interpreted programs
    are not an issue as __bpf_prog_run() will be attributed to them),
    thus when a stack trace is triggered, the code walking the stack
    won't see any of the JITed ones. The same for address correlation
    done from user space via reading /proc/kallsyms. This is read by
    tools like perf, but the latter is also useful for permanent live
    tracing with eBPF itself in combination with stack maps when other
    eBPF types are part of the callchain. See offwaketime example on
    dumping stack from a map.

    This work tries to tackle that issue by making the addresses and
    symbols known to the kernel. The lookup from *kernel_text_address()
    is implemented through a latched RB tree that can be read under
    RCU in fast-path that is also shared for symbol/size/offset lookup
    for a specific given address in kallsyms. The slow-path iteration
    through all symbols in the seq file done via RCU list, which holds
    a tiny fraction of all exported ksyms, usually below 0.1 percent.
    Function symbols are exported as bpf_prog_, in order to aide
    debugging and attribution. This facility is currently enabled for
    root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
    is active in any mode. The rationale behind this is that still a lot
    of systems ship with world read permissions on kallsyms thus addresses
    should not get suddenly exposed for them. If that situation gets
    much better in future, we always have the option to change the
    default on this. Likewise, unprivileged programs are not allowed
    to add entries there either, but that is less of a concern as most
    such programs types relevant in this context are for root-only anyway.
    If enabled, call graphs and stack traces will then show a correct
    attribution; one example is illustrated below, where the trace is
    now visible in tooling such as perf script --kallsyms=/proc/kallsyms
    and friends.

    Before:

    7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
    f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)

    After:

    7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
    [...]
    7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
    f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

16 Mar, 2016

1 commit

  • Similar to how relative extables are implemented, it is possible to emit
    the kallsyms table in such a way that it contains offsets relative to
    some anchor point in the kernel image rather than absolute addresses.

    On 64-bit architectures, it cuts the size of the kallsyms address table
    in half, since offsets between kernel symbols can typically be expressed
    in 32 bits. This saves several hundreds of kilobytes of permanent
    .rodata on average. In addition, the kallsyms address table is no
    longer subject to dynamic relocation when CONFIG_RELOCATABLE is in
    effect, so the relocation work done after decompression now doesn't have
    to do relocation updates for all these values. This saves up to 24
    bytes (i.e., the size of a ELF64 RELA relocation table entry) per value,
    which easily adds up to a couple of megabytes of uncompressed __init
    data on ppc64 or arm64. Even if these relocation entries typically
    compress well, the combined size reduction of 2.8 MB uncompressed for a
    ppc64_defconfig build (of which 2.4 MB is __init data) results in a ~500
    KB space saving in the compressed image.

    Since it is useful for some architectures (like x86) to retain the
    ability to emit absolute values as well, this patch also adds support
    for capturing both absolute and relative values when
    KALLSYMS_ABSOLUTE_PERCPU is in effect, by emitting absolute per-cpu
    addresses as positive 32-bit values, and addresses relative to the
    lowest encountered relative symbol as negative values, which are
    subtracted from the runtime address of this base symbol to produce the
    actual address.

    Support for the above is enabled by default for all architectures except
    IA-64 and Tile-GX, whose symbols are too far apart to capture in this
    manner.

    Signed-off-by: Ard Biesheuvel
    Tested-by: Guenter Roeck
    Reviewed-by: Kees Cook
    Tested-by: Kees Cook
    Cc: Heiko Carstens
    Cc: Michael Ellerman
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Benjamin Herrenschmidt
    Cc: Michal Marek
    Cc: Rusty Russell
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ard Biesheuvel
     

14 Oct, 2014

1 commit


09 Aug, 2014

1 commit

  • __sprint_symbol() should restore original address when kallsyms_lookup()
    failed to find a symbol. It's reported when dumpstack shows an address in
    a dynamically allocated trampoline for ftrace.

    [ 1314.612287] [] dump_stack+0x45/0x56
    [ 1314.612290] [] ? meminfo_proc_open+0x30/0x30
    [ 1314.612293] [] kpatch_ftrace_handler+0x14/0xf0 [kpatch]
    [ 1314.612306] [] 0xffffffffa00160c3

    You can see a difference in the hex address - c4 and c3. Fix it.

    Signed-off-by: Namhyung Kim
    Reported-by: Masami Hiramatsu
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Josh Poimboeuf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     

08 Apr, 2014

1 commit


15 Apr, 2013

1 commit

  • We don't export any symbols > 128 characters, but if we did then
    kallsyms_expand_symbol() would overflow the buffer handed to it.
    So we need check destination buffer length when copying.

    the related test:
    if we define an EXPORT function which name more than 128.
    will panic when call kallsyms_lookup_name by init_kprobes on booting.
    after check the length (provide this patch), it is ok.

    Implementaion:
    add additional destination buffer length parameter (maxlen)
    if uncompressed string is too long (>= maxlen), it will be truncated.
    not check the parameters whether valid, since it is a static function.

    Signed-off-by: Chen Gang
    Signed-off-by: Rusty Russell

    Chen Gang
     

30 May, 2012

1 commit

  • Using %ps in a printk format will sometimes fail silently and print the
    empty string if the address passed in does not match a symbol that
    kallsyms knows about. But using %pS will fall back to printing the full
    address if kallsyms can't find the symbol. Make %ps act the same as %pS
    by falling back to printing the address.

    While we're here also make %ps print the module that a symbol comes from
    so that it matches what %pS already does. Take this simple function for
    example (in a module):

    static void test_printk(void)
    {
    int test;
    pr_info("with pS: %pS\n", &test);
    pr_info("with ps: %ps\n", &test);
    }

    Before this patch:

    with pS: 0xdff7df44
    with ps:

    After this patch:

    with pS: 0xdff7df44
    with ps: 0xdff7df44

    Signed-off-by: Stephen Boyd
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     

26 Mar, 2011

1 commit


24 Mar, 2011

3 commits

  • The %pB format specifier is for stack backtrace. Its handler
    sprint_backtrace() does symbol lookup using (address-1) to
    ensure the address will not point outside of the function.

    If there is a tail-call to the function marked "noreturn",
    gcc optimized out the code after the call then causes saved
    return address points outside of the function (i.e. the start
    of the next function), so pollutes call trace somewhat.

    This patch adds the %pB printk mechanism that allows architecture
    call-trace printout functions to improve backtrace printouts.

    Signed-off-by: Namhyung Kim
    Acked-by: Steven Rostedt
    Acked-by: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: linux-arch@vger.kernel.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Namhyung Kim
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    deal with races in /proc/*/{syscall,stack,personality}
    proc: enable writing to /proc/pid/mem
    proc: make check_mem_permission() return an mm_struct on success
    proc: hold cred_guard_mutex in check_mem_permission()
    proc: disable mem_write after exec
    mm: implement access_remote_vm
    mm: factor out main logic of access_process_vm
    mm: use mm_struct to resolve gate vma's in __get_user_pages
    mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm
    mm: arch: make in_gate_area take an mm_struct instead of a task_struct
    mm: arch: make get_gate_vma take an mm_struct instead of a task_struct
    x86: mark associated mm when running a task in 32 bit compatibility mode
    x86: add context tag to mark mm when running a task in 32-bit compatibility mode
    auxv: require the target to be tracable (or yourself)
    close race in /proc/*/environ
    report errors in /proc/*/*map* sanely
    pagemap: close races with suid execve
    make sessionid permissions in /proc/*/task/* match those in /proc/*
    fix leaks in path_lookupat()

    Fix up trivial conflicts in fs/proc/base.c

    Linus Torvalds
     
  • Now that gate vma's are referenced with respect to a particular mm and not a
    particular task it only makes sense to propagate the change to this predicate as
    well.

    Signed-off-by: Stephen Wilson
    Reviewed-by: Michel Lespinasse
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Signed-off-by: Al Viro

    Stephen Wilson
     

23 Mar, 2011

1 commit

  • In an effort to reduce kernel address leaks that might be used to help
    target kernel privilege escalation exploits, this patch uses %pK when
    displaying addresses in /proc/kallsyms, /proc/modules, and
    /sys/module/*/sections/*.

    Note that this changes %x to %p, so some legitimately 0 values in
    /proc/kallsyms would have changed from 00000000 to "(null)". To avoid
    this, "(null)" is not used when using the "K" format. Anything that was
    already successfully parsing "(null)" in addition to full hex digits
    should have no problem with this change. (Thanks to Joe Perches for the
    suggestion.) Due to the %x to %p, "void *" casts are needed since these
    addresses are already "unsigned long" everywhere internally, due to their
    starting life as ELF section offsets.

    Signed-off-by: Kees Cook
    Cc: Eugene Teo
    Cc: Dan Rosenberg
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

20 Nov, 2010

1 commit

  • This reverts commit 59365d136d205cc20fe666ca7f89b1c5001b0d5a.

    It turns out that this can break certain existing user land setups.
    Quoth Sarah Sharp:

    "On Wednesday, I updated my branch to commit 460781b from linus' tree,
    and my box would not boot. klogd segfaulted, which stalled the whole
    system.

    At first I thought it actually hung the box, but it continued booting
    after 5 minutes, and I was able to log in. It dropped back to the
    text console instead of the graphical bootup display for that period
    of time. dmesg surprisingly still works. I've bisected the problem
    down to this commit (commit 59365d136d205cc20fe666ca7f89b1c5001b0d5a)

    The box is running klogd 1.5.5ubuntu3 (from Jaunty). Yes, I know
    that's old. I read the bit in the commit about changing the
    permissions of kallsyms after boot, but if I can't boot that doesn't
    help."

    So let's just keep the old default, and encourage distributions to do
    the "chmod -r /proc/kallsyms" in their bootup scripts. This is not
    worth a kernel option to change default behavior, since it's so easily
    done in user space.

    Reported-and-bisected-by: Sarah Sharp
    Cc: Marcus Meissner
    Cc: Tejun Heo
    Cc: Eugene Teo
    Cc: Jesper Juhl
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

17 Nov, 2010

1 commit

  • Making /proc/kallsyms readable only for root by default makes it
    slightly harder for attackers to write generic kernel exploits by
    removing one source of knowledge where things are in the kernel.

    This is the second submit, discussion happened on this on first submit
    and mostly concerned that this is just one hole of the sieve ... but
    one of the bigger ones.

    Changing the permissions of at least System.map and vmlinux is also
    required to fix the same set, but a packaging issue.

    Target of this starter patch and follow ups is removing any kind of
    kernel space address information leak from the kernel.

    [ Side note: the default of root-only reading is the "safe" value, and
    it's easy enough to then override at any time after boot. The /proc
    filesystem allows root to change the permissions with a regular
    chmod, so you can "revert" this at run-time by simply doing

    chmod og+r /proc/kallsyms

    as root if you really want regular users to see the kernel symbols.
    It does help some tools like "perf" figure them out without any
    setup, so it may well make sense in some situations. - Linus ]

    Signed-off-by: Marcus Meissner
    Acked-by: Tejun Heo
    Acked-by: Eugene Teo
    Reviewed-by: Jesper Juhl
    Signed-off-by: Linus Torvalds

    Marcus Meissner
     

21 May, 2010

1 commit

  • This patch contains the hooks and instrumentation into kernel which
    live outside the kernel/debug directory, which the kdb core
    will call to run commands like lsmod, dmesg, bt etc...

    CC: linux-arch@vger.kernel.org
    Signed-off-by: Jason Wessel
    Signed-off-by: Martin Hicks

    Jason Wessel
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

10 Nov, 2009

1 commit


23 Sep, 2009

1 commit

  • This allows kallsyms to locate symbols that are in arch-specific text
    sections (such as text in Blackfin on-chip SRAM regions).

    Signed-off-by: Mike Frysinger
    Cc: Ingo Molnar
    Cc: Robin Getz
    Cc: Sam Ravnborg
    Cc: Peter Zijlstra
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     

10 Jun, 2009

1 commit

  • Fix coding style whitespace issues and replace __initcall with
    device_initcall. Fixed multi-line comments as per coding style.

    Errors as reported by checkpatch.pl :-
    Before:
    total: 14 errors, 14 warnings, 487 lines checked
    After :
    total: 0 errors, 8 warnings, 507 lines checked

    Compile tested binary verified as :-
    Before:
    text data bss dec hex filename
    2405 4 0 2409 969 kernel/kallsyms.o
    After :
    text data bss dec hex filename
    2405 4 0 2409 969 kernel/kallsyms.o

    Signed-off-by: Manish Katiyar
    Signed-off-by: Andrew Morton
    Signed-off-by: Sam Ravnborg

    Manish Katiyar
     

31 Mar, 2009

1 commit

  • Impact: New API

    kallsyms_lookup_name only returns the first match that it finds. Ksplice
    needs information about all symbols with a given name in order to correctly
    resolve local symbols.

    kallsyms_on_each_symbol provides a generic mechanism for iterating over the
    kallsyms table.

    Cc: Jeff Arnold
    Cc: Tim Abbott
    Signed-off-by: Anders Kaseorg
    Signed-off-by: Rusty Russell

    Anders Kaseorg
     

15 Jan, 2009

1 commit

  • This reverts commit ad7a953c522ceb496611d127e51e278bfe0ff483.

    And commit: ("allow stripping of generated symbols under CONFIG_KALLSYMS_ALL")
    9bb482476c6c9d1ae033306440c51ceac93ea80c

    These stripping patches has caused a set of issues:

    1) People have reported compatibility issues with binutils due to
    lack of support for `--strip-unneeded-symbols' with objcopy 2.15.92.0.2
    Reported by: Wenji
    2) ccache and distcc no longer works as expeced
    Reported by: Ted, Roland, + others
    3) The installed modules increased a lot in size
    Reported by: Ted, Davej + others

    Reported-by: Wenji Huang
    Reported-by: "Theodore Ts'o"
    Reported-by: Dave Jones
    Reported-by: Roland McGrath
    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     

20 Dec, 2008

1 commit

  • Building upon parts of the module stripping patch, this patch
    introduces similar stripping for vmlinux when CONFIG_KALLSYMS_ALL=y.
    Using CONFIG_KALLSYMS_STRIP_GENERATED reduces the overhead of
    CONFIG_KALLSYMS_ALL from 245k/310k to 65k/80k for the (i386/x86-64)
    kernels I tested with.

    The patch also does away with the need to special case the kallsyms-
    internal symbols by making them available even in the first linking
    stage.

    While it is a generated file, the patch includes the changes to
    scripts/genksyms/keywords.c_shipped, as I'm unsure what the procedure
    here is.

    Signed-off-by: Jan Beulich
    Signed-off-by: Sam Ravnborg

    Jan Beulich
     

20 Nov, 2008

1 commit

  • sprint_symbol(), itself used when dumping stacks, has been wasting 128
    bytes of stack: lookup the symbol directly into the buffer supplied by the
    caller, instead of using a locally declared namebuf.

    I believe the name != buffer strcpy() is obsolete: the design here dates
    from when module symbol lookup pointed into a supposedly const but sadly
    volatile table; nowadays it copies, but an uncalled strcpy() looks better
    here than the risk of a recursive BUG_ON().

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

17 Oct, 2008

1 commit

  • Commit 6dd06c9fbe025f542bce4cdb91790c0f91962722 ("module: make
    module_address_lookup safe") introduced double returns in the function
    kallsyms_lookup(), it's weird. The second one should be removed.

    Signed-off-by: WANG Cong
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    WANG Cong
     

26 Jul, 2008

1 commit


29 Apr, 2008

1 commit


07 Feb, 2008

1 commit

  • When passing a zero address to kallsyms_lookup(), the kernel thought it was
    a valid kernel address, even if it is not. This is because is_ksym_addr()
    called is_kernel_extratext() and checked against labels that don't exist on
    many archs (which default as zero). Since PPC was the only kernel which
    defines _extra_text, (in 2005), and no longer needs it, this patch removes
    _extra_text support.

    For some history (provided by Jon):
    http://ozlabs.org/pipermail/linuxppc-dev/2005-September/019734.html
    http://ozlabs.org/pipermail/linuxppc-dev/2005-September/019736.html
    http://ozlabs.org/pipermail/linuxppc-dev/2005-September/019751.html

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Robin Getz
    Cc: David Woodhouse
    Cc: Jon Loeliger
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Sam Ravnborg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Getz