19 Jun, 2009

1 commit

  • Call constructors (gcc-generated initcall-like functions) during kernel
    start and module load. Constructors are e.g. used for gcov data
    initialization.

    Disable constructor support for usermode Linux to prevent conflicts with
    host glibc.

    Signed-off-by: Peter Oberparleiter
    Acked-by: Rusty Russell
    Acked-by: WANG Cong
    Cc: Sam Ravnborg
    Cc: Jeff Dike
    Cc: Andi Kleen
    Cc: Huang Ying
    Cc: Li Wei
    Cc: Michael Ellerman
    Cc: Ingo Molnar
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Oberparleiter
     

17 Jun, 2009

1 commit

  • Several WARN_ON() messages omit the '\n' at the end of the string, which
    is a simple (and understandable) error. The next line printed after
    that warning line is usually the current module list, and that printk
    does not have a log-level marker - resulting in one long mixed-up line.

    Adding this loglevel marker will now avoid this unreadable mess.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

12 Jun, 2009

4 commits

  • It's theoretically possible that there are exception table entries
    which point into the (freed) init text of modules. These could cause
    future problems if other modules get loaded into that memory and cause
    an exception as we'd see the wrong fixup. The only case I know of is
    kvm-intel.ko (when CONFIG_CC_OPTIMIZE_FOR_SIZE=n).

    Amerigo fixed this long-standing FIXME in the x86 version, but this
    patch is more general.

    This implements trim_init_extable(); most archs are simple since they
    use the standard lib/extable.c sort code. Alpha and IA64 use relative
    addresses in their fixups, so thier trimming is a slight variation.

    Sparc32 is unique; it doesn't seem to define ARCH_HAS_SORT_EXTABLE,
    yet it defines its own sort_extable() which overrides the one in lib.
    It doesn't sort, so we have to mark deleted entries instead of
    actually trimming them.

    Inspired-by: Amerigo Wang
    Signed-off-by: Rusty Russell
    Cc: linux-alpha@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org

    Rusty Russell
     
  • * 'for-linus' of git://linux-arm.org/linux-2.6:
    kmemleak: Add the corresponding MAINTAINERS entry
    kmemleak: Simple testing module for kmemleak
    kmemleak: Enable the building of the memory leak detector
    kmemleak: Remove some of the kmemleak false positives
    kmemleak: Add modules support
    kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash
    kmemleak: Add the vmalloc memory allocation/freeing hooks
    kmemleak: Add the slub memory allocation/freeing hooks
    kmemleak: Add the slob memory allocation/freeing hooks
    kmemleak: Add the slab memory allocation/freeing hooks
    kmemleak: Add documentation on the memory leak detector
    kmemleak: Add the base support

    Manual conflict resolution (with the slab/earlyboot changes) in:
    drivers/char/vt.c
    init/main.c
    mm/slab.c

    Linus Torvalds
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (44 commits)
    nommu: Provide mmap_min_addr definition.
    TOMOYO: Add description of lists and structures.
    TOMOYO: Remove unused field.
    integrity: ima audit dentry_open failure
    TOMOYO: Remove unused parameter.
    security: use mmap_min_addr indepedently of security models
    TOMOYO: Simplify policy reader.
    TOMOYO: Remove redundant markers.
    SELinux: define audit permissions for audit tree netlink messages
    TOMOYO: Remove unused mutex.
    tomoyo: avoid get+put of task_struct
    smack: Remove redundant initialization.
    integrity: nfsd imbalance bug fix
    rootplug: Remove redundant initialization.
    smack: do not beyond ARRAY_SIZE of data
    integrity: move ima_counts_get
    integrity: path_check update
    IMA: Add __init notation to ima functions
    IMA: Minimal IMA policy and boot param for TCB IMA policy
    selinux: remove obsolete read buffer limit from sel_read_bool
    ...

    Linus Torvalds
     
  • This patch handles the kmemleak operations needed for modules loading so
    that memory allocations from inside a module are properly tracked.

    Signed-off-by: Catalin Marinas

    Catalin Marinas
     

08 May, 2009

1 commit


17 Apr, 2009

1 commit

  • The hooks in the module code for the function tracer must be called
    before any of that module code runs. The function tracer hooks
    modify the module (replacing calls to mcount to nops). If the code
    is executed while the change occurs, then the CPU can take a GPF.

    To handle the above with a bit of paranoia, I originally implemented
    the hooks as calls directly from the module code.

    After examining the notifier calls, it looks as though the start up
    notify is called before any of the module's code is executed. This makes
    the use of the notify safe with ftrace.

    Only the startup notify is required to be "safe". The shutdown simply
    removes the entries from the ftrace function list, and does not modify
    any code.

    This change has another benefit. It removes a issue with a reverse dependency
    in the mutexes of ftrace_lock and module_mutex.

    [ Impact: fix lock dependency bug, cleanup ]

    Cc: Rusty Russell
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

15 Apr, 2009

2 commits

  • Commit 3d43321b7015387cfebbe26436d0e9d299162ea1 ("modules: sysctl to
    block module loading") introduces a modules_disabled variable that is
    only defined if CONFIG_MODULE_UNLOAD is enabled, despite being used in
    other places. This moves it up and fixes up the build.

    CC kernel/module.o
    kernel/module.c: In function 'sys_init_module':
    kernel/module.c:2401: error: 'modules_disabled' undeclared (first use in this function)
    kernel/module.c:2401: error: (Each undeclared identifier is reported only once
    kernel/module.c:2401: error: for each function it appears in.)
    make[1]: *** [kernel/module.o] Error 1
    make: *** [kernel/module.o] Error 2

    Signed-off-by: Paul Mundt
    Signed-off-by: James Morris

    Stephen Rothwell
     
  • Impact: allow modules to add TRACE_EVENTS on load

    This patch adds the final hooks to allow modules to use the TRACE_EVENT
    macro. A notifier and a data structure are used to link the TRACE_EVENTs
    defined in the module to connect them with the ftrace event tracing system.

    It also adds the necessary automated clean ups to the trace events when a
    module is removed.

    Cc: Rusty Russell
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

12 Apr, 2009

1 commit

  • Several drivers use asynchronous work to do device discovery, and we
    synchronize with them in the compiled-in case before we actually try to
    mount root filesystems etc.

    However, when compiled as modules, that synchronization is missing - the
    module loading completes, but the driver hasn't actually finished
    probing for devices, and that means that any user mode that expects to
    use the devices after the 'insmod' is now potentially broken.

    We already saw one case of a similar issue in the ACPI battery code,
    where the kernel itself expected the module to be all done, and unmapped
    the init memory - but the async device discovery was still running.
    That got hacked around by just removing the "__init" (see commit
    5d38258ec026921a7b266f4047ebeaa75db358e5 "ACPI battery: fix async boot
    oops"), but the real fix is to just make the module loading wait for all
    async work to be completed.

    It will slow down module loading, but since common devices should be
    built in anyway, and since the bug is really annoying and hard to handle
    from user space (and caused several S3 resume regressions), the simple
    fix to wait is the right one.

    This fixes at least

    http://bugzilla.kernel.org/show_bug.cgi?id=13063

    but probably a few other bugzilla entries too (12936, for example), and
    is confirmed to fix Rafael's storage driver breakage after resume bug
    report (no bugzilla entry).

    We should also be able to now revert that ACPI battery fix.

    Reported-and-tested-by: Rafael J. Wysocki
    Tested-by: Heinz Diehl
    Acked-by: Arjan van de Ven
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

07 Apr, 2009

1 commit

  • This reverts commit 9cb610d8e35fe3ec95a2fe2030b02f85aeea83c1.

    This was an impressively stupid patch. Firstly, we reset the SHF_ALLOC
    flag lower down in the same function, so the patch was useless. Even
    better, find_sec() ignores sections with SHF_ALLOC not set, so
    it breaks CONFIG_MODVERSIONS=y with CONFIG_MODULE_FORCE_LOAD=n, which
    refuses to load the module since it can't find the __versions section.

    Signed-off-by: Rusty Russell

    Rusty Russell
     

06 Apr, 2009

1 commit

  • * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
    tracing, net: fix net tree and tracing tree merge interaction
    tracing, powerpc: fix powerpc tree and tracing tree interaction
    ring-buffer: do not remove reader page from list on ring buffer free
    function-graph: allow unregistering twice
    trace: make argument 'mem' of trace_seq_putmem() const
    tracing: add missing 'extern' keywords to trace_output.h
    tracing: provide trace_seq_reserve()
    blktrace: print out BLK_TN_MESSAGE properly
    blktrace: extract duplidate code
    blktrace: fix memory leak when freeing struct blk_io_trace
    blktrace: fix blk_probes_ref chaos
    blktrace: make classic output more classic
    blktrace: fix off-by-one bug
    blktrace: fix the original blktrace
    blktrace: fix a race when creating blk_tree_root in debugfs
    blktrace: fix timestamp in binary output
    tracing, Text Edit Lock: cleanup
    tracing: filter fix for TRACE_EVENT_FORMAT events
    ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
    x86: kretprobe-booster interrupt emulation code fix
    ...

    Fix up trivial conflicts in
    arch/parisc/include/asm/ftrace.h
    include/linux/memory.h
    kernel/extable.c
    kernel/module.c

    Linus Torvalds
     

03 Apr, 2009

1 commit

  • Implement a sysctl file that disables module-loading system-wide since
    there is no longer a viable way to remove CAP_SYS_MODULE after the system
    bounding capability set was removed in 2.6.25.

    Value can only be set to "1", and is tested only if standard capability
    checks allow CAP_SYS_MODULE. Given existing /dev/mem protections, this
    should allow administrators a one-way method to block module loading
    after initial boot-time module loading has finished.

    Signed-off-by: Kees Cook
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    Kees Cook
     

02 Apr, 2009

1 commit


31 Mar, 2009

12 commits

  • Impact: minor cleanup.

    I'm not going to neaten anyone else's code, but I'm happy to clean up
    my own.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Kay Sievers discovered that boot times are slowed
    by about half a second because all the stop_machine_create() calls,
    and he only probes about 40 modules (I have 125 loaded on this laptop).

    We only do stop_machine_create() so we can unlink the module if
    something goes wrong, but it's overkill (and buggy anyway: if
    stop_machine_create() fails we still call stop_machine_destroy()).

    Since we are only protecting against kallsyms (esp. oops) walking the
    list, synchronize_sched() is sufficient (synchronize_rcu() is probably
    sufficient, but we're not in a hurry).

    Kay says of this patch:
    ... no module takes more than 40 millisecs to link now, most of
    them are between 3 and 8 millisecs.

    That looks very different to the numbers without this patch
    and the otherwise same setup, where we get heavy noise in the
    traces and many delays of up to 200 millisecs until linking,
    most of them taking 30+ millisecs.

    Tested-by: Kay Sievers
    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • With CONFIG_MODVERSIONS, we version 'struct module' using a dummy
    export, but other things matter too:

    1) 'struct modversion_info' determines the layout of the __versions section,
    2) 'struct kernel_param' determines the layout of the __params section,
    3) 'struct kernel_symbol' determines __ksymtab*.
    4) 'struct marker' determines __markers.
    5) 'struct tracepoint' determines __tracepoints.

    So we rename 'struct_module' to 'module_layout' and include these in
    the signature. Now it's general we can add others later on without
    confusion.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: reduce kernel memory usage

    This patch just takes off the SHF_ALLOC flag on __versions so we don't
    keep them around after module load.

    This saves about 7% of module memory if CONFIG_MODVERSIONS=y.

    Cc: Shawn Bohrer
    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: Message cleanup

    Two of three callers of try_to_force_load() are not because of a
    missing version, so change the messages:

    Old:
    : no version for "magic" found: kernel tainted.
    New:
    : bad vermagic: kernel tainted.

    Old:
    : no version for "nocrc" found: kernel tainted.
    New:
    : no versions for exported symbols: kernel tainted.

    Old:
    : no version for "" found: kernel tainted.
    New:
    : : kernel tainted.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: Expose some module.c symbols

    Ksplice uses several functions from module.c in order to resolve
    symbols and implement dependency handling. Calling these functions
    requires holding module_mutex, so it is exported.

    (This is just the module part of a bigger add-exports patch from Tim).

    Cc: Anders Kaseorg
    Cc: Jeff Arnold
    Signed-off-by: Tim Abbott
    Signed-off-by: Rusty Russell

    Tim Abbott
     
  • Impact: New API

    kallsyms_lookup_name only returns the first match that it finds. Ksplice
    needs information about all symbols with a given name in order to correctly
    resolve local symbols.

    kallsyms_on_each_symbol provides a generic mechanism for iterating over the
    kallsyms table.

    Cc: Jeff Arnold
    Cc: Tim Abbott
    Signed-off-by: Anders Kaseorg
    Signed-off-by: Rusty Russell

    Anders Kaseorg
     
  • Impact: Replace and remove risky (non-EXPORTed) API

    module_text_address() returns a pointer to the module, which given locking
    improvements in module.c, is useless except to test for NULL:

    1) If the module can't go away, use __module_text_address.
    2) Otherwise, just use is_module_text_address().

    Cc: linux-mtd@lists.infradead.org
    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: New API, cleanup

    ksplice wants to know the bounds of a module, not just the module text.

    It makes sense to have __module_address. We then implement
    is_module_address and __module_text_address in terms of this (and
    change is_module_text_address() to bool while we're at it).

    Also, add proper kerneldoc for them all.

    Cc: Anders Kaseorg
    Cc: Jeff Arnold
    Cc: Tim Abbott
    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: Cleanup, internal API change

    Ksplice needs access to the kernel_symbol structure in order to support
    modifications to the exported symbol table.

    Cc: Anders Kaseorg
    Cc: Jeff Arnold
    Signed-off-by: Tim Abbott
    Signed-off-by: Rusty Russell (bugfix and style)

    Tim Abbott
     
  • Impact: cleanup

    Label 'free_init' is only used when defined(CONFIG_MODULE_UNLOAD) &&
    defined(CONFIG_SMP), so move it inside to shut up gcc.

    Signed-off-by: WANG Cong
    Cc: Rusty Russell
    Signed-off-by: Rusty Russell

    Américo Wang
     
  • Impact: fix crash on reading from /sys/module/.../ieee80211_default_rc_algo

    The module_param type "charp" simply sets a char * pointer in the
    module to the parameter in the commandline string: this is why we keep
    the (mangled) module command line around. But when set via sysfs (as
    about 11 charp parameters can be) this memory is freed on the way
    out of the write(). Future reads hit random mem.

    So we kstrdup instead: we have to check we're not in early commandline
    parsing, and we have to note when we've used it so we can reliably
    kfree the parameter when it's next overwritten, and also on module
    unload.

    (Thanks to Randy Dunlap for CONFIG_SYSFS=n fixes)

    Reported-by: Sitsofe Wheeler
    Diagnosed-by: Frederic Weisbecker
    Tested-by: Frederic Weisbecker
    Tested-by: Christof Schmitt
    Signed-off-by: Rusty Russell

    Rusty Russell
     

28 Mar, 2009

1 commit


25 Mar, 2009

1 commit

  • This patch combines Greg Bank's dprintk() work with the existing dynamic
    printk patchset, we are now calling it 'dynamic debug'.

    The new feature of this patchset is a richer /debugfs control file interface,
    (an example output from my system is at the bottom), which allows fined grained
    control over the the debug output. The output can be controlled by function,
    file, module, format string, and line number.

    for example, enabled all debug messages in module 'nf_conntrack':

    echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control

    to disable them:

    echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control

    A further explanation can be found in the documentation patch.

    Signed-off-by: Greg Banks
    Signed-off-by: Jason Baron
    Signed-off-by: Greg Kroah-Hartman

    Jason Baron
     

20 Mar, 2009

1 commit


18 Mar, 2009

1 commit

  • Impact: fix ref-after-free crash on failed module load

    Fix refptr bug: Change refptr allocation and release order not to access a module
    data structure pointed by 'mod' after freeing mod->module_core.
    This bug will cause kernel panic(e.g. failed to find undefined symbols).

    This bug was reported on systemtap bugzilla.
    http://sources.redhat.com/bugzilla/show_bug.cgi?id=9927

    Signed-off-by: Masami Hiramatsu
    Cc: Eric Dumazet
    Signed-off-by: Rusty Russell

    Masami Hiramatsu
     

10 Mar, 2009

1 commit


06 Mar, 2009

2 commits

  • Conflicts:
    arch/x86/Kconfig
    block/blktrace.c
    kernel/irq/handle.c

    Semantic conflict:
    kernel/trace/blktrace.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Impact: add reserved allocation functionality and use it for module
    percpu variables

    This patch implements reserved allocation from the first chunk. When
    setting up the first chunk, arch can ask to set aside certain number
    of bytes right after the core static area which is available only
    through a separate reserved allocator. This will be used primarily
    for module static percpu variables on architectures with limited
    relocation range to ensure that the module perpcu symbols are inside
    the relocatable range.

    If reserved area is requested, the first chunk becomes reserved and
    isn't available for regular allocation. If the first chunk also
    includes piggy-back dynamic allocation area, a separate chunk mapping
    the same region is created to serve dynamic allocation. The first one
    is called static first chunk and the second dynamic first chunk.
    Although they share the page map, their different area map
    initializations guarantee they serve disjoint areas according to their
    purposes.

    If arch doesn't setup reserved area, reserved allocation is handled
    like any other allocation.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

20 Feb, 2009

2 commits

  • Impact: new scalable dynamic percpu allocator which allows dynamic
    percpu areas to be accessed the same way as static ones

    Implement scalable dynamic percpu allocator which can be used for both
    static and dynamic percpu areas. This will allow static and dynamic
    areas to share faster direct access methods. This feature is optional
    and enabled only when CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is defined by
    arch. Please read comment on top of mm/percpu.c for details.

    Signed-off-by: Tejun Heo
    Cc: Andrew Morton

    Tejun Heo
     
  • Impact: cleanup

    Move percpu_modinit() upwards. This is to ease further changes.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

09 Feb, 2009

1 commit

  • When the function graph tracer picks a return address, it ensures this address
    is really a kernel text one by calling __kernel_text_address()

    Actually this path has never been taken.Its role was more likely to debug the tracer
    on the beginning of its development but this function is wasteful since it is called
    for every traced function.

    The fault check is already sufficient.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

03 Feb, 2009

1 commit

  • Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
    using a lot of memory.

    Each 'struct module' contains an [NR_CPUS] array of full cache lines.

    This patch uses existing infrastructure (percpu_modalloc() &
    percpu_modfree()) to allocate percpu space for the refcount storage.

    Instead of wasting NR_CPUS*128 bytes (on i386), we now use
    nr_cpu_ids*sizeof(local_t) bytes.

    On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
    size of module files by about 2 Mbytes. (1Kb per module)

    Instead of having all refcounters in the same memory node - with TLB misses
    because of vmalloc() - this new implementation permits to have better
    NUMA properties, since each CPU will use storage on its preferred node,
    thanks to percpu storage.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Rusty Russell
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

14 Jan, 2009

1 commit


08 Jan, 2009

1 commit

  • Right now, most of the kernel boot is strictly synchronous, such that
    various hardware delays are done sequentially.

    In order to make the kernel boot faster, this patch introduces
    infrastructure to allow doing some of the initialization steps
    asynchronously, which will hide significant portions of the hardware delays
    in practice.

    In order to not change device order and other similar observables, this
    patch does NOT do full parallel initialization.

    Rather, it operates more in the way an out of order CPU does; the work may
    be done out of order and asynchronous, but the observable effects
    (instruction retiring for the CPU) are still done in the original sequence.

    Signed-off-by: Arjan van de Ven

    Arjan van de Ven