08 Dec, 2008

1 commit

  • Impact: trace more functions

    When the function graph tracer is configured, three more files are not
    traced to prevent only four functions to be traced. And this impacts the
    normal function tracer too.

    arch/x86/kernel/process_64/32.c:

    I had crashes when I let this file traced. After some debugging, I saw
    that the "current" task point was changed inside__swtich_to(), ie:
    "write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
    the original return address of the function inside current, we had
    crashes. Only __switch_to() has to be excluded from tracing.

    kernel/module.c and kernel/extable.c:

    Because of a function used internally by the function graph tracer:
    __kernel_text_address()

    To let the other functions inside these files to be traced, this patch
    introduces the __notrace_funcgraph function prefix which is __notrace if
    function graph tracer is configured and nothing if not.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

17 Nov, 2008

1 commit


16 Nov, 2008

3 commits

  • Impact: cleanup

    Use module notifiers for tracepoint updates rather than adding a hook in
    module.c.

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar

    Mathieu Desnoyers
     
  • Impact: cleanup

    Use module notifiers instead of adding a hook in module.c.

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar

    Mathieu Desnoyers
     
  • Impact: allow archs more flexibility on dynamic ftrace implementations

    Dynamic ftrace has largly been developed on x86. Since x86 does not
    have the same limitations as other architectures, the ftrace interaction
    between the generic code and the architecture specific code was not
    flexible enough to handle some of the issues that other architectures
    have.

    Most notably, module trampolines. Due to the limited branch distance
    that archs make in calling kernel core code from modules, the module
    load code must create a trampoline to jump to what will make the
    larger jump into core kernel code.

    The problem arises when this happens to a call to mcount. Ftrace checks
    all code before modifying it and makes sure the current code is what
    it expects. Right now, there is not enough information to handle modifying
    module trampolines.

    This patch changes the API between generic dynamic ftrace code and
    the arch dependent code. There is now two functions for modifying code:

    ftrace_make_nop(mod, rec, addr) - convert the code at rec->ip into
    a nop, where the original text is calling addr. (mod is the
    module struct if called by module init)

    ftrace_make_caller(rec, addr) - convert the code rec->ip that should
    be a nop into a caller to addr.

    The record "rec" now has a new field called "arch" where the architecture
    can add any special attributes to each call site record.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

24 Oct, 2008

1 commit

  • * 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits)
    proc: remove fs/proc/proc_misc.c
    proc: move /proc/vmcore creation to fs/proc/vmcore.c
    proc: move pagecount stuff to fs/proc/page.c
    proc: move all /proc/kcore stuff to fs/proc/kcore.c
    proc: move /proc/schedstat boilerplate to kernel/sched_stats.h
    proc: move /proc/modules boilerplate to kernel/module.c
    proc: move /proc/diskstats boilerplate to block/genhd.c
    proc: move /proc/zoneinfo boilerplate to mm/vmstat.c
    proc: move /proc/vmstat boilerplate to mm/vmstat.c
    proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c
    proc: move /proc/buddyinfo boilerplate to mm/vmstat.c
    proc: move /proc/vmallocinfo to mm/vmalloc.c
    proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c
    proc: move /proc/slab_allocators boilerplate to mm/slab.c
    proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c
    proc: move /proc/stat to fs/proc/stat.c
    proc: move rest of /proc/partitions code to block/genhd.c
    proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c
    proc: move /proc/devices code to fs/proc/devices.c
    proc: move rest of /proc/locks to fs/locks.c
    ...

    Linus Torvalds
     

23 Oct, 2008

1 commit


22 Oct, 2008

2 commits

  • Remove stop_machine during module load v2

    module loading currently does a stop_machine on each module load to insert
    the module into the global module lists. Especially on larger systems this
    can be quite expensive.

    It does that to handle concurrent lock lessmodule list readers
    like kallsyms.

    I don't think stop_machine() is actually needed to insert something
    into a list though. There are no concurrent writers because the
    module mutex is taken. And the RCU list functions know how to insert
    a node into a list with the right memory ordering so that concurrent
    readers don't go off into the wood.

    So remove the stop_machine for the module list insert and just
    do a list_add_rcu() instead.

    Module removal will still do a stop_machine of course, it needs
    that for other reasons.

    v2: Revised readers based on Paul's comments. All readers that only
    rely on disabled preemption need to be changed to list_for_each_rcu().
    Done that. The others are ok because they have the modules mutex.
    Also added a possible missing preempt disable for print_modules().

    [cc Paul McKenney for review. It's not RCU, but quite similar.]

    Acked-by: Paul E. McKenney
    Signed-off-by: Rusty Russell

    Andi Kleen
     
  • Linus' recent catch of stack overflow in load_module lead me to look
    at the code. A couple of helpers to get a section address and get
    objects from a section can help clean things up a little.

    (And in case you're wondering, the stack size also dropped from 328 to
    284 bytes).

    Signed-off-by: Rusty Russell

    Rusty Russell
     

21 Oct, 2008

1 commit

  • …l/git/tip/linux-2.6-tip

    * 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (131 commits)
    tracing/fastboot: improve help text
    tracing/stacktrace: improve help text
    tracing/fastboot: fix initcalls disposition in bootgraph.pl
    tracing/fastboot: fix bootgraph.pl initcall name regexp
    tracing/fastboot: fix issues and improve output of bootgraph.pl
    tracepoints: synchronize unregister static inline
    tracepoints: tracepoint_synchronize_unregister()
    ftrace: make ftrace_test_p6nop disassembler-friendly
    markers: fix synchronize marker unregister static inline
    tracing/fastboot: add better resolution to initcall debug/tracing
    trace: add build-time check to avoid overrunning hex buffer
    ftrace: fix hex output mode of ftrace
    tracing/fastboot: fix initcalls disposition in bootgraph.pl
    tracing/fastboot: fix printk format typo in boot tracer
    ftrace: return an error when setting a nonexistent tracer
    ftrace: make some tracers reentrant
    ring-buffer: make reentrant
    ring-buffer: move page indexes into page headers
    tracing/fastboot: only trace non-module initcalls
    ftrace: move pc counter in irqtrace
    ...

    Manually fix conflicts:
    - init/main.c: initcall tracing
    - kernel/module.c: verbose level vs tracepoints
    - scripts/bootgraph.pl: fallout from cherry-picking commits.

    Linus Torvalds
     

18 Oct, 2008

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (25 commits)
    staging: at76_usb wireless driver
    Staging: workaround build system bug
    Staging: Lindent sxg.c
    Staging: SLICOSS: Call pci_release_regions at driver exit
    Staging: SLICOSS: Fix remaining type names
    Staging: SLICOSS: Fix warnings due to static usage
    Staging: SLICOSS: lots of checkpatch fixes
    Staging: go7007 v4l fixes
    Staging: Fix gcc warnings in sxg
    Staging: add echo cancelation module
    Staging: add wlan-ng prism2 usb driver
    Staging: add w35und wifi driver
    Staging: USB/IP: add host driver
    Staging: USB/IP: add client driver
    Staging: USB/IP: add common functions needed
    Staging: add the go7007 video driver
    Staging: add me4000 pci data collection driver
    Staging: add me4000 firmware files
    Staging: add sxg network driver
    Staging: add Alacritech slicoss network driver
    ...

    Fixed up conflicts due to taint flags changes and MAINTAINERS cleanup in
    MAINTAINERS, include/linux/kernel.h and kernel/panic.c.

    Linus Torvalds
     

17 Oct, 2008

4 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (46 commits)
    UIO: Fix mapping of logical and virtual memory
    UIO: add automata sercos3 pci card support
    UIO: Change driver name of uio_pdrv
    UIO: Add alignment warnings for uio-mem
    Driver core: add bus_sort_breadthfirst() function
    NET: convert the phy_device file to use bus_find_device_by_name
    kobject: Cleanup kobject_rename and !CONFIG_SYSFS
    kobject: Fix kobject_rename and !CONFIG_SYSFS
    sysfs: Make dir and name args to sysfs_notify() const
    platform: add new device registration helper
    sysfs: use ilookup5() instead of ilookup5_nowait()
    PNP: create device attributes via default device attributes
    Driver core: make bus_find_device_by_name() more robust
    usb: turn dev_warn+WARN_ON combos into dev_WARN
    debug: use dev_WARN() rather than WARN_ON() in device_pm_add()
    debug: Introduce a dev_WARN() function
    sysfs: fix deadlock
    device model: Do a quickcheck for driver binding before doing an expensive check
    Driver core: Fix cleanup in device_create_vargs().
    Driver core: Clarify device cleanup.
    ...

    Linus Torvalds
     
  • It's somewhat unlikely that it happens, but right now a race window
    between interrupts or machine checks or oopses could corrupt the tainted
    bitmap because it is modified in a non atomic fashion.

    Convert the taint variable to an unsigned long and use only atomic bit
    operations on it.

    Unfortunately this means the intvec sysctl functions cannot be used on it
    anymore.

    It turned out the taint sysctl handler could actually be simplified a bit
    (since it only increases capabilities) so this patch actually removes
    code.

    [akpm@linux-foundation.org: remove unneeded include]
    Signed-off-by: Andi Kleen
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Base infrastructure to enable per-module debug messages.

    I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
    control of debugging statements on a per-module basis in one /proc file,
    currently, /dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
    is not set, debugging statements can still be enabled as before, often by
    defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
    affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.

    The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
    is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
    can be dynamically enabled/disabled on a per-module basis.

    Future plans include extending this functionality to subsystems, that define
    their own debug levels and flags.

    Usage:

    Dynamic debugging is controlled by the debugfs file,
    /dynamic_printk/modules. This file contains a list of the modules that
    can be enabled. The format of the file is as follows:


    .
    .
    .

    : Name of the module in which the debug call resides
    : whether the messages are enabled or not

    For example:

    snd_hda_intel enabled=0
    fixup enabled=1
    driver enabled=0

    Enable a module:

    $echo "set enabled=1 " > dynamic_printk/modules

    Disable a module:

    $echo "set enabled=0 " > dynamic_printk/modules

    Enable all modules:

    $echo "set enabled=1 all" > dynamic_printk/modules

    Disable all modules:

    $echo "set enabled=0 all" > dynamic_printk/modules

    Finally, passing "dynamic_printk" at the command line enables
    debugging for all modules. This mode can be turned off via the above
    disable command.

    [gkh: minor cleanups and tweaks to make the build work quietly]

    Signed-off-by: Jason Baron
    Signed-off-by: Greg Kroah-Hartman

    Jason Baron
     
  • Fix "notes" kobject leak

    It happens every rmmod if KALLSYMS=y and SYSFS=y.

    # modprobe foo

    kobject: 'foo' (ffffffffa00743d0): kobject_add_internal: parent: 'module', set: 'module'
    kobject: 'holders' (ffff88017e7c5770): kobject_add_internal: parent: 'foo', set: ''
    kobject: 'foo' (ffffffffa00743d0): kobject_uevent_env
    kobject: 'foo' (ffffffffa00743d0): fill_kobj_path: path = '/module/foo'
    kobject: 'notes' (ffff88017fa9b668): kobject_add_internal: parent: 'foo', set: ''
    ^^^^^

    # rmmod foo

    kobject: 'holders' (ffff88017e7c5770): kobject_cleanup
    kobject: 'holders' (ffff88017e7c5770): auto cleanup kobject_del
    kobject: 'holders' (ffff88017e7c5770): calling ktype release
    kobject: (ffff88017e7c5770): dynamic_kobj_release
    kobject: 'holders': free name
    kobject: 'foo' (ffffffffa00743d0): kobject_cleanup
    kobject: 'foo' (ffffffffa00743d0): does not have a release() function, it is broken and must be fixed.
    kobject: 'foo' (ffffffffa00743d0): auto cleanup 'remove' event
    kobject: 'foo' (ffffffffa00743d0): kobject_uevent_env
    kobject: 'foo' (ffffffffa00743d0): fill_kobj_path: path = '/module/foo'
    kobject: 'foo' (ffffffffa00743d0): auto cleanup kobject_del
    kobject: 'foo': free name

    [whooops]

    Signed-off-by: Alexey Dobriyan
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Alexey Dobriyan
     

14 Oct, 2008

3 commits

  • When a mcount pointer is recorded into a table, it is used to add or
    remove calls to mcount (replacing them with nops). If the code is removed
    via removing a module, the pointers still exist. At modifying the code
    a check is always made to make sure the code being replaced is the code
    expected. In-other-words, the code being replaced is compared to what
    it is expected to be before being replaced.

    There is a very small chance that the code being replaced just happens
    to look like code that calls mcount (very small since the call to mcount
    is relative). To remove this chance, this patch adds ftrace_release to
    allow module unloading to remove the pointers to mcount within the module.

    Another change for init calls is made to not trace calls marked with
    __init. The tracing can not be started until after init is done anyway.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • This patch enables the loading of the __mcount_section of modules and
    changing all the callers of mcount into nops.

    The modification is done before the init_module function is called, so
    again, we do not need to use kstop_machine to make these changes.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Implementation of kernel tracepoints. Inspired from the Linux Kernel
    Markers. Allows complete typing verification by declaring both tracing
    statement inline functions and probe registration/unregistration static
    inline functions within the same macro "DEFINE_TRACE". No format string
    is required. See the tracepoint Documentation and Samples patches for
    usage examples.

    Taken from the documentation patch :

    "A tracepoint placed in code provides a hook to call a function (probe)
    that you can provide at runtime. A tracepoint can be "on" (a probe is
    connected to it) or "off" (no probe is attached). When a tracepoint is
    "off" it has no effect, except for adding a tiny time penalty (checking
    a condition for a branch) and space penalty (adding a few bytes for the
    function call at the end of the instrumented function and adds a data
    structure in a separate section). When a tracepoint is "on", the
    function you provide is called each time the tracepoint is executed, in
    the execution context of the caller. When the function provided ends its
    execution, it returns to the caller (continuing from the tracepoint
    site).

    You can put tracepoints at important locations in the code. They are
    lightweight hooks that can pass an arbitrary number of parameters, which
    prototypes are described in a tracepoint declaration placed in a header
    file."

    Addition and removal of tracepoints is synchronized by RCU using the
    scheduler (and preempt_disable) as guarantees to find a quiescent state
    (this is really RCU "classic"). The update side uses rcu_barrier_sched()
    with call_rcu_sched() and the read/execute side uses
    "preempt_disable()/preempt_enable()".

    We make sure the previous array containing probes, which has been
    scheduled for deletion by the rcu callback, is indeed freed before we
    proceed to the next update. It therefore limits the rate of modification
    of a single tracepoint to one update per RCU period. The objective here
    is to permit fast batch add/removal of probes on _different_
    tracepoints.

    Changelog :
    - Use #name ":" #proto as string to identify the tracepoint in the
    tracepoint table. This will make sure not type mismatch happens due to
    connexion of a probe with the wrong type to a tracepoint declared with
    the same name in a different header.
    - Add tracepoint_entry_free_old.
    - Change __TO_TRACE to get rid of the 'i' iterator.

    Masami Hiramatsu :
    Tested on x86-64.

    Performance impact of a tracepoint : same as markers, except that it
    adds about 70 bytes of instructions in an unlikely branch of each
    instrumented function (the for loop, the stack setup and the function
    call). It currently adds a memory read, a test and a conditional branch
    at the instrumentation site (in the hot path). Immediate values will
    eventually change this into a load immediate, test and branch, which
    removes the memory read which will make the i-cache impact smaller
    (changing the memory read for a load immediate removes 3-4 bytes per
    site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it
    also saves the d-cache hit).

    About the performance impact of tracepoints (which is comparable to
    markers), even without immediate values optimizations, tests done by
    Hideo Aoki on ia64 show no regression. His test case was using hackbench
    on a kernel where scheduler instrumentation (about 5 events in code
    scheduler code) was added.

    Quoting Hideo Aoki about Markers :

    I evaluated overhead of kernel marker using linux-2.6-sched-fixes git
    tree, which includes several markers for LTTng, using an ia64 server.

    While the immediate trace mark feature isn't implemented on ia64, there
    is no major performance regression. So, I think that we don't have any
    issues to propose merging marker point patches into Linus's tree from
    the viewpoint of performance impact.

    I prepared two kernels to evaluate. The first one was compiled without
    CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS.

    I downloaded the original hackbench from the following URL:
    http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c

    I ran hackbench 5 times in each condition and calculated the average and
    difference between the kernels.

    The parameter of hackbench: every 50 from 50 to 800
    The number of CPUs of the server: 2, 4, and 8

    Below is the results. As you can see, major performance regression
    wasn't found in any case. Even if number of processes increases,
    differences between marker-enabled kernel and marker- disabled kernel
    doesn't increase. Moreover, if number of CPUs increases, the differences
    doesn't increase either.

    Curiously, marker-enabled kernel is better than marker-disabled kernel
    in more than half cases, although I guess it comes from the difference
    of memory access pattern.

    * 2 CPUs

    Number of | without | with | diff | diff |
    processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
    --------------------------------------------------------------
    50 | 4.811 | 4.872 | +0.061 | +1.27 |
    100 | 9.854 | 10.309 | +0.454 | +4.61 |
    150 | 15.602 | 15.040 | -0.562 | -3.6 |
    200 | 20.489 | 20.380 | -0.109 | -0.53 |
    250 | 25.798 | 25.652 | -0.146 | -0.56 |
    300 | 31.260 | 30.797 | -0.463 | -1.48 |
    350 | 36.121 | 35.770 | -0.351 | -0.97 |
    400 | 42.288 | 42.102 | -0.186 | -0.44 |
    450 | 47.778 | 47.253 | -0.526 | -1.1 |
    500 | 51.953 | 52.278 | +0.325 | +0.63 |
    550 | 58.401 | 57.700 | -0.701 | -1.2 |
    600 | 63.334 | 63.222 | -0.112 | -0.18 |
    650 | 68.816 | 68.511 | -0.306 | -0.44 |
    700 | 74.667 | 74.088 | -0.579 | -0.78 |
    750 | 78.612 | 79.582 | +0.970 | +1.23 |
    800 | 85.431 | 85.263 | -0.168 | -0.2 |
    --------------------------------------------------------------

    * 4 CPUs

    Number of | without | with | diff | diff |
    processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
    --------------------------------------------------------------
    50 | 2.586 | 2.584 | -0.003 | -0.1 |
    100 | 5.254 | 5.283 | +0.030 | +0.56 |
    150 | 8.012 | 8.074 | +0.061 | +0.76 |
    200 | 11.172 | 11.000 | -0.172 | -1.54 |
    250 | 13.917 | 14.036 | +0.119 | +0.86 |
    300 | 16.905 | 16.543 | -0.362 | -2.14 |
    350 | 19.901 | 20.036 | +0.135 | +0.68 |
    400 | 22.908 | 23.094 | +0.186 | +0.81 |
    450 | 26.273 | 26.101 | -0.172 | -0.66 |
    500 | 29.554 | 29.092 | -0.461 | -1.56 |
    550 | 32.377 | 32.274 | -0.103 | -0.32 |
    600 | 35.855 | 35.322 | -0.533 | -1.49 |
    650 | 39.192 | 38.388 | -0.804 | -2.05 |
    700 | 41.744 | 41.719 | -0.025 | -0.06 |
    750 | 45.016 | 44.496 | -0.520 | -1.16 |
    800 | 48.212 | 47.603 | -0.609 | -1.26 |
    --------------------------------------------------------------

    * 8 CPUs

    Number of | without | with | diff | diff |
    processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
    --------------------------------------------------------------
    50 | 2.094 | 2.072 | -0.022 | -1.07 |
    100 | 4.162 | 4.273 | +0.111 | +2.66 |
    150 | 6.485 | 6.540 | +0.055 | +0.84 |
    200 | 8.556 | 8.478 | -0.078 | -0.91 |
    250 | 10.458 | 10.258 | -0.200 | -1.91 |
    300 | 12.425 | 12.750 | +0.325 | +2.62 |
    350 | 14.807 | 14.839 | +0.032 | +0.22 |
    400 | 16.801 | 16.959 | +0.158 | +0.94 |
    450 | 19.478 | 19.009 | -0.470 | -2.41 |
    500 | 21.296 | 21.504 | +0.208 | +0.98 |
    550 | 23.842 | 23.979 | +0.137 | +0.57 |
    600 | 26.309 | 26.111 | -0.198 | -0.75 |
    650 | 28.705 | 28.446 | -0.259 | -0.9 |
    700 | 31.233 | 31.394 | +0.161 | +0.52 |
    750 | 34.064 | 33.720 | -0.344 | -1.01 |
    800 | 36.320 | 36.114 | -0.206 | -0.57 |
    --------------------------------------------------------------

    Signed-off-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Acked-by: 'Peter Zijlstra'
    Signed-off-by: Ingo Molnar

    Mathieu Desnoyers
     

11 Oct, 2008

1 commit

  • We need to add a flag for all code that is in the drivers/staging/
    directory to prevent all other kernel developers from worrying about
    issues here, and to notify users that the drivers might not be as good
    as they are normally used to.

    Based on code from Andreas Gruenbacher and Jeff Mahoney to provide a
    TAINT flag for the support level of a kernel module in the Novell
    enterprise kernel release.

    This is the kernel portion of this feature, the ability for the flag to
    be set needs to be done in the build process and will happen in a
    follow-up patch.

    Cc: Andreas Gruenbacher
    Cc: Jeff Mahoney
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

26 Aug, 2008

1 commit

  • 'load_module()' is a complex function that contains all the ELF section
    logic, and inlining it is utterly insane. But gcc will do it, simply
    because there is only one call-site. As a result, all the stack space
    that is allocated for all the work to load the module will still be
    active when we actually call the module init sequence, and the deep call
    chain makes stack overflows happen.

    And stack overflows are really hard to debug, because they not only
    corrupt random pages below the stack, but also corrupt the thread_info
    structure that is allocated under the stack.

    In this case, Alan Brunelle reported some crazy oopses at bootup, after
    loading the processor module that ends up doing complex ACPI stuff and
    has quite a deep callchain. This should fix it, and is the sane thing
    to do regardless.

    Cc: Alan D. Brunelle
    Cc: Arjan van de Ven
    Cc: Rusty Russell
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

12 Aug, 2008

1 commit

  • The kernel has this really nice facility where if you put "initcall_debug"
    on the kernel commandline, it'll print which function it's going to
    execute just before calling an initcall, and then after the call completes
    it will

    1) print if it had an error code

    2) checks for a few simple bugs (like leaving irqs off)
    and

    3) print how long the init call took in milliseconds.

    While trying to optimize the boot speed of my laptop, I have been loving
    number 3 to figure out what to optimize... ... and then I wished that
    the same thing was done for module loading.

    This patch makes the module loader use this exact same functionality; it's
    a logical extension in my view (since modules are just sort of late
    binding initcalls anyway) and so far I've found it quite useful in finding
    where things are too slow in my boot.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Rusty Russell

    Arjan van de Ven
     

28 Jul, 2008

2 commits


22 Jul, 2008

5 commits

  • This patch keeps track of the boundaries of module allocation, in
    order to speed up module_text_address().

    Inspired by Arjan's version, which required arch-specific defines:

    Various pieces of the kernel (lockdep, latencytop, etc) tend
    to store backtraces, sometimes at a relatively high
    frequency. In itself this isn't a big performance deal (after
    all you're using diagnostics features), but there have been
    some complaints from people who have over 100 modules loaded
    that this is a tad too slow.

    This is due to the new backtracer code which looks at every
    slot on the stack to see if it's a kernel/module text address,
    so that's 1024 slots. 1024 times 100 modules... that's a lot
    of list walking.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • This shrinks module.o and each *.ko file.

    And finally, structure members which hold length of module
    code (four such members there) and count of symbols
    are converted from longs to ints.

    We cannot possibly have a module where 32 bits won't
    be enough to hold such counts.

    For one, module loading checks module size for sanity
    before loading, so such insanely big module will fail
    that test first.

    Signed-off-by: Denys Vlasenko
    Signed-off-by: Rusty Russell

    Denys Vlasenko
     
  • module.c and module.h conatains code for finding
    exported symbols which are declared with EXPORT_UNUSED_SYMBOL,
    and this code is compiled in even if CONFIG_UNUSED_SYMBOLS is not set
    and thus there can be no EXPORT_UNUSED_SYMBOLs in modules anyway
    (because EXPORT_UNUSED_SYMBOL(x) are compiled out to nothing then).

    This patch adds required #ifdefs.

    Signed-off-by: Denys Vlasenko
    Signed-off-by: Rusty Russell

    Denys Vlasenko
     
  • Introduce an each_symbol() iterator to avoid duplicating the knowledge
    about the 5 different sections containing symbols. Currently only
    used by find_symbol(), but will be used by symbol_put_addr() too.

    (Includes NULL ptr deref fix by Jiri Kosina )

    Signed-off-by: Rusty Russell
    Cc: Jiri Kosina

    Rusty Russell
     
  • rmmod has a little-used "-w" option, meaning that instead of failing if the
    module is in use, it should block until the module becomes unused.

    In this case, we don't need to use stop_machine: Max Krasnyansky
    indicated that would be useful for SystemTap which loads/unloads new
    modules frequently.

    Cc: Max Krasnyansky
    Signed-off-by: Rusty Russell

    Rusty Russell
     

23 May, 2008

2 commits

  • kobject: '' (ffffffffa0104050): is not initialized, yet kobject_put() is being called.
    ------------[ cut here ]------------
    WARNING: at /home/den/src/linux-netns26/lib/kobject.c:583 kobject_put+0x53/0x55()
    Modules linked in: ipv6 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ide_cd_mod cdrom button [last unloaded: pktgen]
    comm: rmmod Tainted: G W 2.6.26-rc3 #585
    Call Trace:
    [] warn_on_slowpath+0x58/0x7a
    [] ? printk+0x67/0x69
    [] ? printk+0x67/0x69
    [] kobject_put+0x53/0x55
    [] free_module+0x87/0xfa
    [] sys_delete_module+0x178/0x1e1
    [] ? lockdep_sys_exit_thunk+0x35/0x67
    [] ? trace_hardirqs_on_thunk+0x35/0x3a
    [] system_call_after_swapgs+0x7b/0x80
    ---[ end trace 8f5aafa7f6406cf8 ]---

    mod->mkobj.kobj is not initialized without CONFIG_SYSFS. Do not call
    kobject_put in this case.

    Signed-off-by: Denis V. Lunev
    Cc: Rusty Russell
    Cc: Kay Sievers
    Signed-off-by: Rusty Russell

    Denis V. Lunev
     
  • Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Andrew Morton
    Signed-off-by: Rusty Russell

    Cyrill Gorcunov
     

09 May, 2008

2 commits

  • Linus found a logic bug: we ignore the version number in a module's
    vermagic string if we have CONFIG_MODVERSIONS set, but modversions
    also lets through a module with no __versions section for modprobe
    --force (with tainting, but still).

    We should only ignore the start of the vermagic string if the module
    actually *has* crcs to check. Rather than (say) having an
    entertaining hissy fit and creating a config option to work around the
    buggy code.

    Signed-off-by: Rusty Russell
    Signed-off-by: Linus Torvalds

    Rusty Russell
     
  • We allow missing __versions sections, because modprobe --force strips
    it. It makes less sense to allow sections where there's no version
    for a specific symbol the module uses, so disallow that.

    Signed-off-by: Rusty Russell
    Signed-off-by: Linus Torvalds

    Rusty Russell
     

05 May, 2008

1 commit

  • The kernel module loader used to be much too happy to allow loading of
    modules for the wrong kernel version by default. For example, if you
    had MODVERSIONS enabled, but tried to load a module with no version
    info, it would happily load it and taint the kernel - whether it was
    likely to actually work or not!

    Generally, such forced module loading should be considered a really
    really bad idea, so make it conditional on a new config option
    (MODULE_FORCE_LOAD), and make it default to off.

    If somebody really wants to force module loads, that's their problem,
    but we should not encourage it. Especially as it happened to me by
    mistake (ie regular unversioned Fedora modules getting loaded) causing
    lots of strange behavior.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

01 May, 2008

6 commits


19 Apr, 2008

1 commit