Eric Lee / smarc-fsl-linux-kernel

28 Mar, 2009

1 commit

6e15cf048 Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2 ... Browse Code »

Conflicts:
arch/parisc/kernel/irq.c
arch/x86/include/asm/fixmap_64.h
arch/x86/include/asm/setup.h
kernel/irq/handle.c

Semantic merge:
arch/x86/include/asm/fixmap.h

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-03-28 00:28:43 +0800

25 Mar, 2009

1 commit

e9d376f0f dynamic debug: combine dprintk and dynamic printk ... Browse Code »

This patch combines Greg Bank's dprintk() work with the existing dynamic
printk patchset, we are now calling it 'dynamic debug'.

The new feature of this patchset is a richer /debugfs control file interface,
(an example output from my system is at the bottom), which allows fined grained
control over the the debug output. The output can be controlled by function,
file, module, format string, and line number.

for example, enabled all debug messages in module 'nf_conntrack':

echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control

to disable them:

echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control

A further explanation can be found in the documentation patch.

Signed-off-by: Greg Banks
Signed-off-by: Jason Baron
Signed-off-by: Greg Kroah-Hartman

Jason Baron
2009-03-25 07:38:26 +0800

18 Mar, 2009

1 commit

6e2b75740 module: fix refptr allocation and release order ... Browse Code »

Impact: fix ref-after-free crash on failed module load

Fix refptr bug: Change refptr allocation and release order not to access a module
data structure pointed by 'mod' after freeing mod->module_core.
This bug will cause kernel panic(e.g. failed to find undefined symbols).

This bug was reported on systemtap bugzilla.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=9927

Signed-off-by: Masami Hiramatsu
Cc: Eric Dumazet
Signed-off-by: Rusty Russell

Masami Hiramatsu
2009-03-18 07:01:21 +0800

06 Mar, 2009

1 commit

edcb46399 percpu, module: implement reserved allocation and use it for module percpu variables ... Browse Code »

Impact: add reserved allocation functionality and use it for module
percpu variables

This patch implements reserved allocation from the first chunk. When
setting up the first chunk, arch can ask to set aside certain number
of bytes right after the core static area which is available only
through a separate reserved allocator. This will be used primarily
for module static percpu variables on architectures with limited
relocation range to ensure that the module perpcu symbols are inside
the relocatable range.

If reserved area is requested, the first chunk becomes reserved and
isn't available for regular allocation. If the first chunk also
includes piggy-back dynamic allocation area, a separate chunk mapping
the same region is created to serve dynamic allocation. The first one
is called static first chunk and the second dynamic first chunk.
Although they share the page map, their different area map
initializations guarantee they serve disjoint areas according to their
purposes.

If arch doesn't setup reserved area, reserved allocation is handled
like any other allocation.

Signed-off-by: Tejun Heo

Tejun Heo
2009-03-06 13:33:59 +0800

20 Feb, 2009

2 commits

fbf59bc9d percpu: implement new dynamic percpu allocator ... Browse Code »

Impact: new scalable dynamic percpu allocator which allows dynamic
percpu areas to be accessed the same way as static ones

Implement scalable dynamic percpu allocator which can be used for both
static and dynamic percpu areas. This will allow static and dynamic
areas to share faster direct access methods. This feature is optional
and enabled only when CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is defined by
arch. Please read comment on top of mm/percpu.c for details.

Signed-off-by: Tejun Heo
Cc: Andrew Morton

Tejun Heo
2009-02-20 15:29:08 +0800
6b588c18f module: reorder module pcpu related functions ... Browse Code »

Impact: cleanup

Move percpu_modinit() upwards. This is to ease further changes.

Signed-off-by: Tejun Heo

Tejun Heo
2009-02-20 15:29:07 +0800

03 Feb, 2009

1 commit

720eba31f modules: Use a better scheme for refcounting ... Browse Code »

Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
using a lot of memory.

Each 'struct module' contains an [NR_CPUS] array of full cache lines.

This patch uses existing infrastructure (percpu_modalloc() &
percpu_modfree()) to allocate percpu space for the refcount storage.

Instead of wasting NR_CPUS*128 bytes (on i386), we now use
nr_cpu_ids*sizeof(local_t) bytes.

On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
size of module files by about 2 Mbytes. (1Kb per module)

Instead of having all refcounters in the same memory node - with TLB misses
because of vmalloc() - this new implementation permits to have better
NUMA properties, since each CPU will use storage on its preferred node,
thanks to percpu storage.

Signed-off-by: Eric Dumazet
Signed-off-by: Rusty Russell
Signed-off-by: Linus Torvalds

Eric Dumazet
2009-02-03 11:17:55 +0800

14 Jan, 2009

1 commit

17da2bd90 [CVE-2009-0029] System call wrappers part 08 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:21 +0800

08 Jan, 2009

1 commit

22a9d6456 async: Asynchronous function calls to speed up kernel boot ... Browse Code »

Right now, most of the kernel boot is strictly synchronous, such that
various hardware delays are done sequentially.

In order to make the kernel boot faster, this patch introduces
infrastructure to allow doing some of the initialization steps
asynchronously, which will hide significant portions of the hardware delays
in practice.

In order to not change device order and other similar observables, this
patch does NOT do full parallel initialization.

Rather, it operates more in the way an out of order CPU does; the work may
be done out of order and asynchronous, but the observable effects
(instruction retiring for the CPU) are still done in the original sequence.

Signed-off-by: Arjan van de Ven

Arjan van de Ven
2009-01-08 00:45:46 +0800

07 Jan, 2009

3 commits

0deddf436 module: add MODULE_STATE_LIVE notify ... Browse Code »

Add a module notifier call which notifies that the state of a module
changes from MODULE_STATE_COMING to MODULE_STATE_LIVE.

Signed-off-by: Masami Hiramatsu
Cc: Ananth N Mavinakayanahalli
Cc: Anil S Keshavamurthy
Acked-by: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masami Hiramatsu
2009-01-07 07:59:21 +0800
a06f6211e module: add within_module_core() and within_module_init() ... Browse Code »

This series of patches allows kprobes to probe module's __init and __exit
functions. This means, you can probe driver initialization and
terminating.

Currently, kprobes can't probe __init function because these functions are
freed after module initialization. And it also can't probe module __exit
functions because kprobe increments reference count of target module and
user can't unload it. this means __exit functions never be called unless
removing probes from the module.

To solve both cases, this series of patches introduces GONE flag and sets
it when the target code is freed(for this purpose, kprobes hooks
MODULE_STATE_* events). This also removes refcount incrementing for
allowing user to unload target module. Users can check which probes are
GONE by debugfs interface. For taking timing of freeing module's .init
text, these also include a patch which adds module's notifier of
MODULE_STATE_LIVE event.

This patch:

Add within_module_core() and within_module_init() for checking whether an
address is in the module .init.text section or .text section, and replace
within() local inline functions in kernel/module.c with them.

kprobes uses these functions to check where the kprobe is inserted.

Signed-off-by: Masami Hiramatsu
Cc: Ananth N Mavinakayanahalli
Cc: Anil S Keshavamurthy
Acked-by: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masami Hiramatsu
2009-01-07 07:59:20 +0800
f1883f86d Remove remaining unwinder code ... Browse Code »

Signed-off-by: Alexey Dobriyan
Cc: Gabor Gombas
Cc: Jan Beulich
Cc: Andi Kleen
Cc: Ingo Molnar ,
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-01-07 07:59:11 +0800

05 Jan, 2009

4 commits

9e01892c4 module: convert to stop_machine_create/destroy. ... Browse Code »

The module code relies on a non-failing stop_machine call. So we create
the kstop threads in advance and with that make sure the call won't fail.

Signed-off-by: Heiko Carstens
Signed-off-by: Rusty Russell

Heiko Carstens
2009-01-05 06:10:15 +0800
088af9a6e module: fix module loading failure of large kernel modules for parisc ... Browse Code »

When creating the final layout of a kernel module in memory, allow the
module loader to reserve some additional memory in front of a given section.
This is currently only needed for the parisc port which needs to put the
stub entries there to fulfill the 17/22bit PCREL relocations with large
kernel modules like xfs.

Signed-off-by: Helge Deller
Signed-off-by: Rusty Russell (renamed fn)

Helge Deller
2009-01-05 06:10:13 +0800
d1e99d7ae module: fix warning of unused function when !CONFIG_PROC_FS ... Browse Code »

Fix this warning:
kernel/module.c:824: warning: ‘print_unload_info’ defined but not used
print_unload_info() just was used when CONFIG_PROC_FS was defined.
This patch mark print_unload_info() inline to solve the problem.

Signed-off-by: Jianjun Kong
Signed-off-by: Rusty Russell
CC: Ingo Molnar
CC: Américo Wang

Jianjun Kong
2009-01-05 06:10:11 +0800
ca4787b77 kernel/module.c: compare symbol values when marking symbols as exported in /proc/kallsyms. ... Browse Code »

When there are two symbols in a module with the same name, one of which is
exported, both will be marked as exported in /proc/kallsyms. There aren't
any instances of this in the current kernel, but it is easy to construct a
simple module with two compilation units that exhibits the problem.

$ objdump -j .text -t testmod.ko | grep foo
00000000 l F .text 00000032 foo
00000080 g F .text 00000001 foo
$ sudo insmod testmod.ko
$ grep "T foo" /proc/kallsyms
c28e8000 T foo [testmod]
c28e8080 T foo [testmod]

Fix this by comparing the symbol values once we've found the exported
symbol table entry matching the symbol name. Tested using Ksplice:

$ ksplice-create --patch=this_commit.patch --id=bar .
$ sudo ksplice-apply ksplice-bar.tar.gz
Done!
$ grep "T foo" /proc/kallsyms
c28e8080 T foo [testmod]

Signed-off-by: Tim Abbott
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Rusty Russell

Tim Abbott
2009-01-05 06:10:11 +0800

08 Dec, 2008

1 commit

8b96f0119 tracing/function-graph-tracer: introduce __notrace_funcgraph to filter special functions ... Browse Code »

Impact: trace more functions

When the function graph tracer is configured, three more files are not
traced to prevent only four functions to be traced. And this impacts the
normal function tracer too.

arch/x86/kernel/process_64/32.c:

I had crashes when I let this file traced. After some debugging, I saw
that the "current" task point was changed inside__swtich_to(), ie:
"write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
the original return address of the function inside current, we had
crashes. Only __switch_to() has to be excluded from tracing.

kernel/module.c and kernel/extable.c:

Because of a function used internally by the function graph tracer:
__kernel_text_address()

To let the other functions inside these files to be traced, this patch
introduces the __notrace_funcgraph function prefix which is __notrace if
function graph tracer is configured and nothing if not.

Signed-off-by: Frederic Weisbecker
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2008-12-08 22:11:44 +0800

17 Nov, 2008

1 commit

3f8e402f3 Merge branches 'tracing/branch-tracer', 'tracing/ftrace', 'tracing/function-retu… ... Browse Code »

…rn-tracer', 'tracing/tracepoints' and 'tracing/urgent' into tracing/core

Ingo Molnar
2008-11-17 16:36:22 +0800

16 Nov, 2008

3 commits

32f857427 tracepoints: use modules notifiers ... Browse Code »

Impact: cleanup

Use module notifiers for tracepoint updates rather than adding a hook in
module.c.

Signed-off-by: Mathieu Desnoyers
Signed-off-by: Ingo Molnar

Mathieu Desnoyers
2008-11-16 16:01:35 +0800
a419246ac markers: use module notifier ... Browse Code »

Impact: cleanup

Use module notifiers instead of adding a hook in module.c.

Signed-off-by: Mathieu Desnoyers
Signed-off-by: Ingo Molnar

Mathieu Desnoyers
2008-11-16 16:01:28 +0800
31e889098 ftrace: pass module struct to arch dynamic ftrace functions ... Browse Code »

Impact: allow archs more flexibility on dynamic ftrace implementations

Dynamic ftrace has largly been developed on x86. Since x86 does not
have the same limitations as other architectures, the ftrace interaction
between the generic code and the architecture specific code was not
flexible enough to handle some of the issues that other architectures
have.

Most notably, module trampolines. Due to the limited branch distance
that archs make in calling kernel core code from modules, the module
load code must create a trampoline to jump to what will make the
larger jump into core kernel code.

The problem arises when this happens to a call to mcount. Ftrace checks
all code before modifying it and makes sure the current code is what
it expects. Right now, there is not enough information to handle modifying
module trampolines.

This patch changes the API between generic dynamic ftrace code and
the arch dependent code. There is now two functions for modifying code:

ftrace_make_nop(mod, rec, addr) - convert the code at rec->ip into
a nop, where the original text is calling addr. (mod is the
module struct if called by module init)

ftrace_make_caller(rec, addr) - convert the code rec->ip that should
be a nop into a caller to addr.

The record "rec" now has a new field called "arch" where the architecture
can add any special attributes to each call site record.

Signed-off-by: Steven Rostedt
Signed-off-by: Ingo Molnar

Steven Rostedt
2008-11-16 14:36:02 +0800

24 Oct, 2008

1 commit

88ed86fee Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc ... Browse Code »

* 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits)
proc: remove fs/proc/proc_misc.c
proc: move /proc/vmcore creation to fs/proc/vmcore.c
proc: move pagecount stuff to fs/proc/page.c
proc: move all /proc/kcore stuff to fs/proc/kcore.c
proc: move /proc/schedstat boilerplate to kernel/sched_stats.h
proc: move /proc/modules boilerplate to kernel/module.c
proc: move /proc/diskstats boilerplate to block/genhd.c
proc: move /proc/zoneinfo boilerplate to mm/vmstat.c
proc: move /proc/vmstat boilerplate to mm/vmstat.c
proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c
proc: move /proc/buddyinfo boilerplate to mm/vmstat.c
proc: move /proc/vmallocinfo to mm/vmalloc.c
proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c
proc: move /proc/slab_allocators boilerplate to mm/slab.c
proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c
proc: move /proc/stat to fs/proc/stat.c
proc: move rest of /proc/partitions code to block/genhd.c
proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c
proc: move /proc/devices code to fs/proc/devices.c
proc: move rest of /proc/locks to fs/locks.c
...

Linus Torvalds
2008-10-24 03:04:37 +0800

23 Oct, 2008

1 commit

3b5d5c6b0 proc: move /proc/modules boilerplate to kernel/module.c ... Browse Code »

Signed-off-by: Alexey Dobriyan

Alexey Dobriyan
2008-10-23 22:03:13 +0800

22 Oct, 2008

2 commits

d72b37513 Remove stop_machine during module load v2 ... Browse Code »

Remove stop_machine during module load v2

module loading currently does a stop_machine on each module load to insert
the module into the global module lists. Especially on larger systems this
can be quite expensive.

It does that to handle concurrent lock lessmodule list readers
like kallsyms.

I don't think stop_machine() is actually needed to insert something
into a list though. There are no concurrent writers because the
module mutex is taken. And the RCU list functions know how to insert
a node into a list with the right memory ordering so that concurrent
readers don't go off into the wood.

So remove the stop_machine for the module list insert and just
do a list_add_rcu() instead.

Module removal will still do a stop_machine of course, it needs
that for other reasons.

v2: Revised readers based on Paul's comments. All readers that only
rely on disabled preemption need to be changed to list_for_each_rcu().
Done that. The others are ok because they have the modules mutex.
Also added a possible missing preempt disable for print_modules().

[cc Paul McKenney for review. It's not RCU, but quite similar.]

Acked-by: Paul E. McKenney
Signed-off-by: Rusty Russell

Andi Kleen
2008-10-22 07:00:22 +0800
5e458cc0f module: simplify load_module. ... Browse Code »

Linus' recent catch of stack overflow in load_module lead me to look
at the code. A couple of helpers to get a section address and get
objects from a section can help clean things up a little.

(And in case you're wondering, the stack size also dropped from 328 to
284 bytes).

Signed-off-by: Rusty Russell

Rusty Russell
2008-10-22 07:00:15 +0800

21 Oct, 2008

1 commit

92b29b86f Merge branch 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (131 commits)
tracing/fastboot: improve help text
tracing/stacktrace: improve help text
tracing/fastboot: fix initcalls disposition in bootgraph.pl
tracing/fastboot: fix bootgraph.pl initcall name regexp
tracing/fastboot: fix issues and improve output of bootgraph.pl
tracepoints: synchronize unregister static inline
tracepoints: tracepoint_synchronize_unregister()
ftrace: make ftrace_test_p6nop disassembler-friendly
markers: fix synchronize marker unregister static inline
tracing/fastboot: add better resolution to initcall debug/tracing
trace: add build-time check to avoid overrunning hex buffer
ftrace: fix hex output mode of ftrace
tracing/fastboot: fix initcalls disposition in bootgraph.pl
tracing/fastboot: fix printk format typo in boot tracer
ftrace: return an error when setting a nonexistent tracer
ftrace: make some tracers reentrant
ring-buffer: make reentrant
ring-buffer: move page indexes into page headers
tracing/fastboot: only trace non-module initcalls
ftrace: move pc counter in irqtrace
...

Manually fix conflicts:
- init/main.c: initcall tracing
- kernel/module.c: verbose level vs tracepoints
- scripts/bootgraph.pl: fallout from cherry-picking commits.

Linus Torvalds
2008-10-21 04:35:07 +0800

18 Oct, 2008

1 commit

26e9a3977 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (25 commits)
staging: at76_usb wireless driver
Staging: workaround build system bug
Staging: Lindent sxg.c
Staging: SLICOSS: Call pci_release_regions at driver exit
Staging: SLICOSS: Fix remaining type names
Staging: SLICOSS: Fix warnings due to static usage
Staging: SLICOSS: lots of checkpatch fixes
Staging: go7007 v4l fixes
Staging: Fix gcc warnings in sxg
Staging: add echo cancelation module
Staging: add wlan-ng prism2 usb driver
Staging: add w35und wifi driver
Staging: USB/IP: add host driver
Staging: USB/IP: add client driver
Staging: USB/IP: add common functions needed
Staging: add the go7007 video driver
Staging: add me4000 pci data collection driver
Staging: add me4000 firmware files
Staging: add sxg network driver
Staging: add Alacritech slicoss network driver
...

Fixed up conflicts due to taint flags changes and MAINTAINERS cleanup in
MAINTAINERS, include/linux/kernel.h and kernel/panic.c.

Linus Torvalds
2008-10-18 00:50:12 +0800

17 Oct, 2008

4 commits

c813b4e16 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (46 commits)
UIO: Fix mapping of logical and virtual memory
UIO: add automata sercos3 pci card support
UIO: Change driver name of uio_pdrv
UIO: Add alignment warnings for uio-mem
Driver core: add bus_sort_breadthfirst() function
NET: convert the phy_device file to use bus_find_device_by_name
kobject: Cleanup kobject_rename and !CONFIG_SYSFS
kobject: Fix kobject_rename and !CONFIG_SYSFS
sysfs: Make dir and name args to sysfs_notify() const
platform: add new device registration helper
sysfs: use ilookup5() instead of ilookup5_nowait()
PNP: create device attributes via default device attributes
Driver core: make bus_find_device_by_name() more robust
usb: turn dev_warn+WARN_ON combos into dev_WARN
debug: use dev_WARN() rather than WARN_ON() in device_pm_add()
debug: Introduce a dev_WARN() function
sysfs: fix deadlock
device model: Do a quickcheck for driver binding before doing an expensive check
Driver core: Fix cleanup in device_create_vargs().
Driver core: Clarify device cleanup.
...

Linus Torvalds
2008-10-17 03:40:26 +0800
25ddbb18a Make the taint flags reliable ... Browse Code »

It's somewhat unlikely that it happens, but right now a race window
between interrupts or machine checks or oopses could corrupt the tainted
bitmap because it is modified in a non atomic fashion.

Convert the taint variable to an unsigned long and use only atomic bit
operations on it.

Unfortunately this means the intvec sysctl functions cannot be used on it
anymore.

It turned out the taint sysctl handler could actually be simplified a bit
(since it only increases capabilities) so this patch actually removes
code.

[akpm@linux-foundation.org: remove unneeded include]
Signed-off-by: Andi Kleen
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2008-10-17 02:21:31 +0800
346e15beb driver core: basic infrastructure for per-module dynamic debug messages ... Browse Code »

Base infrastructure to enable per-module debug messages.

I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
control of debugging statements on a per-module basis in one /proc file,
currently, /dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
is not set, debugging statements can still be enabled as before, often by
defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.

The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
can be dynamically enabled/disabled on a per-module basis.

Future plans include extending this functionality to subsystems, that define
their own debug levels and flags.

Usage:

Dynamic debugging is controlled by the debugfs file,
/dynamic_printk/modules. This file contains a list of the modules that
can be enabled. The format of the file is as follows:

.
.
.

: Name of the module in which the debug call resides
: whether the messages are enabled or not

For example:

snd_hda_intel enabled=0
fixup enabled=1
driver enabled=0

Enable a module:

$echo "set enabled=1 " > dynamic_printk/modules

Disable a module:

$echo "set enabled=0 " > dynamic_printk/modules

Enable all modules:

$echo "set enabled=1 all" > dynamic_printk/modules

Disable all modules:

$echo "set enabled=0 all" > dynamic_printk/modules

Finally, passing "dynamic_printk" at the command line enables
debugging for all modules. This mode can be turned off via the above
disable command.

[gkh: minor cleanups and tweaks to make the build work quietly]

Signed-off-by: Jason Baron
Signed-off-by: Greg Kroah-Hartman

Jason Baron
2008-10-17 00:24:47 +0800
e94320939 modules: fix module "notes" kobject leak ... Browse Code »

Fix "notes" kobject leak

It happens every rmmod if KALLSYMS=y and SYSFS=y.

# modprobe foo

kobject: 'foo' (ffffffffa00743d0): kobject_add_internal: parent: 'module', set: 'module'
kobject: 'holders' (ffff88017e7c5770): kobject_add_internal: parent: 'foo', set: ''
kobject: 'foo' (ffffffffa00743d0): kobject_uevent_env
kobject: 'foo' (ffffffffa00743d0): fill_kobj_path: path = '/module/foo'
kobject: 'notes' (ffff88017fa9b668): kobject_add_internal: parent: 'foo', set: ''
^^^^^

# rmmod foo

kobject: 'holders' (ffff88017e7c5770): kobject_cleanup
kobject: 'holders' (ffff88017e7c5770): auto cleanup kobject_del
kobject: 'holders' (ffff88017e7c5770): calling ktype release
kobject: (ffff88017e7c5770): dynamic_kobj_release
kobject: 'holders': free name
kobject: 'foo' (ffffffffa00743d0): kobject_cleanup
kobject: 'foo' (ffffffffa00743d0): does not have a release() function, it is broken and must be fixed.
kobject: 'foo' (ffffffffa00743d0): auto cleanup 'remove' event
kobject: 'foo' (ffffffffa00743d0): kobject_uevent_env
kobject: 'foo' (ffffffffa00743d0): fill_kobj_path: path = '/module/foo'
kobject: 'foo' (ffffffffa00743d0): auto cleanup kobject_del
kobject: 'foo': free name

[whooops]

Signed-off-by: Alexey Dobriyan
Cc: stable
Signed-off-by: Greg Kroah-Hartman

Alexey Dobriyan
2008-10-17 00:24:41 +0800

14 Oct, 2008

3 commits

fed1939c6 ftrace: remove old pointers to mcount ... Browse Code »

When a mcount pointer is recorded into a table, it is used to add or
remove calls to mcount (replacing them with nops). If the code is removed
via removing a module, the pointers still exist. At modifying the code
a check is always made to make sure the code being replaced is the code
expected. In-other-words, the code being replaced is compared to what
it is expected to be before being replaced.

There is a very small chance that the code being replaced just happens
to look like code that calls mcount (very small since the call to mcount
is relative). To remove this chance, this patch adds ftrace_release to
allow module unloading to remove the pointers to mcount within the module.

Another change for init calls is made to not trace calls marked with
__init. The tracing can not be started until after init is done anyway.

Signed-off-by: Steven Rostedt
Signed-off-by: Ingo Molnar

Steven Rostedt
2008-10-14 16:35:12 +0800
90d595fe5 ftrace: enable mcount recording for modules ... Browse Code »

This patch enables the loading of the __mcount_section of modules and
changing all the callers of mcount into nops.

The modification is done before the init_module function is called, so
again, we do not need to use kstop_machine to make these changes.

Signed-off-by: Steven Rostedt
Signed-off-by: Ingo Molnar

Steven Rostedt
2008-10-14 16:34:47 +0800
97e1c18e8 tracing: Kernel Tracepoints ... Browse Code »

Implementation of kernel tracepoints. Inspired from the Linux Kernel
Markers. Allows complete typing verification by declaring both tracing
statement inline functions and probe registration/unregistration static
inline functions within the same macro "DEFINE_TRACE". No format string
is required. See the tracepoint Documentation and Samples patches for
usage examples.

Taken from the documentation patch :

"A tracepoint placed in code provides a hook to call a function (probe)
that you can provide at runtime. A tracepoint can be "on" (a probe is
connected to it) or "off" (no probe is attached). When a tracepoint is
"off" it has no effect, except for adding a tiny time penalty (checking
a condition for a branch) and space penalty (adding a few bytes for the
function call at the end of the instrumented function and adds a data
structure in a separate section). When a tracepoint is "on", the
function you provide is called each time the tracepoint is executed, in
the execution context of the caller. When the function provided ends its
execution, it returns to the caller (continuing from the tracepoint
site).

You can put tracepoints at important locations in the code. They are
lightweight hooks that can pass an arbitrary number of parameters, which
prototypes are described in a tracepoint declaration placed in a header
file."

Addition and removal of tracepoints is synchronized by RCU using the
scheduler (and preempt_disable) as guarantees to find a quiescent state
(this is really RCU "classic"). The update side uses rcu_barrier_sched()
with call_rcu_sched() and the read/execute side uses
"preempt_disable()/preempt_enable()".

We make sure the previous array containing probes, which has been
scheduled for deletion by the rcu callback, is indeed freed before we
proceed to the next update. It therefore limits the rate of modification
of a single tracepoint to one update per RCU period. The objective here
is to permit fast batch add/removal of probes on _different_
tracepoints.

Changelog :
- Use #name ":" #proto as string to identify the tracepoint in the
tracepoint table. This will make sure not type mismatch happens due to
connexion of a probe with the wrong type to a tracepoint declared with
the same name in a different header.
- Add tracepoint_entry_free_old.
- Change __TO_TRACE to get rid of the 'i' iterator.

Masami Hiramatsu :
Tested on x86-64.

Performance impact of a tracepoint : same as markers, except that it
adds about 70 bytes of instructions in an unlikely branch of each
instrumented function (the for loop, the stack setup and the function
call). It currently adds a memory read, a test and a conditional branch
at the instrumentation site (in the hot path). Immediate values will
eventually change this into a load immediate, test and branch, which
removes the memory read which will make the i-cache impact smaller
(changing the memory read for a load immediate removes 3-4 bytes per
site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it
also saves the d-cache hit).

About the performance impact of tracepoints (which is comparable to
markers), even without immediate values optimizations, tests done by
Hideo Aoki on ia64 show no regression. His test case was using hackbench
on a kernel where scheduler instrumentation (about 5 events in code
scheduler code) was added.

Quoting Hideo Aoki about Markers :

I evaluated overhead of kernel marker using linux-2.6-sched-fixes git
tree, which includes several markers for LTTng, using an ia64 server.

While the immediate trace mark feature isn't implemented on ia64, there
is no major performance regression. So, I think that we don't have any
issues to propose merging marker point patches into Linus's tree from
the viewpoint of performance impact.

I prepared two kernels to evaluate. The first one was compiled without
CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS.

I downloaded the original hackbench from the following URL:
http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c

I ran hackbench 5 times in each condition and calculated the average and
difference between the kernels.

The parameter of hackbench: every 50 from 50 to 800
The number of CPUs of the server: 2, 4, and 8

Below is the results. As you can see, major performance regression
wasn't found in any case. Even if number of processes increases,
differences between marker-enabled kernel and marker- disabled kernel
doesn't increase. Moreover, if number of CPUs increases, the differences
doesn't increase either.

Curiously, marker-enabled kernel is better than marker-disabled kernel
in more than half cases, although I guess it comes from the difference
of memory access pattern.

* 2 CPUs

Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 4.811 | 4.872 | +0.061 | +1.27 |
100 | 9.854 | 10.309 | +0.454 | +4.61 |
150 | 15.602 | 15.040 | -0.562 | -3.6 |
200 | 20.489 | 20.380 | -0.109 | -0.53 |
250 | 25.798 | 25.652 | -0.146 | -0.56 |
300 | 31.260 | 30.797 | -0.463 | -1.48 |
350 | 36.121 | 35.770 | -0.351 | -0.97 |
400 | 42.288 | 42.102 | -0.186 | -0.44 |
450 | 47.778 | 47.253 | -0.526 | -1.1 |
500 | 51.953 | 52.278 | +0.325 | +0.63 |
550 | 58.401 | 57.700 | -0.701 | -1.2 |
600 | 63.334 | 63.222 | -0.112 | -0.18 |
650 | 68.816 | 68.511 | -0.306 | -0.44 |
700 | 74.667 | 74.088 | -0.579 | -0.78 |
750 | 78.612 | 79.582 | +0.970 | +1.23 |
800 | 85.431 | 85.263 | -0.168 | -0.2 |
--------------------------------------------------------------

* 4 CPUs

Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 2.586 | 2.584 | -0.003 | -0.1 |
100 | 5.254 | 5.283 | +0.030 | +0.56 |
150 | 8.012 | 8.074 | +0.061 | +0.76 |
200 | 11.172 | 11.000 | -0.172 | -1.54 |
250 | 13.917 | 14.036 | +0.119 | +0.86 |
300 | 16.905 | 16.543 | -0.362 | -2.14 |
350 | 19.901 | 20.036 | +0.135 | +0.68 |
400 | 22.908 | 23.094 | +0.186 | +0.81 |
450 | 26.273 | 26.101 | -0.172 | -0.66 |
500 | 29.554 | 29.092 | -0.461 | -1.56 |
550 | 32.377 | 32.274 | -0.103 | -0.32 |
600 | 35.855 | 35.322 | -0.533 | -1.49 |
650 | 39.192 | 38.388 | -0.804 | -2.05 |
700 | 41.744 | 41.719 | -0.025 | -0.06 |
750 | 45.016 | 44.496 | -0.520 | -1.16 |
800 | 48.212 | 47.603 | -0.609 | -1.26 |
--------------------------------------------------------------

* 8 CPUs

Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 2.094 | 2.072 | -0.022 | -1.07 |
100 | 4.162 | 4.273 | +0.111 | +2.66 |
150 | 6.485 | 6.540 | +0.055 | +0.84 |
200 | 8.556 | 8.478 | -0.078 | -0.91 |
250 | 10.458 | 10.258 | -0.200 | -1.91 |
300 | 12.425 | 12.750 | +0.325 | +2.62 |
350 | 14.807 | 14.839 | +0.032 | +0.22 |
400 | 16.801 | 16.959 | +0.158 | +0.94 |
450 | 19.478 | 19.009 | -0.470 | -2.41 |
500 | 21.296 | 21.504 | +0.208 | +0.98 |
550 | 23.842 | 23.979 | +0.137 | +0.57 |
600 | 26.309 | 26.111 | -0.198 | -0.75 |
650 | 28.705 | 28.446 | -0.259 | -0.9 |
700 | 31.233 | 31.394 | +0.161 | +0.52 |
750 | 34.064 | 33.720 | -0.344 | -1.01 |
800 | 36.320 | 36.114 | -0.206 | -0.57 |
--------------------------------------------------------------

Signed-off-by: Mathieu Desnoyers
Acked-by: Masami Hiramatsu
Acked-by: 'Peter Zijlstra'
Signed-off-by: Ingo Molnar

Mathieu Desnoyers
2008-10-14 16:28:28 +0800

11 Oct, 2008

1 commit

061b1bd39 Staging: add TAINT_CRAP for all drivers/staging code ... Browse Code »

We need to add a flag for all code that is in the drivers/staging/
directory to prevent all other kernel developers from worrying about
issues here, and to notify users that the drivers might not be as good
as they are normally used to.

Based on code from Andreas Gruenbacher and Jeff Mahoney to provide a
TAINT flag for the support level of a kernel module in the Novell
enterprise kernel release.

This is the kernel portion of this feature, the ability for the flag to
be set needs to be done in the build process and will happen in a
follow-up patch.

Cc: Andreas Gruenbacher
Cc: Jeff Mahoney
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2008-10-11 06:31:05 +0800

26 Aug, 2008

1 commit

ffb4ba76a [module] Don't let gcc inline load_module() ... Browse Code »

'load_module()' is a complex function that contains all the ELF section
logic, and inlining it is utterly insane. But gcc will do it, simply
because there is only one call-site. As a result, all the stack space
that is allocated for all the work to load the module will still be
active when we actually call the module init sequence, and the deep call
chain makes stack overflows happen.

And stack overflows are really hard to debug, because they not only
corrupt random pages below the stack, but also corrupt the thread_info
structure that is allocated under the stack.

In this case, Alan Brunelle reported some crazy oopses at bootup, after
loading the processor module that ends up doing complex ACPI stuff and
has quite a deep callchain. This should fix it, and is the sane thing
to do regardless.

Cc: Alan D. Brunelle
Cc: Arjan van de Ven
Cc: Rusty Russell
Signed-off-by: Linus Torvalds

Linus Torvalds
2008-08-26 02:10:26 +0800

12 Aug, 2008

1 commit

59f9415ff modules: extend initcall_debug functionality to the module loader ... Browse Code »

The kernel has this really nice facility where if you put "initcall_debug"
on the kernel commandline, it'll print which function it's going to
execute just before calling an initcall, and then after the call completes
it will

1) print if it had an error code

2) checks for a few simple bugs (like leaving irqs off)
and

3) print how long the init call took in milliseconds.

While trying to optimize the boot speed of my laptop, I have been loving
number 3 to figure out what to optimize... ... and then I wished that
the same thing was done for module loading.

This patch makes the module loader use this exact same functionality; it's
a logical extension in my view (since modules are just sort of late
binding initcalls anyway) and so far I've found it quite useful in finding
where things are too slow in my boot.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Rusty Russell

Arjan van de Ven
2008-08-12 15:52:54 +0800

28 Jul, 2008

2 commits

9b1a4d383 stop_machine: Wean existing callers off stop_machine_run() ... Browse Code »

Signed-off-by: Rusty Russell

Rusty Russell
2008-07-28 10:16:31 +0800
15bba37d6 module: fix build warning with !CONFIG_KALLSYMS ... Browse Code »

This patch fixed the warning:

CC kernel/module.o
/home/wangcong/Projects/linux-2.6/kernel/module.c:332: warning:
‘lookup_symbol’ defined but not used

Signed-off-by: WANG Cong
Signed-off-by: Rusty Russell

WANG Cong
2008-07-28 10:16:28 +0800

22 Jul, 2008

1 commit

3a642e99b modules: Take a shortcut for checking if an address is in a module ... Browse Code »

This patch keeps track of the boundaries of module allocation, in
order to speed up module_text_address().

Inspired by Arjan's version, which required arch-specific defines:

Various pieces of the kernel (lockdep, latencytop, etc) tend
to store backtraces, sometimes at a relatively high
frequency. In itself this isn't a big performance deal (after
all you're using diagnostics features), but there have been
some complaints from people who have over 100 modules loaded
that this is a tad too slow.

This is due to the new backtracer code which looks at every
slot on the stack to see if it's a kernel/module text address,
so that's 1024 slots. 1024 times 100 modules... that's a lot
of list walking.

Signed-off-by: Rusty Russell

Rusty Russell
2008-07-22 17:24:28 +0800