11 Aug, 2011
1 commit
-
Copy the information needed from struct module into a local module list
held within tracepoint.c from within the module coming/going notifier.This vastly simplifies locking of tracepoint registration /
unregistration, because we don't have to take the module mutex to
register and unregister tracepoints anymore. Steven Rostedt ran into
dependency problems related to modules mutex vs kprobes mutex vs ftrace
mutex vs tracepoint mutex that seems to be hard to fix without removing
this dependency between tracepoint and module mutex. (note: it should be
investigated whether kprobes could benefit of being dissociated from the
modules mutex too.)This also fixes module handling of tracepoint list iterators, because it
was expecting the list to be sorted by pointer address. Given we have
control on our own list now, it's OK to sort this list which has
tracepoints as its only purpose. The reason why this sorting is required
is to handle the fact that seq files (and any read() operation from
user-space) cannot hold the tracepoint mutex across multiple calls, so
list entries may vanish between calls. With sorting, the tracepoint
iterator becomes usable even if the list don't contain the exact item
pointed to by the iterator anymore.Signed-off-by: Mathieu Desnoyers
Acked-by: Jason Baron
CC: Ingo Molnar
CC: Lai Jiangshan
CC: Peter Zijlstra
CC: Thomas Gleixner
CC: Masami Hiramatsu
Link: http://lkml.kernel.org/r/20110810191839.GC8525@Krystal
Signed-off-by: Steven Rostedt
24 Jul, 2011
3 commits
-
Userspace wants to manage module parameters with udev rules.
This currently only works for loaded modules, but not for
built-in ones.To allow access to the built-in modules we need to
re-trigger all module load events that happened before any
userspace was running. We already do the same thing for all
devices, subsystems(buses) and drivers.This adds the currently missing /sys/module//uevent files
to all module entries.Signed-off-by: Kay Sievers
Signed-off-by: Rusty Russell (split & trivial fix) -
This simplifies the next patch, where we have an attribute on a
builtin module (ie. module == NULL).Signed-off-by: Kay Sievers
Signed-off-by: Rusty Russell (split into 2) -
The module loader code allows architectures to hook into the code by
providing a small number of entry points that each arch must implement.
This patch provides __weakly linked generic implementations of these
entry points for architectures that don't need to do anything special.Signed-off-by: Jonas Bonn
Signed-off-by: Rusty Russell
24 May, 2011
1 commit
-
* 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (970 commits)
staging: usbip: replace usbip_u{dbg,err,info} and printk with dev_ and pr_
staging:iio: Trivial kconfig reorganization and uniformity improvements.
staging:iio:documenation partial update.
staging:iio: use pollfunc allocation helpers in remaining drivers.
staging:iio:max1363 misc cleanups and use of for_each_bit_set to simplify event code spitting out.
staging:iio: implement an iio_info structure to take some of the constant elements out of iio_dev.
staging:iio:meter:ade7758: Use private data space from iio_allocate_device
staging:iio:accel:lis3l02dq make write_reg_8 take value not a pointer to value.
staging:iio: ring core cleanups + check if read_last available in lis3l02dq
staging:iio:core cleanup: squash tiny wrappers and use dev_set_name to handle creation of event interface name.
staging:iio: poll func allocation clean up.
staging:iio:ad7780 trivial unused header cleanup.
staging:iio:adc: AD7780: Use private data space from iio_allocate_device + trivial fixes
staging:iio:adc:AD7780: Convert to new channel registration method
staging:iio:adc: AD7606: Drop dev_data in favour of iio_priv()
staging:iio:adc: AD7606: Consitently use indio_dev
staging:iio: Rip out helper for software rings.
staging:iio:adc:AD7298: Use private data space from iio_allocate_device
staging:iio: rationalization of different buffer implementation hooks.
staging:iio:imu:adis16400 avoid allocating rx, tx, and state separately from iio_dev.
...Fix up trivial conflicts in
- drivers/staging/intel_sst/intelmid.c: patches applied in both branches
- drivers/staging/rt2860/common/cmm_data_{pci,usb}.c: removed vs spelling
- drivers/staging/usbip/vhci_sysfs.c: trivial header file inclusion
19 May, 2011
7 commits
-
The function is_exported() with its helper function lookup_symbol() are used to
verify if a provided symbol is effectively exported by the kernel or by the
modules. Now that both have their symbols sorted we can replace a linear search
with a binary search which provide a considerably speed-up.This work was supported by a hardware donation from the CE Linux Forum.
Signed-off-by: Alessio Igor Bogani
Acked-by: Greg Kroah-Hartman
Signed-off-by: Rusty Russell -
Takes advantage of the order and locates symbols using binary search.
This work was supported by a hardware donation from the CE Linux Forum.
Signed-off-by: Alessio Igor Bogani
Signed-off-by: Rusty Russell
Tested-by: Dirk Behme -
Instead of having a callback function for each symbol in the kernel,
have a callback for each array of symbols.This eases the logic when we move to sorted symbols and binary search.
Signed-off-by: Rusty Russell
Signed-off-by: Alessio Igor Bogani -
Split the unprotect function into a function per section to make
the code more readable and add the missing static declaration.Signed-off-by: Jan Glauber
Signed-off-by: Rusty Russell -
While debugging I stumbled over two problems in the code that protects module
pages.First issue is that disabling the protection before freeing init or unload of
a module is not symmetric with the enablement. For instance, if pages are set
to RO the page range from module_core to module_core + core_ro_size is
protected. If a module is unloaded the page range from module_core to
module_core + core_size is set back to RW.
So pages that were not set to RO are also changed to RW.
This is not critical but IMHO it should be symmetric.Second issue is that while set_memory_rw & set_memory_ro are used for
RO/RW changes only set_memory_nx is involved for NX/X. One would await that
the inverse function is called when the NX protection should be removed,
which is not the case here, unless I'm missing something.Signed-off-by: Jan Glauber
Signed-off-by: Rusty Russell -
Reset mod->init_ro_size to zero after the init part of a module is unloaded.
Otherwise we need to check if module->init is NULL in the unprotect functions
in the next patch.Signed-off-by: Jan Glauber
Signed-off-by: Rusty Russell -
Fix function prototype to be ANSI-C compliant, consistent with other
function prototypes, addressing a sparse warning.Signed-off-by: Daniel J Blueman
Signed-off-by: Rusty Russell
26 Apr, 2011
1 commit
-
Driver modules from the staging directory are marked 'tainted'
by module.c. Subsequently, tainted modules are denied dynamic
debugging. This is unwanted behavior, since staging modules should
be able to use the dynamic debugging mechanism.Please merge this also into the staging-linus branch.
Signed-off-by: Roland Vossen
Acked-by: Jason Baron
Signed-off-by: Greg Kroah-Hartman
31 Mar, 2011
1 commit
-
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi
23 Mar, 2011
1 commit
-
In an effort to reduce kernel address leaks that might be used to help
target kernel privilege escalation exploits, this patch uses %pK when
displaying addresses in /proc/kallsyms, /proc/modules, and
/sys/module/*/sections/*.Note that this changes %x to %p, so some legitimately 0 values in
/proc/kallsyms would have changed from 00000000 to "(null)". To avoid
this, "(null)" is not used when using the "K" format. Anything that was
already successfully parsing "(null)" in addition to full hex digits
should have no problem with this change. (Thanks to Joe Perches for the
suggestion.) Due to the %x to %p, "void *" casts are needed since these
addresses are already "unsigned long" everywhere internally, due to their
starting life as ELF section offsets.Signed-off-by: Kees Cook
Cc: Eugene Teo
Cc: Dan Rosenberg
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Feb, 2011
1 commit
-
Make the tracepoints more robust, making them solid enough to handle compiler
changes by not relying on anything based on compiler-specific behavior with
respect to structure alignment. Implement an approach proposed by David Miller:
use an array of const pointers to refer to the individual structures, and export
this pointer array through the linker script rather than the structures per se.
It will consume 32 extra bytes per tracepoint (24 for structure padding and 8
for the pointers), but are less likely to break due to compiler changes.History:
commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()
added the aligned(32) type and variable attribute to the tracepoint structures
to deal with gcc happily aligning statically defined structures on 32-byte
multiples.One attempt was to use a 8-byte alignment for tracepoint structures by applying
both the variable and type attribute to tracepoint structures definitions and
declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5.The reason is that the "aligned" attribute only specify the _minimum_ alignment
for a structure, leaving both the compiler and the linker free to align on
larger multiples. Because tracepoint.c expects the structures to be placed as an
array within each section, up-alignment cause NULL-pointer exceptions due to the
extra unexpected padding.(this patch applies on top of -tip)
Signed-off-by: Mathieu Desnoyers
Acked-by: David S. Miller
LKML-Reference:
CC: Frederic Weisbecker
CC: Ingo Molnar
CC: Thomas Gleixner
CC: Andrew Morton
CC: Peter Zijlstra
CC: Rusty Russell
Signed-off-by: Steven Rostedt
23 Dec, 2010
2 commits
-
The commit:
84e1c6bb38eb318e456558b610396d9f1afaabf0
x86: Add RO/NX protection for loadable kernel modulesBroke the function tracer with this output:
------------[ cut here ]------------
WARNING: at kernel/trace/ftrace.c:1014 ftrace_bug+0x114/0x171()
Hardware name: Precision WorkStation 470
Modules linked in: i2c_core(+)
Pid: 86, comm: modprobe Not tainted 2.6.37-rc2+ #68
Call Trace:
[] warn_slowpath_common+0x85/0x9d
[] ? __process_new_adapter+0x7/0x34 [i2c_core]
[] ? __process_new_adapter+0x7/0x34 [i2c_core]
[] warn_slowpath_null+0x1a/0x1c
[] ftrace_bug+0x114/0x171
[] ? __process_new_adapter+0x7/0x34 [i2c_core]
[] ftrace_process_locs+0x1ae/0x274
[] ? __process_new_adapter+0x7/0x34 [i2c_core]
[] ftrace_module_notify+0x39/0x44
[] notifier_call_chain+0x37/0x63
[] __blocking_notifier_call_chain+0x46/0x5b
[] blocking_notifier_call_chain+0x14/0x16
[] sys_init_module+0x73/0x1f3
[] system_call_fastpath+0x16/0x1b
---[ end trace 2aff4f4ca53ec746 ]---
ftrace faulted on writing []
__process_new_adapter+0x7/0x34 [i2c_core]The cause was that the module text was set to read only before ftrace
could convert the calls to mcount to nops. Thus, the conversions failed
due to not being able to write to the text locations.The simple fix is to move setting the module to read only after the
module notifiers are called (where ftrace sets the module mcounts to nops).Reported-by: Peter Zijlstra
Acked-by: Rusty Russell
Signed-off-by: Steven Rostedt
18 Nov, 2010
1 commit
-
This patch is a logical extension of the protection provided by
CONFIG_DEBUG_RODATA to LKMs. The protection is provided by
splitting module_core and module_init into three logical parts
each and setting appropriate page access permissions for each
individual section:1. Code: RO+X
2. RO data: RO+NX
3. RW data: RW+NXIn order to achieve proper protection, layout_sections() have
been modified to align each of the three parts mentioned above
onto page boundary. Next, the corresponding page access
permissions are set right before successful exit from
load_module(). Further, free_module() and sys_init_module have
been modified to set module_core and module_init as RW+NX right
before calling module_free().By default, the original section layout and access flags are
preserved. When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y,
the patch will page-align each group of sections to ensure that
each page contains only one type of content and will enforce
RO/NX for each group of pages.-v1: Initial proof-of-concept patch.
-v2: The patch have been re-written to reduce the number of #ifdefs
and to make it architecture-agnostic. Code formatting has also
been corrected.
-v3: Opportunistic RO/NX protection is now unconditional. Section
page-alignment is enabled when CONFIG_DEBUG_RODATA=y.
-v4: Removed most macros and improved coding style.
-v5: Changed page-alignment and RO/NX section size calculation
-v6: Fixed comments. Restricted RO/NX enforcement to x86 only
-v7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added
calls to set_all_modules_text_rw() and set_all_modules_text_ro()
in ftrace
-v8: updated for compatibility with linux 2.6.33-rc5
-v9: coding style fixes
-v10: more coding style fixes
-v11: minor adjustments for -tip
-v12: minor adjustments for v2.6.35-rc2-tip
-v13: minor adjustments for v2.6.37-rc1-tipSigned-off-by: Siarhei Liakh
Signed-off-by: Xuxian Jiang
Acked-by: Arjan van de Ven
Reviewed-by: James Morris
Signed-off-by: H. Peter Anvin
Cc: Andi Kleen
Cc: Rusty Russell
Cc: Stephen Rothwell
Cc: Dave Jones
Cc: Kees Cook
Cc: Linus Torvalds
LKML-Reference:
[ minor cleanliness edits, -v14: build failure fix ]
Signed-off-by: Ingo Molnar
11 Nov, 2010
1 commit
-
On use of trace_printk() there's a macro that determines if the format
is static or a variable. If it is static, it defaults to __trace_bprintk()
otherwise it uses __trace_printk().A while ago, Lai Jiangshan added __trace_bprintk(). In that patch, we
discussed a way to allow modules to use it. The difference between
__trace_bprintk() and __trace_printk() is that for faster processing,
just the format and args are stored in the trace instead of running
it through a sprintf function. In order to do this, the format used
by the __trace_bprintk() had to be persistent.See commit 1ba28e02a18cbdbea123836f6c98efb09cbf59ec
The problem comes with trace_bprintk() where the module is unloaded.
The pointer left in the buffer is still pointing to the format.To solve this issue, the formats in the module were copied into kernel
core. If the same format was used, they would use the same copy (to prevent
memory leak). This all worked well until we tried to merge everything.At the time this was written, Lai Jiangshan, Frederic Weisbecker,
Ingo Molnar and myself were all touching the same code. When this was
merged, we lost the part of it that was in module.c. This kept out the
copying of the formats and unloading the module could cause bad pointers
left in the ring buffer.This patch adds back (with updates required for current kernel) the
module code that sets up the necessary pointers.Cc: Lai Jiangshan
Cc: Rusty Russell
Signed-off-by: Steven Rostedt
27 Oct, 2010
1 commit
-
Building with CONFIG_KALLSYMS=n gives following warning:
/mnt/src/linux-git/kernel/module.c: In function ‘post_relocation’:
/mnt/src/linux-git/kernel/module.c:2534:2: warning: passing argument 2 of ‘add_kallsyms’ discards qualifiers from pointer target type
/mnt/src/linux-git/kernel/module.c:2038:13: note: expected ‘struct load_info *’ but argument is of type ‘const struct load_info *’Signed-off-by: Michał Mirosław
Signed-off-by: Rusty Russell
08 Oct, 2010
1 commit
-
Conflicts:
arch/x86/kernel/module.cMerge reason: Resolve the conflict, pick up fixes.
Signed-off-by: Ingo Molnar
06 Oct, 2010
1 commit
-
With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling. That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.Future fixups:
- move the module list handling code into kernel/module.c where it
belongs.
- get rid of 'module_bug_list' and just use the regular list of modules
(called 'modules' - imagine that) that we already create and maintain
for other reasons.Reported-and-tested-by: Thomas Gleixner
Cc: Rusty Russell
Cc: Adrian Bunk
Cc: Andrew Morton
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds
23 Sep, 2010
1 commit
-
base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
assembly gcc mechanism, we can now branch to labels from an 'asm goto'
statment. This allows us to create a 'no-op' fastpath, which can subsequently
be patched with a jump to the slowpath code. This is useful for code which
might be rarely used, but which we'd like to be able to call, if needed.
Tracepoints are the current usecase that these are being implemented for.Acked-by: David S. Miller
Signed-off-by: Jason Baron
LKML-Reference:[ cleaned up some formating ]
Signed-off-by: Steven Rostedt
05 Aug, 2010
16 commits
-
On my (32-bit x86) machine, sys_init_module() uses 124 bytes of stack
once load_module() is inlined.This effectively reverts ffb4ba76 which inlined it due to stack
pressure.Signed-off-by: Rusty Russell
-
This simply hoists more code out of load_module; we also put the
identification of the extable and dynamic debug table in with the
others in find_module_sections().We move the taint check to the actual add/remove of the dynamic debug
info: this is certain (find_module_sections is too early).Signed-off-by: Rusty Russell
Cc: Yehuda Sadeh -
Instead of copying and allocating the args and storing it in
load_info, we can just allocate them right before we need them.Signed-off-by: Rusty Russell
-
Pass the struct load_info into all the other functions in module
loading. This neatens things and makes them more consistent.Signed-off-by: Rusty Russell
-
Restore the stub module_remove_modinfo_attrs, remove the now-unused
!CONFIG_SYSFS module_sysfs_init.Also, rename mod_kobject_remove() to mod_sysfs_teardown() as
it is the logical counterpart to mod_sysfs_setup now.Reported-by: Randy Dunlap
Signed-off-by: Rusty Russell -
We change the sysfs functions to take struct load_info, and call
them all in mod_sysfs_setup().We also clean up the #ifdefs a little.
Signed-off-by: Rusty Russell
-
layout_and_allocate() does everything up to and including the final
struct module placement inside the allocated module memory. We have
to store the symbol layout information in our struct load_info though.This avoids the nasty code we had before where 'mod' pointed first
to the version inside the temporary allocation containing the entire
file, then later was moved to point to the real struct module: now
the main code only ever sees the final module address.(Includes fix for the Tony Luck-found Linus-diagnosed failure path
error).Signed-off-by: Rusty Russell
-
Andrew had the sole pleasure of tickling this bug in linux-next; when we set
up "info->strtab" it's pointing into the temporary copy of the module. For
most uses that is fine, but kallsyms keeps a pointer around during module
load (inside mod->strtab).If we oops for some reason inside a module's init function, kallsyms will use
the mod->strtab pointer into the now-freed temporary module copy.(Later oopses work fine: after init we overwrite mod->strtab to point to a
compacted core-only strtab).Reported-by: Andrew "Grumpy" Morton
Signed-off-by: Rusty "Buggy" Russell
Tested-by: Andrew "Happy" Morton
Signed-off-by: Rusty Russell -
Simple refactor causes us to lift struct definition to top of file.
Signed-off-by: Rusty Russell
-
We can't do the find_sec after removing the SHF_ALLOC flags; it won't
find the sections.Signed-off-by: Rusty Russell
-
Put all the "rewrite and check section headers" in one place. This
adds another iteration over the sections, but it's far clearer. We
iterate once for every find_section() so we already iterate over many
times.Signed-off-by: Rusty Russell
-
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just doerr = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up beinginfo->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).Signed-off-by: Linus Torvalds
Signed-off-by: Rusty Russell -
And now that I'm looking at that call-chain (to see if it would make sense
to use some other more specific lock - doesn't look like it: all the
readers are using RCU and this is the only writer), I also give you this
trivial one-liner. It changes each_symbol() to not put that constant array
on the stack, resulting in changingmovq $C.388.31095, %rsi #, tmp85
subq $376, %rsp #,
movq %rdi, %rbx # fn, fn
leaq -208(%rbp), %rdi #, tmp84
movq %rbx, %rdx # fn,
rep movsl
xorl %esi, %esi #
leaq -208(%rbp), %rdi #, tmp87
movq %r12, %rcx # data,
call each_symbol_in_section.clone.0 #into
xorl %esi, %esi #
subq $216, %rsp #,
movq %rdi, %rbx # fn, fn
movq $arr.31078, %rdi #,
call each_symbol_in_section.clone.0 #which is not so much about being obviously shorter and simpler because we
don't unnecessarily copy that constant array around onto the stack, but
also about having a much smaller stack footprint (376 vs 216 bytes - see
the update of 'rsp').Signed-off-by: Linus Torvalds
Signed-off-by: Rusty Russell -
1) Extract out the relocation loop into apply_relocations
2) Extract license and version checks into check_module_license_and_versions
3) Extract icache flushing into flush_module_icache
4) Move __obsparm warning into find_module_sections
5) Move license setting into check_modinfo.Signed-off-by: Rusty Russell
-
Allocate references inside module_unload_init(), clean up inside
module_unload_free().This version fixed to do allocation before __this_cpu_write, thanks to
bug reports from linux-next from Dave Young
and Stephen Rothwell .Signed-off-by: Rusty Russell
-
Extract out the allocation and copying in from userspace, and the
first set of modinfo checks.Signed-off-by: Rusty Russell