Eric Lee / smarc-fsl-linux-kernel

24 Aug, 2015

1 commit

275d7d44d module: Fix locking in symbol_put_addr() ... Browse Code »

Poma (on the way to another bug) reported an assertion triggering:

[] module_assert_mutex_or_preempt+0x49/0x90
[] __module_address+0x32/0x150
[] __module_text_address+0x16/0x70
[] symbol_put_addr+0x29/0x40
[] dvb_frontend_detach+0x7d/0x90 [dvb_core]

Laura Abbott produced a patch which lead us to
inspect symbol_put_addr(). This function has a comment claiming it
doesn't need to disable preemption around the module lookup
because it holds a reference to the module it wants to find, which
therefore cannot go away.

This is wrong (and a false optimization too, preempt_disable() is really
rather cheap, and I doubt any of this is on uber critical paths,
otherwise it would've retained a pointer to the actual module anyway and
avoided the second lookup).

While its true that the module cannot go away while we hold a reference
on it, the data structure we do the lookup in very much _CAN_ change
while we do the lookup. Therefore fix the comment and add the
required preempt_disable().

Reported-by: poma
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell
Fixes: a6e6abd575fc ("module: remove module_text_address()")
Cc: stable@kernel.org

Peter Zijlstra
2015-08-24 09:07:01 +0800

29 Jul, 2015

1 commit

fe0d34d24 module: weaken locking assertion for oops path. ... Browse Code »

We don't actually hold the module_mutex when calling find_module_all
from module_kallsyms_lookup_name: that's because it's used by the oops
code and we don't want to deadlock.

However, access to the list read-only is safe if preempt is disabled,
so we can weaken the assertion. Keep a strong version for external
callers though.

Fixes: 0be964be0d45 ("module: Sanitize RCU usage and locking")
Reported-by: He Kuang
Cc: stable@kernel.org
Acked-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Rusty Russell
2015-07-29 04:43:22 +0800

09 Jul, 2015

1 commit

758556bdc module: Fix load_module() error path ... Browse Code »

The load_module() error path frees a module but forgot to take it out
of the mod_tree, leaving a dangling entry in the tree, causing havoc.

Cc: Mathieu Desnoyers
Reported-by: Arthur Marsh
Tested-by: Arthur Marsh
Fixes: 93c2e105f6bc ("module: Optimize __module_address() using a latched RB-tree")
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-07-09 05:27:12 +0800

02 Jul, 2015

1 commit

02201e3f1 Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module updates from Rusty Russell:
"Main excitement here is Peter Zijlstra's lockless rbtree optimization
to speed module address lookup. He found some abusers of the module
lock doing that too.

A little bit of parameter work here too; including Dan Streetman's
breaking up the big param mutex so writing a parameter can load
another module (yeah, really). Unfortunately that broke the usual
suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were
appended too"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits)
modules: only use mod->param_lock if CONFIG_MODULES
param: fix module param locks when !CONFIG_SYSFS.
rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE()
module: add per-module param_lock
module: make perm const
params: suppress unused variable error, warn once just in case code changes.
modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'.
kernel/module.c: avoid ifdefs for sig_enforce declaration
kernel/workqueue.c: remove ifdefs over wq_power_efficient
kernel/params.c: export param_ops_bool_enable_only
kernel/params.c: generalize bool_enable_only
kernel/module.c: use generic module param operaters for sig_enforce
kernel/params: constify struct kernel_param_ops uses
sysfs: tightened sysfs permission checks
module: Rework module_addr_{min,max}
module: Use __module_address() for module_address_lookup()
module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING
module: Optimize __module_address() using a latched RB-tree
rbtree: Implement generic latch_tree
seqlock: Introduce raw_read_seqcount_latch()
...

Linus Torvalds
2015-07-02 01:49:25 +0800

28 Jun, 2015

1 commit

cf2fde7b3 param: fix module param locks when !CONFIG_SYSFS. ... Browse Code »

As Dan Streetman points out, the entire point of locking for is to
stop sysfs accesses, so they're elided entirely in the !SYSFS case.

Reported-by: Stephen Rothwell
Signed-off-by: Rusty Russell

Rusty Russell
2015-06-28 13:16:14 +0800

27 Jun, 2015

2 commits

8d7804a2f Merge tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core updates from Greg KH:
"Here is the driver core / firmware changes for 4.2-rc1.

A number of small changes all over the place in the driver core, and
in the firmware subsystem. Nothing really major, full details in the
shortlog. Some of it is a bit of churn, given that the platform
driver probing changes was found to not work well, so they were
reverted.

All of these have been in linux-next for a while with no reported
issues"

* tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits)
Revert "base/platform: Only insert MEM and IO resources"
Revert "base/platform: Continue on insert_resource() error"
Revert "of/platform: Use platform_device interface"
Revert "base/platform: Remove code duplication"
firmware: add missing kfree for work on async call
fs: sysfs: don't pass count == 0 to bin file readers
base:dd - Fix for typo in comment to function driver_deferred_probe_trigger().
base/platform: Remove code duplication
of/platform: Use platform_device interface
base/platform: Continue on insert_resource() error
base/platform: Only insert MEM and IO resources
firmware: use const for remaining firmware names
firmware: fix possible use after free on name on asynchronous request
firmware: check for file truncation on direct firmware loading
firmware: fix __getname() missing failure check
drivers: of/base: move of_init to driver_init
drivers/base: cacheinfo: fix annoying typo when DT nodes are absent
sysfs: disambiguate between "error code" and "failure" in comments
driver-core: fix build for !CONFIG_MODULES
driver-core: make __device_attach() static
...

Linus Torvalds
2015-06-27 06:07:37 +0800
e38260825 Merge tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"This patch series contains several clean ups and even a new trace
clock "monitonic raw". Also some enhancements to make the ring buffer
even faster. But the biggest and most noticeable change is the
renaming of the ftrace* files, structures and variables that have to
deal with trace events.

Over the years I've had several developers tell me about their
confusion with what ftrace is compared to events. Technically,
"ftrace" is the infrastructure to do the function hooks, which include
tracing and also helps with live kernel patching. But the trace
events are a separate entity altogether, and the files that affect the
trace events should not be named "ftrace". These include:

include/trace/ftrace.h -> include/trace/trace_events.h
include/linux/ftrace_event.h -> include/linux/trace_events.h

Also, functions that are specific for trace events have also been renamed:

ftrace_print_*() -> trace_print_*()
(un)register_ftrace_event() -> (un)register_trace_event()
ftrace_event_name() -> trace_event_name()
ftrace_trigger_soft_disabled() -> trace_trigger_soft_disabled()
ftrace_define_fields_##call() -> trace_define_fields_##call()
ftrace_get_offsets_##call() -> trace_get_offsets_##call()

Structures have been renamed:

ftrace_event_file -> trace_event_file
ftrace_event_{call,class} -> trace_event_{call,class}
ftrace_event_buffer -> trace_event_buffer
ftrace_subsystem_dir -> trace_subsystem_dir
ftrace_event_raw_##call -> trace_event_raw_##call
ftrace_event_data_offset_##call-> trace_event_data_offset_##call
ftrace_event_type_funcs_##call -> trace_event_type_funcs_##call

And a few various variables and flags have also been updated.

This has been sitting in linux-next for some time, and I have not
heard a single complaint about this rename breaking anything. Mostly
because these functions, variables and structures are mostly internal
to the tracing system and are seldom (if ever) used by anything
external to that"

* tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
ring_buffer: Allow to exit the ring buffer benchmark immediately
ring-buffer-benchmark: Fix the wrong type
ring-buffer-benchmark: Fix the wrong param in module_param
ring-buffer: Add enum names for the context levels
ring-buffer: Remove useless unused tracing_off_permanent()
ring-buffer: Give NMIs a chance to lock the reader_lock
ring-buffer: Add trace_recursive checks to ring_buffer_write()
ring-buffer: Allways do the trace_recursive checks
ring-buffer: Move recursive check to per_cpu descriptor
ring-buffer: Add unlikelys to make fast path the default
tracing: Rename ftrace_get_offsets_##call() to trace_event_get_offsets_##call()
tracing: Rename ftrace_define_fields_##call() to trace_event_define_fields_##call()
tracing: Rename ftrace_event_type_funcs_##call to trace_event_type_funcs_##call
tracing: Rename ftrace_data_offset_##call to trace_event_data_offset_##call
tracing: Rename ftrace_raw_##call event structures to trace_event_raw_##call
tracing: Rename ftrace_trigger_soft_disabled() to trace_trigger_soft_disabled()
tracing: Rename FTRACE_EVENT_FL_* flags to EVENT_FILE_FL_*
tracing: Rename struct ftrace_subsystem_dir to trace_subsystem_dir
tracing: Rename ftrace_event_name() to trace_event_name()
tracing: Rename FTRACE_MAX_EVENT to TRACE_EVENT_TYPE_MAX
...

Linus Torvalds
2015-06-27 05:02:43 +0800

23 Jun, 2015

1 commit

b51d23e4e module: add per-module param_lock ... Browse Code »

Add a "param_lock" mutex to each module, and update params.c to use
the correct built-in or module mutex while locking kernel params.
Remove the kparam_block_sysfs_r/w() macros, replace them with direct
calls to kernel_param_[un]lock(module).

The kernel param code currently uses a single mutex to protect
modification of any and all kernel params. While this generally works,
there is one specific problem with it; a module callback function
cannot safely load another module, i.e. with request_module() or even
with indirect calls such as crypto_has_alg(). If the module to be
loaded has any of its params configured (e.g. with a /etc/modprobe.d/*
config file), then the attempt will result in a deadlock between the
first module param callback waiting for modprobe, and modprobe trying to
lock the single kernel param mutex to set the new module's param.

This fixes that by using per-module mutexes, so that each individual module
is protected against concurrent changes in its own kernel params, but is
not blocked by changes to other module params. All built-in modules
continue to use the built-in mutex, since they will always be loaded at
runtime and references (e.g. request_module(), crypto_has_alg()) to them
will never cause load-time param changing.

This also simplifies the interface used by modules to block sysfs access
to their params; while there are currently functions to block and unblock
sysfs param access which are split up by read and write and expect a single
kernel param to be passed, their actual operation is identical and applies
to all params, not just the one passed to them; they simply lock and unlock
the global param mutex. They are replaced with direct calls to
kernel_param_[un]lock(THIS_MODULE), which locks THIS_MODULE's param_lock, or
if the module is built-in, it locks the built-in mutex.

Suggested-by: Rusty Russell
Signed-off-by: Dan Streetman
Signed-off-by: Rusty Russell

Dan Streetman
2015-06-23 13:57:38 +0800

09 Jun, 2015

1 commit

987aec39a Merge 4.1-rc7 into driver-core-next ... Browse Code »

We want the fixes in this branch as well for testing and merge
resolution.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2015-06-09 01:19:40 +0800

28 May, 2015

8 commits

6727bb9c6 kernel/module.c: avoid ifdefs for sig_enforce declaration ... Browse Code »

There's no need to require an ifdef over the declaration
of sig_enforce as IS_ENABLED() can be used. While at it,
there's no harm in exposing this kernel parameter outside of
CONFIG_MODULE_SIG as it'd be a no-op on non module sig
kernels.

Now, technically we should in theory be able to remove
the #ifdef'ery over the declaration of the module parameter
as we are also trusting the bool_enable_only code for
CONFIG_MODULE_SIG kernels but for now remain paranoid
and keep it.

With time if no one can put a bullet through bool_enable_only
and if there are no technical requirements over not exposing
CONFIG_MODULE_SIG_FORCE with the measures in place by
bool_enable_only we could remove this last ifdef.

Cc: Rusty Russell
Cc: Andrew Morton
Cc: Kees Cook
Cc: Tejun Heo
Cc: Ingo Molnar
Cc: linux-kernel@vger.kernel.org
Cc: cocci@systeme.lip6.fr
Signed-off-by: Luis R. Rodriguez
Signed-off-by: Rusty Russell

Luis R. Rodriguez
2015-05-28 10:02:13 +0800
d19f05d8a kernel/params.c: generalize bool_enable_only ... Browse Code »

This takes out the bool_enable_only implementation from
the module loading code and generalizes it so that others
can make use of it.

Cc: Rusty Russell
Cc: Jani Nikula
Cc: Andrew Morton
Cc: Kees Cook
Cc: Tejun Heo
Cc: Ingo Molnar
Cc: linux-kernel@vger.kernel.org
Cc: cocci@systeme.lip6.fr
Signed-off-by: Luis R. Rodriguez
Signed-off-by: Rusty Russell

Luis R. Rodriguez
2015-05-28 10:02:11 +0800
05f408ddd kernel/module.c: use generic module param operaters for sig_enforce ... Browse Code »

We're directly checking and modifying sig_enforce when needed instead
of using the generic helpers. This prevents us from generalizing this
helper so that others can use it. Use indirect helpers to allow us
to generalize this code a bit and to make it a bit more clear what
this is doing.

Cc: Rusty Russell
Cc: Jani Nikula
Cc: Andrew Morton
Cc: Kees Cook
Cc: Tejun Heo
Cc: Ingo Molnar
Cc: linux-kernel@vger.kernel.org
Cc: cocci@systeme.lip6.fr
Signed-off-by: Luis R. Rodriguez
Signed-off-by: Rusty Russell

Luis R. Rodriguez
2015-05-28 10:02:11 +0800
4f666546d module: Rework module_addr_{min,max} ... Browse Code »

__module_address() does an initial bound check before doing the
{list/tree} iteration to find the actual module. The bound variables
are nowhere near the mod_tree cacheline, in fact they're nowhere near
one another.

module_addr_min lives in .data while module_addr_max lives in .bss
(smarty pants GCC thinks the explicit 0 assignment is a mistake).

Rectify this by moving the two variables into a structure together
with the latch_tree_root to guarantee they all share the same
cacheline and avoid hitting two extra cachelines for the lookup.

While reworking the bounds code, move the bound update from allocation
to insertion time, this avoids updating the bounds for a few error
paths.

Cc: Rusty Russell
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-05-28 10:02:09 +0800
b7df4d1b2 module: Use __module_address() for module_address_lookup() ... Browse Code »

Use the generic __module_address() addr to struct module lookup
instead of open coding it once more.

Cc: Rusty Russell
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-05-28 10:02:08 +0800
6c9692e2d module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING ... Browse Code »

Andrew worried about the overhead on small systems; only use the fancy
code when either perf or tracing is enabled.

Cc: Rusty Russell
Cc: Steven Rostedt
Requested-by: Andrew Morton
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-05-28 10:02:07 +0800
93c2e105f module: Optimize __module_address() using a latched RB-tree ... Browse Code »

Currently __module_address() is using a linear search through all
modules in order to find the module corresponding to the provided
address. With a lot of modules this can take a lot of time.

One of the users of this is kernel_text_address() which is employed
in many stack unwinders; which in turn are used by perf-callchain and
ftrace (possibly from NMI context).

So by optimizing __module_address() we optimize many stack unwinders
which are used by both perf and tracing in performance sensitive code.

Cc: Rusty Russell
Cc: Steven Rostedt
Cc: Mathieu Desnoyers
Cc: Oleg Nesterov
Cc: "Paul E. McKenney"
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-05-28 10:02:07 +0800
0be964be0 module: Sanitize RCU usage and locking ... Browse Code »

Currently the RCU usage in module is an inconsistent mess of RCU and
RCU-sched, this is broken for CONFIG_PREEMPT where synchronize_rcu()
does not imply synchronize_sched().

Most usage sites use preempt_{dis,en}able() which is RCU-sched, but
(most of) the modification sites use synchronize_rcu(). With the
exception of the module bug list, which actually uses RCU.

Convert everything over to RCU-sched.

Furthermore add lockdep asserts to all sites, because it's not at all
clear to me the required locking is observed, esp. on exported
functions.

Cc: Rusty Russell
Acked-by: "Paul E. McKenney"
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-05-28 10:01:52 +0800

27 May, 2015

1 commit

926a59b1d module: Annotate module version magic ... Browse Code »

Due to the new lockdep checks in the coming patch, we go:

[ 9.759380] ------------[ cut here ]------------
[ 9.759389] WARNING: CPU: 31 PID: 597 at ../kernel/module.c:216 each_symbol_section+0x121/0x130()
[ 9.759391] Modules linked in:
[ 9.759393] CPU: 31 PID: 597 Comm: modprobe Not tainted 4.0.0-rc1+ #65
[ 9.759393] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 9.759396] ffffffff817d8676 ffff880424567ca8 ffffffff8157e98b 0000000000000001
[ 9.759398] 0000000000000000 ffff880424567ce8 ffffffff8105fbc7 ffff880424567cd8
[ 9.759400] 0000000000000000 ffffffff810ec160 ffff880424567d40 0000000000000000
[ 9.759400] Call Trace:
[ 9.759407] [] dump_stack+0x4f/0x7b
[ 9.759410] [] warn_slowpath_common+0x97/0xe0
[ 9.759412] [] ? section_objs+0x60/0x60
[ 9.759414] [] warn_slowpath_null+0x1a/0x20
[ 9.759415] [] each_symbol_section+0x121/0x130
[ 9.759417] [] find_symbol+0x31/0x70
[ 9.759420] [] load_module+0x20f/0x2660
[ 9.759422] [] ? __do_page_fault+0x190/0x4e0
[ 9.759426] [] ? retint_restore_args+0x13/0x13
[ 9.759427] [] ? retint_restore_args+0x13/0x13
[ 9.759433] [] ? trace_hardirqs_on_caller+0x11d/0x1e0
[ 9.759437] [] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 9.759439] [] ? retint_restore_args+0x13/0x13
[ 9.759441] [] SyS_init_module+0xce/0x100
[ 9.759443] [] system_call_fastpath+0x12/0x17
[ 9.759445] ---[ end trace 9294429076a9c644 ]---

As per the comment this site should be fine, but lets wrap it in
preempt_disable() anyhow to placate lockdep.

Cc: Rusty Russell
Acked-by: Paul E. McKenney
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-05-27 09:39:50 +0800

20 May, 2015

2 commits

f2411da74 driver-core: add driver module asynchronous probe support ... Browse Code »

Some init systems may wish to express the desire to have device drivers
run their probe() code asynchronously. This implements support for this
and allows userspace to request async probe as a preference through a
generic shared device driver module parameter, async_probe.

Implementation for async probe is supported through a module parameter
given that since synchronous probe has been prevalent for years some
userspace might exist which relies on the fact that the device driver
will probe synchronously and the assumption that devices it provides
will be immediately available after this.

Signed-off-by: Luis R. Rodriguez
Signed-off-by: Dmitry Torokhov
Signed-off-by: Greg Kroah-Hartman

Luis R. Rodriguez
2015-05-20 15:25:24 +0800
ecc861705 module: add extra argument for parse_params() callback ... Browse Code »

This adds an extra argument onto parse_params() to be used
as a way to make the unused callback a bit more useful and
generic by allowing the caller to pass on a data structure
of its choice. An example use case is to allow us to easily
make module parameters for every module which we will do
next.

@ parse @
identifier name, args, params, num, level_min, level_max;
identifier unknown, param, val, doing;
type s16;
@@
extern char *parse_args(const char *name,
char *args,
const struct kernel_param *params,
unsigned num,
s16 level_min,
s16 level_max,
+ void *arg,
int (*unknown)(char *param, char *val,
const char *doing
+ , void *arg
));

@ parse_mod @
identifier name, args, params, num, level_min, level_max;
identifier unknown, param, val, doing;
type s16;
@@
char *parse_args(const char *name,
char *args,
const struct kernel_param *params,
unsigned num,
s16 level_min,
s16 level_max,
+ void *arg,
int (*unknown)(char *param, char *val,
const char *doing
+ , void *arg
))
{
...
}

@ parse_args_found @
expression R, E1, E2, E3, E4, E5, E6;
identifier func;
@@

(
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
func);
|
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
&func);
|
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
NULL);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
func);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
&func);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
NULL);
)

@ parse_args_unused depends on parse_args_found @
identifier parse_args_found.func;
@@

int func(char *param, char *val, const char *unused
+ , void *arg
)
{
...
}

@ mod_unused depends on parse_args_found @
identifier parse_args_found.func;
expression A1, A2, A3;
@@

- func(A1, A2, A3);
+ func(A1, A2, A3, NULL);

Generated-by: Coccinelle SmPL
Cc: cocci@systeme.lip6.fr
Cc: Tejun Heo
Cc: Arjan van de Ven
Cc: Greg Kroah-Hartman
Cc: Rusty Russell
Cc: Christoph Hellwig
Cc: Felipe Contreras
Cc: Ewan Milne
Cc: Jean Delvare
Cc: Hannes Reinecke
Cc: Jani Nikula
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Tejun Heo
Acked-by: Rusty Russell
Signed-off-by: Luis R. Rodriguez
Signed-off-by: Greg Kroah-Hartman

Luis R. Rodriguez
2015-05-20 15:25:24 +0800

14 May, 2015

1 commit

af658dca2 tracing: Rename ftrace_event.h to trace_events.h ... Browse Code »

The term "ftrace" is really the infrastructure of the function hooks,
and not the trace events. Rename ftrace_event.h to trace_events.h to
represent the trace_event infrastructure and decouple the term ftrace
from it.

Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-05-14 02:05:12 +0800

09 May, 2015

1 commit

37815bf86 module: Call module notifier on failure after complete_formation() ... Browse Code »

The module notifier call chain for MODULE_STATE_COMING was moved up before
the parsing of args, into the complete_formation() call. But if the module failed
to load after that, the notifier call chain for MODULE_STATE_GOING was
never called and that prevented the users of those call chains from
cleaning up anything that was allocated.

Link: http://lkml.kernel.org/r/554C52B9.9060700@gmail.com

Reported-by: Pontus Fuchs
Fixes: 4982223e51e8 "module: set nx before marking module MODULE_STATE_COMING"
Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: Steven Rostedt
Signed-off-by: Rusty Russell

Steven Rostedt
2015-05-09 01:59:24 +0800

23 Apr, 2015

1 commit

15ce2658d Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module updates from Rusty Russell:
"Quentin opened a can of worms by adding extable entry checking to
modpost, but most architectures seem fixed now. Thanks to all
involved.

Last minute rebase because I noticed a "[PATCH]" had snuck into a
commit message somehow"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modpost: don't emit section mismatch warnings for compiler optimizations
modpost: expand pattern matching to support substring matches
modpost: do not try to match the SHT_NUL section.
modpost: fix extable entry size calculation.
modpost: fix inverted logic in is_extable_fault_address().
modpost: handle -ffunction-sections
modpost: Whitelist .text.fixup and .exception.text
params: handle quotes properly for values not of form foo="bar".
modpost: document the use of struct section_check.
modpost: handle relocations mismatch in __ex_table.
scripts: add check_extable.sh script.
modpost: mismatch_handler: retrieve tosym information only when needed.
modpost: factorize symbol pretty print in get_pretty_name().
modpost: add handler function pointer to sectioncheck.
modpost: add .sched.text and .kprobes.text to the TEXT_SECTIONS list.
modpost: add strict white-listing when referencing sections.
module: do not print allocation-fail warning on bogus user buffer size
kernel/module.c: fix typos in message about unused symbols

Linus Torvalds
2015-04-23 00:49:24 +0800

15 Apr, 2015

1 commit

eeee78cf7 Merge tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"Some clean ups and small fixes, but the biggest change is the addition
of the TRACE_DEFINE_ENUM() macro that can be used by tracepoints.

Tracepoints have helper functions for the TP_printk() called
__print_symbolic() and __print_flags() that lets a numeric number be
displayed as a a human comprehensible text. What is placed in the
TP_printk() is also shown in the tracepoint format file such that user
space tools like perf and trace-cmd can parse the binary data and
express the values too. Unfortunately, the way the TRACE_EVENT()
macro works, anything placed in the TP_printk() will be shown pretty
much exactly as is. The problem arises when enums are used. That's
because unlike macros, enums will not be changed into their values by
the C pre-processor. Thus, the enum string is exported to the format
file, and this makes it useless for user space tools.

The TRACE_DEFINE_ENUM() solves this by converting the enum strings in
the TP_printk() format into their number, and that is what is shown to
user space. For example, the tracepoint tlb_flush currently has this
in its format file:

__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })

After adding:

TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);

Its format file will contain this:

__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })"

* tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits)
tracing: Add enum_map file to show enums that have been mapped
writeback: Export enums used by tracepoint to user space
v4l: Export enums used by tracepoints to user space
SUNRPC: Export enums in tracepoints to user space
mm: tracing: Export enums in tracepoints to user space
irq/tracing: Export enums in tracepoints to user space
f2fs: Export the enums in the tracepoints to userspace
net/9p/tracing: Export enums in tracepoints to userspace
x86/tlb/trace: Export enums in used by tlb_flush tracepoint
tracing/samples: Update the trace-event-sample.h with TRACE_DEFINE_ENUM()
tracing: Allow for modules to convert their enums to values
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
tracing: Update trace-event-sample with TRACE_SYSTEM_VAR documentation
tracing: Give system name a pointer
brcmsmac: Move each system tracepoints to their own header
iwlwifi: Move each system tracepoints to their own header
mac80211: Move message tracepoints to their own header
tracing: Add TRACE_SYSTEM_VAR to xhci-hcd
tracing: Add TRACE_SYSTEM_VAR to kvm-s390
tracing: Add TRACE_SYSTEM_VAR to intel-sst
...

Linus Torvalds
2015-04-15 01:49:03 +0800

09 Apr, 2015

1 commit

3afe9f849 Copy the kernel module data from user space in chunks ... Browse Code »

Unlike most (all?) other copies from user space, kernel module loading
is almost unlimited in size. So we do a potentially huge
"copy_from_user()" when we copy the module data from user space to the
kernel buffer, which can be a latency concern when preemption is
disabled (or voluntary).

Also, because 'copy_from_user()' clears the tail of the kernel buffer on
failures, even a *failed* copy can end up wasting a lot of time.

Normally neither of these are concerns in real life, but they do trigger
when doing stress-testing with trinity. Running in a VM seems to add
its own overheadm causing trinity module load testing to even trigger
the watchdog.

The simple fix is to just chunk up the module loading, so that it never
tries to copy insanely big areas in one go. That bounds the latency,
and also the amount of (unnecessarily, in this case) cleared memory for
the failure case.

Reported-by: Sasha Levin
Signed-off-by: Linus Torvalds

Linus Torvalds
2015-04-09 05:35:48 +0800

08 Apr, 2015

1 commit

3673b8e4c tracing: Allow for modules to convert their enums to values ... Browse Code »

Update the infrastructure such that modules that declare TRACE_DEFINE_ENUM()
will have those enums converted into their values in the tracepoint
print fmt strings.

Link: http://lkml.kernel.org/r/87vbhjp74q.fsf@rustcorp.com.au

Acked-by: Rusty Russell
Reviewed-by: Masami Hiramatsu
Tested-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-04-08 21:39:57 +0800

24 Mar, 2015

2 commits

cc9e605dc module: do not print allocation-fail warning on bogus user buffer size ... Browse Code »

init_module(2) passes user-specified buffer length directly to
vmalloc(). It makes warn_alloc_failed() to print out a lot of info into
dmesg if user specified insane size, like -1.

Let's silence the warning. It doesn't add much value to -ENOMEM return
code. Without the patch the syscall is prohibitive noisy for testing
with trinity.

Signed-off-by: Kirill A. Shutemov
Cc: Dave Jones
Cc: Sasha Levin
Signed-off-by: Rusty Russell

Kirill A. Shutemov
2015-03-24 10:02:37 +0800
7b63c3ab9 kernel/module.c: fix typos in message about unused symbols ... Browse Code »

Fix typos in pr_warn message about unused symbols

Signed-off-by: Yannick Guerrini
Signed-off-by: Rusty Russell

Yannick Guerrini
2015-03-24 10:02:36 +0800

23 Mar, 2015

1 commit

35a9393c9 lockdep: Fix the module unload key range freeing logic ... Browse Code »

Module unload calls lockdep_free_key_range(), which removes entries
from the data structures. Most of the lockdep code OTOH assumes the
data structures are append only; in specific see the comments in
add_lock_to_list() and look_up_lock_class().

Clearly this has only worked by accident; make it work proper. The
actual scenario to make it go boom would involve the memory freed by
the module unlock being re-allocated and re-used for a lock inside of
a rcu-sched grace period. This is a very unlikely scenario, still
better plug the hole.

Use RCU list iteration in all places and ammend the comments.

Change lockdep_free_key_range() to issue a sync_sched() between
removal from the lists and returning -- which results in the memory
being freed. Further ensure the callers are placed correctly and
comment the requirements.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Andrey Tsyvarev
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Rusty Russell
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-03-23 17:49:07 +0800

13 Mar, 2015

1 commit

a5af5aa8b kasan, module, vmalloc: rework shadow allocation for modules ... Browse Code »

Current approach in handling shadow memory for modules is broken.

Shadow memory could be freed only after memory shadow corresponds it is no
longer used. vfree() called from interrupt context could use memory its
freeing to store 'struct llist_node' in it:

void vfree(const void *addr)
{
...
if (unlikely(in_interrupt())) {
struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
if (llist_add((struct llist_node *)addr, &p->list))
schedule_work(&p->wq);

Later this list node used in free_work() which actually frees memory.
Currently module_memfree() called in interrupt context will free shadow
before freeing module's memory which could provoke kernel crash.

So shadow memory should be freed after module's memory. However, such
deallocation order could race with kasan_module_alloc() in module_alloc().

Free shadow right before releasing vm area. At this point vfree()'d
memory is not used anymore and yet not available for other allocations.
New VM_KASAN flag used to indicate that vm area has dynamically allocated
shadow memory so kasan frees shadow only if it was previously allocated.

Signed-off-by: Andrey Ryabinin
Acked-by: Rusty Russell
Cc: Dmitry Vyukov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrey Ryabinin
2015-03-13 09:46:08 +0800

06 Mar, 2015

1 commit

168e47f2a kernel/module.c: Update debug alignment after symtable generation ... Browse Code »

When CONFIG_DEBUG_SET_MODULE_RONX is enabled, the sizes of
module sections are aligned up so appropriate permissions can
be applied. Adjusting for the symbol table may cause them to
become unaligned. Make sure to re-align the sizes afterward.

Signed-off-by: Laura Abbott
Acked-by: Rusty Russell
Signed-off-by: Catalin Marinas

Laura Abbott
2015-03-06 20:04:22 +0800

18 Feb, 2015

1 commit

be02a1862 kernel/module.c: do not inline do_init_module() ... Browse Code »

This provides a reliable breakpoint target, required for automatic symbol
loading via the gdb helper command 'lx-symbols'.

Signed-off-by: Jan Kiszka
Acked-by: Rusty Russell
Cc: Thomas Gleixner
Cc: Jason Wessel
Cc: Andi Kleen
Cc: Ben Widawsky
Cc: Borislav Petkov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kiszka
2015-02-18 06:34:53 +0800

14 Feb, 2015

1 commit

bebf56a1b kasan: enable instrumentation of global variables ... Browse Code »

This feature let us to detect accesses out of bounds of global variables.
This will work as for globals in kernel image, so for globals in modules.
Currently this won't work for symbols in user-specified sections (e.g.
__init, __read_mostly, ...)

The idea of this is simple. Compiler increases each global variable by
redzone size and add constructors invoking __asan_register_globals()
function. Information about global variable (address, size, size with
redzone ...) passed to __asan_register_globals() so we could poison
variable's redzone.

This patch also forces module_alloc() to return 8*PAGE_SIZE aligned
address making shadow memory handling (
kasan_module_alloc()/kasan_module_free() ) more simple. Such alignment
guarantees that each shadow page backing modules address space correspond
to only one module_alloc() allocation.

Signed-off-by: Andrey Ryabinin
Cc: Dmitry Vyukov
Cc: Konstantin Serebryany
Cc: Dmitry Chernenkov
Signed-off-by: Andrey Konovalov
Cc: Yuri Gribov
Cc: Konstantin Khlebnikov
Cc: Sasha Levin
Cc: Christoph Lameter
Cc: Joonsoo Kim
Cc: Dave Hansen
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrey Ryabinin
2015-02-14 13:21:42 +0800

11 Feb, 2015

2 commits

9cc019b8c module: Replace over-engineered nested sleep ... Browse Code »

Since the introduction of the nested sleep warning; we've established
that the occasional sleep inside a wait_event() is fine.

wait_event() loops are invariant wrt. spurious wakeups, and the
occasional sleep has a similar effect on them. As long as its occasional
its harmless.

Therefore replace the 'correct' but verbose wait_woken() thing with
a simple annotation to shut up the warning.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-02-11 12:32:04 +0800
d64810f56 module: Annotate nested sleep in resolve_symbol() ... Browse Code »

Because wait_event() loops are safe vs spurious wakeups we can allow the
occasional sleep -- which ends up being very similar.

Reported-by: Dave Jones
Signed-off-by: Peter Zijlstra (Intel)
Tested-by: Dave Jones
Signed-off-by: Rusty Russell

Peter Zijlstra
2015-02-11 12:32:04 +0800

06 Feb, 2015

2 commits

ab92ebbb8 module: Remove double spaces in module verification taint message ... Browse Code »

The warning message when loading modules with a wrong signature has
two spaces in it:

"module verification failed: signature and/or required key missing"

Signed-off-by: Marcel Holtmann
Signed-off-by: Rusty Russell

Marcel Holtmann
2015-02-06 13:01:41 +0800
de96d79f3 kernel/module.c: Free lock-classes if parse_args failed ... Browse Code »

parse_args call module parameters' .set handlers, which may use locks defined in the module.
So, these classes should be freed in case parse_args returns error(e.g. due to incorrect parameter passed).

Signed-off-by: Andrey Tsyvarev
Signed-off-by: Rusty Russell

Andrey Tsyvarev
2015-02-06 13:01:40 +0800

22 Jan, 2015

1 commit

d5db139ab module: make module_refcount() a signed integer. ... Browse Code »

James Bottomley points out that it will be -1 during unload. It's
only used for diagnostics, so let's not hide that as it could be a
clue as to what's gone wrong.

Cc: Jason Wessel
Acked-and-documention-added-by: James Bottomley
Reviewed-by: Masami Hiramatsu
Signed-off-by: Rusty Russell

Rusty Russell
2015-01-22 08:45:54 +0800

20 Jan, 2015

2 commits

c74963790 module: fix race in kallsyms resolution during module load success. ... Browse Code »

The kallsyms routines (module_symbol_name, lookup_module_* etc) disable
preemption to walk the modules rather than taking the module_mutex:
this is because they are used for symbol resolution during oopses.

This works because there are synchronize_sched() and synchronize_rcu()
in the unload and failure paths. However, there's one case which doesn't
have that: the normal case where module loading succeeds, and we free
the init section.

We don't want a synchronize_rcu() there, because it would slow down
module loading: this bug was introduced in 2009 to speed module
loading in the first place.

Thus, we want to do the free in an RCU callback. We do this in the
simplest possible way by allocating a new rcu_head: if we put it in
the module structure we'd have to worry about that getting freed.

Reported-by: Rui Xiang
Signed-off-by: Rusty Russell

Rusty Russell
2015-01-20 09:08:34 +0800
be1f221c0 module: remove mod arg from module_free, rename module_memfree(). ... Browse Code »

Nothing needs the module pointer any more, and the next patch will
call it from RCU, where the module itself might no longer exist.
Removing the arg is the safest approach.

This just codifies the use of the module_alloc/module_free pattern
which ftrace and bpf use.

Signed-off-by: Rusty Russell
Acked-by: Alexei Starovoitov
Cc: Mikael Starvik
Cc: Jesper Nilsson
Cc: Ralf Baechle
Cc: Ley Foon Tan
Cc: Benjamin Herrenschmidt
Cc: Chris Metcalf
Cc: Steven Rostedt
Cc: x86@kernel.org
Cc: Ananth N Mavinakayanahalli
Cc: Anil S Keshavamurthy
Cc: Masami Hiramatsu
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: nios2-dev@lists.rocketboards.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: netdev@vger.kernel.org

Rusty Russell
2015-01-20 09:08:33 +0800