10 Nov, 2015
1 commit
-
commit 275d7d44d802ef271a42dc87ac091a495ba72fc5 upstream.
Poma (on the way to another bug) reported an assertion triggering:
[] module_assert_mutex_or_preempt+0x49/0x90
[] __module_address+0x32/0x150
[] __module_text_address+0x16/0x70
[] symbol_put_addr+0x29/0x40
[] dvb_frontend_detach+0x7d/0x90 [dvb_core]Laura Abbott produced a patch which lead us to
inspect symbol_put_addr(). This function has a comment claiming it
doesn't need to disable preemption around the module lookup
because it holds a reference to the module it wants to find, which
therefore cannot go away.This is wrong (and a false optimization too, preempt_disable() is really
rather cheap, and I doubt any of this is on uber critical paths,
otherwise it would've retained a pointer to the actual module anyway and
avoided the second lookup).While its true that the module cannot go away while we hold a reference
on it, the data structure we do the lookup in very much _CAN_ change
while we do the lookup. Therefore fix the comment and add the
required preempt_disable().Reported-by: poma
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell
Fixes: a6e6abd575fc ("module: remove module_text_address()")
Signed-off-by: Greg Kroah-Hartman
09 May, 2015
1 commit
-
The module notifier call chain for MODULE_STATE_COMING was moved up before
the parsing of args, into the complete_formation() call. But if the module failed
to load after that, the notifier call chain for MODULE_STATE_GOING was
never called and that prevented the users of those call chains from
cleaning up anything that was allocated.Link: http://lkml.kernel.org/r/554C52B9.9060700@gmail.com
Reported-by: Pontus Fuchs
Fixes: 4982223e51e8 "module: set nx before marking module MODULE_STATE_COMING"
Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: Steven Rostedt
Signed-off-by: Rusty Russell
23 Apr, 2015
1 commit
-
Pull module updates from Rusty Russell:
"Quentin opened a can of worms by adding extable entry checking to
modpost, but most architectures seem fixed now. Thanks to all
involved.Last minute rebase because I noticed a "[PATCH]" had snuck into a
commit message somehow"* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modpost: don't emit section mismatch warnings for compiler optimizations
modpost: expand pattern matching to support substring matches
modpost: do not try to match the SHT_NUL section.
modpost: fix extable entry size calculation.
modpost: fix inverted logic in is_extable_fault_address().
modpost: handle -ffunction-sections
modpost: Whitelist .text.fixup and .exception.text
params: handle quotes properly for values not of form foo="bar".
modpost: document the use of struct section_check.
modpost: handle relocations mismatch in __ex_table.
scripts: add check_extable.sh script.
modpost: mismatch_handler: retrieve tosym information only when needed.
modpost: factorize symbol pretty print in get_pretty_name().
modpost: add handler function pointer to sectioncheck.
modpost: add .sched.text and .kprobes.text to the TEXT_SECTIONS list.
modpost: add strict white-listing when referencing sections.
module: do not print allocation-fail warning on bogus user buffer size
kernel/module.c: fix typos in message about unused symbols
15 Apr, 2015
1 commit
-
Pull tracing updates from Steven Rostedt:
"Some clean ups and small fixes, but the biggest change is the addition
of the TRACE_DEFINE_ENUM() macro that can be used by tracepoints.Tracepoints have helper functions for the TP_printk() called
__print_symbolic() and __print_flags() that lets a numeric number be
displayed as a a human comprehensible text. What is placed in the
TP_printk() is also shown in the tracepoint format file such that user
space tools like perf and trace-cmd can parse the binary data and
express the values too. Unfortunately, the way the TRACE_EVENT()
macro works, anything placed in the TP_printk() will be shown pretty
much exactly as is. The problem arises when enums are used. That's
because unlike macros, enums will not be changed into their values by
the C pre-processor. Thus, the enum string is exported to the format
file, and this makes it useless for user space tools.The TRACE_DEFINE_ENUM() solves this by converting the enum strings in
the TP_printk() format into their number, and that is what is shown to
user space. For example, the tracepoint tlb_flush currently has this
in its format file:__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })After adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);Its format file will contain this:
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })"* tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits)
tracing: Add enum_map file to show enums that have been mapped
writeback: Export enums used by tracepoint to user space
v4l: Export enums used by tracepoints to user space
SUNRPC: Export enums in tracepoints to user space
mm: tracing: Export enums in tracepoints to user space
irq/tracing: Export enums in tracepoints to user space
f2fs: Export the enums in the tracepoints to userspace
net/9p/tracing: Export enums in tracepoints to userspace
x86/tlb/trace: Export enums in used by tlb_flush tracepoint
tracing/samples: Update the trace-event-sample.h with TRACE_DEFINE_ENUM()
tracing: Allow for modules to convert their enums to values
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
tracing: Update trace-event-sample with TRACE_SYSTEM_VAR documentation
tracing: Give system name a pointer
brcmsmac: Move each system tracepoints to their own header
iwlwifi: Move each system tracepoints to their own header
mac80211: Move message tracepoints to their own header
tracing: Add TRACE_SYSTEM_VAR to xhci-hcd
tracing: Add TRACE_SYSTEM_VAR to kvm-s390
tracing: Add TRACE_SYSTEM_VAR to intel-sst
...
09 Apr, 2015
1 commit
-
Unlike most (all?) other copies from user space, kernel module loading
is almost unlimited in size. So we do a potentially huge
"copy_from_user()" when we copy the module data from user space to the
kernel buffer, which can be a latency concern when preemption is
disabled (or voluntary).Also, because 'copy_from_user()' clears the tail of the kernel buffer on
failures, even a *failed* copy can end up wasting a lot of time.Normally neither of these are concerns in real life, but they do trigger
when doing stress-testing with trinity. Running in a VM seems to add
its own overheadm causing trinity module load testing to even trigger
the watchdog.The simple fix is to just chunk up the module loading, so that it never
tries to copy insanely big areas in one go. That bounds the latency,
and also the amount of (unnecessarily, in this case) cleared memory for
the failure case.Reported-by: Sasha Levin
Signed-off-by: Linus Torvalds
08 Apr, 2015
1 commit
-
Update the infrastructure such that modules that declare TRACE_DEFINE_ENUM()
will have those enums converted into their values in the tracepoint
print fmt strings.Link: http://lkml.kernel.org/r/87vbhjp74q.fsf@rustcorp.com.au
Acked-by: Rusty Russell
Reviewed-by: Masami Hiramatsu
Tested-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt
24 Mar, 2015
2 commits
-
init_module(2) passes user-specified buffer length directly to
vmalloc(). It makes warn_alloc_failed() to print out a lot of info into
dmesg if user specified insane size, like -1.Let's silence the warning. It doesn't add much value to -ENOMEM return
code. Without the patch the syscall is prohibitive noisy for testing
with trinity.Signed-off-by: Kirill A. Shutemov
Cc: Dave Jones
Cc: Sasha Levin
Signed-off-by: Rusty Russell -
Fix typos in pr_warn message about unused symbols
Signed-off-by: Yannick Guerrini
Signed-off-by: Rusty Russell
23 Mar, 2015
1 commit
-
Module unload calls lockdep_free_key_range(), which removes entries
from the data structures. Most of the lockdep code OTOH assumes the
data structures are append only; in specific see the comments in
add_lock_to_list() and look_up_lock_class().Clearly this has only worked by accident; make it work proper. The
actual scenario to make it go boom would involve the memory freed by
the module unlock being re-allocated and re-used for a lock inside of
a rcu-sched grace period. This is a very unlikely scenario, still
better plug the hole.Use RCU list iteration in all places and ammend the comments.
Change lockdep_free_key_range() to issue a sync_sched() between
removal from the lists and returning -- which results in the memory
being freed. Further ensure the callers are placed correctly and
comment the requirements.Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Andrey Tsyvarev
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Rusty Russell
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar
13 Mar, 2015
1 commit
-
Current approach in handling shadow memory for modules is broken.
Shadow memory could be freed only after memory shadow corresponds it is no
longer used. vfree() called from interrupt context could use memory its
freeing to store 'struct llist_node' in it:void vfree(const void *addr)
{
...
if (unlikely(in_interrupt())) {
struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
if (llist_add((struct llist_node *)addr, &p->list))
schedule_work(&p->wq);Later this list node used in free_work() which actually frees memory.
Currently module_memfree() called in interrupt context will free shadow
before freeing module's memory which could provoke kernel crash.So shadow memory should be freed after module's memory. However, such
deallocation order could race with kasan_module_alloc() in module_alloc().Free shadow right before releasing vm area. At this point vfree()'d
memory is not used anymore and yet not available for other allocations.
New VM_KASAN flag used to indicate that vm area has dynamically allocated
shadow memory so kasan frees shadow only if it was previously allocated.Signed-off-by: Andrey Ryabinin
Acked-by: Rusty Russell
Cc: Dmitry Vyukov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Mar, 2015
1 commit
-
When CONFIG_DEBUG_SET_MODULE_RONX is enabled, the sizes of
module sections are aligned up so appropriate permissions can
be applied. Adjusting for the symbol table may cause them to
become unaligned. Make sure to re-align the sizes afterward.Signed-off-by: Laura Abbott
Acked-by: Rusty Russell
Signed-off-by: Catalin Marinas
18 Feb, 2015
1 commit
-
This provides a reliable breakpoint target, required for automatic symbol
loading via the gdb helper command 'lx-symbols'.Signed-off-by: Jan Kiszka
Acked-by: Rusty Russell
Cc: Thomas Gleixner
Cc: Jason Wessel
Cc: Andi Kleen
Cc: Ben Widawsky
Cc: Borislav Petkov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Feb, 2015
1 commit
-
This feature let us to detect accesses out of bounds of global variables.
This will work as for globals in kernel image, so for globals in modules.
Currently this won't work for symbols in user-specified sections (e.g.
__init, __read_mostly, ...)The idea of this is simple. Compiler increases each global variable by
redzone size and add constructors invoking __asan_register_globals()
function. Information about global variable (address, size, size with
redzone ...) passed to __asan_register_globals() so we could poison
variable's redzone.This patch also forces module_alloc() to return 8*PAGE_SIZE aligned
address making shadow memory handling (
kasan_module_alloc()/kasan_module_free() ) more simple. Such alignment
guarantees that each shadow page backing modules address space correspond
to only one module_alloc() allocation.Signed-off-by: Andrey Ryabinin
Cc: Dmitry Vyukov
Cc: Konstantin Serebryany
Cc: Dmitry Chernenkov
Signed-off-by: Andrey Konovalov
Cc: Yuri Gribov
Cc: Konstantin Khlebnikov
Cc: Sasha Levin
Cc: Christoph Lameter
Cc: Joonsoo Kim
Cc: Dave Hansen
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 Feb, 2015
2 commits
-
Since the introduction of the nested sleep warning; we've established
that the occasional sleep inside a wait_event() is fine.wait_event() loops are invariant wrt. spurious wakeups, and the
occasional sleep has a similar effect on them. As long as its occasional
its harmless.Therefore replace the 'correct' but verbose wait_woken() thing with
a simple annotation to shut up the warning.Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Rusty Russell -
Because wait_event() loops are safe vs spurious wakeups we can allow the
occasional sleep -- which ends up being very similar.Reported-by: Dave Jones
Signed-off-by: Peter Zijlstra (Intel)
Tested-by: Dave Jones
Signed-off-by: Rusty Russell
06 Feb, 2015
2 commits
-
The warning message when loading modules with a wrong signature has
two spaces in it:"module verification failed: signature and/or required key missing"
Signed-off-by: Marcel Holtmann
Signed-off-by: Rusty Russell -
parse_args call module parameters' .set handlers, which may use locks defined in the module.
So, these classes should be freed in case parse_args returns error(e.g. due to incorrect parameter passed).Signed-off-by: Andrey Tsyvarev
Signed-off-by: Rusty Russell
22 Jan, 2015
1 commit
-
James Bottomley points out that it will be -1 during unload. It's
only used for diagnostics, so let's not hide that as it could be a
clue as to what's gone wrong.Cc: Jason Wessel
Acked-and-documention-added-by: James Bottomley
Reviewed-by: Masami Hiramatsu
Signed-off-by: Rusty Russell
20 Jan, 2015
3 commits
-
The kallsyms routines (module_symbol_name, lookup_module_* etc) disable
preemption to walk the modules rather than taking the module_mutex:
this is because they are used for symbol resolution during oopses.This works because there are synchronize_sched() and synchronize_rcu()
in the unload and failure paths. However, there's one case which doesn't
have that: the normal case where module loading succeeds, and we free
the init section.We don't want a synchronize_rcu() there, because it would slow down
module loading: this bug was introduced in 2009 to speed module
loading in the first place.Thus, we want to do the free in an RCU callback. We do this in the
simplest possible way by allocating a new rcu_head: if we put it in
the module structure we'd have to worry about that getting freed.Reported-by: Rui Xiang
Signed-off-by: Rusty Russell -
Nothing needs the module pointer any more, and the next patch will
call it from RCU, where the module itself might no longer exist.
Removing the arg is the safest approach.This just codifies the use of the module_alloc/module_free pattern
which ftrace and bpf use.Signed-off-by: Rusty Russell
Acked-by: Alexei Starovoitov
Cc: Mikael Starvik
Cc: Jesper Nilsson
Cc: Ralf Baechle
Cc: Ley Foon Tan
Cc: Benjamin Herrenschmidt
Cc: Chris Metcalf
Cc: Steven Rostedt
Cc: x86@kernel.org
Cc: Ananth N Mavinakayanahalli
Cc: Anil S Keshavamurthy
Cc: Masami Hiramatsu
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: nios2-dev@lists.rocketboards.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: netdev@vger.kernel.org -
Archs have been abusing module_free() to clean up their arch-specific
allocations. Since module_free() is also (ab)used by BPF and trace code,
let's keep it to simple allocations, and provide a hook called before
that.This means that avr32, ia64, parisc and s390 no longer need to implement
their own module_free() at all. avr32 doesn't need module_finalize()
either.Signed-off-by: Rusty Russell
Cc: Chris Metcalf
Cc: Haavard Skinnemoen
Cc: Hans-Christian Egtvedt
Cc: Tony Luck
Cc: Fenghua Yu
Cc: "James E.J. Bottomley"
Cc: Helge Deller
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: linux-kernel@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-s390@vger.kernel.org
19 Dec, 2014
1 commit
-
Pull module updates from Rusty Russell:
"The exciting thing here is the getting rid of stop_machine on module
removal. This is possible by using a simple atomic_t for the counter,
rather than our fancy per-cpu counter: it turns out that no one is
doing a module increment per net packet, so the slowdown should be in
the noise"* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
param: do not set store func without write perm
params: cleanup sysfs allocation
kernel:module Fix coding style errors and warnings.
module: Remove stop_machine from module unloading
module: Replace module_ref with atomic_t refcnt
lib/bug: Use RCU list ops for module_bug_list
module: Unlink module with RCU synchronizing instead of stop_machine
module: Wait for RCU synchronizing before releasing a module
11 Nov, 2014
6 commits
-
Fixed codin style errors and warnings. Changes printk with
print_debug/warn. Changed seq_printf to seq_puts.Signed-off-by: Ionut Alexa
Signed-off-by: Rusty Russell (removed bogus KERN_DEFAULT conversion) -
Remove stop_machine from module unloading by adding new reference
counting algorithm.This atomic refcounter works like a semaphore, it can get (be
incremented) only when the counter is not 0. When loading a module,
kmodule subsystem sets the counter MODULE_REF_BASE (= 1). And when
unloading the module, it subtracts MODULE_REF_BASE from the counter.
If no one refers the module, the refcounter becomes 0 and we can
remove the module safely. If someone referes it, we try to recover
the counter by adding MODULE_REF_BASE unless the counter becomes 0,
because the referrer can put the module right before recovering.
If the recovering is failed, we can get the 0 refcount and it
never be incremented again, it can be removed safely too.Note that __module_get() forcibly gets the module refcounter,
users should use try_module_get() instead of that.Signed-off-by: Masami Hiramatsu
Signed-off-by: Rusty Russell -
Replace module_ref per-cpu complex reference counter with
an atomic_t simple refcnt. This is for code simplification.Signed-off-by: Masami Hiramatsu
Signed-off-by: Rusty Russell -
Actually since module_bug_list should be used in BUG context,
we may not need this. But for someone who want to use this
from normal context, this makes module_bug_list an RCU list.Signed-off-by: Masami Hiramatsu
Signed-off-by: Rusty Russell -
Unlink module from module list with RCU synchronizing instead
of using stop_machine(). Since module list is already protected
by rcu, we don't need stop_machine() anymore.Signed-off-by: Masami Hiramatsu
Signed-off-by: Rusty Russell -
Wait for RCU synchronizing on failure path of module loading
before releasing struct module, because the memory of mod->list
can still be accessed by list walkers (e.g. kallsyms).Signed-off-by: Masami Hiramatsu
Signed-off-by: Rusty Russell
28 Oct, 2014
1 commit
-
This is a genuine bug in add_unformed_module(), we cannot use blocking
primitives inside a wait loop.So rewrite the wait_event_interruptible() usage to use the fresh
wait_woken() stuff.Reported-by: Fengguang Wu
Signed-off-by: Peter Zijlstra (Intel)
Cc: tglx@linutronix.de
Cc: ilya.dryomov@inktank.com
Cc: umgwanakikbuti@gmail.com
Cc: Rusty Russell
Cc: oleg@redhat.com
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Greg Kroah-Hartman
Link: http://lkml.kernel.org/r/20140924082242.458562904@infradead.org
[ So this is probably complex to backport and the race wasn't reported AFAIK,
so not marked for -stable. ]
Signed-off-by: Ingo MolnarSigned-off-by: Ingo Molnar
19 Oct, 2014
1 commit
-
Pull module fix from Rusty Russell:
"A single panic fix for a rare race, stable CC'd"* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modules, lock around setting of MODULE_STATE_UNFORMED
15 Oct, 2014
1 commit
-
A panic was seen in the following sitation.
There are two threads running on the system. The first thread is a system
monitoring thread that is reading /proc/modules. The second thread is
loading and unloading a module (in this example I'm using my simple
dummy-module.ko). Note, in the "real world" this occurred with the qlogic
driver module.When doing this, the following panic occurred:
------------[ cut here ]------------
kernel BUG at kernel/module.c:3739!
invalid opcode: 0000 [#1] SMP
Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
CPU: 37 PID: 186343 Comm: cat Tainted: GF O-------------- 3.10.0+ #7
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
RIP: 0010:[] [] module_flags+0xb5/0xc0
RSP: 0018:ffff88080fa7fe18 EFLAGS: 00010246
RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
FS: 00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
Call Trace:
[] m_show+0x19c/0x1e0
[] seq_read+0x16e/0x3b0
[] proc_reg_read+0x3d/0x80
[] vfs_read+0x9c/0x170
[] SyS_read+0x58/0xb0
[] system_call_fastpath+0x16/0x1b
Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
RIP [] module_flags+0xb5/0xc0
RSPConsider the two processes running on the system.
CPU 0 (/proc/modules reader)
CPU 1 (loading/unloading module)CPU 0 opens /proc/modules, and starts displaying data for each module by
traversing the modules list via fs/seq_file.c:seq_open() and
fs/seq_file.c:seq_read(). For each module in the modules list, seq_read
doesop->start() show() stop() state == MODULE_STATE_UNFORMED);
...The other thread, CPU 1, in unloading the module calls the syscall
delete_module() defined in kernel/module.c. The module_mutex is acquired
for a short time, and then released. free_module() is called without the
module_mutex. free_module() then sets mod->state = MODULE_STATE_UNFORMED,
also without the module_mutex. Some additional code is called and then the
module_mutex is reacquired to remove the module from the modules list:/* Now we can delete it from the lists */
mutex_lock(&module_mutex);
stop_machine(__unlink_module, mod, NULL);
mutex_unlock(&module_mutex);This is the sequence of events that leads to the panic.
CPU 1 is removing dummy_module via delete_module(). It acquires the
module_mutex, and then releases it. CPU 1 has NOT set dummy_module->state to
MODULE_STATE_UNFORMED yet.CPU 0, which is reading the /proc/modules, acquires the module_mutex and
acquires a pointer to the dummy_module which is still in the modules list.
CPU 0 calls m_show for dummy_module. The check in m_show() for
MODULE_STATE_UNFORMED passed for dummy_module even though it is being
torn down.Meanwhile CPU 1, which has been continuing to remove dummy_module without
holding the module_mutex, now calls free_module() and sets
dummy_module->state to MODULE_STATE_UNFORMED.CPU 0 now calls module_flags() with dummy_module and ...
static char *module_flags(struct module *mod, char *buf)
{
int bx = 0;BUG_ON(mod->state == MODULE_STATE_UNFORMED);
and BOOM.
Acquire and release the module_mutex lock around the setting of
MODULE_STATE_UNFORMED in the teardown path, which should resolve the
problem.Testing: In the unpatched kernel I can panic the system within 1 minute by
doingwhile (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done
and
while (true) do cat /proc/modules; done
in separate terminals.
In the patched kernel I was able to run just over one hour without seeing
any issues. I also verified the output of panic via sysrq-c and the output
of /proc/modules looks correct for all three states for the dummy_module.dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)Signed-off-by: Prarit Bhargava
Reviewed-by: Oleg Nesterov
Signed-off-by: Rusty Russell
Cc: stable@kernel.org
08 Oct, 2014
1 commit
-
Pull arm64 updates from Catalin Marinas:
- eBPF JIT compiler for arm64
- CPU suspend backend for PSCI (firmware interface) with standard idle
states defined in DT (generic idle driver to be merged via a
different tree)
- Support for CONFIG_DEBUG_SET_MODULE_RONX
- Support for unmapped cpu-release-addr (outside kernel linear mapping)
- set_arch_dma_coherent_ops() implemented and bus notifiers removed
- EFI_STUB improvements when base of DRAM is occupied
- Typos in KGDB macros
- Clean-up to (partially) allow kernel building with LLVM
- Other clean-ups (extern keyword, phys_addr_t usage)* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (51 commits)
arm64: Remove unneeded extern keyword
ARM64: make of_device_ids const
arm64: Use phys_addr_t type for physical address
aarch64: filter $x from kallsyms
arm64: Use DMA_ERROR_CODE to denote failed allocation
arm64: Fix typos in KGDB macros
arm64: insn: Add return statements after BUG_ON()
arm64: debug: don't re-enable debug exceptions on return from el1_dbg
Revert "arm64: dmi: Add SMBIOS/DMI support"
arm64: Implement set_arch_dma_coherent_ops() to replace bus notifiers
of: amba: use of_dma_configure for AMBA devices
arm64: dmi: Add SMBIOS/DMI support
arm64: Correct ftrace calls to aarch64_insn_gen_branch_imm()
arm64:mm: initialize max_mapnr using function set_max_mapnr
setup: Move unmask of async interrupts after possible earlycon setup
arm64: LLVMLinux: Fix inline arm64 assembly for use with clang
arm64: pageattr: Correctly adjust unaligned start addresses
net: bpf: arm64: fix module memory leak when JIT image build fails
arm64: add PSCI CPU_SUSPEND based cpu_suspend support
arm64: kernel: introduce cpu_init_idle CPU operation
...
03 Oct, 2014
1 commit
-
Similar to ARM, AArch64 is generating $x and $d syms... which isn't
terribly helpful when looking at %pF output and the like. Filter those
out in kallsyms, modpost and when looking at module symbols.Seems simplest since none of these check EM_ARM anyway, to just add it
to the strchr used, rather than trying to make things overly
complicated.initcall_debug improves:
dmesg_before.txt: initcall $x+0x0/0x154 [sg] returned 0 after 26331 usecs
dmesg_after.txt: initcall init_sg+0x0/0x154 [sg] returned 0 after 15461 usecsSigned-off-by: Kyle McMartin
Acked-by: Rusty Russell
Signed-off-by: Catalin Marinas
27 Aug, 2014
1 commit
-
Make it clear this is about kernel_param_ops, not kernel_param (which
will soon have a flags field of its own). No functional changes.Cc: Rusty Russell
Cc: Jean Delvare
Cc: Andrew Morton
Cc: Li Zhong
Cc: Jon Mason
Cc: Daniel Vetter
Signed-off-by: Jani Nikula
Signed-off-by: Rusty Russell
16 Aug, 2014
1 commit
-
The commit
4982223e51e8 module: set nx before marking module MODULE_STATE_COMING.
introduced a regression: if a module fails to parse its arguments or
if mod_sysfs_setup fails, then the module's memory will be freed
while still read-only. Anything that reuses that memory will crash
as soon as it tries to write to it.Cc: stable@vger.kernel.org # v3.16
Cc: Rusty Russell
Signed-off-by: Andy Lutomirski
Signed-off-by: Rusty Russell
11 Aug, 2014
1 commit
-
Pull module updates from Rusty Russell:
"This finally applies the stricter sysfs perms checking we pulled out
before last merge window. A few stragglers are fixed (thanks
linux-next!)"* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
arch/powerpc/platforms/powernv/opal-dump.c: fix world-writable sysfs files
arch/powerpc/platforms/powernv/opal-elog.c: fix world-writable sysfs files
drivers/video/fbdev/s3c2410fb.c: don't make debug world-writable.
ARM: avoid ARM binutils leaking ELF local symbols
scripts: modpost: Remove numeric suffix pattern matching
scripts: modpost: fix compilation warning
sysfs: disallow world-writable files.
module: return bool from within_module*()
module: add within_module() function
modules: Fix build error in moduleloader.h
27 Jul, 2014
2 commits
-
Symbols starting with .L are ELF local symbols and should not appear
in ELF symbol tables. However, unfortunately ARM binutils leaks the
.LANCHOR symbols into the symbol table, which leads kallsyms to report
these symbols rather than the real name. It is not very useful when
%pf reports symbols against these leaked .LANCHOR symbols.Arrange for kallsyms to ignore these symbols using the same mechanism
that is used for the ARM mapping symbols.Signed-off-by: Russell King
Signed-off-by: Rusty Russell -
It is just a small optimization that allows to replace few
occurrences of within_module_init() || within_module_core()
with a single call.Signed-off-by: Petr Mladek
Signed-off-by: Rusty Russell
03 Jul, 2014
1 commit
-
Per further discussion with NIST, the requirements for FIPS state that
we only need to panic the system on failed kernel module signature checks
for crypto subsystem modules. This moves the fips-mode-only module
signature check out of the generic module loading code, into the crypto
subsystem, at points where we can catch both algorithm module loads and
mode module loads. At the same time, make CONFIG_CRYPTO_FIPS dependent on
CONFIG_MODULE_SIG, as this is entirely necessary for FIPS mode.v2: remove extraneous blank line, perform checks in static inline
function, drop no longer necessary fips.h include.CC: "David S. Miller"
CC: Rusty Russell
CC: Stephan Mueller
Signed-off-by: Jarod Wilson
Acked-by: Neil Horman
Signed-off-by: Herbert Xu
12 Jun, 2014
1 commit
-
Pull module updates from Rusty Russell:
"Most of this is cleaning up various driver sysfs permissions so we can
re-add the perm check (we unified the module param and sysfs checks,
but the module ones were stronger so we weakened them temporarily).Param parsing gets documented, and also "--" now forces args to be
handed to init (and ignored by the kernel).Module NX/RO protections get tightened: we now set them before calling
parse_args()"* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: set nx before marking module MODULE_STATE_COMING.
samples/kobject/: avoid world-writable sysfs files.
drivers/hid/hid-picolcd_fb: avoid world-writable sysfs files.
drivers/staging/speakup/: avoid world-writable sysfs files.
drivers/regulator/virtual: avoid world-writable sysfs files.
drivers/scsi/pm8001/pm8001_ctl.c: avoid world-writable sysfs files.
drivers/hid/hid-lg4ff.c: avoid world-writable sysfs files.
drivers/video/fbdev/sm501fb.c: avoid world-writable sysfs files.
drivers/mtd/devices/docg3.c: avoid world-writable sysfs files.
speakup: fix incorrect perms on speakup_acntsa.c
cpumask.h: silence warning with -Wsign-compare
Documentation: Update kernel-parameters.tx
param: hand arguments after -- straight to init
modpost: Fix resource leak in read_dump()