Eric Lee / smarc-fsl-linux-kernel

22 Jul, 2010

5 commits

edd63cb6b sysrq,kdb: Use __handle_sysrq() for kdb's sysrq function ... Browse Code »

The kdb code should not toggle the sysrq state in case an end user
wants to try and resume the normal kernel execution.

Signed-off-by: Jason Wessel
Acked-by: Dmitry Torokhov

Jason Wessel
2010-07-22 08:27:07 +0800
b0679c63d debug_core,kdb: fix kgdb_connected bit set in the wrong place ... Browse Code »

Immediately following an exit from the kdb shell the kgdb_connected
variable should be set to zero, unless there are breakpoints planted.
If the kgdb_connected variable is not zeroed out with kdb, it is
impossible to turn off kdb.

This patch is merely a work around for now, the real fix will check
for the breakpoints.

Signed-off-by: Jason Wessel

Jason Wessel
2010-07-22 08:27:07 +0800
9e8b624fc Fix merge regression from external kdb to upstream kdb ... Browse Code »

In the process of merging kdb to the mainline, the kdb lsmod command
stopped printing the base load address of kernel modules. This is
needed for using kdb in conjunction with external tools such as gdb.

Simply restore the functionality by adding a kdb_printf for the base
load address of the kernel modules.

Signed-off-by: Jason Wessel

Jason Wessel
2010-07-22 08:27:06 +0800
fb82c0ff2 repair gdbstub to match the gdbserial protocol specification ... Browse Code »

The gdbserial protocol handler should return an empty packet instead
of an error string when ever it responds to a command it does not
implement.

The problem cases come from a debugger client sending
qTBuffer, qTStatus, qSearch, qSupported.

The incorrect response from the gdbstub leads the debugger clients to
not function correctly. Recent versions of gdb will not detach correctly as a result of this behavior.

Signed-off-by: Jason Wessel
Signed-off-by: Dongdong Deng

Jason Wessel
2010-07-22 08:27:05 +0800
1396a21ba kdb: break out of kdb_ll() when command is terminated ... Browse Code »

Without this patch the "ll" linked-list traversal command won't
terminate when you hit q/Q.

Signed-off-by: Martin Hicks
Signed-off-by: Jason Wessel

Martin Hicks
2010-07-22 08:27:05 +0800

19 Jul, 2010

1 commit

9078370c0 kmemleak: Add support for NO_BOOTMEM configurations ... Browse Code »

With commits 08677214 and 59be5a8e, alloc_bootmem()/free_bootmem() and
friends use the early_res functions for memory management when
NO_BOOTMEM is enabled. This patch adds the kmemleak calls in the
corresponding code paths for bootmem allocations.

Signed-off-by: Catalin Marinas
Acked-by: Pekka Enberg
Acked-by: Yinghai Lu
Cc: H. Peter Anvin
Cc: stable@kernel.org

Catalin Marinas
2010-07-19 18:54:15 +0800

05 Jul, 2010

1 commit

ff49d74ad module: initialize module dynamic debug later ... Browse Code »

We should initialize the module dynamic debug datastructures
only after determining that the module is not loaded yet. This
fixes a bug that introduced in 2.6.35-rc2, where when a trying
to load a module twice, we also load it's dynamic printing data
twice which causes all sorts of nasty issues. Also handle
the dynamic debug cleanup later on failure.

Signed-off-by: Yehuda Sadeh
Signed-off-by: Rusty Russell (removed a #ifdef)
Signed-off-by: Linus Torvalds

Yehuda Sadeh
2010-07-05 11:17:22 +0800

03 Jul, 2010

1 commit

123f94f22 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Cure nr_iowait_cpu() users
init: Fix comment
init, sched: Fix race between init and kthreadd

Linus Torvalds
2010-07-03 00:52:58 +0800

01 Jul, 2010

2 commits

8c215bd38 sched: Cure nr_iowait_cpu() users ... Browse Code »

Commit 0224cf4c5e (sched: Intoduce get_cpu_iowait_time_us())
broke things by not making sure preemption was indeed disabled
by the callers of nr_iowait_cpu() which took the iowait value of
the current cpu.

This resulted in a heap of preempt warnings. Cure this by making
nr_iowait_cpu() take a cpu number and fix up the callers to pass
in the right number.

Signed-off-by: Peter Zijlstra
Cc: Arjan van de Ven
Cc: Sergey Senozhatsky
Cc: Rafael J. Wysocki
Cc: Maxim Levitsky
Cc: Len Brown
Cc: Pavel Machek
Cc: Jiri Slaby
Cc: linux-pm@lists.linux-foundation.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-07-01 15:39:48 +0800
7a0ea09ad futex: futex_find_get_task remove credentails check ... Browse Code »

futex_find_get_task is currently used (through lookup_pi_state) from two
contexts, futex_requeue and futex_lock_pi_atomic. None of the paths
looks it needs the credentials check, though. Different (e)uids
shouldn't matter at all because the only thing that is important for
shared futex is the accessibility of the shared memory.

The credentail check results in glibc assert failure or process hang (if
glibc is compiled without assert support) for shared robust pthread
mutex with priority inheritance if a process tries to lock already held
lock owned by a process with a different euid:

pthread_mutex_lock.c:312: __pthread_mutex_lock_full: Assertion `(-(e)) != 3 || !robust' failed.

The problem is that futex_lock_pi_atomic which is called when we try to
lock already held lock checks the current holder (tid is stored in the
futex value) to get the PI state. It uses lookup_pi_state which in turn
gets task struct from futex_find_get_task. ESRCH is returned either
when the task is not found or if credentials check fails.

futex_lock_pi_atomic simply returns if it gets ESRCH. glibc code,
however, doesn't expect that robust lock returns with ESRCH because it
should get either success or owner died.

Signed-off-by: Michal Hocko
Acked-by: Darren Hart
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Nick Piggin
Cc: Alexey Kuznetsov
Cc: Peter Zijlstra
Signed-off-by: Linus Torvalds

Michal Hocko
2010-07-01 06:43:44 +0800

30 Jun, 2010

1 commit

e05bd3367 kexec: fix Oops in crash_shrink_memory() ... Browse Code »

When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size"
OOPSes the kernel in crash_shrink_memory. This happens when
crash_shrink_memory tries to release the 'crashk_res' resource which are
not reserved. Also value of "/sys/kernel/kexec_crash_size" shows as 1,
which should be 0.

This patch fixes the OOPS in crash_shrink_memory and shows
"/sys/kernel/kexec_crash_size" as 0 when crash kernel memory is not
reserved.

Signed-off-by: Pavan Naregundi
Reviewed-by: WANG Cong
Cc: Simon Horman
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavan Naregundi
2010-06-30 06:29:31 +0800

29 Jun, 2010

5 commits

5904b3b81 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Fix undeclared ENOSYS in include/linux/tracepoint.h
perf record: prevent kill(0, SIGTERM);
perf session: Remove threads from tree on PERF_RECORD_EXIT
perf/tracing: Fix regression of perf losing kprobe events
perf_events: Fix Intel Westmere event constraints
perf record: Don't call newt functions when not initialized

Linus Torvalds
2010-06-29 03:24:43 +0800
f3866db8f Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Deal with desc->set_type() changing desc->chip

Linus Torvalds
2010-06-29 03:23:12 +0800
f014d937d Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Prevent compiler from optimising the sched_avg_update() loop
sched: Fix over-scheduling bug
sched: Fix PROVE_RCU vs cpu_cgroup

Linus Torvalds
2010-06-29 03:18:30 +0800
cf91b415c Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
nohz: Fix nohz ratelimit

Linus Torvalds
2010-06-29 03:18:02 +0800
e6cb6281e Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: silence PROVE_RCU in sched_fork()
idr: fix RCU lockdep splat in idr_get_next()
rcu: apply RCU protection to wake_affine()

Linus Torvalds
2010-06-29 03:17:40 +0800

25 Jun, 2010

1 commit

0d98bb265 sched: Prevent compiler from optimising the sched_avg_update() loop ... Browse Code »

GCC 4.4.1 on ARM has been observed to replace the while loop in
sched_avg_update with a call to uldivmod, resulting in the
following build failure at link-time:

kernel/built-in.o: In function `sched_avg_update':
kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
make: *** [.tmp_vmlinux1] Error 1

This patch introduces a fake data hazard to the loop body to
prevent the compiler optimising the loop away.

Signed-off-by: Will Deacon
Signed-off-by: Andrew Morton
Acked-by: Peter Zijlstra
Cc: Catalin Marinas
Cc: Russell King
Cc: Linus Torvalds
Cc:
Signed-off-by: Ingo Molnar

Will Deacon
2010-06-25 22:11:50 +0800

24 Jun, 2010

1 commit

869515996 sched: silence PROVE_RCU in sched_fork() ... Browse Code »

Because cgroup_fork() is ran before sched_fork() [ from copy_process() ]
and the child's pid is not yet visible the child is pinned to its
cgroup. Therefore we can silence this warning.

A nicer solution would be moving cgroup_fork() to right after
dup_task_struct() and exclude PF_STARTING from task_subsys_state().

Signed-off-by: Peter Zijlstra
Reviewed-by: Li Zefan
Signed-off-by: Paul E. McKenney

Peter Zijlstra
2010-06-24 06:14:09 +0800

23 Jun, 2010

1 commit

f3b577dec rcu: apply RCU protection to wake_affine() ... Browse Code »

The task_group() function returns a pointer that must be protected
by either RCU, the ->alloc_lock, or the cgroup lock (see the
rcu_dereference_check() in task_subsys_state(), which is invoked by
task_group()). The wake_affine() function currently does none of these,
which means that a concurrent update would be within its rights to free
the structure returned by task_group(). Because wake_affine() uses this
structure only to compute load-balancing heuristics, there is no reason
to acquire either of the two locks.

Therefore, this commit introduces an RCU read-side critical section that
starts before the first call to task_group() and ends after the last use
of the "tg" pointer returned from task_group(). Thanks to Li Zefan for
pointing out the need to extend the RCU read-side critical section from
that proposed by the original patch.

Signed-off-by: Daniel J Blueman
Signed-off-by: Paul E. McKenney

Daniel J Blueman
2010-06-23 21:50:44 +0800

18 Jun, 2010

2 commits

3c93717cf sched: Fix over-scheduling bug ... Browse Code »

Commit e70971591 ("sched: Optimize unused cgroup configuration") introduced
an imbalanced scheduling bug.

If we do not use CGROUP, function update_h_load won't update h_load. When the
system has a large number of tasks far more than logical CPU number, the
incorrect cfs_rq[cpu]->h_load value will cause load_balance() to pull too
many tasks to the local CPU from the busiest CPU. So the busiest CPU keeps
going in a round robin. That will hurt performance.

The issue was found originally by a scientific calculation workload that
developed by Yanmin. With that commit, the workload performance drops
about 40%.

CPU before after

00 : 2 : 7
01 : 1 : 7
02 : 11 : 6
03 : 12 : 7
04 : 6 : 6
05 : 11 : 7
06 : 10 : 6
07 : 12 : 7
08 : 11 : 6
09 : 12 : 6
10 : 1 : 6
11 : 1 : 6
12 : 6 : 6
13 : 2 : 6
14 : 2 : 6
15 : 1 : 6

Reviewed-by: Yanmin zhang
Signed-off-by: Alex Shi
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Alex,Shi
2010-06-18 16:45:25 +0800
3310d4d38 nohz: Fix nohz ratelimit ... Browse Code »

Chris Wedgwood reports that 39c0cbe (sched: Rate-limit nohz) causes a
serial console regression, unresponsiveness, and indeed it does. The
reason is that the nohz code is skipped even when the tick was already
stopped before the nohz_ratelimit(cpu) condition changed.

Move the nohz_ratelimit() check to the other conditions which prevent
long idle sleeps.

Reported-by: Chris Wedgwood
Tested-by: Brian Bloniarz
Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
Cc: Jiri Kosina
Cc: Linus Torvalds
Cc: Greg KH
Cc: Alan Cox
Cc: OGAWA Hirofumi
Cc: Jef Driesen
LKML-Reference:
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2010-06-18 01:37:29 +0800

12 Jun, 2010

1 commit

42de5532f Merge branch 'bugzilla-13931-sleep-nvs' into release ... Browse Code »

Conflicts:
drivers/acpi/sleep.c

Signed-off-by: Len Brown

Len Brown
2010-06-12 13:15:40 +0800

11 Jun, 2010

2 commits

a8fb26080 perf/tracing: Fix regression of perf losing kprobe events ... Browse Code »

With the addition of the code to shrink the kernel tracepoint
infrastructure, we lost kprobes being traced by perf. The reason
is that I tested if the "tp_event->class->perf_probe" existed before
enabling it. This prevents "ftrace only" events (like the function
trace events) from being enabled by perf.

Unfortunately, kprobe events do not use perf_probe. This causes
kprobes to be missed by perf. To fix this, we add the test to
see if "tp_event->class->reg" exists as well as perf_probe.

Normal trace events have only "perf_probe" but no "reg" function,
and kprobes and syscalls have the "reg" but no "perf_probe".
The ftrace unique events do not have either, so this is a valid
test. If a kprobe or syscall is not to be probed by perf, the
"reg" function is called anyway, and will return a failure and
prevent perf from probing it.

Reported-by: Srikar Dronamraju
Tested-by: Srikar Dronamraju
Acked-by: Peter Zijlstra
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-06-11 08:56:54 +0800
85ca7886f Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Fix null pointer deref with SEND_SIG_FORCED
perf: Fix signed comparison in perf_adjust_period()
powerpc/oprofile: fix potential buffer overrun in op_model_cell.c
perf symbols: Set the DSO long name when using symbol_conf.vmlinux_name

Linus Torvalds
2010-06-11 00:30:09 +0800

10 Jun, 2010

1 commit

dd4c4f17d suspend: Move NVS save/restore code to generic suspend functionality ... Browse Code »

Saving platform non-volatile state may be required for suspend to RAM as
well as hibernation. Move it to more generic code.

Signed-off-by: Matthew Garrett
Acked-by: Rafael J. Wysocki
Tested-by: Maxim Levitsky
Signed-off-by: Len Brown

Matthew Garrett
2010-06-10 23:02:34 +0800

09 Jun, 2010

3 commits

467324756 genirq: Deal with desc->set_type() changing desc->chip ... Browse Code »

The set_type() function can change the chip implementation when the
trigger mode changes. That might result in using an non-initialized
irq chip when called from __setup_irq() or when called via
set_irq_type() on an already enabled irq.

The set_irq_type() function should not be called on an enabled irq,
but because we forgot to put a check into it, we have a bunch of users
which grew the habit of doing that and it never blew up as the
function is serialized via desc->lock against all users of desc->chip
and they never hit the non-initialized irq chip issue.

The easy fix for the __setup_irq() issue would be to move the
irq_chip_set_defaults(desc->chip) call after the trigger setting to
make sure that a chip change is covered.

But as we have already users, which do the type setting after
request_irq(), the safe fix for now is to call irq_chip_set_defaults()
from __irq_set_trigger() when desc->set_type() changed the irq chip.

It needs a deeper analysis whether we should refuse to change the chip
on an already enabled irq, but that'd be a large scale change to fix
all the existing users. So that's neither stable nor 2.6.35 material.

Reported-by: Esben Haabendal
Signed-off-by: Thomas Gleixner
Cc: Benjamin Herrenschmidt
Cc: linuxppc-dev
Cc: stable@kernel.org

Thomas Gleixner
2010-06-09 23:05:08 +0800
dc61b1d65 sched: Fix PROVE_RCU vs cpu_cgroup ... Browse Code »

PROVE_RCU has a few issues with the cpu_cgroup because the scheduler
typically holds rq->lock around the css rcu derefs but the generic
cgroup code doesn't (and can't) know about that lock.

Provide means to add extra checks to the css dereference and use that
in the scheduler to annotate its users.

The addition of rq->lock to these checks is correct because the
cgroup_subsys::attach() method takes the rq->lock for each task it
moves, therefore by holding that lock, we ensure the task is pinned to
the current cgroup and the RCU derefence is valid.

That leaves one genuine race in __sched_setscheduler() where we used
task_group() without holding any of the required locks and thus raced
with the cgroup code. Solve this by moving the check under the
appropriate lock.

Signed-off-by: Peter Zijlstra
Cc: "Paul E. McKenney"
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 00:44:04 +0800
f6ab91add perf: Fix signed comparison in perf_adjust_period() ... Browse Code »

Frederic reported that frequency driven swevents didn't work properly
and even caused a division-by-zero error.

It turns out there are two bugs, the division-by-zero comes from a
failure to deal with that in perf_calculate_period().

The other was more interesting and turned out to be a wrong comparison
in perf_adjust_period(). The comparison was between an s64 and u64 and
got implicitly converted to an unsigned comparison. The problem is
that period_left is typically < 0, so it ended up being always true.

Cure this by making the local period variables s64.

Reported-by: Frederic Weisbecker
Tested-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 00:43:00 +0800

05 Jun, 2010

12 commits

90ec78197 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
module: fix bne2 "gave up waiting for init of module libcrc32c"
module: verify_export_symbols under the lock
module: move find_module check to end
module: make locking more fine-grained.
module: Make module sysfs functions private.
module: move sysfs exposure to end of load_module
module: fix kdb's illicit use of struct module_use.
module: Make the 'usage' lists be two-way

Linus Torvalds
2010-06-05 12:09:48 +0800
9bea7f239 module: fix bne2 "gave up waiting for init of module libcrc32c" ... Browse Code »

Problem: it's hard to avoid an init routine stumbling over a
request_module these days. And it's not clear it's always a bad idea:
for example, a module like kvm with dynamic dependencies on kvm-intel
or kvm-amd would be neater if it could simply request_module the right
one.

In this particular case, it's libcrc32c:

libcrc32c_mod_init
crypto_alloc_shash
crypto_alloc_tfm
crypto_find_alg
crypto_alg_mod_lookup
crypto_larval_lookup
request_module

If another module is waiting inside resolve_symbol() for libcrc32c to
finish initializing (ie. bne2 depends on libcrc32c) then it does so
holding the module lock, and our request_module() can't make progress
until that is released.

Waiting inside resolve_symbol() without the lock isn't all that hard:
we just need to pass the -EBUSY up the call chain so we can sleep
where we don't hold the lock. Error reporting is a bit trickier: we
need to copy the name of the unfinished module before releasing the
lock.

Other notes:
1) This also fixes a theoretical issue where a weak dependency would allow
symbol version mismatches to be ignored.
2) We rename use_module to ref_module to make life easier for the only
external user (the out-of-tree ksplice patches).

Signed-off-by: Rusty Russell
Cc: Linus Torvalds
Cc: Tim Abbot
Tested-by: Brandon Philips

Rusty Russell
2010-06-05 09:47:37 +0800
be593f4ce module: verify_export_symbols under the lock ... Browse Code »

It disabled preempt so it was "safe", but nothing stops another module
slipping in before this module is added to the global list now we don't
hold the lock the whole time.

So we check this just after we check for duplicate modules, and just
before we put the module in the global list.

(find_symbol finds symbols in coming and going modules, too).

Signed-off-by: Rusty Russell

Rusty Russell
2010-06-05 09:47:37 +0800
3bafeb624 module: move find_module check to end ... Browse Code »

I think Rusty may have made the lock a bit _too_ finegrained there, and
didn't add it to some places that needed it. It looks, for example, like
PATCH 1/2 actually drops the lock in places where it's needed
("find_module()" is documented to need it, but now load_module() didn't
hold it at all when it did the find_module()).

Rather than adding a new "module_loading" list, I think we should be able
to just use the existing "modules" list, and just fix up the locking a
bit.

In fact, maybe we could just move the "look up existing module" a bit
later - optimistically assuming that the module doesn't exist, and then
just undoing the work if it turns out that we were wrong, just before
adding ourselves to the list.

Signed-off-by: Rusty Russell

Linus Torvalds
2010-06-05 09:47:37 +0800
75676500f module: make locking more fine-grained. ... Browse Code »

Kay Sievers reports that we still have some
contention over module loading which is slowing boot.

Linus also disliked a previous "drop lock and regrab" patch to fix the
bne2 "gave up waiting for init of module libcrc32c" message.

This is more ambitious: we only grab the lock where we need it.

Signed-off-by: Rusty Russell
Cc: Brandon Philips
Cc: Kay Sievers
Cc: Linus Torvalds

Rusty Russell
2010-06-05 09:47:36 +0800
6407ebb27 module: Make module sysfs functions private. ... Browse Code »

These were placed in the header in ef665c1a06 to get the various
SYSFS/MODULE config combintations to compile.

That may have been necessary then, but it's not now. These functions
are all local to module.c.

Signed-off-by: Rusty Russell
Cc: Randy Dunlap

Rusty Russell
2010-06-05 09:47:36 +0800
80a3d1bb4 module: move sysfs exposure to end of load_module ... Browse Code »

This means a little extra work, but is more logical: we don't put
anything in sysfs until we're about to put the module into the
global list an parse its parameters.

This also gives us a logical place to put duplicate module detection
in the next patch.

Signed-off-by: Rusty Russell

Rusty Russell
2010-06-05 09:47:36 +0800
c8e21ced0 module: fix kdb's illicit use of struct module_use. ... Browse Code »

Linus changed the structure, and luckily this didn't compile any more.

Reported-by: Stephen Rothwell
Signed-off-by: Rusty Russell
Cc: Jason Wessel
Cc: Martin Hicks

Rusty Russell
2010-06-05 09:47:36 +0800
2c02dfe7f module: Make the 'usage' lists be two-way ... Browse Code »

When adding a module that depends on another one, we used to create a
one-way list of "modules_which_use_me", so that module unloading could
see who needs a module.

It's actually quite simple to make that list go both ways: so that we
not only can see "who uses me", but also see a list of modules that are
"used by me".

In fact, we always wanted that list in "module_unload_free()": when we
unload a module, we want to also release all the other modules that are
used by that module. But because we didn't have that list, we used to
first iterate over all modules, and then iterate over each "used by me"
list of that module.

By making the list two-way, we simplify module_unload_free(), and it
allows for some trivial fixes later too.

Signed-off-by: Linus Torvalds
Signed-off-by: Rusty Russell (cleaned & rebased)

Linus Torvalds
2010-06-05 09:47:35 +0800
d2dd328b7 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (27 commits)
block: make blk_init_free_list and elevator_init idempotent
block: avoid unconditionally freeing previously allocated request_queue
pipe: change /proc/sys/fs/pipe-max-pages to byte sized interface
pipe: change the privilege required for growing a pipe beyond system max
pipe: adjust minimum pipe size to 1 page
block: disable preemption before using sched_clock()
cciss: call BUG() earlier
Preparing 8.3.8rc2
drbd: Reduce verbosity
drbd: use drbd specific ratelimit instead of global printk_ratelimit
drbd: fix hang on local read errors while disconnected
drbd: Removed the now empty w_io_error() function
drbd: removed duplicated #includes
drbd: improve usage of MSG_MORE
drbd: need to set socket bufsize early to take effect
drbd: improve network latency, TCP_QUICKACK
drbd: Revert "drbd: Create new current UUID as late as possible"
brd: support discard
Revert "writeback: fix WB_SYNC_NONE writeback from umount"
Revert "writeback: ensure that WB_SYNC_NONE writeback with sb pinned is sync"
...

Linus Torvalds
2010-06-05 06:37:44 +0800
9e506f7ad kernel/: fix BUG_ON checks for cpu notifier callbacks direct call ... Browse Code »

The commit 80b5184cc537718122e036afe7e62d202b70d077 ("kernel/: convert cpu
notifier to return encapsulate errno value") changed the return value of
cpu notifier callbacks.

Those callbacks don't return NOTIFY_BAD on failures anymore. But there
are a few callbacks which are called directly at init time and checking
the return value.

I forgot to change BUG_ON checking by the direct callers in the commit.

Signed-off-by: Akinobu Mita
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2010-06-05 06:21:45 +0800
94b3dd0f7 cgroups: alloc_css_id() increments hierarchy depth ... Browse Code »

Child groups should have a greater depth than their parents. Prior to
this change, the parent would incorrectly report zero memory usage for
child cgroups when use_hierarchy is enabled.

test script:
mount -t cgroup none /cgroups -o memory
cd /cgroups
mkdir cg1

echo 1 > cg1/memory.use_hierarchy
mkdir cg1/cg11

echo $$ > cg1/cg11/tasks
dd if=/dev/zero of=/tmp/foo bs=1M count=1

echo
echo CHILD
grep cache cg1/cg11/memory.stat

echo
echo PARENT
grep cache cg1/memory.stat

echo $$ > tasks
rmdir cg1/cg11 cg1
cd /
umount /cgroups

Using fae9c79, a recent patch that changed alloc_css_id() depth computation,
the parent incorrectly reports zero usage:
root@ubuntu:~# ./test
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0151844 s, 69.1 MB/s

CHILD
cache 1048576
total_cache 1048576

PARENT
cache 0
total_cache 0

With this patch, the parent correctly includes child usage:
root@ubuntu:~# ./test
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0136827 s, 76.6 MB/s

CHILD
cache 1052672
total_cache 1052672

PARENT
cache 0
total_cache 1052672

Signed-off-by: Greg Thelen
Acked-by: Paul Menage
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Li Zefan
Cc: [2.6.34.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Greg Thelen
2010-06-05 06:21:45 +0800