Doug / smarc-fsl-linux-kernel | Embedian Git Server

29 Oct, 2010

1 commit

e9f29c9a5 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 ... Browse Code »

* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits)
x86: allocate space within a region top-down
x86: update iomem_resource end based on CPU physical address capabilities
x86/PCI: allocate space from the end of a region, not the beginning
PCI: allocate bus resources from the top down
resources: support allocating space within a region from the top down
resources: handle overflow when aligning start of available area
resources: ensure callback doesn't allocate outside available space
resources: factor out resource_clip() to simplify find_resource()
resources: add a default alignf to simplify find_resource()
x86/PCI: MMCONFIG: fix region end calculation
PCI: Add support for polling PME state on suspended legacy PCI devices
PCI: Export some PCI PM functionality
PCI: fix message typo
PCI: log vendor/device ID always
PCI: update Intel chipset names and defines
PCI: use new ccflags variable in Makefile
PCI: add PCI_MSIX_TABLE/PBA defines
PCI: add PCI vendor id for STmicroelectronics
x86/PCI: irq and pci_ids patch for Intel Patsburg DeviceIDs
PCI: OLPC: Only enable PCI configuration type override on XO-1
...

Linus Torvalds
2010-10-29 02:59:52 +0800

28 Oct, 2010

25 commits

bdab22501 Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300: (44 commits)
MN10300: Save frame pointer in thread_info struct rather than global var
MN10300: Change "Matsushita" to "Panasonic".
MN10300: Create a defconfig for the ASB2364 board
MN10300: Update the ASB2303 defconfig
MN10300: ASB2364: Add support for SMSC911X and SMC911X
MN10300: ASB2364: Handle the IRQ multiplexer in the FPGA
MN10300: Generic time support
MN10300: Specify an ELF HWCAP flag for MN10300 Atomic Operations Unit support
MN10300: Map userspace atomic op regs as a vmalloc page
MN10300: And Panasonic AM34 subarch and implement SMP
MN10300: Delete idle_timestamp from irq_cpustat_t
MN10300: Make various interrupt priority settings configurable
MN10300: Optimise do_csum()
MN10300: Implement atomic ops using atomic ops unit
MN10300: Make the FPU operate in non-lazy mode under SMP
MN10300: SMP TLB flushing
MN10300: Use the [ID]PTEL2 registers rather than [ID]PTEL for TLB control
MN10300: Make the use of PIDR to mark TLB entries controllable
MN10300: Rename __flush_tlb*() to local_flush_tlb*()
MN10300: AM34 erratum requires MMUCTR read and write on exception entry
...

Linus Torvalds
2010-10-28 09:53:26 +0800
a042e2613 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
perf python scripting: Add futex-contention script
perf python scripting: Fixup cut'n'paste error in sctop script
perf scripting: Shut up 'perf record' final status
perf record: Remove newline character from perror() argument
perf python scripting: Support fedora 11 (audit 1.7.17)
perf python scripting: Improve the syscalls-by-pid script
perf python scripting: print the syscall name on sctop
perf python scripting: Improve the syscalls-counts script
perf python scripting: Improve the failed-syscalls-by-pid script
kprobes: Remove redundant text_mutex lock in optimize
x86/oprofile: Fix uninitialized variable use in debug printk
tracing: Fix 'faild' -> 'failed' typo
perf probe: Fix format specified for Dwarf_Off parameter
perf trace: Fix detection of script extension
perf trace: Use $PERF_EXEC_PATH in canned report scripts
perf tools: Document event modifiers
perf tools: Remove direct slang.h include
perf_events: Fix for transaction recovery in group_sched_in()
perf_events: Revert: Fix transaction recovery in group_sched_in()
perf, x86: Use NUMA aware allocations for PEBS/BTS/DS allocations
...

Linus Torvalds
2010-10-28 09:48:00 +0800
f66dd539f Merge branch 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus ... Browse Code »

* 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
NULL-terminate all pci_device_id tables
(trivial) Fix compiler warning in kernel/modules.c

Linus Torvalds
2010-10-28 09:47:39 +0800
61d8e11e5 Remove duplicate includes from many files ... Browse Code »

Signed-off-by: Zimny Lech
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Zimny Lech
2010-10-28 09:03:18 +0800
5de1cb2d0 kernel/resource.c: handle reinsertion of an already-inserted resource ... Browse Code »

If the same resource is inserted to the resource tree (maybe not on
purpose), a dead loop will be created. In this situation, The kernel does
not report any warning or error :(

The command below will show a endless print.
#cat /proc/iomem

[akpm@linux-foundation.org: add WARN_ON()]
Signed-off-by: Huang Shijie
Cc: Jesse Barnes
Cc: Bjorn Helgaas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Huang Shijie
2010-10-28 09:03:18 +0800
d57af9b21 taskstats: use real microsecond granularity for CPU times ... Browse Code »

The taskstats interface uses microsecond granularity for the user and
system time values. The conversion from cputime to the taskstats values
uses the cputime_to_msecs primitive which effectively limits the
granularity to milliseconds. Add the cputime_to_usecs primitive for
architectures that have better, more precise CPU time values. Remove
cputime_to_msecs primitive because there are no more users left.

Signed-off-by: Michael Holzheu
Acked-by: Balbir Singh
Cc: Luck Tony
Cc: Shailabh Nagar
Cc: Martin Schwidefsky
Cc: Oleg Nesterov
Cc: Benjamin Herrenschmidt
Cc: Heiko Carstens
Cc: Thomas Gleixner
Cc: Shailabh Nagar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Holzheu
2010-10-28 09:03:17 +0800
3d9e0cf1f taskstats: split fill_pid function ... Browse Code »

Separate the finding of a task_struct by pid or tgid from filling the
taskstats data. This makes the code more readable.

Signed-off-by: Michael Holzheu
Acked-by: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Holzheu
2010-10-28 09:03:17 +0800
932331259 taskstats: separate taskstats commands ... Browse Code »

Move each taskstats command into a single function. This makes the code
more readable and makes it easier to add new commands.

Signed-off-by: Michael Holzheu
Acked-by: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Holzheu
2010-10-28 09:03:17 +0800
858931206 delayacct: align to 8 byte boundary on 64-bit systems ... Browse Code »

prepare_reply() sets up an skb for the response. The payload contains:

+--------------------------------+
| genlmsghdr - 4 bytes |
+--------------------------------+
| NLA header - 4 bytes | /* Aggregate header */
+-+------------------------------+
| | NLA header - 4 bytes | /* PID header */
| +------------------------------+
| | pid/tgid - 4 bytes |
| +------------------------------+
| | NLA header - 4 bytes | /* stats header */
| + -----------------------------+
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jeff Mahoney
2010-10-28 09:03:17 +0800
478735e38 /proc/stat: fix scalability of irq sum of all cpu ... Browse Code »

In /proc/stat, the number of per-IRQ event is shown by making a sum each
irq's events on all cpus. But we can make use of kstat_irqs().

kstat_irqs() do the same calculation, If !CONFIG_GENERIC_HARDIRQ,
it's not a big cost. (Both of the number of cpus and irqs are small.)

If a system is very big and CONFIG_GENERIC_HARDIRQ, it does

for_each_irq()
for_each_cpu()
- look up a radix tree
- read desc->irq_stat[cpu]
This seems not efficient. This patch adds kstat_irqs() for
CONFIG_GENRIC_HARDIRQ and change the calculation as

for_each_irq()
look up radix tree
for_each_cpu()
- read desc->irq_stat[cpu]

This reduces cost.

A test on (4096cpusp, 256 nodes, 4592 irqs) host (by Jack Steiner)

%time cat /proc/stat > /dev/null

Before Patch: 2.459 sec
After Patch : .561 sec

[akpm@linux-foundation.org: unexport kstat_irqs, coding-style tweaks]
[akpm@linux-foundation.org: fix unused variable 'per_irq_sum']
Signed-off-by: KAMEZAWA Hiroyuki
Tested-by: Jack Steiner
Acked-by: Jack Steiner
Cc: Yinghai Lu
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2010-10-28 09:03:13 +0800
d16e15f5b exit: add lock context annotation on find_new_reaper() ... Browse Code »

find_new_reaper() releases and regrabs tasklist_lock but was missing
proper annotations. Add it. This remove following sparse warning:

warning: context imbalance in 'find_new_reaper' - unexpected unlock

Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:13 +0800
9b1bf12d5 signals: move cred_guard_mutex from task_struct to signal_struct ... Browse Code »

Oleg Nesterov pointed out we have to prevent multiple-threads-inside-exec
itself and we can reuse ->cred_guard_mutex for it. Yes, concurrent
execve() has no worth.

Let's move ->cred_guard_mutex from task_struct to signal_struct. It
naturally prevent multiple-threads-inside-exec.

Signed-off-by: KOSAKI Motohiro
Reviewed-by: Oleg Nesterov
Acked-by: Roland McGrath
Acked-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2010-10-28 09:03:12 +0800
b84011508 signals: annotate lock context change on ptrace_stop() ... Browse Code »

ptrace_stop() releases and regrabs current->sighand->siglock but was
missing proper annotation. Add it.

Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Cc: Ingo Molnar
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:12 +0800
b8ed374e2 signals: annotate lock_task_sighand() ... Browse Code »

lock_task_sighand() grabs sighand->siglock in case of returning non-NULL
but unlock_task_sighand() releases it unconditionally. This leads sparse
to complain about the lock context imbalance. Rename and wrap
lock_task_sighand() using __cond_lock() macro to make sparse happy.

Suggested-by: Eric Dumazet
Signed-off-by: Namhyung Kim
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:12 +0800
9fed81dc4 ptrace: cleanup ptrace_request() ... Browse Code »

Use new 'datavp' and 'datalp' variables to remove unnecesary castings.

Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:10 +0800
4abf98696 ptrace: change signature of sys_ptrace() and friends ... Browse Code »

Since userspace API of ptrace syscall defines @addr and @data as void
pointers, it would be more appropriate to define them as unsigned long in
kernel. Therefore related functions are changed also.

'unsigned long' is typically used in other places in kernel as an opaque
data type and that using this helps cleaning up a lot of warnings from
sparse.

Suggested-by: Arnd Bergmann
Signed-off-by: Namhyung Kim
Acked-by: Arnd Bergmann
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:10 +0800
c4b5ed250 ptrace: annotate lock context change on exit_ptrace() ... Browse Code »

exit_ptrace() releases and regrabs tasklist_lock but was missing proper
annotation. Add it.

Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Cc: Ingo Molnar
Cc: Oleg Nesterov
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:10 +0800
45531757b cgroup: notify ns_cgroup deprecated ... Browse Code »

The ns_cgroup will be removed very soon. Let's warn, for this version,
ns_cgroup is deprecated.

Make ns_cgroup and clone_children exclusive. If the clone_children is set
and the ns_cgroup is mounted, let's fail with EINVAL when the ns_cgroup
subsys is created (a printk will help the user to understand why the
creation fails).

Update the feature remove schedule file with the deprecated ns_cgroup.

Signed-off-by: Daniel Lezcano
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daniel Lezcano
2010-10-28 09:03:09 +0800
f4a2589fe cgroups: add check for strcpy destination string overflow ... Browse Code »

Function "strcpy" is used without check for maximum allowed source string
length and could cause destination string overflow. Check for string
length is added before using "strcpy". Function now is return error if
source string length is more than a maximum.

akpm: presently considered NotABug, but add the check for general
future-safeness and robustness.

Signed-off-by: Evgeny Kuznetsov
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Evgeny Kuznetsov
2010-10-28 09:03:09 +0800
32a8cf235 cgroup: make the mount options parsing more accurate ... Browse Code »

Current behavior:
=================

(1) When we mount a cgroup, we can specify the 'all' option which
means to enable all the cgroup subsystems. This is the default option
when no option is specified.

(2) If we want to mount a cgroup with a subset of the supported cgroup
subsystems, we have to specify a subsystems name list for the mount
option.

(3) If we specify another option like 'noprefix' or 'release_agent',
the actual code wants the 'all' or a subsystem name option specified
also. Not critical but a bit not friendly as we should assume (1) in
this case.

(4) Logically, the 'all' option is mutually exclusive with a subsystem
name, but this is not detected.

In other words:
succeed : mount -t cgroup -o all,freezer cgroup /cgroup
=> is it 'all' or 'freezer' ?
fails : mount -t cgroup -o noprefix cgroup /cgroup
=> succeed if we do '-o noprefix,all'

The following patches consolidate a bit the mount options check.

New behavior:
=============

(1) untouched
(2) untouched
(3) the 'all' option will be by default when specifying other than
a subsystem name option
(4) raises an error

In other words:
fails : mount -t cgroup -o all,freezer cgroup /cgroup
succeed : mount -t cgroup -o noprefix cgroup /cgroup

For the sake of lisibility, the if ... then ... else ... if ...
indentation when parsing the options has been changed to:
if ... then
...
continue
fi

Signed-off-by: Daniel Lezcano
Signed-off-by: Serge E. Hallyn
Reviewed-by: Li Zefan
Reviewed-by: Paul Menage
Cc: Eric W. Biederman
Cc: Jamal Hadi Salim
Cc: Matt Helsley
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daniel Lezcano
2010-10-28 09:03:09 +0800
97978e6d1 cgroup: add clone_children control file ... Browse Code »

The ns_cgroup is a control group interacting with the namespaces. When a
new namespace is created, a corresponding cgroup is automatically created
too. The cgroup name is the pid of the process who did 'unshare' or the
child of 'clone'.

This cgroup is tied with the namespace because it prevents a process to
escape the control group and use the post_clone callback, so the child
cgroup inherits the values of the parent cgroup.

Unfortunately, the more we use this cgroup and the more we are facing
problems with it:

(1) when a process unshares, the cgroup name may conflict with a
previous cgroup with the same pid, so unshare or clone return -EEXIST

(2) the cgroup creation is out of control because there may have an
application creating several namespaces where the system will
automatically create several cgroups in his back and let them on the
cgroupfs (eg. a vrf based on the network namespace).

(3) the mix of (1) and (2) force an administrator to regularly check
and clean these cgroups.

This patchset removes the ns_cgroup by adding a new flag to the cgroup and
the cgroupfs mount option. It enables the copy of the parent cgroup when
a child cgroup is created. We can then safely remove the ns_cgroup as
this flag brings a compatibility. We have now to manually create and add
the task to a cgroup, which is consistent with the cgroup framework.

This patch:

Sent as an answer to a previous thread around the ns_cgroup.

https://lists.linux-foundation.org/pipermail/containers/2009-June/018627.html

It adds a control file 'clone_children' for a cgroup. This control file
is a boolean specifying if the child cgroup should be a clone of the
parent cgroup or not. The default value is 'false'.

This flag makes the child cgroup to call the post_clone callback of all
the subsystem, if it is available.

At present, the cpuset is the only one which had implemented the
post_clone callback.

The option can be set at mount time by specifying the 'clone_children'
mount option.

Signed-off-by: Daniel Lezcano
Signed-off-by: Serge E. Hallyn
Cc: Eric W. Biederman
Acked-by: Paul Menage
Reviewed-by: Li Zefan
Cc: Jamal Hadi Salim
Cc: Matt Helsley
Acked-by: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daniel Lezcano
2010-10-28 09:03:09 +0800
2d3cbf8bc cgroup_freezer: update_freezer_state() does incorrect state transitions ... Browse Code »

There are 4 state transitions possible for a freezer. Only FREEZING ->
FROZEN transaction is done lazily. This patch allows update_freezer_state
only to perform this transaction and renames the function to
update_if_frozen.

Moreover is_task_frozen_enough function is removed and its every occurence
is replaced with frozen(). Therefore for a group to become FROZEN every
task must be frozen.

The previous version could trigger a following bug: When cgroup is in the
process of freezing (but none of its tasks are frozen yet),
update_freezer_state() (called from freezer_read or freezer_write) would
incorrectly report that a group is 'THAWED' (because nfrozen = 0),
allowing the transaction FREEZING -> THAWED without writing anything to
'freezer.state'. This is incorrect according to the documentation. This
could result in a 'THAWED' cgroup with frozen tasks inside.

A code to reproduce this bug is available here:
http://pentium.hopto.org/~thinred/repos/linux-misc/freezer_bug2.c

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Tomasz Buchert
Cc: Matt Helsley
Cc: Paul Menage
Cc: Li Zefan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tomasz Buchert
2010-10-28 09:03:08 +0800
0bdba580a cgroup_freezer: fix can_attach() to prohibit moving from/to freezing/frozen cgroups ... Browse Code »

It is possible to move a task from its cgroup even if this group is
'FREEZING'. This results in a nasty bug - the moved task will become
frozen OUTSIDE its original cgroup and will remain in a permanent 'D'
state.

This patch allows to migrate the task only between THAWED cgroups.

This behavior was observed and easily reproduced on a single core laptop.
Notice that reproducibility depends highly on the machine used. Program
and instructions how to reproduce the bug can be fetched from:
http://pentium.hopto.org/~thinred/repos/linux-misc/freezer_bug.c

Signed-off-by: Tomasz Buchert
Cc: Matt Helsley
Cc: Paul Menage
Cc: Li Zefan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tomasz Buchert
2010-10-28 09:03:08 +0800
d5de4ddb1 cgroup_freezer: unnecessary test in cgroup_freezing_or_frozen() ... Browse Code »

The root freezer_state is always CGROUP_THAWED so we can remove the
special case from the code. The test itself can be handy and is extracted
to static function.

Signed-off-by: Tomasz Buchert
Cc: Matt Helsley
Cc: Paul Menage
Cc: Li Zefan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tomasz Buchert
2010-10-28 09:03:08 +0800
3a5f65df5 Typedef SMP call function pointer ... Browse Code »

Typedef the pointer to the function to be called by smp_call_function() and
friends:

typedef void (*smp_call_func_t)(void *info);

as it is used in a fair number of places.

Signed-off-by: David Howells
cc: linux-arch@vger.kernel.org

David Howells
2010-10-28 00:28:36 +0800

27 Oct, 2010

14 commits

abbce906d (trivial) Fix compiler warning in kernel/modules.c ... Browse Code »

Building with CONFIG_KALLSYMS=n gives following warning:

/mnt/src/linux-git/kernel/module.c: In function ‘post_relocation’:
/mnt/src/linux-git/kernel/module.c:2534:2: warning: passing argument 2 of ‘add_kallsyms’ discards qualifiers from pointer target type
/mnt/src/linux-git/kernel/module.c:2038:13: note: expected ‘struct load_info *’ but argument is of type ‘const struct load_info *’

Signed-off-by: Michał Mirosław
Signed-off-by: Rusty Russell

Michał Mirosław
2010-10-27 18:03:05 +0800
426e1f5ce Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
split invalidate_inodes()
fs: skip I_FREEING inodes in writeback_sb_inodes
fs: fold invalidate_list into invalidate_inodes
fs: do not drop inode_lock in dispose_list
fs: inode split IO and LRU lists
fs: switch bdev inode bdi's correctly
fs: fix buffer invalidation in invalidate_list
fsnotify: use dget_parent
smbfs: use dget_parent
exportfs: use dget_parent
fs: use RCU read side protection in d_validate
fs: clean up dentry lru modification
fs: split __shrink_dcache_sb
fs: improve DCACHE_REFERENCED usage
fs: use percpu counter for nr_dentry and nr_dentry_unused
fs: simplify __d_free
fs: take dcache_lock inside __d_path
fs: do not assign default i_ino in new_inode
fs: introduce a per-cpu last_ino allocator
new helper: ihold()
...

Linus Torvalds
2010-10-27 08:58:44 +0800
ee2f154a5 docbook: add more wait/wake/completion to device-drivers docbook ... Browse Code »

Add more wait, wake, and completion interfaces to the device-drivers
docbook.

Fix kernel-doc notation in the added files.

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2010-10-27 08:32:41 +0800
f5d87d851 printk: declare printk_ratelimit_state in ratelimit.h ... Browse Code »

Adding declaration of printk_ratelimit_state in ratelimit.h removes
potential build breakage and following sparse warning:

kernel/printk.c:1426:1: warning: symbol 'printk_ratelimit_state' was not declared. Should it be static?

[akpm@linux-foundation.org: remove unneeded ifdef]
Signed-off-by: Namhyung Kim
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-27 07:52:16 +0800
674dff650 printk: change type of 'boot_delay' to int * ... Browse Code »

get_option() takes its 2nd arg as int * so passing boot_delay to it
caused following warnings from sparse:

kernel/printk.c:223:27: warning: incorrect type in argument 2 (different signedness)
kernel/printk.c:223:27: expected int *pint
kernel/printk.c:223:27: got unsigned int static [toplevel] *

Since boot_delay can't grow more than 10,000 changing it to 'int *'
will not produce any problem.

Signed-off-by: Namhyung Kim
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-27 07:52:16 +0800
8155c02a4 printk: add lock context annotation ... Browse Code »

acquire_console_semaphore_for_printk() releases logbuf_lock but
was missing proper annotation. Add it.

Signed-off-by: Namhyung Kim
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-27 07:52:16 +0800
6c095efd8 printk: fixup declaration of kmsg_reasons ... Browse Code »

Move redundant 'const' after '*' to make pointer itself const

Signed-off-by: Namhyung Kim
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-27 07:52:16 +0800
4ce6494db stop_machine: convert cpu notifier to return encapsulate errno value ... Browse Code »

In commit e6bde73b07edeb703d4c89c1daabc09c303de11f ("cpu-hotplug: return
better errno on cpu hotplug failure"), the cpu notifier can return an
encapsulated errno value.

This converts the cpu notifier to return an encapsulated errno value for
stop_machine().

Signed-off-by: Akinobu Mita
Cc: Rusty Russell
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2010-10-27 07:52:15 +0800
ca51c5a76 kernel/stop_machine.c: fix unused variable warning ... Browse Code »

kernel/stop_machine.c: In function `cpu_stopper_thread':
kernel/stop_machine.c:265: warning: unused variable `ksym_buf'

ksym_buf[] is unused if WARN_ON() is a no-op.

Signed-off-by: Rakib Mullick
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rakib Mullick
2010-10-27 07:52:15 +0800
518de9b39 fs: allow for more than 2^31 files ... Browse Code »

Robin Holt tried to boot a 16TB system and found af_unix was overflowing
a 32bit value :

We were seeing a failure which prevented boot. The kernel was incapable
of creating either a named pipe or unix domain socket. This comes down
to a common kernel function called unix_create1() which does:

atomic_inc(&unix_nr_socks);
if (atomic_read(&unix_nr_socks) > 2 * get_max_files())
goto out;

The function get_max_files() is a simple return of files_stat.max_files.
files_stat.max_files is a signed integer and is computed in
fs/file_table.c's files_init().

n = (mempages * (PAGE_SIZE / 1024)) / 10;
files_stat.max_files = n;

In our case, mempages (total_ram_pages) is approx 3,758,096,384
(0xe0000000). That leaves max_files at approximately 1,503,238,553.
This causes 2 * get_max_files() to integer overflow.

Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long
integers, and change af_unix to use an atomic_long_t instead of atomic_t.

get_max_files() is changed to return an unsigned long. get_nr_files() is
changed to return a long.

unix_nr_socks is changed from atomic_t to atomic_long_t, while not
strictly needed to address Robin problem.

Before patch (on a 64bit kernel) :
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
-18446744071562067968

After patch:
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
2147483648
# cat /proc/sys/fs/file-nr
704 0 2147483648

Reported-by: Robin Holt
Signed-off-by: Eric Dumazet
Acked-by: David Miller
Reviewed-by: Robin Holt
Tested-by: Robin Holt
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Dumazet
2010-10-27 07:52:15 +0800
571428be5 kernel/user.c: add lock release annotation on free_user() ... Browse Code »

free_user() releases uidhash_lock but was missing annotation. Add it.
This removes following sparse warnings:

include/linux/spinlock.h:339:9: warning: context imbalance in 'free_user' - unexpected unlock
kernel/user.c:120:6: warning: context imbalance in 'free_uid' - wrong count at exit

Signed-off-by: Namhyung Kim
Cc: Ingo Molnar
Cc: Dhaval Giani
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-27 07:52:15 +0800
ca1cab37d workqueues: s/ON_STACK/ONSTACK/ ... Browse Code »

Silly though it is, completions and wait_queue_heads use foo_ONSTACK
(COMPLETION_INITIALIZER_ONSTACK, DECLARE_COMPLETION_ONSTACK,
__WAIT_QUEUE_HEAD_INIT_ONSTACK and DECLARE_WAIT_QUEUE_HEAD_ONSTACK) so I
guess workqueues should do the same thing.

s/INIT_WORK_ON_STACK/INIT_WORK_ONSTACK/
s/INIT_DELAYED_WORK_ON_STACK/INIT_DELAYED_WORK_ONSTACK/

Cc: Peter Zijlstra
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2010-10-27 07:52:14 +0800
3ecb01df3 use clear_page()/copy_page() in favor of memset()/memcpy() on whole pages ... Browse Code »

After all that's what they are intended for.

Signed-off-by: Jan Beulich
Cc: Miklos Szeredi
Cc: "Eric W. Biederman"
Cc: "Rafael J. Wysocki"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2010-10-27 07:52:13 +0800
61ecdb801 mm: strictly nested kmap_atomic() ... Browse Code »

Ensure kmap_atomic() usage is strictly nested

Signed-off-by: Peter Zijlstra
Reviewed-by: Rik van Riel
Acked-by: Chris Metcalf
Cc: David Howells
Cc: Hugh Dickins
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Steven Rostedt
Cc: Russell King
Cc: Ralf Baechle
Cc: David Miller
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2010-10-27 07:52:08 +0800