28 Oct, 2010
40 commits
-
This takes care of leaking uninitialized kernel stack memory to
userspace from non-zeroed fields in structs in compat ipc functions.Signed-off-by: Dan Rosenberg
Cc: Manfred Spraul
Cc: Arnd Bergmann
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The kernel currently provides no functionality to analyze the RSS and swap
space usage of each individual sysvipc shared memory segment.This patch adds this info for each existing shm segment by extending the
output of /proc/sysvipc/shm by two columns for RSS and swap.Since shmctl(SHM_INFO) already provides a similiar calculation (it
currently sums up all RSS/swap info for all segments), I did split out a
static function which is now used by the /proc/sysvipc/shm output and
shmctl(SHM_INFO).SAP products (esp. the SAP Netweaver ABAP Kernel) uses lots of big shared
memory segments (we often have Linux systems with >= 16GB shm usage).
Sometimes we get customer reports about "slow" system responses and while
looking into their configurations we often find massive swapping activity
on the system. With this patch it's now easy to see from the command line
if and which shm segments gets swapped out (and how much) and can more
easily give recommendations for system tuning. Without the patch it's
currently not possible to do such shm analysis at all.Also...
Add some spaces in front of the "size" field for 64bit kernels to get the
columns correct if you cat the contents of the file. In
sysvipc_shm_proc_show() the kernel prints the size value in "SPEC_SIZE"
format, which is defined like this:#if BITS_PER_LONG
Cc: Manfred Spraul
Acked-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Presently do_execve() turns PF_KTHREAD off before search_binary_handler().
THis has a theorical risk of PF_KTHREAD getting lost. We don't have to
turn PF_KTHREAD off in the ENOEXEC case.This patch moves this flag modification to after the finding of the
executable file.This is only a theorical issue because kthreads do not call do_execve()
directly. But fixing would be better.Signed-off-by: KOSAKI Motohiro
Acked-by: Roland McGrath
Acked-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In /proc/stat, the number of per-IRQ event is shown by making a sum each
irq's events on all cpus. But we can make use of kstat_irqs().kstat_irqs() do the same calculation, If !CONFIG_GENERIC_HARDIRQ,
it's not a big cost. (Both of the number of cpus and irqs are small.)If a system is very big and CONFIG_GENERIC_HARDIRQ, it does
for_each_irq()
for_each_cpu()
- look up a radix tree
- read desc->irq_stat[cpu]
This seems not efficient. This patch adds kstat_irqs() for
CONFIG_GENRIC_HARDIRQ and change the calculation asfor_each_irq()
look up radix tree
for_each_cpu()
- read desc->irq_stat[cpu]This reduces cost.
A test on (4096cpusp, 256 nodes, 4592 irqs) host (by Jack Steiner)
%time cat /proc/stat > /dev/null
Before Patch: 2.459 sec
After Patch : .561 sec[akpm@linux-foundation.org: unexport kstat_irqs, coding-style tweaks]
[akpm@linux-foundation.org: fix unused variable 'per_irq_sum']
Signed-off-by: KAMEZAWA Hiroyuki
Tested-by: Jack Steiner
Acked-by: Jack Steiner
Cc: Yinghai Lu
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
/proc/stat shows the total number of all interrupts to each cpu. But when
the number of IRQs are very large, it take very long time and 'cat
/proc/stat' takes more than 10 secs. This is because sum of all irq
events are counted when /proc/stat is read. This patch adds "sum of all
irq" counter percpu and reduce read costs.The cost of reading /proc/stat is important because it's used by major
applications as 'top', 'ps', 'w', etc....A test on a mechin (4096cpu, 256 nodes, 4592 irqs) shows
%time cat /proc/stat > /dev/null
Before Patch: 12.627 sec
After Patch: 2.459 secSigned-off-by: KAMEZAWA Hiroyuki
Tested-by: Jack Steiner
Acked-by: Jack Steiner
Cc: Yinghai Lu
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The length of the BLOCK_IPOLL string is making i's value be printed too
far to the right. This patch fixes this and makes the output a bit
neater.Currently:
CPU0
HI: 0
TIMER: 599792
NET_TX: 2
NET_RX: 6
BLOCK: 80807
BLOCK_IOPOLL: 0
TASKLET: 20012
SCHED: 0
HRTIMER: 63
RCU: 619279With patch:
CPU0
HI: 0
TIMER: 585582
NET_TX: 2
NET_RX: 6
BLOCK: 80320
BLOCK_IOPOLL: 0
TASKLET: 19287
SCHED: 0
HRTIMER: 62
RCU: 604441Signed-off-by: Davidlohr Bueso
Acked-by: Keika Kobayashi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Document /proc/pid/pagemap in Documentation/filesystems/proc.txt
Signed-off-by: Nikanth Karthikesan
Cc: Richard Guenther
Cc: Balbir Singh
Cc: KOSAKI Motohiro
Acked-by: Matt Mackall
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Export the number of anonymous pages in a mapping via smaps.
Even the private pages in a mapping backed by a file, would be marked as
anonymous, when they are modified. Export this information to user-space via
smaps.Exporting this count will help gdb to make a better decision on which
areas need to be dumped in its coredump; and should be useful to others
studying the memory usage of a process.Signed-off-by: Nikanth Karthikesan
Acked-by: Hugh Dickins
Reviewed-by: KOSAKI Motohiro
Cc: Matt Mackall
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
find_new_reaper() releases and regrabs tasklist_lock but was missing
proper annotations. Add it. This remove following sparse warning:warning: context imbalance in 'find_new_reaper' - unexpected unlock
Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The userland ELF tools have been coping with partial-segments core files
for a few years now. Multiple distro builds are now setting this option.
It behooves everyone who ever deals with core files to have more info
dumped in there, especially as more and more people's compilers are
producing build IDs. Make it the default.Anyone using older tools confused by these core files can configure this
option off, or just change /proc/PID/coredump_filter after boot.Signed-off-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We met a parameter truncated issue, consider following:
> echo "|/root/core_pattern_pipe_test %p /usr/libexec/blah-blah-blah \
%s %c %p %u %g 11 12345678901234567890123456789012345678 %t" > \
/proc/sys/kernel/core_patternThis is okay because the strings is less than CORENAME_MAX_SIZE. "cat
/proc/sys/kernel/core_pattern" shows the whole string. but after we run
core_pattern_pipe_test in man page, we found last parameter was truncated
like below:argc[10]=
The root cause is core_pattern allows % specifiers, which need to be
replaced during parse time, but the replace may expand the strings to
larger than CORENAME_MAX_SIZE. So if the last parameter is % specifiers,
the replace code is using snprintf(out_ptr, out_end - out_ptr, ...), this
will write out of corename array.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Xiaotian Feng
Cc: Alexander Viro
Cc: Oleg Nesterov
Cc: KOSAKI Motohiro
Reviewed-by: Neil Horman
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Oleg Nesterov pointed out we have to prevent multiple-threads-inside-exec
itself and we can reuse ->cred_guard_mutex for it. Yes, concurrent
execve() has no worth.Let's move ->cred_guard_mutex from task_struct to signal_struct. It
naturally prevent multiple-threads-inside-exec.Signed-off-by: KOSAKI Motohiro
Reviewed-by: Oleg Nesterov
Acked-by: Roland McGrath
Acked-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
ptrace_stop() releases and regrabs current->sighand->siglock but was
missing proper annotation. Add it.Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Cc: Ingo Molnar
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
lock_task_sighand() grabs sighand->siglock in case of returning non-NULL
but unlock_task_sighand() releases it unconditionally. This leads sparse
to complain about the lock context imbalance. Rename and wrap
lock_task_sighand() using __cond_lock() macro to make sparse happy.Suggested-by: Eric Dumazet
Signed-off-by: Namhyung Kim
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'datap' variable in order to remove unnecessary castings.
Signed-off-by: Namhyung Kim
Cc: Chris Zankel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove unnecessary castings using void pointer and fix copy_to_user()
return value. Also add missing __user markup on the argument of
arch_ptrctl().Signed-off-by: Namhyung Kim
Cc: Jeff Dike
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove checking @addr less than 0 because @addr is now unsigned.
Signed-off-by: Namhyung Kim
Acked-by: Chris Metcalf
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Factor out struct fps and remove redundant castings.
Signed-off-by: Namhyung Kim
Acked-by: David S. Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove unnecessary castings and get rid of dummy pointer in favor of
offsetof() macro in ptrace_32.c. Also use temporary variables and
break long lines in order to improve readability.Signed-off-by: Namhyung Kim
Cc: Paul Mundt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove unnecessary castings.
Signed-off-by: Namhyung Kim
Cc: Chen Liqin
Cc: Lennox Wu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'datavp' and 'datalp' variables in order to remove unnecessary
castings.Signed-off-by: Namhyung Kim
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add missing __user markup on the argument of put_user().
Signed-off-by: Namhyung Kim
Cc: Kyle McMartin
Cc: Helge Deller
Cc: "James E.J. Bottomley"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'datap' variable in order to remove unnecessary castings.
Also remove checking @addr less than 0 because @addr is now unsigned.Signed-off-by: Namhyung Kim
Cc: David Howells
Cc: Koichi Yasutake
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'addrp', 'datavp' and 'datalp' variables in order to remove
unnecessary castings.Signed-off-by: Namhyung Kim
Cc: Ralf Baechle
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove checking @addr greater than 0 because @addr is now unsigned.
Signed-off-by: Namhyung Kim
Cc: Michal Simek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'regno', 'datap' variables in order to remove duplicated
expressions and unnecessary castings. Alse remove checking @addr less
than 0 because addr is now unsigned.Signed-off-by: Namhyung Kim
Acked-by: Greg Ungerer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'regno', 'datap' variables in order to remove duplicated
expressions and unnecessary castings.Signed-off-by: Namhyung Kim
Acked-by: Geert Uytterhoeven
Cc: Roman Zippel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'datap' variable in order to remove duplicated castings.
Signed-off-by: Namhyung Kim
Cc: Hirokazu Takata
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'regno', 'datap' variables in order to remove duplicated
expressions and unnecessary castings. Alse remove checking @addr
less than 0 because addr is now unsigned.Signed-off-by: Namhyung Kim
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'regno', 'datap' variables in order to remove duplicated
expressions and unnecessary castings. Alse remove checking @addr
less than 0 because addr is now unsigned.Signed-off-by: Namhyung Kim
Cc: David Howells
Cc: "Daniel K."
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'regno' variable in order to remove redandunt expression and
remove checking @addr less than 0 because @addr is now unsigned. Also
update 'datap' on PTRACE_GET/SETREGS to fix a bug on arch-v10.Signed-off-by: Namhyung Kim
Acked-by: Mikael Starvik
Cc: Jesper Nilsson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Change signature of get/put_reg() according to the change of arch_ptrace()
and remove unnecessary castings.Signed-off-by: Namhyung Kim
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
use new 'datap' variable type of void pointer in order to remove unnecessary
castings.Signed-off-by: Namhyung Kim
Acked-by: Haavard Skinnemoen
Cc: Hans-Christian Egtvedt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
use new 'datap' variable in order to remove unnecessary castings.
Signed-off-by: Namhyung Kim
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove checking @addr less than 0 because @addr is now unsigned and
use new udescp variable in order to remove unnecessary castings.[akpm@linux-foundation.org: fix unused variable 'udescp']
Signed-off-by: Namhyung Kim
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix up the arguments to arch_ptrace() to take account of the fact that
@addr and @data are now unsigned long rather than long as of a preceding
patch in this series.Signed-off-by: Namhyung Kim
Cc:
Acked-by: Roland McGrath
Acked-by: David Howells
Acked-by: Geert Uytterhoeven
Acked-by: David S. Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use new 'datavp' and 'datalp' variables to remove unnecesary castings.
Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Since userspace API of ptrace syscall defines @addr and @data as void
pointers, it would be more appropriate to define them as unsigned long in
kernel. Therefore related functions are changed also.'unsigned long' is typically used in other places in kernel as an opaque
data type and that using this helps cleaning up a lot of warnings from
sparse.Suggested-by: Arnd Bergmann
Signed-off-by: Namhyung Kim
Acked-by: Arnd Bergmann
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
exit_ptrace() releases and regrabs tasklist_lock but was missing proper
annotation. Add it.Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Cc: Ingo Molnar
Cc: Oleg Nesterov
Signed-off-by: Linus Torvalds -
This patch extracts the core logic from mem_cgroup_update_file_mapped() as
mem_cgroup_update_file_stat() and adds a wrapper.As a planned future update, memory cgroup has to count dirty pages to
implement dirty_ratio/limit. And more, the number of dirty pages is
required to kick flusher thread to start writeback. (Now, no kick.)This patch is preparation for it and makes other statistics implementation
clearer. Just a clean up.Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: Balbir Singh
Reviewed-by: Greg Thelen
Cc: Daisuke Nishimura
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds