09 Feb, 2008
40 commits
-
ipc_lock_check_down(), ipc_lock_check() and ipcget() seem too large to be
inline. Besides, they give no optimization being inline as they perform
calls inside in any case.Moving them into ipc/util.c saves 500 bytes of vmlinux and shortens IPC
internal API.$ ./scripts/bloat-o-meter vmlinux-orig vmlinux
add/remove: 3/2 grow/shrink: 0/10 up/down: 490/-989 (-499)
function old new delta
ipcget - 392 +392
ipc_lock_check_down - 49 +49
ipc_lock_check - 49 +49
sys_semget 119 105 -14
sys_shmget 108 86 -22
sys_msgget 100 78 -22
do_msgsnd 665 631 -34
do_msgrcv 680 644 -36
do_shmat 771 733 -38
sys_msgctl 1302 1229 -73
ipcget_new 80 - -80
sys_semtimedop 1534 1452 -82
sys_semctl 2034 1922 -112
sys_shmctl 1919 1765 -154
ipcget_public 322 - -322The ipcget() growth is the result of gcc inlining of currently static
ipcget_new/_public.Signed-off-by: Pavel Emelyanov
Cc: Nadia Derbey
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Also removes a cflag comparison that caused some mode changes to get wrongly
ignoredSigned-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Cc: Jiri Slaby
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Get the constness right, avoid nasty cast.
Cc: Ingo Molnar
Cc: Kyle McMartin
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
module.c should not define linker variables on its own. We have an include
file for that.Signed-off-by: Christoph Lameter
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The module subsystem cannot handle symbols that are zero. If symbols are
present that have a zero value then the module resolver prints out a
message that these symbols are unresolved.[akinobu.mita@gmail.com: fix __find_symbl() error checks]
Cc: Mathieu Desnoyers
Cc: Kay Sievers
Cc: Rusty Russell
Cc: Andi Kleen
Signed-off-by: Akinobu Mita
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Backend for s390.
Acked-by: Alan Cox
Cc: Martin Schwidefsky
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Give architectures that support the new termios2 the possibilty to overide the
user_termios_to_kernel_termios and kernel_termios_to_user_termios macros. As
soon as all architectures that use the generic variant have been converted the
ifdefs can go away again. Architectures in question are avr32, frv, powerpc
and s390.Cc: Alan Cox
Cc: Paul Mackerras
Cc: David Howells
Cc: Haavard Skinnemoen
Cc: Martin Schwidefsky
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix an off by one bug in the fault reason string reporting function, and
clean up some of the code around this buglet.[akpm@linux-foundation.org: cleanup]
Signed-off-by: mark gross
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add support for protected memory enable bits by clearing them if they are
set at startup time. Some future boot loaders or firmware could have this
bit set after it loads the kernel, and it needs to be cleared if DMA's are
going to happen effectively.Signed-off-by: mark gross
Acked-by: Muli Ben-Yehuda
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Typical PDE creation code looks like:
pde = create_proc_entry("foo", 0, NULL);
if (pde)
pde->proc_fops = &foo_proc_fops;Notice that PDE is first created, only then ->proc_fops is set up to
final value. This is a problem because right after creation
a) PDE is fully visible in /proc , and
b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's
possible to ->read without ->open (see one class of oopses below).The fix is new API called proc_create() which makes sure ->proc_fops are
set up before gluing PDE to main tree. Typical new code looks like:pde = proc_create("foo", 0, NULL, &foo_proc_fops);
if (!pde)
return -ENOMEM;Fix most networking users for a start.
In the long run, create_proc_entry() for regular files will go.
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024
printing eip: c1188c1b *pdpt = 000000002929e001 *pde = 0000000000000000
Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/block/sda/sda1/dev
Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdromPid: 24679, comm: cat Not tainted (2.6.24-rc3-mm1 #2)
EIP: 0060:[] EFLAGS: 00210002 CPU: 0
EIP is at mutex_lock_nested+0x75/0x25d
EAX: 000006fe EBX: fffffffb ECX: 00001000 EDX: e9340570
ESI: 00000020 EDI: 00200246 EBP: e9340570 ESP: e8ea1ef8
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 24679, ti=E8EA1000 task=E9340570 task.ti=E8EA1000)
Stack: 00000000 c106f7ce e8ee05b4 00000000 00000001 458003d0 f6fb6f20 fffffffb
00000000 c106f7aa 00001000 c106f7ce 08ae9000 f6db53f0 00000020 00200246
00000000 00000002 00000000 00200246 00200246 e8ee05a0 fffffffb e8ee0550
Call Trace:
[] seq_read+0x24/0x28a
[] seq_read+0x0/0x28a
[] seq_read+0x24/0x28a
[] seq_read+0x0/0x28a
[] proc_reg_read+0x60/0x73
[] proc_reg_read+0x0/0x73
[] vfs_read+0x6c/0x8b
[] sys_read+0x3c/0x63
[] sysenter_past_esp+0x5f/0xa5
[] destroy_inode+0x24/0x33
=======================
INFO: lockdep is turned off.
Code: 75 21 68 e1 1a 19 c1 68 87 00 00 00 68 b8 e8 1f c1 68 25 73 1f c1 e8 84 06 e9 ff e8 52 b8 e7 ff 83 c4 10 9c 5f fa e8 28 89 ea ff fe 4e 04 79 0a f3 90 80 7e 04 00 7e f8 eb f0 39 76 34 74 33
EIP: [] mutex_lock_nested+0x75/0x25d SS:ESP 0068:e8ea1ef8[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan
Cc: "Eric W. Biederman"
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Long ago when the CLONE_THREAD support first went it someone thought it
would be wise to point /proc/self at /proc/ instead of /proc/.Given that /proc/ can return information about a very different task
(if enough things have been unshared) then our current process /proc/
seems blatantly wrong. So far I have yet to think up an example where the
current behavior would be advantageous, and I can see several places where
it is seriously non-intuitive.We may be stuck with the current broken behavior for backwards
compatibility reasons but lets try fixing our ancient bug for the 2.6.25
time frame and see if anyone screams.Signed-off-by: Eric W. Biederman
Acked-by: Ingo Molnar
Cc: "Guillaume Chazarain"
Cc: "Pavel Emelyanov"
Cc: "Rafael J. Wysocki"
Cc: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently if you access a /proc that is not mounted with your processes
current pid namespace /proc/self will point at a completely random task.This patch fixes /proc/self to point to the current process if it is
available in the particular mount of /proc or to return -ENOENT if the
current process is not visible.Signed-off-by: Eric W. Biederman
Cc: Pavel Emelyanov
Cc: Alexey Dobriyan
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently we possibly lookup the pid in the wrong pid namespace. So
seq_file convert proc_pid_status which ensures the proper pid namespaces is
passed in.[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: another build fix]
[akpm@linux-foundation.org: s390 build fix]
[akpm@linux-foundation.org: fix task_name() output]
[akpm@linux-foundation.org: fix nommu build]
Signed-off-by: Eric W. Biederman
Cc: Andrew Morgan
Cc: Serge Hallyn
Cc: Cedric Le Goater
Cc: Pavel Emelyanov
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Paul Menage
Cc: Paul Jackson
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This conversion is just for code cleanliness, uniformity, and general safety.
Signed-off-by: Eric W. Biederman
Cc: Oleg Nesterov
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently (as pointed out by Oleg) do_task_stat has a race when calling
task_pid_nr_ns with the task exiting. In addition do_task_stat is not
currently displaying information in the context of the pid namespace that
mounted the /proc filesystem. So "cut -d' ' -f 1 /proc//stat" may not
equal .This patch fixes the problem by converting to a single_open seq_file show
method. Getting the pid namespace from the filesystem superblock instead of
current, and simply using the the struct pid from the inode instead of
attempting to get that same pid from the task.Signed-off-by: Eric W. Biederman
Cc: Oleg Nesterov
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently many /proc/pid files use a crufty precursor to the current seq_file
api, and they don't have direct access to the pid_namespace or the pid of for
which they are displaying data.So implement proc_single_file_operations to make the seq_file routines easy to
use, and to give access to the full state of the pid of we are displaying data
for.Signed-off-by: Eric W. Biederman
Cc: Oleg Nesterov
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Print a warning if PDE is registered with a name which already exists in
target directory.Bug report and a simple fix can be found here:
http://bugzilla.kernel.org/show_bug.cgi?id=8798[\n fixlet and no undescriptive variable usage --adobriyan]
[akpm@linux-foundation.org: make printk comprehensible]
Signed-off-by: Zhang Rui
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
proc symlinks always have valid ->data containing destination of symlink. No
need to check it on removal -- proc_symlink() already done it.Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Move code around so as to reduce the number of forward-declarations.
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Pseudo-code for lookup effectively is:
LOCK kernel
LOCK proc_subdir_lock
find PDE
UNLOCK proc_subdir_lockget inode
LOCK proc_subdir_lock
goto unlock
UNLOCK proc_subdir_lock
UNLOCK kernelWe can get rid of LOCK/UNLOCK pair after getting inode simply by jumping
to unlock_kernel() directly.Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
proc is not modular, so MODULE_LICENSE just expands to empty space. proc
without doubts remains GPLed.Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There's already an option controlling the net namespaces cloning code, so make
it work the same way as all the other namespaces' options.Signed-off-by: Pavel Emelyanov
Cc: "David S. Miller"
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Just like with the user namespaces, move the namespace management code into
the separate .c file and mark the (already existing) PID_NS option as "depend
on NAMESPACES"[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make the user_namespace.o compilation depend on this option and move the
init_user_ns into user.c file to make the kernel compile and work without the
namespaces support. This make the user namespace code be organized similar to
other namespaces'.Also mask the USER_NS option as "depend on NAMESPACES".
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently the IPC namespace management code is spread over the ipc/*.c files.
I moved this code into ipc/namespace.c file which is compiled out when needed.The linux/ipc_namespace.h file is used to store the prototypes of the
functions in namespace.c and the stubs for NAMESPACES=n case. This is done
so, because the stub for copy_ipc_namespace requires the knowledge of the
CLONE_NEWIPC flag, which is in sched.h. But the linux/ipc.h file itself in
included into many many .c files via the sys.h->sem.h sequence so adding the
sched.h into it will make all these .c depend on sched.h which is not that
good. On the other hand the knowledge about the namespaces stuff is required
in 4 .c files only.Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
msg.c and shm.c files. It turned out that moving these functions into
namespaces.c is not that easy because they use many other calls and macros
from the original file. Moving them would make this patch complicated. On
the other hand all these functions can be consolidated, so I will send a
separate patch doing this a bit later.Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently all the namespace management code is in the kernel/utsname.c file,
so just compile it out and make stubs in the appropriate header.The init namespace itself is in init/version.c and is in the kernel all the
time.Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds