28 Jul, 2011
1 commit
-
Since __proc_create() appends the name it is given to the end of the PDE
structure that it allocates, there isn't a need to store a name pointer.
Instead we can just replace the name pointer with a terminal char array of
_unspecified_ length. The compiler will simply append the string to statically
defined variables of PDE type overlapping any hole at the end of the structure
and, unlike specifying an explicitly _zero_ length array, won't give a warning
if you try to statically initialise it with a string of more than zero length.Also, whilst we're at it:
(1) Move namelen to end just prior to name and reduce it to a single byte
(name shouldn't be longer than NAME_MAX).(2) Move pde_unload_lock two places further on so that if it's four bytes in
size on a 64-bit machine, it won't cause an unused hole in the PDE struct.Signed-off-by: David Howells
Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds
27 Jul, 2011
1 commit
-
This allows us to move duplicated code in
(atomic_inc_not_zero() for now) toSigned-off-by: Arun Sharma
Reviewed-by: Eric Dumazet
Cc: Ingo Molnar
Cc: David Miller
Cc: Eric Dumazet
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 May, 2011
1 commit
-
Setup and cleanup of mm_struct->exe_file is currently done in fs/proc/.
This was because exe_file was needed only for /proc//exe. Since we
will need the exe_file functionality also for core dumps (so core name can
contain full binary path), built this functionality always into the
kernel.To achieve that move that out of proc FS to the kernel/ where in fact it
should belong. By doing that we can make dup_mm_exe_file static. Also we
can drop linux/proc_fs.h inclusion in fs/exec.c and kernel/fork.c.Signed-off-by: Jiri Slaby
Cc: Alexander Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 May, 2011
1 commit
-
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd:
net: fix get_net_ns_by_fd for !CONFIG_NET_NS
ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry.
ns: Declare sys_setns in syscalls.h
net: Allow setting the network namespace by fd
ns proc: Add support for the ipc namespace
ns proc: Add support for the uts namespace
ns proc: Add support for the network namespace.
ns: Introduce the setns syscall
ns: proc files for namespace naming policy.
25 May, 2011
1 commit
-
Now that mm/mempolicy.c is no longer implementing /proc/pid/numa_maps
there is no need to export struct proc_maps_private to the world. Move it
to fs/proc/internal.h instead.Signed-off-by: Stephen Wilson
Reviewed-by: KOSAKI Motohiro
Cc: Hugh Dickins
Cc: David Rientjes
Cc: Lee Schermerhorn
Cc: Alexey Dobriyan
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 May, 2011
1 commit
-
Provide a stub for proc_mkdir_mode() when CONFIG_PROC_FS is not
enabled, just like the stub for proc_mkdir().Fixes this linux-next build error:
drivers/net/wireless/airo.c:4504: error: implicit declaration of function 'proc_mkdir_mode'
Signed-off-by: Randy Dunlap
Cc: Stephen Rothwell
Cc: Alexey Dobriyan
Cc: "John W. Linville"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 May, 2011
4 commits
-
Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman -
Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman -
Implementing file descriptors for the network namespace
is simple and straight forward.Acked-by: David S. Miller
Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman -
Create files under /proc//ns/ to allow controlling the
namespaces of a process.This addresses three specific problems that can make namespaces hard to
work with.
- Namespaces require a dedicated process to pin them in memory.
- It is not possible to use a namespace unless you are the child
of the original creator.
- Namespaces don't have names that userspace can use to talk about
them.The namespace files under /proc//ns/ can be opened and the
file descriptor can be used to talk about a specific namespace, and
to keep the specified namespace alive.A namespace can be kept alive by either holding the file descriptor
open or bind mounting the file someplace else. aka:
mount --bind /proc/self/ns/net /some/filesystem/path
mount --bind /proc/self/fd/ /some/filesystem/pathThis allows namespaces to be named with userspace policy.
It requires additional support to make use of these filedescriptors
and that will be comming in the following patches.Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman
24 Mar, 2011
1 commit
-
1. namelen is declared "unsigned short" which hints for "maybe space savings".
Indeed in 2.4 struct proc_dir_entry looked like:struct proc_dir_entry {
unsigned short low_ino;
unsigned short namelen;Now, low_ino is "unsigned int", all savings were gone for a long time.
"struct proc_dir_entry" is not that countless to worry about it's size,
anyway.2. converting from unsigned short to int/unsigned int can only create
problems, we better play it safe.Space is not really conserved, because of natural alignment for the next
field. sizeof(struct proc_dir_entry) remains the same.Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Sep, 2009
3 commits
-
Benjamin Herrenschmidt pointed out that vmemmap
range is not included in KCORE_RAM, KCORE_VMALLOC ....This adds KCORE_VMEMMAP if SPARSEMEM_VMEMMAP is used. By this, vmemmap
can be readable via /proc/kcoreBecause it's not vmalloc area, vread/vwrite cannot be used. But the range
is static against the memory layout, this patch handles vmemmap area by
the same scheme with physical memory.This patch assumes SPARSEMEM_VMEMMAP range is not in VMALLOC range. It's
correct now.[akpm@linux-foundation.org: fix typo]
Signed-off-by: KAMEZAWA Hiroyuki
Cc: Jiri Slaby
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: WANG Cong
Cc: Benjamin Herrenschmidt
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Presently, kclist_add() only eats start address and size as its arguments.
Considering to make kclist dynamically reconfigulable, it's necessary to
know which kclists are for System RAM and which are not.This patch add kclist types as
KCORE_RAM
KCORE_VMALLOC
KCORE_TEXT
KCORE_OTHERThis "type" is used in a patch following this for detecting KCORE_RAM.
Signed-off-by: KAMEZAWA Hiroyuki
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: WANG Cong
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patchset is for /proc/kcore. With this,
- many per-arch hooks are removed.
- /proc/kcore will know really valid physical memory area.
- /proc/kcore will be aware of memory hotplug.
- /proc/kcore will be architecture independent i.e.
if an arch supports CONFIG_MMU, it can use /proc/kcore.
(if the arch uses usual memory layout.)This patch:
/proc/kcore uses its own list handling codes. It's better to use
generic list codes.No changes in logic. just clean up.
Signed-off-by: KAMEZAWA Hiroyuki
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: WANG Cong
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Jun, 2009
1 commit
-
Signed-off-by: Al Viro
31 Mar, 2009
1 commit
-
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.->read_proc/->write_proc were just fixed to not require ->owner for
protection.rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.Removing ->owner will also make PDE smaller.
So, let's nuke it.
Kudos to Jeff Layton for reminding about this, let's say, oversight.
http://bugzilla.kernel.org/show_bug.cgi?id=12454
Signed-off-by: Alexey Dobriyan
23 Oct, 2008
2 commits
-
Now that everything was moved to their more or less expected places,
apply rm(1).Signed-off-by: Alexey Dobriyan
-
Signed-off-by: Alexey Dobriyan
07 Oct, 2008
1 commit
-
commit 14cf11af6cf608eb8c23e989ddb17a715ddce109 ("powerpc: Merge enough to
start building in arch/powerpc.") unwired /proc/ppc_htab, and commit
917f0af9e5a9ceecf9e72537fabb501254ba321d ("powerpc: Remove arch/ppc and
include/asm-ppc") removed the rest of the /proc/ppc_htab support, but there are
still a few references left. Kill them for good.Signed-off-by: Geert Uytterhoeven
Signed-off-by: Benjamin Herrenschmidt
27 Jul, 2008
1 commit
-
* keep references to ctl_table_head and ctl_table in /proc/sys inodes
* grab the former during operations, use the latter for access to
entry if that succeeds
* have ->d_compare() check if table should be seen for one who does lookup;
that allows us to avoid flipping inodes - if we have the same name resolve
to different things, we'll just keep several dentries and ->d_compare()
will reject the wrong ones.
* have ->lookup() and ->readdir() scan the table of our inode first, then
walk all ctl_table_header and scan ->attached_by for those that are
attached to our directory.
* implement ->getattr().
* get rid of insane amounts of tree-walking
* get rid of the need to know dentry in ->permission() and of the contortions
induced by that.Signed-off-by: Al Viro
26 Jul, 2008
2 commits
-
Current two-stage scheme of removing PDE emphasizes one bug in proc:
open
rmmod
remove_proc_entry
close->release won't be called because ->proc_fops were cleared. In simple
cases it's small memory leak.For every ->open, ->release has to be done. List of openers is introduced
which is traversed at remove_proc_entry() if neeeded.Discussions with Al long ago (sigh).
Signed-off-by: Alexey Dobriyan
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch moves the extern of struct proc_kmsg_operations to
fs/proc/internal.h and adds an #include "internal.h" to fs/proc/kmsg.c
so that the latter sees the former.Signed-off-by: Adrian Bunk
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Jul, 2008
1 commit
-
get_proc_net() can now become static.
Signed-off-by: Adrian Bunk
Acked-by: Pavel Emelyanov
Signed-off-by: David S. Miller
13 Jun, 2008
1 commit
-
Move the forward-declaration of struct mm_struct a little way up
proc_fs.h. This fixes a bunch of "'struct mm_struct' declared inside
parameter list" warnings with CONFIG_PROC_FS=nSigned-off-by: Ben Nizette
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Apr, 2008
7 commits
-
This set of patches fixes an proc ->open'less usage due to ->proc_fops flip in
the most part of the kernel code. The original OOPS is described in the
commit 2d3a4e3666325a9709cc8ea2e88151394e8f20fc:Typical PDE creation code looks like:
pde = create_proc_entry("foo", 0, NULL);
if (pde)
pde->proc_fops = &foo_proc_fops;Notice that PDE is first created, only then ->proc_fops is set up to
final value. This is a problem because right after creation
a) PDE is fully visible in /proc , and
b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's
possible to ->read without ->open (see one class of oopses below).The fix is new API called proc_create() which makes sure ->proc_fops are
set up before gluing PDE to main tree. Typical new code looks like:pde = proc_create("foo", 0, NULL, &foo_proc_fops);
if (!pde)
return -ENOMEM;Fix most networking users for a start.
In the long run, create_proc_entry() for regular files will go.
In addition to this, proc_create_data is introduced to fix reading from
proc without PDE->data. The race is basically the same as above.create_proc_entries is replaced in the entire kernel code as new method
is also simply better.This patch:
The problem is the same as for de->proc_fops. Right now PDE becomes visible
without data set. So, the entry could be looked up without data. This, in
most cases, will simply OOPS.proc_create_data call is created to address this issue. proc_create now
becomes a wrapper around it.Signed-off-by: Denis V. Lunev
Cc: "Eric W. Biederman"
Cc: "J. Bruce Fields"
Cc: Alessandro Zummo
Cc: Alexey Dobriyan
Cc: Bartlomiej Zolnierkiewicz
Cc: Benjamin Herrenschmidt
Cc: Bjorn Helgaas
Cc: Chris Mason
Acked-by: David Howells
Cc: Dmitry Torokhov
Cc: Geert Uytterhoeven
Cc: Grant Grundler
Cc: Greg Kroah-Hartman
Cc: Haavard Skinnemoen
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: James Bottomley
Cc: Jaroslav Kysela
Cc: Jeff Garzik
Cc: Jeff Mahoney
Cc: Jesper Nilsson
Cc: Karsten Keil
Cc: Kyle McMartin
Cc: Len Brown
Cc: Martin Schwidefsky
Cc: Mathieu Desnoyers
Cc: Matthew Wilcox
Cc: Mauro Carvalho Chehab
Cc: Mikael Starvik
Cc: Nadia Derbey
Cc: Neil Brown
Cc: Paul Mackerras
Cc: Peter Osterlund
Cc: Pierre Peiffer
Cc: Russell King
Cc: Takashi Iwai
Cc: Tony Luck
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Now that last dozen or so users of ->get_info were removed, ditch it too.
Everyone sane shouldd have switched to seq_file interface long ago.P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc
is long-standing crap, BTW, thus
a) put ->read_proc/->write_proc/read_proc_entry() users on death row,
b) new such users should be rejected,
c) everyone is encouraged to convert his favourite ->read_proc user or
I'll do it, lazy bastards.Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove proc_root export. Creation and removal works well if parent PDE is
supplied as NULL -- it worked always that way.So, one useless export removed and consistency added, some drivers created
PDEs with &proc_root as parent but removed them as NULL and so on.Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use creation by full path: "driver/foo".
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use creation by full path instead: "fs/foo".
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove proc_bus export and variable itself. Using pathnames works fine
and is slightly more understandable and greppable.Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The kernel implements readlink of /proc/pid/exe by getting the file from
the first executable VMA. Then the path to the file is reconstructed and
reported as the result.Because of the VMA walk the code is slightly different on nommu systems.
This patch avoids separate /proc/pid/exe code on nommu systems. Instead of
walking the VMAs to find the first executable file-backed VMA we store a
reference to the exec'd file in the mm_struct.That reference would prevent the filesystem holding the executable file
from being unmounted even after unmapping the VMAs. So we track the number
of VM_EXECUTABLE VMAs and drop the new reference when the last one is
unmapped. This avoids pinning the mounted filesystem.[akpm@linux-foundation.org: improve comments]
[yamamoto@valinux.co.jp: fix dup_mmap]
Signed-off-by: Matt Helsley
Cc: Oleg Nesterov
Cc: David Howells
Cc:"Eric W. Biederman"
Cc: Christoph Hellwig
Cc: Al Viro
Cc: Hugh Dickins
Signed-off-by: YAMAMOTO Takashi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Mar, 2008
1 commit
-
Current /proc/net is done with so called "shadows", but current
implementation is broken and has little chances to get fixed.The problem is that dentries subtree of /proc/net directory has
fancy revalidation rules to make processes living in different
net namespaces see different entries in /proc/net subtree, but
currently, tasks see in the /proc/net subdir the contents of any
other namespace, depending on who opened the file first.The proposed fix is to turn /proc/net into a symlink, which points
to /proc/self/net, which in turn shows what previously was in
/proc/net - the network-related info, from the net namespace the
appropriate task lives in.# ls -l /proc/net
lrwxrwxrwx 1 root root 8 Mar 5 15:17 /proc/net -> self/netIn other words - this behaves like /proc/mounts, but unlike
"mounts", "net" is not a file, but a directory.Changes from v2:
* Fixed discrepancy of /proc/net nlink count and selinux labeling
screwup pointed out by Stephen.To get the correct nlink count the ->getattr callback for /proc/net
is overridden to read one from the net->proc_net entry.To make selinux still work the net->proc_net entry is initialized
properly, i.e. with the "net" name and the proc_net parent.Selinux fixes are
Acked-by: Stephen SmalleyChanges from v1:
* Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.Signed-off-by: Pavel Emelyanov
Acked-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
15 Feb, 2008
1 commit
-
proc_get_link() is always called with a dentry and a vfsmount from a struct
path. Make proc_get_link() take it directly as an argument.Signed-off-by: Jan Blunck
Acked-by: Christoph Hellwig
Cc: Al Viro
Cc: "J. Bruce Fields"
Cc: Neil Brown
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Feb, 2008
3 commits
-
Typical PDE creation code looks like:
pde = create_proc_entry("foo", 0, NULL);
if (pde)
pde->proc_fops = &foo_proc_fops;Notice that PDE is first created, only then ->proc_fops is set up to
final value. This is a problem because right after creation
a) PDE is fully visible in /proc , and
b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's
possible to ->read without ->open (see one class of oopses below).The fix is new API called proc_create() which makes sure ->proc_fops are
set up before gluing PDE to main tree. Typical new code looks like:pde = proc_create("foo", 0, NULL, &foo_proc_fops);
if (!pde)
return -ENOMEM;Fix most networking users for a start.
In the long run, create_proc_entry() for regular files will go.
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024
printing eip: c1188c1b *pdpt = 000000002929e001 *pde = 0000000000000000
Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/block/sda/sda1/dev
Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdromPid: 24679, comm: cat Not tainted (2.6.24-rc3-mm1 #2)
EIP: 0060:[] EFLAGS: 00210002 CPU: 0
EIP is at mutex_lock_nested+0x75/0x25d
EAX: 000006fe EBX: fffffffb ECX: 00001000 EDX: e9340570
ESI: 00000020 EDI: 00200246 EBP: e9340570 ESP: e8ea1ef8
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 24679, ti=E8EA1000 task=E9340570 task.ti=E8EA1000)
Stack: 00000000 c106f7ce e8ee05b4 00000000 00000001 458003d0 f6fb6f20 fffffffb
00000000 c106f7aa 00001000 c106f7ce 08ae9000 f6db53f0 00000020 00200246
00000000 00000002 00000000 00200246 00200246 e8ee05a0 fffffffb e8ee0550
Call Trace:
[] seq_read+0x24/0x28a
[] seq_read+0x0/0x28a
[] seq_read+0x24/0x28a
[] seq_read+0x0/0x28a
[] proc_reg_read+0x60/0x73
[] proc_reg_read+0x0/0x73
[] vfs_read+0x6c/0x8b
[] sys_read+0x3c/0x63
[] sysenter_past_esp+0x5f/0xa5
[] destroy_inode+0x24/0x33
=======================
INFO: lockdep is turned off.
Code: 75 21 68 e1 1a 19 c1 68 87 00 00 00 68 b8 e8 1f c1 68 25 73 1f c1 e8 84 06 e9 ff e8 52 b8 e7 ff 83 c4 10 9c 5f fa e8 28 89 ea ff fe 4e 04 79 0a f3 90 80 7e 04 00 7e f8 eb f0 39 76 34 74 33
EIP: [] mutex_lock_nested+0x75/0x25d SS:ESP 0068:e8ea1ef8[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan
Cc: "Eric W. Biederman"
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently we possibly lookup the pid in the wrong pid namespace. So
seq_file convert proc_pid_status which ensures the proper pid namespaces is
passed in.[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: another build fix]
[akpm@linux-foundation.org: s390 build fix]
[akpm@linux-foundation.org: fix task_name() output]
[akpm@linux-foundation.org: fix nommu build]
Signed-off-by: Eric W. Biederman
Cc: Andrew Morgan
Cc: Serge Hallyn
Cc: Cedric Le Goater
Cc: Pavel Emelyanov
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Paul Menage
Cc: Paul Jackson
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently many /proc/pid files use a crufty precursor to the current seq_file
api, and they don't have direct access to the pid_namespace or the pid of for
which they are displaying data.So implement proc_single_file_operations to make the seq_file routines easy to
use, and to give access to the full state of the pid of we are displaying data
for.Signed-off-by: Eric W. Biederman
Cc: Oleg Nesterov
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Feb, 2008
1 commit
-
This puts all the clear_refs code where it belongs and probably lets things
compile on MMU-less systems as well.Signed-off-by: Matt Mackall
Cc: Jeremy Fitzhardinge
Cc: David Rientjes
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Jan, 2008
1 commit
-
cat /proc/net/atm/arp causes the NULL pointer dereference in the
get_proc_net+0xc/0x3a. This happens as proc_get_net believes that the
parent proc dir entry contains struct net.Fix this assumption for "net/atm" case.
The problem is introduced by the commit c0097b07abf5f92ab135d024dd41bd2aada1512f
from Eric W. Biederman/Daniel Lezcano.Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller
06 Dec, 2007
1 commit
-
Creating PDEs with refcount 0 and "deleted" flag has problems (see below).
Switch to usual scheme:
* PDE is created with refcount 1
* every de_get does +1
* every de_put() and remove_proc_entry() do -1
* once refcount reaches 0, PDE is freed.This elegantly fixes at least two following races (both observed) without
introducing new locks, without abusing old locks, without spreading
lock_kernel():1) PDE leak
remove_proc_entry de_put
----------------- ------
[refcnt = 1]
if (atomic_read(&de->count) == 0)
if (atomic_dec_and_test(&de->count))
if (de->deleted)
/* also not taken! */
free_proc_entry(de);
else
de->deleted = 1;
[refcount=0, deleted=1]2) use after free
remove_proc_entry de_put
----------------- ------
[refcnt = 1]if (atomic_dec_and_test(&de->count))
if (atomic_read(&de->count) == 0)
free_proc_entry(de);
/* boom! */
if (de->deleted)
free_proc_entry(de);BUG: unable to handle kernel paging request at virtual address 6b6b6b6b
printing eip: c10acdda *pdpt = 00000000338f8001 *pde = 0000000000000000
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom
Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c0863403f109a43d7000b4646da4818220d501f #4)
EIP: 0060:[] EFLAGS: 00210097 CPU: 1
EIP is at strnlen+0x6/0x18
EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe
ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000)
Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400
c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400
f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34
Call Trace:
[] vsnprintf+0x2ad/0x49b
[] vscnprintf+0x14/0x1f
[] vprintk+0xc5/0x2f9
[] handle_fasteoi_irq+0x0/0xab
[] do_IRQ+0x9f/0xb7
[] preempt_schedule_irq+0x3f/0x5b
[] need_resched+0x1f/0x21
[] printk+0x1b/0x1f
[] de_put+0x3d/0x50
[] proc_delete_inode+0x38/0x41
[] proc_delete_inode+0x0/0x41
[] generic_delete_inode+0x5e/0xc6
[] iput+0x60/0x62
[] d_kill+0x2d/0x46
[] dput+0xdc/0xe4
[] __fput+0xb0/0xcd
[] filp_close+0x48/0x4f
[] sys_close+0x67/0xa5
[] sysenter_past_esp+0x5f/0x85
=======================
Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9
EIP: [] strnlen+0x6/0x18 SS:ESP 0068:f380be44Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds,
module is already pinned and remove_proc_entry() can't happen => nobody
can mark PDE deleted.Dummy proc root in netns code is not marked with refcount 1. AFAICS, we
never get it, it's just for proper /proc/net removal. I double checked
CLONE_NETNS continues to work.Patch survives many hours of modprobe/rmmod/cat loops without new bugs
which can be attributed to refcounting.Signed-off-by: Alexey Dobriyan
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Dec, 2007
1 commit
-
Well I clearly goofed when I added the initial network namespace support
for /proc/net. Currently things work but there are odd details visible to
user space, even when we have a single network namespace.Since we do not cache proc_dir_entry dentries at the moment we can just
modify ->lookup to return a different directory inode depending on the
network namespace of the process looking at /proc/net, replacing the
current technique of using a magic and fragile follow_link method.To accomplish that this patch:
- introduces a shadow_proc method to allow different dentries to
be returned from proc_lookup.
- Removes the old /proc/net follow_link magic
- Fixes a weakness in our not caching of proc generic dentries.As shadow_proc uses a task struct to decided which dentry to return we can
go back later and fix the proc generic caching without modifying any code
that uses the shadow_proc method.Signed-off-by: Eric W. Biederman
Cc: "Rafael J. Wysocki"
Cc: Pavel Machek
Cc: Pavel Emelyanov
Cc: "David S. Miller"
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Herbert Xu