20 Apr, 2006
1 commit
-
There are places in the kernel where we look up files in fd tables and
access the file structure without holding refereces to the file. So, we
need special care to avoid the race between looking up files in the fd
table and tearing down of the file in another CPU. Otherwise, one might
see a NULL f_dentry or such torn down version of the file. This patch
fixes those special places where such a race may happen.Signed-off-by: Dipankar Sarma
Acked-by: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 Apr, 2006
1 commit
-
A couple of /proc/vmcore data structures overflow with 32bit systems having
memory more than 4G. This patch fixes those.Signed-off-by: Ken'ichi Ohmichi
Signed-off-by: Vivek Goyal
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Apr, 2006
2 commits
-
proc_check_chroot() does the check in a very unintuitive way (keeping a
copy of the argument, then modifying the argument), and has uncommented
sideeffects.Signed-off-by: Herbert Poetzl
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make baby-simple the code for /proc/devices. Based on the proven design
for /proc/interrupts.This also fixes the early-termination regression 2.6.16 introduced, as
demonstrated by:# dd if=/proc/devices bs=1
Character devices:
1 mem
27+0 records in
27+0 records outThis should also work (but is untested) when /proc/devices >4096 bytes,
which I believe is what the original 2.6.16 rewrite fixed.[akpm@osdl.org: cleanups, simplifications]
Signed-off-by: Joe Korty
Cc: Neil Horman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Mar, 2006
4 commits
-
This is a conversion to make the various file_operations structs in fs/
const. Basically a regexp job, with a few manual fixupsThe goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Mark the f_ops members of inodes as const, as well as fix the
ripple-through this causes by places that copy this f_ops and then "do
stuff" with it.Signed-off-by: Arjan van de Ven
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
replaces for_each_cpu with for_each_possible_cpu().
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 Mar, 2006
1 commit
-
Various dodgy firmware might give us nodes and/or properties in the device
tree with conflicting names. That's generally ok, except for when we export
the device tree via /proc, so check when we're creating the proc device tree
and munge names accordingly.Tested on a faked device tree with kexec, would be good if someone with
actual bogus firmware could try it, but just for completeness.Signed-off-by: Michael Ellerman
Signed-off-by: Paul Mackerras
27 Mar, 2006
2 commits
-
Remove the it_real_value from /proc/*/stat, during 1.2.x was the last time it
returned useful data (as it was directly maintained by the scheduler), now
it's only a waste of time to calculate it. Return 0 instead.Signed-off-by: Roman Zippel
Acked-by: Ingo Molnar
Acked-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It has been discovered that the remove_proc_entry has a race in the removing
of entries in the proc file system that are siblings. There's no protection
around the traversing and removing of elements that belong in the same
subdirectory.This subdirectory list is protected in other areas by the BKL. So the BKL was
at first used to protect this area too, but unfortunately, remove_proc_entry
may be called with spinlocks held. The BKL may schedule, so this was not a
solution.The final solution was to add a new global spin lock to protect this list,
called proc_subdir_lock. This lock now protects the list in
remove_proc_entry, and I also went around looking for other areas that this
list is modified and added this protection there too. Care must be taken
since these locations call several functions that may also schedule.Since I don't see any location that these functions that modify the
subdirectory list are called by interrupts, the irqsave/restore versions of
the spin lock was _not_ used.Signed-off-by: Steven Rostedt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Mar, 2006
2 commits
-
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (103 commits)
SUNRPC,RPCSEC_GSS: spkm3--fix config dependencies
SUNRPC,RPCSEC_GSS: spkm3: import contexts using NID_cast5_cbc
LOCKD: Make nlmsvc_traverse_shares return void
LOCKD: nlmsvc_traverse_blocks return is unused
SUNRPC,RPCSEC_GSS: fix krb5 sequence numbers.
NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.
SUNRPC,RPCSEC_GSS: remove unnecessary kmalloc of a checksum
SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
SUNRPC: Fix memory barriers for req->rq_received
NFS: Fix a race in nfs_sync_inode()
NFS: Clean up nfs_flush_list()
NFS: Fix a race with PG_private and nfs_release_page()
NFSv4: Ensure the callback daemon flushes signals
SUNRPC: Fix a 'Busy inodes' error in rpc_pipefs
NFS, NLM: Allow blocking locks to respect signals
NFS: Make nfs_fhget() return appropriate error values
NFSv4: Fix an oops in nfs4_fill_super
lockd: blocks should hold a reference to the nlm_file
NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE
NFSv4: Send the delegation stateid for SETATTR calls
... -
Implement /proc/slab_allocators. It produces output like:
idr_layer_cache: 80 idr_pre_get+0x33/0x4e
buffer_head: 2555 alloc_buffer_head+0x20/0x75
mm_struct: 9 mm_alloc+0x1e/0x42
mm_struct: 20 dup_mm+0x36/0x370
vm_area_struct: 384 dup_mm+0x18f/0x370
vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
vm_area_struct: 1 split_vma+0x5a/0x10e
vm_area_struct: 11 do_brk+0x206/0x2e2
vm_area_struct: 2 copy_vma+0xda/0x142
vm_area_struct: 9 setup_arg_pages+0x99/0x214
fs_cache: 8 copy_fs_struct+0x21/0x133
fs_cache: 29 copy_process+0xf38/0x10e3
files_cache: 30 alloc_files+0x1b/0xcf
signal_cache: 81 copy_process+0xbaa/0x10e3
sighand_cache: 77 copy_process+0xe65/0x10e3
sighand_cache: 1 de_thread+0x4d/0x5f8
anon_vma: 241 anon_vma_prepare+0xd9/0xf3
size-2048: 1 add_sect_attrs+0x5f/0x145
size-2048: 2 journal_init_revoke+0x99/0x302
size-2048: 2 journal_init_revoke+0x137/0x302
size-2048: 2 journal_init_inode+0xf9/0x1c4Cc: Manfred Spraul
Cc: Alexander Nyberg
Cc: Pekka Enberg
Cc: Christoph Lameter
Cc: Ravikiran Thirumalai
Signed-off-by: Al Viro
DESC
slab-leaks3-locking-fix
EDESC
From: Andrew MortonUpdate for slab-remove-cachep-spinlock.patch
Cc: Al Viro
Cc: Manfred Spraul
Cc: Alexander Nyberg
Cc: Pekka Enberg
Cc: Christoph Lameter
Cc: Ravikiran Thirumalai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
24 Mar, 2006
3 commits
-
Rewrap the overly long source code lines resulting from the previous
patch's addition of the slab cache flag SLAB_MEM_SPREAD. This patch
contains only formatting changes, and no function change.Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Mark file system inode and similar slab caches subject to SLAB_MEM_SPREAD
memory spreading.If a slab cache is marked SLAB_MEM_SPREAD, then anytime that a task that's
in a cpuset with the 'memory_spread_slab' option enabled goes to allocate
from such a slab cache, the allocations are spread evenly over all the
memory nodes (task->mems_allowed) allowed to that task, instead of favoring
allocation on the node local to the current cpu.The following inode and similar caches are marked SLAB_MEM_SPREAD:
file cache
==== =====
fs/adfs/super.c adfs_inode_cache
fs/affs/super.c affs_inode_cache
fs/befs/linuxvfs.c befs_inode_cache
fs/bfs/inode.c bfs_inode_cache
fs/block_dev.c bdev_cache
fs/cifs/cifsfs.c cifs_inode_cache
fs/coda/inode.c coda_inode_cache
fs/dquot.c dquot
fs/efs/super.c efs_inode_cache
fs/ext2/super.c ext2_inode_cache
fs/ext2/xattr.c (fs/mbcache.c) ext2_xattr
fs/ext3/super.c ext3_inode_cache
fs/ext3/xattr.c (fs/mbcache.c) ext3_xattr
fs/fat/cache.c fat_cache
fs/fat/inode.c fat_inode_cache
fs/freevxfs/vxfs_super.c vxfs_inode
fs/hpfs/super.c hpfs_inode_cache
fs/isofs/inode.c isofs_inode_cache
fs/jffs/inode-v23.c jffs_fm
fs/jffs2/super.c jffs2_i
fs/jfs/super.c jfs_ip
fs/minix/inode.c minix_inode_cache
fs/ncpfs/inode.c ncp_inode_cache
fs/nfs/direct.c nfs_direct_cache
fs/nfs/inode.c nfs_inode_cache
fs/ntfs/super.c ntfs_big_inode_cache_name
fs/ntfs/super.c ntfs_inode_cache
fs/ocfs2/dlm/dlmfs.c dlmfs_inode_cache
fs/ocfs2/super.c ocfs2_inode_cache
fs/proc/inode.c proc_inode_cache
fs/qnx4/inode.c qnx4_inode_cache
fs/reiserfs/super.c reiser_inode_cache
fs/romfs/inode.c romfs_inode_cache
fs/smbfs/inode.c smb_inode_cache
fs/sysv/inode.c sysv_inode_cache
fs/udf/super.c udf_inode_cache
fs/ufs/super.c ufs_inode_cache
net/socket.c sock_inode_cache
net/sunrpc/rpc_pipe.c rpc_inode_cacheThe choice of which slab caches to so mark was quite simple. I marked
those already marked SLAB_RECLAIM_ACCOUNT, except for fs/xfs, dentry_cache,
inode_cache, and buffer_head, which were marked in a previous patch. Even
though SLAB_RECLAIM_ACCOUNT is for a different purpose, it marks the same
potentially large file system i/o related slab caches as we need for memory
spreading.Given that the rule now becomes "wherever you would have used a
SLAB_RECLAIM_ACCOUNT slab cache flag before (usually the inode cache), use
the SLAB_MEM_SPREAD flag too", this should be easy enough to maintain.
Future file system writers will just copy one of the existing file system
slab cache setups and tend to get it right without thinking.Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Mar, 2006
1 commit
-
Fix a duplicate block device line printed after the "Block device" header
in /proc/devices.Signed-off-by: Neil Horman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Mar, 2006
1 commit
-
Create a new file under /proc/self, called mountstats, where mounted file
systems can export information (configuration options, performance counters,
and so on). Use a mechanism similar to /proc/mounts and s_ops->show_options.This mechanism does not violate namespace security, and is safe to use while
other processes are unmounting file systems.Thanks to Mike Waychison for his review and comments.
Test-plan:
Test concurrent mount/unmount operations while cat'ing /proc/self/mountstats.Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust
07 Mar, 2006
2 commits
-
The point of the smaps "shared" is to count the number of pages that are
mapped by more than one process, according to Mauricio Lin. However, smaps
uses page_count for this, so it will return a false positive for every page
that is mapped by just that one process, which is also in pagecache or
swapcache. There are false positive situations for anonymous pages not in
swapcache as well: - page reclaim, migration - get_user_pages (eg.
direct-io, ptrace)Use page_mapcount instead, to count the number of mappings to the page.
Use vm_normal_page so that weird things like /dev/mem aren't counted either.
Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
smaps doesn't have a hugepage pagetable walker. Skip walking hugepage
vmas.Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Feb, 2006
1 commit
-
1) it should use nr_processes(), not nr_threads; otherwise we are getting
very confused find(1) and friends, among other things.
2) better do that at stat() time than at every damn lookup in procfs root.Patch had been sitting in FC4 kernels for many months now...
Signed-off-by: Al Viro
04 Feb, 2006
1 commit
-
Don't compute and display the per-irq sums on ia64 either, too much
overhead for mostly useless figures.Cc: Olaf Hering
Acked-by: "Luck, Tony"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Jan, 2006
1 commit
-
A Christoph suggested that the /proc/devices file be converted to use the
seq_file interface. This patch does that.I've obxerved one or two installation that had sufficiently large sans that
they overran the 4k limit on /proc/devices.Signed-off-by: Neil Horman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Jan, 2006
1 commit
-
Add support to the proc_device_tree file for removing
and updating properties. Remove just removes the
proc file, update changes the data pointer within
the proc file. The remainder of the device-tree
changes occur elsewhere.Signed-off-by: Dave Boutcher
Signed-off-by: Paul Mackerras
12 Jan, 2006
2 commits
-
fs: Use where capable() is used.
Signed-off-by: Randy Dunlap
Acked-by: Tim Schmielau
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
o fs/proc/vmcore.c compilation gives warnings on ppc64. The reason being
that u64 is defined as unsigned long hence u64* is not same as loff_t*
and compiler cribs.o Changed the parameter type to u64* instead of loff_t* to resolve the
conflict.Signed-off-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 Jan, 2006
3 commits
-
Header included twice.
Signed-off-by: Nicolas Kaiser
Signed-off-by: Adrian Bunk -
switch itimers to a hrtimers-based implementation
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
- Moving the crash_dump.c file to arch dependent part as kmap_atomic_pfn is
specific to i386 and highmem may not exist in other archs.- Use ioremap for x86_64 to map the previous kernel memory.
- In copy_oldmem_page(), we now directly copy to the user/kernel buffer and
avoid the unneccesary copy to a kmalloc'd page.Signed-off-by: Rachita Kothiyal
Signed-off-by: Vivek Goyal
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Jan, 2006
3 commits
-
Function prototypes belong into header files.
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
configurable replacement for slab allocator
This adds a CONFIG_SLAB option under CONFIG_EMBEDDED. When CONFIG_SLAB is
disabled, the kernel falls back to using the 'SLOB' allocator.SLOB is a traditional K&R/UNIX allocator with a SLAB emulation layer,
similar to the original Linux kmalloc allocator that SLAB replaced. It's
signicantly smaller code and is more memory efficient. But like all
similar allocators, it scales poorly and suffers from fragmentation more
than SLAB, so it's only appropriate for small systems.It's been tested extensively in the Linux-tiny tree. I've also
stress-tested it with make -j 8 compiles on a 3G SMP+PREEMPT box (not
recommended).Here's a comparison for otherwise identical builds, showing SLOB saving
nearly half a megabyte of RAM:$ size vmlinux*
text data bss dec hex filename
3336372 529360 190812 4056544 3de5e0 vmlinux-slab
3323208 527948 190684 4041840 3dac70 vmlinux-slob$ size mm/{slab,slob}.o
text data bss dec hex filename
13221 752 48 14021 36c5 mm/slab.o
1896 52 8 1956 7a4 mm/slob.o/proc/meminfo:
SLAB SLOB delta
MemTotal: 27964 kB 27980 kB +16 kB
MemFree: 24596 kB 25092 kB +496 kB
Buffers: 36 kB 36 kB 0 kB
Cached: 1188 kB 1188 kB 0 kB
SwapCached: 0 kB 0 kB 0 kB
Active: 608 kB 600 kB -8 kB
Inactive: 808 kB 812 kB +4 kB
HighTotal: 0 kB 0 kB 0 kB
HighFree: 0 kB 0 kB 0 kB
LowTotal: 27964 kB 27980 kB +16 kB
LowFree: 24596 kB 25092 kB +496 kB
SwapTotal: 0 kB 0 kB 0 kB
SwapFree: 0 kB 0 kB 0 kB
Dirty: 4 kB 12 kB +8 kB
Writeback: 0 kB 0 kB 0 kB
Mapped: 560 kB 556 kB -4 kB
Slab: 1756 kB 0 kB -1756 kB
CommitLimit: 13980 kB 13988 kB +8 kB
Committed_AS: 4208 kB 4208 kB 0 kB
PageTables: 28 kB 28 kB 0 kB
VmallocTotal: 1007312 kB 1007312 kB 0 kB
VmallocUsed: 48 kB 48 kB 0 kB
VmallocChunk: 1007264 kB 1007264 kB 0 kB(this work has been sponsored in part by CELF)
From: Ingo Molnar
Fix 32-bitness bugs in mm/slob.c.
Signed-off-by: Matt Mackall
Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
First discussed at http://marc.theaimsgroup.com/?t=113149255100001&r=1&w=2
- Use the check_range() in mempolicy.c to gather statistics.
- Improve the numa_maps code in general and fix some comments.
Signed-off-by: Christoph Lameter
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Jan, 2006
1 commit
-
Sanitize some s390 Kconfig options. We have ARCH_S390, ARCH_S390X,
ARCH_S390_31, 64BIT, S390_SUPPORT and COMPAT. Replace these 6 options by
S390, 64BIT and COMPAT.Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
31 Dec, 2005
1 commit
-
The old /proc interfaces were never updated to use loff_t, and are just
generally broken. Now, we should be using the seq_file interface for
all of the proc files, but converting the legacy functions is more work
than most people care for and has little upside..But at least we can make the non-LFS rules explicit, rather than just
insanely wrapping the offset or something.Signed-off-by: Linus Torvalds
29 Nov, 2005
1 commit
-
This replaces the (in my opinion horrible) VM_UNMAPPED logic with very
explicit support for a "remapped page range" aka VM_PFNMAP. It allows a
VM area to contain an arbitrary range of page table entries that the VM
never touches, and never considers to be normal pages.Any user of "remap_pfn_range()" automatically gets this new
functionality, and doesn't even have to mark the pages reserved or
indeed mark them any other way. It just works. As a side effect, doing
mmap() on /dev/mem works for arbitrary ranges.Sparc update from David in the next commit.
Signed-off-by: Linus Torvalds
14 Nov, 2005
1 commit
-
fs/proc/task_mmu.c:198:33: warning: Using plain integer as NULL pointer
Signed-off-by: Luiz Capitulino
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Nov, 2005
3 commits
-
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds -
This patch adds the ability to the SMU driver to recover missing
calibration partitions from the SMU chip itself. It also adds some
dynamic mecanism to /proc/device-tree so that new properties are visible
to userland.Signed-off-by: Benjamin Herrenschmidt
Signed-off-by: Paul Mackerras
31 Oct, 2005
1 commit