31 Mar, 2006
3 commits
-
Woe be unto he who builds their filesystems as modules.
Signed-off-by: Jeff Garzik
[ Obscure quote from the infamous geek bible? ]
Signed-off-by: Linus Torvalds -
This enables the caller to migrate pages from one address space page
cache to another. In buzz word marketing, you can do zero-copy file
copies!Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds -
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).From the splice.c comments:
"splice": joining two ropes together by interweaving their strands.
This is the "extended pipe" functionality, where a pipe is used as
an arbitrary in-memory buffer. Think of a pipe as a small kernel
buffer that you can use to transfer data from one end to the other.The traditional unix read/write is extended with a "splice()" operation
that transfers data buffers to or from a pipe buffer.Named by Larry McVoy, original implementation from Linus, extended by
Jens to support splicing to files and fixing the initial implementation
bugs.Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds
30 Mar, 2006
2 commits
-
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (67 commits)
[PATCH] powerpc: Remove oprofile spinlock backtrace code
[PATCH] powerpc: Add oprofile calltrace support to all powerpc cpus
[PATCH] powerpc: Add oprofile calltrace support
[PATCH] for_each_possible_cpu: ppc
[PATCH] for_each_possible_cpu: powerpc
[PATCH] lock PTE before updating it in 440/BookE page fault handler
[PATCH] powerpc: Kill _machine and hard-coded platform numbers
ppc: Fix compile error in arch/ppc/lib/strcase.c
[PATCH] git-powerpc: WARN was a dumb idea
[PATCH] powerpc: a couple of trivial compile warning fixes
powerpc: remove OCP references
powerpc: Make uImage default build output for MPC8540 ADS
powerpc: move math-emu over to arch/powerpc
powerpc: use memparse() for mem= command line parsing
ppc: fix strncasecmp prototype
[PATCH] powerpc: make ISA floppies work again
[PATCH] powerpc: Fix some initcall return values
[PATCH] powerpc: Workaround for pSeries RTAS bug
[PATCH] spufs: fix __init/__exit annotations
[PATCH] powerpc: add hvc backend for rtas
... -
* git://oss.sgi.com:8090/oss/git/xfs-2.6:
[XFS] Cleanup in XFS after recent get_block_t interface tweaks.
[XFS] Remove unused/obsoleted function: xfs_bmap_do_search_extents()
[XFS] A change to inode chunk allocation to try allocating the new chunk
Fixes a regression from the recent "remove ->get_blocks() support"
[XFS] Fix compiler warning and small code inconsistencies in compat
[XFS] We really suck at spulling. Thanks to Chris Pascoe for fixing all
29 Mar, 2006
21 commits
-
This patch borrows a clever Hugh's 'struct anon_vma' trick.
Without tasklist_lock held we can't trust task->sighand until we locked it
and re-checked that it is still the same.But this means we don't need to defer 'kmem_cache_free(sighand)'. We can
return the memory to slab immediately, all we need is to be sure that
sighand->siglock can't dissapear inside rcu protected section.To do so we need to initialize ->siglock inside ctor function,
SLAB_DESTROY_BY_RCU does the rest.Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
add_parent(p, parent) is always called with parent == p->parent, and it makes
no sense to do it differently. This patch removes this argument.No changes in affected .o files.
Signed-off-by: Oleg Nesterov
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
switch_exec_pids is only called from de_thread by way of exec, and it is
only called when we are exec'ing from a non thread group leader.Currently switch_exec_pids gives the leader the pid of the thread and
unhashes and rehashes all of the process groups. The leader is already in
the EXIT_DEAD state so no one cares about it's pids. The only concern for
the leader is that __unhash_process called from release_task will function
correctly. If we don't touch the leader at all we know that
__unhash_process will work fine so there is no need to touch the leader.For the task becomming the thread group leader, we just need to give it the
pid of the old thread group leader, add it to the task list, and attach it
to the session and the process group of the thread group.Currently de_thread is also adding the task to the task list which is just
silly.Currently the only leader of __detach_pid besides detach_pid is
switch_exec_pids because of the ugly extra work that was being
performed.So this patch removes switch_exec_pids because it is doing too much, it is
creating an unnecessary special case in pid.c, duing work duplicated in
de_thread, and generally obscuring what it is going on.The necessary work is added to de_thread, and it seems to be a little
clearer there what is going on.Signed-off-by: Eric W. Biederman
Cc: Oleg Nesterov
Cc: Kirill Korotaev
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
I think it is enough to take tasklist_lock for reading while changing
child_reaper:Reparenting needs write_lock(tasklist_lock)
Only one thread in a thread group can do exec()
sighand->siglock garantees that get_signal_to_deliver()
will not see a stale value of child_reaper.This means that we can change child_reaper earlier, without calling
zap_other_threads() twice."child_reaper = current" is a NOOP when init does exec from main thread, we
don't care.Signed-off-by: Oleg Nesterov
Acked-by: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
After looking at the problem of init calling exec some more I figured out
an easy way to make the code work.The actual symptom without out this patch is that all threads will die
except pid == 1, and the thread calling exec. The thread calling exec will
wait forever for pid == 1 to die.Since pid == 1 does not install a handler for SIGKILL it will never die.
This modifies the tests for init from current->pid == 1 to the equivalent
current == child_reaper. And then it causes exec in the ugly case to
modify child_reaper.The only weird symptom is that you wind up with an init process that
doesn't have the oldest start time on the box.Signed-off-by: Eric W. Biederman
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Nathan Scott
-
SGI-PV: 951415
SGI-Modid: xfs-linux-melb:xfs-kern:208490aSigned-off-by: Mandy Kirkconnell
Signed-off-by: Nathan Scott -
contiguous with the most recently allocated chunk. On a striped
filesystem, this will fill a stripe unit with inodes before allocating new
inodes in another stripe unit.SGI-PV: 951416
SGI-Modid: xfs-linux-melb:xfs-kern:208488aSigned-off-by: Glen Overby
Signed-off-by: Nathan Scott -
change. inode->i_blkbits should be used when making a get_block_t
request of a filesystem instead of dio->blkbits, as that does not
indicate the filesystem block size all the time (depends on request
alignment - see start of __blockdev_direct_IO).Signed-off-by: Nathan Scott
Acked-by: Badari Pulavarty -
ioctl32 land.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25590aSigned-off-by: Nathan Scott
-
these typos.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25539aSigned-off-by: Nathan Scott
-
Fix a lot of typos. Eyeballed by jmc@ in OpenBSD.
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is a conversion to make the various file_operations structs in fs/
const. Basically a regexp job, with a few manual fixupsThe goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Mark the f_ops members of inodes as const, as well as fix the
ripple-through this causes by places that copy this f_ops and then "do
stuff" with it.Signed-off-by: Arjan van de Ven
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
replaces for_each_cpu with for_each_possible_cpu().
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove an unnecessary level of indirection in allocating and freeing select
bits, as per the select_bits_alloc() and select_bits_free() functions.
Both select.c and compat.c are updated.Signed-off-by: Vadim Lobanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Optimize select and poll by a using stack space for small fd sets
This brings back an old optimization from Linux 2.0. Using the stack is
faster than kmalloc. On a Intel P4 system it speeds up a select of a
single pty fd by about 13% (~4000 cycles -> ~3500)It also saves memory because a daemon hanging in select or poll will
usually save one or two less pages. This can add up - e.g. if you have 10
daemons blocking in poll/select you save 40KB of memory.I did a patch for this long ago, but it was never applied. This version is
a reimplementation of the old patch that tries to be less intrusive. I
only did the minimal changes needed for the stack allocation.The cut off point before external memory is allocated is currently at
832bytes. The system calls always allocate this much memory on the stack.These 832 bytes are divided into 256 bytes frontend data (for the select
bitmaps of the pollfds) and the rest of the space for the wait queues used
by the low level drivers. There are some extreme cases where this won't
work out for select and it falls back to allocating memory too early -
especially with very sparse large select bitmaps - but the majority of
processes who only have a small number of file descriptors should be ok.
[TBD: 832/256 might not be the best split for select or poll]I suspect more optimizations might be possible, but they would be more
complicated. One way would be to cache the select/poll context over
multiple system calls because typically the input values should be similar.
Problem is when to flush the file descriptors out though.Signed-off-by: Andi Kleen
Cc: Eric Dumazet
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add a proper prototype for autofs4_dentry_release() to autofs_i.h.
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add proper prototypes for fat_cache_init() and fat_cache_destroy() in
msdos_fs.h.Signed-off-by: Adrian Bunk
Acked-by: OGAWA Hirofumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 Mar, 2006
14 commits
-
This removes statically assigned platform numbers and reworks the
powerpc platform probe code to use a better mechanism. With this,
board support files can simply declare a new machine type with a
macro, and implement a probe() function that uses the flattened
device-tree to detect if they apply for a given machine.We now have a machine_is() macro that replaces the comparisons of
_machine with the various PLATFORM_* constants. This commit also
changes various drivers to use the new macro instead of looking at
_machine.Signed-off-by: Benjamin Herrenschmidt
Signed-off-by: Paul Mackerras -
Various dodgy firmware might give us nodes and/or properties in the device
tree with conflicting names. That's generally ok, except for when we export
the device tree via /proc, so check when we're creating the proc device tree
and munge names accordingly.Tested on a faked device tree with kexec, would be good if someone with
actual bogus firmware could try it, but just for completeness.Signed-off-by: Michael Ellerman
Signed-off-by: Paul Mackerras -
Convert bd_sem to bd_mutex
Signed-off-by: Jun'ichi Nomura
Cc: Alasdair G Kergon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Adding bd_claim_by_kobject() function which takes kobject as additional
signature of holder device and creates sysfs symlinks between holder device
and claimed device. bd_release_from_kobject() is a counterpart of
bd_claim_by_kobject.Signed-off-by: Jun'ichi Nomura
Cc: Alasdair G Kergon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove all the CONFIG_SYSFS stuff. That's supposed to all be implemented up
in header files.Yes, the CONFIG_SYSFS=n data structures will be a little larger than
necessary, but that's a tradeoff we can decide to make.Cc: Jun'ichi Nomura
Cc: Alasdair G Kergon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Creating "slaves" and "holders" directories in /sys/block/ and
creating "holders" directory under /sys/block//Signed-off-by: Jun'ichi Nomura
Cc: Alasdair G Kergon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Replace for_each_pgdat() with for_each_online_pgdat().
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We can now make some code static.
Signed-off-by: Adrian Bunk
Cc: Neil Brown
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
.. it makes some of the code nicer.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
These were an unnecessary wart. Also only have one 'DefineSimpleCache..'
instead of two.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Current svc_expkey holds a pointer to the svc_export structure, so updates to
that structure have to be in-place, which is a wart on the whole cache
infrastruct. So we break that linkage and just do a second lookup.If this became a performance issue, it would be possible to put a direct link
back in which was only used conditionally. i.e. when an object is replaced
in the cache, we set a flag in the old object. When dereferencing the link
from svc_expkey, if the flag is set, we drop the reference and do a fresh
lookup.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds