Eric Lee / smarc-fsl-linux-kernel

14 Jan, 2012

1 commit

69e4747ee Unused iocbs in a batch should not be accounted as active. ... Browse Code »

Since commit 080d676de095 ("aio: allocate kiocbs in batches") iocbs are
allocated in a batch during processing of first iocbs. All iocbs in a
batch are automatically added to ctx->active_reqs list and accounted in
ctx->reqs_active.

If one (not the last one) of iocbs submitted by an user fails, further
iocbs are not processed, but they are still present in ctx->active_reqs
and accounted in ctx->reqs_active. This causes process to stuck in a D
state in wait_for_all_aios() on exit since ctx->reqs_active will never
go down to zero. Furthermore since kiocb_batch_free() frees iocb
without removing it from active_reqs list the list become corrupted
which may cause oops.

Fix this by removing iocb from ctx->active_reqs and updating
ctx->reqs_active in kiocb_batch_free().

Signed-off-by: Gleb Natapov
Reviewed-by: Jeff Moyer
Cc: stable@kernel.org # 3.2
Signed-off-by: Linus Torvalds

Gleb Natapov
2012-01-14 12:39:44 +0800

03 Nov, 2011

1 commit

080d676de aio: allocate kiocbs in batches ... Browse Code »

In testing aio on a fast storage device, I found that the context lock
takes up a fair amount of cpu time in the I/O submission path. The reason
is that we take it for every I/O submitted (see __aio_get_req). Since we
know how many I/Os are passed to io_submit, we can preallocate the kiocbs
in batches, reducing the number of times we take and release the lock.

In my testing, I was able to reduce the amount of time spent in
_raw_spin_lock_irq by .56% (average of 3 runs). The command I used to
test this was:

aio-stress -O -o 2 -o 3 -r 8 -d 128 -b 32 -i 32 -s 16384

I also tested the patch with various numbers of events passed to
io_submit, and I ran the xfstests aio group of tests to ensure I didn't
break anything.

Signed-off-by: Jeff Moyer
Cc: Daniel Ehrenberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jeff Moyer
2011-11-03 07:07:03 +0800

01 Nov, 2011

1 commit

fcf634098 Cross Memory Attach ... Browse Code »

The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call. There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
using it:
- Does not allow for specifying iovecs for both src and dest, assuming
preadv or pwritev was implemented either the area read from or
written to would need to be contiguous.
- Currently mem_read allows only processes who are currently
ptrace'ing the target and are still able to ptrace the target to read
from the target. This check could possibly be moved to the open call,
but its not clear exactly what race this restriction is stopping
(reason appears to have been lost)
- Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
domain socket is a bit ugly from a userspace point of view,
especially when you may have hundreds if not (eventually) thousands
of processes that all need to do this with each other
- Doesn't allow for some future use of the interface we would like to
consider adding in the future (see below)
- Interestingly reading from /proc/pid/mem currently actually
involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems. Since you need the reader and writer working co-operatively if
the pipe is not drained then you block. Which requires some wrapping to
do non blocking on the send side or polling on the receive. In all to all
communication it requires ordering otherwise you can deadlock. And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could. For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy. We don't need to keep a copy of the data
from the source. I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI). This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication. And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

Signed-off-by: Chris Yeoh
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Arnd Bergmann
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: David Howells
Cc: James Morris
Cc:
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christopher Yeoh
2011-11-01 08:30:44 +0800

25 Mar, 2011

1 commit

6c5103890 Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...

Fix up conflicts in fs/{aio.c,super.c}

Linus Torvalds
2011-03-25 01:16:26 +0800

23 Mar, 2011

1 commit

e91f90bb0 aio: wake all waiters when destroying ctx ... Browse Code »

The test program below will hang because io_getevents() uses
add_wait_queue_exclusive(), which means the wake_up() in io_destroy() only
wakes up one of the threads. Fix this by using wake_up_all() in the aio
code paths where we want to make sure no one gets stuck.

// t.c -- compile with gcc -lpthread -laio t.c

#include
#include
#include
#include

static const int nthr = 2;

void *getev(void *ctx)
{
struct io_event ev;
io_getevents(ctx, 1, 1, &ev, NULL);
printf("io_getevents returned\n");
return NULL;
}

int main(int argc, char *argv[])
{
io_context_t ctx = 0;
pthread_t thread[nthr];
int i;

io_setup(1024, &ctx);

for (i = 0; i < nthr; ++i)
pthread_create(&thread[i], NULL, getev, ctx);

sleep(1);

io_destroy(ctx);

for (i = 0; i < nthr; ++i)
pthread_join(thread[i], NULL);

return 0;
}

Signed-off-by: Roland Dreier
Reviewed-by: Jeff Moyer
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roland Dreier
2011-03-23 08:44:17 +0800

16 Mar, 2011

1 commit

bd2895eea Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix build failure introduced by s/freezeable/freezable/
workqueue: add system_freezeable_wq
rds/ib: use system_wq instead of rds_ib_fmr_wq
net/9p: replace p9_poll_task with a work
net/9p: use system_wq instead of p9_mux_wq
xfs: convert to alloc_workqueue()
reiserfs: make commit_wq use the default concurrency level
ocfs2: use system_wq instead of ocfs2_quota_wq
ext4: convert to alloc_workqueue()
scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path
scsi/be2iscsi,qla2xxx: convert to alloc_workqueue()
misc/iwmc3200top: use system_wq instead of dedicated workqueues
i2o: use alloc_workqueue() instead of create_workqueue()
acpi: kacpi*_wq don't need WQ_MEM_RECLAIM
fs/aio: aio_wq isn't used in memory reclaim path
input/tps6507x-ts: use system_wq instead of dedicated workqueue
cpufreq: use system_wq instead of dedicated workqueues
wireless/ipw2x00: use system_wq instead of dedicated workqueues
arm/omap: use system_wq in mailbox
workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER

Linus Torvalds
2011-03-16 23:20:19 +0800

10 Mar, 2011

4 commits

4c63f5646 Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core ... Browse Code »

Conflicts:
block/blk-core.c
block/blk-flush.c
drivers/md/raid1.c
drivers/md/raid10.c
drivers/md/raid5.c
fs/nilfs2/btnode.c
fs/nilfs2/mdt.c

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:58:35 +0800
cf15900e1 aio: remove request submission batching ... Browse Code »

This should be useless now that we have on-stack plugging. So lets just
kill it.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:27 +0800
9f5b94254 fs: make aio plug ... Browse Code »

Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2011-03-10 15:52:27 +0800
7eaceacca block: remove per-queue plugging ... Browse Code »
86

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:07 +0800

26 Feb, 2011

2 commits

7137c6bd4 aio: fix race between io_destroy() and io_submit() ... Browse Code »

A race can occur when io_submit() races with io_destroy():

CPU1 CPU2
io_submit()
do_io_submit()
...
ctx = lookup_ioctx(ctx_id);
io_destroy()
Now do_io_submit() holds the last reference to ctx.
...
queue new AIO
put_ioctx(ctx) - frees ctx with active AIOs

We solve this issue by checking whether ctx is being destroyed in AIO
submission path after adding new AIO to ctx. Then we are guaranteed that
either io_destroy() waits for new AIO or we see that ctx is being
destroyed and bail out.

Cc: Nick Piggin
Reviewed-by: Jeff Moyer
Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2011-02-26 07:07:37 +0800
3bd9a5d73 aio: fix rcu ioctx lookup ... Browse Code »

aio-dio-invalidate-failure GPFs in aio_put_req from io_submit.

lookup_ioctx doesn't implement the rcu lookup pattern properly.
rcu_read_lock does not prevent refcount going to zero, so we might take
a refcount on a zero count ioctx.

Fix the bug by atomically testing for zero refcount before incrementing.

[jack@suse.cz: added comment into the code]
Reviewed-by: Jeff Moyer
Signed-off-by: Nick Piggin
Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2011-02-26 07:07:37 +0800

27 Jan, 2011

1 commit

d37adaa15 fs/aio: aio_wq isn't used in memory reclaim path ... Browse Code »

aio_wq isn't used during memory reclaim. Convert to alloc_workqueue()
without WQ_MEM_RECLAIM. It's possible to use system_wq but given that
the number of work items is determined from userland and the work item
may block, enforcing strict concurrency limit would be a good idea.

Also, move fput_work to system_wq so that aio_wq is used soley to
throttle the max concurrency of aio work items and fput_work doesn't
interact with other work items.

Signed-off-by: Tejun Heo
Acked-by: Jeff Moyer
Cc: Benjamin LaHaise
Cc: linux-aio@kvack.org

Tejun Heo
2011-01-27 00:42:27 +0800

17 Jan, 2011

1 commit

27eaa1c90 aio: check return value of create_workqueue() ... Browse Code »

Signed-off-by: Namhyung Kim
Signed-off-by: Al Viro

Namhyung Kim
2011-01-17 18:12:44 +0800

14 Jan, 2011

2 commits

d3486f8b9 aio: remove unused aio_run_iocbs() ... Browse Code »

aio_run_iocbs() is not used at all, so get rid of it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jeff Moyer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jeff Moyer
2011-01-14 00:03:22 +0800
2e4102559 aio: remove unnecessary check ... Browse Code »

'nr >= min_nr >= 0' always satisfies 'nr >= 0' so the check is unnecesary.

Signed-off-by: Namhyung Kim
Acked-by: Jeff Moyer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2011-01-14 00:03:22 +0800

26 Oct, 2010

2 commits

7de9c6ee3 new helper: ihold() ... Browse Code »

Clones an existing reference to inode; caller must already hold one.

Signed-off-by: Al Viro

Al Viro
2010-10-26 09:26:11 +0800
306fb0979 aio: bump i_count instead of using igrab ... Browse Code »

The aio batching code is using igrab to get an extra reference on the
inode so it can safely batch. igrab will go ahead and take the global
inode spinlock, which can be a bottleneck on large machines doing lots
of AIO.

In this case, igrab isn't required because we already have a reference
on the file handle. It is safe to just bump the i_count directly
on the inode.

Benchmarking shows this patch brings IOP/s on tons of flash up by about
2.5X.

Signed-off-by: Chris Mason

Chris Mason
2010-10-26 09:18:23 +0800

23 Sep, 2010

1 commit

a0c42bac7 aio: do not return ERESTARTSYS as a result of AIO ... Browse Code »

OCFS2 can return ERESTARTSYS from its write function when the process is
signalled while waiting for a cluster lock (and the filesystem is mounted
with intr mount option). Generally, it seems reasonable to allow
filesystems to return this error code from its IO functions. As we must
not leak ERESTARTSYS (and similar error codes) to userspace as a result of
an AIO operation, we have to properly convert it to EINTR inside AIO code
(restarting the syscall isn't really an option because other AIO could
have been already submitted by the same io_submit syscall).

Signed-off-by: Jan Kara
Reviewed-by: Jeff Moyer
Cc: Christoph Hellwig
Cc: Zach Brown
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2010-09-23 08:22:39 +0800

15 Sep, 2010

1 commit

75e1c70fc aio: check for multiplication overflow in do_io_submit ... Browse Code »

Tavis Ormandy pointed out that do_io_submit does not do proper bounds
checking on the passed-in iocb array:

if (unlikely(nr < 0))
return -EINVAL;

if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp)))))
return -EFAULT; ^^^^^^^^^^^^^^^^^^

The attached patch checks for overflow, and if it is detected, the
number of iocbs submitted is scaled down to a number that will fit in
the long. This is an ok thing to do, as sys_io_submit is documented as
returning the number of iocbs submitted, so callers should handle a
return value of less than the 'nr' argument passed in.

Reported-by: Tavis Ormandy
Signed-off-by: Jeff Moyer
Signed-off-by: Linus Torvalds

Jeff Moyer
2010-09-15 08:02:37 +0800

06 Aug, 2010

1 commit

642b5123a aio: fix wrong subsystem comments ... Browse Code »

- sys_io_destroy(): acutually return -EINVAL if the context pointed to
is invalidIndex: linux-2.6.33-rc4/fs/aio.c
- sys_io_getevents(): An argument specifying timeout is not `when',
but `timeout'.
- sys_io_getevents(): Should describe what is returned if this syscall
succeeds.

Signed-off-by: Satoru Takeuchi
Signed-off-by: Randy Dunlap
Reviewed-by: Jeff Moyer
Signed-off-by: Linus Torvalds

Satoru Takeuchi
2010-08-06 04:21:23 +0800

28 May, 2010

2 commits

d7065da03 get rid of the magic around f_count in aio ... Browse Code »

__aio_put_req() plays sick games with file refcount. What
it wants is fput() from atomic context; it's almost always
done with f_count > 1, so they only have to deal with delayed
work in rare cases when their reference happens to be the
last one. Current code decrements f_count and if it hasn't
hit 0, everything is fine. Otherwise it keeps a pointer
to struct file (with zero f_count!) around and has delayed
work do __fput() on it.

Better way to do it: use atomic_long_add_unless( , -1, 1)
instead of !atomic_long_dec_and_test(). IOW, decrement it
only if it's not the last reference, leave refcount alone
if it was. And use normal fput() in delayed work.

I've made that atomic_long_add_unless call a new helper -
fput_atomic(). Drops a reference to file if it's safe to
do in atomic (i.e. if that's not the last one), tells if
it had been able to do that. aio.c converted to it, __fput()
use is gone. req->ki_file *always* contributes to refcount
now. And __fput() became static.

Signed-off-by: Al Viro

Al Viro
2010-05-28 10:03:07 +0800
9d85cba71 aio: fix the compat vectored operations ... Browse Code »

The aio compat code was not converting the struct iovecs from 32bit to
64bit pointers, causing either EINVAL to be returned from io_getevents, or
EFAULT as the result of the I/O. This patch passes a compat flag to
io_submit to signal that pointer conversion is necessary for a given iocb
array.

A variant of this was tested by Michael Tokarev. I have also updated the
libaio test harness to exercise this code path with good success.
Further, I grabbed a copy of ltp and ran the
testcases/kernel/syscall/readv and writev tests there (compiled with -m32
on my 64bit system). All seems happy, but extra eyes on this would be
welcome.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix CONFIG_COMPAT=n build]
Signed-off-by: Jeff Moyer
Reported-by: Michael Tokarev
Cc: Zach Brown
Cc: [2.6.35.1]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jeff Moyer
2010-05-28 00:12:53 +0800

16 Dec, 2009

1 commit

fac046ad0 aio: remove unused field ... Browse Code »

Don't know the reason, but it appears ki_wait field of iocb never gets used.

Signed-off-by: Shaohua Li
Cc: Jeff Moyer
Cc: Benjamin LaHaise
Cc: Zach Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shaohua Li
2009-12-16 23:20:13 +0800

29 Oct, 2009

1 commit

b9d128f10 block: move bdi/address_space unplug functions to backing-dev.h ... Browse Code »

There's nothing block related about them, the backing device
is used by things like NFS etc as well. This gets rid of the
need to protect such calls by CONFIG_BLOCK.

Signed-off-by: Jens Axboe

Jens Axboe
2009-10-29 20:59:26 +0800

28 Oct, 2009

1 commit

cfb1e33ee aio: implement request batching ... Browse Code »

Hi,

Some workloads issue batches of small I/O, and the performance is poor
due to the call to blk_run_address_space for every single iocb. Nathan
Roberts pointed this out, and suggested that by deferring this call
until all I/Os in the iocb array are submitted to the block layer, we
can realize some impressive performance gains (up to 30% for sequential
4k reads in batches of 16).

Signed-off-by: Jeff Moyer
Signed-off-by: Jens Axboe

Jeff Moyer
2009-10-28 16:29:25 +0800

23 Sep, 2009

1 commit

385773e04 aio.c: move EXPORT* macros to line after function ... Browse Code »

As mentioned in Documentation/CodingStyle, move EXPORT* macro's
to the line immediately after the closing function brace line.

Also, move the __initcall() similarly.

Signed-off-by: H Hartley Sweeten
Cc: Zach Brown
Cc: Benjamin LaHaise
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

H Hartley Sweeten
2009-09-23 22:39:29 +0800

22 Sep, 2009

1 commit

3d2d827f5 mm: move use_mm/unuse_mm from aio.c to mm/ ... Browse Code »

Anyone who wants to do copy to/from user from a kernel thread, needs
use_mm (like what fs/aio has). Move that into mm/, to make reusing and
exporting easier down the line, and make aio use it. Next intended user,
besides aio, will be vhost-net.

Acked-by: Andrea Arcangeli
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael S. Tsirkin
2009-09-22 22:17:42 +0800

01 Jul, 2009

1 commit

133890103 eventfd: revised interface and cleanups ... Browse Code »

Change the eventfd interface to de-couple the eventfd memory context, from
the file pointer instance.

Without such change, there is no clean way to racely free handle the
POLLHUP event sent when the last instance of the file* goes away. Also,
now the internal eventfd APIs are using the eventfd context instead of the
file*.

This patch is required by KVM's IRQfd code, which is still under
development.

Signed-off-by: Davide Libenzi
Cc: Gregory Haskins
Cc: Rusty Russell
Cc: Benjamin LaHaise
Cc: Avi Kivity
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davide Libenzi
2009-07-01 09:55:58 +0800

20 Mar, 2009

2 commits

65c24491b aio: lookup_ioctx can return the wrong value when looking up a bogus context ... Browse Code »

The libaio test harness turned up a problem whereby lookup_ioctx on a
bogus io context was returning the 1 valid io context from the list
(harness/cases/3.p).

Because of that, an extra put_iocontext was done, and when the process
exited, it hit a BUG_ON in the put_iocontext macro called from exit_aio
(since we expect a users count of 1 and instead get 0).

The problem was introduced by "aio: make the lookup_ioctx() lockless"
(commit abf137dd7712132ee56d5b3143c2ff61a72a5faa).

Thanks to Zach for pointing out that hlist_for_each_entry_rcu will not
return with a NULL tpos at the end of the loop, even if the entry was
not found.

Signed-off-by: Jeff Moyer
Acked-by: Zach Brown
Acked-by: Jens Axboe
Cc: Benjamin LaHaise
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jeff Moyer
2009-03-20 06:57:18 +0800
87c3a86e1 eventfd: remove fput() call from possible IRQ context ... Browse Code »

Remove a source of fput() call from inside IRQ context. Myself, like Eric,
wasn't able to reproduce an fput() call from IRQ context, but Jeff said he was
able to, with the attached test program. Independently from this, the bug is
conceptually there, so we might be better off fixing it. This patch adds an
optimization similar to the one we already do on ->ki_filp, on ->ki_eventfd.
Playing with ->f_count directly is not pretty in general, but the alternative
here would be to add a brand new delayed fput() infrastructure, that I'm not
sure is worth it.

Signed-off-by: Davide Libenzi
Cc: Benjamin LaHaise
Cc: Trond Myklebust
Cc: Eric Dumazet
Signed-off-by: Jeff Moyer
Cc: Zach Brown
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davide Libenzi
2009-03-20 06:57:18 +0800

14 Jan, 2009

1 commit

002c8976e [CVE-2009-0029] System call wrappers part 16 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:25 +0800

29 Dec, 2008

1 commit

abf137dd7 aio: make the lookup_ioctx() lockless ... Browse Code »

The mm->ioctx_list is currently protected by a reader-writer lock,
so we always grab that lock on the read side for doing ioctx
lookups. As the workload is extremely reader biased, turn this into
an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
the rwlock and use a spinlock for providing update side exclusion.

There's usually only 1 entry on this list, so it doesn't make sense
to look into fancier data structures.

Reviewed-by: Jeff Moyer
Signed-off-by: Jens Axboe

Jens Axboe
2008-12-29 15:29:50 +0800

27 Jul, 2008

1 commit

516e0cc56 [PATCH] f_count may wrap around ... Browse Code »

make it atomic_long_t; while we are at it, get rid of useless checks in affs,
hfs and hpfs - ->open() always has it equal to 1, ->release() - to 0.

Signed-off-by: Al Viro

Al Viro
2008-07-27 08:53:40 +0800

26 Jul, 2008

1 commit

246bb0b1d kill PF_BORROWED_MM in favour of PF_KTHREAD ... Browse Code »

Kill PF_BORROWED_MM. Change use_mm/unuse_mm to not play with ->flags, and
do s/PF_BORROWED_MM/PF_KTHREAD/ for a couple of other users.

No functional changes yet. But this allows us to do further
fixes/cleanups.

oom_kill/ptrace/etc often check "p->mm != NULL" to filter out the
kthreads, this is wrong because of use_mm(). The problem with
PF_BORROWED_MM is that we need task_lock() to avoid races. With this
patch we can check PF_KTHREAD directly, or use a simple lockless helper:

/* The result must not be dereferenced !!! */
struct mm_struct *__get_task_mm(struct task_struct *tsk)
{
if (tsk->flags & PF_KTHREAD)
return NULL;
return tsk->mm;
}

Note also ecard_task(). It runs with ->mm != NULL, but it's the kernel
thread without PF_BORROWED_MM.

Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-07-26 01:53:39 +0800

07 Jun, 2008

1 commit

aab2545fd uml: activate_mm: remove the dead PF_BORROWED_MM check ... Browse Code »

use_mm() was changed to use switch_mm() instead of activate_mm(), since
then nobody calls (and nobody should call) activate_mm() with
PF_BORROWED_MM bit set.

As Jeff Dike pointed out, we can also remove the "old != new" check, it is
always true.

Signed-off-by: Oleg Nesterov
Cc: Jeff Dike
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-06-07 02:36:22 +0800

30 Apr, 2008

1 commit

c6f3a97f8 debugobjects: add timer specific object debugging code ... Browse Code »

Add calls to the generic object debugging infrastructure and provide fixup
functions which allow to keep the system alive when recoverable problems have
been detected by the object debugging core code.

Signed-off-by: Thomas Gleixner
Acked-by: Ingo Molnar
Cc: Greg KH
Cc: Randy Dunlap
Cc: Kay Sievers
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Gleixner
2008-04-30 23:29:53 +0800

29 Apr, 2008

3 commits

39fa00311 aio: fix misleading comments ... Browse Code »

The FIXME comments are inaccurate.
The locking comment over lookup_ioctx() is wrong.

Signed-off-by: Jeff Moyer
Signed-off-by: Zach Brown
Signed-off-by: Shen Feng
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jeff Moyer
2008-04-29 23:06:29 +0800
801678c5a Remove duplicated unlikely() in IS_ERR() ... Browse Code »

Some drivers have duplicated unlikely() macros. IS_ERR() already has
unlikely() in itself.

This patch cleans up such pointless code.

Signed-off-by: Hirofumi Nakagawa
Acked-by: David S. Miller
Acked-by: Jeff Garzik
Cc: Paul Clements
Cc: Richard Purdie
Cc: Alessandro Zummo
Cc: David Brownell
Cc: James Bottomley
Cc: Michael Halcrow
Cc: Anton Altaparmakov
Cc: Al Viro
Cc: Carsten Otte
Cc: Patrick McHardy
Cc: Paul Mundt
Cc: Jaroslav Kysela
Cc: Takashi Iwai
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hirofumi Nakagawa
2008-04-29 23:06:25 +0800
d5470b596 fs/aio.c: make 3 functions static ... Browse Code »

Make the following needlessly global functions static:

- __put_ioctx()
- lookup_ioctx()
- io_submit_one()

Signed-off-by: Adrian Bunk
Cc: Zach Brown
Cc: Benjamin LaHaise
Cc: Badari Pulavarty
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-04-29 23:06:00 +0800