Doug / smarc-fsl-linux-kernel | Embedian Git Server

30 Apr, 2013

1 commit

546ae2d2f fs/read_write.c: fix generic_file_llseek() comment ... Browse Code »

Commit ef3d0fd27e90 ("vfs: do (nearly) lockless generic_file_llseek")
has removed i_mutex from generic_file_llseek, so update the comment
accordingly.

Signed-off-by: Ming Lei
Cc: Alexander Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ming Lei
2013-04-30 06:54:28 +0800

28 Mar, 2013

1 commit

3e84f48ed vfs/splice: Fix missed checks in new __kernel_write() helper ... Browse Code »

Commit 06ae43f34bcc ("Don't bother with redoing rw_verify_area() from
default_file_splice_from()") lost the checks to test existence of the
write/aio_write methods. My apologies ;-/

Eventually, we want that in fs/splice.c side of things (no point
repeating it for every buffer, after all), but for now this is the
obvious minimal fix.

Reported-by: Dave Jones
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2013-03-28 00:24:02 +0800

22 Mar, 2013

1 commit

06ae43f34 Don't bother with redoing rw_verify_area() from default_file_splice_from() ... Browse Code »

default_file_splice_from() ends up calling vfs_write() (via very convoluted
callchain). It's an overkill, since we already have done rw_verify_area()
in the caller by the time we call vfs_write() we are under set_fs(KERNEL_DS),
so access_ok() is also pointless. Add a new helper (__kernel_write()),
use it instead of kernel_write() in there.

Signed-off-by: Al Viro

Al Viro
2013-03-22 01:11:11 +0800

03 Mar, 2013

1 commit

14cc0b55b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal ... Browse Code »

Pull signal/compat fixes from Al Viro:
"Fixes for several regressions introduced in the last signal.git pile,
along with fixing bugs in truncate and ftruncate compat (on just about
anything biarch at least one of those two had been done wrong)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
compat: restore timerfd settime and gettime compat syscalls
[regression] braino in "sparc: convert to ksignal"
fix compat truncate/ftruncate
switch lseek to COMPAT_SYSCALL_DEFINE
lseek() and truncate() on sparc really need sign extension

Linus Torvalds
2013-03-03 00:34:06 +0800

24 Feb, 2013

1 commit

561c67319 switch lseek to COMPAT_SYSCALL_DEFINE ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-02-24 23:52:26 +0800

23 Feb, 2013

1 commit

496ad9aa8 new helper: file_inode(file) ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-02-23 12:31:31 +0800

21 Dec, 2012

1 commit

a68c2f12b sendfile: allows bypassing of notifier events ... Browse Code »

do_sendfile() in fs/read_write.c does not call the fsnotify functions,
unlike its neighbors. This manifests as a lack of inotify ACCESS events
when a file is sent using sendfile(2).

Addresses
https://bugzilla.kernel.org/show_bug.cgi?id=12812

[akpm@linux-foundation.org: use fsnotify_modify(out.file), not fsnotify_access(), per Dave]
Signed-off-by: Alan Cox
Cc: Dave Chinner
Cc: Jens Axboe
Cc: Scott Wolchok
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Scott Wolchok
2012-12-21 09:40:21 +0800

18 Dec, 2012

1 commit

965c8e59c lseek: the "whence" argument is called "whence" ... Browse Code »

But the kernel decided to call it "origin" instead. Fix most of the
sites.

Acked-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2012-12-18 09:15:12 +0800

03 Oct, 2012

1 commit

8f9c0119d compat: fs: Generic compat_sys_sendfile implementation ... Browse Code »

This function is used by sparc, powerpc and arm64 for compat support.
The patch adds a generic implementation which calls do_sendfile()
directly and avoids set_fs().

The sparc architecture has wrappers for the sign extensions while
powerpc relies on the compiler to do the this. The patch adds wrappers
for powerpc to handle the u32->int type conversion.

compat_sys_sendfile64() can be replaced by a sys_sendfile() call since
compat_loff_t has the same size as off_t on a 64-bit system.

On powerpc, the patch also changes the 64-bit sendfile call from
sys_sendile64 to sys_sendfile.

Signed-off-by: Catalin Marinas
Acked-by: David S. Miller
Cc: Arnd Bergmann
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Alexander Viro
Cc: Andrew Morton
Signed-off-by: Al Viro

Catalin Marinas
2012-10-03 09:35:55 +0800

27 Sep, 2012

1 commit

2903ff019 switch simple cases of fget_light to fdget ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-09-27 10:20:08 +0800

23 Jul, 2012

1 commit

e8b96eb50 vfs: allow custom EOF in generic_file_llseek code ... Browse Code »

For ext3/4 htree directories, using the vfs llseek function with
SEEK_END goes to i_size like for any other file, but in reality
we want the maximum possible hash value. Recent changes
in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c,
but replicating this core code seems like a bad idea, especially
since the copy has already diverged from the vfs.

This patch updates generic_file_llseek_size to accept
both a custom maximum offset, and a custom EOF position. With this
in place, ext4_dir_llseek can pass in the appropriate maximum hash
position for both maxsize and eof, and get what it wants.

As far as I know, this does not fix any bugs - nfs in the kernel
doesn't use SEEK_END, and I don't know of any user who does. But
some ext4 folks seem keen on doing the right thing here, and I can't
really argue.

(Patch also fixes up some comments slightly)

Signed-off-by: Eric Sandeen
Signed-off-by: Al Viro

Eric Sandeen
2012-07-23 04:00:15 +0800

01 Jun, 2012

1 commit

ac34ebb3a aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() ... Browse Code »

A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after
changes made to support CMA in an earlier patch.

Rather than having an additional check_access parameter to these
functions, the first paramater type is overloaded to allow the caller to
specify CHECK_IOVEC_ONLY which means check that the contents of the iovec
are valid, but do not check the memory that they point to. This is used
by process_vm_readv/writev where we need to validate that a iovec passed
to the syscall is valid but do not want to check the memory that it points
to at this point because it refers to an address space in another process.

Signed-off-by: Chris Yeoh
Reviewed-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christopher Yeoh
2012-06-01 08:49:32 +0800

29 Feb, 2012

1 commit

630d9c472 fs: reduce the use of module.h wherever possible ... Browse Code »

For files only using THIS_MODULE and/or EXPORT_SYMBOL, map
them onto including export.h -- or if the file isn't even
using those, then just delete the include. Fix up any implicit
include dependencies that were being masked by module.h along
the way.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2012-02-29 08:31:58 +0800

01 Nov, 2011

1 commit

fcf634098 Cross Memory Attach ... Browse Code »

The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call. There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
using it:
- Does not allow for specifying iovecs for both src and dest, assuming
preadv or pwritev was implemented either the area read from or
written to would need to be contiguous.
- Currently mem_read allows only processes who are currently
ptrace'ing the target and are still able to ptrace the target to read
from the target. This check could possibly be moved to the open call,
but its not clear exactly what race this restriction is stopping
(reason appears to have been lost)
- Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
domain socket is a bit ugly from a userspace point of view,
especially when you may have hundreds if not (eventually) thousands
of processes that all need to do this with each other
- Doesn't allow for some future use of the interface we would like to
consider adding in the future (see below)
- Interestingly reading from /proc/pid/mem currently actually
involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems. Since you need the reader and writer working co-operatively if
the pipe is not drained then you block. Which requires some wrapping to
do non blocking on the send side or polling on the receive. In all to all
communication it requires ordering otherwise you can deadlock. And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could. For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy. We don't need to keep a copy of the data
from the source. I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI). This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication. And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

Signed-off-by: Chris Yeoh
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Arnd Bergmann
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: David Howells
Cc: James Morris
Cc:
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christopher Yeoh
2011-11-01 08:30:44 +0800

28 Oct, 2011

2 commits

5760495a8 vfs: add generic_file_llseek_size ... Browse Code »

Add a generic_file_llseek variant to the VFS that allows passing in
the maximum file size of the file system, instead of always
using maxbytes from the superblock.

This can be used to eliminate some cut'n'paste seek code in ext4.

Signed-off-by: Andi Kleen
Signed-off-by: Christoph Hellwig

Andi Kleen
2011-10-28 20:58:59 +0800
ef3d0fd27 vfs: do (nearly) lockless generic_file_llseek ... Browse Code »

The i_mutex lock use of generic _file_llseek hurts. Independent processes
accessing the same file synchronize over a single lock, even though
they have no need for synchronization at all.

Under high utilization this can cause llseek to scale very poorly on larger
systems.

This patch does some rethinking of the llseek locking model:

First the 64bit f_pos is not necessarily atomic without locks
on 32bit systems. This can already cause races with read() today.
This was discussed on linux-kernel in the past and deemed acceptable.
The patch does not change that.

Let's look at the different seek variants:

SEEK_SET: Doesn't really need any locking.
If there's a race one writer wins, the other loses.

For 32bit the non atomic update races against read()
stay the same. Without a lock they can also happen
against write() now. The read() race was deemed
acceptable in past discussions, and I think if it's
ok for read it's ok for write too.

=> Don't need a lock.

SEEK_END: This behaves like SEEK_SET plus it reads
the maximum size too. Reading the maximum size would have the
32bit atomic problem. But luckily we already have a way to read
the maximum size without locking (i_size_read), so we
can just use that instead.

Without i_mutex there is no synchronization with write() anymore,
however since the write() update is atomic on 64bit it just behaves
like another racy SEEK_SET. On non atomic 32bit it's the same
as SEEK_SET.

=> Don't need a lock, but need to use i_size_read()

SEEK_CUR: This has a read-modify-write race window
on the same file. One could argue that any application
doing unsynchronized seeks on the same file is already broken.
But for the sake of not adding a regression here I'm
using the file->f_lock to synchronize this. Using this
lock is much better than the inode mutex because it doesn't
synchronize between processes.

=> So still need a lock, but can use a f_lock.

This patch implements this new scheme in generic_file_llseek.
I dropped generic_file_llseek_unlocked and changed all callers.

Signed-off-by: Andi Kleen
Signed-off-by: Christoph Hellwig

Andi Kleen
2011-10-28 20:58:58 +0800

27 Jul, 2011

1 commit

bacb2d816 fs: add missing unlock in default_llseek() ... Browse Code »

A recent change in linux-next, 982d816581 "fs: add SEEK_HOLE and
SEEK_DATA flags" added some direct returns on error, but it should
have been a goto out.

Signed-off-by: Dan Carpenter
Signed-off-by: Al Viro

Dan Carpenter
2011-07-27 00:57:09 +0800

21 Jul, 2011

1 commit

982d81658 fs: add SEEK_HOLE and SEEK_DATA flags ... Browse Code »

This just gets us ready to support the SEEK_HOLE and SEEK_DATA flags. Turns out
using fiemap in things like cp cause more problems than it solves, so lets try
and give userspace an interface that doesn't suck. We need to match solaris
here, and the definitions are

*o* If /whence/ is SEEK_HOLE, the offset of the start of the
next hole greater than or equal to the supplied offset
is returned. The definition of a hole is provided near
the end of the DESCRIPTION.

*o* If /whence/ is SEEK_DATA, the file pointer is set to the
start of the next non-hole file region greater than or
equal to the supplied offset.

So in the generic case the entire file is data and there is a virtual hole at
the end. That means we will just return i_size for SEEK_HOLE and will return
the same offset for SEEK_DATA. This is how Solaris does it so we have to do it
the same way.

Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Al Viro

Josef Bacik
2011-07-21 08:47:56 +0800

13 Jan, 2011

1 commit

cccb5a1e6 fix signedness mess in rw_verify_area() on 64bit architectures ... Browse Code »

... and clean the unsigned-f_pos code, while we are at it.

Signed-off-by: Al Viro

Al Viro
2011-01-13 09:06:58 +0800

18 Nov, 2010

1 commit

451a3c24b BKL: remove extraneous #include <smp_lock.h> ... Browse Code »

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann
Signed-off-by: Linus Torvalds

Arnd Bergmann
2010-11-18 00:59:32 +0800

30 Oct, 2010

1 commit

435f49a51 readv/writev: do the same MAX_RW_COUNT truncation that read/write does ... Browse Code »

We used to protect against overflow, but rather than return an error, do
what read/write does, namely to limit the total size to MAX_RW_COUNT.
This is not only more consistent, but it also means that any broken
low-level read/write routine that still keeps counts in 'int' can't
break.

Signed-off-by: Linus Torvalds

Linus Torvalds
2010-10-30 01:36:49 +0800

26 Oct, 2010

1 commit

4a3956c79 vfs: introduce FMODE_UNSIGNED_OFFSET for allowing negative f_pos ... Browse Code »

Now, rw_verify_area() checsk f_pos is negative or not. And if negative,
returns -EINVAL.

But, some special files as /dev/(k)mem and /proc//mem etc.. has
negative offsets. And we can't do any access via read/write to the
file(device).

So introduce FMODE_UNSIGNED_OFFSET to allow negative file offsets.

Signed-off-by: Wu Fengguang
Signed-off-by: KAMEZAWA Hiroyuki
Cc: Al Viro
Cc: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

KAMEZAWA Hiroyuki
2010-10-26 09:18:21 +0800

15 Oct, 2010

2 commits

776c163b1 vfs: make no_llseek the default ... Browse Code »

All file operations now have an explicit .llseek
operation pointer, so we can change the default
action for future code.

This makes changes the default from default_llseek
to no_llseek, which always returns -ESPIPE if
a user tries to seek on a file without a .llseek
operation.

The name of the default_llseek function remains
unchanged, if anyone thinks we should change it,
please speak up.

Signed-off-by: Arnd Bergmann
Cc: Christoph Hellwig
Cc: Al Viro
Cc: linux-fsdevel@vger.kernel.org

Arnd Bergmann
2010-10-15 21:53:46 +0800
ab91261f5 vfs: don't use BKL in default_llseek ... Browse Code »

There are currently 191 users of default_llseek.
Nine of these are in device drivers that use the
big kernel lock. None of these ever touch
file->f_pos outside of llseek or file_pos_write.

Consequently, we never rely on the BKL
in the default_llseek function and can
replace that with i_mutex, which is also
used in generic_file_llseek.

Signed-off-by: Arnd Bergmann

Arnd Bergmann
2010-10-15 21:53:34 +0800

28 Jul, 2010

1 commit

2a12a9d78 fsnotify: pass a file instead of an inode to open, read, and write ... Browse Code »

fanotify, the upcoming notification system actually needs a struct path so it can
do opens in the context of listeners, and it needs a file so it can get f_flags
from the original process. Close was the only operation that already was passing
a struct file to the notification hook. This patch passes a file for access,
modify, and open as well as they are easily available to these hooks.

Signed-off-by: Eric Paris

Eric Paris
2010-07-28 21:58:32 +0800

28 May, 2010

1 commit

ae6afc3f5 vfs: introduce noop_llseek() ... Browse Code »

This is an implementation of ->llseek useable for the rare special case
when userspace expects the seek to succeed but the (device) file is
actually not able to perform the seek. In this case you use noop_llseek()
instead of falling back to the default implementation of ->llseek.

Signed-off-by: Jan Blunck
Cc: Frederic Weisbecker
Cc: Christoph Hellwig
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

jan Blunck
2010-05-28 00:12:56 +0800

25 Mar, 2010

1 commit

61964eba5 do_sync_read/write() should set kiocb.ki_nbytes to be consistent ... Browse Code »

do_sync_read/write() should set kiocb.ki_nbytes to be consistent with
do_sync_readv_writev().

Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

David Howells
2010-03-25 07:43:29 +0800

04 Nov, 2009

1 commit

cc56f7de7 sendfile(): check f_op.splice_write() rather than f_op.sendpage() ... Browse Code »

sendfile(2) was reworked with the splice infrastructure, but it still
checks f_op.sendpage() instead of f_op.splice_write() wrongly. Although
if f_op.sendpage() exists, f_op.splice_write() always exists at the same
time currently, the assumption will be broken in future silently. This
patch also brings a side effect: sendfile(2) can work with any output
file. Some security checks related to f_op are added too.

Signed-off-by: Changli Gao
Signed-off-by: Jens Axboe

Changli Gao
2009-11-04 16:09:52 +0800

24 Sep, 2009

1 commit

f9098980f vfs: remove redundant position check in do_sendfile ... Browse Code »

As Johannes Weiner pointed out, one of the range checks in do_sendfile
is redundant and is already checked in rw_verify_area.

Signed-off-by: Jeff Layton
Reviewed-by: Johannes Weiner
Cc: Christoph Hellwig
Cc: Al Viro
Cc: Robert Love
Cc: Mandeep Singh Baines
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Jeff Layton
2009-09-24 19:47:34 +0800

11 May, 2009

1 commit

6818173bd splice: implement default splice_read method ... Browse Code »

If f_op->splice_read() is not implemented, fall back to a plain read.
Use vfs_readv() to read into previously allocated pages.

This will allow splice and functions using splice, such as the loop
device, to work on all filesystems. This includes "direct_io" files
in fuse which bypass the page cache.

Signed-off-by: Miklos Szeredi
Signed-off-by: Jens Axboe

Miklos Szeredi
2009-05-11 20:13:10 +0800

05 Apr, 2009

1 commit

601cc11d0 Make non-compat preadv/pwritev use native register size ... Browse Code »

Instead of always splitting the file offset into 32-bit 'high' and 'low'
parts, just split them into the largest natural word-size - which in C
terms is 'unsigned long'.

This allows 64-bit architectures to avoid the unnecessary 32-bit
shifting and masking for native format (while the compat interfaces will
obviously always have to do it).

This also changes the order of 'high' and 'low' to be "low first". Why?
Because when we have it like this, the 64-bit system calls now don't use
the "pos_high" argument at all, and it makes more sense for the native
system call to simply match the user-mode prototype.

This results in a much more natural calling convention, and allows the
compiler to generate much more straightforward code. On x86-64, we now
generate

testq %rcx, %rcx # pos_l
js .L122 #,
movq %rcx, -48(%rbp) # pos_l, pos

from the C source

loff_t pos = pos_from_hilo(pos_h, pos_l);
...
if (pos < 0)
return -EINVAL;

and the 'pos_h' register isn't even touched. It used to generate code
like

mov %r8d, %r8d # pos_low, pos_low
salq $32, %rcx #, tmp71
movq %r8, %rax # pos_low, pos.386
orq %rcx, %rax # tmp71, pos.386
js .L122 #,
movq %rax, -48(%rbp) # pos.386, pos

which isn't _that_ horrible, but it does show how the natural word size
is just a more sensible interface (same arguments will hold in the user
level glibc wrapper function, of course, so the kernel side is just half
of the equation!)

Note: in all cases the user code wrapper can again be the same. You can
just do

#define HALF_BITS (sizeof(unsigned long)*4)
__syscall(PWRITEV, fd, iov, count, offset, (offset >> HALF_BITS) >> HALF_BITS);

or something like that. That way the user mode wrapper will also be
nicely passing in a zero (it won't actually have to do the shifts, the
compiler will understand what is going on) for the last argument.

And that is a good idea, even if nobody will necessarily ever care: if
we ever do move to a 128-bit lloff_t, this particular system call might
be left alone. Of course, that will be the least of our worries if we
really ever need to care, so this may not be worth really caring about.

[ Fixed for lost 'loff_t' cast noticed by Andrew Morton ]

Acked-by: Gerd Hoffmann
Cc: H. Peter Anvin
Cc: Andrew Morton
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Ingo Molnar
Cc: Ralf Baechle >
Cc: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2009-04-05 05:20:34 +0800

03 Apr, 2009

1 commit

f3554f4bc preadv/pwritev: Add preadv and pwritev system calls. ... Browse Code »

This patch adds preadv and pwritev system calls. These syscalls are a
pretty straightforward combination of pread and readv (same for write).
They are quite useful for doing vectored I/O in threaded applications.
Using lseek+readv instead opens race windows you'll have to plug with
locking.

Other systems have such system calls too, for example NetBSD, check
here: http://www.daemon-systems.org/man/preadv.2.html

The application-visible interface provided by glibc should look like
this to be compatible to the existing implementations in the *BSD family:

ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset);
ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset);

This prototype has one problem though: On 32bit archs is the (64bit)
offset argument unaligned, which the syscall ABI of several archs doesn't
allow to do. At least s390 needs a wrapper in glibc to handle this. As
we'll need a wrappers in glibc anyway I've decided to push problem to
glibc entriely and use a syscall prototype which works without
arch-specific wrappers inside the kernel: The offset argument is
explicitly splitted into two 32bit values.

The patch sports the actual system call implementation and the windup in
the x86 system call tables. Other archs follow as separate patches.

Signed-off-by: Gerd Hoffmann
Cc: Arnd Bergmann
Cc: Al Viro
Cc:
Cc:
Cc: Ralf Baechle
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gerd Hoffmann
2009-04-03 10:05:08 +0800

14 Jan, 2009

5 commits

3cdad4288 [CVE-2009-0029] System call wrappers part 20 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:26 +0800
003d7ab47 [CVE-2009-0029] System call wrappers part 19 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:26 +0800
002c8976e [CVE-2009-0029] System call wrappers part 16 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:25 +0800
6673e0c3f [CVE-2009-0029] System call wrapper special cases ... Browse Code »

System calls with an unsigned long long argument can't be converted with
the standard wrappers since that would include a cast to long, which in
turn means that we would lose the upper 32 bit on 32 bit architectures.
Also semctl can't use the standard wrapper since it has a 'union'
parameter.

So we handle them as special case and add some extra wrappers instead.

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:18 +0800
2ed7c03ec [CVE-2009-0029] Convert all system calls to return a long ... Browse Code »

Convert all system calls to return a long. This should be a NOP since all
converted types should have the same size anyway.
With the exception of sys_exit_group which returned void. But that doesn't
matter since the system call doesn't return.

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:14 +0800

06 Jan, 2009

1 commit

5b6f1eb97 vfs: lseek(fd, 0, SEEK_CUR) race condition ... Browse Code »

This patch fixes a race condition in lseek. While it is expected that
unpredictable behaviour may result while repositioning the offset of a
file descriptor concurrently with reading/writing to the same file
descriptor, this should not happen when merely *reading* the file
descriptor's offset.

Unfortunately, the only portable way in Unix to read a file
descriptor's offset is lseek(fd, 0, SEEK_CUR); however executing this
concurrently with read/write may mess up the position.

[with fixes from akpm]

Signed-off-by: Alain Knaff
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Alain Knaff
2009-01-06 00:53:07 +0800

23 Oct, 2008

1 commit

3a8cff4f0 [PATCH] generic_file_llseek tidyups ... Browse Code »

Add kerneldoc for generic_file_llseek and generic_file_llseek_unlocked,
use sane variable names and unclutter the code.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2008-10-23 17:12:59 +0800

03 Jul, 2008

1 commit

9465efc9e Remove BKL from remote_llseek v2 ... Browse Code »

- Replace remote_llseek with generic_file_llseek_unlocked (to force compilation
failures in all users)
- Change all users to either use generic_file_llseek_unlocked directly or
take the BKL around. I changed the file systems who don't use the BKL
for anything (CIFS, GFS) to call it directly. NCPFS and SMBFS and NFS
take the BKL, but explicitely in their own source now.

I moved them all over in a single patch to avoid unbisectable sections.

Open problem: 32bit kernels can corrupt fpos because its modification
is not atomic, but they can do that anyways because there's other paths who
modify it without BKL.

Do we need a special lock for the pos/f_version = 0 checks?

Trond says the NFS BKL is likely not needed, but keep it for now
until his full audit.

v2: Use generic_file_llseek_unlocked instead of remote_llseek_unlocked
and factor duplicated code (suggested by hch)

Cc: Trond.Myklebust@netapp.com
Cc: swhiteho@redhat.com
Cc: sfrench@samba.org
Cc: vandrove@vc.cvut.cz

Signed-off-by: Andi Kleen
Signed-off-by: Andi Kleen
Signed-off-by: Jonathan Corbet

Andi Kleen
2008-07-03 05:06:27 +0800