Eric Lee / smarc-fsl-linux-kernel

16 Apr, 2015

1 commit

75c3cfa85 VFS: assorted weird filesystems: d_inode() annotations ... Browse Code »

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-04-16 03:06:58 +0800

12 Apr, 2015

1 commit

5d5d56897 make new_sync_{read,write}() static ... Browse Code »

All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.

Signed-off-by: Al Viro

Al Viro
2015-04-12 10:29:40 +0800

26 Mar, 2015

1 commit

e2e40f2c1 fs: move struct kiocb to fs.h ... Browse Code »

struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2015-03-26 08:28:11 +0800

07 May, 2014

3 commits

f0d1bec9d new helper: copy_page_from_iter() ... Browse Code »

parallel to copy_page_to_iter(). pipe_write() switched to it (and became
->write_iter()).

Signed-off-by: Al Viro

Al Viro
2014-05-07 05:39:42 +0800
fb9096a34 pipe: switch to ->read_iter() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-05-07 05:37:58 +0800
71d8e532b start adding the tag to iov_iter ... Browse Code »

For now, just use the same thing we pass to ->direct_IO() - it's all
iovec-based at the moment. Pass it explicitly to iov_iter_init() and
account for kvec vs. iovec in there, by the same kludge NFS ->direct_IO()
uses.

Signed-off-by: Al Viro

Al Viro
2014-05-07 05:32:49 +0800

02 Apr, 2014

2 commits

637b58c28 switch pipe_read() to copy_page_to_iter() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-04-02 11:19:22 +0800
fbb32750a pipe: kill ->map() and ->unmap() ... Browse Code »

all pipe_buffer_operations have the same instances of those...

Signed-off-by: Al Viro

Al Viro
2014-04-02 11:19:19 +0800

24 Jan, 2014

1 commit

7e775f46a fs/pipe.c: skip file_update_time on frozen fs ... Browse Code »

Pipe has no data associated with fs so it is not good idea to block
pipe_write() if FS is frozen, but we can not update file's time on such
filesystem. Let's use same idea as we use in touch_time().

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=65701

Signed-off-by: Dmitry Monakhov
Reviewed-by: Jan Kara
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dmitry Monakhov
2014-01-24 08:37:00 +0800

03 Dec, 2013

1 commit

b0d8d2292 vfs: fix subtle use-after-free of pipe_inode_info ... Browse Code »

The pipe code was trying (and failing) to be very careful about freeing
the pipe info only after the last access, with a pattern like:

spin_lock(&inode->i_lock);
if (!--pipe->files) {
inode->i_pipe = NULL;
kill = 1;
}
spin_unlock(&inode->i_lock);
__pipe_unlock(pipe);
if (kill)
free_pipe_info(pipe);

where the final freeing is done last.

HOWEVER. The above is actually broken, because while the freeing is
done at the end, if we have two racing processes releasing the pipe
inode info, the one that *doesn't* free it will decrement the ->files
count, and unlock the inode i_lock, but then still use the
"pipe_inode_info" afterwards when it does the "__pipe_unlock(pipe)".

This is *very* hard to trigger in practice, since the race window is
very small, and adding debug options seems to just hide it by slowing
things down.

Simon originally reported this way back in July as an Oops in
kmem_cache_allocate due to a single bit corruption (due to the final
"spin_unlock(pipe->mutex.wait_lock)" incrementing a field in a different
allocation that had re-used the free'd pipe-info), it's taken this long
to figure out.

Since the 'pipe->files' accesses aren't even protected by the pipe lock
(we very much use the inode lock for that), the simple solution is to
just drop the pipe lock early. And since there were two users of this
pattern, create a helper function for it.

Introduced commit ba5bb147330a ("pipe: take allocation and freeing of
pipe_inode_info out of ->i_mutex").

Reported-by: Simon Kirby
Reported-by: Ian Applegate
Acked-by: Al Viro
Cc: stable@kernel.org # v3.10+
Signed-off-by: Linus Torvalds

Linus Torvalds
2013-12-03 01:44:51 +0800

08 May, 2013

1 commit

a27bb332c aio: don't include aio.h in sched.h ... Browse Code »

Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Kent Overstreet
Cc: Zach Brown
Cc: Felipe Balbi
Cc: Greg Kroah-Hartman
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Rusty Russell
Cc: Jens Axboe
Cc: Asai Thambi S P
Cc: Selvan Mani
Cc: Sam Bradshaw
Cc: Jeff Moyer
Cc: Al Viro
Cc: Benjamin LaHaise
Reviewed-by: "Theodore Ts'o"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kent Overstreet
2013-05-08 11:16:25 +0800

10 Apr, 2013

11 commits

4b8a8f1e4 get rid of the last free_pipe_info() callers ... Browse Code »

and rename __free_pipe_info() to free_pipe_info()

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:13:02 +0800
7bee130e2 get rid of alloc_pipe_info() argument ... Browse Code »

not used anymore

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:13:01 +0800
6447a3cf1 get rid of pipe->inode ... Browse Code »

it's used only as a flag to distinguish normal pipes/FIFOs from the
internal per-task one used by file-to-file splice. And pipe->files
would work just as well for that purpose...

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:13:01 +0800
ebec73f47 introduce variants of pipe_lock/pipe_unlock for real pipes/FIFOs ... Browse Code »

fs/pipe.c file_operations methods *know* that pipe is not an internal one;
no need to check pipe->inode for those callers.

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:13:01 +0800
de32ec4cf pipe: set file->private_data to ->i_pipe ... Browse Code »

simplify get_pipe_info(), while we are at it

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:13:00 +0800
72b0d9aac pipe: don't use ->i_mutex ... Browse Code »

now it can be done - put mutex into pipe_inode_info, use it instead
of ->i_mutex

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:13:00 +0800
ba5bb1473 pipe: take allocation and freeing of pipe_inode_info out of ->i_mutex ... Browse Code »

* new field - pipe->files; number of struct file over that pipe (all
sharing the same inode, of course); protected by inode->i_lock.
* pipe_release() decrements pipe->files, clears inode->i_pipe when
if the counter has reached 0 (all under ->i_lock) and, in that case,
frees pipe after having done pipe_unlock()
* fifo_open() starts with grabbing ->i_lock, and either bumps pipe->files
if ->i_pipe was non-NULL or allocates a new pipe (dropping and regaining
->i_lock) and rechecks ->i_pipe; if it's still NULL, inserts new pipe
there, otherwise bumps ->i_pipe->files and frees the one we'd allocated.
At that point we know that ->i_pipe is non-NULL and won't go away, so
we can do pipe_lock() on it and proceed as we used to. If we end up
failing, decrement pipe->files and if it reaches 0 clear ->i_pipe and
free the sucker after pipe_unlock().

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:12:59 +0800
18c03cfd4 pipe: preparation to new locking rules ... Browse Code »

* use the fact that file_inode(file)->i_pipe doesn't change
while the file is opened - no locks needed to access that.
* switch to pipe_lock/pipe_unlock where it's easy to do

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:12:59 +0800
fc7478a2b pipe: switch wait_for_partner() and wake_up_partner() to pipe_inode_info ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:12:59 +0800
599a0ac14 pipe: fold file_operations instances in one ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:12:58 +0800
f776c7388 fold fifo.c into pipe.c ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:12:58 +0800

12 Mar, 2013

1 commit

a930d8790 vfs: fix pipe counter breakage ... Browse Code »

If you open a pipe for neither read nor write, the pipe code will not
add any usage counters to the pipe, causing the 'struct pipe_inode_info"
to be potentially released early.

That doesn't normally matter, since you cannot actually use the pipe,
but the pipe release code - particularly fasync handling - still expects
the actual pipe infrastructure to all be there. And rather than adding
NULL pointer checks, let's just disallow this case, the same way we
already do for the named pipe ("fifo") case.

This is ancient going back to pre-2.4 days, and until trinity, nobody
naver noticed.

Reported-by: Dave Jones
Signed-off-by: Linus Torvalds

Al Viro
2013-03-12 23:29:17 +0800

23 Feb, 2013

2 commits

39b652527 fs: Preserve error code in get_empty_filp(), part 2 ... Browse Code »

Allocating a file structure in function get_empty_filp() might fail because
of several reasons:
- not enough memory for file structures
- operation is not allowed
- user is over its limit

Currently the function returns NULL in all cases and we loose the exact
reason of the error. All callers of get_empty_filp() assume that the function
can fail with ENFILE only.

Return error through pointer. Change all callers to preserve this error code.

[AV: cleaned up a bit, carved the get_empty_filp() part out into a separate commit
(things remaining here deal with alloc_file()), removed pipe(2) behaviour change]

Signed-off-by: Anatol Pomozov
Reviewed-by: "Theodore Ts'o"
Signed-off-by: Al Viro

Anatol Pomozov
2013-02-23 12:31:32 +0800
496ad9aa8 new helper: file_inode(file) ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-02-23 12:31:31 +0800

27 Sep, 2012

1 commit

5b249b1b0 pipe(2) - race-free error recovery ... Browse Code »

don't mess with sys_close() if copy_to_user() fails; just postpone
fd_install() until we know it hasn't.

Signed-off-by: Al Viro

Al Viro
2012-09-27 09:08:52 +0800

02 Aug, 2012

1 commit

a0e881b7c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull second vfs pile from Al Viro:
"The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the
deadlock reproduced by xfstests 068), symlink and hardlink restriction
patches, plus assorted cleanups and fixes.

Note that another fsfreeze deadlock (emergency thaw one) is *not*
dealt with - the series by Fernando conflicts a lot with Jan's, breaks
userland ABI (FIFREEZE semantics gets changed) and trades the deadlock
for massive vfsmount leak; this is going to be handled next cycle.
There probably will be another pull request, but that stuff won't be
in it."

Fix up trivial conflicts due to unrelated changes next to each other in
drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c}

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
delousing target_core_file a bit
Documentation: Correct s_umount state for freeze_fs/unfreeze_fs
fs: Remove old freezing mechanism
ext2: Implement freezing
btrfs: Convert to new freezing mechanism
nilfs2: Convert to new freezing mechanism
ntfs: Convert to new freezing mechanism
fuse: Convert to new freezing mechanism
gfs2: Convert to new freezing mechanism
ocfs2: Convert to new freezing mechanism
xfs: Convert to new freezing code
ext4: Convert to new freezing mechanism
fs: Protect write paths by sb_start_write - sb_end_write
fs: Skip atime update on frozen filesystem
fs: Add freezing handling to mnt_want_write() / mnt_drop_write()
fs: Improve filesystem freezing handling
switch the protection of percpu_counter list to spinlock
nfsd: Push mnt_want_write() outside of i_mutex
btrfs: Push mnt_want_write() outside of i_mutex
fat: Push mnt_want_write() outside of i_mutex
...

Linus Torvalds
2012-08-02 01:26:23 +0800

30 Jul, 2012

1 commit

e4fad8e5d consolidate pipe file creation ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-07-30 01:24:19 +0800

24 Jul, 2012

1 commit

2164d3344 pipe: remove KM_USER0 from comments ... Browse Code »

Signed-off-by: Cong Wang

Cong Wang
2012-07-24 15:27:34 +0800

02 Jun, 2012

1 commit

c3b2da314 fs: introduce inode operation ->update_time ... Browse Code »

Btrfs has to make sure we have space to allocate new blocks in order to modify
the inode, so updating time can fail. We've gotten around this by having our
own file_update_time but this is kind of a pain, and Christoph has indicated he
would like to make xfs do something different with atime updates. So introduce
->update_time, where we will deal with i_version an a/m/c time updates and
indicate which changes need to be made. The normal version just does what it
has always done, updates the time and marks the inode dirty, and then
filesystems can choose to do something different.

I've gone through all of the users of file_update_time and made them check for
errors with the exception of the fault code since it's complicated and I wasn't
quite sure what to do there, also Jan is going to be pushing the file time
updates into page_mkwrite for those who have it so that should satisfy btrfs and
make it not a big deal to check the file_update_time() return code in the
generic fault path. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-06-02 00:07:25 +0800

31 May, 2012

1 commit

46ce341b2 pipe: return -ENOIOCTLCMD instead of -EINVAL on unknown ioctl command ... Browse Code »

As described in commit 07d106d0a ("vfs: fix up ENOIOCTLCMD error
handling"), drivers should return -ENOIOCTLCMD if they receive an ioctl
command which they don't understand. Doing so will result in -ENOTTY
being returned to userspace, which matches the behaviour of the compat
layer if it fails to translate an ioctl command.

This patch fixes the pipe ioctl to return -ENOIOCTLCMD instead of
-EINVAL when passed an unknown ioctl command.

Cc: Al Viro
Cc: Andrew Morton
Signed-off-by: Will Deacon
Signed-off-by: Al Viro

Will Deacon
2012-05-31 09:04:55 +0800

30 Apr, 2012

1 commit

9883035ae pipes: add a "packetized pipe" mode for writing ... Browse Code »
1

The actual internal pipe implementation is already really about
individual packets (called "pipe buffers"), and this simply exposes that
as a special packetized mode.

When we are in the packetized mode (marked by O_DIRECT as suggested by
Alan Cox), a write() on a pipe will not merge the new data with previous
writes, so each write will get a pipe buffer of its own. The pipe
buffer is then marked with the PIPE_BUF_FLAG_PACKET flag, which in turn
will tell the reader side to break the read at that boundary (and throw
away any partial packet contents that do not fit in the read buffer).

End result: as long as you do writes less than PIPE_BUF in size (so that
the pipe doesn't have to split them up), you can now treat the pipe as a
packet interface, where each read() system call will read one packet at
a time. You can just use a sufficiently big read buffer (PIPE_BUF is
sufficient, since bigger than that doesn't guarantee atomicity anyway),
and the return value of the read() will naturally give you the size of
the packet.

NOTE! We do not support zero-sized packets, and zero-sized reads and
writes to a pipe continue to be no-ops. Also note that big packets will
currently be split at write time, but that the size at which that
happens is not really specified (except that it's bigger than PIPE_BUF).
Currently that limit is the system page size, but we might want to
explicitly support bigger packets some day.

The main user for this is going to be the autofs packet interface,
allowing us to stop having to care so deeply about exact packet sizes
(which have had bugs with 32/64-bit compatibility modes). But user
space can create packetized pipes with "pipe2(fd, O_DIRECT)", which will
fail with an EINVAL on kernels that do not support this interface.

Tested-by: Michael Tokarev
Cc: Alan Cox
Cc: David Miller
Cc: Ian Kent
Cc: Thomas Meyer
Cc: stable@kernel.org # needed for systemd/autofs interaction fix
Signed-off-by: Linus Torvalds

Linus Torvalds
2012-04-30 04:12:42 +0800

24 Mar, 2012

1 commit

b502bd115 magic.h: move some FS magic numbers into magic.h ... Browse Code »

- Move open-coded filesystem magic numbers into magic.h

- Rearrange magic.h so that the filesystem-related constants are grouped
together.

Signed-off-by: Muthukumar R
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Muthu Kumar
2012-03-24 07:58:31 +0800

20 Mar, 2012

1 commit

e8e3c3d66 fs: remove the second argument of k[un]map_atomic() ... Browse Code »

Acked-by: Benjamin LaHaise
Signed-off-by: Cong Wang

Cong Wang
2012-03-20 21:48:21 +0800

13 Jan, 2012

1 commit

2ccd4f4d4 pipe: fail cleanly when root tries F_SETPIPE_SZ with big size ... Browse Code »

When a user with the CAP_SYS_RESOURCE cap tries to F_SETPIPE_SZ a pipe
with size bigger than kmalloc() can alloc it spits out an ugly warning:

------------[ cut here ]------------
WARNING: at mm/page_alloc.c:2095 __alloc_pages_nodemask+0x5d3/0x7a0()
Pid: 733, comm: a.out Not tainted 3.2.0-rc1+ #4
Call Trace:
warn_slowpath_common+0x75/0xb0
warn_slowpath_null+0x15/0x20
__alloc_pages_nodemask+0x5d3/0x7a0
__get_free_pages+0x12/0x50
__kmalloc+0x12b/0x150
pipe_set_size+0x75/0x120
pipe_fcntl+0xf8/0x140
do_fcntl+0x2d4/0x410
sys_fcntl+0x66/0xa0
system_call_fastpath+0x16/0x1b
---[ end trace 432f702e6db7b5ee ]---

Instead, make kcalloc() handle the overflow case and fail quietly.

[akpm@linux-foundation.org: switch to sizeof(*bufs) for 80-column niceness]
Signed-off-by: Sasha Levin
Cc: Alexander Viro
Acked-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sasha Levin
2012-01-13 12:13:04 +0800

04 Jan, 2012

1 commit

84b92d39f vfs: pipe.c is really non-modular ... Browse Code »

... so no exitcalls there. Not much would work if pipe(2) would stop
working, after all...

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:52:41 +0800

01 Nov, 2011

1 commit

d70ef97ba fs/pipe.c: add ->statfs callback for pipefs ... Browse Code »

Currently a statfs on a pipe's /proc//fd/ link returns -ENOSYS. Wire
pipfs up so that the statfs succeeds.

This is required by checkpoint-restart in the userspace to make it
possible to distinguish pipes from fifos.

When we dump information about task's open files we use the /proc/pid/fd
directoy's symlinks and the fact that opening any of them gives us exactly
the same dentry->inode pair as the original process has. Now if a task
we're dumping has opened pipe and fifo we need to detect this and act
accordingly. Knowing that an fd with type S_ISFIFO resides on a pipefs is
the most precise way.

Signed-off-by: Pavel Emelyanov
Reviewed-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Cyrill Gorcunov
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2011-11-01 08:30:51 +0800

27 Jul, 2011

1 commit

a209dfc7b vfs: dont chain pipe/anon/socket on superblock s_inodes list ... Browse Code »

Workloads using pipes and sockets hit inode_sb_list_lock contention.

superblock s_inodes list is needed for quota, dirty, pagecache and
fsnotify management. pipe/anon/socket fs are clearly not candidates for
these.

Signed-off-by: Eric Dumazet
Reviewed-by: Christoph Hellwig
Signed-off-by: Al Viro

Eric Dumazet
2011-07-27 00:57:09 +0800

24 Jul, 2011

1 commit

423e0ab08 VFS : mount lock scalability for internal mounts ... Browse Code »

For a number of file systems that don't have a mount point (e.g. sockfs
and pipefs), they are not marked as long term. Therefore in
mntput_no_expire, all locks in vfs_mount lock are taken instead of just
local cpu's lock to aggregate reference counts when we release
reference to file objects. In fact, only local lock need to have been
taken to update ref counts as these file systems are in no danger of
going away until we are ready to unregister them.

The attached patch marks file systems using kern_mount without
mount point as long term. The contentions of vfs_mount lock
is now eliminated. Before un-registering such file system,
kern_unmount should be called to remove the long term flag and
make the mount point ready to be freed.

Signed-off-by: Tim Chen
Signed-off-by: Al Viro

Tim Chen
2011-07-24 22:08:32 +0800

21 Jan, 2011

1 commit

28e58ee8c Fix broken "pipe: use event aware wakeups" optimization ... Browse Code »

Commit e462c448fdc8 ("pipe: use event aware wakeups") optimized the pipe
event wakeup calls to avoid wakeups if the events do not match the
requested set.

However, the optimization was buggy, in that it didn't actually use the
correct sets for the events: when we make room for more data to be
written, the pipe poll() routine will return both the POLLOUT _and_
POLLWRNORM bits. Similarly for read.

And most critically, when a pipe is released, that will potentially
result in POLLHUP|POLLERR (depending on whether it was the last reader
or writer), not just the regular POLLIN|POLLOUT.

This bug showed itself as a hung gnome-screensaver-dialog process, stuck
forever (or at least until it was poked by a signal or by being traced)
in a poll() system call.

Cc: Davide Libenzi
Cc: David S. Miller
Cc: Eric Dumazet
Cc: Jens Axboe
Cc: Andrew Morton
Signed-off-by: Linus Torvalds

Linus Torvalds
2011-01-21 08:21:59 +0800