Eric Lee / linux-smarc-t335x-v3.2

26 Jan, 2009

2 commits

f6d47a176 fuse: fix poll notify ... Browse Code »

Move fuse_copy_finish() to before calling fuse_notify_poll_wakeup().
This is not a big issue because fuse_notify_poll_wakeup() should be
atomic, but it's cleaner this way, and later uses of notification will
need to be able to finish the copying before performing some actions.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2009-01-26 22:00:59 +0800
26c367910 fuse: destroy bdi on umount ... Browse Code »

If a fuse filesystem is unmounted but the device file descriptor
remains open and a new mount reuses the old device number, then the
mount fails with EEXIST and the following warning is printed in the
kernel log:

WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x35/0x3d()
sysfs: duplicate filename '0:15' can not be created

The cause is that the bdi belonging to the fuse filesystem was
destoryed only after the device file was released. Fix this by
calling bdi_destroy() from fuse_put_super() instead.

Signed-off-by: Miklos Szeredi
CC: stable@kernel.org

Miklos Szeredi
2009-01-26 22:00:59 +0800

07 Jan, 2009

1 commit

5fec8bdbf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: clean up annotations of fc->lock
fuse: fix sparse warning in ioctl
fuse: update interface version
fuse: add fuse_conn->release()
fuse: separate out fuse_conn_init() from new_conn()
fuse: add fuse_ prefix to several functions
fuse: implement poll support
fuse: implement unsolicited notification
fuse: add file kernel handle
fuse: implement ioctl support
fuse: don't let fuse_req->end() put the base reference
fuse: move FUSE_MINOR to miscdevice.h
fuse: style fixes

Linus Torvalds
2009-01-07 09:01:20 +0800

02 Dec, 2008

1 commit

5d9ec854b fuse: clean up annotations of fc->lock ... Browse Code »

Makes the existing annotations match the more common one per line style
and adds a few missing annotations.

Signed-off-by: Harvey Harrison
Signed-off-by: Miklos Szeredi

Harvey Harrison
2008-12-02 21:49:42 +0800

26 Nov, 2008

5 commits

b93f858ab fuse: add fuse_ prefix to several functions ... Browse Code »

Add fuse_ prefix to request_send*() and get_root_inode() as some of
those functions will be exported for CUSE. With or without CUSE
export, having the function names scoped is a good idea for
debuggability.

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:55 +0800
95668a69a fuse: implement poll support ... Browse Code »

Implement poll support. Polled files are indexed using kh in a RB
tree rooted at fuse_conn->polled_files.

Client should send FUSE_NOTIFY_POLL notification once after processing
FUSE_POLL which has FUSE_POLL_SCHEDULE_NOTIFY set. Sending
notification unconditionally after the latest poll or everytime file
content might have changed is inefficient but won't cause malfunction.

fuse_file_poll() can sleep and requires patches from the following
thread which allows f_op->poll() to sleep.

http://thread.gmane.org/gmane.linux.kernel/726176

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:55 +0800
8599396b5 fuse: implement unsolicited notification ... Browse Code »

Clients always used to write only in response to read requests. To
implement poll efficiently, clients should be able to issue
unsolicited notifications. This patch implements basic notification
support.

Zero fuse_out_header.unique is now accepted and considered unsolicited
notification and the error field contains notification code. This
patch doesn't implement any actual notification.

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:55 +0800
e9bb09dd6 fuse: don't let fuse_req->end() put the base reference ... Browse Code »

fuse_req->end() was supposed to be put the base reference but there's
no reason why it should. It only makes things more complex. Move it
out of ->end() and make it the responsibility of request_end().

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:54 +0800
1729a16c2 fuse: style fixes ... Browse Code »

Fix coding style errors reported by checkpatch and others. Uptdate
copyright date to 2008.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2008-11-26 19:03:54 +0800

14 Nov, 2008

1 commit

2186a71cb CRED: Wrap task credential accesses in the FUSE filesystem ... Browse Code »

Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells
Reviewed-by: James Morris
Acked-by: Serge Hallyn
Cc: Miklos Szeredi
Cc: fuse-devel@lists.sourceforge.net
Signed-off-by: James Morris

David Howells
2008-11-14 07:38:53 +0800

02 Nov, 2008

1 commit

233e70f42 saner FASYNC handling on file close ... Browse Code »

As it is, all instances of ->release() for files that have ->fasync()
need to remember to evict file from fasync lists; forgetting that
creates a hole and we actually have a bunch that *does* forget.

So let's keep our lives simple - let __fput() check FASYNC in
file->f_flags and call ->fasync() there if it's been set. And lose that
crap in ->release() instances - leaving it there is still valid, but we
don't have to bother anymore.

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2008-11-02 00:49:46 +0800

30 Apr, 2008

2 commits

4dbf930ed fuse: fix sparse warnings ... Browse Code »

fs/fuse/dev.c:306:2: warning: context imbalance in 'wait_answer_interruptible' - unexpected unlock
fs/fuse/dev.c:361:2: warning: context imbalance in 'request_wait_answer' - unexpected unlock
fs/fuse/dev.c:1002:4: warning: context imbalance in 'end_io_requests' - unexpected unlock

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-04-30 23:29:51 +0800
3be5a52b3 fuse: support writable mmap ... Browse Code »

Quoting Linus (3 years ago, FUSE inclusion discussions):

"User-space filesystems are hard to get right. I'd claim that they
are almost impossible, unless you limit them somehow (shared
writable mappings are the nastiest part - if you don't have those,
you can reasonably limit your problems by limiting the number of
dirty pages you accept through normal "write()" calls)."

Instead of attempting the impossible, I've just waited for the dirty page
accounting infrastructure to materialize (thanks to Peter Zijlstra and
others). This nicely solved the biggest problem: limiting the number of pages
used for write caching.

Some small details remained, however, which this largish patch attempts to
address. It provides a page writeback implementation for fuse, which is
completely safe against VM related deadlocks. Performance may not be very
good for certain usage patterns, but generally it should be acceptable.

It has been tested extensively with fsx-linux and bash-shared-mapping.

Fuse page writeback design
--------------------------

fuse_writepage() allocates a new temporary page with GFP_NOFS|__GFP_HIGHMEM.
It copies the contents of the original page, and queues a WRITE request to the
userspace filesystem using this temp page.

The writeback is finished instantly from the MM's point of view: the page is
removed from the radix trees, and the PageDirty and PageWriteback flags are
cleared.

For the duration of the actual write, the NR_WRITEBACK_TEMP counter is
incremented. The per-bdi writeback count is not decremented until the actual
write completes.

On dirtying the page, fuse waits for a previous write to finish before
proceeding. This makes sure, there can only be one temporary page used at a
time for one cached page.

This approach is wasteful in both memory and CPU bandwidth, so why is this
complication needed?

The basic problem is that there can be no guarantee about the time in which
the userspace filesystem will complete a write. It may be buggy or even
malicious, and fail to complete WRITE requests. We don't want unrelated parts
of the system to grind to a halt in such cases.

Also a filesystem may need additional resources (particularly memory) to
complete a WRITE request. There's a great danger of a deadlock if that
allocation may wait for the writepage to finish.

Currently there are several cases where the kernel can block on page
writeback:

- allocation order is larger than PAGE_ALLOC_COSTLY_ORDER
- page migration
- throttle_vm_writeout (through NR_WRITEBACK)
- sync(2)

Of course in some cases (fsync, msync) we explicitly want to allow blocking.
So for these cases new code has to be added to fuse, since the VM is not
tracking writeback pages for us any more.

As an extra safetly measure, the maximum dirty ratio allocated to a single
fuse filesystem is set to 1% by default. This way one (or several) buggy or
malicious fuse filesystems cannot slow down the rest of the system by hogging
dirty memory.

With appropriate privileges, this limit can be raised through
'/sys/class/bdi//max_ratio'.

Signed-off-by: Miklos Szeredi
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-04-30 23:29:50 +0800

07 Feb, 2008

1 commit

d12def1bc fuse: limit queued background requests ... Browse Code »

Libfuse basically creates a new thread for each new request. This is fine for
synchronous requests, which are naturally limited. However background
requests (especially writepage) can cause a thread creation storm.

To avoid this, limit the number of background requests available to userspace.

This is done by introducing another queue for background requests, and a
counter for the number of "active" requests, which are currently available for
userspace.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-02-07 02:41:13 +0800

17 Oct, 2007

6 commits

c9c9d7df5 fuse: no ENOENT from fuse device read ... Browse Code »

Don't return -ENOENT for a read() on the fuse device when the request was
aborted. Instead return -ENODEV, meaning the filesystem has been
force-umounted or aborted.

Previously ENOENT meant that the request was interrupted, but now the
'aborted' flag is not set in case of interrupts.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2007-10-17 23:43:04 +0800
a131de0a4 fuse: no abort on interrupt ... Browse Code »

Don't set 'aborted' flag on a request if it's interrupted. We have to wait
for the answer anyway, and this would only a very little time while copying
the reply.

This means, that write() on the fuse device will not return -ENOENT during
normal operation, only if the filesystem is aborted by a forced umount or
through the fusectl interface.

This could simplify userspace code somewhat when backward compatibility with
earlier kernel versions is not required.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2007-10-17 23:43:04 +0800
819c4b3b4 fuse: cleanup in release ... Browse Code »

Move dput/mntput pair from request_end() to fuse_release_end(), because
there's no other place they are used.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2007-10-17 23:43:04 +0800
c756e0a4d fuse: add reference counting to fuse_file ... Browse Code »

Make lifetime of 'struct fuse_file' independent from 'struct file' by adding a
reference counter and destructor.

This will enable asynchronous page writeback, where it cannot be guaranteed,
that the file is not released while a request with this file handle is being
served.

The actual RELEASE request is only sent when there are no more references to
the fuse_file.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2007-10-17 23:43:03 +0800
de5e3dec4 fuse: fix reserved request wake up ... Browse Code »

Use wake_up_all instead of wake_up in put_reserved_req(), otherwise it is
possible that the right task is not woken up.

Also create a separate reserved_req_waitq in addition to the blocked_waitq,
since they fulfill totally separate functions.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2007-10-17 23:43:03 +0800
f92b99b9d fuse: update backing_dev_info congestion state ... Browse Code »

Set the read and write congestion state if the request queue is close to
blocking, and clear it when it's not.

This prevents unnecessary blocking in readahead and (when writable mmaps are
allowed) writeback.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2007-10-17 23:43:03 +0800

20 Jul, 2007

1 commit

20c2df83d mm: Remove slab destructors from kmem_cache_create(). ... Browse Code »

Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt

Paul Mundt
2007-07-20 09:11:58 +0800

08 Dec, 2006

2 commits

e18b890bb [PATCH] slab: remove kmem_cache_t ... Browse Code »

Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#

set -e

for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done

The script was run like this

sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-12-08 00:39:25 +0800
e94b17660 [PATCH] slab: remove SLAB_KERNEL ... Browse Code »

SLAB_KERNEL is an alias of GFP_KERNEL.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-12-08 00:39:24 +0800

01 Oct, 2006

1 commit

ee0b3e671 [PATCH] Remove readv/writev methods and use aio_read/aio_write instead ... Browse Code »

This patch removes readv() and writev() methods and replaces them with
aio_read()/aio_write() methods.

Signed-off-by: Badari Pulavarty
Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Badari Pulavarty
2006-10-01 15:39:28 +0800

30 Sep, 2006

1 commit

105f4d7a8 [PATCH] fuse: add lock annotations to request_end and fuse_read_interrupt ... Browse Code »

request_end and fuse_read_interrupt release fc->lock. Add lock annotations
to these two functions so that sparse can check callers for lock pairing,
and so that sparse will not complain about these functions since they
intentionally use locks in this manner.

Signed-off-by: Josh Triplett
Acked-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2006-09-30 00:18:08 +0800

26 Jun, 2006

5 commits

a4d27e75f [PATCH] fuse: add request interruption ... Browse Code »

Add synchronous request interruption. This is needed for file locking
operations which have to be interruptible. However filesystem may implement
interruptibility of other operations (e.g. like NFS 'intr' mount option).

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-06-26 01:01:19 +0800
f9a2842e5 [PATCH] fuse: rename the interrupted flag ... Browse Code »

Rename the 'interrupted' flag to 'aborted', since it indicates exactly that,
and next patch will introduce an 'interrupted' flag for a

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-06-26 01:01:19 +0800
33649c91a [PATCH] fuse: ensure FLUSH reaches userspace ... Browse Code »

All POSIX locks owned by the current task are removed on close(). If the
FLUSH request resulting initiated by close() fails to reach userspace, there
might be locks remaining, which cannot be removed.

The only reason it could fail, is if allocating the request fails. In this
case use the request reserved for RELEASE, or if that is currently used by
another FLUSH, wait for it to become available.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-06-26 01:01:19 +0800
bafa96541 [PATCH] fuse: add control filesystem ... Browse Code »

Add a control filesystem to fuse, replacing the attributes currently exported
through sysfs. An empty directory '/sys/fs/fuse/connections' is still created
in sysfs, and mounting the control filesystem here provides backward
compatibility.

Advantages of the control filesystem over the previous solution:

- allows the object directory and the attributes to be owned by the
filesystem owner, hence letting unpriviled users abort the
filesystem connection

- does not suffer from module unload race

[akpm@osdl.org: fix this fs for recent dhowells depredations]
[akpm@osdl.org: fix 64-bit printk warnings]
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-06-26 01:01:19 +0800
51eb01e73 [PATCH] fuse: no backgrounding on interrupt ... Browse Code »

Don't put requests into the background when a fatal interrupt occurs while the
request is in userspace. This removes a major wart from the implementation.

Backgrounding of requests was introduced to allow breaking of deadlocks.
However now the same can be achieved by aborting the filesystem through the
'abort' sysfs attribute.

This is a change in the interface, but should not cause problems, since these
kinds of deadlocks never happen during normal operation.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-06-26 01:01:19 +0800

26 Apr, 2006

2 commits

6dbbcb120 [fuse] fix deadlock between fuse_put_super() and request_end(), try #2 ... Browse Code »

A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.

Releasing the file would release the vfsmount which in turn would
release the superblock. Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.

The solution is to move the fput() outside the region protected by
sbput_sem.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2006-04-26 16:49:06 +0800
5a5fb1ea7 Revert "[fuse] fix deadlock between fuse_put_super() and request_end()" ... Browse Code »

This reverts 73ce8355c243a434524a34c05cc417dd0467996e commit.

It was wrong, because it didn't take into account the requirement,
that iput() for background requests must be performed synchronously
with ->put_super(), otherwise active inodes may remain after unmount.

The right solution is to keep the sbput_sem and perform iput() within
the locked region, but move fput() outside sbput_sem.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2006-04-26 16:48:55 +0800

12 Apr, 2006

3 commits

4858cae4f [fuse] Don't init request twice ... Browse Code »

Request is already initialized in fuse_request_alloc() so no need to
do it again in fuse_get_req().

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2006-04-12 03:16:38 +0800
9bc5dddad [fuse] Fix accounting the number of waiting requests ... Browse Code »

Properly accounting the number of waiting requests was forgotten in
"clean up request accounting" patch.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2006-04-12 03:16:09 +0800
73ce8355c [fuse] fix deadlock between fuse_put_super() and request_end() ... Browse Code »

A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.

Releasing the file would release the vfsmount which in turn would
release the superblock. Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.

The chosen soltuion is to get rid of sbput_sem, and instead use the
spinlock to ensure the referenced inodes/file are released only once.
Since the actual release may sleep, defer these outside the locked
region, but using local variables instead of the structure members.

This is a much more rubust solution.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2006-04-12 03:14:26 +0800

11 Apr, 2006

5 commits

08a53cdce [PATCH] fuse: account background requests ... Browse Code »

The previous patch removed limiting the number of outstanding requests. This
patch adds a much simpler limiting, that is also compatible with file locking
operations.

A task may have at most one synchronous request allocated. So these requests
need not be otherwise limited.

However the number of background requests (release, forget, asynchronous
reads, interrupted requests) can grow indefinitely. This can be used by a
malicous user to cause FUSE to allocate arbitrary amounts of unswappable
kernel memory, denying service.

For this reason add a limit for the number of background requests, and block
allocations of new requests until the number goes bellow the limit.

Also use this mechanism to block all requests until the INIT reply is
received.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-04-11 21:18:49 +0800
ce1d5a491 [PATCH] fuse: clean up request accounting ... Browse Code »

FUSE allocated most requests from a fixed size pool filled at mount time.
However in some cases (release/forget) non-pool requests were used. File
locking operations aren't well served by the request pool, since they may
block indefinetly thus exhausting the pool.

This patch removes the request pool and always allocates requests on demand.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-04-11 21:18:49 +0800
a87046d82 [PATCH] fuse: consolidate device errors ... Browse Code »

Return consistent error values for the case when the opened device file has no
mount associated yet.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-04-11 21:18:48 +0800
d71331146 [PATCH] fuse: use a per-mount spinlock ... Browse Code »

Remove the global spinlock in favor of a per-mount one.

This patch is basically find & replace. The difficult part has already been
done by the previous patch.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-04-11 21:18:48 +0800
0720b3159 [PATCH] fuse: simplify locking ... Browse Code »

This is in preparation for removing the global spinlock in favor of a
per-mount one.

The only critical part is the interaction between fuse_dev_release() and
fuse_fill_super(): fuse_dev_release() must see the assignment to
file->private_data, otherwise it will leak the reference to fuse_conn.

This is ensured by the fput() operation, which will synchronize the assignment
with other CPU's that may do a final fput() soon after this.

Also redundant locking is removed from fuse_fill_super(), where exclusion is
already ensured by the BKL held for this function by the VFS.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2006-04-11 21:18:48 +0800