Eric Lee / linux-smarc-t335x-v3.2

25 Aug, 2011

1 commit

051732bcb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: check size of FUSE_NOTIFY_INVAL_ENTRY message
fuse: mark pages accessed when written to
fuse: delete dead .write_begin and .write_end aops
fuse: fix flock
fuse: fix non-ANSI void function notation

Linus Torvalds
2011-08-25 00:14:42 +0800

08 Aug, 2011

1 commit

37fb3a30b fuse: fix flock ... Browse Code »

Commit a9ff4f87 "fuse: support BSD locking semantics" overlooked a
number of issues with supporing flock locks over existing POSIX
locking infrastructure:

- it's not backward compatible, passing flock(2) calls to userspace
unconditionally (if userspace sets FUSE_POSIX_LOCKS)

- it doesn't cater for the fact that flock locks are automatically
unlocked on file release

- it doesn't take into account the fact that flock exclusive locks
(write locks) don't need an fd opened for write.

The last one invalidates the original premise of the patch that flock
locks can be emulated with POSIX locks.

This patch fixes the first two issues. The last one needs to be fixed
in userspace if the filesystem assumed that a write lock will happen
only on a file operned for write (as in the case of the current fuse
library).

Reported-by: Sebastian Pipping
Signed-off-by: Miklos Szeredi

Miklos Szeredi
2011-08-08 22:08:08 +0800

21 Jul, 2011

1 commit

02c24a821 fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers ... Browse Code »

Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2. For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,

Acked-by: Jan Kara
Signed-off-by: Josef Bacik
Signed-off-by: Al Viro

Josef Bacik
2011-07-21 08:47:59 +0800

21 Mar, 2011

1 commit

07d5f69b4 fuse: reduce size of struct fuse_request ... Browse Code »

Reduce the size of struct fuse_request by removing cuse_init_out from
the request structure and allocating it dinamically instead.

CC: Tejun Heo
Signed-off-by: Miklos Szeredi

Miklos Szeredi
2011-03-21 20:58:05 +0800

25 Feb, 2011

1 commit

5a18ec176 fuse: fix hang of single threaded fuseblk filesystem ... Browse Code »

Single threaded NTFS-3G could get stuck if a delayed RELEASE reply
triggered a DESTROY request via path_put().

Fix this by

a) making RELEASE requests synchronous, whenever possible, on fuseblk
filesystems

b) if not possible (triggered by an asynchronous read/write) then do
the path_put() in a separate thread with schedule_work().

Reported-by: Oliver Neukum
Cc: stable@kernel.org
Signed-off-by: Miklos Szeredi

Miklos Szeredi
2011-02-25 21:44:58 +0800

08 Dec, 2010

2 commits

02c048b91 fuse: allow batching of FORGET requests ... Browse Code »

Terje Malmedal reports that a fuse filesystem with 32 million inodes
on a machine with lots of memory can take up to 30 minutes to process
FORGET requests when all those inodes are evicted from the icache.

To solve this, create a BATCH_FORGET request that allows up to about
8000 FORGET requests to be sent in a single message.

This request is only sent if userspace supports interface version 7.16
or later, otherwise fall back to sending individual FORGET messages.

Reported-by: Terje Malmedal
Signed-off-by: Miklos Szeredi

Miklos Szeredi
2010-12-08 03:16:56 +0800
07e77dca8 fuse: separate queue for FORGET requests ... Browse Code »

Terje Malmedal reports that a fuse filesystem with 32 million inodes
on a machine with lots of memory can go unresponsive for up to 30
minutes when all those inodes are evicted from the icache.

The reason is that FORGET messages, sent when the inode is evicted,
are queued up together with regular filesystem requests, and while the
huge queue of FORGET messages are processed no other filesystem
operation can proceed.

Since a full fuse request structure is allocated for each inode, these
take up quite a bit of memory as well.

To solve these issues, create a slim 'fuse_forget_link' structure
containing just the minimum of information required to send the FORGET
request and chain these on a separate queue.

When userspace is asking for a request make sure that FORGET and
non-FORGET requests are selected fairly: for each 8 non-FORGET allow
16 FORGET requests. This will make sure FORGETs do not pile up, yet
other requests are also allowed to proceed while the queued FORGETs
are processed.

Reported-by: Terje Malmedal
Signed-off-by: Miklos Szeredi

Miklos Szeredi
2010-12-08 03:16:56 +0800

12 Jul, 2010

2 commits

2d45ba381 fuse: add retrieve request ... Browse Code »

Userspace filesystem can request data to be retrieved from the inode's
mapping. This request is synchronous and the retrieved data is queued
as a new request. If the write to the fuse device returns an error
then the retrieve request was not completed and a reply will not be
sent.

Only present pages are returned in the retrieve reply. Retrieving
stops when it finds a non-present page and only data prior to that is
returned.

This request doesn't change the dirty state of pages.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2010-07-12 20:41:40 +0800
a1d75f258 fuse: add store request ... Browse Code »

Userspace filesystem can request data to be stored in the inode's
mapping. This request is synchronous and has no reply. If the write
to the fuse device returns an error then the store request was not
fully completed (but may have updated some pages).

If the stored data overflows the current file size, then the size is
extended, similarly to a write(2) on the filesystem.

Pages which have been completely stored are marked uptodate.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2010-07-12 20:41:40 +0800

31 May, 2010

1 commit

003386fff Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
mm: export generic_pipe_buf_*() to modules
fuse: support splice() reading from fuse device
fuse: allow splice to move pages
mm: export remove_from_page_cache() to modules
mm: export lru_cache_add_*() to modules
fuse: support splice() writing to fuse device
fuse: get page reference for readpages
fuse: use get_user_pages_fast()
fuse: remove unneeded variable

Linus Torvalds
2010-05-31 00:16:14 +0800

28 May, 2010

1 commit

7ea808591 drop unused dentry argument to ->fsync ... Browse Code »

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-05-28 10:05:02 +0800

25 May, 2010

1 commit

ce534fb05 fuse: allow splice to move pages ... Browse Code »

When splicing buffers to the fuse device with SPLICE_F_MOVE, try to
move pages from the pipe buffer into the page cache. This allows
populating the fuse filesystem's cache without ever touching the page
contents, i.e. zero copy read capability.

The following steps are performed when trying to move a page into the
page cache:

- buf->ops->confirm() to make sure the new page is uptodate
- buf->ops->steal() to try to remove the new page from it's previous place
- remove_from_page_cache() on the old page
- add_to_page_cache_locked() on the new page

If any of the above steps fail (non fatally) then the code falls back
to copying the page. In particular ->steal() will fail if there are
external references (other than the page cache and the pipe buffer) to
the page.

Also since the remove_from_page_cache() + add_to_page_cache_locked()
are non-atomic it is possible that the page cache is repopulated in
between the two and add_to_page_cache_locked() will fail. This could
be fixed by creating a new atomic replace_page_cache_page() function.

fuse_readpages_end() needed to be reworked so it works even if
page->mapping is NULL for some or all pages which can happen if the
add_to_page_cache_locked() failed.

A number of sanity checks were added to make sure the stolen pages
don't have weird flags set, etc... These could be moved into generic
splice/steal code.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2010-05-25 21:06:07 +0800

24 Sep, 2009

1 commit

c08d3b0e3 truncate: use new helpers ... Browse Code »

Update some fs code to make use of new helper functions introduced
in the previous patch. Should be no significant change in behaviour
(except CIFS now calls send_sig under i_lock, via inode_newsize_ok).

Reviewed-by: Christoph Hellwig
Acked-by: Miklos Szeredi
Cc: linux-nfs@vger.kernel.org
Cc: Trond.Myklebust@netapp.com
Cc: linux-cifs-client@lists.samba.org
Cc: sfrench@samba.org
Signed-off-by: Nick Piggin
Signed-off-by: Al Viro

npiggin@suse.de
2009-09-24 20:41:47 +0800

16 Sep, 2009

1 commit

79a9d9943 fuse: add fusectl interface to max_background ... Browse Code »

Make the max_background and congestion_threshold parameters of a FUSE
mount tunable at runtime by adding the respective knobs to its directory
within the fusectl filesystem.

Signed-off-by: Csaba Henk
Signed-off-by: Miklos Szeredi

Csaba Henk
2009-09-16 20:15:29 +0800

07 Jul, 2009

1 commit

7a6d3c8b3 fuse: make the number of max background requests and congestion threshold tunable ... Browse Code »

The practical values for these limits depend on the design of the
filesystem server so let userspace set them at initialization time.

Signed-off-by: Csaba Henk
Signed-off-by: Miklos Szeredi

Csaba Henk
2009-07-07 23:28:52 +0800

01 Jul, 2009

2 commits

3b463ae0c fuse: invalidation reverse calls ... Browse Code »

Add notification messages that allow the filesystem to invalidate VFS
caches.

Two notifications are added:

1) inode invalidation

- invalidate cached attributes
- invalidate a range of pages in the page cache (this is optional)

2) dentry invalidation

- try to invalidate a subtree in the dentry cache

Care must be taken while accessing the 'struct super_block' for the
mount, as it can go away while an invalidation is in progress. To
prevent this, introduce a rw-semaphore, that is taken for read during
the invalidation and taken for write in the ->kill_sb callback.

Cc: Csaba Henk
Cc: Anand Avati
Signed-off-by: Miklos Szeredi

John Muir
2009-07-01 02:12:24 +0800
e0a43ddcc fuse: allow umask processing in userspace ... Browse Code »

This patch lets filesystems handle masking the file mode on creation.
This is needed if filesystem is using ACLs.

- The CREATE, MKDIR and MKNOD requests are extended with a "umask"
parameter.

- A new FUSE_DONT_MASK flag is added to the INIT request/reply. With
this the filesystem may request that the create mode is not masked.

CC: Jean-Pierre André
Signed-off-by: Miklos Szeredi

Miklos Szeredi
2009-07-01 02:12:23 +0800

28 Apr, 2009

8 commits

08cbf542b fuse: export symbols to be used by CUSE ... Browse Code »

Export the following symbols for CUSE.

fuse_conn_put()
fuse_conn_get()
fuse_conn_kill()
fuse_send_init()
fuse_do_open()
fuse_sync_release()
fuse_direct_io()
fuse_do_ioctl()
fuse_file_poll()
fuse_request_alloc()
fuse_get_req()
fuse_put_request()
fuse_request_send()
fuse_abort_conn()
fuse_dev_release()
fuse_dev_operations

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2009-04-28 22:56:42 +0800
a325f9b92 fuse: update fuse_conn_init() and separate out fuse_conn_kill() ... Browse Code »

Update fuse_conn_init() such that it doesn't take @sb and move bdi
registration into a separate function. Also separate out
fuse_conn_kill() from fuse_put_super().

These will be used to implement cuse.

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2009-04-28 22:56:41 +0800
8b0797a49 fuse: don't use inode in fuse_sync_release() ... Browse Code »

Make fuse_sync_release() a generic helper function that doesn't need a
struct inode pointer. This makes it suitable for use by CUSE.

Change return value of fuse_release_common() from int to void.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2009-04-28 22:56:39 +0800
91fe96b40 fuse: create fuse_do_open() helper for CUSE ... Browse Code »

Create a helper for sending an OPEN request that doesn't need a struct
inode pointer.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2009-04-28 22:56:37 +0800
c7b7143c6 fuse: clean up args in fuse_finish_open() and fuse_release_fill() ... Browse Code »

Move setting ff->fh, ff->nodeid and file->private_data outside
fuse_finish_open(). Add ->open_flags member to struct fuse_file.

This simplifies the argument passing to fuse_finish_open() and
fuse_release_fill(), and paves the way for creating an open helper
that doesn't need an inode pointer.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2009-04-28 22:56:37 +0800
2106cb189 fuse: don't use inode in helpers called by fuse_direct_io() ... Browse Code »

Use ff->fc and ff->nodeid instead of passing down the inode.

This prepares this function for use by CUSE, where the inode is not
owned by a fuse filesystem.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2009-04-28 22:56:37 +0800
da5e47145 fuse: add members to struct fuse_file ... Browse Code »

Add new members ->fc and ->nodeid to struct fuse_file. This will aid
in converting functions for use by CUSE, where the inode is not owned
by a fuse filesystem.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2009-04-28 22:56:36 +0800
b0be46ebf fuse: use struct path in release structure ... Browse Code »

Use struct path instead of separate dentry and vfsmount in
req->misc.release.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2009-04-28 22:56:36 +0800

28 Mar, 2009

1 commit

4269590a7 constify dentry_operations: FUSE ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2009-03-28 02:44:01 +0800

26 Nov, 2008

6 commits

43901aabd fuse: add fuse_conn->release() ... Browse Code »

Add fuse_conn->release() so that fuse_conn can be embedded in other
structures.

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:56 +0800
0d179aa59 fuse: separate out fuse_conn_init() from new_conn() ... Browse Code »

Separate out fuse_conn_init() from new_conn() and while at it
initialize fuse_conn->entry during conn initialization.

This will be used by CUSE.

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:55 +0800
b93f858ab fuse: add fuse_ prefix to several functions ... Browse Code »

Add fuse_ prefix to request_send*() and get_root_inode() as some of
those functions will be exported for CUSE. With or without CUSE
export, having the function names scoped is a good idea for
debuggability.

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:55 +0800
95668a69a fuse: implement poll support ... Browse Code »

Implement poll support. Polled files are indexed using kh in a RB
tree rooted at fuse_conn->polled_files.

Client should send FUSE_NOTIFY_POLL notification once after processing
FUSE_POLL which has FUSE_POLL_SCHEDULE_NOTIFY set. Sending
notification unconditionally after the latest poll or everytime file
content might have changed is inefficient but won't cause malfunction.

fuse_file_poll() can sleep and requires patches from the following
thread which allows f_op->poll() to sleep.

http://thread.gmane.org/gmane.linux.kernel/726176

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:55 +0800
acf99433d fuse: add file kernel handle ... Browse Code »

The file handle, fuse_file->fh, is opaque value supplied by userland
FUSE server and uniqueness is not guaranteed. Add file kernel handle,
fuse_file->kh, which is allocated by the kernel on file allocation and
guaranteed to be unique.

This will be used by poll to match notification to the respective file
but can be used for other purposes where unique file handle is
necessary.

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-11-26 19:03:55 +0800
1729a16c2 fuse: style fixes ... Browse Code »

Fix coding style errors reported by checkpatch and others. Uptdate
copyright date to 2008.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2008-11-26 19:03:54 +0800

16 Oct, 2008

1 commit

29d434b39 fuse: add include protectors ... Browse Code »

Add include protectors to include/linux/fuse.h and fs/fuse/fuse_i.h.

Signed-off-by: Tejun Heo
Signed-off-by: Miklos Szeredi

Tejun Heo
2008-10-16 22:08:57 +0800

26 Jul, 2008

2 commits

33670fa29 fuse: nfs export special lookups ... Browse Code »

Implement the get_parent export operation by sending a LOOKUP request with
".." as the name.

Implement looking up an inode by node ID after it has been evicted from
the cache. This is done by seding a LOOKUP request with "." as the name
(for all file types, not just directories).

The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to
indicate that it supports these special lookups.

Thanks to John Muir for the original implementation of this feature.

Signed-off-by: Miklos Szeredi
Cc: "J. Bruce Fields"
Cc: Trond Myklebust
Cc: Matthew Wilcox
Cc: David Teigland
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-07-26 01:53:48 +0800
dbd561d23 fuse: add export operations ... Browse Code »

Implement export_operations, to allow fuse filesystems to be exported to
NFS. This feature has been in the out-of-tree fuse module, and is widely
used and tested.

It has not been originally merged into mainline, because doing the NFS
export in userspace was thought to be a cleaner and more efficient way of
doing it, than through the kernel.

While that is true, it would also have involved a lot of duplicated effort
at reimplementing NFS exporting (all the different versions of the
protocol). This effort was unfortunately not undertaken by anyone, so we
are left with doing it the easy but less efficient way.

If this feature goes in, the out-of-tree fuse module can go away,
which would have several advantages:

- not having to maintain two versions
- less confusion for users
- no bugs due to kernel API changes

Comment from hch:
- Use the same fh_type values as XFS, since we use the same fh encoding.

Signed-off-by: Miklos Szeredi
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-07-26 01:53:48 +0800

13 May, 2008

1 commit

78bb6cb9a fuse: add flag to turn on big writes ... Browse Code »

Prior to 2.6.26 fuse only supported single page write requests. In theory all
fuse filesystem should be able support bigger than 4k writes, as there's
nothing in the API to prevent it. Unfortunately there's a known case in
NTFS-3G where big writes cause filesystem corruption. There could also be
other filesystems, where the lack of testing with big write requests would
result in bugs.

To prevent such problems on a kernel upgrade, disable big writes by default,
but let filesystems set a flag to turn it on.

Signed-off-by: Miklos Szeredi
Cc: Szabolcs Szakacsits
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-05-13 23:02:26 +0800

30 Apr, 2008

4 commits

b48badf01 fuse: fix node ID type ... Browse Code »

Node ID is 64bit but it is passed as unsigned long to some functions. This
breakage wasn't noticed, because libfuse uses unsigned long too.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-04-30 23:29:51 +0800
5c5c5e51b fuse: update file size on short read ... Browse Code »

If the READ request returned a short count, then either

- cached size is incorrect
- filesystem is buggy, as short reads are only allowed on EOF

So assume that the size is wrong and refresh it, so that cached read() doesn't
zero fill the missing chunk.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-04-30 23:29:50 +0800
3be5a52b3 fuse: support writable mmap ... Browse Code »

Quoting Linus (3 years ago, FUSE inclusion discussions):

"User-space filesystems are hard to get right. I'd claim that they
are almost impossible, unless you limit them somehow (shared
writable mappings are the nastiest part - if you don't have those,
you can reasonably limit your problems by limiting the number of
dirty pages you accept through normal "write()" calls)."

Instead of attempting the impossible, I've just waited for the dirty page
accounting infrastructure to materialize (thanks to Peter Zijlstra and
others). This nicely solved the biggest problem: limiting the number of pages
used for write caching.

Some small details remained, however, which this largish patch attempts to
address. It provides a page writeback implementation for fuse, which is
completely safe against VM related deadlocks. Performance may not be very
good for certain usage patterns, but generally it should be acceptable.

It has been tested extensively with fsx-linux and bash-shared-mapping.

Fuse page writeback design
--------------------------

fuse_writepage() allocates a new temporary page with GFP_NOFS|__GFP_HIGHMEM.
It copies the contents of the original page, and queues a WRITE request to the
userspace filesystem using this temp page.

The writeback is finished instantly from the MM's point of view: the page is
removed from the radix trees, and the PageDirty and PageWriteback flags are
cleared.

For the duration of the actual write, the NR_WRITEBACK_TEMP counter is
incremented. The per-bdi writeback count is not decremented until the actual
write completes.

On dirtying the page, fuse waits for a previous write to finish before
proceeding. This makes sure, there can only be one temporary page used at a
time for one cached page.

This approach is wasteful in both memory and CPU bandwidth, so why is this
complication needed?

The basic problem is that there can be no guarantee about the time in which
the userspace filesystem will complete a write. It may be buggy or even
malicious, and fail to complete WRITE requests. We don't want unrelated parts
of the system to grind to a halt in such cases.

Also a filesystem may need additional resources (particularly memory) to
complete a WRITE request. There's a great danger of a deadlock if that
allocation may wait for the writepage to finish.

Currently there are several cases where the kernel can block on page
writeback:

- allocation order is larger than PAGE_ALLOC_COSTLY_ORDER
- page migration
- throttle_vm_writeout (through NR_WRITEBACK)
- sync(2)

Of course in some cases (fsync, msync) we explicitly want to allow blocking.
So for these cases new code has to be added to fuse, since the VM is not
tracking writeback pages for us any more.

As an extra safetly measure, the maximum dirty ratio allocated to a single
fuse filesystem is set to 1% by default. This way one (or several) buggy or
malicious fuse filesystems cannot slow down the rest of the system by hogging
dirty memory.

With appropriate privileges, this limit can be raised through
'/sys/class/bdi//max_ratio'.

Signed-off-by: Miklos Szeredi
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-04-30 23:29:50 +0800
b6f2fcbcf mm: bdi: expose the BDI object in sysfs for FUSE ... Browse Code »

Register FUSE's backing_dev_info under sysfs with the name "fuse-MAJOR:MINOR"

Make the fuse control filesystem use s_dev instead of a fuse specific ID.
This makes it easier to match directories under /sys/fs/fuse/connections/ with
directories under /sys/class/bdi, and with actual mounts.

Signed-off-by: Miklos Szeredi
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-04-30 23:29:49 +0800