Eric Lee / linux-smarc-t335x-v3.2

10 Jan, 2009

1 commit

c4be0c1dc filesystem freeze: add error handling of write_super_lockfs/unlockfs ... Browse Code »

Currently, ext3 in mainline Linux doesn't have the freeze feature which
suspends write requests. So, we cannot take a backup which keeps the
filesystem's consistency with the storage device's features (snapshot and
replication) while it is mounted.

In many case, a commercial filesystem (e.g. VxFS) has the freeze feature
and it would be used to get the consistent backup.

If Linux's standard filesystem ext3 has the freeze feature, we can do it
without a commercial filesystem.

So I have implemented the ioctls of the freeze feature.
I think we can take the consistent backup with the following steps.
1. Freeze the filesystem with the freeze ioctl.
2. Separate the replication volume or create the snapshot
with the storage device's feature.
3. Unfreeze the filesystem with the unfreeze ioctl.
4. Take the backup from the separated replication volume
or the snapshot.

This patch:

VFS:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that they can return an error.
Rename write_super_lockfs and unlockfs of the super block operation
freeze_fs and unfreeze_fs to avoid a confusion.

ext3, ext4, xfs, gfs2, jfs:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that write_super_lockfs returns an error if needed,
and unlockfs always returns 0.

reiserfs:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that they always return 0 (success) to keep a current behavior.

Signed-off-by: Takashi Sato
Signed-off-by: Masayuki Hamaguchi
Cc:
Cc:
Cc: Christoph Hellwig
Cc: Dave Kleikamp
Cc: Dave Chinner
Cc: Alasdair G Kergon
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Takashi Sato
2009-01-10 08:54:42 +0800

07 Jan, 2009

1 commit

5f820f648 poll: allow f_op->poll to sleep ... Browse Code »

f_op->poll is the only vfs operation which is not allowed to sleep. It's
because poll and select implementation used task state to synchronize
against wake ups, which doesn't have to be the case anymore as wait/wake
interface can now use custom wake up functions. The non-sleep restriction
can be a bit tricky because ->poll is not called from an atomic context
and the result of accidentally sleeping in ->poll only shows up as
temporary busy looping when the timing is right or rather wrong.

This patch converts poll/select to use custom wake up function and use
separate triggered variable to synchronize against wake up events. The
only added overhead is an extra function call during wake up and
negligible.

This patch removes the one non-sleep exception from vfs locking rules and
is beneficial to userland filesystem implementations like FUSE, 9p or
peculiar fs like spufs as it's very difficult for those to implement
non-sleeping poll method.

While at it, make the following cosmetic changes to make poll.h and
select.c checkpatch friendly.

* s/type * symbol/type *symbol/ : three places in poll.h
* remove blank line before EXPORT_SYMBOL() : two places in select.c

Oleg: spotted missing barrier in poll_schedule_timeout()
Davide: spotted missing write barrier in pollwake()

Signed-off-by: Tejun Heo
Cc: Eric Van Hensbergen
Cc: Ron Minnich
Cc: Ingo Molnar
Cc: Christoph Hellwig
Signed-off-by: Miklos Szeredi
Cc: Davide Libenzi
Cc: Brad Boyer
Cc: Al Viro
Cc: Roland McGrath
Cc: Mauro Carvalho Chehab
Signed-off-by: Andrew Morton
Cc: Davide Libenzi
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2009-01-07 07:59:12 +0800

01 Jan, 2009

1 commit

6badd79bd kill ->dir_notify() ... Browse Code »

Remove the hopelessly misguided ->dir_notify(). The only instance (cifs)
has been broken by design from the very beginning; the objects it creates
are never destroyed, keep references to struct file they can outlive, nothing
that could possibly evict them exists on close(2) path *and* no locking
whatsoever is done to prevent races with close(), should the previous, er,
deficiencies someday be dealt with.

Signed-off-by: Al Viro

Al Viro
2009-01-01 07:07:43 +0800

31 Oct, 2008

1 commit

4e02ed4b4 fs: remove prepare_write/commit_write ... Browse Code »

Nothing uses prepare_write or commit_write. Remove them from the tree
completely.

[akpm@linux-foundation.org: schedule simple_prepare_write() for unexporting]
Signed-off-by: Nick Piggin
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2008-10-31 02:38:45 +0800

10 Sep, 2008

1 commit

adaae7215 update Documentation/filesystems/Locking for 2.6.27 changes ... Browse Code »

In the 2.6.27 circle ->fasync lost the BKL, and the last remaining
->open variant that takes the BKL is also gone. ->get_sb and ->kill_sb
didn't have BKL forever, so updated the entries while we're at that.

Signed-off-by: Christoph Hellwig
Signed-off-by: Linus Torvalds

Christoph Hellwig
2008-09-10 02:51:15 +0800

25 Jul, 2008

1 commit

28b2ee20c access_process_vm device memory infrastructure ... Browse Code »

In order to be able to debug things like the X server and programs using
the PPC Cell SPUs, the debugger needs to be able to access device memory
through ptrace and /proc/pid/mem.

This patch:

Add the generic_access_phys access function and put the hooks in place
to allow access_process_vm to access device or PPC Cell SPU memory.

[riel@redhat.com: Add documentation for the vm_ops->access function]
Signed-off-by: Rik van Riel
Signed-off-by: Benjamin Herrensmidt
Cc: Dave Airlie
Cc: Hugh Dickins
Cc: Paul Mackerras
Cc: Arnd Bergmann
Acked-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rik van Riel
2008-07-25 01:47:15 +0800

07 May, 2008

1 commit

33dcdac2d [PATCH] kill ->put_inode ... Browse Code »

And with that last patch to affs killing the last put_inode instance we
can finally, after many years of transition kill this racy and awkward
interface.

(It's kinda funny that even the description in
Documentation/filesystems/vfs.txt was entirely wrong..)

Also remove a very misleading comment above the defintion of
struct super_operations.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2008-05-07 01:45:34 +0800

28 Apr, 2008

1 commit

3c18ddd16 mm: remove nopage ... Browse Code »

Nothing in the tree uses nopage any more. Remove support for it in the
core mm code and documentation (and a few stray references to it in
comments).

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2008-04-28 23:58:18 +0800

08 Feb, 2008

1 commit

12debc424 iget: remove iget() and the read_inode() super op as being obsolete ... Browse Code »

Remove the old iget() call and the read_inode() superblock operation it uses
as these are really obsolete, and the use of read_inode() does not produce
proper error handling (no distinction between ENOMEM and EIO when marking an
inode bad).

Furthermore, this removes the temptation to use iget() to find an inode by
number in a filesystem from code outside that filesystem.

iget_locked() should be used instead. A new function is added in an earlier
patch (iget_failed) that is to be called to mark an inode as bad, unlock it
and release it should the get routine fail. Mark iget() and read_inode() as
being obsolete and remove references to them from the documentation.

Typically a filesystem will be modified such that the read_inode function
becomes an internal iget function, for example the following:

void thingyfs_read_inode(struct inode *inode)
{
...
}

would be changed into something like:

struct inode *thingyfs_iget(struct super_block *sp, unsigned long ino)
{
struct inode *inode;
int ret;

inode = iget_locked(sb, ino);
if (!inode)
return ERR_PTR(-ENOMEM);
if (!(inode->i_state & I_NEW))
return inode;

...
unlock_new_inode(inode);
return inode;
error:
iget_failed(inode);
return ERR_PTR(ret);
}

and then thingyfs_iget() would be called rather than iget(), for example:

ret = -EINVAL;
inode = iget(sb, ino);
if (!inode || is_bad_inode(inode))
goto error;

becomes:

inode = thingyfs_iget(sb, ino);
if (IS_ERR(inode)) {
ret = PTR_ERR(inode);
goto error;
}

Note that is_bad_inode() does not need to be called. The error returned by
thingyfs_iget() should render it unnecessary.

Signed-off-by: David Howells
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2008-02-08 00:42:29 +0800

20 Oct, 2007

1 commit

3a4fa0a25 Fix misspellings of "system", "controller", "interrupt" and "necessary". ... Browse Code »

Fix the various misspellings of "system", controller", "interrupt" and
"[un]necessary".

Signed-off-by: Robert P. J. Day
Signed-off-by: Adrian Bunk

Robert P. J. Day
2007-10-20 05:10:43 +0800

17 Oct, 2007

1 commit

afddba49d fs: introduce write_begin, write_end, and perform_write aops ... Browse Code »

These are intended to replace prepare_write and commit_write with more
flexible alternatives that are also able to avoid the buffered write
deadlock problems efficiently (which prepare_write is unable to do).

[mark.fasheh@oracle.com: API design contributions, code review and fixes]
[akpm@linux-foundation.org: various fixes]
[dmonakhov@sw.ru: new aop block_write_begin fix]
Signed-off-by: Nick Piggin
Signed-off-by: Mark Fasheh
Signed-off-by: Dmitriy Monakhov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-10-17 00:42:55 +0800

20 Jul, 2007

3 commits

d0217ac04 mm: fault feedback #1 ... Browse Code »

Change ->fault prototype. We now return an int, which contains
VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
FAULT_RET_ code tells the VM whether a page was found, whether it has been
locked, and potentially other things. This is not quite the way he wanted
it yet, but that's changed in the next patch (which requires changes to
arch code).

This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
that a page is locked which requires filemap_nopage to go away (because we
can no longer remain backward compatible without that flag), but we were
going to do that anyway.

struct fault_data is renamed to struct vm_fault as Linus asked. address
is now a void __user * that we should firmly encourage drivers not to use
without really good reason.

The page is now returned via a page pointer in the vm_fault struct.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800
ed2f2f9b3 Document ->page_mkwrite() locking ... Browse Code »

There seems to be very little documentation about this callback in general.
The locking in particular is a bit tricky, so it's worth having this in
writing.

Signed-off-by: Mark Fasheh
Cc: Nick Piggin
Cc: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mark Fasheh
2007-07-20 01:04:41 +0800
54cb8821d mm: merge populate and nopage into fault (fixes nonlinear) ... Browse Code »

Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
the virtual address -> file offset differently from linear mappings.

->populate is a layering violation because the filesystem/pagecache code
should need to know anything about the virtual memory mapping. The hitch here
is that the ->nopage handler didn't pass down enough information (ie. pgoff).
But it is more logical to pass pgoff rather than have the ->nopage function
calculate it itself anyway (because that's a similar layering violation).

Having the populate handler install the pte itself is likewise a nasty thing
to be doing.

This patch introduces a new fault handler that replaces ->nopage and
->populate and (later) ->nopfn. Most of the old mechanism is still in place
so there is a lot of duplication and nice cleanups that can be removed if
everyone switches over.

The rationale for doing this in the first place is that nonlinear mappings are
subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
to duplicate the synchronisation logic rather than just consolidate the two.

After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
pagecache. Seems like a fringe functionality anyway.

NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no
users have hit mainline yet.

[akpm@linux-foundation.org: cleanup]
[randy.dunlap@oracle.com: doc. fixes for readahead]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Nick Piggin
Signed-off-by: Randy Dunlap
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800

09 May, 2007

2 commits

a7bc02f4f trivial: s/i_sem /i_mutex/ ... Browse Code »

This patch substitutes i_sem by i_mutex in
Documentation/filesystems/Locking.
The patch also removes a couple of trailing white-spaces.

Signed-off-by: Artem Bityutskiy
Signed-off-by: Adrian Bunk

Artem Bityutskiy
2007-05-09 14:58:17 +0800
c23fbb6bc VFS: delay the dentry name generation on sockets and pipes ... Browse Code »

1) Introduces a new method in 'struct dentry_operations'. This method
called d_dname() might be called from d_path() to build a pathname for
special filesystems. It is called without locks.

Future patches (if we succeed in having one common dentry for all
pipes/sockets) may need to change prototype of this method, but we now
use : char *d_dname(struct dentry *dentry, char *buffer, int buflen);

2) Adds a dynamic_dname() helper function that eases d_dname() implementations

3) Defines d_dname method for sockets : No more sprintf() at socket
creation. This is delayed up to the moment someone does an access to
/proc/pid/fd/...

4) Defines d_dname method for pipes : No more sprintf() at pipe
creation. This is delayed up to the moment someone does an access to
/proc/pid/fd/...

A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a
*nice* speedup on my Pentium(M) 1.6 Ghz :

3.090 s instead of 3.450 s

Signed-off-by: Eric Dumazet
Acked-by: Christoph Hellwig
Acked-by: Linus Torvalds
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Dumazet
2007-05-09 02:15:03 +0800

12 Jan, 2007

1 commit

e3db7691e [PATCH] NFS: Fix race in nfs_release_page() ... Browse Code »

NFS: Fix race in nfs_release_page()

invalidate_inode_pages2() may find the dirty bit has been set on a page
owing to the fact that the page may still be mapped after it was locked.
Only after the call to unmap_mapping_range() are we sure that the page
can no longer be dirtied.
In order to fix this, NFS has hooked the releasepage() method and tries
to write the page out between the call to unmap_mapping_range() and the
call to remove_mapping(). This, however leads to deadlocks in the page
reclaim code, where the page may be locked without holding a reference
to the inode or dentry.

Fix is to add a new address_space_operation, launder_page(), which will
attempt to write out a dirty page without releasing the page lock.

Signed-off-by: Trond Myklebust

Also, the bare SetPageDirty() can skew all sort of accounting leading to
other nasties.

[akpm@osdl.org: cleanup]
Signed-off-by: Peter Zijlstra
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Trond Myklebust
2007-01-12 10:18:21 +0800

08 Dec, 2006

1 commit

70888bd5b [PATCH] Documentation: remount_fs() needs lock_kernel ... Browse Code »

Fixed long-lived typo: remount_fs() needs BKL

Signed-off-by: Vasily Averin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vasily Averin
2006-12-08 00:39:36 +0800

01 Oct, 2006

1 commit

027445c37 [PATCH] Vectorize aio_read/aio_write fileop methods ... Browse Code »

This patch vectorizes aio_read() and aio_write() methods to prepare for
collapsing all aio & vectored operations into one interface - which is
aio_read()/aio_write().

Signed-off-by: Badari Pulavarty
Signed-off-by: Christoph Hellwig
Cc: Michael Holzheu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Badari Pulavarty
2006-10-01 15:39:28 +0800

11 Jul, 2006

1 commit

5d8b2ebfa [PATCH] VFS documentation tweak ... Browse Code »

As I was looking over the get_sb() changes, I stumbled across a little
mistake in the documentation updates. Unless we're getting into an
interesting new object-oriented realm, I doubt that get_sb() should really
return "struct int"...

Signed-off-by: Jonathan Corbet
Acked-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jonathan Corbet
2006-07-11 04:24:15 +0800

23 Jun, 2006

2 commits

726c33422 [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry ... Browse Code »

Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch. That reduced the significance of
sb->s_root, allowing NFS to place a fake root there. However, NFS does
require a dentry to use as a target for the statfs operation. This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells
Acked-by: Al Viro
Cc: Nathan Scott
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2006-06-23 22:42:45 +0800
454e2398b [PATCH] VFS: Permit filesystem to override root dentry on mount ... Browse Code »

Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers. For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing. In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

(*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
pointer argument and return an integer, so most filesystems have to change
very little.

(*) If one of the convenience function is not used, then get_sb() should
normally call simple_set_mnt() to instantiate the vfsmount. This will
always return 0, and so can be tail-called from get_sb().

(*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
dcache upon superblock destruction rather than shrink_dcache_anon().

This is required because the superblock may now have multiple trees that
aren't actually bound to s_root, but that still need to be cleaned up. The
currently called functions assume that the whole tree is rooted at s_root,
and that anonymous dentries are not the roots of trees which results in
dentries being left unculled.

However, with the way NFS superblock sharing are currently set to be
implemented, these assumptions are violated: the root of the filesystem is
simply a dummy dentry and inode (the real inode for '/' may well be
inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
with child trees.

[*] Anonymous until discovered from another tree.

(*) The documentation has been adjusted, including the additional bit of
changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells
Acked-by: Al Viro
Cc: Nathan Scott
Cc: Roland Dreier
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2006-06-23 22:42:45 +0800

01 May, 2005

1 commit

2054606ad [PATCH] doc: Locking update ... Browse Code »

Make the Locking document truer.

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nikita Danilov
2005-05-01 23:58:37 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800