24 Sep, 2009

1 commit

  • It's unused.

    It isn't needed -- read or write flag is already passed and sysctl
    shouldn't care about the rest.

    It _was_ used in two places at arch/frv for some reason.

    Signed-off-by: Alexey Dobriyan
    Cc: David Howells
    Cc: "Eric W. Biederman"
    Cc: Al Viro
    Cc: Ralf Baechle
    Cc: Martin Schwidefsky
    Cc: Ingo Molnar
    Cc: "David S. Miller"
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

12 Jun, 2009

2 commits

  • This function walks the s_files lock, and operates primarily on the
    files in a superblock, so it better belongs here (eg. see also
    fs_may_remount_ro).

    [AV: ... and it shouldn't be static after that move]

    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    npiggin@suse.de
     
  • This patch speeds up lmbench lat_mmap test by about another 2% after the
    first patch.

    Before:
    avg = 462.286
    std = 5.46106

    After:
    avg = 453.12
    std = 9.58257

    (50 runs of each, stddev gives a reasonable confidence)

    It does this by introducing mnt_clone_write, which avoids some heavyweight
    operations of mnt_want_write if called on a vfsmount which we know already
    has a write count; and mnt_want_write_file, which can call mnt_clone_write
    if the file is open for write.

    After these two patches, mnt_want_write and mnt_drop_write go from 7% on
    the profile down to 1.3% (including mnt_clone_write).

    [AV: mnt_want_write_file() should take file alone and derive mnt from it;
    not only all callers have that form, but that's the only mnt about which
    we know that it's already held for write if file is opened for write]

    Cc: Dave Hansen
    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    npiggin@suse.de
     

30 Mar, 2009

1 commit


27 Mar, 2009

1 commit


16 Mar, 2009

1 commit

  • This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock. For now,
    epoll remains the only user, but a future patch will use it to protect
    f_flags as well.

    Cc: Davide Libenzi
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jonathan Corbet

    Jonathan Corbet
     

06 Feb, 2009

2 commits

  • Conflicts:
    fs/namei.c

    Manually merged per:

    diff --cc fs/namei.c
    index 734f2b5,bbc15c2..0000000
    --- a/fs/namei.c
    +++ b/fs/namei.c
    @@@ -860,9 -848,8 +849,10 @@@ static int __link_path_walk(const char
    nd->flags |= LOOKUP_CONTINUE;
    err = exec_permission_lite(inode);
    if (err == -EAGAIN)
    - err = vfs_permission(nd, MAY_EXEC);
    + err = inode_permission(nd->path.dentry->d_inode,
    + MAY_EXEC);
    + if (!err)
    + err = ima_path_check(&nd->path, MAY_EXEC);
    if (err)
    break;

    @@@ -1525,14 -1506,9 +1509,14 @@@ int may_open(struct path *path, int acc
    flag &= ~O_TRUNC;
    }

    - error = vfs_permission(nd, acc_mode);
    + error = inode_permission(inode, acc_mode);
    if (error)
    return error;
    +
    - error = ima_path_check(&nd->path,
    ++ error = ima_path_check(path,
    + acc_mode & (MAY_READ | MAY_WRITE | MAY_EXEC));
    + if (error)
    + return error;
    /*
    * An append-only file must be opened in append mode for writing.
    */

    Signed-off-by: James Morris

    James Morris
     
  • This patch replaces the generic integrity hooks, for which IMA registered
    itself, with IMA integrity hooks in the appropriate places directly
    in the fs directory.

    Signed-off-by: Mimi Zohar
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    Mimi Zohar
     

01 Jan, 2009

1 commit


14 Nov, 2008

3 commits

  • Attach creds to file structs and discard f_uid/f_gid.

    file_operations::open() methods (such as hppfs_open()) should use file->f_cred
    rather than current_cred(). At the moment file->f_cred will be current_cred()
    at this point.

    Signed-off-by: David Howells
    Reviewed-by: James Morris
    Signed-off-by: James Morris

    David Howells
     
  • Wrap current->cred and a few other accessors to hide their actual
    implementation.

    Signed-off-by: David Howells
    Acked-by: James Morris
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    David Howells
     
  • Separate the task security context from task_struct. At this point, the
    security data is temporarily embedded in the task_struct with two pointers
    pointing to it.

    Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
    entry.S via asm-offsets.

    With comment fixes Signed-off-by: Marc Dionne

    Signed-off-by: David Howells
    Acked-by: James Morris
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    David Howells
     

02 Nov, 2008

1 commit

  • As it is, all instances of ->release() for files that have ->fasync()
    need to remember to evict file from fasync lists; forgetting that
    creates a hole and we actually have a bunch that *does* forget.

    So let's keep our lives simple - let __fput() check FASYNC in
    file->f_flags and call ->fasync() there if it's been set. And lose that
    crap in ->release() instances - leaving it there is still valid, but we
    don't have to bother anymore.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

21 Oct, 2008

1 commit


27 Jul, 2008

1 commit


02 May, 2008

1 commit


19 Apr, 2008

3 commits

  • There have been a few oopses caused by 'struct file's with NULL f_vfsmnts.
    There was also a set of potentially missed mnt_want_write()s from
    dentry_open() calls.

    This patch provides a very simple debugging framework to catch these kinds of
    bugs. It will WARN_ON() them, but should stop us from having any oopses or
    mnt_writer count imbalances.

    I'm quite convinced that this is a good thing because it found bugs in the
    stuff I was working on as soon as I wrote it.

    [hch: made it conditional on a debug option.
    But it's still a little bit too ugly]

    [hch: merged forced remount r/o fix from Dave and akpm's fix for the fix]

    Signed-off-by: Dave Hansen
    Acked-by: Al Viro
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Dave Hansen
     
  • This is the first really tricky patch in the series. It elevates the writer
    count on a mount each time a non-special file is opened for write.

    We used to do this in may_open(), but Miklos pointed out that __dentry_open()
    is used as well to create filps. This will cover even those cases, while a
    call in may_open() would not have.

    There is also an elevated count around the vfs_create() call in open_namei().
    See the comments for more details, but we need this to fix a 'create, remount,
    fail r/w open()' race.

    Some filesystems forego the use of normal vfs calls to create
    struct files. Make sure that these users elevate the mnt
    writer count because they will get __fput(), and we need
    to make sure they're balanced.

    Acked-by: Al Viro
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Dave Hansen
     
  • If someone decides to demote a file from r/w to just
    r/o, they can use this same code as __fput().

    NFS does just that, and will use this in the next
    patch.

    AV: drop write access in __fput() only after we evict from file list.

    Signed-off-by: Dave Hansen
    Cc: Erez Zadok
    Cc: Trond Myklebust
    Cc: "J Bruce Fields"
    Acked-by: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Dave Hansen
     

19 Mar, 2008

1 commit

  • Some new uses of get_empty_filp() have crept in; switched
    to alloc_file() to make sure that pieces of initialization
    won't be missing.

    We really need to kill get_empty_filp().

    [AV] fixed dentry leak on failure exit in anon_inode_getfd()

    Cc: Erez Zadok
    Cc: Trond Myklebust
    Cc: "J Bruce Fields"
    Acked-by: Al Viro
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Dave Hansen
    Signed-off-by: Al Viro

    Dave Hansen
     

09 Feb, 2008

1 commit


20 Oct, 2007

1 commit


17 Oct, 2007

3 commits

  • Why do we need r/o bind mounts?

    This feature allows a read-only view into a read-write filesystem. In the
    process of doing that, it also provides infrastructure for keeping track of
    the number of writers to any given mount.

    This has a number of uses. It allows chroots to have parts of filesystems
    writable. It will be useful for containers in the future because users may
    have root inside a container, but should not be allowed to write to
    somefilesystems. This also replaces patches that vserver has had out of the
    tree for several years.

    It allows security enhancement by making sure that parts of your filesystem
    read-only (such as when you don't trust your FTP server), when you don't want
    to have entire new filesystems mounted, or when you want atime selectively
    updated. I've been using the following script to test that the feature is
    working as desired. It takes a directory and makes a regular bind and a r/o
    bind mount of it. It then performs some normal filesystem operations on the
    three directories, including ones that are expected to fail, like creating a
    file on the r/o mount.

    This patch:

    Some filesystems forego the vfs and may_open() and create their own 'struct
    file's.

    This patch creates a couple of helper functions which can be used by these
    filesystems, and will provide a unified place which the r/o bind mount code
    may patch.

    Also, rename an existing, static-scope init_file() to a less generic name.

    Signed-off-by: Dave Hansen
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • Signed-off-by: Denis Cheng
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Denis Cheng
     
  • s/percpu_counter_sum/&_positive/

    Because its consitent with percpu_counter_read*

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

09 May, 2007

1 commit


09 Dec, 2006

1 commit

  • This patch changes struct file to use struct path instead of having
    independent pointers to struct dentry and struct vfsmount, and converts all
    users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}.

    Additionally, it adds two #define's to make the transition easier for users of
    the f_dentry and f_vfsmnt.

    Signed-off-by: Josef "Jeff" Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef "Jeff" Sipek
     

02 Oct, 2006

1 commit


27 Sep, 2006

1 commit


01 Jul, 2006

1 commit


23 Jun, 2006

1 commit

  • The percpu counter data type are changed in this set of patches to support
    more users like ext3 who need more than 32 bit to store the free blocks
    total in the filesystem.

    - Generic perpcu counters data type changes. The size of the global counter
    and local counter were explictly specified using s64 and s32. The global
    counter is changed from long to s64, while the local counter is changed from
    long to s32, so we could avoid doing 64 bit update in most cases.

    - Users of the percpu counters are updated to make use of the new
    percpu_counter_init() routine now taking an additional parameter to allow
    users to pass the initial value of the global counter.

    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mingming Cao
     

23 Mar, 2006

1 commit

  • Eliminate a handful of cache references by keeping current in a register
    instead of reloading (helps x86) and avoiding the overhead of a function
    call. Inlining eventpoll_init_file() saves 24 bytes. Also reorder file
    initialization to make writes occur more sequentially.

    Signed-off-by: Benjamin LaHaise
    Cc: Davide Libenzi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin LaHaise
     

09 Mar, 2006

1 commit

  • I have benchmarked this on an x86_64 NUMA system and see no significant
    performance difference on kernbench. Tested on both x86_64 and powerpc.

    The way we do file struct accounting is not very suitable for batched
    freeing. For scalability reasons, file accounting was
    constructor/destructor based. This meant that nr_files was decremented
    only when the object was removed from the slab cache. This is susceptible
    to slab fragmentation. With RCU based file structure, consequent batched
    freeing and a test program like Serge's, we just speed this up and end up
    with a very fragmented slab -

    llm22:~ # cat /proc/sys/fs/file-nr
    587730 0 758844

    At the same time, I see only a 2000+ objects in filp cache. The following
    patch I fixes this problem.

    This patch changes the file counting by removing the filp_count_lock.
    Instead we use a separate percpu counter, nr_files, for now and all
    accesses to it are through get_nr_files() api. In the sysctl handler for
    nr_files, we populate files_stat.nr_files before returning to user.

    Counting files as an when they are created and destroyed (as opposed to
    inside slab) allows us to correctly count open files with RCU.

    Signed-off-by: Dipankar Sarma
    Cc: "Paul E. McKenney"
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dipankar Sarma
     

12 Jan, 2006

1 commit


09 Jan, 2006

1 commit


07 Nov, 2005

1 commit


31 Oct, 2005

1 commit

  • Now that RCU applied on 'struct file' seems stable, we can place f_rcuhead
    in a memory location that is not anymore used at call_rcu(&f->f_rcuhead,
    file_free_rcu) time, to reduce the size of this critical kernel object.

    The trick I used is to move f_rcuhead and f_list in an union called f_u

    The callers are changed so that f_rcuhead becomes f_u.fu_rcuhead and f_list
    becomes f_u.f_list

    Signed-off-by: Eric Dumazet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

10 Sep, 2005

1 commit

  • Patch to eliminate struct files_struct.file_lock spinlock on the reader side
    and use rcu refcounting rcuref_xxx api for the f_count refcounter. The
    updates to the fdtable are done by allocating a new fdtable structure and
    setting files->fdt to point to the new structure. The fdtable structure is
    protected by RCU thereby allowing lock-free lookup. For fd arrays/sets that
    are vmalloced, we use keventd to free them since RCU callbacks can't sleep. A
    global list of fdtable to be freed is not scalable, so we use a per-cpu list.
    If keventd is already handling the current cpu's work, we use a timer to defer
    queueing of that work.

    Since the last publication, this patch has been re-written to avoid using
    explicit memory barriers and use rcu_assign_pointer(), rcu_dereference()
    premitives instead. This required that the fd information is kept in a
    separate structure (fdtable) and updated atomically.

    Signed-off-by: Dipankar Sarma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dipankar Sarma
     

08 Sep, 2005

1 commit


13 Jul, 2005

1 commit

  • inotify is intended to correct the deficiencies of dnotify, particularly
    its inability to scale and its terrible user interface:

    * dnotify requires the opening of one fd per each directory
    that you intend to watch. This quickly results in too many
    open files and pins removable media, preventing unmount.
    * dnotify is directory-based. You only learn about changes to
    directories. Sure, a change to a file in a directory affects
    the directory, but you are then forced to keep a cache of
    stat structures.
    * dnotify's interface to user-space is awful. Signals?

    inotify provides a more usable, simple, powerful solution to file change
    notification:

    * inotify's interface is a system call that returns a fd, not SIGIO.
    You get a single fd, which is select()-able.
    * inotify has an event that says "the filesystem that the item
    you were watching is on was unmounted."
    * inotify can watch directories or files.

    Inotify is currently used by Beagle (a desktop search infrastructure),
    Gamin (a FAM replacement), and other projects.

    See Documentation/filesystems/inotify.txt.

    Signed-off-by: Robert Love
    Cc: John McCutchan
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert Love