30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

04 Mar, 2010

2 commits


17 Dec, 2009

1 commit

  • The EXPORT_SYMBOL for d_alloc_name is in fs/libfs.c but the function
    is in fs/dcache.c. Move the EXPORT_SYMBOL to the line immediately
    after the closing function brace line in fs/dcache.c as mentioned
    in Documentation/CodingStyle.

    Signed-off-by: H Hartley Sweeten
    Signed-off-by: Al Viro

    H Hartley Sweeten
     

24 Sep, 2009

2 commits

  • Currently all simple_attr.set handlers return 0 on success and negative
    codes on error. Fix simple_attr_write() to return these error codes.

    Signed-off-by: Wu Fengguang
    Cc: Theodore Ts'o
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Wu Fengguang
     
  • Impact: have simple_read_from_buffer conform to standards

    It was brought to my attention by Andrew Morton, Theodore Tso, and H.
    Peter Anvin that a read from userspace should only return -EFAULT if
    nothing was actually read.

    Looking at the simple_read_from_buffer I noticed that this function does
    not conform to that rule. This patch fixes that function.

    [akpm@linux-foundation.org: simplification suggested by hpa]
    [hpa@zytor.com: fix count==0 handling]
    Signed-off-by: Steven Rostedt
    Cc: Al Viro
    Cc: Theodore Ts'o
    Cc: Ingo Molnar
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Steven Rostedt
     

19 Aug, 2009

1 commit


12 Jun, 2009

1 commit

  • writes associated buffers, then does sync_inode() to write
    the inode itself (and to make it clean). Depends on
    ->write_inode() honouring the second argument.

    Signed-off-by: Al Viro

    Al Viro
     

09 May, 2009

1 commit


03 Apr, 2009

1 commit

  • Impact: cleanup

    We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace
    changes), but this is not possible currently without breaking the
    build because fs.h has an implicit include file depedency: it
    uses PAGE_SIZE but does not include asm/page.h which defines it.

    This problem gets masked in practice because most fs.h using sites
    use rcupreempt.h (and other headers) which includes percpu.h which
    brings in asm/page.h indirectly.

    We cannot add asm/page.h to asm/fs.h because page.h is not an
    exported header.

    Move simple_transaction_set() to the other simple-transaction
    file helpers in fs/libfs.c.

    This removes the include file hell and also reduces
    kernel size a bit.

    Acked-by: Al Viro
    Cc: Alexey Dobriyan
    Cc: Pekka Enberg
    Cc: Eduard - Gabriel Munteanu
    Cc: paulmck@linux.vnet.ibm.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

28 Mar, 2009

2 commits

  • simple_set_mnt() is defined as returning 'int' but always returns 0.
    Callers assume simple_set_mnt() never fails and don't properly cleanup if
    it were to _ever_ fail. For instance, get_sb_single() and get_sb_nodev()
    should:

    up_write(sb->s_unmount);
    deactivate_super(sb);

    if simple_set_mnt() fails.

    Since simple_set_mnt() never fails, would be cleaner if it did not
    return anything.

    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: Serge Hallyn
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Sukadev Bhattiprolu
     
  • Signed-off-by: Al Viro

    Al Viro
     

06 Jan, 2009

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    inotify: fix type errors in interfaces
    fix breakage in reiserfs_new_inode()
    fix the treatment of jfs special inodes
    vfs: remove duplicate code in get_fs_type()
    add a vfs_fsync helper
    sys_execve and sys_uselib do not call into fsnotify
    zero i_uid/i_gid on inode allocation
    inode->i_op is never NULL
    ntfs: don't NULL i_op
    isofs check for NULL ->i_op in root directory is dead code
    affs: do not zero ->i_op
    kill suid bit only for regular files
    vfs: lseek(fd, 0, SEEK_CUR) race condition

    Linus Torvalds
     
  • ... and don't bother in callers. Don't bother with zeroing i_blocks,
    while we are at it - it's already been zeroed.

    i_mode is not worth the effort; it has no common default value.

    Signed-off-by: Al Viro

    Al Viro
     

05 Jan, 2009

1 commit

  • With the write_begin/write_end aops, page_symlink was broken because it
    could no longer pass a GFP_NOFS type mask into the point where the
    allocations happened. They are done in write_begin, which would always
    assume that the filesystem can be entered from reclaim. This bug could
    cause filesystem deadlocks.

    The funny thing with having a gfp_t mask there is that it doesn't really
    allow the caller to arbitrarily tinker with the context in which it can be
    called. It couldn't ever be GFP_ATOMIC, for example, because it needs to
    take the page lock. The only thing any callers care about is __GFP_FS
    anyway, so turn that into a single flag.

    Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on
    this flag in their write_begin function. Change __grab_cache_page to
    accept a nofs argument as well, to honour that flag (while we're there,
    change the name to grab_cache_page_write_begin which is more instructive
    and does away with random leading underscores).

    This is really a more flexible way to go in the end anyway -- if a
    filesystem happens to want any extra allocations aside from the pagecache
    ones in ints write_begin function, it may now use GFP_KERNEL (rather than
    GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a
    random example).

    [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
    [kosaki.motohiro@jp.fujitsu.com: fix fuse]
    Signed-off-by: Nick Piggin
    Reviewed-by: KOSAKI Motohiro
    Cc: [2.6.28.x]
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    [ Cleaned up the calling convention: just pass in the AOP flags
    untouched to the grab_cache_page_write_begin() function. That
    just simplifies everybody, and may even allow future expansion of the
    logic. - Linus ]
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

31 Oct, 2008

1 commit

  • Nothing uses prepare_write or commit_write. Remove them from the tree
    completely.

    [akpm@linux-foundation.org: schedule simple_prepare_write() for unexporting]
    Signed-off-by: Nick Piggin
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

23 Oct, 2008

1 commit

  • The calling conventions of d_alloc_anon are rather unfortunate for all
    users, and it's name is not very descriptive either.

    Add d_obtain_alias as a new exported helper that drops the inode
    reference in the failure case, too and allows to pass-through NULL
    pointers and inodes to allow for tail-calls in the export operations.

    Incidentally this helper already existed as a private function in
    libfs.c as exportfs_d_alloc so kill that one and switch the callers
    to d_obtain_alias.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

31 Jul, 2008

1 commit

  • This commit:

    commit ba52de123d454b57369f291348266d86f4b35070
    Author: Theodore Ts'o
    Date: Wed Sep 27 01:50:49 2006 -0700

    [PATCH] inode-diet: Eliminate i_blksize from the inode structure

    caused the block size used by pseudo-filesystems to decrease from
    PAGE_SIZE to 1024 leading to a doubling of the number of context switches
    during a kernbench run.

    Signed-off-by: Alex Nixon
    Cc: Andi Kleen
    Cc: Jeremy Fitzhardinge
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Ian Campbell
    Cc: "Theodore Ts'o"
    Cc: Alexander Viro
    Cc: Hugh Dickins
    Cc: Jens Axboe
    Cc: [2.6.25.x, 2.6.26.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Nixon
     

05 Jul, 2008

1 commit


07 Jun, 2008

1 commit

  • This patch introduces memory_read_from_buffer().

    The only difference between memory_read_from_buffer() and
    simple_read_from_buffer() is which address space the function copies to.

    simple_read_from_buffer copies to user space memory.
    memory_read_from_buffer copies to normal memory.

    Signed-off-by: Akinobu Mita
    Cc: Al Viro
    Cc: Doug Warzecha
    Cc: Zhang Rui
    Cc: Matt Domsch
    Cc: Abhay Salunke
    Cc: Greg Kroah-Hartman
    Cc: Markus Rechberger
    Cc: Kay Sievers
    Cc: Bob Moore
    Cc: Thomas Renninger
    Cc: Len Brown
    Cc: Benjamin Herrenschmidt
    Cc: "Antonino A. Daplas"
    Cc: Krzysztof Helt
    Cc: Geert Uytterhoeven
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Peter Oberparleiter
    Cc: Michael Holzheu
    Cc: Brian King
    Cc: James E.J. Bottomley
    Cc: Andrew Vasquez
    Cc: Seokmann Ju
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

09 Feb, 2008

3 commits

  • simple_attr_close implementes ->release so it should be named accordingly.

    Signed-off-by: Christoph Hellwig
    Cc:
    Cc: Arnd Bergmann
    Cc: Greg KH
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Use mutex_lock_interruptible in simple_attr_read/write.

    Signed-off-by: Christoph Hellwig
    Cc:
    Cc: Arnd Bergmann
    Cc: Greg KH
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Sometimes simple attributes might need to return an error, e.g. for
    acquiring a mutex interruptibly. In fact we have that situation in
    spufs already which is the original user of the simple attributes. This
    patch merged the temporarily forked attributes in spufs back into the
    main ones and allows to return errors.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Christoph Hellwig
    Cc:
    Cc: Arnd Bergmann
    Cc: Greg KH
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

06 Feb, 2008

1 commit

  • Simplify page cache zeroing of segments of pages through 3 functions

    zero_user_segments(page, start1, end1, start2, end2)

    Zeros two segments of the page. It takes the position where to
    start and end the zeroing which avoids length calculations and
    makes code clearer.

    zero_user_segment(page, start, end)

    Same for a single segment.

    zero_user(page, start, length)

    Length variant for the case where we know the length.

    We remove the zero_user_page macro. Issues:

    1. Its a macro. Inline functions are preferable.

    2. The KM_USER0 macro is only defined for HIGHMEM.

    Having to treat this special case everywhere makes the
    code needlessly complex. The parameter for zeroing is always
    KM_USER0 except in one single case that we open code.

    Avoiding KM_USER0 makes a lot of code not having to be dealing
    with the special casing for HIGHMEM anymore. Dealing with
    kmap is only necessary for HIGHMEM configurations. In those
    configurations we use KM_USER0 like we do for a series of other
    functions defined in highmem.h.

    Since KM_USER0 is depends on HIGHMEM the existing zero_user_page
    function could not be a macro. zero_user_* functions introduced
    here can be be inline because that constant is not used when these
    functions are called.

    Also extract the flushing of the caches to be outside of the kmap.

    [akpm@linux-foundation.org: fix nfs and ntfs build]
    [akpm@linux-foundation.org: fix ntfs build some more]
    Signed-off-by: Christoph Lameter
    Cc: Steven French
    Cc: Michael Halcrow
    Cc:
    Cc: Steven Whitehouse
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Anton Altaparmakov
    Cc: Mark Fasheh
    Cc: David Chinner
    Cc: Michael Halcrow
    Cc: Steven French
    Cc: Steven Whitehouse
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

22 Oct, 2007

1 commit

  • Add the guts for the new filesystem API to exportfs.

    There's now a fh_to_dentry method that returns a dentry for the object looked
    for given a filehandle fragment, and a fh_to_parent operation that returns the
    dentry for the encoded parent directory in case the file handle contains it.

    There are default implementations for these methods that only take a callback
    for an nfs-enhanced iget variant and implement the rest of the semantics.

    Signed-off-by: Christoph Hellwig
    Cc: Neil Brown
    Cc: "J. Bruce Fields"
    Cc:
    Cc: Dave Kleikamp
    Cc: Anton Altaparmakov
    Cc: David Chinner
    Cc: Timothy Shimmin
    Cc: OGAWA Hirofumi
    Cc: Hugh Dickins
    Cc: Chris Mason
    Cc: Jeff Mahoney
    Cc: "Vladimir V. Saveliev"
    Cc: Steven Whitehouse
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

17 Oct, 2007

2 commits

  • simple_commit_write() can now become static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • These are intended to replace prepare_write and commit_write with more
    flexible alternatives that are also able to avoid the buffered write
    deadlock problems efficiently (which prepare_write is unable to do).

    [mark.fasheh@oracle.com: API design contributions, code review and fixes]
    [akpm@linux-foundation.org: various fixes]
    [dmonakhov@sw.ru: new aop block_write_begin fix]
    Signed-off-by: Nick Piggin
    Signed-off-by: Mark Fasheh
    Signed-off-by: Dmitriy Monakhov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

09 May, 2007

2 commits


05 Mar, 2007

1 commit


21 Feb, 2007

1 commit

  • simple_prepare_write leaks uninitialised kernel data. This happens because
    the it leaves an uninitialised "hole" over the part of the page that the
    write is expected to go to. This is fine, but it then marks the page
    uptodate, which means a concurrent read can come in and copy the
    uninitialised memory into userspace before it written to.

    Fix it by simply marking it uptodate in simple_commit_write instead, after
    the hole has been filled in. This could theoretically break an fs that
    uses simple_prepare_write and not simple_commit_write, and that relies on
    the incorrect simple_prepare_write behaviour. Luckily, none of those
    exists in the tree.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

13 Feb, 2007

2 commits

  • This patch is inspired by Arjan's "Patch series to mark struct
    file_operations and struct inode_operations const".

    Compile tested with gcc & sparse.

    Signed-off-by: Josef 'Jeff' Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef 'Jeff' Sipek
     
  • Many struct inode_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

09 Dec, 2006

1 commit

  • This patch changes struct file to use struct path instead of having
    independent pointers to struct dentry and struct vfsmount, and converts all
    users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}.

    Additionally, it adds two #define's to make the transition easier for users of
    the f_dentry and f_vfsmnt.

    Signed-off-by: Josef "Jeff" Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef "Jeff" Sipek
     

01 Oct, 2006

2 commits

  • This is mostly included for parity with dec_nlink(), where we will have some
    more hooks. This one should stay pretty darn straightforward for now.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • When a filesystem decrements i_nlink to zero, it means that a write must be
    performed in order to drop the inode from the filesystem.

    We're shortly going to have keep filesystems from being remounted r/o between
    the time that this i_nlink decrement and that write occurs.

    So, add a little helper function to do the decrements. We'll tie into it in a
    bit to note when i_nlink hits zero.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     

30 Sep, 2006

1 commit

  • Remove the unnecessary PageUptodate check from simple_readpage. The only
    two callers for ->readpage that don't have explicit PageUptodate check are
    read_cache_pages and page_cache_read which operate on newly allocated pages
    which don't have the flag set.

    [akpm: use the allegedly-faster clear_page(), too]
    Signed-off-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pekka J Enberg
     

27 Sep, 2006

2 commits

  • This eliminates the i_blksize field from struct inode. Filesystems that want
    to provide a per-inode st_blksize can do so by providing their own getattr
    routine instead of using the generic_fillattr() function.

    Note that some filesystems were providing pretty much random (and incorrect)
    values for i_blksize.

    [bunk@stusta.de: cleanup]
    [akpm@osdl.org: generic_fillattr() fix]
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Theodore Ts'o
     
  • The following patches reduce the size of the VFS inode structure by 28 bytes
    on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction
    in the inode size on a UP kernel that is configured in a production mode
    (i.e., with no spinlock or other debugging functions enabled; if you want to
    save memory taken up by in-core inodes, the first thing you should do is
    disable the debugging options; they are responsible for a huge amount of bloat
    in the VFS inode structure).

    This patch:

    The filesystem or device-specific pointer in the inode is inside a union,
    which is pretty pointless given that all 30+ users of this field have been
    using the void pointer. Get rid of the union and rename it to i_private, with
    a comment to explain who is allowed to use the void pointer. This is just a
    cleanup, but it allows us to reuse the union 'u' for something something where
    the union will actually be used.

    [judith@osdl.org: powerpc build fix]
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Judith Lebzelter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Theodore Ts'o
     

27 Jun, 2006

1 commit

  • This patch converts the combination of list_del(A) and list_add(A, B) to
    list_move(A, B).

    Cc: Greg Kroah-Hartman
    Cc: Ram Pai
    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita