10 Aug, 2010

2 commits

  • Make sure we check the truncate constraints early on in ->setattr by adding
    those checks to inode_change_ok. Also clean up and document inode_change_ok
    to make this obvious.

    As a fallout we don't have to call inode_newsize_ok from simple_setsize and
    simplify it down to a truncate_setsize which doesn't return an error. This
    simplifies a lot of setattr implementations and means we use truncate_setsize
    almost everywhere. Get rid of fat_setsize now that it's trivial and mark
    ext2_setsize static to make the calling convention obvious.

    Keep the inode_newsize_ok in vmtruncate for now as all callers need an
    audit for its removal anyway.

    Note: setattr code in ecryptfs doesn't call inode_change_ok at all and
    needs a deeper audit, but that is left for later.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Replace inode_setattr with opencoded variants of it in all callers. This
    moves the remaining call to vmtruncate into the filesystem methods where it
    can be replaced with the proper truncate sequence.

    In a few cases it was obvious that we would never end up calling vmtruncate
    so it was left out in the opencoded variant:

    spufs: explicitly checks for ATTR_SIZE earlier
    btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier
    ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above

    In addition to that ncpfs called inode_setattr with handcrafted iattrs,
    which allowed to trim down the opencoded variant.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

28 May, 2010

1 commit

  • Lots of filesystems calls vmtruncate despite not implementing the old
    ->truncate method. Switch them to use simple_setsize and add some
    comments about the truncate code where it seems fitting.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    npiggin@suse.de
     

04 Mar, 2010

1 commit


12 Jan, 2010

1 commit


08 Jan, 2010

1 commit

  • The rename code was taking a resource group lock in cases where
    it wasn't actually needed, this caused problems if the rename
    was resulting in an inode being unlinked. The patch ensures that
    we only take the rgrp lock early if it is really needed.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

24 Sep, 2009

1 commit

  • * remove asm/atomic.h inclusion from linux/utsname.h --
    not needed after kref conversion
    * remove linux/utsname.h inclusion from files which do not need it

    NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
    due to some personality stuff it _is_ needed -- cowardly leave ELF-related
    headers and files alone.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

27 Aug, 2009

2 commits

  • Use the more conventional name for the extended attribute
    support code. Update all the places which care.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This has been on my list for some time. We need to change the way
    in which we handle extended attributes to allow faster file creation
    times (by reducing the number of transactions required) and the
    extended attribute code is the main obstacle to this.

    In addition to that, the VFS provides a way to demultiplex the xattr
    calls which we ought to be using, rather than rolling our own. This
    patch changes the GFS2 code to use that VFS feature and as a result
    the code shrinks by a couple of hundred lines or so, and becomes
    easier to read.

    I'm planning on doing further clean up work in this area, but this
    patch is a good start. The cleaned up code also uses the more usual
    "xattr" shorthand, I plan to eliminate the use of "eattr" eventually
    and in the mean time it serves as a flag as to which bits of the code
    have been updated.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

24 Aug, 2009

1 commit


22 May, 2009

3 commits


15 Apr, 2009

1 commit

  • In certain cases symlinks can appear to have zero size if a lookup
    on the inode occurs within a certain (very short) time after the
    symlink has been created. The symlink is correctly created on disk
    but appears to have zero size when stat()ed. This patch closes the
    race and prevents incorrect sizes appearing.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

24 Mar, 2009

1 commit

  • This is the big patch that I've been working on for some time
    now. There are many reasons for wanting to make this change
    such as:
    o Reducing overhead by eliminating duplicated fields between structures
    o Simplifcation of the code (reduces the code size by a fair bit)
    o The locking interface is now the DLM interface itself as proposed
    some time ago.
    o Fewer lookups of glocks when processing replies from the DLM
    o Fewer memory allocations/deallocations for each glock
    o Scope to do further optimisations in the future (but this patch is
    more than big enough for now!)

    Please note that (a) this patch relates to the lock_dlm module and
    not the DLM itself, that is still a separate module; and (b) that
    we retain the ability to build GFS2 as a standalone single node
    filesystem with out requiring the DLM.

    This patch needs a lot of testing, hence my keeping it I restarted
    my -git tree after the last merge window. That way, this has the maximum
    exposure before its merged. This is (modulo a few minor bug fixes) the
    same patch that I've been posting on and off the the last three months
    and its passed a number of different tests so far.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

05 Jan, 2009

5 commits

  • The final field in gfs2_dinode_host was the i_flags field. Thats
    renamed to i_diskflags in order to avoid confusion with the existing
    inode flags, and moved into the inode proper at a suitable location
    to avoid creating a "hole".

    At that point struct gfs2_dinode_host is no longer needed and as
    promised (quite some time ago!) it can now be removed completely.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch moved the i_size field from the gfs2_dinode_host and
    following the ext3 convention renames it i_disksize.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This moves the directory entry count into the proper inode.
    Potentially we could get this to share the space used by
    something else in the future, but this is one more step
    on the way to removing the gfs2_dinode_host structure.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • Move the contents of some headers which contained very
    little into more sensible places, and remove the original
    header files. This should make it easier to find things.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch implements the FIEMAP ioctl for GFS2. We can use the generic
    code (aside from a lock order issue, solved as per Ted Tso's suggestion)
    for which I've introduced a new variant of the generic function. We also
    have one exception to deal with, namely stuffed files, so we do that
    "by hand", setting all the required flags.

    This has been tested with a modified (I could only find an old version) of
    Eric's test program, and appears to work correctly.

    This patch does not currently support FIEMAP of xattrs, but the plan is to add
    that feature at some future point.

    Signed-off-by: Steven Whitehouse
    Cc: Theodore Tso
    Cc: Eric Sandeen

    Steven Whitehouse
     

23 Oct, 2008

1 commit


27 Aug, 2008

1 commit

  • This patch fixes a locking issue in the rename code by ensuring that we hold
    the per sb rename lock over both directory and "other" renames which involve
    different parent directories.

    At the same time, this moved the (only called from one place) function
    gfs2_ok_to_move into the file that its called from, so we can mark it
    static. This should make a code a bit easier to follow.

    Signed-off-by: Steven Whitehouse
    Cc: Peter Staubach

    Steven Whitehouse
     

13 Aug, 2008

1 commit

  • This patch fixes a problem whereby simultaneous unlink, rmdir,
    rename and link operations (e.g. rm -fR *) from multiple nodes
    on the same GFS2 file system can cause kernel panics, hangs,
    and/or memory corruption. It also gets rid of all the non-rgrp
    calls to gfs2_glock_nq_m.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

27 Jul, 2008

2 commits


03 Jul, 2008

1 commit

  • GFS2 calls permission() to verify permissions after locks on the files
    have been taken.

    For this it's sufficient to call gfs2_permission() instead. This
    results in the following changes:

    - IS_RDONLY() check is not performed
    - IS_IMMUTABLE() check is not performed
    - devcgroup_inode_permission() is not called
    - security_inode_permission() is not called

    IS_RDONLY() should be unnecessary anyway, as the per-mount read-only
    flag should provide protection against read-only remounts during
    operations. do_gfs2_set_flags() has been fixed to perform
    mnt_want_write()/mnt_drop_write() to protect against remounting
    read-only.

    IS_IMMUTABLE has been added to gfs2_permission()

    Repeating the security checks seems to be pointless, as they don't
    normally change, and if they do, it's independent of the filesystem
    state.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Steven Whitehouse

    Miklos Szeredi
     

31 Mar, 2008

5 commits

  • This patch streamlines the quota checking in the "no quota" case by
    making the check inline in the calling function, thus reducing the
    number of function calls. Eventually we might be able to remove the
    checks from the gfs2_quota_lock() and gfs2_quota_check() functions, but
    currently we can't as there are a very few places in the code which need
    to call these functions directly still.

    Signed-off-by: Steven Whitehouse
    Cc: Abhijith Das

    Steven Whitehouse
     
  • gfs2_alloc_get may fail so we have to check it to prevent
    NULL pointer dereference.

    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Steven Whitehouse

    Cyrill Gorcunov
     
  • struct inode_operations gfs2_dev_iops is always the same as gfs2_file_iops,
    since Jan 2006, when GFS2 merged into mainstream kernel.

    So one of them could be removed.

    Signed-off-by: Denis Cheng
    Signed-off-by: Steven Whitehouse

    Denis Cheng
     
  • We've previously been using a "try lock" in readpage on the basis that
    it would prevent deadlocks due to the inverted lock ordering (our normal
    lock ordering is glock first and then page lock). Unfortunately tests
    have shown that this isn't enough. If the glock has a demote request
    queued such that run_queue() in the glock code tries to do a demote when
    its called under readpage then it will try and write out all the dirty
    pages which requires locking them. This then deadlocks with the page
    locked by readpage.

    The solution is to always require two calls into readpage. The first
    unlocks the page, gets the glock and returns AOP_TRUNCATED_PAGE, the
    second does the actual readpage and unlocks the glock & page as
    required.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • The blocks counter is almost a duplicate of the i_blocks
    field in the VFS inode. The only difference is that i_blocks
    can be only 32bits long for 32bit arch without large single file
    support. Since GFS2 doesn't handle the non-large single file
    case (for 32 bit anyway) this adds a new config dependency on
    64BIT || LSF. This has always been the case, however we've never
    explicitly said so before.

    Even if we do add support for the non-LSF case, we will still
    not require this field to be duplicated since we will not be
    able to access oversized files anyway.

    So the net result of all this is that we shave 8 bytes from a gfs2_inode
    and get our config deps correct.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

08 Feb, 2008

1 commit


25 Jan, 2008

2 commits

  • It is possible to reduce the size of GFS2 inodes by taking the i_alloc
    structure out of the gfs2_inode. This patch allocates the i_alloc
    structure whenever its needed, and frees it afterward. This decreases
    the amount of low memory we use at the expense of requiring a memory
    allocation for each page or partial page that we write. A quick test
    with postmark shows that the overhead is not measurable and I also note
    that OCFS2 use the same approach.

    In the future I'd like to solve the problem by shrinking down the size
    of the members of the i_alloc structure, but for now, this reduces the
    immediate problem of using too much low-memory on x86 and doesn't add
    too much overhead.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch fixes a couple of problems which affected the execution of files
    on GFS2. The first is that there was a corner case where inodes were not
    always uptodate at the point at which permissions checks were being carried
    out, this was resulting in refusal of execute permission, but only on the
    first lookup, subsequent requests worked correctly. The second was a problem
    relating to incorrect updating of file sizes which was introduced with the
    write_begin/end code for GFS2 a little while back.

    Signed-off-by: Steven Whitehouse
    Cc: Abhijith Das

    Steven Whitehouse
     

10 Oct, 2007

3 commits

  • This patch cleans up the code for writing journaled data into the log.
    It also removes the need to allocate a small "tag" structure for each
    block written into the log. Instead we just keep count of the outstanding
    I/O so that we can be sure that its all been written at the correct time.
    Another result of this patch is that a number of ll_rw_block() calls
    have become submit_bh() calls, closing some races at the same time.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch corrects the lock ordering in unlink to be the same as
    that in the rest of GFS2, i.e. parent -> child -> rgrp.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • When looking at an unrelated problem, I noticed that nfsd does not
    set nameidata pointer on create (ie nd is NULL). This should
    cause an oops in some cases in which when NFSd is mounted over GFS2.

    Signed-off-by: Steve French
    Signed-off-by: Steven Whitehouse

    Steve French
     

09 Jul, 2007

3 commits

  • This should have been part of the NFS patch #1 but somehow I missed it
    when packaging the patches. It is not a critical issue as the others (I
    hope). RHEL 5.1 31.el5 kernel runs fine without this change.

    Our truncate code is chopped into two parts, one for vfs inode changes
    (in vmtruncate()) and one of gfs inode (in gfs2_truncatei()). These two
    operatons are, unfortunately, not atomic. So it could happens that
    vmtruncate() succeeds (inode->i_size is changed) but gfs2_truncatei
    fails (say kernel temporarily out of memory). This would leave gfs inode
    i_di.di_size out of sync with vfs inode i_size. It will later confuse
    gfs2_commit_write() if a write is issued. Last time I checked, it will
    cause file corruption.

    Signed-off-by: S. Wendy Cheng
    Signed-off-by: Steven Whitehouse

    Wendy Cheng
     
  • A typo caused us to pass a NULL pointer when renaming directories. It
    was accidentally introduced in: [GFS2] Clean up inode number handling

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This adds a nanosecond timestamp feature to the GFS2 filesystem. Due
    to the way that the on-disk format works, older filesystems will just
    appear to have this field set to zero. When mounted by an older version
    of GFS2, the filesystem will simply ignore the extra fields so that
    it will again appear to have whole second resolution, so that its
    trivially backward compatible.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse