23 Aug, 2011

2 commits


02 Aug, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    xfs: Fix build breakage in xfs_iops.c when CONFIG_FS_POSIX_ACL is not set
    VFS: Reorganise shrink_dcache_for_umount_subtree() after demise of dcache_lock
    VFS: Remove dentry->d_lock locking from shrink_dcache_for_umount_subtree()
    VFS: Remove detached-dentry counter from shrink_dcache_for_umount_subtree()
    switch posix_acl_chmod() to umode_t
    switch posix_acl_from_mode() to umode_t
    switch posix_acl_equiv_mode() to umode_t *
    switch posix_acl_create() to umode_t *
    block: initialise bd_super in bdget()
    vfs: avoid call to inode_lru_list_del() if possible
    vfs: avoid taking inode_hash_lock on pipes and sockets
    vfs: conditionally call inode_wb_list_del()
    VFS: Fix automount for negative autofs dentries
    Btrfs: load the key from the dir item in readdir into a fake dentry
    devtmpfs: missing initialialization in never-hit case
    hppfs: missing include

    Linus Torvalds
     

01 Aug, 2011

2 commits


28 Jul, 2011

1 commit


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

26 Jul, 2011

5 commits


23 Jul, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
    vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
    isofs: Remove global fs lock
    jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
    fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
    mm/truncate.c: fix build for CONFIG_BLOCK not enabled
    fs:update the NOTE of the file_operations structure
    Remove dead code in dget_parent()
    AFS: Fix silly characters in a comment
    switch d_add_ci() to d_splice_alias() in "found negative" case as well
    simplify gfs2_lookup()
    jfs_lookup(): don't bother with . or ..
    get rid of useless dget_parent() in btrfs rename() and link()
    get rid of useless dget_parent() in fs/btrfs/ioctl.c
    fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
    drivers: fix up various ->llseek() implementations
    fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
    Ext4: handle SEEK_HOLE/SEEK_DATA generically
    Btrfs: implement our own ->llseek
    fs: add SEEK_HOLE and SEEK_DATA flags
    reiserfs: make reiserfs default to barrier=flush
    ...

    Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
    shrinker callout for the inode cache, that clashed with the xfs code to
    start the periodic workers later.

    Linus Torvalds
     

21 Jul, 2011

3 commits

  • d_splice_alias() will DTRT when given NULL or ERR_PTR

    Signed-off-by: Al Viro

    Al Viro
     
  • Btrfs needs to be able to control how filemap_write_and_wait_range() is called
    in fsync to make it less of a painful operation, so push down taking i_mutex and
    the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
    file systems can drop taking the i_mutex altogether it seems, like ext3 and
    ocfs2. For correctness sake I just pushed everything down in all cases to make
    sure that we keep the current behavior the same for everybody, and then each
    individual fs maintainer can make up their mind about what to do from there.
    Thanks,

    Acked-by: Jan Kara
    Signed-off-by: Josef Bacik
    Signed-off-by: Al Viro

    Josef Bacik
     
  • Let filesystems handle waiting for direct I/O requests themselves instead
    of doing it beforehand. This means filesystem-specific locks to prevent
    new dio referenes from appearing can be held. This is important to allow
    generalizing i_dio_count to non-DIO_LOCKING filesystems.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

20 Jul, 2011

5 commits


15 Jul, 2011

4 commits

  • __gfs2_free_data and __gfs2_free_meta are almost identical, and
    can be trivially combined.

    [This is as per Eric's original patch minus gfs2_free_data() which had
    no callers left and plus the conversion of the bmap.c calls to these
    functions. All in all, a nice clean up]

    Signed-off-by: Eric Sandeen
    Signed-off-by: Steven Whitehouse

    Eric Sandeen
     
  • This adds S_NOSEC support to GFS2. We set/reset the flag either when
    a user calls setattr or when we have just regained the glock
    from another node. The flag is only set if there are no xattrs
    on the inode and there is no suid bit set.

    Signed-off-by: Steven Whitehouse
    Reviewed-by: Andi Kleen
    Cc: Al Viro

    Steven Whitehouse
     
  • This patch is a performance improvement for GFS2 in a clustered
    environment. It makes the glock hold time self-adjusting.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch adds a cache for the hash table to the directory code
    in order to help simplify the way in which the hash table is
    accessed. This is intended to be a first step towards introducing
    some performance improvements in the directory code.

    There are two follow ups that I'm hoping to see fairly shortly. One
    is to simplify the hash table reading code now that we always read the
    complete hash table, whether we want one entry or all of them. The
    other is to introduce readahead on the heads of the hash chains
    which are referred to from the table.

    The hash table is a maximum of 128k in size, so it is not worth trying
    to read it in small chunks.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

14 Jul, 2011

1 commit

  • This patch contains a few misc fixes which resolve a recently
    reported issue. This patch has been a real team effort and has
    received a lot of testing.

    The first issue is that the ail lock needs to be held over a few
    more operations. The lock thats added into gfs2_releasepage() may
    possibly be a candidate for replacing with RCU at some future
    point, but at this stage we've gone for the obvious fix.

    The second issue is that gfs2_write_inode() can end up calling
    a glock recursively when called from gfs2_evict_inode() via the
    syncing code, so it needs a guard added.

    The third issue is that we either need to not truncate the metadata
    pages of inodes which have zero link count, but which we cannot
    deallocate due to them still being in use by other nodes, or we need
    to ensure that those pages have all made it through the journal and
    ail lists first. This patch takes the former approach, but the
    latter has also been tested and there is nothing to choose between
    them performance-wise. So again, we could revise that decision
    in the future.

    Also, the inode eviction process is now better documented.

    Signed-off-by: Steven Whitehouse
    Tested-by: Bob Peterson
    Tested-by: Abhijith Das
    Reported-by: Barry J. Marson
    Reported-by: David Teigland

    Steven Whitehouse
     

12 Jul, 2011

2 commits

  • There is a potential race during filesystem mounting which has recently
    been reported. It occurs when the userland gfs_controld is able to
    process requests fast enough that it tries to use the sysfs interface
    before the lock module is properly initialised. This is a pretty
    unusual case as normally the lock module initialisation is very quick
    compared with gfs_controld.

    This patch adds an interruptible completion which is used to ensure that
    userland will wait for the initialisation of the lock module to
    complete.

    There are other potential solutions to this problem, but this is the
    quickest at this stage and has been tested both with and without
    mount.gfs2 present in the system.

    Signed-off-by: Steven Whitehouse
    Reported-by: David Booher

    Steven Whitehouse
     
  • Right now, there is nothing that forces the log to get flushed when a node
    drops its rindex glock so that another node can grow the filesystem. If the
    log doesn't get flushed, GFS2 can corrupt the sd_log_le_rg list in the
    following way.

    A node puts an rgd on the list in rg_lo_add(), and then the rindex glock is
    dropped so the other node can grow the filesystem. When the node reacquires the
    rindex glock, that rgd gets deleted in clear_rgrpdi() before ever being
    removed from the list by gfs2_log_flush().

    This code simply forces a log flush when the rindex glock is invalidated,
    solving the problem.

    Signed-off-by: Benjamin Marzinski
    Signed-off-by: Steven Whitehouse

    Benjamin Marzinski
     

08 Jun, 2011

1 commit


27 May, 2011

1 commit

  • * 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
    gfs2: Drop __TIME__ usage
    isdn/diva: Drop __TIME__ usage
    atm: Drop __TIME__ usage
    dlm: Drop __TIME__ usage
    wan/pc300: Drop __TIME__ usage
    parport: Drop __TIME__ usage
    hdlcdrv: Drop __TIME__ usage
    baycom: Drop __TIME__ usage
    pmcraid: Drop __DATE__ usage
    edac: Drop __DATE__ usage
    rio: Drop __DATE__ usage
    scsi/wd33c93: Drop __TIME__ usage
    scsi/in2000: Drop __TIME__ usage
    aacraid: Drop __TIME__ usage
    media/cx231xx: Drop __TIME__ usage
    media/radio-maxiradio: Drop __TIME__ usage
    nozomi: Drop __TIME__ usage
    cyclades: Drop __TIME__ usage

    Linus Torvalds
     

26 May, 2011

1 commit

  • The kernel already prints its build timestamp during boot, no need to
    repeat it in random drivers and produce different object files each
    time.

    Cc: Steven Whitehouse
    Cc: cluster-devel@redhat.com
    Signed-off-by: Michal Marek

    Michal Marek
     

25 May, 2011

2 commits

  • Change each shrinker's API by consolidating the existing parameters into
    shrink_control struct. This will simplify any further features added w/o
    touching each file of shrinker.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: fix warning]
    [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
    [akpm@linux-foundation.org: fix xfs warning]
    [akpm@linux-foundation.org: update gfs2]
    Signed-off-by: Ying Han
    Cc: KOSAKI Motohiro
    Cc: Minchan Kim
    Acked-by: Pavel Emelyanov
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Acked-by: Rik van Riel
    Cc: Johannes Weiner
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Steven Whitehouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ying Han
     
  • This patch fixes a race in the GFS2 glock state machine that may
    result in lockups. The symptom is that all nodes but one will
    hang, waiting for a particular glock. All the holder records
    will have the "W" (Waiting) bit set. The other node will
    typically have the glock stuck in Exclusive mode (EX) with no
    holder records, but the dinode will be cached. In other words,
    an entry with "I:" will appear in the glock dump for that glock,
    but nothing else.

    The race has to do with the glock "Pending Demote" bit, which
    can be set, then immediately reset, thus losing the fact that
    another node needs the glock. The sequence of events is:

    1. Something schedules the glock workqueue (e.g. glock request from fs)
    2. The glock workqueue gets to the point between the test of the reply pending
    bit and the spin lock:

    if (test_and_clear_bit(GLF_REPLY_PENDING, &gl->gl_flags)) {
    finish_xmote(gl, gl->gl_reply);
    drop_ref = 1;
    }
    down_read(&gfs2_umount_flush_sem); gl_spin);

    3. In comes (a) the reply to our EX lock request setting GLF_REPLY_PENDING and
    (b) the demote request which sets GLF_PENDING_DEMOTE

    4. The following test is executed:

    if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags) &&
    gl->gl_state != LM_ST_UNLOCKED &&
    gl->gl_demote_state != LM_ST_EXCLUSIVE) {

    This resets the pending demote flag, and gl->gl_demote_state is not equal to
    exclusive, however because the reply from the dlm arrived after we checked for
    the GLF_REPLY_PENDING flag, gl->gl_state is still equal to unlocked, so
    although we reset the GLF_PENDING_DEMOTE flag, we didn't then set the
    GLF_DEMOTE flag or reinstate the GLF_PENDING_DEMOTE_FLAG.

    The patch closes the timing window by only transitioning the
    "Pending demote" bit to the "demote" flag once we know the
    other conditions (not unlocked and not exclusive) are met.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

22 May, 2011

1 commit

  • The ail flush code has always relied upon log flushing to prevent
    it from spinning needlessly. This fixes it to wait on the last
    I/O request submitted (we don't need to wait for all of it)
    instead of either spinning with io_schedule or sleeping.

    As a result cpu usage of gfs2_logd is much reduced with certain
    workloads.

    Reported-by: Abhijith Das
    Tested-by: Abhijith Das
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

21 May, 2011

2 commits

  • The deallocation code for directories in GFS2 is largely divided into
    two parts. The first part deallocates any directory leaf blocks and
    marks the directory as being a regular file when that is complete. The
    second stage was identical to deallocating regular files.

    Regular files have their data blocks in a different
    address space to directories, and thus what would have been normal data
    blocks in a regular file (the hash table in a GFS2 directory) were
    deallocated correctly. However, a reference to these blocks was left in the
    journal (assuming of course that some previous activity had resulted in
    those blocks being in the journal or ail list).

    This patch uses the i_depth as a test of whether the inode is an
    exhash directory (we cannot test the inode type as that has already
    been changed to a regular file at this stage in deallocation)

    The original issue was reported by Chris Hertel as an issue he encountered
    running bonnie++

    Reported-by: Christopher R. Hertel
    Cc: Abhijith Das
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (32 commits)
    GFS2: Move all locking inside the inode creation function
    GFS2: Clean up symlink creation
    GFS2: Clean up mkdir
    GFS2: Use UUID field in generic superblock
    GFS2: Rename ops_inode.c to inode.c
    GFS2: Inode.c is empty now, remove it
    GFS2: Move final part of inode.c into super.c
    GFS2: Move most of the remaining inode.c into ops_inode.c
    GFS2: Move gfs2_refresh_inode() and friends into glops.c
    GFS2: Remove gfs2_dinode_print() function
    GFS2: When adding a new dir entry, inc link count if it is a subdir
    GFS2: Make gfs2_dir_del update link count when required
    GFS2: Don't use gfs2_change_nlink in link syscall
    GFS2: Don't use a try lock when promoting to a higher mode
    GFS2: Double check link count under glock
    GFS2: Improve bug trap code in ->releasepage()
    GFS2: Fix ail list traversal
    GFS2: make sure fallocate bytes is a multiple of blksize
    GFS2: Add an AIL writeback tracepoint
    GFS2: Make writeback more responsive to system conditions
    ...

    Linus Torvalds
     

13 May, 2011

3 commits


10 May, 2011

1 commit