02 Aug, 2011

7 commits

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (60 commits)
    ext4: prevent memory leaks from ext4_mb_init_backend() on error path
    ext4: use EXT4_BAD_INO for buddy cache to avoid colliding with valid inode #
    ext4: use ext4_msg() instead of printk in mballoc
    ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_info
    ext4: introduce ext4_kvmalloc(), ext4_kzalloc(), and ext4_kvfree()
    ext4: use the correct error exit path in ext4_init_inode_table()
    ext4: add missing kfree() on error return path in add_new_gdb()
    ext4: change umode_t in tracepoint headers to be an explicit __u16
    ext4: fix races in ext4_sync_parent()
    ext4: Fix overflow caused by missing cast in ext4_fallocate()
    ext4: add action of moving index in ext4_ext_rm_idx for Punch Hole
    ext4: simplify parameters of reserve_backup_gdb()
    ext4: simplify parameters of add_new_gdb()
    ext4: remove lock_buffer in bclean() and setup_new_group_blocks()
    ext4: simplify journal handling in setup_new_group_blocks()
    ext4: let setup_new_group_blocks() set multiple bits at a time
    ext4: fix a typo in ext4_group_extend()
    ext4: let ext4_group_add_blocks() handle 0 blocks quickly
    ext4: let ext4_group_add_blocks() return an error code
    ext4: rename ext4_add_groupblocks() to ext4_group_add_blocks()
    ...

    Fix up conflict in fs/ext4/inode.c: commit aacfc19c626e ("fs: simplify
    the blockdev_direct_IO prototype") had changed the ext4_ind_direct_IO()
    function for the new simplified calling convention, while commit
    dae1e52cb126 ("ext4: move ext4_ind_* functions from inode.c to
    indirect.c") moved the function to another file.

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    xfs: Fix build breakage in xfs_iops.c when CONFIG_FS_POSIX_ACL is not set
    VFS: Reorganise shrink_dcache_for_umount_subtree() after demise of dcache_lock
    VFS: Remove dentry->d_lock locking from shrink_dcache_for_umount_subtree()
    VFS: Remove detached-dentry counter from shrink_dcache_for_umount_subtree()
    switch posix_acl_chmod() to umode_t
    switch posix_acl_from_mode() to umode_t
    switch posix_acl_equiv_mode() to umode_t *
    switch posix_acl_create() to umode_t *
    block: initialise bd_super in bdget()
    vfs: avoid call to inode_lru_list_del() if possible
    vfs: avoid taking inode_hash_lock on pipes and sockets
    vfs: conditionally call inode_wb_list_del()
    VFS: Fix automount for negative autofs dentries
    Btrfs: load the key from the dir item in readdir into a fake dentry
    devtmpfs: missing initialialization in never-hit case
    hppfs: missing include

    Linus Torvalds
     
  • * 'pstore-efi' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
    efivars: Introduce PSTORE_EFI_ATTRIBUTES
    efivars: Use string functions in pstore_write
    efivars: introduce utf16_strncmp
    efivars: String functions
    efi: Add support for using efivars as a pstore backend
    pstore: Allow the user to explicitly choose a backend
    pstore: Make "part" unsigned
    pstore: Add extra context for writes and erases
    pstore: Extend API for more flexibility in new backends

    Linus Torvalds
     
  • In ext4_mb_init(), if the s_locality_group allocation fails it will
    currently cause the allocations made in ext4_mb_init_backend() to
    be leaked. Moving the ext4_mb_init_backend() allocation after the
    s_locality_group allocation avoids that problem.

    Signed-off-by: Yu Jian
    Signed-off-by: Andreas Dilger
    Signed-off-by: "Theodore Ts'o"

    Yu Jian
     
  • Signed-off-by: Yu Jian
    Signed-off-by: Andreas Dilger
    Signed-off-by: "Theodore Ts'o"

    Yu Jian
     
  • Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    CIFS: Cleanup demupltiplex thread exiting code
    CIFS: Move mid search to a separate function
    CIFS: Move RFC1002 check to a separate function
    CIFS: Simplify socket reading in demultiplex thread
    CIFS: Move buffer allocation to a separate function
    cifs: remove unneeded variable initialization in cifs_reconnect_tcon
    cifs: simplify refcounting for oplock breaks
    cifs: fix compiler warning in CIFSSMBQAllEAs
    cifs: fix name parsing in CIFSSMBQAllEAs
    cifs: don't start signing too early
    cifs: trivial: goto out here is unnecessary
    cifs: advertise the right receive buffer size to the server

    Linus Torvalds
     

01 Aug, 2011

33 commits

  • Reviewed-and-Tested-by: Jeff Layton
    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Pavel Shilovsky
     
  • Reviewed-and-Tested-by: Jeff Layton
    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Pavel Shilovsky
     
  • Reviewed-and-Tested-by: Jeff Layton
    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Pavel Shilovsky
     
  • Move reading to separate function and remove csocket variable.

    Also change semantic in a little: goto incomplete_rcv only when
    we get -EAGAIN (or a familiar error) while reading rfc1002 header.
    In this case we don't check for echo timeout when we don't get whole
    header at once, as it was before.

    Reviewed-and-Tested-by: Jeff Layton
    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Pavel Shilovsky
     
  • Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Introduce new helper functions which try kmalloc, and then fall back
    to vmalloc if necessary, and use them for allocating and deallocating
    s_flex_groups.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Reviewed-and-Tested-by: Jeff Layton
    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Pavel Shilovsky
     
  • This patch lets ext4_init_inode_table() handle errors right.
    ext4_init_inode_table() should down_write() alloc_sem which
    has been up_write()ed and stop the started journal handle.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     
  • commit 4e34e719e45, that takes the ACL checks to common code,
    accidentely broke the build when CONFIG_FS_POSIX_ACL is not set:

    CC fs/xfs/linux-2.6/xfs_iops.o
    fs/xfs/linux-2.6/xfs_iops.c:1025:14: error: ‘xfs_get_acl’ undeclared here (not in a function)

    Fix this by declaring xfs_get_acl a static inline function.

    Signed-off-by: Markus Trippelsdorf
    Signed-off-by: Al Viro

    Markus Trippelsdorf
     
  • Reorganise shrink_dcache_for_umount_subtree() in light of the demise of
    dcache_lock. Without that dcache_lock, there is no need for the batching of
    removal of dentries from the system under it (we wanted to make intensive use
    of the locked data whilst we held it, but didn't want to hold it for long at a
    time).

    This works, provided the preceding patch is correct in its removal of locking
    on dentry->d_lock on the basis that no one should be locking these dentries any
    more as the whole superblock is defunct.

    With this patch, the calls to dentry_lru_del() and __d_shrink() are placed at
    the point where each dentry is detached handled.

    It is possible that, as an alternative, the batching should still be done -
    but only for dentry_lru_del() of all a dentry's children in one go. In such a
    case, the batching would be done under dcache_lru_lock.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Locks of the dcache_lock were replaced by locks of dentry->d_lock in commits
    such as:

    2304450783dfde7b0b94ae234edd0dbffa865073
    2fd6b7f50797f2e993eea59e0a0b8c6399c811dc

    as part of the RCU-based pathwalk changes, despite the fact that the caller
    (shrink_dcache_for_umount()) notes in the banner comment the reasons that
    d_lock is not necessary in these functions:

    /*
    * destroy the dentries attached to a superblock on unmounting
    * - we don't need to use dentry->d_lock because:
    * - the superblock is detached from all mountings and open files, so the
    * dentry trees will not be rearranged by the VFS
    * - s_umount is write-locked, so the memory pressure shrinker will ignore
    * any dentries belonging to this superblock that it comes across
    * - the filesystem itself is no longer permitted to rearrange the dentries
    * in this superblock
    */

    So remove these locks. If the locks are actually necessary, then this banner
    comment should be altered instead.

    The hash table chains are protected by 1-bit locks in the hash table heads, so
    those shouldn't be a problem.

    Note that to make this work, __d_drop() has to be split so that the RCUwalk
    barrier can be avoided. This causes problems otherwise as it has an assertion
    that dentry->d_lock is locked - but there is no need for that as no one else
    can be trying to access this dentry, except to step over it (and that should
    be handled by d_free(), I think).

    Signed-off-by: David Howells
    Cc: Nick Piggin
    Signed-off-by: Al Viro

    David Howells
     
  • Remove the detached-dentry counter from shrink_dcache_for_umount_subtree() as
    the value it computes is no longer used as of commit
    312d3ca856d369bb04d0443846b85b4cdde6fa8a which made the nr_dentry counters
    summed per-CPU rather than global atomic.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • again, that's what all callers pass to it

    Signed-off-by: Al Viro

    Al Viro
     
  • ... seeing that this is what all callers pass to it anyway.

    Signed-off-by: Al Viro

    Al Viro
     
  • ... so that &inode->i_mode could be passed to it

    Signed-off-by: Al Viro

    Al Viro
     
  • so we can pass &inode->i_mode to it

    Signed-off-by: Al Viro

    Al Viro
     
  • bd_super is currently reset to NULL in kill_block_super() so we rely on previous
    users of the block_device object to initialise this value for the next user.
    This quirk was exposed on RHEL5 when a third party filesystem did not always use
    kill_block_super() and therefore bd_super wasn't being reset when a block_device
    object was recycled within the cache. This may not be a problem upstream but
    makes sense to be defensive.

    Signed-off-by: Lachlan McIlroy
    Reviewed-by: Eric Sandeen
    Signed-off-by: Al Viro

    Lachlan McIlroy
     
  • inode_lru_list_del() is expensive because of per superblock lru locking,
    while some inodes are not in lru list.

    Adding a check in iput_final() can speedup pipe/sockets workloads on
    SMP.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Al Viro

    Eric Dumazet
     
  • Some inodes (pipes, sockets, ...) are not hashed, no need to take
    contended inode_hash_lock at dismantle time.

    nice speedup on SMP machines on socket intensive workloads.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Al Viro

    Eric Dumazet
     
  • Some inodes (pipes, sockets, ...) are not in bdi writeback list.

    evict() can avoid calling inode_wb_list_del() and its expensive spinlock
    by checking inode i_wb_list being empty or not.

    At this point, no other cpu/user can concurrently manipulate this inode
    i_wb_list

    Signed-off-by: Eric Dumazet
    Signed-off-by: Al Viro

    Eric Dumazet
     
  • Autofs may set the DCACHE_NEED_AUTOMOUNT flag on negative dentries. These
    need attention from the automounter daemon regardless of the LOOKUP_FOLLOW flag.

    Signed-off-by: David Howells
    Acked-by: Ian Kent
    Signed-off-by: Al Viro

    David Howells
     
  • In btrfs we have 2 indexes for inodes. One is for readdir, it's in this nice
    sequential order and works out brilliantly for readdir. However if you use ls,
    it usually stat's each file it gets from readdir. This is where the second
    index comes in, which is based on a hash of the name of the file. So then the
    lookup has to lookup this index, and then lookup the inode. The index lookup is
    going to be in random order (since its based on the name hash), which gives us
    less than stellar performance. Since we know the inode location from the
    readdir index, I create a dummy dentry and copy the location key into
    dentry->d_fsdata. Then on lookup if we have d_fsdata we use that location to
    lookup the inode, avoiding looking up the other directory index. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Al Viro

    Josef Bacik
     
  • Fix two recently introduced compile problems:

    Fix a typo in fs/nfs/pnfs.h

    Move the pnfs_blksize declaration outside the CONFIG_NFS_V4 section in
    struct nfs_server.

    Reported-by: Jens Axboe
    Signed-off-by: Trond Myklebust
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • Reported-and-acked-by: David Howells
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • Currently, we take a sb->s_active reference and a cifsFileInfo reference
    when an oplock break workqueue job is queued. This is unnecessary and
    more complicated than it needs to be. Also as Al points out,
    deactivate_super has non-trivial locking implications so it's best to
    avoid that if we can.

    Instead, just cancel any pending oplock breaks for this filehandle
    synchronously in cifsFileInfo_put after taking it off the lists.
    That should ensure that this job doesn't outlive the structures it
    depends on.

    Reported-by: Al Viro
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • The recent fix to the above function causes this compiler warning to pop
    on some gcc versions:

    CC [M] fs/cifs/cifssmb.o
    fs/cifs/cifssmb.c: In function ‘CIFSSMBQAllEAs’:
    fs/cifs/cifssmb.c:5708: warning: ‘ea_name_len’ may be used uninitialized in
    this function

    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • The code that matches EA names in CIFSSMBQAllEAs is incorrect. It
    uses strncmp to do the comparison with the length limited to the
    name_len sent in the response.

    Problem: Suppose we're looking for an attribute named "foobar" and
    have an attribute before it in the EA list named "foo". The
    comparison will succeed since we're only looking at the first 3
    characters. Fix this by also comparing the length of the provided
    ea_name with the name_len in the response. If they're not equal then
    it shouldn't match.

    Reported-by: Jian Li
    Signed-off-by: Jeff Layton
    Reviewed-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Jeff Layton
     
  • Sniffing traffic on the wire shows that windows clients send a zeroed
    out signature field in a NEGOTIATE request, and send "BSRSPYL" in the
    signature field during SESSION_SETUP. Make the cifs client behave the
    same way.

    It doesn't seem to make much difference in any server that I've tested
    against, but it's probably best to follow windows behavior as closely as
    possible here.

    Signed-off-by: Jeff Layton
    Reviewed-by: Shirish Pargaonkar
    Signed-off-by: Steve French

    Jeff Layton
     
  • ...and remove some obsolete comments.

    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • Currently, we mirror the same size back to the server that it sends us.
    That makes little sense. Instead we should be sending the server the
    maximum buffer size that we can handle -- CIFSMaxBufSize minus the
    4 byte RFC1001 header.

    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • * 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (28 commits)
    pnfsblock: write_pagelist handle zero invalid extents
    pnfsblock: note written INVAL areas for layoutcommit
    pnfsblock: bl_write_pagelist
    pnfsblock: bl_read_pagelist
    pnfsblock: cleanup_layoutcommit
    pnfsblock: encode_layoutcommit
    pnfsblock: merge rw extents
    pnfsblock: add extent manipulation functions
    pnfsblock: bl_find_get_extent
    pnfsblock: xdr decode pnfs_block_layout4
    pnfsblock: call and parse getdevicelist
    pnfsblock: merge extents
    pnfsblock: lseg alloc and free
    pnfsblock: remove device operations
    pnfsblock: add device operations
    pnfsblock: basic extent code
    pnfsblock: use pageio_ops api
    pnfsblock: add blocklayout Kconfig option, Makefile, and stubs
    pnfs: cleanup_layoutcommit
    pnfs: ask for layout_blksize and save it in nfs_server
    ...

    Linus Torvalds
     
  • For invalid extents, find other pages in the same fsblock and write them out.

    [pnfsblock: write_begin]
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Peng Tao
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Peng Tao
     
  • Signed-off-by: Peng Tao
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman