07 Oct, 2010

2 commits


04 Oct, 2010

2 commits

  • We currently use struct backing_dev_info for various different purposes.
    Originally it was introduced to describe a backing device which includes
    an unplug and congestion function and various bits of readahead information
    and VM-relevant flags. We're also using for tracking dirty inodes for
    writeback.

    To make writeback properly find all inodes we need to only access the
    per-filesystem backing_device pointed to by the superblock in ->s_bdi
    inside the writeback code, and not the instances pointeded to by
    inode->i_mapping->backing_dev which can be overriden by special devices
    or might not be set at all by some filesystems.

    Long term we should split out the writeback-relevant bits of struct
    backing_device_info (which includes more than the current bdi_writeback)
    and only point to it from the superblock while leaving the traditional
    backing device as a separate structure that can be overriden by devices.

    The one exception for now is the block device filesystem which really
    wants different writeback contexts for it's different (internal) inodes
    to handle the writeout more efficiently. For now we do this with
    a hack in fs-writeback.c because we're so late in the cycle, but in
    the future I plan to replace this with a superblock method that allows
    for multiple writeback contexts per filesystem.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • fs/fuse/dev.c:1357: warning: ‘total_len’ may be used uninitialized in this
    function

    Initialize total_len to zero, else its value will be undefined.

    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Miklos Szeredi

    Geert Uytterhoeven
     

02 Oct, 2010

5 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    cifs: prevent infinite recursion in cifs_reconnect_tcon
    cifs: set backing_dev_info on new S_ISREG inodes

    Linus Torvalds
     
  • Prevent from recursively locking the reiserfs lock in reiserfs_unpack()
    because we may call journal_begin() that requires the lock to be taken
    only once, otherwise it won't be able to release the lock while taking
    other mutexes, ending up in inverted dependencies between the journal
    mutex and the reiserfs lock for example.

    This fixes:

    =======================================================
    [ INFO: possible circular locking dependency detected ]
    2.6.35.4.4a #3
    -------------------------------------------------------
    lilo/1620 is trying to acquire lock:
    (&journal->j_mutex){+.+...}, at: [] do_journal_begin_r+0x7f/0x340 [reiserfs]

    but task is already holding lock:
    (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x28/0x40 [reiserfs]

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
    [] lock_acquire+0x67/0x80
    [] __mutex_lock_common+0x4d/0x410
    [] mutex_lock_nested+0x18/0x20
    [] reiserfs_write_lock+0x28/0x40 [reiserfs]
    [] do_journal_begin_r+0x86/0x340 [reiserfs]
    [] journal_begin+0x77/0x140 [reiserfs]
    [] reiserfs_remount+0x224/0x530 [reiserfs]
    [] do_remount_sb+0x60/0x110
    [] do_mount+0x625/0x790
    [] sys_mount+0x84/0xb0
    [] syscall_call+0x7/0xb

    -> #0 (&journal->j_mutex){+.+...}:
    [] __lock_acquire+0x1026/0x1180
    [] lock_acquire+0x67/0x80
    [] __mutex_lock_common+0x4d/0x410
    [] mutex_lock_nested+0x18/0x20
    [] do_journal_begin_r+0x7f/0x340 [reiserfs]
    [] journal_begin+0x77/0x140 [reiserfs]
    [] reiserfs_persistent_transaction+0x41/0x90 [reiserfs]
    [] reiserfs_get_block+0x22c/0x1530 [reiserfs]
    [] __block_prepare_write+0x1bb/0x3a0
    [] block_prepare_write+0x26/0x40
    [] reiserfs_prepare_write+0x88/0x170 [reiserfs]
    [] reiserfs_unpack+0xe6/0x120 [reiserfs]
    [] reiserfs_ioctl+0x272/0x320 [reiserfs]
    [] vfs_ioctl+0x28/0xa0
    [] do_vfs_ioctl+0x32d/0x5c0
    [] sys_ioctl+0x63/0x70
    [] syscall_call+0x7/0xb

    other info that might help us debug this:

    2 locks held by lilo/1620:
    #0: (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [] reiserfs_unpack+0x6a/0x120 [reiserfs]
    #1: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x28/0x40 [reiserfs]

    stack backtrace:
    Pid: 1620, comm: lilo Not tainted 2.6.35.4.4a #3
    Call Trace:
    [] __lock_acquire+0x1026/0x1180
    [] lock_acquire+0x67/0x80
    [] __mutex_lock_common+0x4d/0x410
    [] mutex_lock_nested+0x18/0x20
    [] do_journal_begin_r+0x7f/0x340 [reiserfs]
    [] journal_begin+0x77/0x140 [reiserfs]
    [] reiserfs_persistent_transaction+0x41/0x90 [reiserfs]
    [] reiserfs_get_block+0x22c/0x1530 [reiserfs]
    [] __block_prepare_write+0x1bb/0x3a0
    [] block_prepare_write+0x26/0x40
    [] reiserfs_prepare_write+0x88/0x170 [reiserfs]
    [] reiserfs_unpack+0xe6/0x120 [reiserfs]
    [] reiserfs_ioctl+0x272/0x320 [reiserfs]
    [] vfs_ioctl+0x28/0xa0
    [] do_vfs_ioctl+0x32d/0x5c0
    [] sys_ioctl+0x63/0x70
    [] syscall_call+0x7/0xb

    Reported-by: Jarek Poplawski
    Tested-by: Jarek Poplawski
    Signed-off-by: Frederic Weisbecker
    Cc: Jeff Mahoney
    Cc: All since 2.6.32
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frederic Weisbecker
     
  • The reiserfs mutex already depends on the inode mutex, so we can't lock
    the inode mutex in reiserfs_unpack() without using the safe locking API,
    because reiserfs_unpack() is always called with the reiserfs mutex locked.

    This fixes:

    =======================================================
    [ INFO: possible circular locking dependency detected ]
    2.6.35c #13
    -------------------------------------------------------
    lilo/1606 is trying to acquire lock:
    (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [] reiserfs_unpack+0x60/0x110 [reiserfs]

    but task is already holding lock:
    (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x28/0x40 [reiserfs]

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
    [] lock_acquire+0x67/0x80
    [] __mutex_lock_common+0x4d/0x410
    [] mutex_lock_nested+0x18/0x20
    [] reiserfs_write_lock+0x28/0x40 [reiserfs]
    [] reiserfs_lookup_privroot+0x2a/0x90 [reiserfs]
    [] reiserfs_fill_super+0x941/0xe60 [reiserfs]
    [] get_sb_bdev+0x117/0x170
    [] get_super_block+0x21/0x30 [reiserfs]
    [] vfs_kern_mount+0x6a/0x1b0
    [] do_kern_mount+0x39/0xe0
    [] do_mount+0x340/0x790
    [] sys_mount+0x84/0xb0
    [] syscall_call+0x7/0xb

    -> #0 (&sb->s_type->i_mutex_key#8){+.+.+.}:
    [] __lock_acquire+0x1026/0x1180
    [] lock_acquire+0x67/0x80
    [] __mutex_lock_common+0x4d/0x410
    [] mutex_lock_nested+0x18/0x20
    [] reiserfs_unpack+0x60/0x110 [reiserfs]
    [] reiserfs_ioctl+0x272/0x320 [reiserfs]
    [] vfs_ioctl+0x28/0xa0
    [] do_vfs_ioctl+0x32d/0x5c0
    [] sys_ioctl+0x63/0x70
    [] syscall_call+0x7/0xb

    other info that might help us debug this:

    1 lock held by lilo/1606:
    #0: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x28/0x40 [reiserfs]

    stack backtrace:
    Pid: 1606, comm: lilo Not tainted 2.6.35c #13
    Call Trace:
    [] __lock_acquire+0x1026/0x1180
    [] lock_acquire+0x67/0x80
    [] __mutex_lock_common+0x4d/0x410
    [] mutex_lock_nested+0x18/0x20
    [] reiserfs_unpack+0x60/0x110 [reiserfs]
    [] reiserfs_ioctl+0x272/0x320 [reiserfs]
    [] vfs_ioctl+0x28/0xa0
    [] do_vfs_ioctl+0x32d/0x5c0
    [] sys_ioctl+0x63/0x70
    [] syscall_call+0x7/0xb

    Reported-by: Jarek Poplawski
    Tested-by: Jarek Poplawski
    Signed-off-by: Frederic Weisbecker
    Cc: Jeff Mahoney
    Cc: [2.6.32 and later]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frederic Weisbecker
     
  • Having the limits file world readable will ease the task of system
    management on systems where root privileges might be restricted.

    Having admin restricted with root priviledges, he/she could not check
    other users process' limits.

    Also it'd align with most of the /proc stat files.

    Signed-off-by: Jiri Olsa
    Acked-by: Neil Horman
    Cc: Eugene Teo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Olsa
     
  • cifs_reconnect_tcon is called from smb_init. After a successful
    reconnect, cifs_reconnect_tcon will call reset_cifs_unix_caps. That
    function will, in turn call CIFSSMBQFSUnixInfo and CIFSSMBSetFSUnixInfo.
    Those functions also call smb_init.

    It's possible for the session and tcon reconnect to succeed, and then
    for another cifs_reconnect to occur before CIFSSMBQFSUnixInfo or
    CIFSSMBSetFSUnixInfo to be called. That'll cause those functions to call
    smb_init and cifs_reconnect_tcon again, ad infinitum...

    Break the infinite recursion by having those functions use a new
    smb_init variant that doesn't attempt to perform a reconnect.

    Reported-and-Tested-by: Michal Suchanek
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     

30 Sep, 2010

3 commits


29 Sep, 2010

1 commit

  • I have been seeing occasional pauses in transaction throughput up to
    30s long under heavy parallel workloads. The only notable thing was
    that the xfsaild was trying to be active during the pauses, but
    making no progress. It was running exactly 20 times a second (on the
    50ms no-progress backoff), and the number of pushbuf events was
    constant across this time as well. IOWs, the xfsaild appeared to be
    stuck on buffers that it could not push out.

    Further investigation indicated that it was trying to push out inode
    buffers that were pinned and/or locked. The xfsbufd was also getting
    woken at the same frequency (by the xfsaild, no doubt) to push out
    delayed write buffers. The xfsbufd was not making any progress
    because all the buffers in the delwri queue were pinned. This scan-
    and-make-no-progress dance went one in the trace for some seconds,
    before the xfssyncd came along an issued a log force, and then
    things started going again.

    However, I noticed something strange about the log force - there
    were way too many IO's issued. 516 log buffers were written, to be
    exact. That added up to 129MB of log IO, which got me very
    interested because it's almost exactly 25% of the size of the log.
    He delayed logging code is suppose to aggregate the minimum of 25%
    of the log or 8MB worth of changes before flushing. That's what
    really puzzled me - why did a log force write 129MB instead of only
    8MB?

    Essentially what has happened is that no CIL pushes had occurred
    since the previous tail push which cleared out 25% of the log space.
    That caused all the new transactions to block because there wasn't
    log space for them, but they kick the xfsaild to push the tail.
    However, the xfsaild was not making progress because there were
    buffers it could not lock and flush, and the xfsbufd could not flush
    them because they were pinned. As a result, both the xfsaild and the
    xfsbufd could not move the tail of the log forward without the CIL
    first committing.

    The cause of the problem was that the background CIL push, which
    should happen when 8MB of aggregated changes have been committed, is
    being held off by the concurrent transaction commit load. The
    background push does a down_write_trylock() which will fail if there
    is a concurrent transaction commit holding the push lock in read
    mode. With 8 CPUs all doing transactions as fast as they can, there
    was enough concurrent transaction commits to hold off the background
    push until tail-pushing could no longer free log space, and the halt
    would occur.

    It should be noted that there is no reason why it would halt at 25%
    of log space used by a single CIL checkpoint. This bug could
    definitely violate the "no transaction should be larger than half
    the log" requirement and hence result in corruption if the system
    crashed under heavy load. This sort of bug is exactly the reason why
    delayed logging was tagged as experimental....

    The fix is to start blocking background pushes once the threshold
    has been exceeded. Rework the threshold calculations to keep the
    amount of log space a CIL checkpoint can use to below that of the
    AIL push threshold to avoid the problem completely.

    Signed-off-by: Dave Chinner
    Reviewed-by: Alex Elder
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

25 Sep, 2010

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    o2dlm: force free mles during dlm exit
    ocfs2: Sync inode flags with ext2.
    ocfs2: Move 'wanted' into parens of ocfs2_resmap_resv_bits.
    ocfs2: Use cpu_to_le16 for e_leaf_clusters in ocfs2_bg_discontig_add_extent.
    ocfs2: update ctime when changing the file's permission by setfacl
    ocfs2/net: fix uninitialized ret in o2net_send_message_vec()
    Ocfs2: Handle empty list in lockres_seq_start() for dlmdebug.c
    Ocfs2: Re-access the journal after ocfs2_insert_extent() in dxdir codes.
    ocfs2: Fix lockdep warning in reflink.
    ocfs2/lockdep: Move ip_xattr_sem out of ocfs2_xattr_get_nolock.

    Linus Torvalds
     

24 Sep, 2010

5 commits

  • While umounting, a block mle doesn't get freed if dlm is shutdown after
    master request is received but before assert master. This results in unclean
    shutdown of dlm domain.

    This patch frees all mles that lie around after other nodes were notified about
    exiting the dlm and marking dlm state as leaving. Only block mles are expected
    to be around, so we log ERROR for other mles but still free them.

    Signed-off-by: Srinivas Eeda
    Signed-off-by: Joel Becker

    Srinivas Eeda
     
  • We sync our inode flags with ext2 and define them by hex
    values. But actually in commit 3669567(4 years ago), all
    these values are moved to include/linux/fs.h. So we'd
    better also use them as what ext2 did. So sync our inode
    flags with ext2 by using FS_*.

    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     
  • The first time I read the function ocfs2_resmap_resv_bits, I consider
    about what 'wanted' will be used and consider about the comments.
    Then I find it is only used if the reservation is empty. ;)

    So we'd better move it to the parens so that it make the code more
    readable, what's more, ocfs2_resmap_resv_bits is used so frequently
    and we should save some cpus.

    Acked-by: Mark Fasheh
    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     
  • e_leaf_clusters is a le16, so use cpu_to_le16 instead
    of cpu_to_le32.

    What's more, we change 'clusters' to unsigned int to
    signify that the size of 'clusters' isn't important here.

    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     
  • In commit 30e2bab, ext3 fixed it. So change it accordingly in ocfs2.

    Steps to reproduce:
    # touch aaa
    # stat -c %Z aaa
    1283760364
    # setfacl -m 'u::x,g::x,o::x' aaa
    # stat -c %Z aaa
    1283760364

    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     

23 Sep, 2010

5 commits

  • Currently, /proc//smaps has wrong dirty pages accounting.
    Shared_Dirty and Private_Dirty output only pte dirty pages and ignore
    PG_dirty page flag. It is difference against documentation, but also
    inconsistent against Referenced field. (Referenced checks both pte and
    page flags)

    This patch fixes it.

    Test program:

    large-array.c
    ---------------------------------------------------
    #include
    #include
    #include
    #include

    char array[1*1024*1024*1024L];

    int main(void)
    {
    memset(array, 1, sizeof(array));
    pause();

    return 0;
    }
    ---------------------------------------------------

    Test case:
    1. run ./large-array
    2. cat /proc/`pidof large-array`/smaps
    3. swapoff -a
    4. cat /proc/`pidof large-array`/smaps again

    Test result:

    00601000-40601000 rw-p 00000000 00:00 0
    Size: 1048576 kB
    Rss: 1048576 kB
    Pss: 1048576 kB
    Shared_Clean: 0 kB
    Shared_Dirty: 0 kB
    Private_Clean: 218992 kB

    00601000-40601000 rw-p 00000000 00:00 0
    Size: 1048576 kB
    Rss: 1048576 kB
    Pss: 1048576 kB
    Shared_Clean: 0 kB
    Shared_Dirty: 0 kB
    Private_Clean: 0 kB
    Private_Dirty: 1048576 kB
    Acked-by: Hugh Dickins
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • OCFS2 can return ERESTARTSYS from its write function when the process is
    signalled while waiting for a cluster lock (and the filesystem is mounted
    with intr mount option). Generally, it seems reasonable to allow
    filesystems to return this error code from its IO functions. As we must
    not leak ERESTARTSYS (and similar error codes) to userspace as a result of
    an AIO operation, we have to properly convert it to EINTR inside AIO code
    (restarting the syscall isn't really an option because other AIO could
    have been already submitted by the same io_submit syscall).

    Signed-off-by: Jan Kara
    Reviewed-by: Jeff Moyer
    Cc: Christoph Hellwig
    Cc: Zach Brown
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Commit 73296bc611 ("procfs: Use generic_file_llseek in /proc/vmcore")
    broke seeking on /proc/vmcore. This changes it back to use default_llseek
    in order to restore the original behaviour.

    The problem with generic_file_llseek is that it only allows seeks up to
    inode->i_sb->s_maxbytes, which is zero on procfs and some other virtual
    file systems. We should merge generic_file_llseek and default_llseek some
    day and clean this up in a proper way, but for 2.6.35/36, reverting vmcore
    is the safer solution.

    Signed-off-by: Arnd Bergmann
    Cc: Frederic Weisbecker
    Reported-by: CAI Qian
    Tested-by: CAI Qian
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • In 32-bit compatibility mode, the error handling for
    compat_do_readv_writev() may free an uninitialized pointer, potentially
    leading to all sorts of ugly memory corruption. This is reliably
    triggerable by unprivileged users by invoking the readv()/writev()
    syscalls with an invalid iovec pointer. The below patch fixes this to
    emulate the non-compat version.

    Introduced by commit b83733639a49 ("compat: factor out
    compat_rw_copy_check_uvector from compat_do_readv_writev")

    Signed-off-by: Dan Rosenberg
    Cc: stable@kernel.org (2.6.35)
    Cc: Al Viro
    Signed-off-by: Linus Torvalds

    Dan Rosenberg
     
  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends
    char: Mark /dev/zero and /dev/kmem as not capable of writeback
    bdi: Initialize noop_backing_dev_info properly
    cfq-iosched: fix a kernel OOPs when usb key is inserted
    block: fix blk_rq_map_kern bio direction flag
    cciss: freeing uninitialized data on error path

    Linus Torvalds
     

22 Sep, 2010

3 commits

  • Inodes of devices such as /dev/zero can get dirty for example via
    utime(2) syscall or due to atime update. Backing device of such inodes
    (zero_bdi, etc.) is however unable to handle dirty inodes and thus
    __mark_inode_dirty complains. In fact, inode should be rather dirtied
    against backing device of the filesystem holding it. This is generally a
    good rule except for filesystems such as 'bdev' or 'mtd_inodefs'. Inodes
    in these pseudofilesystems are referenced from ordinary filesystem
    inodes and carry mapping with real data of the device. Thus for these
    inodes we have to use inode->i_mapping->backing_dev_info as we did so
    far. We distinguish these filesystems by checking whether sb->s_bdi
    points to a non-trivial backing device or not.

    Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /.
    There's a device inode A described by a path "/dev/sdb" on this
    filesystem. This inode will be dirtied against backing device "8:0"
    after this patch. bdev filesystem contains block device inode B coupled
    with our inode A. When someone modifies a page of /dev/sdb, it's B that
    gets dirtied and the dirtying happens against the backing device "8:16".
    Thus both inodes get filed to a correct bdi list.

    Cc: stable@kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Jens Axboe

    Jan Kara
     
  • These devices don't do any writeback but their device inodes still can get
    dirty so mark bdi appropriately so that bdi code does the right thing and files
    inodes to lists of bdi carrying the device inodes.

    Cc: stable@kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Jens Axboe

    Jan Kara
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    ceph: select CRYPTO
    ceph: check mapping to determine if FILE_CACHE cap is used
    ceph: only send one flushsnap per cap_snap per mds session
    ceph: fix cap_snap and realm split
    ceph: stop sending FLUSHSNAPs when we hit a dirty capsnap
    ceph: correctly set 'follows' in flushsnap messages
    ceph: fix dn offset during readdir_prepopulate
    ceph: fix file offset wrapping at 4GB on 32-bit archs
    ceph: fix reconnect encoding for old servers
    ceph: fix pagelist kunmap tail
    ceph: fix null pointer deref on anon root dentry release

    Linus Torvalds
     

20 Sep, 2010

1 commit

  • Coda's REQ_* defines were renamed to avoid clashes with the block layer
    (commit 4aeefdc69f7b: "coda: fixup clash with block layer REQ_*
    defines").

    However one was missed and response messages are no longer matched with
    requests and waiting threads are no longer woken up. This patch fixes
    this.

    Signed-off-by: Jan Harkes
    [ Also fixed up whitespace while at it -Linus ]
    Signed-off-by: Linus Torvalds

    Jan Harkes
     

18 Sep, 2010

3 commits


17 Sep, 2010

4 commits


15 Sep, 2010

5 commits

  • * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
    SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies
    statfs() gives ESTALE error
    NFS: Fix a typo in nfs_sockaddr_match_ipaddr6
    sunrpc: increase MAX_HASHTABLE_BITS to 14
    gss:spkm3 miss returning error to caller when import security context
    gss:krb5 miss returning error to caller when import security context
    Remove incorrect do_vfs_lock message
    SUNRPC: cleanup state-machine ordering
    SUNRPC: Fix a race in rpc_info_open
    SUNRPC: Fix race corrupting rpc upcall
    Fix null dereference in call_allocate

    Linus Torvalds
     
  • Tavis Ormandy pointed out that do_io_submit does not do proper bounds
    checking on the passed-in iocb array:

           if (unlikely(nr < 0))
                   return -EINVAL;

           if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp)))))
                   return -EFAULT;                      ^^^^^^^^^^^^^^^^^^

    The attached patch checks for overflow, and if it is detected, the
    number of iocbs submitted is scaled down to a number that will fit in
    the long.  This is an ok thing to do, as sys_io_submit is documented as
    returning the number of iocbs submitted, so callers should handle a
    return value of less than the 'nr' argument passed in.

    Reported-by: Tavis Ormandy
    Signed-off-by: Jeff Moyer
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     
  • cifs_get_smb_ses must be called on a server pointer on which it holds an
    active reference. It first does a search for an existing SMB session. If
    it finds one, it'll put the server reference and then try to ensure that
    the negprot is done, etc.

    If it encounters an error at that point then it'll return an error.
    There's a potential problem here though. When cifs_get_smb_ses returns
    an error, the caller will also put the TCP server reference leading to a
    double-put.

    Fix this by having cifs_get_smb_ses only put the server reference if
    it found an existing session that it could use and isn't returning an
    error.

    Cc: stable@kernel.org
    Reviewed-by: Suresh Jayaraman
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • Stop sending FLUSHSNAP messages when we hit a capsnap that has dirty_pages
    or is still writing. We'll send the newer capsnaps only after the older
    ones complete.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • The 'follows' should match the seq for the snap context for the given snap
    cap, which is the context under which we have been dirtying and writing
    data and metadata. The snapshot that _contains_ those updates thus
    _follows_ that context's seq #.

    Signed-off-by: Sage Weil

    Sage Weil