17 Apr, 2014

5 commits

  • This issue was found by Coverity (CID 1202536)

    This proposes a fix for a statement that creates dead code.
    The "rc < 0" statement is within code that is run
    with "rc > 0".

    It seems like "err < 0" was meant to be used here.
    This way, the error code is returned by the function.

    Signed-off-by: Michael Opdenacker
    Acked-by: Al Viro
    Signed-off-by: Steve French

    Michael Opdenacker
     
  • Coverity says:

    *** CID 1202537: Dereference after null check (FORWARD_NULL)
    /fs/cifs/file.c: 2873 in cifs_user_readv()
    2867 cur_len = min_t(const size_t, len - total_read, cifs_sb->rsize);
    2868 npages = DIV_ROUND_UP(cur_len, PAGE_SIZE);
    2869
    2870 /* allocate a readdata struct */
    2871 rdata = cifs_readdata_alloc(npages,
    2872 cifs_uncached_readv_complete);
    >>> CID 1202537: Dereference after null check (FORWARD_NULL)
    >>> Comparing "rdata" to null implies that "rdata" might be null.
    2873 if (!rdata) {
    2874 rc = -ENOMEM;
    2875 goto error;
    2876 }
    2877
    2878 rc = cifs_read_allocate_pages(rdata, npages);

    ...when we "goto error", rc will be non-zero, and then we end up trying
    to do a kref_put on the rdata (which is NULL). Fix this by replacing
    the "goto error" with a "break".

    Reported-by:
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • In SMB2_set_compression(), the "res_key" variable is only initialized to NULL
    and later kfreed. It is therefore useless and should be removed.

    Found with the following semantic patch:

    @@
    identifier foo;
    identifier f;
    type T;
    @@
    * f(...) {
    ...
    * T *foo = NULL;
    ... when forall
    when != foo
    * kfree(foo);
    ...
    }

    Signed-off-by: Cyril Roelandt
    Signed-off-by: Steve French

    Cyril Roelandt
     
  • xfstest 020 detected a problem with cifs xattr handling. When a file
    had an empty xattr list, we returned success (with an empty xattr value)
    on query of particular xattrs rather than returning ENODATA.
    This patch fixes it so that query of an xattr returns ENODATA when the
    xattr list is empty for the file.

    Signed-off-by: Steve French
    Reviewed-by: Jeff Layton

    Steve French
     
  • Problem reported in Red Hat bz 1040329 for strict writes where we cache
    only when we hold oplock and write direct to the server when we don't.

    When we receive an oplock break, we first change the oplock value for
    the inode in cifsInodeInfo->oplock to indicate that we no longer hold
    the oplock before we enqueue a task to flush changes to the backing
    device. Once we have completed flushing the changes, we return the
    oplock to the server.

    There are 2 ways here where we can have data corruption
    1) While we flush changes to the backing device as part of the oplock
    break, we can have processes write to the file. These writes check for
    the oplock, find none and attempt to write directly to the server.
    These direct writes made while we are flushing from cache could be
    overwritten by data being flushed from the cache causing data
    corruption.
    2) While a thread runs in cifs_strict_writev, the machine could receive
    and process an oplock break after the thread has checked the oplock and
    found that it allows us to cache and before we have made changes to the
    cache. In that case, we end up with a dirty page in cache when we
    shouldn't have any. This will be flushed later and will overwrite all
    subsequent writes to the part of the file represented by this page.

    Before making any writes to the server, we need to confirm that we are
    not in the process of flushing data to the server and if we are, we
    should wait until the process is complete before we attempt the write.
    We should also wait for existing writes to complete before we process
    an oplock break request which changes oplock values.

    We add a version specific downgrade_oplock() operation to allow for
    differences in the oplock values set for the different smb versions.

    Cc: stable@vger.kernel.org
    Signed-off-by: Sachin Prabhu
    Reviewed-by: Jeff Layton
    Reviewed-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Sachin Prabhu
     

14 Apr, 2014

1 commit

  • On 32 bit, size_t is "unsigned int", not "unsigned long", causing the
    following warning when comparing with PAGE_SIZE, which is always "unsigned
    long":

    fs/cifs/file.c: In function ‘cifs_readdata_to_iov’:
    fs/cifs/file.c:2757: warning: comparison of distinct pointer types lacks a cast

    Introduced by commit 7f25bba819a3 ("cifs_iovec_read: keep iov_iter
    between the calls of cifs_readdata_to_iov()"), which changed the
    signedness of "remaining" and the code from min_t() to min().

    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     

13 Apr, 2014

1 commit

  • Pull vfs updates from Al Viro:
    "The first vfs pile, with deep apologies for being very late in this
    window.

    Assorted cleanups and fixes, plus a large preparatory part of iov_iter
    work. There's a lot more of that, but it'll probably go into the next
    merge window - it *does* shape up nicely, removes a lot of
    boilerplate, gets rid of locking inconsistencie between aio_write and
    splice_write and I hope to get Kent's direct-io rewrite merged into
    the same queue, but some of the stuff after this point is having
    (mostly trivial) conflicts with the things already merged into
    mainline and with some I want more testing.

    This one passes LTP and xfstests without regressions, in addition to
    usual beating. BTW, readahead02 in ltp syscalls testsuite has started
    giving failures since "mm/readahead.c: fix readahead failure for
    memoryless NUMA nodes and limit readahead pages" - might be a false
    positive, might be a real regression..."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    missing bits of "splice: fix racy pipe->buffers uses"
    cifs: fix the race in cifs_writev()
    ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
    kill generic_file_buffered_write()
    ocfs2_file_aio_write(): switch to generic_perform_write()
    ceph_aio_write(): switch to generic_perform_write()
    xfs_file_buffered_aio_write(): switch to generic_perform_write()
    export generic_perform_write(), start getting rid of generic_file_buffer_write()
    generic_file_direct_write(): get rid of ppos argument
    btrfs_file_aio_write(): get rid of ppos
    kill the 5th argument of generic_file_buffered_write()
    kill the 4th argument of __generic_file_aio_write()
    lustre: don't open-code kernel_recvmsg()
    ocfs2: don't open-code kernel_recvmsg()
    drbd: don't open-code kernel_recvmsg()
    constify blk_rq_map_user_iov() and friends
    lustre: switch to kernel_sendmsg()
    ocfs2: don't open-code kernel_sendmsg()
    take iov_iter stuff to mm/iov_iter.c
    process_vm_access: tidy up a bit
    ...

    Linus Torvalds
     

12 Apr, 2014

1 commit

  • O_APPEND handling there hadn't been completely fixed by Pavel's
    patch; it checks the right value, but it's racy - we can't really
    do that until i_mutex has been taken.

    Fix by switching to __generic_file_aio_write() (open-coding
    generic_file_aio_write(), actually) and pulling mutex_lock() above
    inode_size_read().

    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Al Viro
     

08 Apr, 2014

1 commit

  • filemap_map_pages() is generic implementation of ->map_pages() for
    filesystems who uses page cache.

    It should be safe to use filemap_map_pages() for ->map_pages() if
    filesystem use filemap_fault() for ->fault().

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Linus Torvalds
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Andi Kleen
    Cc: Matthew Wilcox
    Cc: Dave Hansen
    Cc: Alexander Viro
    Cc: Dave Chinner
    Cc: Ning Qu
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

05 Apr, 2014

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Major changes for 3.14 include support for the newly added ZERO_RANGE
    and COLLAPSE_RANGE fallocate operations, and scalability improvements
    in the jbd2 layer and in xattr handling when the extended attributes
    spill over into an external block.

    Other than that, the usual clean ups and minor bug fixes"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
    ext4: fix premature freeing of partial clusters split across leaf blocks
    ext4: remove unneeded test of ret variable
    ext4: fix comment typo
    ext4: make ext4_block_zero_page_range static
    ext4: atomically set inode->i_flags in ext4_set_inode_flags()
    ext4: optimize Hurd tests when reading/writing inodes
    ext4: kill i_version support for Hurd-castrated file systems
    ext4: each filesystem creates and uses its own mb_cache
    fs/mbcache.c: doucple the locking of local from global data
    fs/mbcache.c: change block and index hash chain to hlist_bl_node
    ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
    ext4: refactor ext4_fallocate code
    ext4: Update inode i_size after the preallocation
    ext4: fix partial cluster handling for bigalloc file systems
    ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
    ext4: only call sync_filesystm() when remounting read-only
    fs: push sync_filesystem() down to the file system's remount_fs()
    jbd2: improve error messages for inconsistent journal heads
    jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
    jbd2: minimize region locked by j_list_lock in journal_get_create_access()
    ...

    Linus Torvalds
     

04 Apr, 2014

2 commits

  • Reclaim will be leaving shadow entries in the page cache radix tree upon
    evicting the real page. As those pages are found from the LRU, an
    iput() can lead to the inode being freed concurrently. At this point,
    reclaim must no longer install shadow pages because the inode freeing
    code needs to ensure the page tree is really empty.

    Add an address_space flag, AS_EXITING, that the inode freeing code sets
    under the tree lock before doing the final truncate. Reclaim will check
    for this flag before installing shadow pages.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: Minchan Kim
    Cc: Andrea Arcangeli
    Cc: Bob Liu
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Greg Thelen
    Cc: Hugh Dickins
    Cc: Jan Kara
    Cc: KOSAKI Motohiro
    Cc: Luigi Semenzato
    Cc: Mel Gorman
    Cc: Metin Doslu
    Cc: Michel Lespinasse
    Cc: Ozgun Erdogan
    Cc: Peter Zijlstra
    Cc: Roman Gushchin
    Cc: Ryan Mallon
    Cc: Tejun Heo
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • cifs_init_inodecache is only called by __init init_cifs.

    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     

02 Apr, 2014

4 commits


13 Mar, 2014

1 commit

  • Previously, the no-op "mount -o mount /dev/xxx" operation when the
    file system is already mounted read-write causes an implied,
    unconditional syncfs(). This seems pretty stupid, and it's certainly
    documented or guaraunteed to do this, nor is it particularly useful,
    except in the case where the file system was mounted rw and is getting
    remounted read-only.

    However, it's possible that there might be some file systems that are
    actually depending on this behavior. In most file systems, it's
    probably fine to only call sync_filesystem() when transitioning from
    read-write to read-only, and there are some file systems where this is
    not needed at all (for example, for a pseudo-filesystem or something
    like romfs).

    Signed-off-by: "Theodore Ts'o"
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Christoph Hellwig
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: Evgeniy Dushistov
    Cc: Jan Kara
    Cc: OGAWA Hirofumi
    Cc: Anders Larsen
    Cc: Phillip Lougher
    Cc: Kees Cook
    Cc: Mikulas Patocka
    Cc: Petr Vandrovec
    Cc: xfs@oss.sgi.com
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-cifs@vger.kernel.org
    Cc: samba-technical@lists.samba.org
    Cc: codalist@coda.cs.cmu.edu
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: fuse-devel@lists.sourceforge.net
    Cc: cluster-devel@redhat.com
    Cc: linux-mtd@lists.infradead.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-nilfs@vger.kernel.org
    Cc: linux-ntfs-dev@lists.sourceforge.net
    Cc: ocfs2-devel@oss.oracle.com
    Cc: reiserfs-devel@vger.kernel.org

    Theodore Ts'o
     

01 Mar, 2014

1 commit

  • The rfc1002 length actually includes a type byte, which we aren't
    masking off. In most cases, it's not a problem since the
    RFC1002_SESSION_MESSAGE type is 0, but when doing a RFC1002 session
    establishment, the type is non-zero and that throws off the returned
    length.

    Signed-off-by: Jeff Layton
    Tested-by: Sachin Prabhu
    Signed-off-by: Steve French

    Jeff Layton
     

24 Feb, 2014

2 commits

  • We had a bug discovered recently where an upper layer function
    (cifs_iovec_write) could pass down a smb_rqst with an invalid amount of
    data in it. The length of the SMB frame would be correct, but the rqst
    struct would cause smb_send_rqst to send nearly 4GB of data.

    This should never be the case. Add some sanity checking to the beginning
    of smb_send_rqst that ensures that the amount of data we're going to
    send agrees with the length in the RFC1002 header. If it doesn't, WARN()
    and return -EIO to the upper layers.

    Signed-off-by: Jeff Layton
    Acked-by: Sachin Prabhu
    Reviewed-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Jeff Layton
     
  • and use generic_file_aio_write rather than __generic_file_aio_write
    in cifs_writev.

    Signed-off-by: Pavel Shilovsky
    Reported-by: Al Viro
    Signed-off-by: Steve French

    Pavel Shilovsky
     

18 Feb, 2014

1 commit

  • Pull CIFS fixes from Steve French:
    "Three cifs fixes, the most important fixing the problem with passing
    bogus pointers with writev (CVE-2014-0069).

    Two additional cifs fixes are still in review (including the fix for
    an append problem which Al also discovered)"

    * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
    CIFS: Fix too big maxBuf size for SMB3 mounts
    cifs: ensure that uncached writes handle unmapped areas correctly
    [CIFS] Fix cifsacl mounts over smb2 to not call cifs

    Linus Torvalds
     

15 Feb, 2014

2 commits

  • SMB3 servers can respond with MaxTransactSize of more than 4M
    that can cause a memory allocation error returned from kmalloc
    in a lock codepath. Also the client doesn't support multicredit
    requests now and allows buffer sizes of 65536 bytes only. Set
    MaxTransactSize to this maximum supported value.

    Cc: stable@vger.kernel.org # 3.7+
    Signed-off-by: Pavel Shilovsky
    Acked-by: Jeff Layton
    Signed-off-by: Steve French

    Pavel Shilovsky
     
  • It's possible for userland to pass down an iovec via writev() that has a
    bogus user pointer in it. If that happens and we're doing an uncached
    write, then we can end up getting less bytes than we expect from the
    call to iov_iter_copy_from_user. This is CVE-2014-0069

    cifs_iovec_write isn't set up to handle that situation however. It'll
    blindly keep chugging through the page array and not filling those pages
    with anything useful. Worse yet, we'll later end up with a negative
    number in wdata->tailsz, which will confuse the sending routines and
    cause an oops at the very least.

    Fix this by having the copy phase of cifs_iovec_write stop copying data
    in this situation and send the last write as a short one. At the same
    time, we want to avoid sending a zero-length write to the server, so
    break out of the loop and set rc to -EFAULT if that happens. This also
    allows us to handle the case where no address in the iovec is valid.

    [Note: Marking this for stable on v3.4+ kernels, but kernels as old as
    v2.6.38 may have a similar problem and may need similar fix]

    Cc: # v3.4+
    Reviewed-by: Pavel Shilovsky
    Reported-by: Al Viro
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     

11 Feb, 2014

2 commits

  • When mounting with smb2/smb3 (e.g. vers=2.1) and cifsacl mount option,
    it was trying to get the mode by querying the acl over the cifs
    rather than smb2 protocol. This patch makes that protocol
    independent and makes cifsacl smb2 mounts return a more intuitive
    operation not supported error (until we add a worker function
    for smb2_get_acl).

    Note that a previous patch fixed getxattr/setxattr for the CIFSACL xattr
    which would unconditionally call cifs_get_acl and cifs_set_acl (even when
    mounted smb2). I made those protocol independent last week (new protocol
    version operations "get_acl" and "set_acl" but did not add an
    smb2_get_acl and smb2_set_acl yet so those now simply return EOPNOTSUPP
    which at least is better than sending cifs requests on smb2 mount)

    The previous patches did not fix the one remaining case though ie
    mounting with "cifsacl" when getting mode from acl would unconditionally
    end up calling "cifs_get_acl_from_fid" even for smb2 - so made that protocol
    independent but to make that protocol independent had to make sure that the callers
    were passing the protocol independent handle structure (cifs_fid) instead
    of cifs specific _u16 network file handle (ie cifs_fid instead of cifs_fid->fid)

    Now mount with smb2 and cifsacl mount options will return EOPNOTSUP (instead
    of timing out) and a future patch will add smb2 operations (e.g. get_smb2_acl)
    to enable this.

    Signed-off-by: Steve French

    Steve French
     
  • Pull CIFS fixes from Steve French:
    "Small fix from Jeff for writepages leak, and some fixes for ACLs and
    xattrs when SMB2 enabled.

    Am expecting another fix from Jeff and at least one more fix (for
    mounting SMB2 with cifsacl) in the next week"

    * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
    [CIFS] clean up page array when uncached write send fails
    cifs: use a flexarray in cifs_writedata
    retrieving CIFS ACLs when mounted with SMB2 fails dropping session
    Add protocol specific operation for CIFS xattrs

    Linus Torvalds
     

10 Feb, 2014

1 commit

  • It actually goes back to 2004 ([PATCH] Concurrent O_SYNC write support)
    when sync_page_range() had been introduced; generic_file_write{,v}() correctly
    synced
    pos_after_write - written .. pos_after_write - 1
    but generic_file_aio_write() synced
    pos_before_write .. pos_before_write + written - 1
    instead. Which is not the same thing with O_APPEND, obviously.
    A couple of years later correct variant had been killed off when
    everything switched to use of generic_file_aio_write().

    All users of generic_file_aio_write() are affected, and the same bug
    has been copied into other instances of ->aio_write().

    The fix is trivial; the only subtle point is that generic_write_sync()
    ought to be inlined to avoid calculations useless for the majority of
    calls.

    Signed-off-by: Al Viro

    Al Viro
     

08 Feb, 2014

4 commits

  • In the event that a send fails in an uncached write, or we end up
    needing to reissue it (-EAGAIN case), we'll kfree the wdata but
    the pages currently leak.

    Fix this by adding a new kref release routine for uncached writedata
    that releases the pages, and have the uncached codepaths use that.

    [original patch by Jeff modified to fix minor formatting problems]

    Signed-off-by: Jeff Layton
    Reviewed-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Steve French
     
  • The cifs_writedata code uses a single element trailing array, which
    just adds unneeded complexity. Use a flexarray instead.

    Signed-off-by: Jeff Layton
    Reviewed-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Jeff Layton
     
  • The get/set ACL xattr support for CIFS ACLs attempts to send old
    cifs dialect protocol requests even when mounted with SMB2 or later
    dialects. Sending cifs requests on an smb2 session causes problems -
    the server drops the session due to the illegal request.

    This patch makes CIFS ACL operations protocol specific to fix that.

    Attempting to query/set CIFS ACLs for SMB2 will now return
    EOPNOTSUPP (until we add worker routines for sending query
    ACL requests via SMB2) instead of sending invalid (cifs)
    requests.

    A separate followon patch will be needed to fix cifs_acl_to_fattr
    (which takes a cifs specific u16 fid so can't be abstracted
    to work with SMB2 until that is changed) and will be needed
    to fix mount problems when "cifsacl" is specified on mount
    with e.g. vers=2.1

    Signed-off-by: Steve French
    Reviewed-by: Shirish Pargaonkar
    CC: Stable

    Steve French
     
  • Changeset 666753c3ef8fc88b0ddd5be4865d0aa66428ac35 added protocol
    operations for get/setxattr to avoid calling cifs operations
    on smb2/smb3 mounts for xattr operations and this changeset
    adds the calls to cifs specific protocol operations for xattrs
    (in order to reenable cifs support for xattrs which was
    temporarily disabled by the previous changeset. We do not
    have SMB2/SMB3 worker function for setting xattrs yet so
    this only enables it for cifs.

    CCing stable since without these two small changsets (its
    small coreq 666753c3ef8fc88b0ddd5be4865d0aa66428ac35 is
    also needed) calling getfattr/setfattr on smb2/smb3 mounts
    causes problems.

    Signed-off-by: Steve French
    Reviewed-by: Shirish Pargaonkar
    CC: Stable

    Steve French
     

31 Jan, 2014

1 commit

  • MF Symlinks are regular files containing content in a specified format.

    The function couldbe_mf_symlink() checks the mode for a set S_IFREG bit
    as a test to confirm that it is a regular file. This bit is also set for
    other filetypes and simply checking for this bit being set may return
    false positives.

    We ensure that we are actually checking for a regular file by using the
    S_ISREG macro to test instead.

    Signed-off-by: Sachin Prabhu
    Reviewed-by: Jeff Layton
    Reported-by: Neil Brown
    Reported-by: Dan Carpenter
    Signed-off-by: Steve French

    Sachin Prabhu
     

27 Jan, 2014

1 commit

  • When mounting with smb2 (or smb2.1 or smb3) we need to check to make
    sure that attempts to query or set extended attributes do not
    attempt to send the request with the older cifs protocol instead
    (eventually we also need to add the support in SMB2
    to query/set extended attributes but this patch prevents us from
    using the wrong protocol for extended attribute operations).

    Signed-off-by: Steve French

    Steve French
     

20 Jan, 2014

8 commits