07 Nov, 2014

3 commits

  • These functions are being open-coded in 3 different places in the driver
    core, and other driver subsystems will want to start doing this as well,
    so move it to the sysfs core to keep it all in one place, where we know
    it is written properly.

    Signed-off-by: Greg Kroah-Hartman

    Conflicts:
    drivers/base/bus.c

    (cherry picked from commit f1986282fe78586eddf3ae972a72eab7ca425aa7)

    Greg Kroah-Hartman
     
  • When only using bin_attrs instead of attrs the kernel prints a warning
    and refuses to create the sysfs entry. This fixes that.

    Signed-off-by: Oliver Schinagl
    Signed-off-by: Greg Kroah-Hartman
    (cherry picked from commit 8f14c8fa7b2800a25fdb02be99b91104a88c4d7f)

    Oliver Schinagl
     
  • groups should be able to support binary attributes, just like it
    supports "normal" attributes. This lets us only handle one type of
    structure, groups, throughout the driver core and subsystems, making
    binary attributes a "full fledged" part of the driver model, and not
    something just "tacked on".

    Reported-by: Oliver Schinagl
    Reviewed-by: Guenter Roeck
    Tested-by: Guenter Roeck
    Signed-off-by: Greg Kroah-Hartman
    (cherry picked from commit 03829e7591389acd227a532db83d92f0bd188287)

    Greg Kroah-Hartman
     

28 Aug, 2014

1 commit


01 Aug, 2014

1 commit

  • commit aed8adb7688d5744cb484226820163af31d2499a upstream.

    Commit 079148b919d0 ("coredump: factor out the setting of PF_DUMPCORE")
    cleaned up the setting of PF_DUMPCORE by removing it from all the
    linux_binfmt->core_dump() and moving it to zap_threads().But this ended
    up clearing all the previously set flags. This causes issues during
    core generation when tsk->flags is checked again (eg. for PF_USED_MATH
    to dump floating point registers). Fix this.

    Signed-off-by: Silesh C V
    Acked-by: Oleg Nesterov
    Cc: Mandeep Singh Baines
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Silesh C V
     

28 Jul, 2014

1 commit

  • commit 233a01fa9c4c7c41238537e8db8434667ff28a2f upstream.

    If the number in "user_id=N" or "group_id=N" mount options was larger than
    INT_MAX then fuse returned EINVAL.

    Fix this to handle all valid uid/gid values.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Greg Kroah-Hartman

    Miklos Szeredi
     

18 Jul, 2014

3 commits

  • commit 5dd214248f94d430d70e9230bda72f2654ac88a8 upstream.

    The mount manpage says of the max_batch_time option,

    This optimization can be turned off entirely
    by setting max_batch_time to 0.

    But the code doesn't do that. So fix the code to do
    that.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Eric Sandeen
     
  • commit ae0f78de2c43b6fadd007c231a352b13b5be8ed2 upstream.

    Make it clear that values printed are times, and that it is error
    since last fsck. Also add note about fsck version required.

    Signed-off-by: Pavel Machek
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Andreas Dilger
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     
  • commit 61c219f5814277ecb71d64cb30297028d6665979 upstream.

    The first time that we allocate from an uninitialized inode allocation
    bitmap, if the block allocation bitmap is also uninitalized, we need
    to get write access to the block group descriptor before we start
    modifying the block group descriptor flags and updating the free block
    count, etc. Otherwise, there is the potential of a bad journal
    checksum (if journal checksums are enabled), and of the file system
    becoming inconsistent if we crash at exactly the wrong time.

    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     

10 Jul, 2014

4 commits

  • commit 76f47128f9b33af1e96819746550d789054c9664 upstream.

    An NFS operation that creates a new symlink includes the symlink data,
    which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
    of zero-padding as required to reach a 4-byte boundary.

    The vfs, on the other hand, wants null-terminated data.

    The simple way to handle this would be by copying the data into a newly
    allocated buffer with space for the final null.

    The current nfsd_symlink code tries to be more clever by skipping that
    step in the (likely) case where the byte following the string is already
    0.

    But that assumes that the byte following the string is ours to look at.
    In fact, it might be the first byte of a page that we can't read, or of
    some object that another task might modify.

    Worse, the NFSv4 code tries to fix the problem by actually writing to
    that byte.

    In the NFSv2/v3 cases this actually appears to be safe:

    - nfs3svc_decode_symlinkargs explicitly null-terminates the data
    (after first checking its length and copying it to a new
    page).
    - NFSv2 limits symlinks to 1k. The buffer holding the rpc
    request is always at least a page, and the link data (and
    previous fields) have maximum lengths that prevent the request
    from reaching the end of a page.

    In the NFSv4 case the CREATE op is potentially just one part of a long
    compound so can end up on the end of a page if you're unlucky.

    The minimal fix here is to copy and null-terminate in the NFSv4 case.
    The nfsd_symlink() interface here seems too fragile, though. It should
    really either do the copy itself every time or just require a
    null-terminated string.

    Reported-by: Jeff Layton
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    J. Bruce Fields
     
  • commit a93cd4cf86466caa49cfe64607bea7f0bde3f916 upstream.

    Hole punching code for files with indirect blocks wrongly computed
    number of blocks which need to be cleared when traversing the indirect
    block tree. That could result in punching more blocks than actually
    requested and thus effectively cause a data loss. For example:

    fallocate -n -p 10240000 4096

    will punch the range 10240000 - 12632064 instead of the range 1024000 -
    10244096. Fix the calculation.

    Fixes: 8bad6fc813a3a5300f51369c39d315679fd88c72
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit c5c7b8ddfbf8cb3b2291e515a34ab1b8982f5a2d upstream.

    Error recovery in ext4_alloc_branch() calls ext4_forget() even for
    buffer corresponding to indirect block it did not allocate. This leads
    to brelse() being called twice for that buffer (once from ext4_forget()
    and once from cleanup in ext4_ind_map_blocks()) leading to buffer use
    count misaccounting. Eventually (but often much later because there
    are other users of the buffer) we will see messages like:
    VFS: brelse: Trying to free free buffer

    Another manifestation of this problem is an error:
    JBD2 unexpected failure: jbd2_journal_revoke: !buffer_revoked(bh);
    inconsistent data on disk

    The fix is easy - don't forget buffer we did not allocate. Also add an
    explanatory comment because the indexing at ext4_alloc_branch() is
    somewhat subtle.

    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit ce36d9ab3bab06b7b5522f5c8b68fac231b76ffb upstream.

    When we SMB3 mounted with mapchars (to allow reserved characters : \ / > < * ?
    via the Unicode Windows to POSIX remap range) empty paths
    (eg when we open "" to query the root of the SMB3 directory on mount) were not
    null terminated so we sent garbarge as a path name on empty paths which caused
    SMB2/SMB2.1/SMB3 mounts to fail when mapchars was specified. mapchars is
    particularly important since Unix Extensions for SMB3 are not supported (yet)

    Signed-off-by: Steve French
    Reviewed-by: David Disseldorp
    Signed-off-by: Greg Kroah-Hartman

    Steve French
     

07 Jul, 2014

9 commits

  • commit 22e7478ddbcb670e33fab72d0bbe7c394c3a2c84 upstream.

    Prior to commit 0e4f6a791b1e (Fix reiserfs_file_release()), reiserfs
    truncates serialized on i_mutex. They mostly still do, with the exception
    of reiserfs_file_release. That blocks out other writers via the tailpack
    mutex and the inode openers counter adjusted in reiserfs_file_open.

    However, NFS will call reiserfs_setattr without having called ->open, so
    we end up with a race when nfs is calling ->setattr while another
    process is releasing the file. Ultimately, it triggers the
    BUG_ON(inode->i_size != new_file_size) check in maybe_indirect_to_direct.

    The solution is to pull the lock into reiserfs_setattr to encompass the
    truncate_setsize call as well.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Jan Kara
    Signed-off-by: Greg Kroah-Hartman

    Jeff Mahoney
     
  • commit 1b19453d1c6abcfa7c312ba6c9f11a277568fc94 upstream.

    Currently, the DRC cache pruner will stop scanning the list when it
    hits an entry that is RC_INPROG. It's possible however for a call to
    take a *very* long time. In that case, we don't want it to block other
    entries from being pruned if they are expired or we need to trim the
    cache to get back under the limit.

    Fix the DRC cache pruner to just ignore RC_INPROG entries.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    Jeff Layton
     
  • commit a0ef5e19684f0447da9ff0654a12019c484f57ca upstream.

    Currently when we are processing a request, we try to scrape an expired
    or over-limit entry off the list in preference to allocating a new one
    from the slab.

    This is unnecessarily complicated. Just use the slab layer.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    Jeff Layton
     
  • commit 43b6535e717d2f656f71d9bd16022136b781c934 upstream.

    Fix a bug, whereby nfs_update_inode() was declaring the inode to be
    up to date despite not having checked all the attributes.
    The bug occurs because the temporary variable in which we cache
    the validity information is 'sanitised' before reapplying to
    nfsi->cache_validity.

    Reported-by: Kinglong Mee
    Signed-off-by: Trond Myklebust
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     
  • commit 12337901d654415d9f764b5f5ba50052e9700f37 upstream.

    Note nobody's ever noticed because the typical client probably never
    requests FILES_AVAIL without also requesting something else on the list.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    Christoph Hellwig
     
  • commit 48385408b45523d9a432c66292d47ef43efcbb94 upstream.

    27b11428b7de ("nfsd4: remove lockowner when removing lock stateid")
    introduced a memory leak.

    Reported-by: Jeff Layton
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    J. Bruce Fields
     
  • commit 6df200f5d5191bdde4d2e408215383890f956781 upstream.

    Return the NULL pointer when the allocation fails.

    Reported-by: Fengguang Wu
    Signed-off-by: Trond Myklebust
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     
  • commit 72abc8f4b4e8574318189886de627a2bfe6cd0da upstream.

    I hit the same assert failed as Dolev Raviv reported in Kernel v3.10
    shows like this:

    [ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297)
    [ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G O 3.10.40 #1
    [ 9641.234116] [] (unwind_backtrace+0x0/0x12c) from [] (show_stack+0x20/0x24)
    [ 9641.234137] [] (show_stack+0x20/0x24) from [] (dump_stack+0x20/0x28)
    [ 9641.234188] [] (dump_stack+0x20/0x28) from [] (shrink_tnc_trees+0x25c/0x350 [ubifs])
    [ 9641.234265] [] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [] (ubifs_shrinker+0x25c/0x310 [ubifs])
    [ 9641.234307] [] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [] (shrink_slab+0x1d4/0x2f8)
    [ 9641.234327] [] (shrink_slab+0x1d4/0x2f8) from [] (do_try_to_free_pages+0x300/0x544)
    [ 9641.234344] [] (do_try_to_free_pages+0x300/0x544) from [] (try_to_free_pages+0x2d0/0x398)
    [ 9641.234363] [] (try_to_free_pages+0x2d0/0x398) from [] (__alloc_pages_nodemask+0x494/0x7e8)
    [ 9641.234382] [] (__alloc_pages_nodemask+0x494/0x7e8) from [] (new_slab+0x78/0x238)
    [ 9641.234400] [] (new_slab+0x78/0x238) from [] (__slab_alloc.constprop.42+0x1a4/0x50c)
    [ 9641.234419] [] (__slab_alloc.constprop.42+0x1a4/0x50c) from [] (kmem_cache_alloc_trace+0x54/0x188)
    [ 9641.234459] [] (kmem_cache_alloc_trace+0x54/0x188) from [] (do_readpage+0x168/0x468 [ubifs])
    [ 9641.234553] [] (do_readpage+0x168/0x468 [ubifs]) from [] (ubifs_readpage+0x424/0x464 [ubifs])
    [ 9641.234606] [] (ubifs_readpage+0x424/0x464 [ubifs]) from [] (filemap_fault+0x304/0x418)
    [ 9641.234638] [] (filemap_fault+0x304/0x418) from [] (__do_fault+0xd4/0x530)
    [ 9641.234665] [] (__do_fault+0xd4/0x530) from [] (handle_pte_fault+0x480/0xf54)
    [ 9641.234690] [] (handle_pte_fault+0x480/0xf54) from [] (handle_mm_fault+0x140/0x184)
    [ 9641.234716] [] (handle_mm_fault+0x140/0x184) from [] (do_page_fault+0x150/0x3ac)
    [ 9641.234737] [] (do_page_fault+0x150/0x3ac) from [] (do_DataAbort+0x3c/0xa0)
    [ 9641.234759] [] (do_DataAbort+0x3c/0xa0) from [] (__dabt_usr+0x38/0x40)

    After analyzing the code, I found a condition that may cause this failed
    in correct operations. Thus, I think this assertion is wrong and should be
    removed.

    Suppose there are two clean znodes and one dirty znode in TNC. So the
    per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode
    is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops
    on this znode. We clear COW bit and DIRTY bit in write_index() without
    @tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the
    comments in write_index() shows, if another process hold @tnc_mutex and
    dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1).
    We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in
    free_obsolete_znodes() to keep it right.

    If shrink_tnc() performs between decrease and increase, it will release
    other 2 clean znodes it holds and found @clean_zn_cnt is less than zero
    (1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will
    soon correct @clean_zn_cnt and no harm to fs in this case, I think this
    assertion could be removed.

    2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2

    Thread A (commit) Thread B (write or others) Thread C (shrinker)
    ->write_index
    ->clear_bit(DIRTY_NODE)
    ->clear_bit(COW_ZNODE)

    @clean_zn_cnt == 2
    ->mutex_locked(&tnc_mutex)
    ->dirty_cow_znode
    ->!ubifs_zn_cow(znode)
    ->!test_and_set_bit(DIRTY_NODE)
    ->atomic_dec(&clean_zn_cnt)
    ->mutex_unlocked(&tnc_mutex)

    @clean_zn_cnt == 1
    ->mutex_locked(&tnc_mutex)
    ->shrink_tnc
    ->destroy_tnc_subtree
    ->atomic_sub(&clean_zn_cnt, 2)
    ->ubifs_assert mutex_unlocked(&tnc_mutex)

    @clean_zn_cnt == -1
    ->mutex_lock(&tnc_mutex)
    ->free_obsolete_znodes
    ->atomic_inc(&clean_zn_cnt)
    ->mutux_unlock(&tnc_mutex)

    @clean_zn_cnt == 0 (correct after shrink)

    Signed-off-by: hujianyang
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: Greg Kroah-Hartman

    hujianyang
     
  • commit 691a7c6f28ac90cccd0dbcf81348ea90b211bdd0 upstream.

    There is a race condition in UBIFS:

    Thread A (mmap) Thread B (fsync)

    ->__do_fault ->write_cache_pages
    -> ubifs_vm_page_mkwrite
    -> budget_space
    -> lock_page
    -> release/convert_page_budget
    -> SetPagePrivate
    -> TestSetPageDirty
    -> unlock_page
    -> lock_page
    -> TestClearPageDirty
    -> ubifs_writepage
    -> do_writepage
    -> release_budget
    -> ClearPagePrivate
    -> unlock_page
    -> !(ret & VM_FAULT_LOCKED)
    -> lock_page
    -> set_page_dirty
    -> ubifs_set_page_dirty
    -> TestSetPageDirty (set page dirty without budgeting)
    -> unlock_page

    This leads to situation where we have a diry page but no budget allocated for
    this page, so further write-back may fail with -ENOSPC.

    In this fix we return from page_mkwrite without performing unlock_page. We
    return VM_FAULT_LOCKED instead. After doing this, the race above will not
    happen.

    Signed-off-by: hujianyang
    Tested-by: Laurence Withers
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: Greg Kroah-Hartman

    hujianyang
     

01 Jul, 2014

15 commits

  • commit 3e2426bd0eb980648449e7a2f5a23e3cd3c7725c upstream.

    If this condition in end_extent_writepage() is false:

    if (tree->ops && tree->ops->writepage_end_io_hook)

    we will then test an uninitialized "ret" at:

    ret = ret < 0 ? ret : -EIO;

    The test for ret is for the case where ->writepage_end_io_hook
    failed, and we'd choose that ret as the error; but if
    there is no ->writepage_end_io_hook, nothing sets ret.

    Initializing ret to 0 should be sufficient; if
    writepage_end_io_hook wasn't set, (!uptodate) means
    non-zero err was passed in, so we choose -EIO in that case.

    Signed-of-by: Eric Sandeen

    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Eric Sandeen
     
  • commit 6eda71d0c030af0fc2f68aaa676e6d445600855b upstream.

    The skinny extents are intepreted incorrectly in scrub_print_warning(),
    and end up hitting the BUG() in btrfs_extent_inline_ref_size.

    Reported-by: Konstantinos Skarlatos
    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • commit cd857dd6bc2ae9ecea14e75a34e8a8fdc158e307 upstream.

    We want to make sure the point is still within the extent item, not to verify
    the memory it's pointing to.

    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • commit 8321cf2596d283821acc466377c2b85bcd3422b7 upstream.

    There is otherwise a risk of a possible null pointer dereference.

    Was largely found by using a static code analysis program called cppcheck.

    Signed-off-by: Rickard Strandqvist
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Rickard Strandqvist
     
  • commit 1af56070e3ef9477dbc7eba3b9ad7446979c7974 upstream.

    If we are doing an incremental send and the base snapshot has a
    directory with name X that doesn't exist anymore in the second
    snapshot and a new subvolume/snapshot exists in the second snapshot
    that has the same name as the directory (name X), the incremental
    send would fail with -ENOENT error. This is because it attempts
    to lookup for an inode with a number matching the objectid of a
    root, which doesn't exist.

    Steps to reproduce:

    mkfs.btrfs -f /dev/sdd
    mount /dev/sdd /mnt

    mkdir /mnt/testdir
    btrfs subvolume snapshot -r /mnt /mnt/mysnap1

    rmdir /mnt/testdir
    btrfs subvolume create /mnt/testdir
    btrfs subvolume snapshot -r /mnt /mnt/mysnap2

    btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/send.data

    A test case for xfstests follows.

    Reported-by: Robert White
    Signed-off-by: Filipe David Borba Manana
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit 298658414a2f0bea1f05a81876a45c1cd96aa2e0 upstream.

    Seeding device support allows us to create a new filesystem
    based on existed filesystem.

    However newly created filesystem's @total_devices should include seed
    devices. This patch fix the following problem:

    # mkfs.btrfs -f /dev/sdb
    # btrfstune -S 1 /dev/sdb
    # mount /dev/sdb /mnt
    # btrfs device add -f /dev/sdc /mnt --->fs_devices->total_devices = 1
    # umount /mnt
    # mount /dev/sdc /mnt --->fs_devices->total_devices = 2

    This is because we record right @total_devices in superblock, but
    @fs_devices->total_devices is reset to be 0 in btrfs_prepare_sprout().

    Fix this problem by not resetting @fs_devices->total_devices.

    Signed-off-by: Wang Shilong
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Wang Shilong
     
  • commit 5dca6eea91653e9949ce6eb9e9acab6277e2f2c4 upstream.

    According to commit 865ffef3797da2cac85b3354b5b6050dc9660978
    (fs: fix fsync() error reporting),
    it's not stable to just check error pages because pages can be
    truncated or invalidated, we should also mark mapping with error
    flag so that a later fsync can catch the error.

    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • commit de348ee022175401e77d7662b7ca6e231a94e3fd upstream.

    In close_ctree(), after we have stopped all workers,there maybe still
    some read requests(for example readahead) to submit and this *maybe* trigger
    an oops that user reported before:

    kernel BUG at fs/btrfs/async-thread.c:619!

    By hacking codes, i can reproduce this problem with one cpu available.
    We fix this potential problem by invalidating all btree inode pages before
    stopping all workers.

    Thanks to Miao for pointing out this problem.

    Signed-off-by: Wang Shilong
    Reviewed-by: David Sterba
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Wang Shilong
     
  • commit 32d6b47fe6fc1714d5f1bba1b9f38e0ab0ad58a8 upstream.

    If we fail to load a free space cache, we can rebuild it from the extent tree,
    so it is not a serious error, we should not output a error message that
    would make the users uncomfortable. This patch uses warning message instead
    of it.

    Signed-off-by: Miao Xie
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Miao Xie
     
  • commit 5a1972bd9fd4b2fb1bac8b7a0b636d633d8717e3 upstream.

    Btrfs will send uevent to udev inform the device change,
    but ctime/mtime for the block device inode is not udpated, which cause
    libblkid used by btrfs-progs unable to detect device change and use old
    cache, causing 'btrfs dev scan; btrfs dev rmove; btrfs dev scan' give an
    error message.

    Reported-by: Tsutomu Itoh
    Cc: Karel Zak
    Signed-off-by: Qu Wenruo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Qu Wenruo
     
  • commit 7d78874273463a784759916fc3e0b4e2eb141c70 upstream.

    We need to NULL the cached_state after freeing it, otherwise
    we might free it again if find_delalloc_range doesn't find anything.

    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Chris Mason
     
  • commit edfbbf388f293d70bf4b7c0bc38774d05e6f711a upstream.

    A kernel memory disclosure was introduced in aio_read_events_ring() in v3.10
    by commit a31ad380bed817aa25f8830ad23e1a0480fef797. The changes made to
    aio_read_events_ring() failed to correctly limit the index into
    ctx->ring_pages[], allowing an attacked to cause the subsequent kmap() of
    an arbitrary page with a copy_to_user() to copy the contents into userspace.
    This vulnerability has been assigned CVE-2014-0206. Thanks to Mateusz and
    Petr for disclosing this issue.

    This patch applies to v3.12+. A separate backport is needed for 3.10/3.11.

    [jmoyer@redhat.com: backported to 3.10]
    Signed-off-by: Benjamin LaHaise
    Signed-off-by: Jeff Moyer
    Cc: Mateusz Guzik
    Cc: Petr Matousek
    Cc: Kent Overstreet
    Signed-off-by: Greg Kroah-Hartman

    Benjamin LaHaise
     
  • commit f8567a3845ac05bb28f3c1b478ef752762bd39ef upstream.

    The aio cleanups and optimizations by kmo that were merged into the 3.10
    tree added a regression for userspace event reaping. Specifically, the
    reference counts are not decremented if the event is reaped in userspace,
    leading to the application being unable to submit further aio requests.
    This patch applies to 3.12+. A separate backport is required for 3.10/3.11.
    This issue was uncovered as part of CVE-2014-0206.

    [jmoyer@redhat.com: backported to 3.10]
    Signed-off-by: Benjamin LaHaise
    Signed-off-by: Jeff Moyer
    Cc: Kent Overstreet
    Cc: Mateusz Guzik
    Cc: Petr Matousek
    Signed-off-by: Greg Kroah-Hartman

    Benjamin LaHaise
     
  • commit b5b60778558cafad17bbcbf63e0310bd3c68eb17 upstream.

    The variable "size" is expressed as number of blocks and not as
    number of clusters, this could trigger a kernel panic when using
    ext4 with the size of a cluster different from the size of a block.

    Signed-off-by: Maurizio Lombardi
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Maurizio Lombardi
     
  • commit eeece469dedadf3918bad50ad80f4616a0064e90 upstream.

    Tail of a page straddling inode size must be zeroed when being written
    out due to POSIX requirement that modifications of mmaped page beyond
    inode size must not be written to the file. ext4_bio_write_page() did
    this only for blocks fully beyond inode size but didn't properly zero
    blocks partially beyond inode size. Fix this.

    The problem has been uncovered by mmap_11-4 test in openposix test suite
    (part of LTP).

    Reported-by: Xiaoguang Wang
    Fixes: 5a0dc7365c240
    Fixes: bd2d0210cf22f
    CC: stable@vger.kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     

17 Jun, 2014

1 commit

  • commit 23adbe12ef7d3d4195e80800ab36b37bee28cd03 upstream.

    The kernel has no concept of capabilities with respect to inodes; inodes
    exist independently of namespaces. For example, inode_capable(inode,
    CAP_LINUX_IMMUTABLE) would be nonsense.

    This patch changes inode_capable to check for uid and gid mappings and
    renames it to capable_wrt_inode_uidgid, which should make it more
    obvious what it does.

    Fixes CVE-2014-4014.

    Cc: Theodore Ts'o
    Cc: Serge Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Chinner
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     

08 Jun, 2014

2 commits

  • commit d71f290b4e98a39f49f2595a13be3b4d5ce8e1f1 upstream.

    Specify the maximum stack size for arches where the stack grows upward
    (parisc and metag) in asm/processor.h rather than hard coding in
    fs/exec.c so that metag can specify a smaller value of 256MB rather than
    1GB.

    This fixes a BUG on metag if the RLIMIT_STACK hard limit is increased
    beyond a safe value by root. E.g. when starting a process after running
    "ulimit -H -s unlimited" it will then attempt to use a stack size of the
    maximum 1GB which is far too big for metag's limited user virtual
    address space (stack_top is usually 0x3ffff000):

    BUG: failure at fs/exec.c:589/shift_arg_pages()!

    Signed-off-by: James Hogan
    Cc: Helge Deller
    Cc: "James E.J. Bottomley"
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-metag@vger.kernel.org
    Cc: John David Anglin
    Signed-off-by: Greg Kroah-Hartman

    James Hogan
     
  • commit a1b8ff4c97b4375d21b6d6c45d75877303f61b3b upstream.

    The nfsv4 state code has always assumed a one-to-one correspondance
    between lock stateid's and lockowners even if it appears not to in some
    places.

    We may actually change that, but for now when FREE_STATEID releases a
    lock stateid it also needs to release the parent lockowner.

    Symptoms were a subsequent LOCK crashing in find_lockowner_str when it
    calls same_lockowner_ino on a lockowner that unexpectedly has an empty
    so_stateids list.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    J. Bruce Fields