01 Jul, 2014

40 commits

  • Greg Kroah-Hartman
     
  • commit 783ee43118dc773bc8b0342c5b230e017d5a04d0 upstream.

    In generic_id the long int timestamp is multiplied by 100000 and needs
    an explicit cast to u64.

    Without that the id in the resulting pstore filename is wrong and
    userspace may have problems parsing it, but more importantly files in
    pstore can never be deleted and may fill the EFI flash (brick device?).
    This happens because when generic pstore code wants to delete a file,
    it passes the id to the EFI backend which reinterpretes it and a wrong
    variable name is attempted to be deleted. There's no error message but
    after remounting pstore, deleted files would reappear.

    Signed-off-by: Andrew Zaborowski
    Acked-by: David Rientjes
    Signed-off-by: Matt Fleming
    Signed-off-by: Greg Kroah-Hartman

    Andrzej Zaborowski
     
  • commit 6b4a144a92ab81a1f45fb9b12aebaaaee0d08120 upstream.

    In cross-build environment, we expect to use the cross-compiler objcopy
    instead of the host objcopy.

    It fixes following build failures:
    objcopy --only-keep-debug lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko /srv/build/linux/debian/dbgtmp/usr/lib/debug/lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko
    objcopy: Unable to recognise the format of the input file `lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko'

    Signed-off-by: Fathi Boudra
    Fixes: 810e843746b7 ('deb-pkg: split debug symbols in their own package')
    Reviewed-by: Ben Hutchings
    Signed-off-by: Michal Marek
    Signed-off-by: Greg Kroah-Hartman

    Fathi Boudra
     
  • commit e33ba5fa7afce1a9f159704121d4e4d110df8185 upstream.

    Commit 0fb7a01af5b0 "random: simplify accounting code", introduced in
    v3.15, has a very nasty accounting problem when the entropy pool has
    has fewer bytes of entropy than the number of requested reserved
    bytes. In that case, "have_bytes - reserved" goes negative, and since
    size_t is unsigned, the expression:

    ibytes = min_t(size_t, ibytes, have_bytes - reserved);

    ... does not do the right thing. This is rather bad, because it
    defeats the catastrophic reseeding feature in the
    xfer_secondary_pool() path.

    It also can cause the "BUG: spinlock trylock failure on UP" for some
    kernel configurations when prandom_reseed() calls get_random_bytes()
    in the early init, since when the entropy count gets corrupted,
    credit_entropy_bits() erroneously believes that the nonblocking pool
    has been fully initialized (when in fact it is not), and so it calls
    prandom_reseed(true) recursively leading to the spinlock BUG.

    The logic is *not* the same it was originally, but in the cases where
    it matters, the behavior is the same, and the resulting code is
    hopefully easier to read and understand.

    Fixes: 0fb7a01af5b0 "random: simplify accounting code"
    Signed-off-by: Theodore Ts'o
    Cc: Greg Price
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     
  • commit ebe06187bf2aec10d537ce4595e416035367d703 upstream.

    This fixes use-after-free of epi->fllink.next inside list loop macro.
    This loop actually releases elements in the body. The list is
    rcu-protected but here we cannot hold rcu_read_lock because we need to
    lock mutex inside.

    The obvious solution is to use list_for_each_entry_safe(). RCU-ness
    isn't essential because nobody can change this list under us, it's final
    fput for this file.

    The bug was introduced by ae10b2b4eb01 ("epoll: optimize EPOLL_CTL_DEL
    using rcu")

    Signed-off-by: Konstantin Khlebnikov
    Reported-by: Cyrill Gorcunov
    Cc: Sasha Levin
    Cc: Jason Baron
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     
  • commit 554086d85e71f30abe46fc014fea31929a7c6a8a upstream.

    The bad syscall nr paths are their own incomprehensible route
    through the entry control flow. Rearrange them to work just like
    syscalls that return -ENOSYS.

    This fixes an OOPS in the audit code when fast-path auditing is
    enabled and sysenter gets a bad syscall nr (CVE-2014-4508).

    This has probably been broken since Linux 2.6.27:
    af0575bba0 i386 syscall audit fast-path

    Cc: Roland McGrath
    Reported-by: Toralf Förster
    Signed-off-by: Andy Lutomirski
    Link: http://lkml.kernel.org/r/e09c499eade6fc321266dd6b54da7beb28d6991c.1403558229.git.luto@amacapital.net
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 4148c1f67abf823099b2d7db6851e4aea407f5ee upstream.

    There is one other possible overrun in the lz4 code as implemented by
    Linux at this point in time (which differs from the upstream lz4
    codebase, but will get synced at in a future kernel release.) As
    pointed out by Don, we also need to check the overflow in the data
    itself.

    While we are at it, replace the odd error return value with just a
    "simple" -1 value as the return value is never used for anything other
    than a basic "did this work or not" check.

    Reported-by: "Don A. Bailey"
    Reported-by: Willy Tarreau
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • commit 61b433579b6ffecb1d3534fd482dcd48535277c8 upstream.

    In case there are new LTK types in the future we shouldn't just blindly
    assume that != MGMT_LTK_UNAUTHENTICATED means that the key is
    authenticated. This patch adds explicit checks for each allowed key type
    in the form of a switch statement and skips any key which has an unknown
    value.

    Signed-off-by: Johan Hedberg
    Signed-off-by: Marcel Holtmann
    Signed-off-by: Greg Kroah-Hartman

    Johan Hedberg
     
  • commit d7b2545023ecfde94d3ea9c03c5480ac18da96c9 upstream.

    On the mgmt level we have a key type parameter which currently accepts
    two possible values: 0x00 for unauthenticated and 0x01 for
    authenticated. However, in the internal struct smp_ltk representation we
    have an explicit "authenticated" boolean value.

    To make this distinction clear, add defines for the possible mgmt values
    and do conversion to and from the internal authenticated value.

    Signed-off-by: Johan Hedberg
    Signed-off-by: Marcel Holtmann
    Signed-off-by: Greg Kroah-Hartman

    Johan Hedberg
     
  • commit 3e2426bd0eb980648449e7a2f5a23e3cd3c7725c upstream.

    If this condition in end_extent_writepage() is false:

    if (tree->ops && tree->ops->writepage_end_io_hook)

    we will then test an uninitialized "ret" at:

    ret = ret < 0 ? ret : -EIO;

    The test for ret is for the case where ->writepage_end_io_hook
    failed, and we'd choose that ret as the error; but if
    there is no ->writepage_end_io_hook, nothing sets ret.

    Initializing ret to 0 should be sufficient; if
    writepage_end_io_hook wasn't set, (!uptodate) means
    non-zero err was passed in, so we choose -EIO in that case.

    Signed-of-by: Eric Sandeen

    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Eric Sandeen
     
  • commit 6eda71d0c030af0fc2f68aaa676e6d445600855b upstream.

    The skinny extents are intepreted incorrectly in scrub_print_warning(),
    and end up hitting the BUG() in btrfs_extent_inline_ref_size.

    Reported-by: Konstantinos Skarlatos
    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • commit cd857dd6bc2ae9ecea14e75a34e8a8fdc158e307 upstream.

    We want to make sure the point is still within the extent item, not to verify
    the memory it's pointing to.

    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • commit 8a56457f5f8fa7c2698ffae8545214c5b96a2cb5 upstream.

    The backref code was looking at nodes as well as leaves when we tried to
    populate extent item entries. This is not good, and although we go away with it
    for the most part because we'd skip where disk_bytenr != random_memory,
    sometimes random_memory would match and suddenly boom. This fixes that problem.
    Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Josef Bacik
     
  • commit 8321cf2596d283821acc466377c2b85bcd3422b7 upstream.

    There is otherwise a risk of a possible null pointer dereference.

    Was largely found by using a static code analysis program called cppcheck.

    Signed-off-by: Rickard Strandqvist
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Rickard Strandqvist
     
  • commit c1895442be01c58449e3bf9272f22062a670e08f upstream.

    We are currently allocating space_info objects in an array when we
    allocate space_info. When a user does something like:

    # btrfs balance start -mconvert=raid1 -dconvert=raid1 /mnt
    # btrfs balance start -mconvert=single -dconvert=single /mnt -f
    # btrfs balance start -mconvert=raid1 -dconvert=raid1 /

    We can end up with memory corruption since the kobject hasn't
    been reinitialized properly and the name pointer was left set.

    The rationale behind allocating them statically was to avoid
    creating a separate kobject container that just contained the
    raid type. It used the index in the array to determine the index.

    Ultimately, though, this wastes more memory than it saves in all
    but the most complex scenarios and introduces kobject lifetime
    questions.

    This patch allocates the kobjects dynamically instead. Note that
    we also remove the kobject_get/put of the parent kobject since
    kobject_add and kobject_del do that internally.

    Signed-off-by: Jeff Mahoney
    Reported-by: David Sterba
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Jeff Mahoney
     
  • commit 7e3ae33efad1490d01040f552ef50e58ed6376ca upstream.

    We were limiting the sum of the xattr name and value lengths to PATH_MAX,
    which is not correct, specially on filesystems created with btrfs-progs
    v3.12 or higher, where the default leaf size is max(16384, PAGE_SIZE), or
    systems with page sizes larger than 4096 bytes.

    Xattrs have their own specific maximum name and value lengths, which depend
    on the leaf size, therefore use these limits to be able to send xattrs with
    sizes larger than PATH_MAX.

    A test case for xfstests follows.

    Signed-off-by: Filipe David Borba Manana
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit 1af56070e3ef9477dbc7eba3b9ad7446979c7974 upstream.

    If we are doing an incremental send and the base snapshot has a
    directory with name X that doesn't exist anymore in the second
    snapshot and a new subvolume/snapshot exists in the second snapshot
    that has the same name as the directory (name X), the incremental
    send would fail with -ENOENT error. This is because it attempts
    to lookup for an inode with a number matching the objectid of a
    root, which doesn't exist.

    Steps to reproduce:

    mkfs.btrfs -f /dev/sdd
    mount /dev/sdd /mnt

    mkdir /mnt/testdir
    btrfs subvolume snapshot -r /mnt /mnt/mysnap1

    rmdir /mnt/testdir
    btrfs subvolume create /mnt/testdir
    btrfs subvolume snapshot -r /mnt /mnt/mysnap2

    btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/send.data

    A test case for xfstests follows.

    Reported-by: Robert White
    Signed-off-by: Filipe David Borba Manana
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit 298658414a2f0bea1f05a81876a45c1cd96aa2e0 upstream.

    Seeding device support allows us to create a new filesystem
    based on existed filesystem.

    However newly created filesystem's @total_devices should include seed
    devices. This patch fix the following problem:

    # mkfs.btrfs -f /dev/sdb
    # btrfstune -S 1 /dev/sdb
    # mount /dev/sdb /mnt
    # btrfs device add -f /dev/sdc /mnt --->fs_devices->total_devices = 1
    # umount /mnt
    # mount /dev/sdc /mnt --->fs_devices->total_devices = 2

    This is because we record right @total_devices in superblock, but
    @fs_devices->total_devices is reset to be 0 in btrfs_prepare_sprout().

    Fix this problem by not resetting @fs_devices->total_devices.

    Signed-off-by: Wang Shilong
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Wang Shilong
     
  • commit 5dca6eea91653e9949ce6eb9e9acab6277e2f2c4 upstream.

    According to commit 865ffef3797da2cac85b3354b5b6050dc9660978
    (fs: fix fsync() error reporting),
    it's not stable to just check error pages because pages can be
    truncated or invalidated, we should also mark mapping with error
    flag so that a later fsync can catch the error.

    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • commit 29cc83f69c8338ff8fd1383c9be263d4bdf52d73 upstream.

    Same as normal devices, seed devices should be initialized with
    fs_info->dev_root as well, otherwise we'll get a NULL pointer crash.

    Cc: Chris Murphy
    Reported-by: Chris Murphy
    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • commit de348ee022175401e77d7662b7ca6e231a94e3fd upstream.

    In close_ctree(), after we have stopped all workers,there maybe still
    some read requests(for example readahead) to submit and this *maybe* trigger
    an oops that user reported before:

    kernel BUG at fs/btrfs/async-thread.c:619!

    By hacking codes, i can reproduce this problem with one cpu available.
    We fix this potential problem by invalidating all btree inode pages before
    stopping all workers.

    Thanks to Miao for pointing out this problem.

    Signed-off-by: Wang Shilong
    Reviewed-by: David Sterba
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Wang Shilong
     
  • commit c992ec94f24c3e7135d6c23860615f269f0b1d87 upstream.

    If we have directories with a pending move/rename operation, we must take into
    account any orphan directories that got created before executing the pending
    move/rename. Those orphan directories are directories with an inode number higher
    then the current send progress and that don't exist in the parent snapshot, they
    are created before current progress reaches their inode number, with a generated
    name of the form oN-M-I and at the root of the filesystem tree, and later when
    progress matches their inode number, moved/renamed to their final location.

    Reproducer:

    $ mkfs.btrfs -f /dev/sdd
    $ mount /dev/sdd /mnt

    $ mkdir -p /mnt/a/b/c/d
    $ mkdir /mnt/a/b/e
    $ mv /mnt/a/b/c /mnt/a/b/e/CC
    $ mkdir /mnt/a/b/e/CC/d/f
    $ mkdir /mnt/a/g

    $ btrfs subvolume snapshot -r /mnt /mnt/snap1
    $ btrfs send /mnt/snap1 -f /tmp/base.send

    $ mkdir /mnt/a/g/h
    $ mv /mnt/a/b/e /mnt/a/g/h/EE
    $ mv /mnt/a/g/h/EE/CC/d /mnt/a/g/h/EE/DD

    $ btrfs subvolume snapshot -r /mnt /mnt/snap2
    $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send

    The second receive command failed with the following error:

    ERROR: rename a/b/e/CC/d -> o264-7-0/EE/DD failed. No such file or directory

    A test case for xfstests follows soon.

    Signed-off-by: Filipe David Borba Manana
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit 32d6b47fe6fc1714d5f1bba1b9f38e0ab0ad58a8 upstream.

    If we fail to load a free space cache, we can rebuild it from the extent tree,
    so it is not a serious error, we should not output a error message that
    would make the users uncomfortable. This patch uses warning message instead
    of it.

    Signed-off-by: Miao Xie
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Miao Xie
     
  • commit 5a1972bd9fd4b2fb1bac8b7a0b636d633d8717e3 upstream.

    Btrfs will send uevent to udev inform the device change,
    but ctime/mtime for the block device inode is not udpated, which cause
    libblkid used by btrfs-progs unable to detect device change and use old
    cache, causing 'btrfs dev scan; btrfs dev rmove; btrfs dev scan' give an
    error message.

    Reported-by: Tsutomu Itoh
    Cc: Karel Zak
    Signed-off-by: Qu Wenruo
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Qu Wenruo
     
  • commit a1a50f60a6bf4f861eb94793420274bc1ccd409a upstream.

    In a previous change, commit 12870f1c9b2de7d475d22e73fd7db1b418599725,
    I accidentally moved the roundup of inode->i_size to outside of the
    critical section delimited by the inode mutex, which is not atomic and
    not correct since the size can be changed by other task before we acquire
    the mutex. Therefore fix it.

    Signed-off-by: Filipe David Borba Manana
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit 7d78874273463a784759916fc3e0b4e2eb141c70 upstream.

    We need to NULL the cached_state after freeing it, otherwise
    we might free it again if find_delalloc_range doesn't find anything.

    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Chris Mason
     
  • commit fc19c5e73645f95d3eca12b4e91e7b56faf1e4a4 upstream.

    While running a stress test with multiple threads writing to the same btrfs
    file system, I ended up with a situation where a leaf was corrupted in that
    it had 2 file extent item keys that had the same exact key. I was able to
    detect this quickly thanks to the following patch which triggers an assertion
    as soon as a leaf is marked dirty if there are duplicated keys or out of order
    keys:

    Btrfs: check if items are ordered when a leaf is marked dirty
    (https://patchwork.kernel.org/patch/3955431/)

    Basically while running the test, I got the following in dmesg:

    [28877.415877] WARNING: CPU: 2 PID: 10706 at fs/btrfs/file.c:553 btrfs_drop_extent_cache+0x435/0x440 [btrfs]()
    (...)
    [28877.415917] Call Trace:
    [28877.415922] [] dump_stack+0x4e/0x68
    [28877.415926] [] warn_slowpath_common+0x8c/0xc0
    [28877.415929] [] warn_slowpath_null+0x1a/0x20
    [28877.415944] [] btrfs_drop_extent_cache+0x435/0x440 [btrfs]
    [28877.415949] [] ? kmem_cache_alloc+0xfe/0x1c0
    [28877.415962] [] fill_holes+0x229/0x3e0 [btrfs]
    [28877.415972] [] ? block_rsv_add_bytes+0x55/0x80 [btrfs]
    [28877.415984] [] btrfs_fallocate+0xb6b/0xc20 [btrfs]
    (...)
    [29854.132560] BTRFS critical (device sdc): corrupt leaf, bad key order: block=955232256,root=1, slot=24
    [29854.132565] BTRFS info (device sdc): leaf 955232256 total ptrs 40 free space 778
    (...)
    [29854.132637] item 23 key (3486 108 667648) itemoff 2694 itemsize 53
    [29854.132638] extent data disk bytenr 14574411776 nr 286720
    [29854.132639] extent data offset 0 nr 286720 ram 286720
    [29854.132640] item 24 key (3486 108 954368) itemoff 2641 itemsize 53
    [29854.132641] extent data disk bytenr 0 nr 0
    [29854.132643] extent data offset 0 nr 0 ram 0
    [29854.132644] item 25 key (3486 108 954368) itemoff 2588 itemsize 53
    [29854.132645] extent data disk bytenr 8699670528 nr 77824
    [29854.132646] extent data offset 0 nr 77824 ram 77824
    [29854.132647] item 26 key (3486 108 1146880) itemoff 2535 itemsize 53
    [29854.132648] extent data disk bytenr 8699670528 nr 77824
    [29854.132649] extent data offset 0 nr 77824 ram 77824
    (...)
    [29854.132707] kernel BUG at fs/btrfs/ctree.h:3901!
    (...)
    [29854.132771] Call Trace:
    [29854.132779] [] setup_items_for_insert+0x2dc/0x400 [btrfs]
    [29854.132791] [] __btrfs_drop_extents+0xba7/0xdd0 [btrfs]
    [29854.132794] [] ? trace_hardirqs_on_caller+0x16/0x1d0
    [29854.132797] [] ? trace_hardirqs_on+0xd/0x10
    [29854.132800] [] ? kmem_cache_alloc+0xfe/0x1c0
    [29854.132810] [] insert_reserved_file_extent.constprop.66+0xab/0x310 [btrfs]
    [29854.132820] [] __btrfs_prealloc_file_range+0x116/0x340 [btrfs]
    [29854.132830] [] btrfs_prealloc_file_range+0x23/0x30 [btrfs]
    (...)

    So this is caused by getting an -ENOSPC error while punching a file hole, more
    specifically, we get -ENOSPC error from __btrfs_drop_extents in the while loop
    of file.c:btrfs_punch_hole() when it's unable to modify the btree to delete one
    or more file extent items due to lack of enough free space. When this happens,
    in btrfs_punch_hole(), we attempt to reclaim free space by switching our transaction
    block reservation object to root->fs_info->trans_block_rsv, end our transaction and
    start a new transaction basically - and, we keep increasing our current offset
    (cur_offset) as long as it's smaller than the end of the target range (lockend) -
    this makes use leave the loop with cur_offset == drop_end which in turn makes us
    call fill_holes() for inserting a file extent item that represents a 0 bytes range
    hole (and this insertion succeeds, as in the meanwhile more space became available).

    This 0 bytes file hole extent item is a problem because any subsequent caller of
    __btrfs_drop_extents (regular file writes, or fallocate calls for e.g.), with a
    start file offset that is equal to the offset of the hole, will not remove this
    extent item due to the following conditional in the while loop of
    __btrfs_drop_extents:

    if (extent_end slots[0]++;
    goto next_slot;
    }

    This later makes the call to setup_items_for_insert() (at the very end of
    __btrfs_drop_extents), insert a new file extent item with the same offset as
    the 0 bytes file hole extent item that follows it. Needless is to say that this
    causes chaos, either when reading the leaf from disk (btree_readpage_end_io_hook),
    where we perform leaf sanity checks or in subsequent operations that manipulate
    file extent items, as in the fallocate call as shown by the dmesg trace above.

    Without my other patch to perform the leaf sanity checks once a leaf is marked
    as dirty (if the integrity checker is enabled), it would have been much harder
    to debug this issue.

    This change might fix a few similar issues reported by users in the mailing
    list regarding assertion failures in btrfs_set_item_key_safe calls performed
    by __btrfs_drop_extents, such as the following report:

    http://comments.gmane.org/gmane.comp.file-systems.btrfs/32938

    Asking fill_holes() to create a 0 bytes wide file hole item also produced the
    first warning in the trace above, as we passed a range to btrfs_drop_extent_cache
    that has an end smaller (by -1) than its start.

    On 3.14 kernels this issue manifests itself through leaf corruption, as we get
    duplicated file extent item keys in a leaf when calling setup_items_for_insert(),
    but on older kernels, setup_items_for_insert() isn't called by __btrfs_drop_extents(),
    instead we have callers of __btrfs_drop_extents(), namely the functions
    inode.c:insert_inline_extent() and inode.c:insert_reserved_file_extent(), calling
    btrfs_insert_empty_item() to insert the new file extent item, which would fail with
    error -EEXIST, instead of inserting a duplicated key - which is still a serious
    issue as it would make all similar file extent item replace operations keep
    failing if they target the same file range.

    Signed-off-by: Filipe David Borba Manana
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit 663a962151593c69374776e8651238d0da072459 upstream.

    Signed-off-by: Pavel Shilovsky
    Reviewed-by: Shirish Pargaonkar
    Signed-off-by: Steve French
    Signed-off-by: Greg Kroah-Hartman

    Pavel Shilovsky
     
  • commit edfbbf388f293d70bf4b7c0bc38774d05e6f711a upstream.

    A kernel memory disclosure was introduced in aio_read_events_ring() in v3.10
    by commit a31ad380bed817aa25f8830ad23e1a0480fef797. The changes made to
    aio_read_events_ring() failed to correctly limit the index into
    ctx->ring_pages[], allowing an attacked to cause the subsequent kmap() of
    an arbitrary page with a copy_to_user() to copy the contents into userspace.
    This vulnerability has been assigned CVE-2014-0206. Thanks to Mateusz and
    Petr for disclosing this issue.

    This patch applies to v3.12+. A separate backport is needed for 3.10/3.11.

    Signed-off-by: Benjamin LaHaise
    Cc: Mateusz Guzik
    Cc: Petr Matousek
    Cc: Kent Overstreet
    Cc: Jeff Moyer
    Signed-off-by: Greg Kroah-Hartman

    Benjamin LaHaise
     
  • commit f8567a3845ac05bb28f3c1b478ef752762bd39ef upstream.

    The aio cleanups and optimizations by kmo that were merged into the 3.10
    tree added a regression for userspace event reaping. Specifically, the
    reference counts are not decremented if the event is reaped in userspace,
    leading to the application being unable to submit further aio requests.
    This patch applies to 3.12+. A separate backport is required for 3.10/3.11.
    This issue was uncovered as part of CVE-2014-0206.

    Signed-off-by: Benjamin LaHaise
    Cc: Kent Overstreet
    Cc: Mateusz Guzik
    Cc: Petr Matousek
    Signed-off-by: Greg Kroah-Hartman

    Benjamin LaHaise
     
  • commit 1e77d0a1ed7417d2a5a52a7b8d32aea1833faa6c upstream.

    Till reported that the spurious interrupt detection of threaded
    interrupts is broken in two ways:

    - note_interrupt() is called for each action thread of a shared
    interrupt line. That's wrong as we are only interested whether none
    of the device drivers felt responsible for the interrupt, but by
    calling multiple times for a single interrupt line we account
    IRQ_NONE even if one of the drivers felt responsible.

    - note_interrupt() when called from the thread handler is not
    serialized. That leaves the members of irq_desc which are used for
    the spurious detection unprotected.

    To solve this we need to defer the spurious detection of a threaded
    interrupt to the next hardware interrupt context where we have
    implicit serialization.

    If note_interrupt is called with action_ret == IRQ_WAKE_THREAD, we
    check whether the previous interrupt requested a deferred check. If
    not, we request a deferred check for the next hardware interrupt and
    return.

    If set, we check whether one of the interrupt threads signaled
    success. Depending on this information we feed the result into the
    spurious detector.

    If one primary handler of a shared interrupt returns IRQ_HANDLED we
    disable the deferred check of irq threads on the same line, as we have
    found at least one device driver who cared.

    Reported-by: Till Straumann
    Signed-off-by: Thomas Gleixner
    Tested-by: Austin Schuh
    Cc: Oliver Hartkopp
    Cc: Wolfgang Grandegger
    Cc: Pavel Pisa
    Cc: Marc Kleine-Budde
    Cc: linux-can@vger.kernel.org
    Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1303071450130.22263@ionos
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit 68986c9f0f4552c34c248501eb0c690553866d6e upstream.

    This reverts commit e1edf18b20076da83dd231dbd2146cbbc31c0b14.

    This patch was a misguided attempt at fixing offb for LE ppc64
    kernels on BE qemu but is just wrong ... it breaks real LE/LE
    setups, LE with real HW, and existing mixed endian systems
    that did the fight thing with the appropriate device-tree
    property. Bad reviewing on my part, sorry.

    The right fix is to either make qemu change its endian when
    the guest changes endian (working on that) or to use the
    existing foreign endian support.

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Greg Kroah-Hartman

    Benjamin Herrenschmidt
     
  • commit 0690a229c69f40a6c9c459ab455c85df49822525 upstream.

    This caused reduced performance for some users with advanced post
    processing enabled. We need a better method to pick the
    UVD state based on the amount of post processing required or tune
    the advanced post processing to fit within the lower power state
    envelope.

    This reverts commit 14a9579ddbf15dd1992a9481a4ec80b0b91656d5.

    Signed-off-by: Greg Kroah-Hartman

    Alex Deucher
     
  • commit 7fd44dacdd803c0bbf38bf478d51d280902bb0f1 upstream.

    The io_setup takes a pointer to a context id of type aio_context_t.
    This in turn is typed to a __kernel_ulong_t. We could tweak the
    exported headers to define this as a 64bit quantity for specific
    ABIs, but since we already have a 32bit compat shim for the x86 ABI,
    let's just re-use that logic. The libaio package is also written to
    expect this as a pointer type, so a compat shim would simplify that.

    The io_submit func operates on an array of pointers to iocb structs.
    Padding out the array to be 64bit aligned is a huge pain, so convert
    it over to the existing compat shim too.

    We don't convert io_getevents to the compat func as its only purpose
    is to handle the timespec struct, and the x32 ABI uses 64bit times.

    With this change, the libaio package can now pass its testsuite when
    built for the x32 ABI.

    Signed-off-by: Mike Frysinger
    Link: http://lkml.kernel.org/r/1399250595-5005-1-git-send-email-vapier@gentoo.org
    Cc: H.J. Lu
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    Mike Frysinger
     
  • commit 246f2d2ee1d715e1077fc47d61c394569c8ee692 upstream.

    It is not safe to use LAR to filter when to go down the espfix path,
    because the LDT is per-process (rather than per-thread) and another
    thread might change the descriptors behind our back. Fortunately it
    is always *safe* (if a bit slow) to go down the espfix path, and a
    32-bit LDT stack segment is extremely rare.

    Signed-off-by: H. Peter Anvin
    Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin
     
  • commit e3a920afc3482e954834a4ed95908c4bc5e4c000 upstream.

    This should be a plain old '&' and could easily lead to undefined
    behaviour if the target of a pmd_mknotpresent invocation was the same
    as the parameter.

    Fixes: 9c7e535fcc17 (arm64: mm: Route pmd thp functions through pte equivalents)
    Signed-off-by: Will Deacon
    Signed-off-by: Catalin Marinas
    Signed-off-by: Greg Kroah-Hartman

    Will Deacon
     
  • commit f3a183cb422574014538017b5b291a416396f97e upstream.

    Arm64 does not define dma_get_required_mask() function.
    Therefore, it should not define the ARCH_HAS_DMA_GET_REQUIRED_MASK.
    This causes build errors in some device drivers (e.g. mpt2sas)

    Signed-off-by: Suravee Suthikulpanit
    Signed-off-by: Catalin Marinas
    Signed-off-by: Greg Kroah-Hartman

    Suravee Suthikulpanit
     
  • commit 34c65c43f1518bf85f93526ad373adc6a683b4c5 upstream.

    Whilst native arm64 applications don't have the 16-bit UID/GID syscalls
    wired up, compat tasks can still access them. The 16-bit wrappers for
    these syscalls use __kernel_old_uid_t and __kernel_old_gid_t, which must
    be 16-bit data types to maintain compatibility with the 16-bit UIDs used
    by compat applications.

    This patch defines 16-bit __kernel_old_{gid,uid}_t types for arm64
    instead of using the 32-bit types provided by asm-generic.

    Signed-off-by: Will Deacon
    Acked-by: Arnd Bergmann
    Signed-off-by: Catalin Marinas
    Signed-off-by: Greg Kroah-Hartman

    Will Deacon
     
  • commit e47043aea3853a74a9aa5726a1faa916d7462ab7 upstream.

    The OpenBlocks AX3-4 has a non-DT bootloader. It also comes with 1GB of
    soldered on RAM, and a DIMM slot for expansion.

    Unfortunately, atags_to_fdt() doesn't work in big-endian mode, so we see
    the following failure when attempting to boot a big-endian kernel:

    686 slab pages
    17 pages shared
    0 pages swap cached
    [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
    Kernel panic - not syncing: Out of memory and no killable processes...

    CPU: 1 PID: 351 Comm: kworker/u4:0 Not tainted 3.15.0-rc8-next-20140603 #1
    [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
    [] (show_stack) from [] (dump_stack+0x78/0x94)
    [] (dump_stack) from [] (panic+0x90/0x21c)
    [] (panic) from [] (out_of_memory+0x320/0x340)
    [] (out_of_memory) from [] (__alloc_pages_nodemask+0x874/0x930)
    [] (__alloc_pages_nodemask) from [] (handle_mm_fault+0x744/0x96c)
    [] (handle_mm_fault) from [] (__get_user_pages+0xd0/0x4c0)
    [] (__get_user_pages) from [] (get_arg_page+0x54/0xbc)
    [] (get_arg_page) from [] (copy_strings+0x278/0x29c)
    [] (copy_strings) from [] (copy_strings_kernel+0x20/0x28)
    [] (copy_strings_kernel) from [] (do_execve+0x3a8/0x4c8)
    [] (do_execve) from [] (____call_usermodehelper+0x15c/0x194)
    [] (____call_usermodehelper) from [] (ret_from_fork+0x14/0x3c)
    CPU0: stopping
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.15.0-rc8-next-20140603 #1
    [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
    [] (show_stack) from [] (dump_stack+0x78/0x94)
    [] (dump_stack) from [] (handle_IPI+0x138/0x174)
    [] (handle_IPI) from [] (armada_370_xp_handle_irq+0xb0/0xcc)
    [] (armada_370_xp_handle_irq) from [] (__irq_svc+0x40/0x50)
    Exception stack(0xc0b6bf68 to 0xc0b6bfb0)
    bf60: e9fad598 00000000 00f509a3 00000000 c0b6a000 c0b724c4
    bf80: c0b72458 c0b6a000 00000000 00000000 c0b66da0 c0b6a000 00000000 c0b6bfb0
    bfa0: c027bb94 c027bb24 60000313 ffffffff
    [] (__irq_svc) from [] (cpu_startup_entry+0x54/0x214)
    [] (cpu_startup_entry) from [] (start_kernel+0x318/0x37c)
    [] (start_kernel) from [] (0x208078)
    ---[ end Kernel panic - not syncing: Out of memory and no killable processes...

    A similar failure will also occur if ARM_ATAG_DTB_COMPAT isn't selected.

    Fix this by setting a sane default (1 GB) in the dts file.

    Signed-off-by: Jason Cooper
    Tested-by: Kevin Hilman
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Greg Kroah-Hartman

    Jason Cooper
     
  • commit 2aea39eca6b68d6ae7eb545332df0695f56a3d3f upstream.

    If f2fs_write_data_page is called through the reclaim path, we should submit
    the bio right away.

    This patch resolves the following issue that Marc Dietrich reported.
    "It took me a while to bisect a problem which causes my ARM (tegra2) netbook to
    frequently stall for 5-10 seconds when I enable EXA acceleration (opentegra
    experimental ddx)."
    And this patch fixes that.

    Reported-by: Marc Dietrich
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Jaegeuk Kim