24 Mar, 2019

2 commits

  • [ Upstream commit 29b00e609960ae0fcff382f4c7079dd0874a5311 ]

    When we made the shmem_reserve_inode call in shmem_link conditional, we
    forgot to update the declaration for ret so that it always has a known
    value. Dan Carpenter pointed out this deficiency in the original patch.

    Fixes: 1062af920c07 ("tmpfs: fix link accounting when a tmpfile is linked in")
    Reported-by: Dan Carpenter
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Hugh Dickins
    Cc: Matej Kupljen
    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Darrick J. Wong
     
  • [ Upstream commit 1062af920c07f5b54cf5060fde3339da6df0cf6b ]

    tmpfs has a peculiarity of accounting hard links as if they were
    separate inodes: so that when the number of inodes is limited, as it is
    by default, a user cannot soak up an unlimited amount of unreclaimable
    dcache memory just by repeatedly linking a file.

    But when v3.11 added O_TMPFILE, and the ability to use linkat() on the
    fd, we missed accommodating this new case in tmpfs: "df -i" shows that
    an extra "inode" remains accounted after the file is unlinked and the fd
    closed and the actual inode evicted. If a user repeatedly links
    tmpfiles into a tmpfs, the limit will be hit (ENOSPC) even after they
    are deleted.

    Just skip the extra reservation from shmem_link() in this case: there's
    a sense in which this first link of a tmpfile is then cheaper than a
    hard link of another file, but the accounting works out, and there's
    still good limiting, so no need to do anything more complicated.

    Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1902182134370.7035@eggly.anvils
    Fixes: f4e0c30c191 ("allow the temp files created by open() to be linked to")
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Hugh Dickins
    Reported-by: Matej Kupljen
    Acked-by: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Darrick J. Wong
     

08 Dec, 2018

3 commits

  • commit dcf7fe9d89763a28e0f43975b422ff141fe79e43 upstream.

    Set the page dirty if VM_WRITE is not set because in such case the pte
    won't be marked dirty and the page would be reclaimed without writepage
    (i.e. swapout in the shmem case).

    This was found by source review. Most apps (certainly including QEMU)
    only use UFFDIO_COPY on PROT_READ|PROT_WRITE mappings or the app can't
    modify the memory in the first place. This is for correctness and it
    could help the non cooperative use case to avoid unexpected data loss.

    Link: http://lkml.kernel.org/r/20181126173452.26955-6-aarcange@redhat.com
    Reviewed-by: Hugh Dickins
    Cc: stable@vger.kernel.org
    Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
    Reported-by: Hugh Dickins
    Signed-off-by: Andrea Arcangeli
    Cc: "Dr. David Alan Gilbert"
    Cc: Jann Horn
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Peter Xu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Andrea Arcangeli
     
  • commit e2a50c1f64145a04959df2442305d57307e5395a upstream.

    With MAP_SHARED: recheck the i_size after taking the PT lock, to
    serialize against truncate with the PT lock. Delete the page from the
    pagecache if the i_size_read check fails.

    With MAP_PRIVATE: check the i_size after the PT lock before mapping
    anonymous memory or zeropages into the MAP_PRIVATE shmem mapping.

    A mostly irrelevant cleanup: like we do the delete_from_page_cache()
    pagecache removal after dropping the PT lock, the PT lock is a spinlock
    so drop it before the sleepable page lock.

    Link: http://lkml.kernel.org/r/20181126173452.26955-5-aarcange@redhat.com
    Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
    Signed-off-by: Andrea Arcangeli
    Reviewed-by: Mike Rapoport
    Reviewed-by: Hugh Dickins
    Reported-by: Jann Horn
    Cc:
    Cc: "Dr. David Alan Gilbert"
    Cc: Mike Kravetz
    Cc: Peter Xu
    Cc: stable@vger.kernel.org
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Andrea Arcangeli
     
  • commit 9e368259ad988356c4c95150fafd1a06af095d98 upstream.

    Patch series "userfaultfd shmem updates".

    Jann found two bugs in the userfaultfd shmem MAP_SHARED backend: the
    lack of the VM_MAYWRITE check and the lack of i_size checks.

    Then looking into the above we also fixed the MAP_PRIVATE case.

    Hugh by source review also found a data loss source if UFFDIO_COPY is
    used on shmem MAP_SHARED PROT_READ mappings (the production usages
    incidentally run with PROT_READ|PROT_WRITE, so the data loss couldn't
    happen in those production usages like with QEMU).

    The whole patchset is marked for stable.

    We verified QEMU postcopy live migration with guest running on shmem
    MAP_PRIVATE run as well as before after the fix of shmem MAP_PRIVATE.
    Regardless if it's shmem or hugetlbfs or MAP_PRIVATE or MAP_SHARED, QEMU
    unconditionally invokes a punch hole if the guest mapping is filebacked
    and a MADV_DONTNEED too (needed to get rid of the MAP_PRIVATE COWs and
    for the anon backend).

    This patch (of 5):

    We internally used EFAULT to communicate with the caller, switch to
    ENOENT, so EFAULT can be used as a non internal retval.

    Link: http://lkml.kernel.org/r/20181126173452.26955-2-aarcange@redhat.com
    Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
    Signed-off-by: Andrea Arcangeli
    Reviewed-by: Mike Rapoport
    Reviewed-by: Hugh Dickins
    Cc: Mike Kravetz
    Cc: Jann Horn
    Cc: Peter Xu
    Cc: "Dr. David Alan Gilbert"
    Cc:
    Cc: stable@vger.kernel.org
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Andrea Arcangeli
     

06 Dec, 2018

2 commits

  • commit c1cb20d43728aa9b5393bd8d489bc85c142949b2 upstream.

    We changed the key of swap cache tree from swp_entry_t.val to
    swp_offset. We need to do so in shmem_replace_page() as well.

    Hugh said:
    "shmem_replace_page() has been wrong since the day I wrote it: good
    enough to work on swap "type" 0, which is all most people ever use
    (especially those few who need shmem_replace_page() at all), but
    broken once there are any non-0 swp_type bits set in the higher order
    bits"

    Link: http://lkml.kernel.org/r/20181121215442.138545-1-yuzhao@google.com
    Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
    Signed-off-by: Yu Zhao
    Reviewed-by: Matthew Wilcox
    Acked-by: Hugh Dickins
    Cc: [4.9+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Yu Zhao
     
  • commit aaa52e340073b7f4593b3c4ddafcafa70cf838b5 upstream.

    Huge tmpfs testing on a shortish file mapped into a pmd-rounded extent
    hit shmem_evict_inode()'s WARN_ON(inode->i_blocks) followed by
    clear_inode()'s BUG_ON(inode->i_data.nrpages) when the file was later
    closed and unlinked.

    khugepaged's collapse_shmem() was forgetting to update mapping->nrpages
    on the rollback path, after it had added but then needs to undo some
    holes.

    There is indeed an irritating asymmetry between shmem_charge(), whose
    callers want it to increment nrpages after successfully accounting
    blocks, and shmem_uncharge(), when __delete_from_page_cache() already
    decremented nrpages itself: oh well, just add a comment on that to them
    both.

    And shmem_recalc_inode() is supposed to be called when the accounting is
    expected to be in balance (so it can deduce from imbalance that reclaim
    discarded some pages): so change shmem_charge() to update nrpages
    earlier (though it's rare for the difference to matter at all).

    Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261523450.2275@eggly.anvils
    Fixes: 800d8c63b2e98 ("shmem: add huge pages support")
    Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
    Signed-off-by: Hugh Dickins
    Acked-by: Kirill A. Shutemov
    Cc: Jerome Glisse
    Cc: Konstantin Khlebnikov
    Cc: Matthew Wilcox
    Cc: [4.8+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Hugh Dickins
     

01 Dec, 2018

1 commit

  • [ Upstream commit 1a413646931cb14442065cfc17561e50f5b5bb44 ]

    Other filesystems such as ext4, f2fs and ubifs all return ENXIO when
    lseek (SEEK_DATA or SEEK_HOLE) requests a negative offset.

    man 2 lseek says

    : EINVAL whence is not valid. Or: the resulting file offset would be
    : negative, or beyond the end of a seekable device.
    :
    : ENXIO whence is SEEK_DATA or SEEK_HOLE, and the file offset is beyond
    : the end of the file.

    Make tmpfs return ENXIO under these circumstances as well. After this,
    tmpfs also passes xfstests's generic/448.

    [akpm@linux-foundation.org: rewrite changelog]
    Link: http://lkml.kernel.org/r/1540434176-14349-1-git-send-email-yuyufen@huawei.com
    Signed-off-by: Yufen Yu
    Reviewed-by: Andrew Morton
    Cc: Al Viro
    Cc: Hugh Dickins
    Cc: William Kucharski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Yufen Yu
     

21 Sep, 2018

1 commit

  • Directories and inodes don't necessarily need to be in the same lockdep
    class. For ex, hugetlbfs splits them out too to prevent false positives
    in lockdep. Annotate correctly after new inode creation. If its a
    directory inode, it will be put into a different class.

    This should fix a lockdep splat reported by syzbot:

    > ======================================================
    > WARNING: possible circular locking dependency detected
    > 4.18.0-rc8-next-20180810+ #36 Not tainted
    > ------------------------------------------------------
    > syz-executor900/4483 is trying to acquire lock:
    > 00000000d2bfc8fe (&sb->s_type->i_mutex_key#9){++++}, at: inode_lock
    > include/linux/fs.h:765 [inline]
    > 00000000d2bfc8fe (&sb->s_type->i_mutex_key#9){++++}, at:
    > shmem_fallocate+0x18b/0x12e0 mm/shmem.c:2602
    >
    > but task is already holding lock:
    > 0000000025208078 (ashmem_mutex){+.+.}, at: ashmem_shrink_scan+0xb4/0x630
    > drivers/staging/android/ashmem.c:448
    >
    > which lock already depends on the new lock.
    >
    > -> #2 (ashmem_mutex){+.+.}:
    > __mutex_lock_common kernel/locking/mutex.c:925 [inline]
    > __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073
    > mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088
    > ashmem_mmap+0x55/0x520 drivers/staging/android/ashmem.c:361
    > call_mmap include/linux/fs.h:1844 [inline]
    > mmap_region+0xf27/0x1c50 mm/mmap.c:1762
    > do_mmap+0xa10/0x1220 mm/mmap.c:1535
    > do_mmap_pgoff include/linux/mm.h:2298 [inline]
    > vm_mmap_pgoff+0x213/0x2c0 mm/util.c:357
    > ksys_mmap_pgoff+0x4da/0x660 mm/mmap.c:1585
    > __do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
    > __se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
    > __x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
    > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
    > entry_SYSCALL_64_after_hwframe+0x49/0xbe
    >
    > -> #1 (&mm->mmap_sem){++++}:
    > __might_fault+0x155/0x1e0 mm/memory.c:4568
    > _copy_to_user+0x30/0x110 lib/usercopy.c:25
    > copy_to_user include/linux/uaccess.h:155 [inline]
    > filldir+0x1ea/0x3a0 fs/readdir.c:196
    > dir_emit_dot include/linux/fs.h:3464 [inline]
    > dir_emit_dots include/linux/fs.h:3475 [inline]
    > dcache_readdir+0x13a/0x620 fs/libfs.c:193
    > iterate_dir+0x48b/0x5d0 fs/readdir.c:51
    > __do_sys_getdents fs/readdir.c:231 [inline]
    > __se_sys_getdents fs/readdir.c:212 [inline]
    > __x64_sys_getdents+0x29f/0x510 fs/readdir.c:212
    > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
    > entry_SYSCALL_64_after_hwframe+0x49/0xbe
    >
    > -> #0 (&sb->s_type->i_mutex_key#9){++++}:
    > lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
    > down_write+0x8f/0x130 kernel/locking/rwsem.c:70
    > inode_lock include/linux/fs.h:765 [inline]
    > shmem_fallocate+0x18b/0x12e0 mm/shmem.c:2602
    > ashmem_shrink_scan+0x236/0x630 drivers/staging/android/ashmem.c:455
    > ashmem_ioctl+0x3ae/0x13a0 drivers/staging/android/ashmem.c:797
    > vfs_ioctl fs/ioctl.c:46 [inline]
    > file_ioctl fs/ioctl.c:501 [inline]
    > do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:685
    > ksys_ioctl+0xa9/0xd0 fs/ioctl.c:702
    > __do_sys_ioctl fs/ioctl.c:709 [inline]
    > __se_sys_ioctl fs/ioctl.c:707 [inline]
    > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:707
    > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
    > entry_SYSCALL_64_after_hwframe+0x49/0xbe
    >
    > other info that might help us debug this:
    >
    > Chain exists of:
    > &sb->s_type->i_mutex_key#9 --> &mm->mmap_sem --> ashmem_mutex
    >
    > Possible unsafe locking scenario:
    >
    > CPU0 CPU1
    > ---- ----
    > lock(ashmem_mutex);
    > lock(&mm->mmap_sem);
    > lock(ashmem_mutex);
    > lock(&sb->s_type->i_mutex_key#9);
    >
    > *** DEADLOCK ***
    >
    > 1 lock held by syz-executor900/4483:
    > #0: 0000000025208078 (ashmem_mutex){+.+.}, at:
    > ashmem_shrink_scan+0xb4/0x630 drivers/staging/android/ashmem.c:448

    Link: http://lkml.kernel.org/r/20180821231835.166639-1-joel@joelfernandes.org
    Signed-off-by: Joel Fernandes (Google)
    Reported-by: syzbot
    Reviewed-by: NeilBrown
    Suggested-by: NeilBrown
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Joel Fernandes (Google)
     

24 Aug, 2018

1 commit

  • Use new return type vm_fault_t for fault handler. For now, this is just
    documenting that the function returns a VM_FAULT value rather than an
    errno. Once all instances are converted, vm_fault_t will become a
    distinct type.

    Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")

    The aim is to change the return type of finish_fault() and
    handle_mm_fault() to vm_fault_t type. As part of that clean up return
    type of all other recursively called functions have been changed to
    vm_fault_t type.

    The places from where handle_mm_fault() is getting invoked will be
    change to vm_fault_t type but in a separate patch.

    vmf_error() is the newly introduce inline function in 4.17-rc6.

    [akpm@linux-foundation.org: don't shadow outer local `ret' in __do_huge_pmd_anonymous_page()]
    Link: http://lkml.kernel.org/r/20180604171727.GA20279@jordon-HP-15-Notebook-PC
    Signed-off-by: Souptick Joarder
    Reviewed-by: Matthew Wilcox
    Reviewed-by: Andrew Morton
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     

23 Aug, 2018

1 commit

  • Rather than in vm_area_alloc(). To ensure that the various oddball
    stack-based vmas are in a good state. Some of the callers were zeroing
    them out, others were not.

    Acked-by: Kirill A. Shutemov
    Cc: Russell King
    Cc: Dmitry Vyukov
    Cc: Oleg Nesterov
    Cc: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

18 Aug, 2018

1 commit

  • get_seconds() is deprecated because it will lead to a 32-bit overflow in
    2038 or 2106. We don't need the i_generation to be strictly monotonic
    anyway, and other file systems like ext4 and xfs just use prandom_u32(),
    so let's use the same one here.

    If this is considered too slow, we could also use ktime_get_seconds() or
    ktime_get_real_seconds() to keep the previous behavior. Both of these
    return a time64_t and are not deprecated, but only return a unique value
    once per second, and are predictable.

    Link: http://lkml.kernel.org/r/20180620082556.581543-1-arnd@arndb.de
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Andrew Morton
    Cc: Hugh Dickins
    Cc: Mike Kravetz
    Cc: "Kirill A. Shutemov"
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     

15 Aug, 2018

1 commit

  • Pull block updates from Jens Axboe:
    "First pull request for this merge window, there will also be a
    followup request with some stragglers.

    This pull request contains:

    - Fix for a thundering heard issue in the wbt block code (Anchal
    Agarwal)

    - A few NVMe pull requests:
    * Improved tracepoints (Keith)
    * Larger inline data support for RDMA (Steve Wise)
    * RDMA setup/teardown fixes (Sagi)
    * Effects log suppor for NVMe target (Chaitanya Kulkarni)
    * Buffered IO suppor for NVMe target (Chaitanya Kulkarni)
    * TP4004 (ANA) support (Christoph)
    * Various NVMe fixes

    - Block io-latency controller support. Much needed support for
    properly containing block devices. (Josef)

    - Series improving how we handle sense information on the stack
    (Kees)

    - Lightnvm fixes and updates/improvements (Mathias/Javier et al)

    - Zoned device support for null_blk (Matias)

    - AIX partition fixes (Mauricio Faria de Oliveira)

    - DIF checksum code made generic (Max Gurtovoy)

    - Add support for discard in iostats (Michael Callahan / Tejun)

    - Set of updates for BFQ (Paolo)

    - Removal of async write support for bsg (Christoph)

    - Bio page dirtying and clone fixups (Christoph)

    - Set of bcache fix/changes (via Coly)

    - Series improving blk-mq queue setup/teardown speed (Ming)

    - Series improving merging performance on blk-mq (Ming)

    - Lots of other fixes and cleanups from a slew of folks"

    * tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block: (190 commits)
    blkcg: Make blkg_root_lookup() work for queues in bypass mode
    bcache: fix error setting writeback_rate through sysfs interface
    null_blk: add lock drop/acquire annotation
    Blk-throttle: reduce tail io latency when iops limit is enforced
    block: paride: pd: mark expected switch fall-throughs
    block: Ensure that a request queue is dissociated from the cgroup controller
    block: Introduce blk_exit_queue()
    blkcg: Introduce blkg_root_lookup()
    block: Remove two superfluous #include directives
    blk-mq: count the hctx as active before allocating tag
    block: bvec_nr_vecs() returns value for wrong slab
    bcache: trivial - remove tailing backslash in macro BTREE_FLAG
    bcache: make the pr_err statement used for ENOENT only in sysfs_attatch section
    bcache: set max writeback rate when I/O request is idle
    bcache: add code comments for bset.c
    bcache: fix mistaken comments in request.c
    bcache: fix mistaken code comments in bcache.h
    bcache: add a comment in super.c
    bcache: avoid unncessary cache prefetch bch_btree_node_get()
    bcache: display rate debug parameters to 0 when writeback is not running
    ...

    Linus Torvalds
     

14 Aug, 2018

1 commit

  • Pull vfs open-related updates from Al Viro:

    - "do we need fput() or put_filp()" rules are gone - it's always fput()
    now. We keep track of that state where it belongs - in ->f_mode.

    - int *opened mess killed - in finish_open(), in ->atomic_open()
    instances and in fs/namei.c code around do_last()/lookup_open()/atomic_open().

    - alloc_file() wrappers with saner calling conventions are introduced
    (alloc_file_clone() and alloc_file_pseudo()); callers converted, with
    much simplification.

    - while we are at it, saner calling conventions for path_init() and
    link_path_walk(), simplifying things inside fs/namei.c (both on
    open-related paths and elsewhere).

    * 'work.open3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
    few more cleanups of link_path_walk() callers
    allow link_path_walk() to take ERR_PTR()
    make path_init() unconditionally paired with terminate_walk()
    document alloc_file() changes
    make alloc_file() static
    do_shmat(): grab shp->shm_file earlier, switch to alloc_file_clone()
    new helper: alloc_file_clone()
    create_pipe_files(): switch the first allocation to alloc_file_pseudo()
    anon_inode_getfile(): switch to alloc_file_pseudo()
    hugetlb_file_setup(): switch to alloc_file_pseudo()
    ocxlflash_getfile(): switch to alloc_file_pseudo()
    cxl_getfile(): switch to alloc_file_pseudo()
    ... and switch shmem_file_setup() to alloc_file_pseudo()
    __shmem_file_setup(): reorder allocations
    new wrapper: alloc_file_pseudo()
    kill FILE_{CREATED,OPENED}
    switch atomic_open() and lookup_open() to returning 0 in all success cases
    document ->atomic_open() changes
    ->atomic_open(): return 0 in all success cases
    get rid of 'opened' in path_openat() and the helpers downstream
    ...

    Linus Torvalds
     

27 Jul, 2018

1 commit

  • Make sure to initialize all VMAs properly, not only those which come
    from vm_area_cachep.

    Link: http://lkml.kernel.org/r/20180724121139.62570-3-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov
    Acked-by: Linus Torvalds
    Reviewed-by: Andrew Morton
    Cc: Dmitry Vyukov
    Cc: Oleg Nesterov
    Cc: Andrea Arcangeli
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

12 Jul, 2018

3 commits


09 Jul, 2018

1 commit

  • Memory allocations can induce swapping via kswapd or direct reclaim. If
    we are having IO done for us by kswapd and don't actually go into direct
    reclaim we may never get scheduled for throttling. So instead check to
    see if our cgroup is congested, and if so schedule the throttling.
    Before we return to user space the throttling stuff will only throttle
    if we actually required it.

    Signed-off-by: Tejun Heo
    Signed-off-by: Josef Bacik
    Acked-by: Johannes Weiner
    Acked-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Tejun Heo
     

15 Jun, 2018

1 commit

  • mm/*.c files use symbolic and octal styles for permissions.

    Using octal and not symbolic permissions is preferred by many as more
    readable.

    https://lkml.org/lkml/2016/8/2/1945

    Prefer the direct use of octal for permissions.

    Done using
    $ scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace mm/*.c
    and some typing.

    Before: $ git grep -P -w "0[0-7]{3,3}" mm | wc -l
    44
    After: $ git grep -P -w "0[0-7]{3,3}" mm | wc -l
    86

    Miscellanea:

    o Whitespace neatening around these conversions.

    Link: http://lkml.kernel.org/r/2e032ef111eebcd4c5952bae86763b541d373469.1522102887.git.joe@perches.com
    Signed-off-by: Joe Perches
    Acked-by: David Rientjes
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

08 Jun, 2018

8 commits

  • shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy.

    The pseudo vma doesn't have vm_page_prot set. We are going to encode
    encryption KeyID in vm_page_prot. Having garbage there causes problems.

    Zero out all unused fields in the pseudo vma.

    Link: http://lkml.kernel.org/r/20180531135602.20321-1-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov
    Reviewed-by: Andrew Morton
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Use new return type vm_fault_t for fault handler. For now, this is just
    documenting that the function returns a VM_FAULT value rather than an
    errno. Once all instances are converted, vm_fault_t will become a
    distinct type.

    See commit 1c8f422059ae ("mm: change return type to vm_fault_t")

    vmf_error() is the newly introduce inline function in 4.17-rc6.

    Link: http://lkml.kernel.org/r/20180521202410.GA17912@jordon-HP-15-Notebook-PC
    Signed-off-by: Souptick Joarder
    Reviewed-by: Matthew Wilcox
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     
  • tmpfs uses the helper d_find_alias() to find a dentry from a decoded
    inode, but d_find_alias() skips unhashed dentries, so unlinked files
    cannot be decoded from a file handle.

    This can be reproduced using xfstests test program open_by_handle:

    $ open_by handle -c /tmp/testdir
    $ open_by_handle -dk /tmp/testdir
    open_by_handle(/tmp/testdir/file000000) returned 116 incorrectly on an unlinked open file!

    To fix this, if d_find_alias() can't find a hashed alias, call
    d_find_any_alias() to return an unhashed one.

    Link: http://lkml.kernel.org/r/CAOQ4uxg+qSLP0KwdW+h1tcPqOCQd+_pGZVXiePQB1TXCMBMctQ@mail.gmail.com
    Signed-off-by: Amir Goldstein
    Reviewed-by: NeilBrown
    Cc: Hugh Dickins
    Cc: Jeff Layton
    Cc: "J. Bruce Fields"
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Amir Goldstein
     
  • Since tmpfs THP was supported in 4.8, hugetlbfs is not the only
    filesystem with huge page support anymore. tmpfs can use huge page via
    THP when mounting by "huge=" mount option.

    When applications use huge page on hugetlbfs, it just need check the
    filesystem magic number, but it is not enough for tmpfs. Make
    stat.st_blksize return huge page size if it is mounted by appropriate
    "huge=" option to give applications a hint to optimize the behavior with
    THP.

    Some applications may not do wisely with THP. For example, QEMU may
    mmap file on non huge page aligned hint address with MAP_FIXED, which
    results in no pages are PMD mapped even though THP is used. Some
    applications may mmap file with non huge page aligned offset. Both
    behaviors make THP pointless.

    statfs.f_bsize still returns 4KB for tmpfs since THP could be split, and
    it also may fallback to 4KB page silently if there is not enough huge
    page. Furthermore, different f_bsize makes max_blocks and free_blocks
    calculation harder but without too much benefit. Returning huge page
    size via stat.st_blksize sounds good enough.

    Since PUD size huge page for THP has not been supported, now it just
    returns HPAGE_PMD_SIZE.

    Hugh said:

    : Sorry, I have no enthusiasm for this patch; but do I feel strongly
    : enough to override you and everyone else to NAK it? No, I don't feel
    : that strongly, maybe st_blksize isn't worth arguing over.
    :
    : We did look at struct stat when designing huge tmpfs, to see if there
    : were any fields that should be adjusted for it; but concluded none.
    : Yes, it would sometimes be nice to have a quickly accessible indicator
    : for when tmpfs has been mounted huge (scanning /proc/mounts for options
    : can be tiresome, agreed); but since tmpfs tries to supply huge (or not)
    : pages transparently, no difference seemed right.
    :
    : So, because st_blksize is a not very useful field of struct stat, with
    : "size" in the name, we're going to put HPAGE_PMD_SIZE in there instead
    : of PAGE_SIZE, if the tmpfs was mounted with one of the huge "huge"
    : options (force or always, okay; within_size or advise, not so much).
    : Though HPAGE_PMD_SIZE is no more its "preferred I/O size" or "blocksize
    : for file system I/O" than PAGE_SIZE was.
    :
    : Which we can expect to speed up some applications and disadvantage
    : others, depending on how they interpret st_blksize: just like if we
    : changed it in the same way on non-huge tmpfs. (Did I actually try
    : changing st_blksize early on, and find it broke something? If so, I've
    : now forgotten what, and a search through commit messages didn't find
    : it; but I guess we'll find out soon enough.)
    :
    : If there were an mstat() syscall, returning a field "preferred
    : alignment", then we could certainly agree to put HPAGE_PMD_SIZE in
    : there; but in stat()'s st_blksize? And what happens when (in future)
    : mm maps this or that hard-disk filesystem's blocks with a pmd mapping -
    : should that filesystem then advertise a bigger st_blksize, despite the
    : same disk layout as before? What happens with DAX?
    :
    : And this change is not going to help the QEMU suboptimality that
    : brought you here (or does QEMU align mmaps according to st_blksize?).
    : QEMU ought to work well with kernels without this change, and kernels
    : with this change; and I hope it can easily deal with both by avoiding
    : that use of MAP_FIXED which prevented the kernel's intended alignment.

    [akpm@linux-foundation.org: remove unneeded `else']
    Link: http://lkml.kernel.org/r/1524665633-83806-1-git-send-email-yang.shi@linux.alibaba.com
    Signed-off-by: Yang Shi
    Suggested-by: Christoph Hellwig
    Reviewed-by: Christoph Hellwig
    Acked-by: Kirill A. Shutemov
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Cc: Alexander Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Shi
     
  • With the addition of memfd hugetlbfs support, we now have the situation
    where memfd depends on TMPFS -or- HUGETLBFS. Previously, memfd was only
    supported on tmpfs, so it made sense that the code resided in shmem.c.
    In the current code, memfd is only functional if TMPFS is defined. If
    HUGETLFS is defined and TMPFS is not defined, then memfd functionality
    will not be available for hugetlbfs. This does not cause BUGs, just a
    lack of potentially desired functionality.

    Code is restructured in the following way:
    - include/linux/memfd.h is a new file containing memfd specific
    definitions previously contained in shmem_fs.h.
    - mm/memfd.c is a new file containing memfd specific code previously
    contained in shmem.c.
    - memfd specific code is removed from shmem_fs.h and shmem.c.
    - A new config option MEMFD_CREATE is added that is defined if TMPFS
    or HUGETLBFS is defined.

    No functional changes are made to the code: restructuring only.

    Link: http://lkml.kernel.org/r/20180415182119.4517-4-mike.kravetz@oracle.com
    Signed-off-by: Mike Kravetz
    Reviewed-by: Khalid Aziz
    Cc: Andrea Arcangeli
    Cc: David Herrmann
    Cc: Hugh Dickins
    Cc: Marc-Andr Lureau
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     
  • In preparation for memfd code restructure, update comments, definitions
    and function names dealing with file sealing to indicate that tmpfs and
    hugetlbfs are the supported filesystems. Also, change file pointer
    checks in memfd_file_seals_ptr to use defined interfaces instead of
    directly referencing file_operation structs.

    Link: http://lkml.kernel.org/r/20180415182119.4517-3-mike.kravetz@oracle.com
    Signed-off-by: Mike Kravetz
    Reviewed-by: Khalid Aziz
    Cc: Andrea Arcangeli
    Cc: David Herrmann
    Cc: Hugh Dickins
    Cc: Marc-Andr Lureau
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     
  • Patch series "restructure memfd code", v4.

    This patch (of 3):

    In preparation for memfd code restucture, clean up sparse warnings.
    Most changes required adding __rcu annotations. The routine
    find_swap_entry was modified to properly deference radix tree entries.

    Link: http://lkml.kernel.org/r/20180415182119.4517-2-mike.kravetz@oracle.com
    Signed-off-by: Mike Kravetz
    Signed-off-by: Matthew Wilcox
    Cc: Hugh Dickins
    Cc: Andrea Arcangeli
    Cc: Michal Hocko
    Cc: Marc-Andr Lureau
    Cc: David Herrmann
    Cc: Khalid Aziz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     
  • Patch series "mm, memcontrol: Implement memory.swap.events", v2.

    This patchset implements memory.swap.events which contains max and fail
    events so that userland can monitor and respond to swap running out.

    This patch (of 2):

    get_swap_page() is always followed by mem_cgroup_try_charge_swap().
    This patch moves mem_cgroup_try_charge_swap() into get_swap_page() and
    makes get_swap_page() call the function even after swap allocation
    failure.

    This simplifies the callers and consolidates memcg related logic and
    will ease adding swap related memcg events.

    Link: http://lkml.kernel.org/r/20180416230934.GH1911913@devbig577.frc2.facebook.com
    Signed-off-by: Tejun Heo
    Reviewed-by: Andrew Morton
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Vladimir Davydov
    Cc: Roman Gushchin
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

12 Apr, 2018

1 commit

  • Remove the address_space ->tree_lock and use the xa_lock newly added to
    the radix_tree_root. Rename the address_space ->page_tree to ->i_pages,
    since we don't really care that it's a tree.

    [willy@infradead.org: fix nds32, fs/dax.c]
    Link: http://lkml.kernel.org/r/20180406145415.GB20605@bombadil.infradead.orgLink: http://lkml.kernel.org/r/20180313132639.17387-9-willy@infradead.org
    Signed-off-by: Matthew Wilcox
    Acked-by: Jeff Layton
    Cc: Darrick J. Wong
    Cc: Dave Chinner
    Cc: Ryusuke Konishi
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

06 Apr, 2018

1 commit

  • This patch makes do_swap_page() not need to be aware of two different
    swap readahead algorithms. Just unify cluster-based and vma-based
    readahead function call.

    Link: http://lkml.kernel.org/r/1509520520-32367-3-git-send-email-minchan@kernel.org
    Link: http://lkml.kernel.org/r/20180220085249.151400-3-minchan@kernel.org
    Signed-off-by: Minchan Kim
    Reviewed-by: Andrew Morton
    Cc: Hugh Dickins
    Cc: Huang Ying
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     

23 Mar, 2018

1 commit

  • shmem_unused_huge_shrink() gets called from reclaim path. Waiting for
    page lock may lead to deadlock there.

    There was a bug report that may be attributed to this:

    http://lkml.kernel.org/r/alpine.LRH.2.11.1801242349220.30642@mail.ewheeler.net

    Replace lock_page() with trylock_page() and skip the page if we failed
    to lock it. We will get to the page on the next scan.

    We can test for the PageTransHuge() outside the page lock as we only
    need protection against splitting the page under us. Holding pin oni
    the page is enough for this.

    Link: http://lkml.kernel.org/r/20180316210830.43738-1-kirill.shutemov@linux.intel.com
    Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
    Signed-off-by: Kirill A. Shutemov
    Reported-by: Eric Wheeler
    Acked-by: Michal Hocko
    Reviewed-by: Andrew Morton
    Cc: Tetsuo Handa
    Cc: Hugh Dickins
    Cc: [4.8+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

01 Feb, 2018

3 commits

  • Adapt add_seals()/get_seals() to work with hugetbfs-backed memory.

    Teach memfd_create() to allow sealing operations on MFD_HUGETLB.

    Link: http://lkml.kernel.org/r/20171107122800.25517-6-marcandre.lureau@redhat.com
    Signed-off-by: Marc-André Lureau
    Reviewed-by: Mike Kravetz
    Cc: Andrea Arcangeli
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Cc: David Herrmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marc-André Lureau
     
  • Those functions are called for memfd files, backed by shmem or hugetlb
    (the next patches will handle hugetlb).

    Link: http://lkml.kernel.org/r/20171107122800.25517-3-marcandre.lureau@redhat.com
    Signed-off-by: Marc-André Lureau
    Reviewed-by: Mike Kravetz
    Cc: Andrea Arcangeli
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Cc: David Herrmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marc-André Lureau
     
  • Patch series "memfd: add sealing to hugetlb-backed memory", v3.

    Recently, Mike Kravetz added hugetlbfs support to memfd. However, he
    didn't add sealing support. One of the reasons to use memfd is to have
    shared memory sealing when doing IPC or sharing memory with another
    process with some extra safety. qemu uses shared memory & hugetables
    with vhost-user (used by dpdk), so it is reasonable to use memfd now
    instead for convenience and security reasons.

    This patch (of 9):

    The functions are called through shmem_fcntl() only. And no danger in
    removing the EXPORTs as the routines only work with shmem file structs.

    Link: http://lkml.kernel.org/r/20171107122800.25517-2-marcandre.lureau@redhat.com
    Signed-off-by: Marc-André Lureau
    Reviewed-by: Mike Kravetz
    Cc: Andrea Arcangeli
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Cc: David Herrmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marc-André Lureau
     

28 Nov, 2017

1 commit

  • This is a pure automated search-and-replace of the internal kernel
    superblock flags.

    The s_flags are now called SB_*, with the names and the values for the
    moment mirroring the MS_* flags that they're equivalent to.

    Note how the MS_xyz flags are the ones passed to the mount system call,
    while the SB_xyz flags are what we then use in sb->s_flags.

    The script to do this was:

    # places to look in; re security/*: it generally should *not* be
    # touched (that stuff parses mount(2) arguments directly), but
    # there are two places where we really deal with superblock flags.
    FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
    include/linux/fs.h include/uapi/linux/bfs_fs.h \
    security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
    # the list of MS_... constants
    SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
    DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
    POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
    I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
    ACTIVE NOUSER"

    SED_PROG=
    for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done

    # we want files that contain at least one of MS_...,
    # with fs/namespace.c and fs/pnode.c excluded.
    L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')

    for f in $L; do sed -i $f $SED_PROG; done

    Requested-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

18 Nov, 2017

1 commit

  • Fix the following warning by removing the unused variable:

    mm/shmem.c:3205:27: warning: variable 'info' set but not used [-Wunused-but-set-variable]

    Link: http://lkml.kernel.org/r/1510774029-30652-1-git-send-email-clabbe@baylibre.com
    Signed-off-by: Corentin Labbe
    Acked-by: Michal Hocko
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Corentin Labbe
     

16 Nov, 2017

4 commits

  • Pull drm updates from Dave Airlie:
    "This is the main drm pull request for v4.15.

    Core:
    - Atomic object lifetime fixes
    - Atomic iterator improvements
    - Sparse/smatch fixes
    - Legacy kms ioctls to be interruptible
    - EDID override improvements
    - fb/gem helper cleanups
    - Simple outreachy patches
    - Documentation improvements
    - Fix dma-buf rcu races
    - DRM mode object leasing for improving VR use cases.
    - vgaarb improvements for non-x86 platforms.

    New driver:
    - tve200: Faraday Technology TVE200 block.

    This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
    the StorLink SL3516 (later Cortina Systems CS3516) as well as the
    Grain Media GM8180.

    New bridges:
    - SiI9234 support

    New panels:
    - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
    LT089AC19000, Innolux AT043TN24

    i915:
    - Remove Coffeelake from alpha support
    - Cannonlake workarounds
    - Infoframe refactoring for DisplayPort
    - VBT updates
    - DisplayPort vswing/emph/buffer translation refactoring
    - CCS fixes
    - Restore GPU clock boost on missed vblanks
    - Scatter list updates for userptr allocations
    - Gen9+ transition watermarks
    - Display IPC (Isochronous Priority Control)
    - Private PAT management
    - GVT: improved error handling and pci config sanitizing
    - Execlist refactoring
    - Transparent Huge Page support
    - User defined priorities support
    - HuC/GuC firmware refactoring
    - DP MST fixes
    - eDP power sequencing fixes
    - Use RCU instead of stop_machine
    - PSR state tracking support
    - Eviction fixes
    - BDW DP aux channel timeout fixes
    - LSPCON fixes
    - Cannonlake PLL fixes

    amdgpu:
    - Per VM BO support
    - Powerplay cleanups
    - CI powerplay support
    - PASID mgr for kfd
    - SR-IOV fixes
    - initial GPU reset for vega10
    - Prime mmap support
    - TTM updates
    - Clock query interface for Raven
    - Fence to handle ioctl
    - UVD encode ring support on Polaris
    - Transparent huge page DMA support
    - Compute LRU pipe tweaks
    - BO flag to allow buffers to opt out of implicit sync
    - CTX priority setting API
    - VRAM lost infrastructure plumbing

    qxl:
    - fix flicker since atomic rework

    amdkfd:
    - Further improvements from internal AMD tree
    - Usermode events
    - Drop radeon support

    nouveau:
    - Pascal temperature sensor support
    - Improved BAR2 handling
    - MMU rework to support Pascal MMU

    exynos:
    - Improved HDMI/mixer support
    - HDMI audio interface support

    tegra:
    - Prep work for tegra186
    - Cleanup/fixes

    msm:
    - Preemption support for a5xx
    - Display fixes for 8x96 (snapdragon 820)
    - Async cursor plane fixes
    - FW loading rework
    - GPU debugging improvements

    vc4:
    - Prep for DSI panels
    - fix T-format tiling scanout
    - New madvise ioctl

    Rockchip:
    - LVDS support

    omapdrm:
    - omap4 HDMI CEC support

    etnaviv:
    - GPU performance counters groundwork

    sun4i:
    - refactor driver load + TCON backend
    - HDMI improvements
    - A31 support
    - Misc fixes

    udl:
    - Probe/EDID read fixes.

    tilcdc:
    - Misc fixes.

    pl111:
    - Support more variants

    adv7511:
    - Improve EDID handling.
    - HDMI CEC support

    sii8620:
    - Add remote control support"

    * tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
    drm/rockchip: analogix_dp: Use mutex rather than spinlock
    drm/mode_object: fix documentation for object lookups.
    drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
    drm/i915: Move init_clock_gating() back to where it was
    drm/i915: Prune the reservation shared fence array
    drm/i915: Idle the GPU before shinking everything
    drm/i915: Lock llist_del_first() vs llist_del_all()
    drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
    drm/i915: Disable lazy PPGTT page table optimization for vGPU
    drm/i915/execlists: Remove the priority "optimisation"
    drm/i915: Filter out spurious execlists context-switch interrupts
    drm/amdgpu: use irq-safe lock for kiq->ring_lock
    drm/amdgpu: bypass lru touch for KIQ ring submission
    drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
    drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
    drm/amd/powerplay: initialize a variable before using it
    drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
    drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
    drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
    drm/rockchip: add CONFIG_OF dependency for lvds
    ...

    Linus Torvalds
     
  • In preparation to enabling -Wimplicit-fallthrough, mark switch cases
    where we are expecting to fall through.

    Link: http://lkml.kernel.org/r/20171020191058.GA24427@embeddedor.com
    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gustavo A. R. Silva
     
  • shmem_inode_cachep was created with SLAB_PANIC flag and
    shmem_init_inodecache() never returns non-zero, so convert this
    function to return void.

    Link: http://lkml.kernel.org/r/20170909124542.GA35224@bogon.didichuxing.com
    Signed-off-by: weiping zhang
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    weiping zhang
     
  • Every pagevec_init user claims the pages being released are hot even in
    cases where it is unlikely the pages are hot. As no one cares about the
    hotness of pages being released to the allocator, just ditch the
    parameter.

    No performance impact is expected as the overhead is marginal. The
    parameter is removed simply because it is a bit stupid to have a useless
    parameter copied everywhere.

    Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman
    Acked-by: Vlastimil Babka
    Cc: Andi Kleen
    Cc: Dave Chinner
    Cc: Dave Hansen
    Cc: Jan Kara
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman