14 Oct, 2020

40 commits

  • zhdr is already initialized in the front of the function, so remove
    redundant initialization here.

    Signed-off-by: Xiang Chen
    Signed-off-by: Andrew Morton
    Reviewed-by: David Hildenbrand
    Cc: Seth Jennings
    Cc: Dan Streetman
    Link: https://lkml.kernel.org/r/1600419885-191907-1-git-send-email-chenxiang66@hisilicon.com
    Signed-off-by: Linus Torvalds

    Xiang Chen
     
  • alloc_slots() allocates memory for slots using kmem_cache_alloc(), then
    memsets it. We can just use kmem_cache_zalloc().

    Signed-off-by: Hui Su
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200926100834.GA184671@rlk
    Signed-off-by: Linus Torvalds

    Hui Su
     
  • fix comments for isolate_lru_page():
    s/fundamentnal/fundamental

    Signed-off-by: Hui Su
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200927173923.GA8058@rlk
    Signed-off-by: Linus Torvalds

    Hui Su
     
  • We have observed that drop_caches can take a considerable amount of
    time (). Especially when there are many memcgs involved
    because they are adding an additional overhead.

    It is quite unfortunate that the operation cannot be interrupted by a
    signal currently. Add a check for fatal signals into the main loop so
    that userspace can control early bailout.

    There are two reasons:

    1. We have too many memcgs, even though one object freed in one memcg,
    the sum of object is bigger than 10.

    2. We spend a lot of time in traverse memcg once. So, the memcg who
    traversed at the first have been freed many objects. Traverse memcg
    next time, the freed count bigger than 10 again.

    We can get the following info through 'ps':

    root:~# ps -aux | grep drop
    root 357956 ... R Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_caches
    root 1771385 ... R Aug16 21146421:17 echo 3 > /proc/sys/vm/drop_caches
    root 1986319 ... R 18:56 117:27 echo 3 > /proc/sys/vm/drop_caches
    root 2002148 ... R Aug24 5720:39 echo 3 > /proc/sys/vm/drop_caches
    root 2564666 ... R 18:59 113:58 echo 3 > /proc/sys/vm/drop_caches
    root 2639347 ... R Sep03 2383:39 echo 3 > /proc/sys/vm/drop_caches
    root 3904747 ... R 03:35 993:31 echo 3 > /proc/sys/vm/drop_caches
    root 4016780 ... R Aug21 7882:18 echo 3 > /proc/sys/vm/drop_caches

    Use bpftrace follow 'freed' value in drop_slab_node:

    root:~# bpftrace -e 'kprobe:drop_slab_node+70 {@ret=hist(reg("bp")); }'
    Attaching 1 probe...
    ^B^C

    @ret:
    [64, 128) 1 | |
    [128, 256) 28 | |
    [256, 512) 107 |@ |
    [512, 1K) 298 |@@@ |
    [1K, 2K) 613 |@@@@@@@ |
    [2K, 4K) 4435 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
    [4K, 8K) 442 |@@@@@ |
    [8K, 16K) 299 |@@@ |
    [16K, 32K) 100 |@ |
    [32K, 64K) 139 |@ |
    [64K, 128K) 56 | |
    [128K, 256K) 26 | |
    [256K, 512K) 2 | |

    In the while loop, we can check whether the TASK_KILLABLE signal is set,
    if so, we should break the loop.

    Signed-off-by: Chunxin Zang
    Signed-off-by: Muchun Song
    Signed-off-by: Andrew Morton
    Acked-by: Chris Down
    Acked-by: Michal Hocko
    Cc: Vlastimil Babka
    Cc: Matthew Wilcox
    Link: https://lkml.kernel.org/r/20200909152047.27905-1-zangchunxin@bytedance.com
    Signed-off-by: Linus Torvalds

    Chunxin Zang
     
  • As a debugging aid, huge_pmd_share should make sure i_mmap_rwsem is held
    if necessary. To clarify the 'if necessary', expand the comment block at
    the beginning of huge_pmd_share.

    No functional change. The added i_mmap_assert_locked() call is only
    enabled if CONFIG_LOCKDEP.

    Ideally, this should have been included with commit 34ae204f1851
    ("hugetlbfs: remove call to huge_pte_alloc without i_mmap_rwsem").

    Signed-off-by: Mike Kravetz
    Signed-off-by: Andrew Morton
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: "Kirill A . Shutemov"
    Cc: Davidlohr Bueso
    Link: https://lkml.kernel.org/r/20200911201248.88537-1-mike.kravetz@oracle.com
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     
  • Function dequeue_huge_page_node_exact() iterates the free list and return
    the first valid free hpage.

    Instead of break and check the loop variant, we could return in the loop
    directly. This could reduce some redundant check.

    [mike.kravetz@oracle.com: points out a logic error]
    [richard.weiyang@linux.alibaba.com: v4]
    Link: https://lkml.kernel.org/r/20200901014636.29737-8-richard.weiyang@linux.alibaba.com

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Cc: Baoquan He
    Cc: Mike Kravetz
    Cc: Vlastimil Babka
    Link: https://lkml.kernel.org/r/20200831022351.20916-8-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • set_hugetlb_cgroup_[rsvd] just manipulate page local data, which is not
    necessary to be protected by hugetlb_lock.

    Let's take this out.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Mike Kravetz
    Cc: Vlastimil Babka
    Link: https://lkml.kernel.org/r/20200831022351.20916-7-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • The page allocated from buddy is not on any list, so just use list_add()
    is enough.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Mike Kravetz
    Cc: Vlastimil Babka
    Link: https://lkml.kernel.org/r/20200831022351.20916-6-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • There are only two cases of function add_reservation_in_range()

    * count file_region and return the number in regions_needed
    * do the real list operation without counting

    This means it is not necessary to have two parameters to classify these
    two cases.

    Just use regions_needed to separate them.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Mike Kravetz
    Cc: Vlastimil Babka
    Link: https://lkml.kernel.org/r/20200831022351.20916-5-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • Instead of add allocated file_region one by one to region_cache, we could
    use list_splice to merge two list at once.

    Also we know the number of entries in the list, increase the number
    directly.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Mike Kravetz
    Cc: Vlastimil Babka
    Link: https://lkml.kernel.org/r/20200831022351.20916-4-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • We are sure to get a valid file_region, otherwise the
    VM_BUG_ON(resv->region_cache_count
    Signed-off-by: Andrew Morton
    Reviewed-by: Mike Kravetz
    Cc: Baoquan He
    Cc: Vlastimil Babka
    Link: https://lkml.kernel.org/r/20200831022351.20916-3-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • Patch series "mm/hugetlb: code refine and simplification", v4.

    Following are some cleanups for hugetlb. Simple testing with
    tools/testing/selftests/vm/map_hugetlb passes.

    This patch (of 7):

    Per my understanding, we keep the regions ordered and would always
    coalesce regions properly. So the task to keep this property is just to
    coalesce its neighbour.

    Let's simplify this.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Mike Kravetz
    Cc: Vlastimil Babka
    Link: https://lkml.kernel.org/r/20200901014636.29737-1-richard.weiyang@linux.alibaba.com
    Link: https://lkml.kernel.org/r/20200831022351.20916-1-richard.weiyang@linux.alibaba.com
    Link: https://lkml.kernel.org/r/20200831022351.20916-2-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • Change 'pecify' to 'Specify'.

    Signed-off-by: Baoquan He
    Signed-off-by: Andrew Morton
    Reviewed-by: Mike Kravetz
    Reviewed-by: David Hildenbrand
    Cc: Anshuman Khandual
    Link: https://lkml.kernel.org/r/20200723032248.24772-4-bhe@redhat.com
    Signed-off-by: Linus Torvalds

    Baoquan He
     
  • If a swap entry tests positive for either is_[migration|hwpoison]_entry(),
    then its swap_type() is among SWP_MIGRATION_READ, SWP_MIGRATION_WRITE and
    SWP_HWPOISON. All these types >= MAX_SWAPFILES, exactly what is asserted
    with non_swap_entry().

    So the checking non_swap_entry() in is_hugetlb_entry_migration() and
    is_hugetlb_entry_hwpoisoned() is redundant.

    Let's remove it to optimize code.

    Signed-off-by: Baoquan He
    Signed-off-by: Andrew Morton
    Reviewed-by: Mike Kravetz
    Reviewed-by: David Hildenbrand
    Reviewed-by: Anshuman Khandual
    Link: https://lkml.kernel.org/r/20200723032248.24772-3-bhe@redhat.com
    Signed-off-by: Linus Torvalds

    Baoquan He
     
  • Patch series "mm/hugetlb: Small cleanup and improvement", v2.

    This patch (of 3):

    Just like its neighbour is_hugetlb_entry_migration() has done.

    Signed-off-by: Baoquan He
    Signed-off-by: Andrew Morton
    Reviewed-by: Mike Kravetz
    Reviewed-by: David Hildenbrand
    Reviewed-by: Anshuman Khandual
    Link: https://lkml.kernel.org/r/20200723032248.24772-1-bhe@redhat.com
    Link: https://lkml.kernel.org/r/20200723032248.24772-2-bhe@redhat.com
    Signed-off-by: Linus Torvalds

    Baoquan He
     
  • There is a general understanding that GFP_ATOMIC/GFP_NOWAIT are to be used
    from atomic contexts. E.g. from within a spin lock or from the IRQ
    context. This is correct but there are some atomic contexts where the
    above doesn't hold. One of them would be an NMI context. Page allocator
    has never supported that and the general fear of this context didn't let
    anybody to actually even try to use the allocator there. Good, but let's
    be more specific about that.

    Another such a context, and that is where people seem to be more daring,
    is raw_spin_lock. Mostly because it simply resembles regular spin lock
    which is supported by the allocator and there is not any implementation
    difference with !RT kernels in the first place. Be explicit that such a
    context is not supported by the allocator. The underlying reason is that
    zone->lock would have to become raw_spin_lock as well and that has turned
    out to be a problem for RT
    (http://lkml.kernel.org/r/87mu305c1w.fsf@nanos.tec.linutronix.de).

    Signed-off-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Reviewed-by: David Hildenbrand
    Reviewed-by: Thomas Gleixner
    Reviewed-by: Uladzislau Rezki
    Cc: "Paul E. McKenney"
    Link: https://lkml.kernel.org/r/20200929123010.5137-1-mhocko@kernel.org
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • Here is a very rare race which leaks memory:

    Page P0 is allocated to the page cache. Page P1 is free.

    Thread A Thread B Thread C
    find_get_entry():
    xas_load() returns P0
    Removes P0 from page cache
    P0 finds its buddy P1
    alloc_pages(GFP_KERNEL, 1) returns P0
    P0 has refcount 1
    page_cache_get_speculative(P0)
    P0 has refcount 2
    __free_pages(P0)
    P0 has refcount 1
    put_page(P0)
    P1 is not freed

    Fix this by freeing all the pages in __free_pages() that won't be freed
    by the call to put_page(). It's usually not a good idea to split a page,
    but this is a very unlikely scenario.

    Fixes: e286781d5f2e ("mm: speculative page references")
    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Andrew Morton
    Acked-by: Mike Rapoport
    Cc: Nick Piggin
    Cc: Hugh Dickins
    Cc: Peter Zijlstra
    Link: https://lkml.kernel.org/r/20200926213919.26642-1-willy@infradead.org
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • The function is_huge_zero_page() doesn't call compound_head() to make sure
    the page pointer is a head page. The call to is_huge_zero_page() in
    release_pages() is made before compound_head() is called so the test would
    fail if release_pages() was called with a tail page of the huge_zero_page
    and put_page_testzero() would be called releasing the page.
    This is unlikely to be happening in normal use or we would be seeing all
    sorts of process data corruption when accessing a THP zero page.

    Looking at other places where is_huge_zero_page() is called, all seem to
    only pass a head page so I think the right solution is to move the call
    to compound_head() in release_pages() to a point before calling
    is_huge_zero_page().

    Signed-off-by: Ralph Campbell
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Cc: Yu Zhao
    Cc: Dan Williams
    Cc: Matthew Wilcox
    Cc: Christoph Hellwig
    Link: https://lkml.kernel.org/r/20200917173938.16420-1-rcampbell@nvidia.com
    Signed-off-by: Linus Torvalds

    Ralph Campbell
     
  • Previously 'for_next_zone_zonelist_nodemask' macro parameter 'zlist' was
    unused so this patch removes it.

    Signed-off-by: Mateusz Nosek
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200917211906.30059-1-mateusznosek0@gmail.com
    Signed-off-by: Linus Torvalds

    Mateusz Nosek
     
  • __perform_reclaim()'s single caller expects it to return 'unsigned long',
    hence change its return value and a local variable to 'unsigned long'.

    Suggested-by: Andrew Morton
    Signed-off-by: Yanfei Xu
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200916022138.16740-1-yanfei.xu@windriver.com
    Signed-off-by: Linus Torvalds

    Yanfei Xu
     
  • finalise_ac() is just 'epilogue' for 'prepare_alloc_pages'. Therefore
    there is no need to keep them both so 'finalise_ac' content can be merged
    into prepare_alloc_pages() code. It would make __alloc_pages_nodemask()
    cleaner when it comes to readability.

    Signed-off-by: Mateusz Nosek
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Cc: Mel Gorman
    Cc: Mike Rapoport
    Link: https://lkml.kernel.org/r/20200916110118.6537-1-mateusznosek0@gmail.com
    Signed-off-by: Linus Torvalds

    Mateusz Nosek
     
  • Previously in '__init early_init_on_alloc' and '__init early_init_on_free'
    the return values from 'kstrtobool' were not handled properly. That
    caused potential garbage value read from variable 'bool_result'.
    Introduced patch fixes error handling.

    Signed-off-by: Mateusz Nosek
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200916214125.28271-1-mateusznosek0@gmail.com
    Signed-off-by: Linus Torvalds

    Mateusz Nosek
     
  • Previously flags check was separated into two separated checks with two
    separated branches. In case of presence of any of two mentioned flags,
    the same effect on flow occurs. Therefore checks can be merged and one
    branch can be avoided.

    Signed-off-by: Mateusz Nosek
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200911092310.31136-1-mateusznosek0@gmail.com
    Signed-off-by: Linus Torvalds

    Mateusz Nosek
     
  • Previously variable 'tmp' was initialized, but was not read later before
    reassigning. So the initialization can be removed.

    [akpm@linux-foundation.org: remove `tmp' altogether]

    Signed-off-by: Mateusz Nosek
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200904132422.17387-1-mateusznosek0@gmail.com
    Signed-off-by: Linus Torvalds

    Mateusz Nosek
     
  • In has_unmovable_pages(), the page parameter would not always be the first
    page within a pageblock (see how the page pointer is passed in from
    start_isolate_page_range() after call __first_valid_page()), so that would
    cause checking unmovable pages span two pageblocks.

    After this patch, the checking is enforced within one pageblock no matter
    the page is first one or not, and obey the semantics of this function.

    This issue is found by code inspection.

    Michal said "this might lead to false negatives when an unrelated block
    would cause an isolation failure".

    Signed-off-by: Li Xinhai
    Signed-off-by: Andrew Morton
    Reviewed-by: Oscar Salvador
    Acked-by: Michal Hocko
    Cc: David Hildenbrand
    Link: https://lkml.kernel.org/r/20200824065811.383266-1-lixinhai.lxh@gmail.com
    Signed-off-by: Linus Torvalds

    Li Xinhai
     
  • Let's document what ZONE_MOVABLE means, how it's used, and which special
    cases we have regarding unmovable pages (memory offlining vs. migration /
    allocations).

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Acked-by: Mike Rapoport
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Mike Kravetz
    Cc: Pankaj Gupta
    Cc: Baoquan He
    Cc: Jason Wang
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/20200816125333.7434-7-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather
    unclear, which is why we special-cased ZONE_MOVABLE such that partially
    plugged blocks would never end up in ZONE_MOVABLE.

    Now that the semantics are much clearer (and will be documented in a
    follow-up patch including the new virtio-mem behavior), let's allow to
    online partially plugged memory blocks to ZONE_MOVABLE and also consider
    memory blocks that were onlined to ZONE_MOVABLE when unplugging memory.
    While unplugged memory pages are, in general, unmovable, they can be
    skipped when offlining memory.

    virtio-mem only unplugs fairly big chunks (in the megabyte range) and
    rather tries to shrink the memory region than randomly choosing memory.
    In theory, if all other pages in the movable zone would be movable,
    virtio-mem would only shrink that zone and not create any kind of
    fragmentation.

    In the future, we might want to remember the zone again and use the
    information when (un)plugging memory. For now, let's keep it simple.

    Note: Support for defragmentation is planned, to deal with fragmentation
    after unplug due to memory chunks within memory blocks that could not get
    unplugged before (e.g., somebody pinning pages within ZONE_MOVABLE for a
    longer time).

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Jason Wang
    Cc: Mike Kravetz
    Cc: Pankaj Gupta
    Cc: Baoquan He
    Cc: Mike Rapoport
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/20200816125333.7434-6-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Let's clean it up a bit, simplifying the exit paths.

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Pankaj Gupta
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Mike Kravetz
    Cc: Jason Wang
    Cc: Mike Rapoport
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/20200816125333.7434-5-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Inside has_unmovable_pages(), we have a comment describing how unmovable
    data could end up in ZONE_MOVABLE - via "movablecore". Also, besides
    checking if the first page in the pageblock is reserved, we don't perform
    any further checks in case of ZONE_MOVABLE.

    In case of memory offlining, we set REPORT_FAILURE, properly dump_page()
    the page and handle the error gracefully. alloc_contig_pages() users
    currently never allocate from ZONE_MOVABLE. E.g., hugetlb uses
    alloc_contig_pages() for the allocation of gigantic pages only, which will
    never end up on the MOVABLE zone (see htlb_alloc_mask()).

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Mike Kravetz
    Cc: Pankaj Gupta
    Cc: Jason Wang
    Cc: Mike Rapoport
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/20200816125333.7434-4-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Right now, if we have two isolations racing on a pageblock that's in the
    MOVABLE zone, we would trigger the WARN_ON_ONCE(). Let's just return
    directly, simplifying error handling.

    The change was introduced in commit 3d680bdf60a5 ("mm/page_isolation: fix
    potential warning from user"). As far as I can see, we currently don't
    have alloc_contig_range() users that use the ZONE_MOVABLE (anymore), so
    it's currently more a cleanup and a preparation for the future than a fix.

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Pankaj Gupta
    Acked-by: Mike Kravetz
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Qian Cai
    Cc: Jason Wang
    Cc: Mike Rapoport
    Link: http://lkml.kernel.org/r/20200816125333.7434-3-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Patch series "mm / virtio-mem: support ZONE_MOVABLE", v5.

    When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather
    unclear, which is why we special-cased ZONE_MOVABLE such that partially
    plugged blocks would never end up in ZONE_MOVABLE.

    Now that the semantics are much clearer (and are documented in patch #6),
    let's support partially plugged memory blocks in ZONE_MOVABLE, allowing
    partially plugged memory blocks to be online to ZONE_MOVABLE and also
    unplugging from such memory blocks. This avoids surprises when onlining
    of memory blocks suddenly fails, just because they are not completely
    populated by virtio-mem (yet).

    This is especially helpful for testing, but also paves the way for
    virtio-mem optimizations, allowing more memory to get reliably unplugged.

    Cleanup has_unmovable_pages() and set_migratetype_isolate(), providing
    better documentation of how ZONE_MOVABLE interacts with different kind of
    unmovable pages (memory offlining vs. alloc_contig_range()).

    This patch (of 6):

    Let's move the split comment regarding bootmem allocations and memory
    holes, especially in the context of ZONE_MOVABLE, to the PageReserved()
    check.

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Mike Kravetz
    Cc: Pankaj Gupta
    Cc: Jason Wang
    Cc: Mike Rapoport
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/20200816125333.7434-1-david@redhat.com
    Link: http://lkml.kernel.org/r/20200816125333.7434-2-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • KASAN errors will currently trigger a panic when panic_on_warn is set.
    This renders kasan_multishot useless, as further KASAN errors won't be
    reported if the kernel has already paniced. By making kasan_multishot
    disable this behaviour for KASAN errors, we can still have the benefits of
    panic_on_warn for non-KASAN warnings, yet be able to use kasan_multishot.

    This is particularly important when running KASAN tests, which need to
    trigger multiple KASAN errors: previously these would panic the system if
    panic_on_warn was set, now they can run (and will panic the system should
    non-KASAN warnings show up).

    Signed-off-by: David Gow
    Signed-off-by: Andrew Morton
    Tested-by: Andrey Konovalov
    Reviewed-by: Andrey Konovalov
    Reviewed-by: Brendan Higgins
    Cc: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Ingo Molnar
    Cc: Juri Lelli
    Cc: Patricia Alfonso
    Cc: Peter Zijlstra
    Cc: Shuah Khan
    Cc: Vincent Guittot
    Link: https://lkml.kernel.org/r/20200915035828.570483-6-davidgow@google.com
    Link: https://lkml.kernel.org/r/20200910070331.3358048-6-davidgow@google.com
    Signed-off-by: Linus Torvalds

    David Gow
     
  • Include documentation on how to test KASAN using CONFIG_TEST_KASAN_KUNIT
    and CONFIG_TEST_KASAN_MODULE.

    Signed-off-by: Patricia Alfonso
    Signed-off-by: David Gow
    Signed-off-by: Andrew Morton
    Tested-by: Andrey Konovalov
    Reviewed-by: Andrey Konovalov
    Reviewed-by: Dmitry Vyukov
    Acked-by: Brendan Higgins
    Cc: Andrey Ryabinin
    Cc: Ingo Molnar
    Cc: Juri Lelli
    Cc: Peter Zijlstra
    Cc: Shuah Khan
    Cc: Vincent Guittot
    Link: https://lkml.kernel.org/r/20200915035828.570483-5-davidgow@google.com
    Link: https://lkml.kernel.org/r/20200910070331.3358048-5-davidgow@google.com
    Signed-off-by: Linus Torvalds

    Patricia Alfonso
     
  • Transfer all previous tests for KASAN to KUnit so they can be run more
    easily. Using kunit_tool, developers can run these tests with their other
    KUnit tests and see "pass" or "fail" with the appropriate KASAN report
    instead of needing to parse each KASAN report to test KASAN
    functionalities. All KASAN reports are still printed to dmesg.

    Stack tests do not work properly when KASAN_STACK is enabled so those
    tests use a check for "if IS_ENABLED(CONFIG_KASAN_STACK)" so they only run
    if stack instrumentation is enabled. If KASAN_STACK is not enabled, KUnit
    will print a statement to let the user know this test was not run with
    KASAN_STACK enabled.

    copy_user_test and kasan_rcu_uaf cannot be run in KUnit so there is a
    separate test file for those tests, which can be run as before as a
    module.

    [trishalfonso@google.com: v14]
    Link: https://lkml.kernel.org/r/20200915035828.570483-4-davidgow@google.com

    Signed-off-by: Patricia Alfonso
    Signed-off-by: David Gow
    Signed-off-by: Andrew Morton
    Tested-by: Andrey Konovalov
    Reviewed-by: Brendan Higgins
    Reviewed-by: Andrey Konovalov
    Reviewed-by: Dmitry Vyukov
    Cc: Andrey Ryabinin
    Cc: Ingo Molnar
    Cc: Juri Lelli
    Cc: Peter Zijlstra
    Cc: Shuah Khan
    Cc: Vincent Guittot
    Link: https://lkml.kernel.org/r/20200910070331.3358048-4-davidgow@google.com
    Signed-off-by: Linus Torvalds

    Patricia Alfonso
     
  • Integrate KASAN into KUnit testing framework.

    - Fail tests when KASAN reports an error that is not expected
    - Use KUNIT_EXPECT_KASAN_FAIL to expect a KASAN error in KASAN
    tests
    - Expected KASAN reports pass tests and are still printed when run
    without kunit_tool (kunit_tool still bypasses the report due to the
    test passing)
    - KUnit struct in current task used to keep track of the current
    test from KASAN code

    Make use of "[PATCH v3 kunit-next 1/2] kunit: generalize kunit_resource
    API beyond allocated resources" and "[PATCH v3 kunit-next 2/2] kunit: add
    support for named resources" from Alan Maguire [1]

    - A named resource is added to a test when a KASAN report is
    expected
    - This resource contains a struct for kasan_data containing
    booleans representing if a KASAN report is expected and if a
    KASAN report is found

    [1] (https://lore.kernel.org/linux-kselftest/1583251361-12748-1-git-send-email-alan.maguire@oracle.com/T/#t)

    Signed-off-by: Patricia Alfonso
    Signed-off-by: David Gow
    Signed-off-by: Andrew Morton
    Tested-by: Andrey Konovalov
    Reviewed-by: Andrey Konovalov
    Reviewed-by: Dmitry Vyukov
    Acked-by: Brendan Higgins
    Cc: Andrey Ryabinin
    Cc: Ingo Molnar
    Cc: Juri Lelli
    Cc: Peter Zijlstra
    Cc: Shuah Khan
    Cc: Vincent Guittot
    Link: https://lkml.kernel.org/r/20200915035828.570483-3-davidgow@google.com
    Link: https://lkml.kernel.org/r/20200910070331.3358048-3-davidgow@google.com
    Signed-off-by: Linus Torvalds

    Patricia Alfonso
     
  • Patch series "KASAN-KUnit Integration", v14.

    This patchset contains everything needed to integrate KASAN and KUnit.

    KUnit will be able to:
    (1) Fail tests when an unexpected KASAN error occurs
    (2) Pass tests when an expected KASAN error occurs

    Convert KASAN tests to KUnit with the exception of copy_user_test because
    KUnit is unable to test those.

    Add documentation on how to run the KASAN tests with KUnit and what to
    expect when running these tests.

    This patch (of 5):

    In order to integrate debugging tools like KASAN into the KUnit framework,
    add KUnit struct to the current task to keep track of the current KUnit
    test.

    Signed-off-by: Patricia Alfonso
    Signed-off-by: David Gow
    Signed-off-by: Andrew Morton
    Tested-by: Andrey Konovalov
    Reviewed-by: Brendan Higgins
    Cc: Brendan Higgins
    Cc: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Juri Lelli
    Cc: Vincent Guittot
    Cc: Shuah Khan
    Link: https://lkml.kernel.org/r/20200915035828.570483-1-davidgow@google.com
    Link: https://lkml.kernel.org/r/20200915035828.570483-2-davidgow@google.com
    Link: https://lkml.kernel.org/r/20200910070331.3358048-1-davidgow@google.com
    Link: https://lkml.kernel.org/r/20200910070331.3358048-2-davidgow@google.com
    Signed-off-by: Linus Torvalds

    Patricia Alfonso
     
  • In the context of the anonymous address space lifespan description the
    'mm_users' reference counter is confused with 'mm_count'. I.e a "zombie"
    mm gets released when "mm_count" becomes zero, not "mm_users".

    Signed-off-by: Alexander Gordeev
    Signed-off-by: Andrew Morton
    Cc: Jonathan Corbet
    Link: https://lkml.kernel.org/r/1597040695-32633-1-git-send-email-agordeev@linux.ibm.com
    Signed-off-by: Linus Torvalds

    Alexander Gordeev
     
  • Fix the comment of find_vm_area() and get_vm_area()

    Signed-off-by: Hui Su
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200927153034.GA199877@rlk
    Signed-off-by: Linus Torvalds

    Hui Su
     
  • Since c67dc624757 ("mm/vmalloc: do not call kmemleak_free() on not yet
    accounted memory"), the __vunmap() have been changed to __vfree(), so
    update the confusing comment().

    Signed-off-by: Hui Su
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Cc: Roman Penyaev
    Link: https://lkml.kernel.org/r/20200927155409.GA3315@rlk
    Signed-off-by: Linus Torvalds

    Hui Su
     
  • Unlike others we don't use the marco writeback. so let's remove it to
    tame gcc warning:

    mm/memory-failure.c:827: warning: macro "writeback" is not used
    [-Wunused-macros]

    Signed-off-by: Alex Shi
    Signed-off-by: Andrew Morton
    Cc: Naoya Horiguchi
    Link: https://lkml.kernel.org/r/1599715096-20369-1-git-send-email-alex.shi@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Alex Shi