04 Sep, 2019

1 commit

  • There is no reason to print warnings when balloon page allocation fails,
    as they are expected and can be handled gracefully. Since VMware
    balloon now uses balloon-compaction infrastructure, and suppressed these
    warnings before, it is also beneficial to suppress these warnings to
    keep the same behavior that the balloon had before.

    Cc: Jason Wang
    Signed-off-by: Nadav Amit
    Signed-off-by: Michael S. Tsirkin
    Reviewed-by: David Hildenbrand

    Nadav Amit
     

22 Jul, 2019

2 commits

  • Lots of comments bitrotted. Fix them up.

    Fixes: 418a3ab1e778 (mm/balloon_compaction: List interfaces)
    Reviewed-by: Wei Wang
    Signed-off-by: Michael S. Tsirkin
    Reviewed-by: Ralph Campbell
    Acked-by: Nadav Amit

    Michael S. Tsirkin
     
  • A #GP is reported in the guest when requesting balloon inflation via
    virtio-balloon. The reason is that the virtio-balloon driver has
    removed the page from its internal page list (via balloon_page_pop),
    but balloon_page_enqueue_one also calls "list_del" to do the removal.
    This is necessary when it's used from balloon_page_enqueue_list, but
    not from balloon_page_enqueue.

    Move list_del to balloon_page_enqueue, and update comments accordingly.

    Fixes: 418a3ab1e778 (mm/balloon_compaction: List interfaces)
    Signed-off-by: Wei Wang
    Signed-off-by: Michael S. Tsirkin

    Wei Wang
     

09 Jun, 2019

1 commit


25 May, 2019

1 commit

  • Introduce interfaces for ballooning enqueueing and dequeueing of a list
    of pages. These interfaces reduce the overhead of storing and restoring
    IRQs by batching the operations. In addition they do not panic if the
    list of pages is empty.

    Cc: Jason Wang
    Cc: linux-mm@kvack.org
    Cc: virtualization@lists.linux-foundation.org
    Acked-by: Michael S. Tsirkin
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

15 Nov, 2017

1 commit

  • fill_balloon doing memory allocations under balloon_lock
    can cause a deadlock when leak_balloon is called from
    virtballoon_oom_notify and tries to take same lock.

    To fix, split page allocation and enqueue and do allocations outside the lock.

    Here's a detailed analysis of the deadlock by Tetsuo Handa:

    In leak_balloon(), mutex_lock(&vb->balloon_lock) is called in order to
    serialize against fill_balloon(). But in fill_balloon(),
    alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
    called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
    implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
    is specified, this allocation attempt might indirectly depend on somebody
    else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
    __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
    virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
    out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
    mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
    will cause OOM lockup.

    Thread1 Thread2
    fill_balloon()
    takes a balloon_lock
    balloon_page_enqueue()
    alloc_page(GFP_HIGHUSER_MOVABLE)
    direct reclaim (__GFP_FS context) takes a fs lock
    waits for that fs lock alloc_page(GFP_NOFS)
    __alloc_pages_may_oom()
    takes the oom_lock
    out_of_memory()
    blocking_notifier_call_chain()
    leak_balloon()
    tries to take that balloon_lock and deadlocks

    Reported-by: Tetsuo Handa
    Cc: Michal Hocko
    Cc: Wei Wang
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

09 Sep, 2017

1 commit

  • Introduce a new migration mode that allow to offload the copy to a device
    DMA engine. This changes the workflow of migration and not all
    address_space migratepage callback can support this.

    This is intended to be use by migrate_vma() which itself is use for thing
    like HMM (see include/linux/hmm.h).

    No additional per-filesystem migratepage testing is needed. I disables
    MIGRATE_SYNC_NO_COPY in all problematic migratepage() callback and i
    added comment in those to explain why (part of this patch). The commit
    message is unclear it should say that any callback that wish to support
    this new mode need to be aware of the difference in the migration flow
    from other mode.

    Some of these callbacks do extra locking while copying (aio, zsmalloc,
    balloon, ...) and for DMA to be effective you want to copy multiple
    pages in one DMA operations. But in the problematic case you can not
    easily hold the extra lock accross multiple call to this callback.

    Usual flow is:

    For each page {
    1 - lock page
    2 - call migratepage() callback
    3 - (extra locking in some migratepage() callback)
    4 - migrate page state (freeze refcount, update page cache, buffer
    head, ...)
    5 - copy page
    6 - (unlock any extra lock of migratepage() callback)
    7 - return from migratepage() callback
    8 - unlock page
    }

    The new mode MIGRATE_SYNC_NO_COPY:
    1 - lock multiple pages
    For each page {
    2 - call migratepage() callback
    3 - abort in all problematic migratepage() callback
    4 - migrate page state (freeze refcount, update page cache, buffer
    head, ...)
    } // finished all calls to migratepage() callback
    5 - DMA copy multiple pages
    6 - unlock all the pages

    To support MIGRATE_SYNC_NO_COPY in the problematic case we would need a
    new callback migratepages() (for instance) that deals with multiple
    pages in one transaction.

    Because the problematic cases are not important for current usage I did
    not wanted to complexify this patchset even more for no good reason.

    Link: http://lkml.kernel.org/r/20170817000548.32038-14-jglisse@redhat.com
    Signed-off-by: Jérôme Glisse
    Cc: Aneesh Kumar
    Cc: Balbir Singh
    Cc: Benjamin Herrenschmidt
    Cc: Dan Williams
    Cc: David Nellans
    Cc: Evgeny Baskakov
    Cc: Johannes Weiner
    Cc: John Hubbard
    Cc: Kirill A. Shutemov
    Cc: Mark Hairgrove
    Cc: Michal Hocko
    Cc: Paul E. McKenney
    Cc: Ross Zwisler
    Cc: Sherry Cheung
    Cc: Subhash Gutti
    Cc: Vladimir Davydov
    Cc: Bob Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jérôme Glisse
     

11 Aug, 2017

1 commit

  • Revert commit bb01b64cfab7 ("mm/balloon_compaction.c: enqueue zero page
    to balloon device")'

    Zeroing ballon pages is rather time consuming, especially when a lot of
    pages are in flight. E.g. 7GB worth of ballooned memory takes 2.8s with
    __GFP_ZERO while it takes ~491ms without it.

    The original commit argued that zeroing will help ksmd to merge these
    pages on the host but this argument is assuming that the host actually
    marks balloon pages for ksm which is not universally true. So we pay
    performance penalty for something that even might not be used in the end
    which is wrong. The host can zero out pages on its own when there is a
    need.

    [mhocko@kernel.org: new changelog text]
    Link: http://lkml.kernel.org/r/1501761557-9758-1-git-send-email-wei.w.wang@intel.com
    Fixes: bb01b64cfab7 ("mm/balloon_compaction.c: enqueue zero page to balloon device")
    Signed-off-by: Wei Wang
    Acked-by: Michael S. Tsirkin
    Acked-by: Michal Hocko
    Cc: zhenwei.pi
    Cc: David Hildenbrand
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wei Wang
     

11 Jul, 2017

1 commit

  • presently pages in the balloon device have random value, and these pages
    will be scanned by ksmd on the host. They usually cannot be merged.
    Enqueue zero pages will resolve this problem.

    Link: http://lkml.kernel.org/r/1498698637-26389-1-git-send-email-zhenwei.pi@youruncloud.com
    Signed-off-by: zhenwei.pi
    Cc: Gioh Kim
    Cc: Vlastimil Babka
    Cc: Minchan Kim
    Cc: Konstantin Khlebnikov
    Cc: Rafael Aquini
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    zhenwei.pi
     

27 Jul, 2016

1 commit

  • Now, VM has a feature to migrate non-lru movable pages so balloon
    doesn't need custom migration hooks in migrate.c and compaction.c.

    Instead, this patch implements the page->mapping->a_ops->
    {isolate|migrate|putback} functions.

    With that, we could remove hooks for ballooning in general migration
    functions and make balloon compaction simple.

    [akpm@linux-foundation.org: compaction.h requires that the includer first include node.h]
    Link: http://lkml.kernel.org/r/1464736881-24886-4-git-send-email-minchan@kernel.org
    Signed-off-by: Gioh Kim
    Signed-off-by: Minchan Kim
    Acked-by: Vlastimil Babka
    Cc: Rafael Aquini
    Cc: Konstantin Khlebnikov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     

18 Mar, 2016

1 commit


15 Feb, 2016

1 commit


13 Jan, 2016

1 commit

  • In balloon_page_dequeue, pages_lock should cover the loop
    (ie, list_for_each_entry_safe). Otherwise, the cursor page could
    be isolated by compaction and then list_del by isolation could
    poison the page->lru.{prev,next} so the loop finally could
    access wrong address like this. This patch fixes the bug.

    general protection fault: 0000 [#1] SMP
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Modules linked in:
    CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1-access_bit+ #1906
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000
    RIP: 0010:[] [] balloon_page_dequeue+0x54/0x130
    RSP: 0018:ffff8800a7fefdc0 EFLAGS: 00010246
    RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d
    RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68
    RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000
    R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0
    R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060
    FS: 0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0
    Stack:
    0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060
    0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020
    ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060
    Call Trace:
    [] leak_balloon+0x93/0x1a0
    [] balloon+0x217/0x2a0
    [] ? __schedule+0x31e/0x8b0
    [] ? abort_exclusive_wait+0xb0/0xb0
    [] ? update_balloon_stats+0xf0/0xf0
    [] kthread+0xc9/0xe0
    [] ? kthread_park+0x60/0x60
    [] ret_from_fork+0x3f/0x70
    [] ? kthread_park+0x60/0x60
    Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89
    RIP [] balloon_page_dequeue+0x54/0x130
    RSP
    ---[ end trace 43cf28060d708d5f ]---
    Kernel panic - not syncing: Fatal exception
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Kernel Offset: disabled

    Cc:
    Signed-off-by: Minchan Kim
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Rafael Aquini

    Minchan Kim
     

06 Nov, 2015

1 commit

  • Clean up page migration a little by moving the trylock of newpage from
    move_to_new_page() into __unmap_and_move(), where the old page has been
    locked. Adjust unmap_and_move_huge_page() and balloon_page_migrate()
    accordingly.

    But make one kind-of-functional change on the way: whereas trylock of
    newpage used to BUG() if it failed, now simply return -EAGAIN if so.
    Cutting out BUG()s is good, right? But, to be honest, this is really to
    extend the usefulness of the custom put_new_page feature, allowing a pool
    of new pages to be shared perhaps with racing uses.

    Use an "else" instead of that "skip_unmap" label.

    Signed-off-by: Hugh Dickins
    Cc: Christoph Lameter
    Cc: "Kirill A. Shutemov"
    Cc: Rik van Riel
    Cc: Vlastimil Babka
    Cc: Davidlohr Bueso
    Cc: Oleg Nesterov
    Cc: Sasha Levin
    Cc: Dmitry Vyukov
    Cc: KOSAKI Motohiro
    Acked-by: Rafael Aquini
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

30 Oct, 2014

1 commit

  • If CONFIG_BALLOON_COMPACTION=n balloon_page_insert() does not link pages
    with balloon and doesn't set PagePrivate flag, as a result
    balloon_page_dequeue() cannot get any pages because it thinks that all
    of them are isolated. Without balloon compaction nobody can isolate
    ballooned pages. It's safe to remove this check.

    Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management").
    Signed-off-by: Konstantin Khlebnikov
    Reported-by: Matt Mullins
    Cc: [3.17]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

10 Oct, 2014

3 commits

  • Always mark pages with PageBalloon even if balloon compaction is disabled
    and expose this mark in /proc/kpageflags as KPF_BALLOON.

    Also this patch adds three counters into /proc/vmstat: "balloon_inflate",
    "balloon_deflate" and "balloon_migrate". They accumulate balloon
    activity. Current size of balloon is (balloon_inflate - balloon_deflate)
    pages.

    All generic balloon code now gathered under option CONFIG_MEMORY_BALLOON.
    It should be selected by ballooning driver which wants use this feature.
    Currently virtio-balloon is the only user.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Rafael Aquini
    Cc: Andrey Ryabinin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Now ballooned pages are detected using PageBalloon(). Fake mapping is no
    longer required. This patch links ballooned pages to balloon device using
    field page->private instead of page->mapping. Also this patch embeds
    balloon_dev_info directly into struct virtio_balloon.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Rafael Aquini
    Cc: Andrey Ryabinin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Sasha Levin reported KASAN splash inside isolate_migratepages_range().
    Problem is in the function __is_movable_balloon_page() which tests
    AS_BALLOON_MAP in page->mapping->flags. This function has no protection
    against anonymous pages. As result it tried to check address space flags
    inside struct anon_vma.

    Further investigation shows more problems in current implementation:

    * Special branch in __unmap_and_move() never works:
    balloon_page_movable() checks page flags and page_count. In
    __unmap_and_move() page is locked, reference counter is elevated, thus
    balloon_page_movable() always fails. As a result execution goes to the
    normal migration path. virtballoon_migratepage() returns
    MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
    move_to_new_page() thinks this is an error code and assigns
    newpage->mapping to NULL. Newly migrated page lose connectivity with
    balloon an all ability for further migration.

    * lru_lock erroneously required in isolate_migratepages_range() for
    isolation ballooned page. This function releases lru_lock periodically,
    this makes migration mostly impossible for some pages.

    * balloon_page_dequeue have a tight race with balloon_page_isolate:
    balloon_page_isolate could be executed in parallel with dequeue between
    picking page from list and locking page_lock. Race is rare because they
    use trylock_page() for locking.

    This patch fixes all of them.

    Instead of fake mapping with special flag this patch uses special state of
    page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256. Buddy allocator uses
    PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose. Storing mark
    directly in struct page makes everything safer and easier.

    PagePrivate is used to mark pages present in page list (i.e. not
    isolated, like PageLRU for normal pages). It replaces special rules for
    reference counter and makes balloon migration similar to migration of
    normal pages. This flag is protected by page_lock together with link to
    the balloon device.

    Signed-off-by: Konstantin Khlebnikov
    Reported-by: Sasha Levin
    Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
    Cc: Rafael Aquini
    Cc: Andrey Ryabinin
    Cc: [3.8+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

24 Jan, 2014

1 commit

  • bad_page() is cool in that it prints out a bunch of data about the page.
    But, I can never remember which page flags are good and which are bad,
    or whether ->index or ->mapping is required to be NULL.

    This patch allows bad/dump_page() callers to specify a string about why
    they are dumping the page and adds explanation strings to a number of
    places. It also adds a 'bad_flags' argument to bad_page(), which it
    then dumps out separately from the flags which are actually set.

    This way, the messages will show specifically why the page was bad,
    *specifically* which flags it is complaining about, if it was a page
    flag combination which was the problem.

    [akpm@linux-foundation.org: switch to pr_alert]
    Signed-off-by: Dave Hansen
    Reviewed-by: Christoph Lameter
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     

12 Dec, 2012

1 commit

  • Memory fragmentation introduced by ballooning might reduce significantly
    the number of 2MB contiguous memory blocks that can be used within a guest,
    thus imposing performance penalties associated with the reduced number of
    transparent huge pages that could be used by the guest workload.

    This patch introduces a common interface to help a balloon driver on
    making its page set movable to compaction, and thus allowing the system
    to better leverage the compation efforts on memory defragmentation.

    [akpm@linux-foundation.org: use PAGE_FLAGS_CHECK_AT_PREP, s/__balloon_page_flags/page_flags_cleared/, small cleanups]
    [rientjes@google.com: allow balloon compaction for any system with memory compaction enabled, which is the defconfig]
    Signed-off-by: Rafael Aquini
    Acked-by: Mel Gorman
    Cc: Rusty Russell
    Cc: "Michael S. Tsirkin"
    Cc: Rik van Riel
    Cc: Andi Kleen
    Cc: Konrad Rzeszutek Wilk
    Cc: Minchan Kim
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael Aquini