17 Oct, 2020

2 commits

  • The current page_order() can only be called on pages in the buddy
    allocator. For compound pages, you have to use compound_order(). This is
    confusing and led to a bug, so rename page_order() to buddy_order().

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20201001152259.14932-2-willy@infradead.org
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • list_for_each_entry_safe() guarantees that we will never stumble over the
    list head; "&page->lru != list" will always evaluate to true. Let's
    simplify.

    [david@redhat.com: Changelog refinements]

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: David Hildenbrand
    Reviewed-by: Alexander Duyck
    Link: http://lkml.kernel.org/r/20200818084448.33969-1-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     

08 Apr, 2020

3 commits

  • In order to keep ourselves from reporting pages that are just going to be
    reused again in the case of heavy churn we can put a limit on how many
    total pages we will process per pass. Doing this will allow the worker
    thread to go into idle much more quickly so that we avoid competing with
    other threads that might be allocating or freeing pages.

    The logic added here will limit the worker thread to no more than one
    sixteenth of the total free pages in a given area per list. Once that
    limit is reached it will update the state so that at the end of the pass
    we will reschedule the worker to try again in 2 seconds when the memory
    churn has hopefully settled down.

    Again this optimization doesn't show much of a benefit in the standard
    case as the memory churn is minmal. However with page allocator shuffling
    enabled the gain is quite noticeable. Below are the results with a THP
    enabled version of the will-it-scale page_fault1 test showing the
    improvement in iterations for 16 processes or threads.

    Without:
    tasks processes processes_idle threads threads_idle
    16 8283274.75 0.17 5594261.00 38.15

    With:
    tasks processes processes_idle threads threads_idle
    16 8767010.50 0.21 5791312.75 36.98

    Signed-off-by: Alexander Duyck
    Signed-off-by: Andrew Morton
    Acked-by: Mel Gorman
    Cc: Andrea Arcangeli
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Hildenbrand
    Cc: Konrad Rzeszutek Wilk
    Cc: Luiz Capitulino
    Cc: Matthew Wilcox
    Cc: Michael S. Tsirkin
    Cc: Michal Hocko
    Cc: Nitesh Narayan Lal
    Cc: Oscar Salvador
    Cc: Pankaj Gupta
    Cc: Paolo Bonzini
    Cc: Rik van Riel
    Cc: Vlastimil Babka
    Cc: Wei Wang
    Cc: Yang Zhang
    Cc: wei qi
    Link: http://lkml.kernel.org/r/20200211224719.29318.72113.stgit@localhost.localdomain
    Signed-off-by: Linus Torvalds

    Alexander Duyck
     
  • Rather than walking over the same pages again and again to get to the
    pages that have yet to be reported we can save ourselves a significant
    amount of time by simply rotating the list so that when we have a full
    list of reported pages the head of the list is pointing to the next
    non-reported page. Doing this should save us some significant time when
    processing each free list.

    This doesn't gain us much in the standard case as all of the non-reported
    pages should be near the top of the list already. However in the case of
    page shuffling this results in a noticeable improvement. Below are the
    will-it-scale page_fault1 w/ THP numbers for 16 tasks with and without
    this patch.

    Without:
    tasks processes processes_idle threads threads_idle
    16 8093776.25 0.17 5393242.00 38.20

    With:
    tasks processes processes_idle threads threads_idle
    16 8283274.75 0.17 5594261.00 38.15

    Signed-off-by: Alexander Duyck
    Signed-off-by: Andrew Morton
    Acked-by: Mel Gorman
    Cc: Andrea Arcangeli
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Hildenbrand
    Cc: Konrad Rzeszutek Wilk
    Cc: Luiz Capitulino
    Cc: Matthew Wilcox
    Cc: Michael S. Tsirkin
    Cc: Michal Hocko
    Cc: Nitesh Narayan Lal
    Cc: Oscar Salvador
    Cc: Pankaj Gupta
    Cc: Paolo Bonzini
    Cc: Rik van Riel
    Cc: Vlastimil Babka
    Cc: Wei Wang
    Cc: Yang Zhang
    Cc: wei qi
    Link: http://lkml.kernel.org/r/20200211224708.29318.16862.stgit@localhost.localdomain
    Signed-off-by: Linus Torvalds

    Alexander Duyck
     
  • In order to pave the way for free page reporting in virtualized
    environments we will need a way to get pages out of the free lists and
    identify those pages after they have been returned. To accomplish this,
    this patch adds the concept of a Reported Buddy, which is essentially
    meant to just be the Uptodate flag used in conjunction with the Buddy page
    type.

    To prevent the reported pages from leaking outside of the buddy lists I
    added a check to clear the PageReported bit in the del_page_from_free_list
    function. As a result any reported page that is split, merged, or
    allocated will have the flag cleared prior to the PageBuddy value being
    cleared.

    The process for reporting pages is fairly simple. Once we free a page
    that meets the minimum order for page reporting we will schedule a worker
    thread to start 2s or more in the future. That worker thread will begin
    working from the lowest supported page reporting order up to MAX_ORDER - 1
    pulling unreported pages from the free list and storing them in the
    scatterlist.

    When processing each individual free list it is necessary for the worker
    thread to release the zone lock when it needs to stop and report the full
    scatterlist of pages. To reduce the work of the next iteration the worker
    thread will rotate the free list so that the first unreported page in the
    free list becomes the first entry in the list.

    It will then call a reporting function providing information on how many
    entries are in the scatterlist. Once the function completes it will
    return the pages to the free area from which they were allocated and start
    over pulling more pages from the free areas until there are no longer
    enough pages to report on to keep the worker busy, or we have processed as
    many pages as were contained in the free area when we started processing
    the list.

    The worker thread will work in a round-robin fashion making its way though
    each zone requesting reporting, and through each reportable free list
    within that zone. Once all free areas within the zone have been processed
    it will check to see if there have been any requests for reporting while
    it was processing. If so it will reschedule the worker thread to start up
    again in roughly 2s and exit.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Andrew Morton
    Acked-by: Mel Gorman
    Cc: Andrea Arcangeli
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Hildenbrand
    Cc: Konrad Rzeszutek Wilk
    Cc: Luiz Capitulino
    Cc: Matthew Wilcox
    Cc: Michael S. Tsirkin
    Cc: Michal Hocko
    Cc: Nitesh Narayan Lal
    Cc: Oscar Salvador
    Cc: Pankaj Gupta
    Cc: Paolo Bonzini
    Cc: Rik van Riel
    Cc: Vlastimil Babka
    Cc: Wei Wang
    Cc: Yang Zhang
    Cc: wei qi
    Link: http://lkml.kernel.org/r/20200211224635.29318.19750.stgit@localhost.localdomain
    Signed-off-by: Linus Torvalds

    Alexander Duyck