Commit b25fd5de48e3ad079abd40a4ea946f126a8bceb0

Authored by Vlastimil Babka
Committed by Jiri Slaby
1 parent 71a5b80134

mm, compaction: properly signal and act upon lock and need_sched() contention

commit be9765722e6b7ece8263cbab857490332339bd6f upstream.

Compaction uses compact_checklock_irqsave() function to periodically check
for lock contention and need_resched() to either abort async compaction,
or to free the lock, schedule and retake the lock.  When aborting,
cc->contended is set to signal the contended state to the caller.  Two
problems have been identified in this mechanism.

First, compaction also calls directly cond_resched() in both scanners when
no lock is yet taken.  This call either does not abort async compaction,
or set cc->contended appropriately.  This patch introduces a new
compact_should_abort() function to achieve both.  In isolate_freepages(),
the check frequency is reduced to once by SWAP_CLUSTER_MAX pageblocks to
match what the migration scanner does in the preliminary page checks.  In
case a pageblock is found suitable for calling isolate_freepages_block(),
the checks within there are done on higher frequency.

Second, isolate_freepages() does not check if isolate_freepages_block()
aborted due to contention, and advances to the next pageblock.  This
violates the principle of aborting on contention, and might result in
pageblocks not being scanned completely, since the scanning cursor is
advanced.  This problem has been noticed in the code by Joonsoo Kim when
reviewing related patches.  This patch makes isolate_freepages_block()
check the cc->contended flag and abort.

In case isolate_freepages() has already isolated some pages before
aborting due to contention, page migration will proceed, which is OK since
we do not want to waste the work that has been done, and page migration
has own checks for contention.  However, we do not want another isolation
attempt by either of the scanners, so cc->contended flag check is added
also to compaction_alloc() and compact_finished() to make sure compaction
is aborted right after the migration.

The outcome of the patch should be reduced lock contention by async
compaction and lower latencies for higher-order allocations where direct
compaction is involved.

[akpm@linux-foundation.org: fix typo in comment]
Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Showing 2 changed files with 48 additions and 11 deletions Side-by-side Diff

... ... @@ -222,6 +222,30 @@
222 222 return true;
223 223 }
224 224  
  225 +/*
  226 + * Aside from avoiding lock contention, compaction also periodically checks
  227 + * need_resched() and either schedules in sync compaction or aborts async
  228 + * compaction. This is similar to what compact_checklock_irqsave() does, but
  229 + * is used where no lock is concerned.
  230 + *
  231 + * Returns false when no scheduling was needed, or sync compaction scheduled.
  232 + * Returns true when async compaction should abort.
  233 + */
  234 +static inline bool compact_should_abort(struct compact_control *cc)
  235 +{
  236 + /* async compaction aborts if contended */
  237 + if (need_resched()) {
  238 + if (cc->mode == MIGRATE_ASYNC) {
  239 + cc->contended = true;
  240 + return true;
  241 + }
  242 +
  243 + cond_resched();
  244 + }
  245 +
  246 + return false;
  247 +}
  248 +
225 249 /* Returns true if the page is within a block suitable for migration to */
226 250 static bool suitable_migration_target(struct page *page)
227 251 {
... ... @@ -495,11 +519,8 @@
495 519 return 0;
496 520 }
497 521  
498   - if (cond_resched()) {
499   - /* Async terminates prematurely on need_resched() */
500   - if (cc->mode == MIGRATE_ASYNC)
501   - return 0;
502   - }
  522 + if (compact_should_abort(cc))
  523 + return 0;
503 524  
504 525 /* Time to isolate some pages for migration */
505 526 for (; low_pfn < end_pfn; low_pfn++) {
506 527  
... ... @@ -718,9 +739,11 @@
718 739 /*
719 740 * This can iterate a massively long zone without finding any
720 741 * suitable migration targets, so periodically check if we need
721   - * to schedule.
  742 + * to schedule, or even abort async compaction.
722 743 */
723   - cond_resched();
  744 + if (!(block_start_pfn % (SWAP_CLUSTER_MAX * pageblock_nr_pages))
  745 + && compact_should_abort(cc))
  746 + break;
724 747  
725 748 if (!pfn_valid(block_start_pfn))
726 749 continue;
... ... @@ -758,6 +781,13 @@
758 781 */
759 782 if (isolated)
760 783 cc->finished_update_free = true;
  784 +
  785 + /*
  786 + * isolate_freepages_block() might have aborted due to async
  787 + * compaction being contended
  788 + */
  789 + if (cc->contended)
  790 + break;
761 791 }
762 792  
763 793 /* split_free_page does not map the pages */
764 794  
... ... @@ -784,9 +814,13 @@
784 814 struct compact_control *cc = (struct compact_control *)data;
785 815 struct page *freepage;
786 816  
787   - /* Isolate free pages if necessary */
  817 + /*
  818 + * Isolate free pages if necessary, and if we are not aborting due to
  819 + * contention.
  820 + */
788 821 if (list_empty(&cc->freepages)) {
789   - isolate_freepages(cc->zone, cc);
  822 + if (!cc->contended)
  823 + isolate_freepages(cc->zone, cc);
790 824  
791 825 if (list_empty(&cc->freepages))
792 826 return NULL;
... ... @@ -856,7 +890,7 @@
856 890 unsigned int order;
857 891 unsigned long watermark;
858 892  
859   - if (fatal_signal_pending(current))
  893 + if (cc->contended || fatal_signal_pending(current))
860 894 return COMPACT_PARTIAL;
861 895  
862 896 /* Compaction run completes if the migrate and free scanner meet */
... ... @@ -145,7 +145,10 @@
145 145 int order; /* order a direct compactor needs */
146 146 int migratetype; /* MOVABLE, RECLAIMABLE etc */
147 147 struct zone *zone;
148   - bool contended; /* True if a lock was contended */
  148 + bool contended; /* True if a lock was contended, or
  149 + * need_resched() true during async
  150 + * compaction
  151 + */
149 152 };
150 153  
151 154 unsigned long