Commit dc6e29da9162fa8fa2a9e798569c0f6e87975614

Authored by Linus Torvalds
1 parent 5263bf65d6

Fix balance_dirty_page() calculations with CONFIG_HIGHMEM

This makes balance_dirty_page() always base its calculations on the
amount of non-highmem memory in the machine, rather than try to base it
on total memory and then falling back on non-highmem memory if the
mapping it was writing wasn't highmem capable.

This not only fixes a situation where two different writers can have
wildly different notions about what is a "balanced" dirty state, but it
also means that people with highmem machines don't run into an OOM
situation when regular memory fills up with dirty pages.

We used to try to handle the latter case by scaling down the dirty_ratio
if the machine had a lot of highmem pages in page_writeback_init(), but
it wasn't aggressive enough for some situations, and since basing the
dirty ratio on highmem memory was broken in the first place, let's just
stop doing so.

(A variation of this theme fixed Justin Piszcz's OOM problem when
copying an 18GB file on a RAID setup).

Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 1 changed file with 18 additions and 23 deletions Side-by-side Diff

... ... @@ -133,11 +133,9 @@
133 133  
134 134 #ifdef CONFIG_HIGHMEM
135 135 /*
136   - * If this mapping can only allocate from low memory,
137   - * we exclude high memory from our count.
  136 + * We always exclude high memory from our count.
138 137 */
139   - if (mapping && !(mapping_gfp_mask(mapping) & __GFP_HIGHMEM))
140   - available_memory -= totalhigh_pages;
  138 + available_memory -= totalhigh_pages;
141 139 #endif
142 140  
143 141  
144 142  
... ... @@ -526,28 +524,25 @@
526 524 };
527 525  
528 526 /*
529   - * If the machine has a large highmem:lowmem ratio then scale back the default
530   - * dirty memory thresholds: allowing too much dirty highmem pins an excessive
531   - * number of buffer_heads.
  527 + * Called early on to tune the page writeback dirty limits.
  528 + *
  529 + * We used to scale dirty pages according to how total memory
  530 + * related to pages that could be allocated for buffers (by
  531 + * comparing nr_free_buffer_pages() to vm_total_pages.
  532 + *
  533 + * However, that was when we used "dirty_ratio" to scale with
  534 + * all memory, and we don't do that any more. "dirty_ratio"
  535 + * is now applied to total non-HIGHPAGE memory (by subtracting
  536 + * totalhigh_pages from vm_total_pages), and as such we can't
  537 + * get into the old insane situation any more where we had
  538 + * large amounts of dirty pages compared to a small amount of
  539 + * non-HIGHMEM memory.
  540 + *
  541 + * But we might still want to scale the dirty_ratio by how
  542 + * much memory the box has..
532 543 */
533 544 void __init page_writeback_init(void)
534 545 {
535   - long buffer_pages = nr_free_buffer_pages();
536   - long correction;
537   -
538   - correction = (100 * 4 * buffer_pages) / vm_total_pages;
539   -
540   - if (correction < 100) {
541   - dirty_background_ratio *= correction;
542   - dirty_background_ratio /= 100;
543   - vm_dirty_ratio *= correction;
544   - vm_dirty_ratio /= 100;
545   -
546   - if (dirty_background_ratio <= 0)
547   - dirty_background_ratio = 1;
548   - if (vm_dirty_ratio <= 0)
549   - vm_dirty_ratio = 1;
550   - }
551 546 mod_timer(&wb_timer, jiffies + dirty_writeback_interval);
552 547 writeback_set_ratelimit();
553 548 register_cpu_notifier(&ratelimit_nb);