27 Jul, 2008

1 commit

  • mapping->tree_lock has no read lockers. convert the lock from an rwlock
    to a spinlock.

    Signed-off-by: Nick Piggin
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Hugh Dickins
    Cc: "Paul E. McKenney"
    Reviewed-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

15 Jul, 2008

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
    ext4: Documention update for new ordered mode and delayed allocation
    ext4: do not set extents feature from the kernel
    ext4: Don't allow nonextenst mount option for large filesystem
    ext4: Enable delalloc by default.
    ext4: delayed allocation i_blocks fix for stat
    ext4: fix delalloc i_disksize early update issue
    ext4: Handle page without buffers in ext4_*_writepage()
    ext4: Add ordered mode support for delalloc
    ext4: Invert lock ordering of page_lock and transaction start in delalloc
    mm: Add range_cont mode for writeback
    ext4: delayed allocation ENOSPC handling
    percpu_counter: new function percpu_counter_sum_and_set
    ext4: Add delayed allocation support in data=writeback mode
    vfs: add hooks for ext4's delayed allocation support
    jbd2: Remove data=ordered mode support using jbd buffer heads
    ext4: Use new framework for data=ordered mode in JBD2
    jbd2: Implement data=ordered mode handling via inodes
    vfs: export filemap_fdatawrite_range()
    ext4: Fix lock inversion in ext4_ext_truncate()
    ext4: Invert the locking order of page_lock and transaction start
    ...

    Linus Torvalds
     

12 Jul, 2008

1 commit

  • Filesystems like ext4 needs to start a new transaction in
    the writepages for block allocation. This happens with delayed
    allocation and there is limit to how many credits we can request
    from the journal layer. So we call write_cache_pages multiple
    times with wbc->nr_to_write set to the maximum possible value
    limitted by the max journal credits available.

    Add a new mode to writeback that enables us to handle this
    behaviour. In the new mode we update the wbc->range_start
    to point to the new offset to be written. Next call to
    call to write_cache_pages will start writeout from specified
    range_start offset. In the new mode we also limit writing
    to the specified wbc->range_end.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Acked-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     

24 May, 2008

1 commit

  • Currently there is no protection from the root user to use up all of
    memory for trace buffers. If the root user allocates too many entries,
    the OOM killer might start kill off all tasks.

    This patch adds an algorith to check the following condition:

    pages_requested > (freeable_memory + current_trace_buffer_pages) / 4

    If the above is met then the allocation fails. The above prevents more
    than 1/4th of freeable memory from being used by trace buffers.

    To determine the freeable_memory, I made determine_dirtyable_memory in
    mm/page-writeback.c global.

    Special thanks goes to Peter Zijlstra for suggesting the above calculation.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Steven Rostedt
     

30 Apr, 2008

6 commits

  • Fuse will use temporary buffers to write back dirty data from memory mappings
    (normal writes are done synchronously). This is needed, because there cannot
    be any guarantee about the time in which a write will complete.

    By using temporary buffers, from the MM's point if view the page is written
    back immediately. If the writeout was due to memory pressure, this
    effectively migrates data from a full zone to a less full zone.

    This patch adds a new counter (NR_WRITEBACK_TEMP) for the number of pages used
    as temporary buffers.

    [Lee.Schermerhorn@hp.com: add vmstat_text for NR_WRITEBACK_TEMP]
    Signed-off-by: Miklos Szeredi
    Cc: Christoph Lameter
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Fuse needs this for writable mmap support.

    Signed-off-by: Miklos Szeredi
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add a new BDI capability flag: BDI_CAP_NO_ACCT_WB. If this flag is
    set, then don't update the per-bdi writeback stats from
    test_set_page_writeback() and test_clear_page_writeback().

    Misc cleanups:

    - convert bdi_cap_writeback_dirty() and friends to static inline functions
    - create a flag that includes all three dirty/writeback related flags,
    since almst all users will want to have them toghether

    Signed-off-by: Miklos Szeredi
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add "max_ratio" to /sys/class/bdi. This indicates the maximum percentage of
    the global dirty threshold allocated to this bdi.

    [mszeredi@suse.cz]

    - fix parsing in max_ratio_store().
    - export bdi_set_max_ratio() to modules
    - limit bdi_dirty with bdi->max_ratio
    - document new sysfs attribute

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Under normal circumstances each device is given a part of the total write-back
    cache that relates to its current avg writeout speed in relation to the other
    devices.

    min_ratio - allows one to assign a minimum portion of the write-back cache to
    a particular device. This is useful in situations where you might want to
    provide a minimum QoS. (One request for this feature came from flash based
    storage people who wanted to avoid writing out at all costs - they of course
    needed some pdflush hacks as well)

    max_ratio - allows one to assign a maximum portion of the dirty limit to a
    particular device. This is useful in situations where you want to avoid one
    device taking all or most of the write-back cache. Eg. an NFS mount that is
    prone to get stuck, or a FUSE mount which you don't trust to play fair.

    Add "min_ratio" to /sys/class/bdi. This indicates the minimum percentage of
    the global dirty threshold allocated to this bdi.

    [mszeredi@suse.cz]

    - fix parsing in min_ratio_store()
    - document new sysfs attribute

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Provide a place in sysfs (/sys/class/bdi) for the backing_dev_info object.
    This allows us to see and set the various BDI specific variables.

    In particular this properly exposes the read-ahead window for all relevant
    users and /sys/block//queue/read_ahead_kb should be deprecated.

    With patient help from Kay Sievers and Greg KH

    [mszeredi@suse.cz]

    - split off NFS and FUSE changes into separate patches
    - document new sysfs attributes under Documentation/ABI
    - do bdi_class_init as a core_initcall, otherwise the "default" BDI
    won't be initialized
    - remove bdi_init_fmt macro, it's not used very much

    [akpm@linux-foundation.org: fix ia64 warning]
    Signed-off-by: Peter Zijlstra
    Cc: Kay Sievers
    Acked-by: Greg KH
    Cc: Trond Myklebust
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

06 Feb, 2008

4 commits

  • After making dirty a 100M file, the normal behavior is to start the
    writeback for all data after 30s delays. But sometimes the following
    happens instead:

    - after 30s: ~4M
    - after 5s: ~4M
    - after 5s: all remaining 92M

    Some analyze shows that the internal io dispatch queues goes like this:

    s_io s_more_io
    -------------------------
    1) 100M,1K 0
    2) 1K 96M
    3) 0 96M
    1) initial state with a 100M file and a 1K file

    2) 4M written, nr_to_write 0, no more writes(BUG)

    nr_to_write > 0 in (3) fools the upper layer to think that data have all
    been written out. The big dirty file is actually still sitting in
    s_more_io. We cannot simply splice s_more_io back to s_io as soon as s_io
    becomes empty, and let the loop in generic_sync_sb_inodes() continue: this
    may starve newly expired inodes in s_dirty. It is also not an option to
    draw inodes from both s_more_io and s_dirty, an let the loop go on: this
    might lead to live locks, and might also starve other superblocks in sync
    time(well kupdate may still starve some superblocks, that's another bug).

    We have to return when a full scan of s_io completes. So nr_to_write > 0
    does not necessarily mean that "all data are written". This patch
    introduces a flag writeback_control.more_io to indicate that more io should
    be done. With it the big dirty file no longer has to wait for the next
    kupdate invokation 5s later.

    In sync_sb_inodes() we only set more_io on super_blocks we actually
    visited. This avoids the interaction between two pdflush deamons.

    Also in __sync_single_inode() we don't blindly keep requeuing the io if the
    filesystem cannot progress. Failing to do so may lead to 100% iowait.

    Tested-by: Mike Snitzer
    Signed-off-by: Fengguang Wu
    Cc: Michael Rubin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • fastcall is always defined to be empty, remove it

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Harvey Harrison
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     
  • Add vm.highmem_is_dirtyable toggle

    A 32 bit machine with HIGHMEM64 enabled running DCC has an MMAPed file of
    approximately 2Gb size which contains a hash format that is written
    randomly by the dbclean process. On 2.6.16 this process took a few
    minutes. With lowmem only accounting of dirty ratios, this takes about 12
    hours of 100% disk IO, all random writes.

    Include a toggle in /proc/sys/vm/highmem_is_dirtyable which can be set to 1 to
    add the highmem back to the total available memory count.

    [akpm@linux-foundation.org: Fix the CONFIG_DETECT_SOFTLOCKUP=y build]
    Signed-off-by: Bron Gondwana
    Cc: Ethan Solomita
    Cc: Peter Zijlstra
    Cc: WU Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bron Gondwana
     
  • task_dirty_limit() can become static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

15 Jan, 2008

1 commit

  • This reverts commit 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, as
    requested by Fengguang Wu. It's not quite fully baked yet, and while
    there are patches around to fix the problems it caused, they should get
    more testing. Says Fengguang: "I'll resend them both for -mm later on,
    in a more complete patchset".

    See

    http://bugzilla.kernel.org/show_bug.cgi?id=9738

    for some of this discussion.

    Requested-by: Fengguang Wu
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

16 Nov, 2007

1 commit

  • This code harks back to the days when we didn't count dirty mapped
    pages, which led us to try to balance the number of dirty unmapped pages
    by how much unmapped memory there was in the system.

    That makes no sense any more, since now the dirty counts include the
    mapped pages. Not to mention that the math doesn't work with HIGHMEM
    machines anyway, and causes the unmapped_ratio to potentially turn
    negative (which we do catch thanks to clamping it at a minimum value,
    but I mention that as an indication of how broken the code is).

    The code also was written at a time when the default dirty ratio was
    much larger, and the unmapped_ratio logic effectively capped that large
    dirty ratio a bit. Again, we've since lowered the dirty ratio rather
    aggressively, further lessening the point of that code.

    Acked-by: Peter Zijlstra
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

15 Nov, 2007

1 commit

  • We allow violation of bdi limits if there is a lot of room on the system.
    Once we hit half the total limit we start enforcing bdi limits and bdi
    ramp-up should happen. Doing it this way avoids many small writeouts on an
    otherwise idle system and should also speed up the ramp-up.

    Signed-off-by: Peter Zijlstra
    Reviewed-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

20 Oct, 2007

1 commit


17 Oct, 2007

10 commits

  • We don't want to introduce pointless delays in throttle_vm_writeout() when
    the writeback limits are not yet exceeded, do we?

    Cc: Nick Piggin
    Cc: OGAWA Hirofumi
    Cc: Kumar Gala
    Cc: Pete Zaitcev
    Cc: Greg KH
    Reviewed-by: Rik van Riel
    Signed-off-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • I_LOCK was used for several unrelated purposes, which caused deadlock
    situations in certain filesystems as a side effect. One of the purposes
    now uses the new I_SYNC bit.

    Also document the various bits and change their order from historical to
    logical.

    [bunk@stusta.de: make fs/inode.c:wake_up_inode() static]
    Signed-off-by: Joern Engel
    Cc: Dave Kleikamp
    Cc: David Chinner
    Cc: Anton Altaparmakov
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joern Engel
     
  • After making dirty a 100M file, the normal behavior is to start the writeback
    for all data after 30s delays. But sometimes the following happens instead:

    - after 30s: ~4M
    - after 5s: ~4M
    - after 5s: all remaining 92M

    Some analyze shows that the internal io dispatch queues goes like this:

    s_io s_more_io
    -------------------------
    1) 100M,1K 0
    2) 1K 96M
    3) 0 96M

    1) initial state with a 100M file and a 1K file
    2) 4M written, nr_to_write 0, no more writes(BUG)

    nr_to_write > 0 in (3) fools the upper layer to think that data have all been
    written out. The big dirty file is actually still sitting in s_more_io. We
    cannot simply splice s_more_io back to s_io as soon as s_io becomes empty, and
    let the loop in generic_sync_sb_inodes() continue: this may starve newly
    expired inodes in s_dirty. It is also not an option to draw inodes from both
    s_more_io and s_dirty, an let the loop go on: this might lead to live locks,
    and might also starve other superblocks in sync time(well kupdate may still
    starve some superblocks, that's another bug).

    We have to return when a full scan of s_io completes. So nr_to_write > 0 does
    not necessarily mean that "all data are written". This patch introduces a
    flag writeback_control.more_io to indicate this situation. With it the big
    dirty file no longer has to wait for the next kupdate invocation 5s later.

    Cc: David Chinner
    Cc: Ken Chen
    Signed-off-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • This is a writeback-internal marker but we're propagating it all the way back
    to userspace!.

    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Based on ideas of Andrew:
    http://marc.info/?l=linux-kernel&m=102912915020543&w=2

    Scale the bdi dirty limit inversly with the tasks dirty rate.
    This makes heavy writers have a lower dirty limit than the occasional writer.

    Andrea proposed something similar:
    http://lwn.net/Articles/152277/

    The main disadvantage to his patch is that he uses an unrelated quantity to
    measure time, which leaves him with a workload dependant tunable. Other than
    that the two approaches appear quite similar.

    [akpm@linux-foundation.org: fix warning]
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Scale writeback cache per backing device, proportional to its writeout speed.

    By decoupling the BDI dirty thresholds a number of problems we currently have
    will go away, namely:

    - mutual interference starvation (for any number of BDIs);
    - deadlocks with stacked BDIs (loop, FUSE and local NFS mounts).

    It might be that all dirty pages are for a single BDI while other BDIs are
    idling. By giving each BDI a 'fair' share of the dirty limit, each one can have
    dirty pages outstanding and make progress.

    A global threshold also creates a deadlock for stacked BDIs; when A writes to
    B, and A generates enough dirty pages to get throttled, B will never start
    writeback until the dirty pages go away. Again, by giving each BDI its own
    'independent' dirty limit, this problem is avoided.

    So the problem is to determine how to distribute the total dirty limit across
    the BDIs fairly and efficiently. A DBI that has a large dirty limit but does
    not have any dirty pages outstanding is a waste.

    What is done is to keep a floating proportion between the DBIs based on
    writeback completions. This way faster/more active devices get a larger share
    than slower/idle devices.

    [akpm@linux-foundation.org: fix warnings]
    [hugh@veritas.com: Fix occasional hang when a task couldn't get out of balance_dirty_pages]
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Count per BDI writeback pages.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Count per BDI reclaimable pages; nr_reclaimable = nr_dirty + nr_unstable.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Here's a cut at fixing up uses of the online node map in generic code.

    mm/shmem.c:shmem_parse_mpol()

    Ensure nodelist is subset of nodes with memory.
    Use node_states[N_HIGH_MEMORY] as default for missing
    nodelist for interleave policy.

    mm/shmem.c:shmem_fill_super()

    initialize policy_nodes to node_states[N_HIGH_MEMORY]

    mm/page-writeback.c:highmem_dirtyable_memory()

    sum over nodes with memory

    mm/page_alloc.c:zlc_setup()

    allowednodes - use nodes with memory.

    mm/page_alloc.c:default_zonelist_order()

    average over nodes with memory.

    mm/page_alloc.c:find_next_best_node()

    skip nodes w/o memory.
    N_HIGH_MEMORY state mask may not be initialized at this time,
    unless we want to depend on early_calculate_totalpages() [see
    below]. Will ZONE_MOVABLE ever be configurable?

    mm/page_alloc.c:find_zone_movable_pfns_for_nodes()

    spread kernelcore over nodes with memory.

    This required calling early_calculate_totalpages()
    unconditionally, and populating N_HIGH_MEMORY node
    state therein from nodes in the early_node_map[].
    If we can depend on this, we can eliminate the
    population of N_HIGH_MEMORY mask from __build_all_zonelists()
    and use the N_HIGH_MEMORY mask in find_next_best_node().

    mm/mempolicy.c:mpol_check_policy()

    Ensure nodes specified for policy are subset of
    nodes with memory.

    [akpm@linux-foundation.org: fix warnings]
    Signed-off-by: Lee Schermerhorn
    Acked-by: Christoph Lameter
    Cc: Shaohua Li
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • Probing pages and radix_tree_tagged are lockless operations with the lockless
    radix-tree. Convert these users to RCU locking rather than using tree_lock.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

09 Oct, 2007

1 commit

  • All the current page_mkwrite() implementations also set the page dirty. Which
    results in the set_page_dirty_balance() call to _not_ call balance, because the
    page is already found dirty.

    This allows us to dirty a _lot_ of pages without ever hitting
    balance_dirty_pages(). Not good (tm).

    Force a balance call if ->page_mkwrite() was successful.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

20 Jul, 2007

3 commits

  • page-writeback accounting is presently performed in the page-flags macros.
    This is inconsistent and a bit ugly and makes it awkward to implement
    per-backing_dev under-writeback page accounting.

    So move this accounting down to the callsite(s).

    Acked-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Share the same page flag bit for PG_readahead and PG_reclaim.

    One is used only on file reads, another is only for emergency writes. One
    is used mostly for fresh/young pages, another is for old pages.

    Combinations of possible interactions are:

    a) clear PG_reclaim => implicit clear of PG_readahead
    it will delay an asynchronous readahead into a synchronous one
    it actually does _good_ for readahead:
    the pages will be reclaimed soon, it's readahead thrashing!
    in this case, synchronous readahead makes more sense.

    b) clear PG_readahead => implicit clear of PG_reclaim
    one(and only one) page will not be reclaimed in time
    it can be avoided by checking PageWriteback(page) in readahead first

    c) set PG_reclaim => implicit set of PG_readahead
    will confuse readahead and make it restart the size rampup process
    it's a trivial problem, and can mostly be avoided by checking
    PageWriteback(page) first in readahead

    d) set PG_readahead => implicit set of PG_reclaim
    PG_readahead will never be set on already cached pages.
    PG_reclaim will always be cleared on dirtying a page.
    so not a problem.

    In summary,
    a) we get better behavior
    b,d) possible interactions can be avoided
    c) racy condition exists that might affect readahead, but the chance
    is _really_ low, and the hurt on readahead is trivial.

    Compound pages also use PG_reclaim, but for now they do not interact with
    reclaim/readahead code.

    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • Fix msync data loss and (less importantly) dirty page accounting
    inaccuracies due to the race remaining in clear_page_dirty_for_io().

    The deleted comment explains what the race was, and the added comments
    explain how it is fixed.

    Signed-off-by: Nick Piggin
    Acked-by: Linus Torvalds
    Cc: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

18 Jul, 2007

1 commit

  • It is a bug to set a page dirty if it is not uptodate unless it has
    buffers. If the page has buffers, then the page may be dirty (some buffers
    dirty) but not uptodate (some buffers not uptodate). The exception to this
    rule is if the set_page_dirty caller is racing with truncate or invalidate.

    A buffer can not be set dirty if it is not uptodate.

    If either of these situations occurs, it indicates there could be some data
    loss problem. Some of these warnings could be a harmless one where the
    page or buffer is set uptodate immediately after it is dirtied, however we
    should fix those up, and enforce this ordering.

    Bring the order of operations for truncate into line with those of
    invalidate. This will prevent a page from being able to go !uptodate while
    we're holding the tree_lock, which is probably a good thing anyway.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

17 Jul, 2007

1 commit


11 May, 2007

1 commit

  • Clean up massive code duplication between mpage_writepages() and
    generic_writepages().

    The new generic function, write_cache_pages() takes a function pointer
    argument, which will be called for each page to be written.

    Maybe cifs_writepages() too can use this infrastructure, but I'm not
    touching that with a ten-foot pole.

    The upcoming page writeback support in fuse will also want this.

    Signed-off-by: Miklos Szeredi
    Acked-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

09 May, 2007

1 commit


08 May, 2007

1 commit

  • We can use the global ZVC counters to establish the exact size of the LRU
    and the free pages. This allows a more accurate determination of the dirty
    ratio.

    This patch will fix the broken ratio calculations if large amounts of
    memory are allocated to huge pags or other consumers that do not put the
    pages on to the LRU.

    Notes:
    - I did not add NR_SLAB_RECLAIMABLE to the calculation of the
    dirtyable pages. Those may be reclaimable but they are at this
    point not dirtyable. If NR_SLAB_RECLAIMABLE would be considered
    then a huge number of reclaimable pages would stop writeback
    from occurring.

    - This patch used to be in mm as the last one in a series of patches.
    It was removed when Linus updated the treatment of highmem because
    there was a conflict. I updated the patch to follow Linus' approach.
    This patch is neede to fulfill the claims made in the beginning of the
    patchset that is now in Linus' tree.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

28 Apr, 2007

1 commit


02 Mar, 2007

1 commit


12 Feb, 2007

1 commit