14 Jan, 2011

3 commits

  • I think determine_dirtyable_memory() is a rather costly function since it
    need many atomic reads for gathering zone/global page state. But when we
    use vm_dirty_bytes && dirty_background_bytes, we don't need that costly
    calculation.

    This patch eliminates such unnecessary overhead.

    NOTE : newly added if condition might add overhead in normal path.
    But it should be _really_ small because anyway we need the
    access both vm_dirty_bytes and dirty_background_bytes so it is
    likely to hit the cache.

    [akpm@linux-foundation.org: fix used-uninitialised warning]
    Signed-off-by: Minchan Kim
    Cc: Wu Fengguang
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • __set_page_dirty_no_writeback() should return true if it actually
    transitioned the page from a clean to dirty state although it seems nobody
    uses its return value at present.

    Signed-off-by: Bob Liu
    Acked-by: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Liu
     
  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
    Documentation/trace/events.txt: Remove obsolete sched_signal_send.
    writeback: fix global_dirty_limits comment runtime -> real-time
    ppc: fix comment typo singal -> signal
    drivers: fix comment typo diable -> disable.
    m68k: fix comment typo diable -> disable.
    wireless: comment typo fix diable -> disable.
    media: comment typo fix diable -> disable.
    remove doc for obsolete dynamic-printk kernel-parameter
    remove extraneous 'is' from Documentation/iostats.txt
    Fix spelling milisec -> ms in snd_ps3 module parameter description
    Fix spelling mistakes in comments
    Revert conflicting V4L changes
    i7core_edac: fix typos in comments
    mm/rmap.c: fix comment
    sound, ca0106: Fix assignment to 'channel'.
    hrtimer: fix a typo in comment
    init/Kconfig: fix typo
    anon_inodes: fix wrong function name in comment
    fix comment typos concerning "consistent"
    poll: fix a typo in comment
    ...

    Fix up trivial conflicts in:
    - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
    - fs/ext4/ext4.h

    Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.

    Linus Torvalds
     

04 Jan, 2011

1 commit


23 Dec, 2010

1 commit

  • Using TASK_INTERRUPTIBLE in balance_dirty_pages() seems wrong. If it's
    going to do that then it must break out if signal_pending(), otherwise
    it's pretty much guaranteed to degenerate into a busywait loop. Plus we
    *do* want these processes to appear in D state and to contribute to load
    average.

    So it should be TASK_UNINTERRUPTIBLE. -- Andrew Morton

    Signed-off-by: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     

27 Oct, 2010

3 commits

  • The dirty_ratio was silently limited in global_dirty_limits() to >= 5%.
    This is not a user expected behavior. And it's inconsistent with
    calc_period_shift(), which uses the plain vm_dirty_ratio value.

    Let's remove the internal bound.

    At the same time, fix balance_dirty_pages() to work with the
    dirty_thresh=0 case. This allows applications to proceed when
    dirty+writeback pages are all cleaned.

    And ">" fits with the name "exceeded" better than ">=" does. Neil thinks
    it is an aesthetic improvement as well as a functional one :)

    Signed-off-by: Wu Fengguang
    Cc: Jan Kara
    Proposed-by: Con Kolivas
    Acked-by: Peter Zijlstra
    Reviewed-by: Rik van Riel
    Reviewed-by: Neil Brown
    Reviewed-by: KOSAKI Motohiro
    Cc: Michael Rubin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • To help developers and applications gain visibility into writeback
    behaviour adding two entries to vm_stat_items and /proc/vmstat. This will
    allow us to track the "written" and "dirtied" counts.

    # grep nr_dirtied /proc/vmstat
    nr_dirtied 3747
    # grep nr_written /proc/vmstat
    nr_written 3618

    Signed-off-by: Michael Rubin
    Reviewed-by: Wu Fengguang
    Cc: Dave Chinner
    Cc: Jens Axboe
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Rubin
     
  • To help developers and applications gain visibility into writeback
    behaviour this patch adds two counters to /proc/vmstat.

    # grep nr_dirtied /proc/vmstat
    nr_dirtied 3747
    # grep nr_written /proc/vmstat
    nr_written 3618

    These entries allow user apps to understand writeback behaviour over time
    and learn how it is impacting their performance. Currently there is no
    way to inspect dirty and writeback speed over time. It's not possible for
    nr_dirty/nr_writeback.

    These entries are necessary to give visibility into writeback behaviour.
    We have /proc/diskstats which lets us understand the io in the block
    layer. We have blktrace for more in depth understanding. We have
    e2fsprogs and debugsfs to give insight into the file systems behaviour,
    but we don't offer our users the ability understand what writeback is
    doing. There is no way to know how active it is over the whole system, if
    it's falling behind or to quantify it's efforts. With these values
    exported users can easily see how much data applications are sending
    through writeback and also at what rates writeback is processing this
    data. Comparing the rates of change between the two allow developers to
    see when writeback is not able to keep up with incoming traffic and the
    rate of dirty memory being sent to the IO back end. This allows folks to
    understand their io workloads and track kernel issues. Non kernel
    engineers at Google often use these counters to solve puzzling performance
    problems.

    Patch #4 adds a pernode vmstat file with nr_dirtied and nr_written

    Patch #5 add writeback thresholds to /proc/vmstat

    Currently these values are in debugfs. But they should be promoted to
    /proc since they are useful for developers who are writing databases
    and file servers and are not debugging the kernel.

    The output is as below:

    # grep threshold /proc/vmstat
    nr_pages_dirty_threshold 409111
    nr_pages_dirty_background_threshold 818223

    This patch:

    This allows code outside of the mm core to safely manipulate page
    writeback state and not worry about the other accounting. Not using these
    routines means that some code will lose track of the accounting and we get
    bugs.

    Modify nilfs2 to use interface.

    Signed-off-by: Michael Rubin
    Reviewed-by: KOSAKI Motohiro
    Reviewed-by: Wu Fengguang
    Cc: KONISHI Ryusuke
    Cc: Jiro SEKIBA
    Cc: Dave Chinner
    Cc: Jens Axboe
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Rubin
     

29 Aug, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    ceph: fix get_ticket_handler() error handling
    ceph: don't BUG on ENOMEM during mds reconnect
    ceph: ceph_mdsc_build_path() returns an ERR_PTR
    ceph: Fix warnings
    ceph: ceph_get_inode() returns an ERR_PTR
    ceph: initialize fields on new dentry_infos
    ceph: maintain i_head_snapc when any caps are dirty, not just for data
    ceph: fix osd request lru adjustment when sending request
    ceph: don't improperly set dir complete when holding EXCL cap
    mm: exporting account_page_dirty
    ceph: direct requests in snapped namespace based on nonsnap parent
    ceph: queue cap snap writeback for realm children on snap update
    ceph: include dirty xattrs state in snapped caps
    ceph: fix xattr cap writeback
    ceph: fix multiple mds session shutdown

    Linus Torvalds
     

24 Aug, 2010

1 commit

  • I noticed XFS writeback in 2.6.36-rc1 was much slower than it should have
    been. Enabling writeback tracing showed:

    flush-253:16-8516 [007] 1342952.351608: wbc_writepage: bdi 253:16: towrt=1024 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516 [007] 1342952.351654: wbc_writepage: bdi 253:16: towrt=1023 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516 [000] 1342952.369520: wbc_writepage: bdi 253:16: towrt=0 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516 [000] 1342952.369542: wbc_writepage: bdi 253:16: towrt=-1 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516 [000] 1342952.369549: wbc_writepage: bdi 253:16: towrt=-2 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0

    Writeback is not terminating in background writeback if ->writepage is
    returning with wbc->nr_to_write == 0, resulting in sub-optimal single page
    writeback on XFS.

    Fix the write_cache_pages loop to terminate correctly when this situation
    occurs and so prevent this sub-optimal background writeback pattern. This
    improves sustained sequential buffered write performance from around
    250MB/s to 750MB/s for a 100GB file on an XFS filesystem on my 8p test VM.

    Cc:
    Signed-off-by: Dave Chinner
    Reviewed-by: Wu Fengguang
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

23 Aug, 2010

1 commit

  • This allows code outside of the mm core to safely manipulate page state
    and not worry about the other accounting. Not using these routines means
    that some code will lose track of the accounting and we get bugs. This
    has happened once already.

    Signed-off-by: Michael Rubin
    Signed-off-by: Sage Weil

    Michael Rubin
     

21 Aug, 2010

1 commit


15 Aug, 2010

1 commit


12 Aug, 2010

4 commits

  • Document global_dirty_limits() and bdi_dirty_limit().

    Signed-off-by: Wu Fengguang
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Jens Axboe
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Split get_dirty_limits() into global_dirty_limits()+bdi_dirty_limit(), so
    that the latter can be avoided when under global dirty background
    threshold (which is the normal state for most systems).

    Signed-off-by: Wu Fengguang
    Cc: Peter Zijlstra
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Reducing the number of times balance_dirty_pages calls global_page_state
    reduces the cache references and so improves write performance on a
    variety of workloads.

    'perf stats' of simple fio write tests shows the reduction in cache
    access. Where the test is fio 'write,mmap,600Mb,pre_read' on AMD AthlonX2
    with 3Gb memory (dirty_threshold approx 600 Mb) running each test 10
    times, dropping the fasted & slowest values then taking the average &
    standard deviation

    average (s.d.) in millions (10^6)
    2.6.31-rc8 648.6 (14.6)
    +patch 620.1 (16.5)

    Achieving this reduction is by dropping clip_bdi_dirty_limit as it rereads
    the counters to apply the dirty_threshold and moving this check up into
    balance_dirty_pages where it has already read the counters.

    Also by rearrange the for loop to only contain one copy of the limit tests
    allows the pdflush test after the loop to use the local copies of the
    counters rather than rereading them.

    In the common case with no throttling it now calls global_page_state 5
    fewer times and bdi_stat 2 fewer.

    Fengguang:

    This patch slightly changes behavior by replacing clip_bdi_dirty_limit()
    with the explicit check (nr_reclaimable + nr_writeback >= dirty_thresh) to
    avoid exceeding the dirty limit. Since the bdi dirty limit is mostly
    accurate we don't need to do routinely clip. A simple dirty limit check
    would be enough.

    The check is necessary because, in principle we should throttle everything
    calling balance_dirty_pages() when we're over the total limit, as said by
    Peter.

    We now set and clear dirty_exceeded not only based on bdi dirty limits,
    but also on the global dirty limit. The global limit check is added in
    place of clip_bdi_dirty_limit() for safety and not intended as a behavior
    change. The bdi limits should be tight enough to keep all dirty pages
    under the global limit at most time; occasional small exceeding should be
    OK though. The change makes the logic more obvious: the global limit is
    the ultimate goal and shall be always imposed.

    We may now start background writeback work based on outdated conditions.
    That's safe because the bdi flush thread will (and have to) double check
    the states. It reduces overall overheads because the test based on old
    states still have good chance to be right.

    [akpm@linux-foundation.org] fix uninitialized dirty_exceeded
    Signed-off-by: Richard Kennedy
    Signed-off-by: Wu Fengguang
    Cc: Jan Kara
    Acked-by: Peter Zijlstra
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Fix a fatal kernel-doc error due to a #define coming between a function's
    kernel-doc notation and the function signature. (kernel-doc cannot handle
    this)

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

11 Aug, 2010

1 commit

  • * 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block: (149 commits)
    block: make sure that REQ_* types are seen even with CONFIG_BLOCK=n
    xen-blkfront: fix missing out label
    blkdev: fix blkdev_issue_zeroout return value
    block: update request stacking methods to support discards
    block: fix missing export of blk_types.h
    writeback: fix bad _bh spinlock nesting
    drbd: revert "delay probes", feature is being re-implemented differently
    drbd: Initialize all members of sync_conf to their defaults [Bugz 315]
    drbd: Disable delay probes for the upcomming release
    writeback: cleanup bdi_register
    writeback: add new tracepoints
    writeback: remove unnecessary init_timer call
    writeback: optimize periodic bdi thread wakeups
    writeback: prevent unnecessary bdi threads wakeups
    writeback: move bdi threads exiting logic to the forker thread
    writeback: restructure bdi forker loop a little
    writeback: move last_active to bdi
    writeback: do not remove bdi from bdi_list
    writeback: simplify bdi code a little
    writeback: do not lose wake-ups in bdi threads
    ...

    Fixed up pretty trivial conflicts in drivers/block/virtio_blk.c and
    drivers/scsi/scsi_error.c as per Jens.

    Linus Torvalds
     

10 Aug, 2010

1 commit

  • We try to avoid livelocks of writeback when some steadily creates dirty
    pages in a mapping we are writing out. For memory-cleaning writeback,
    using nr_to_write works reasonably well but we cannot really use it for
    data integrity writeback. This patch tries to solve the problem.

    The idea is simple: Tag all pages that should be written back with a
    special tag (TOWRITE) in the radix tree. This can be done rather quickly
    and thus livelocks should not happen in practice. Then we start doing the
    hard work of locking pages and sending them to disk only for those pages
    that have TOWRITE tag set.

    Note: Adding new radix tree tag grows radix tree node from 288 to 296
    bytes for 32-bit archs and from 552 to 560 bytes for 64-bit archs.
    However, the number of slab/slub items per page remains the same (13 and 7
    respectively).

    Signed-off-by: Jan Kara
    Cc: Dave Chinner
    Cc: Nick Piggin
    Cc: Chris Mason
    Cc: Theodore Ts'o
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

08 Aug, 2010

2 commits

  • Add a trace event to the ->writepage loop in write_cache_pages to give
    visibility into how the ->writepage call is changing variables within the
    writeback control structure. Of most interest is how wbc->nr_to_write changes
    from call to call, especially with filesystems that write multiple pages
    in ->writepage.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Dave Chinner
     
  • Tracing high level background writeback events is good, but it doesn't
    give the entire picture. Add visibility into write throttling to catch IO
    dispatched by foreground throttling of processing dirtying lots of pages.

    Signed-off-by: Dave Chinner
    Signed-off-by: Jens Axboe

    Dave Chinner
     

06 Jul, 2010

1 commit


11 Jun, 2010

1 commit


09 Jun, 2010

2 commits

  • sync can currently take a really long time if a concurrent writer is
    extending a file. The problem is that the dirty pages on the address
    space grow in the same direction as write_cache_pages scans, so if
    the writer keeps ahead of writeback, the writeback will not
    terminate until the writer stops adding dirty pages.

    For a data integrity sync, we only need to write the pages dirty at
    the time we start the writeback, so we can stop scanning once we get
    to the page that was at the end of the file at the time the scan
    started.

    This will prevent operations like copying a large file preventing
    sync from completing as it will not write back pages that were
    dirtied after the sync was started. This does not impact the
    existing integrity guarantees, as any dirty page (old or new)
    within the EOF range at the start of the scan will still be
    captured.

    This patch will not prevent sync from blocking on large writes into
    holes. That requires more complex intervention while this patch only
    addresses the common append-case of this sync holdoff.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Dave Chinner
     
  • If a filesystem writes more than one page in ->writepage, write_cache_pages
    fails to notice this and continues to attempt writeback when wbc->nr_to_write
    has gone negative - this trace was captured from XFS:

    wbc_writeback_start: towrt=1024
    wbc_writepage: towrt=1024
    wbc_writepage: towrt=0
    wbc_writepage: towrt=-1
    wbc_writepage: towrt=-5
    wbc_writepage: towrt=-21
    wbc_writepage: towrt=-85

    This has adverse effects on filesystem writeback behaviour. write_cache_pages()
    needs to terminate after a certain number of pages are written, not after a
    certain number of calls to ->writepage are made. This is a regression
    introduced by 17bc6c30cf6bfffd816bdc53682dd46fc34a2cf4 ("vfs: Add
    no_nrwrite_index_update writeback control flag"), but cannot be reverted
    directly due to subsequent bug fixes that have gone in on top of it.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Dave Chinner
     

01 Jun, 2010

1 commit


22 May, 2010

3 commits

  • The laptop mode timer had the nr_pages and sb_locked arguments
    mixed up.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • When CONFIG_BLOCK isn't enabled:

    mm/page-writeback.c: In function 'laptop_mode_timer_fn':
    mm/page-writeback.c:708: error: dereferencing pointer to incomplete type
    mm/page-writeback.c:709: error: dereferencing pointer to incomplete type

    Fix this by essentially eliminating the laptop sync handlers when
    CONFIG_BLOCK isn't set, as most are only used from the block layer code.
    The exception is laptop_sync_completion() which is used from sys_sync(),
    make that an empty declaration in that case.

    Reported-by: Randy Dunlap
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Commit 69b62d01 fixed up most of the places where we would enter
    busy schedule() spins when disabling the periodic background
    writeback. This fixes up the sb timer so that it doesn't get
    hammered on with the delay disabled, and ensures that it gets
    rearmed if needed when /proc/sys/vm/dirty_writeback_centisecs
    gets modified.

    bdi_forker_task() also needs to check for !dirty_writeback_centisecs
    and use schedule() appropriately, fix that up too.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 May, 2010

1 commit

  • When umount calls sync_filesystem(), we first do a WB_SYNC_NONE
    writeback to kick off writeback of pending dirty inodes, then follow
    that up with a WB_SYNC_ALL to wait for it. Since umount already holds
    the sb s_umount mutex, WB_SYNC_NONE ends up doing nothing and all
    writeback happens as WB_SYNC_ALL. This can greatly slow down umount,
    since WB_SYNC_ALL writeback is a data integrity operation and thus
    a bigger hammer than simple WB_SYNC_NONE. For barrier aware file systems
    it's a lot slower.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

06 Apr, 2010

1 commit

  • One of the features of laptop-mode is that it forces a writeout of dirty
    pages if something else triggers a physical read or write from a device.
    The current implementation flushes pages on all devices, rather than only
    the one that triggered the flush. This patch alters the behaviour so that
    only the recently accessed block device is flushed, preventing other
    disks being spun up for no terribly good reason.

    Signed-off-by: Matthew Garrett
    Signed-off-by: Jens Axboe

    Matthew Garrett
     

03 Dec, 2009

1 commit

  • - no one is calling wb_writeback and write_cache_pages with
    wbc.nonblocking=1 any more
    - lumpy pageout will want to do nonblocking writeback without the
    congestion wait

    So remove the congestion checks as suggested by Chris.

    Signed-off-by: Wu Fengguang
    Cc: Chris Mason
    Cc: Jens Axboe
    Cc: Trond Myklebust
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Evgeniy Polyakov
    Cc: Alex Elder
    Signed-off-by: Jens Axboe

    Wu Fengguang
     

09 Oct, 2009

1 commit

  • It makes sense to do IOWAIT when someone is blocked
    due to IO throttle, as suggested by Kame and Peter.

    There is an old comment for not doing IOWAIT on throttle,
    however it has been mismatching the code for a long time.

    If we stop accounting IOWAIT for 2.6.32, it could be an
    undesirable behavior change. So restore the io_schedule.

    CC: KAMEZAWA Hiroyuki
    CC: Peter Zijlstra
    Signed-off-by: Wu Fengguang
    Signed-off-by: Jens Axboe

    Wu Fengguang
     

26 Sep, 2009

5 commits

  • Sometimes we only want to write pages from a specific super_block,
    so allow that to be passed in.

    This fixes a problem with commit 56a131dcf7ed36c3c6e36bea448b674ea85ed5bb
    causing writeback on all super_blocks on a bdi, where we only really
    want to sync a specific sb from writeback_inodes_sb().

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • * 'writeback' of git://git.kernel.dk/linux-2.6-block:
    writeback: writeback_inodes_sb() should use bdi_start_writeback()
    writeback: don't delay inodes redirtied by a fast dirtier
    writeback: make the super_block pinning more efficient
    writeback: don't resort for a single super_block in move_expired_inodes()
    writeback: move inodes from one super_block together
    writeback: get rid to incorrect references to pdflush in comments
    writeback: improve readability of the wb_writeback() continue/break logic
    writeback: cleanup writeback_single_inode()
    writeback: kupdate writeback shall not stop when more io is possible
    writeback: stop background writeback when below background threshold
    writeback: balance_dirty_pages() shall write more than dirtied pages
    fs: Fix busyloop in wb_writeback()

    Linus Torvalds
     
  • Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Treat bdi_start_writeback(0) as a special request to do background write,
    and stop such work when we are below the background dirty threshold.

    Also simplify the (nr_pages
    CC: Jan Kara
    Acked-by: Peter Zijlstra
    Signed-off-by: Wu Fengguang
    Signed-off-by: Jens Axboe

    Wu Fengguang
     
  • Some filesystem may choose to write much more than ratelimit_pages
    before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
    determine number to write based on real number of dirtied pages.

    Otherwise it is possible that
    loop {
    btrfs_file_write(): dirty 1024 pages
    balance_dirty_pages(): write up to 48 pages (= ratelimit_pages * 1.5)
    }
    in which the writeback rate cannot keep up with dirty rate, and the
    dirty pages go all the way beyond dirty_thresh.

    The increased write_chunk may make the dirtier more bumpy.
    So filesystems shall be take care not to dirty too much at
    a time (eg. > 4MB) without checking the ratelimit.

    Signed-off-by: Wu Fengguang
    Acked-by: Peter Zijlstra
    Signed-off-by: Jens Axboe

    Wu Fengguang
     

24 Sep, 2009

2 commits

  • * 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits)
    HWPOISON: Enable error_remove_page on btrfs
    HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs
    HWPOISON: Add madvise() based injector for hardware poisoned pages v4
    HWPOISON: Enable error_remove_page for NFS
    HWPOISON: Enable .remove_error_page for migration aware file systems
    HWPOISON: The high level memory error handler in the VM v7
    HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process
    HWPOISON: shmem: call set_page_dirty() with locked page
    HWPOISON: Define a new error_remove_page address space op for async truncation
    HWPOISON: Add invalidate_inode_page
    HWPOISON: Refactor truncate to allow direct truncating of page v2
    HWPOISON: check and isolate corrupted free pages v2
    HWPOISON: Handle hardware poisoned pages in try_to_unmap
    HWPOISON: Use bitmask/action code for try_to_unmap behaviour
    HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2
    HWPOISON: Add poison check to page fault handling
    HWPOISON: Add basic support for poisoned pages in fault handler v3
    HWPOISON: Add new SIGBUS error codes for hardware poison signals
    HWPOISON: Add support for poison swap entries v2
    HWPOISON: Export some rmap vma locking to outside world
    ...

    Linus Torvalds
     
  • It's unused.

    It isn't needed -- read or write flag is already passed and sysctl
    shouldn't care about the rest.

    It _was_ used in two places at arch/frv for some reason.

    Signed-off-by: Alexey Dobriyan
    Cc: David Howells
    Cc: "Eric W. Biederman"
    Cc: Al Viro
    Cc: Ralf Baechle
    Cc: Martin Schwidefsky
    Cc: Ingo Molnar
    Cc: "David S. Miller"
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan