Eric Lee / smarc-fsl-linux-kernel

30 Apr, 2008

4 commits

76f1418b4 mm: bdi: move statistics to debugfs ... Browse Code »

Move BDI statistics to debugfs:

/sys/kernel/debug/bdi//stats

Use postcore_initcall() to initialize the sysfs class and debugfs,
because debugfs is initialized in core_initcall().

Update descriptions in ABI documentation.

Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-04-30 23:29:50 +0800
a42dde041 mm: bdi: allow setting a maximum for the bdi dirty limit ... Browse Code »

Add "max_ratio" to /sys/class/bdi. This indicates the maximum percentage of
the global dirty threshold allocated to this bdi.

[mszeredi@suse.cz]

- fix parsing in max_ratio_store().
- export bdi_set_max_ratio() to modules
- limit bdi_dirty with bdi->max_ratio
- document new sysfs attribute

Signed-off-by: Peter Zijlstra
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2008-04-30 23:29:50 +0800
189d3c4a9 mm: bdi: allow setting a minimum for the bdi dirty limit ... Browse Code »

Under normal circumstances each device is given a part of the total write-back
cache that relates to its current avg writeout speed in relation to the other
devices.

min_ratio - allows one to assign a minimum portion of the write-back cache to
a particular device. This is useful in situations where you might want to
provide a minimum QoS. (One request for this feature came from flash based
storage people who wanted to avoid writing out at all costs - they of course
needed some pdflush hacks as well)

max_ratio - allows one to assign a maximum portion of the dirty limit to a
particular device. This is useful in situations where you want to avoid one
device taking all or most of the write-back cache. Eg. an NFS mount that is
prone to get stuck, or a FUSE mount which you don't trust to play fair.

Add "min_ratio" to /sys/class/bdi. This indicates the minimum percentage of
the global dirty threshold allocated to this bdi.

[mszeredi@suse.cz]

- fix parsing in min_ratio_store()
- document new sysfs attribute

Signed-off-by: Peter Zijlstra
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2008-04-30 23:29:50 +0800
cf0ca9fe5 mm: bdi: export BDI attributes in sysfs ... Browse Code »

Provide a place in sysfs (/sys/class/bdi) for the backing_dev_info object.
This allows us to see and set the various BDI specific variables.

In particular this properly exposes the read-ahead window for all relevant
users and /sys/block//queue/read_ahead_kb should be deprecated.

With patient help from Kay Sievers and Greg KH

[mszeredi@suse.cz]

- split off NFS and FUSE changes into separate patches
- document new sysfs attributes under Documentation/ABI
- do bdi_class_init as a core_initcall, otherwise the "default" BDI
won't be initialized
- remove bdi_init_fmt macro, it's not used very much

[akpm@linux-foundation.org: fix ia64 warning]
Signed-off-by: Peter Zijlstra
Cc: Kay Sievers
Acked-by: Greg KH
Cc: Trond Myklebust
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2008-04-30 23:29:49 +0800

06 Dec, 2007

1 commit

4b01a0b16 mm/backing-dev.c: fix percpu_counter_destroy call bug in bdi_init ... Browse Code »

this call should use the array index j, not i. But with this approach, just
one int i is enough, int j is not needed.

Signed-off-by: Denis Cheng
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Denis Cheng
2007-12-06 01:21:18 +0800

17 Oct, 2007

3 commits

04fbfdc14 mm: per device dirty threshold ... Browse Code »

Scale writeback cache per backing device, proportional to its writeout speed.

By decoupling the BDI dirty thresholds a number of problems we currently have
will go away, namely:

- mutual interference starvation (for any number of BDIs);
- deadlocks with stacked BDIs (loop, FUSE and local NFS mounts).

It might be that all dirty pages are for a single BDI while other BDIs are
idling. By giving each BDI a 'fair' share of the dirty limit, each one can have
dirty pages outstanding and make progress.

A global threshold also creates a deadlock for stacked BDIs; when A writes to
B, and A generates enough dirty pages to get throttled, B will never start
writeback until the dirty pages go away. Again, by giving each BDI its own
'independent' dirty limit, this problem is avoided.

So the problem is to determine how to distribute the total dirty limit across
the BDIs fairly and efficiently. A DBI that has a large dirty limit but does
not have any dirty pages outstanding is a waste.

What is done is to keep a floating proportion between the DBIs based on
writeback completions. This way faster/more active devices get a larger share
than slower/idle devices.

[akpm@linux-foundation.org: fix warnings]
[hugh@veritas.com: Fix occasional hang when a task couldn't get out of balance_dirty_pages]
Signed-off-by: Peter Zijlstra
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-10-17 23:42:45 +0800
b2e8fb6ef mm: scalable bdi statistics counters ... Browse Code »

Provide scalable per backing_dev_info statistics counters.

Signed-off-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-10-17 23:42:45 +0800
c4dc4beed nfs: remove congestion_end() ... Browse Code »

These patches aim to improve balance_dirty_pages() and directly address three
issues:
1) inter device starvation
2) stacked device deadlocks
3) inter process starvation

1 and 2 are a direct result from removing the global dirty limit and using
per device dirty limits. By giving each device its own dirty limit is will
no longer starve another device, and the cyclic dependancy on the dirty limit
is broken.

In order to efficiently distribute the dirty limit across the independant
devices a floating proportion is used, this will allocate a share of the total
limit proportional to the device's recent activity.

3 is done by also scaling the dirty limit proportional to the current task's
recent dirty rate.

This patch:

nfs: remove congestion_end(). It's redundant, clear_bdi_congested() already
wakes the waiters.

Signed-off-by: Peter Zijlstra
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-10-17 23:42:44 +0800

17 Jul, 2007

1 commit

8f8a68ee4 remove mm/backing-dev.c:congestion_wait_interruptible() ... Browse Code »

congestion_wait_interruptible() is no longer used.

Signed-off-by: Adrian Bunk
Acked-by: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2007-07-17 00:05:52 +0800

17 Mar, 2007

1 commit

89a09141d [PATCH] nfs: fix congestion control ... Browse Code »

The current NFS client congestion logic is severly broken, it marks the
backing device congested during each nfs_writepages() call but doesn't
mirror this in nfs_writepage() which makes for deadlocks. Also it
implements its own waitqueue.

Replace this by a more regular congestion implementation that puts a cap on
the number of active writeback pages and uses the bdi congestion waitqueue.

Also always use an interruptible wait since it makes sense to be able to
SIGKILL the process even for mounts without 'intr'.

Signed-off-by: Peter Zijlstra
Acked-by: Trond Myklebust
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-03-17 10:25:05 +0800

21 Oct, 2006

1 commit

3fcfab16c [PATCH] separate bdi congestion functions from queue congestion functions ... Browse Code »

Separate out the concept of "queue congestion" from "backing-dev congestion".
Congestion is a backing-dev concept, not a queue concept.

The blk_* congestion functions are retained, as wrappers around the core
backing-dev congestion functions.

This proper layering is needed so that NFS can cleanly use the congestion
functions, and so that CONFIG_BLOCK=n actually links.

Cc: "Thomas Maier"
Cc: "Jens Axboe"
Cc: Trond Myklebust
Cc: David Howells
Cc: Peter Osterlund
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-10-21 01:26:35 +0800