02 Jun, 2015

3 commits

  • As concurrent write sharing of an inode is expected to be very rare
    and memcg only tracks page ownership on first-use basis severely
    confining the usefulness of such sharing, cgroup writeback tracks
    ownership per-inode. While the support for concurrent write sharing
    of an inode is deemed unnecessary, an inode being written to by
    different cgroups at different points in time is a lot more common,
    and, more importantly, charging only by first-use can too readily lead
    to grossly incorrect behaviors (single foreign page can lead to
    gigabytes of writeback to be incorrectly attributed).

    To resolve this issue, cgroup writeback detects the majority dirtier
    of an inode and will transfer the ownership to it. To avoid
    unnnecessary oscillation, the detection mechanism keeps track of
    history and gives out the switch verdict only if the foreign usage
    pattern is stable over a certain amount of time and/or writeback
    attempts.

    The detection mechanism has fairly low space and computation overhead.
    It adds 8 bytes to struct inode (one int and two u16's) and minimal
    amount of calculation per IO. The detection mechanism converges to
    the correct answer usually in several seconds of IO time when there's
    a clear majority dirtier. Even when there isn't, it can reach an
    acceptable answer fairly quickly under most circumstances.

    Please see wb_detach_inode() for more details.

    This patch only implements detection. Following patches will
    implement actual switching.

    v2: wbc_account_io() now checks whether the wbc is associated with a
    wb before dereferencing it. This can happen when pageout() is
    writing pages directly without going through the usual writeback
    path. As pageout() path is single-threaded, we don't want it to
    be blocked behind a slow cgroup and ultimately want it to delegate
    actual writing to the usual writeback path.

    Signed-off-by: Tejun Heo
    Cc: Jens Axboe
    Cc: Jan Kara
    Cc: Wu Fengguang
    Cc: Greg Thelen
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently, for cgroup writeback, the IO submission paths directly
    associate the bio's with the blkcg from inode_to_wb_blkcg_css();
    however, it'd be necessary to keep more writeback context to implement
    foreign inode writeback detection. wbc (writeback_control) is the
    natural fit for the extra context - it persists throughout the
    writeback of each inode and is passed all the way down to IO
    submission paths.

    This patch adds wbc_attach_and_unlock_inode(), wbc_detach_inode(), and
    wbc_attach_fdatawrite_inode() which are used to associate wbc with the
    inode being written back. IO submission paths now use wbc_init_bio()
    instead of directly associating bio's with blkcg themselves. This
    leaves inode_to_wb_blkcg_css() w/o any user. The function is removed.

    wbc currently only tracks the associated wb (bdi_writeback). Future
    patches will add more for foreign inode detection. The association is
    established under i_lock which will be depended upon when migrating
    foreign inodes to other wb's.

    As currently, once established, inode to wb association never changes,
    going through wbc when initializing bio's doesn't cause any behavior
    changes.

    v2: submit_blk_blkcg() now checks whether the wbc is associated with a
    wb before dereferencing it. This can happen when pageout() is
    writing pages directly without going through the usual writeback
    path. As pageout() path is single-threaded, we don't want it to
    be blocked behind a slow cgroup and ultimately want it to delegate
    actual writing to the usual writeback path.

    Signed-off-by: Tejun Heo
    Cc: Jens Axboe
    Cc: Jan Kara
    Cc: Wu Fengguang
    Cc: Greg Thelen
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • __mpage_writepage() is used to implement mpage_writepages() which in
    turn is used for ->writepages() of various filesystems. All writeback
    logic is now updated to handle cgroup writeback and the block cgroup
    to issue IOs for is encoded in writeback_control and can be retrieved
    from the inode; however, __mpage_writepage() currently ignores the
    blkcg indicated by the inode and issues all bio's without explicit
    blkcg association.

    This patch updates __mpage_writepage() so that the issued bio's are
    associated with inode_to_writeback_blkcg_css(inode).

    v2: Updated for per-inode wb association.

    Signed-off-by: Tejun Heo
    Cc: Jens Axboe
    Cc: Jan Kara
    Cc: Andrew Morton
    Cc: Alexander Viro
    Signed-off-by: Jens Axboe

    Tejun Heo
     

10 Oct, 2014

1 commit

  • Add guard_bio_eod() check for mpage code in order to allow us to do IO
    even on the odd last sectors of a device, even if the block size is some
    multiple of the physical sector size.

    Using mpage_readpages() for block device requires this guard check.

    Signed-off-by: Akinobu Mita
    Cc: Jens Axboe
    Cc: Alexander Viro
    Cc: Jeff Moyer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

05 Jun, 2014

3 commits

  • A block device driver may choose to provide a rw_page operation. These
    will be called when the filesystem is attempting to do page sized I/O to
    page cache pages (ie not for direct I/O). This does preclude I/Os that
    are larger than page size, so this may only be a performance gain for
    some devices.

    Signed-off-by: Matthew Wilcox
    Tested-by: Dheeraj Reddy
    Cc: Dave Chinner
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • page_endio() takes care of updating all the appropriate page flags once
    I/O has finished to a page. Switch to using mapping_set_error() instead
    of setting AS_EIO directly; this will handle thin-provisioned devices
    correctly.

    Signed-off-by: Matthew Wilcox
    Cc: Dave Chinner
    Cc: Dheeraj Reddy
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • __mpage_writepage() is over 200 lines long, has 20 local variables, four
    goto labels and could desperately use simplification. Splitting
    clean_buffers() into a helper function improves matters a little,
    removing 20+ lines from it.

    Signed-off-by: Matthew Wilcox
    Cc: Dave Chinner
    Cc: Dheeraj Reddy
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

24 Nov, 2013

2 commits

  • Immutable biovecs are going to require an explicit iterator. To
    implement immutable bvecs, a later patch is going to add a bi_bvec_done
    member to this struct; for now, this patch effectively just renames
    things.

    Signed-off-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Geert Uytterhoeven
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Ed L. Cashin"
    Cc: Nick Piggin
    Cc: Lars Ellenberg
    Cc: Jiri Kosina
    Cc: Matthew Wilcox
    Cc: Geoff Levand
    Cc: Yehuda Sadeh
    Cc: Sage Weil
    Cc: Alex Elder
    Cc: ceph-devel@vger.kernel.org
    Cc: Joshua Morris
    Cc: Philip Kelleher
    Cc: Rusty Russell
    Cc: "Michael S. Tsirkin"
    Cc: Konrad Rzeszutek Wilk
    Cc: Jeremy Fitzhardinge
    Cc: Neil Brown
    Cc: Alasdair Kergon
    Cc: Mike Snitzer
    Cc: dm-devel@redhat.com
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: linux390@de.ibm.com
    Cc: Boaz Harrosh
    Cc: Benny Halevy
    Cc: "James E.J. Bottomley"
    Cc: Greg Kroah-Hartman
    Cc: "Nicholas A. Bellinger"
    Cc: Alexander Viro
    Cc: Chris Mason
    Cc: "Theodore Ts'o"
    Cc: Andreas Dilger
    Cc: Jaegeuk Kim
    Cc: Steven Whitehouse
    Cc: Dave Kleikamp
    Cc: Joern Engel
    Cc: Prasad Joshi
    Cc: Trond Myklebust
    Cc: KONISHI Ryusuke
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Ben Myers
    Cc: xfs@oss.sgi.com
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Len Brown
    Cc: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Cc: Herton Ronaldo Krzesinski
    Cc: Ben Hutchings
    Cc: Andrew Morton
    Cc: Guo Chao
    Cc: Tejun Heo
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Wei Yongjun
    Cc: "Roger Pau Monné"
    Cc: Jan Beulich
    Cc: Stefano Stabellini
    Cc: Ian Campbell
    Cc: Sebastian Ott
    Cc: Christian Borntraeger
    Cc: Minchan Kim
    Cc: Jiang Liu
    Cc: Nitin Gupta
    Cc: Jerome Marchand
    Cc: Joe Perches
    Cc: Peng Tao
    Cc: Andy Adamson
    Cc: fanchaoting
    Cc: Jie Liu
    Cc: Sunil Mushran
    Cc: "Martin K. Petersen"
    Cc: Namjae Jeon
    Cc: Pankaj Kumar
    Cc: Dan Magenheimer
    Cc: Mel Gorman 6

    Kent Overstreet
     
  • With immutable biovecs we don't want code accessing bi_io_vec directly -
    the uses this patch changes weren't incorrect since they all own the
    bio, but it makes the code harder to audit for no good reason - also,
    this will help with multipage bvecs later.

    Signed-off-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Alexander Viro
    Cc: Chris Mason
    Cc: Jaegeuk Kim
    Cc: Joern Engel
    Cc: Prasad Joshi
    Cc: Trond Myklebust

    Kent Overstreet
     

29 Feb, 2012

1 commit


12 Jan, 2012

1 commit


27 May, 2011

1 commit

  • This fourth patch of eight in this cleancache series provides the
    core hooks in VFS for: initializing cleancache per filesystem;
    capturing clean pages reclaimed by page cache; attempting to get
    pages from cleancache before filesystem read; and ensuring coherency
    between pagecache, disk, and cleancache. Note that the placement
    of these hooks was stable from 2.6.18 to 2.6.38; a minor semantic
    change was required due to a patchset in 2.6.39.

    All hooks become no-ops if CONFIG_CLEANCACHE is unset, or become
    a check of a boolean global if CONFIG_CLEANCACHE is set but no
    cleancache "backend" has claimed cleancache_ops.

    Details and a FAQ can be found in Documentation/vm/cleancache.txt

    [v8: minchan.kim@gmail.com: adapt to new remove_from_page_cache function]
    Signed-off-by: Chris Mason
    Signed-off-by: Dan Magenheimer
    Reviewed-by: Jeremy Fitzhardinge
    Reviewed-by: Konrad Rzeszutek Wilk
    Cc: Andrew Morton
    Cc: Al Viro
    Cc: Matthew Wilcox
    Cc: Nick Piggin
    Cc: Mel Gorman
    Cc: Rik Van Riel
    Cc: Jan Beulich
    Cc: Andreas Dilger
    Cc: Ted Ts'o
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Nitin Gupta

    Dan Magenheimer
     

10 Mar, 2011

1 commit


14 Jan, 2011

1 commit

  • Merge mpage_end_io_read() and mpage_end_io_write() into mpage_end_io() to
    eliminate code duplication.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Hai Shan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hai Shan
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

04 Feb, 2010

1 commit


14 May, 2009

1 commit

  • These struct buffer_heads are allocated on the stack (and hence are
    initialized with stack garbage). They are only used to call a
    get_blocks() function, so that's mostly OK, but b_state must be
    initialized to be 0 so we don't have any unexpected BH_* flags set by
    accident, such as BH_Unwritten or BH_Delay.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     

01 Apr, 2009

1 commit

  • Commit 29a814d2ee0e43c2980f33f91c1311ec06c0aa35 (vfs: add hooks for
    ext4's delayed allocation support) exported the following functions

    mpage_bio_submit()
    __mpage_writepage()

    for the benefit of ext4's delayed allocation support. Since commit
    a1d6cc563bfdf1bf2829d3e6ce4d8b774251796b (ext4: Rework the
    ext4_da_writepages() function), these functions are not used by the
    ext4 driver anymore. However, the now unnecessary exports still
    remain, and this patch removes those. Moreover, these two functions
    can become static again.

    The issue was spotted by namespacecheck.

    Signed-off-by: Dmitri Vorobiev
    Reviewed-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Dmitri Vorobiev
     

07 Jan, 2009

2 commits

  • It is known that buffer_mapped() is false in this code path.

    Signed-off-by: Franck Bui-Huu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Franck Bui-Huu
     
  • While tracing I/O patterns with blktrace (a great tool) a few weeks ago I
    identified a minor issue in fs/mpage.c

    As the comment above mpage_readpages() says, a fs's get_block function
    will set BH_Boundary when it maps a block just before a block for which
    extra I/O is required.

    Since get_block() can map a range of pages, for all these pages the
    BH_Boundary flag will be set. But we only need to push what I/O we have
    accumulated at the last block of this range.

    This makes do_mpage_readpage() send out the largest possible bio instead
    of a bunch of page-sized ones in the BH_Boundary case.

    Signed-off-by: Miquel van Smoorenburg
    Cc: Nick Piggin
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miquel van Smoorenburg
     

17 Oct, 2008

1 commit


12 Jul, 2008

1 commit

  • Export mpage_bio_submit() and __mpage_writepage() for the benefit of
    ext4's delayed allocation support. Also change __block_write_full_page
    so that if buffers that have the BH_Delay flag set it will call
    get_block() to get the physical block allocated, just as in the
    !BH_Mapped case.

    Signed-off-by: Alex Tomas
    Signed-off-by: "Theodore Ts'o"

    Alex Tomas
     

04 Mar, 2008

1 commit


06 Feb, 2008

1 commit

  • Simplify page cache zeroing of segments of pages through 3 functions

    zero_user_segments(page, start1, end1, start2, end2)

    Zeros two segments of the page. It takes the position where to
    start and end the zeroing which avoids length calculations and
    makes code clearer.

    zero_user_segment(page, start, end)

    Same for a single segment.

    zero_user(page, start, length)

    Length variant for the case where we know the length.

    We remove the zero_user_page macro. Issues:

    1. Its a macro. Inline functions are preferable.

    2. The KM_USER0 macro is only defined for HIGHMEM.

    Having to treat this special case everywhere makes the
    code needlessly complex. The parameter for zeroing is always
    KM_USER0 except in one single case that we open code.

    Avoiding KM_USER0 makes a lot of code not having to be dealing
    with the special casing for HIGHMEM anymore. Dealing with
    kmap is only necessary for HIGHMEM configurations. In those
    configurations we use KM_USER0 like we do for a series of other
    functions defined in highmem.h.

    Since KM_USER0 is depends on HIGHMEM the existing zero_user_page
    function could not be a macro. zero_user_* functions introduced
    here can be be inline because that constant is not used when these
    functions are called.

    Also extract the flushing of the caches to be outside of the kmap.

    [akpm@linux-foundation.org: fix nfs and ntfs build]
    [akpm@linux-foundation.org: fix ntfs build some more]
    Signed-off-by: Christoph Lameter
    Cc: Steven French
    Cc: Michael Halcrow
    Cc:
    Cc: Steven Whitehouse
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Anton Altaparmakov
    Cc: Mark Fasheh
    Cc: David Chinner
    Cc: Michael Halcrow
    Cc: Steven French
    Cc: Steven Whitehouse
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

17 Oct, 2007

1 commit

  • Quite a bit of code is used in maintaining these "cached pages" that are
    probably pretty unlikely to get used. It would require a narrow race where
    the page is inserted concurrently while this process is allocating a page
    in order to create the spare page. Then a multi-page write into an uncached
    part of the file, to make use of it.

    Next, the buffered write path (and others) uses its own LRU pagevec when it
    should be just using the per-CPU LRU pagevec (which will cut down on both data
    and code size cacheline footprint). Also, these private LRU pagevecs are
    emptied after just a very short time, in contrast with the per-CPU pagevecs
    that are persistent. Net result: 7.3 times fewer lru_lock acquisitions required
    to add the pages to pagecache for a bulk write (in 4K chunks).

    [this gets rid of some cond_resched() calls in readahead.c and mpage.c due
    to clashes in -mm. What put them there, and why? ]

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

10 Oct, 2007

1 commit

  • As bi_end_io is only called once when the reqeust is complete,
    the 'size' argument is now redundant. Remove it.

    Now there is no need for bio_endio to subtract the size completed
    from bi_size. So don't do that either.

    While we are at it, change bi_end_io to return void.

    Signed-off-by: Neil Brown
    Signed-off-by: Jens Axboe

    NeilBrown
     

11 May, 2007

1 commit

  • Clean up massive code duplication between mpage_writepages() and
    generic_writepages().

    The new generic function, write_cache_pages() takes a function pointer
    argument, which will be called for each page to be written.

    Maybe cifs_writepages() too can use this infrastructure, but I'm not
    touching that with a ten-foot pole.

    The upcoming page writeback support in fuse will also want this.

    Signed-off-by: Miklos Szeredi
    Acked-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

10 May, 2007

1 commit

  • It's very common for file systems to need to zero part or all of a page,
    the simplist way is just to use kmap_atomic() and memset(). There's
    actually a library function in include/linux/highmem.h that does exactly
    that, but it's confusingly named memclear_highpage_flush(), which is
    descriptive of *how* it does the work rather than what the *purpose* is.
    So this patchset renames the function to zero_user_page(), and calls it
    from the various places that currently open code it.

    This first patch introduces the new function call, and converts all the
    core kernel callsites, both the open-coded ones and the old
    memclear_highpage_flush() ones. Following this patch is a series of
    conversions for each file system individually, per AKPM, and finally a
    patch deprecating the old call. The diffstat below shows the entire
    patchset.

    [akpm@linux-foundation.org: fix a few things]
    Signed-off-by: Nate Diller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nate Diller
     

09 May, 2007

1 commit


01 Oct, 2006

1 commit


23 Jun, 2006

1 commit

  • When a writeback_control's `start' and `end' fields are used to
    indicate a one-byte-range starting at file offset zero, the required
    values of .start=0,.end=0 mean that the ->writepages() implementation
    has no way of telling that it is being asked to perform a range
    request. Because we're currently overloading (start == 0 && end == 0)
    to mean "this is not a write-a-range request".

    To make all this sane, the patch changes range of writeback_control.

    So caller does: If it is calling ->writepages() to write pages, it
    sets range (range_start/end or range_cyclic) always.

    And if range_cyclic is true, ->writepages() thinks the range is
    cyclic, otherwise it just uses range_start and range_end.

    This patch does,

    - Add LLONG_MAX, LLONG_MIN, ULLONG_MAX to include/linux/kernel.h
    -1 is usually ok for range_end (type is long long). But, if someone did,

    range_end += val; range_end is "val - 1"
    u64val = range_end >> bits; u64val is "~(0ULL)"

    or something, they are wrong. So, this adds LLONG_MAX to avoid nasty
    things, and uses LLONG_MAX for range_end.

    - All callers of ->writepages() sets range_start/end or range_cyclic.

    - Fix updates of ->writeback_index. It seems already bit strange.
    If it starts at 0 and ended by check of nr_to_write, this last
    index may reduce chance to scan end of file. So, this updates
    ->writeback_index only if range_cyclic is true or whole-file is
    scanned.

    Signed-off-by: OGAWA Hirofumi
    Cc: Nathan Scott
    Cc: Anton Altaparmakov
    Cc: Steven French
    Cc: "Vladimir V. Saveliev"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    OGAWA Hirofumi
     

27 Mar, 2006

2 commits

  • This patch changes mpage_readpages() and get_block() to get the disk mapping
    information for multiple blocks at the same time.

    b_size represents the amount of disk mapping that needs to mapped. On the
    successful get_block() b_size indicates the amount of disk mapping thats
    actually mapped. Only the filesystems who care to use this information and
    provide multiple disk blocks at a time can choose to do so.

    No changes are needed for the filesystems who wants to ignore this.

    [akpm@osdl.org: cleanups]
    Signed-off-by: Badari Pulavarty
    Cc: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     
  • Pass amount of disk needs to be mapped to get_block(). This way one can
    modify the fs ->get_block() functions to map multiple blocks at the same time.

    [akpm@osdl.org: performance tweak]
    [akpm@osdl.org: remove unneeded assignments]
    Signed-off-by: Badari Pulavarty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     

09 Jan, 2006

1 commit

  • We've had two instances recently of overflows when doing

    64_bit_value = (32_bit_value << PAGE_CACHE_SHIFT)

    I did a tree-wide grep of `<page_base)

    Cc: Oleg Drokin
    Cc: David Howells
    Cc: David Woodhouse
    Cc:
    Cc: Christoph Hellwig
    Cc: Anton Altaparmakov
    Cc: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Cc: Roman Zippel
    Cc:
    Cc: Miklos Szeredi
    Cc: Russell King
    Cc: Trond Myklebust
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

04 Jan, 2006

1 commit

  • readpage(), prepare_write(), and commit_write() callers are updated to
    understand the special return code AOP_TRUNCATED_PAGE in the style of
    writepage() and WRITEPAGE_ACTIVATE. AOP_TRUNCATED_PAGE tells the caller that
    the callee has unlocked the page and that the operation should be tried again
    with a new page. OCFS2 uses this to detect and work around a lock inversion in
    its aop methods. There should be no change in behaviour for methods that don't
    return AOP_TRUNCATED_PAGE.

    WRITEPAGE_ACTIVATE is also prepended with AOP_ for consistency and they are
    made enums so that kerneldoc can be used to document their semantics.

    Signed-off-by: Zach Brown

    Zach Brown
     

09 Oct, 2005

1 commit

  • - added typedef unsigned int __nocast gfp_t;

    - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
    the same warnings as far as sparse is concerned, doesn't change
    generated code (from gcc point of view we replaced unsigned int with
    typedef) and documents what's going on far better.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

05 Jun, 2005

1 commit

  • When fsync() runs wait_on_page_writeback_range() it only inspects pages which
    are actually under I/O (PAGECACHE_TAG_WRITEBACK). If a page completed I/O
    prior to wait_on_page_writeback_range() looking at it, it is supposed to have
    recorded its I/O error state in the address_space.

    But mpage_mpage_end_io_write() forgot to set the address_space error flag in
    this case.

    Signed-off-by: Qu Fuping
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qu Fuping
     

06 May, 2005

2 commits


01 May, 2005

1 commit