21 Jul, 2011

1 commit

  • Btrfs needs to be able to control how filemap_write_and_wait_range() is called
    in fsync to make it less of a painful operation, so push down taking i_mutex and
    the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
    file systems can drop taking the i_mutex altogether it seems, like ext3 and
    ocfs2. For correctness sake I just pushed everything down in all cases to make
    sure that we keep the current behavior the same for everybody, and then each
    individual fs maintainer can make up their mind about what to do from there.
    Thanks,

    Acked-by: Jan Kara
    Signed-off-by: Josef Bacik
    Signed-off-by: Al Viro

    Josef Bacik
     

16 Jun, 2011

1 commit

  • afs_fill_page should read the page that is about to be written but
    the current implementation has a number of issues. If we aren't
    extending the file we always read PAGE_CACHE_SIZE at offset 0. If we
    are extending the file we try to read the entire file.

    Change afs_fill_page to read PAGE_CACHE_SIZE at the right offset,
    clamped to i_size.

    While here, avoid calling afs_fill_page when we are doing a
    PAGE_CACHE_SIZE write.

    Signed-off-by: Anton Blanchard
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    Anton Blanchard
     

26 Feb, 2011

1 commit

  • I'm seeing the following oops when testing afs:

    Unable to handle kernel paging request for data at address 0x00000008
    ...
    NIP [c0000000003393b0] .afs_unlink_writeback+0x38/0xc0
    LR [c00000000033987c] .afs_put_writeback+0x98/0xec
    Call Trace:
    [c00000000345f600] [c00000000033987c] .afs_put_writeback+0x98/0xec
    [c00000000345f690] [c00000000033ae80] .afs_write_begin+0x6a4/0x75c
    [c00000000345f790] [c00000000012b77c] .generic_file_buffered_write+0x148/0x320
    [c00000000345f8d0] [c00000000012e1b8] .__generic_file_aio_write+0x37c/0x3e4
    [c00000000345f9d0] [c00000000012e2a8] .generic_file_aio_write+0x88/0xfc
    [c00000000345fa90] [c0000000003390a8] .afs_file_write+0x10c/0x178
    [c00000000345fb40] [c000000000188788] .do_sync_write+0xc4/0x128
    [c00000000345fcc0] [c000000000189658] .vfs_write+0xe8/0x1d8
    [c00000000345fd70] [c000000000189884] .SyS_write+0x68/0xb0
    [c00000000345fe30] [c000000000008564] syscall_exit+0x0/0x40

    afs_write_begin hits an error and calls afs_unlink_writeback. In there
    we do list_del_init on an uninitialised list.

    The patch below initialises ->link when creating the afs_writeback struct.

    Signed-off-by: Anton Blanchard
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     

27 Oct, 2010

1 commit

  • This removes more dead code that was somehow missed by commit 0d99519efef
    (writeback: remove unused nonblocking and congestion checks). There are
    no behavior change except for the removal of two entries from one of the
    ext4 tracing interface.

    The nonblocking checks in ->writepages are no longer used because the
    flusher now prefer to block on get_request_wait() than to skip inodes on
    IO congestion. The latter will lead to more seeky IO.

    The nonblocking checks in ->writepage are no longer used because it's
    redundant with the WB_SYNC_NONE check.

    We no long set ->nonblocking in VM page out and page migration, because
    a) it's effectively redundant with WB_SYNC_NONE in current code
    b) it's old semantic of "Don't get stuck on request queues" is mis-behavior:
    that would skip some dirty inodes on congestion and page out others, which
    is unfair in terms of LRU age.

    Inspired by Christoph Hellwig. Thanks!

    Signed-off-by: Wu Fengguang
    Cc: Theodore Ts'o
    Cc: David Howells
    Cc: Sage Weil
    Cc: Steve French
    Cc: Chris Mason
    Cc: Jens Axboe
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     

06 Jul, 2010

1 commit


28 May, 2010

1 commit


06 Mar, 2010

1 commit

  • Similar to the fsync issue fixed a while ago in commit
    2daea67e966dc0c42067ebea015ddac6834cef88 we need to write for data to
    actually hit the disk before writing out the metadata to guarantee
    data integrity for filesystems that modify the inode in the data I/O
    completion path. Currently XFS and NFS handle this manually, and AFS
    has a write_inode method that does nothing but waiting for data, while
    others are possibly missing out on this.

    Fortunately this change has a lot less impact than the fsync change
    as none of the write_inode methods starts data writeout of any form
    by itself.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

10 Dec, 2009

2 commits

  • generic_file_aio_write already calls into ->fsync to handle O_SYNC/O_DSYNC.
    Remove the duplicate manual invocation.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • While Linux provided an O_SYNC flag basically since day 1, it took until
    Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
    since that day we had generic_osync_around with only minor changes and the
    great "For now, when the user asks for O_SYNC, we'll actually give
    O_DSYNC" comment. This patch intends to actually give us real O_SYNC
    semantics in addition to the O_DSYNC semantics. After Jan's O_SYNC
    patches which are required before this patch it's actually surprisingly
    simple, we just need to figure out when to set the datasync flag to
    vfs_fsync_range and when not.

    This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
    numerical value to keep binary compatibility, and adds a new real O_SYNC
    flag. To guarantee backwards compatiblity it is defined as expanding to
    both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
    sure we are backwards-compatible when compiled against the new headers.

    This also means that all places that don't care about the differences can
    just check O_DSYNC and get the right behaviour for O_SYNC, too - only
    places that actuall care need to check __O_SYNC in addition. Drivers and
    network filesystems have been updated in a fail safe way to always do the
    full sync magic if O_DSYNC is set. The few places setting O_SYNC for
    lower layers are kept that way for now to stay failsafe.

    We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
    to make sure we always get these sane options.

    Note that parisc really screwed up their headers as they already define a
    O_DSYNC that has always been a no-op. We try to repair it by using it for
    the new O_DSYNC and redefinining O_SYNC to send both the traditional
    O_SYNC numerical value _and_ the O_DSYNC one.

    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Grant Grundler
    Cc: "David S. Miller"
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Al Viro
    Cc: Andreas Dilger
    Acked-by: Trond Myklebust
    Acked-by: Kyle McMartin
    Acked-by: Ulrich Drepper
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Jan Kara

    Christoph Hellwig
     

16 Sep, 2009

1 commit


03 Apr, 2009

1 commit

  • The attached patch makes the kAFS filesystem in fs/afs/ use FS-Cache, and
    through it any attached caches. The kAFS filesystem will use caching
    automatically if it's available.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     

05 Jan, 2009

1 commit

  • With the write_begin/write_end aops, page_symlink was broken because it
    could no longer pass a GFP_NOFS type mask into the point where the
    allocations happened. They are done in write_begin, which would always
    assume that the filesystem can be entered from reclaim. This bug could
    cause filesystem deadlocks.

    The funny thing with having a gfp_t mask there is that it doesn't really
    allow the caller to arbitrarily tinker with the context in which it can be
    called. It couldn't ever be GFP_ATOMIC, for example, because it needs to
    take the page lock. The only thing any callers care about is __GFP_FS
    anyway, so turn that into a single flag.

    Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on
    this flag in their write_begin function. Change __grab_cache_page to
    accept a nofs argument as well, to honour that flag (while we're there,
    change the name to grab_cache_page_write_begin which is more instructive
    and does away with random leading underscores).

    This is really a more flexible way to go in the end anyway -- if a
    filesystem happens to want any extra allocations aside from the pagecache
    ones in ints write_begin function, it may now use GFP_KERNEL (rather than
    GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a
    random example).

    [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
    [kosaki.motohiro@jp.fujitsu.com: fix fuse]
    Signed-off-by: Nick Piggin
    Reviewed-by: KOSAKI Motohiro
    Cc: [2.6.28.x]
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    [ Cleaned up the calling convention: just pass in the AOP flags
    untouched to the grab_cache_page_write_begin() function. That
    just simplifies everybody, and may even allow future expansion of the
    logic. - Linus ]
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

17 Oct, 2008

1 commit

  • Cannot assume writes will fully complete, so this conversion goes the easy
    way and always brings the page uptodate before the write.

    [dhowells@redhat.com: style tweaks]
    Signed-off-by: Nick Piggin
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

05 Aug, 2008

1 commit

  • Converting page lock to new locking bitops requires a change of page flag
    operation naming, so we might as well convert it to something nicer
    (!TestSetPageLocked_Lock => trylock_page, SetPageLocked => set_page_locked).

    This also facilitates lockdeping of page lock.

    Signed-off-by: Nick Piggin
    Acked-by: KOSAKI Motohiro
    Acked-by: Peter Zijlstra
    Acked-by: Andrew Morton
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

17 Oct, 2007

2 commits

  • mm.h doesn't use directly anything from mutex.h and backing-dev.h, so
    remove them and add them back to files which need them.

    Cross-compile tested on many configs and archs.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • This patch contains the following possible cleanups:
    - make the following needlessly global functions static:
    - rxrpc.c: afs_send_pages()
    - vlocation.c: afs_vlocation_queue_for_updates()
    - write.c: afs_writepages_region()
    - make the following needlessly global variables static:
    - mntpt.c: afs_mntpt_expiry_timeout
    - proc.c: afs_vlocation_states[]
    - server.c: afs_server_timeout
    - vlocation.c: afs_vlocation_timeout
    - vlocation.c: afs_vlocation_update_timeout
    - #if 0 the following unused function:
    - cell.c: afs_get_cell_maybe()
    - #if 0 the following unused variables:
    - callback.c: afs_vnode_update_timeout
    - cmservice.c: struct afs_cm_workqueue

    Signed-off-by: Adrian Bunk
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

17 May, 2007

1 commit

  • afs_prepare_write() should not mark a page up to date if it only partially
    fills it in, in expectation of the caller filling in the rest prior to calling
    commit_write(). commit_write(), however, should mark the page up to date.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

11 May, 2007

3 commits

  • Following bug was uncovered by compiling with '-W' flag:

    CC [M] fs/afs/write.o
    fs/afs/write.c: In function ‘afs_write_back_from_locked_page’:
    fs/afs/write.c:398: warning: comparison of unsigned expression >= 0 is always true

    Loop variable 'n' is unsigned, so wraps around happily as far as I can
    see. Trival fix attached (compile tested only).

    Signed-off-by: Mika Kukkonen
    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Further fixes for AFS write support:

    (1) The afs_send_pages() outer loop must do an extra iteration if it ends
    with 'first == last' because 'last' is inclusive in the page set
    otherwise it fails to send the last page and complete the RxRPC op under
    some circumstances.

    (2) Similarly, the outer loop in afs_pages_written_back() must also do an
    extra iteration if it ends with 'first == last', otherwise it fails to
    clear PG_writeback on the last page under some circumstances.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • AFS write support fixes:

    (1) Support large files using the 64-bit file access operations if available
    on the server.

    (2) Use kmap_atomic() rather than kmap() in afs_prepare_page().

    (3) Don't do stuff in afs_writepage() that's done by the caller.

    [akpm@linux-foundation.org: fix right shift count >= width of type]
    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

10 May, 2007

1 commit

  • Implement support for writing to regular AFS files, including:

    (1) write

    (2) truncate

    (3) fsync, fdatasync

    (4) chmod, chown, chgrp, utime.

    AFS writeback attempts to batch writes into as chunks as large as it can manage
    up to the point that it writes back 65535 pages in one chunk or it meets a
    locked page.

    Furthermore, if a page has been written to using a particular key, then should
    another write to that page use some other key, the first write will be flushed
    before the second is allowed to take place. If the first write fails due to a
    security error, then the page will be scrapped and reread before the second
    write takes place.

    If a page is dirty and the callback on it is broken by the server, then the
    dirty data is not discarded (same behaviour as NFS).

    Shared-writable mappings are not supported by this patch.

    [akpm@linux-foundation.org: fix a bunch of warnings]
    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells