05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

02 Feb, 2016

1 commit

  • Provide read-and-reset objects- and blocks-released counters for cachefilesd
    to use to work out whether there's anything new that can be culled.

    One of the problems cachefilesd has is that if all the objects in the cache
    are pinned by inodes lying dormant in the kernel inode cache, there isn't
    anything for it to cull. In such a case, it just spins around walking the
    filesystem tree and scanning for something to cull. This eats up a lot of
    CPU time.

    By telling cachefilesd if there have been any releases, the daemon can
    sleep until there is the possibility of something to do.

    cachefilesd finds this information by the following means:

    (1) When the control fd is read, the kernel presents a list of values of
    interest. "freleased=N" and "breleased=N" are added to this list to
    indicate the number of files released and number of blocks released
    since the last read call. At this point the counters are reset.

    (2) POLLIN is signalled if the number of files released becomes greater
    than 0.

    Note that by 'released' it just means that the kernel has released its
    interest in those files for the moment, not necessarily that the files
    should be deleted from the cache.

    Signed-off-by: David Howells
    Reviewed-by: Steve Dickson
    Signed-off-by: Al Viro

    David Howells
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

04 Jan, 2016

1 commit


17 Nov, 2015

1 commit

  • fs/cachefiles/rdwr.c: In function ‘cachefiles_write_page’:
    fs/cachefiles/rdwr.c:882: warning: ‘ret’ may be used uninitialized in
    this function

    If the jump to label "error" is taken, "ret" will indeed be
    uninitialized, and random stack data may be printed by the debug code.

    Fixes: 102f4d900c9c8f5e ("FS-Cache: Handle a write to the page immediately beyond the EOF marker")
    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    Geert Uytterhoeven
     

11 Nov, 2015

2 commits

  • Handle a write being requested to the page immediately beyond the EOF
    marker on a cache object. Currently this gets an assertion failure in
    CacheFiles because the EOF marker is used there to encode information about
    a partial page at the EOF - which could lead to an unknown blank spot in
    the file if we extend the file over it.

    The problem is actually in fscache where we check the index of the page
    being written against store_limit. store_limit is set to the number of
    pages that we're allowed to store by fscache_set_store_limit() - which
    means it's one more than the index of the last page we're allowed to store.
    The problem is that we permit writing to a page with an index _equal_ to
    the store limit - when we should reject that case.

    Whilst we're at it, change the triggered assertion in CacheFiles to just
    return -ENOBUFS instead.

    The assertion failure looks something like this:

    CacheFiles: Assertion failed
    1000 < 7b1 is false
    ------------[ cut here ]------------
    kernel BUG at fs/cachefiles/rdwr.c:962!
    ...
    RIP: 0010:[] [] cachefiles_write_page+0x273/0x2d0 [cachefiles]

    Cc: stable@vger.kernel.org # v2.6.31+; earlier - that + backport of a17754f (at least)
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • cachefiles requires that s_blocksize in the cache is not greater than
    PAGE_SIZE, and performs the check every time a block is accessed.

    Move the test to the place where the file is "opened", where other
    file-validity tests are performed.

    Signed-off-by: NeilBrown
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    NeilBrown
     

07 Nov, 2015

1 commit

  • __GFP_WAIT was used to signal that the caller was in atomic context and
    could not sleep. Now it is possible to distinguish between true atomic
    context and callers that are not willing to sleep. The latter should
    clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing
    __GFP_WAIT behaves differently, there is a risk that people will clear the
    wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
    indicate what it does -- setting it allows all reclaim activity, clearing
    them prevents it.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Mel Gorman
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Acked-by: Johannes Weiner
    Cc: Christoph Lameter
    Acked-by: David Rientjes
    Cc: Vitaly Wool
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

24 Jun, 2015

1 commit


16 Apr, 2015

2 commits


24 Feb, 2015

1 commit


23 Feb, 2015

2 commits

  • Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack
    thereof) in cachefiles:

    (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as
    it doesn't want to deal with automounts in its cache.

    (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in
    cachefiles.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Convert the following where appropriate:

    (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

    (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

    (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
    complicated than it appears as some calls should be converted to
    d_can_lookup() instead. The difference is whether the directory in
    question is a real dir with a ->lookup op or whether it's a fake dir with
    a ->d_automount op.

    In some circumstances, we can subsume checks for dentry->d_inode not being
    NULL into this, provided we the code isn't in a filesystem that expects
    d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
    use d_inode() rather than d_backing_inode() to get the inode pointer).

    Note that the dentry type field may be set to something other than
    DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
    manages the fall-through from a negative dentry to a lower layer. In such a
    case, the dentry type of the negative union dentry is set to the same as the
    type of the lower dentry.

    However, if you know d_inode is not NULL at the call site, then you can use
    the d_is_xxx() functions even in a filesystem.

    There is one further complication: a 0,0 chardev dentry may be labelled
    DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
    intended for special directory entry types that don't have attached inodes.

    The following perl+coccinelle script was used:

    use strict;

    my @callers;
    open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
    @callers = ;
    close($fd);
    unless (@callers) {
    print "No matches\n";
    exit(0);
    }

    my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

    my $coccifile = "tmp.sp.cocci";
    open($fd, ">$coccifile") || die $coccifile;
    print($fd "$_\n") || die $coccifile foreach (@cocci);
    close($fd);

    foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
    die "spatch failed";
    }

    [AV: overlayfs parts skipped]

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

20 Nov, 2014

1 commit


14 Oct, 2014

2 commits

  • …git/dhowells/linux-fs

    Pull fs-cache fixes from David Howells:
    "Two fixes for bugs in CacheFiles and a cleanup in FS-Cache"

    * tag 'fscache-fixes-20141013' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    fs/fscache/object-list.c: use __seq_open_private()
    CacheFiles: Fix incorrect test for in-memory object collision
    CacheFiles: Handle object being killed before being set up

    Linus Torvalds
     
  • When CacheFiles cache objects are in use, they have in-memory representations,
    as defined by the cachefiles_object struct. These are kept in a tree rooted in
    the cache and indexed by dentry pointer (since there's a unique mapping between
    object index key and dentry).

    Collisions can occur between a representation already in the tree and a new
    representation being set up because it takes time to dispose of an old
    representation - particularly if it must be unlinked or renamed.

    When such a collision occurs, cachefiles_mark_object_active() is meant to check
    to see if the old, already-present representation is in the process of being
    discarded (ie. FSCACHE_OBJECT_IS_LIVE is not set on it) - and, if so, wait for
    the representation to be removed (ie. CACHEFILES_OBJECT_ACTIVE is then
    cleared).

    However, the test for whether the old representation is still live is checking
    the new object - which always will be live at this point. This leads to an
    oops looking like:

    CacheFiles: Error: Unexpected object collision
    object: OBJ1b354
    objstate=LOOK_UP_OBJECT fl=8 wbusy=2 ev=0[0]
    ops=0 inp=0 exc=0
    parent=ffff88053f5417c0
    cookie=ffff880538f202a0 [pr=ffff8805381b7160 nd=ffff880509c6eb78 fl=27]
    key=[8] '2490000000000000'
    xobject: OBJ1a600
    xobjstate=DROP_OBJECT fl=70 wbusy=2 ev=0[0]
    xops=0 inp=0 exc=0
    xparent=ffff88053f5417c0
    xcookie=ffff88050f4cbf70 [pr=ffff8805381b7160 nd= (null) fl=12]
    ------------[ cut here ]------------
    kernel BUG at fs/cachefiles/namei.c:200!
    ...
    Workqueue: fscache_object fscache_object_work_func [fscache]
    ...
    RIP: ... cachefiles_walk_to_object+0x7ea/0x860 [cachefiles]
    ...
    Call Trace:
    [] ? cachefiles_lookup_object+0x58/0x100 [cachefiles]
    [] ? fscache_look_up_object+0xb9/0x1d0 [fscache]
    [] ? fscache_parent_ready+0x2d/0x80 [fscache]
    [] ? fscache_object_work_func+0x92/0x1f0 [fscache]
    [] ? process_one_work+0x16b/0x400
    [] ? worker_thread+0x116/0x380
    [] ? manage_workers.isra.21+0x290/0x290
    [] ? kthread+0xbc/0xe0
    [] ? flush_kthread_worker+0x80/0x80
    [] ? ret_from_fork+0x7c/0xb0
    [] ? flush_kthread_worker+0x80/0x80

    Reported-by: Manuel Schölling
    Signed-off-by: David Howells
    Acked-by: Steve Dickson

    David Howells
     

09 Oct, 2014

1 commit


30 Sep, 2014

1 commit

  • If a cache object gets killed whilst in the process of being set up - for
    instance if the netfs relinquishes the cookie that the object is associated
    with - then the object's state machine will transit to the DROP_OBJECT state
    without necessarily going through the LOOKUP_OBJECT or CREATE_OBJECT states.

    This is a problem for CacheFiles because cachefiles_drop_object() assumes that
    object->dentry will be set upon reaching the DROP_OBJECT state and has an
    ASSERT() to that effect (see the oops below) - but object->dentry doesn't get
    set until the LOOKUP_OBJECT or CREATE_OBJECT states (and not always then if
    they fail).

    To fix this, just make the dentry cleanup in cachefiles_drop_object()
    conditional on the dentry actually being set and remove the assertion.

    CacheFiles: Assertion failed
    ------------[ cut here ]------------
    kernel BUG at .../fs/cachefiles/namei.c:425!
    ...
    Workqueue: fscache_object fscache_object_work_func [fscache]
    ...
    RIP: ... cachefiles_delete_object+0xcd/0x110 [cachefiles]
    ...
    Call Trace:
    [] ? cachefiles_drop_object+0xff/0x130 [cachefiles]
    [] ? fscache_drop_object+0xd1/0x1d0 [fscache]
    [] ? fscache_object_work_func+0x87/0x210 [fscache]
    [] ? process_one_work+0x155/0x450
    [] ? worker_thread+0x114/0x370
    [] ? manage_workers.isra.21+0x2c0/0x2c0
    [] ? kthread+0xbc/0xe0
    [] ? flush_kthread_worker+0xa0/0xa0
    [] ? ret_from_fork+0x7c/0xb0
    [] ? flush_kthread_worker+0xa0/0xa0

    Reported-by: Manuel Schölling
    Signed-off-by: David Howells
    Acked-by: Steve Dickson

    David Howells
     

26 Sep, 2014

1 commit


18 Sep, 2014

2 commits

  • Not all filesystems now provide the rename i_op - ext4 for one - but rather
    provide the rename2 i_op. CacheFiles checks that the filesystem has rename
    and so will reject ext4 now with EPERM:

    CacheFiles: Failed to register: -1

    Fix this by checking for rename2 as an alternative. The call to vfs_rename()
    actually handles selection of the appropriate function, so we needn't worry
    about that.

    Turning on debugging shows:

    [cachef] ==> cachefiles_get_directory(,,cache)
    [cachef] subdir -> ffff88000b22b778 positive
    [cachef]

    David Howells
     
  • These two have been unused since

    commit c4d6d8dbf335c7fa47341654a37c53a512b519bb
    CacheFiles: Fix the marking of cached pages

    in 3.8.

    Signed-off-by: NeilBrown
    Signed-off-by: David Howells

    NeilBrown
     

07 Jun, 2014

2 commits


13 Apr, 2014

1 commit

  • Pull vfs updates from Al Viro:
    "The first vfs pile, with deep apologies for being very late in this
    window.

    Assorted cleanups and fixes, plus a large preparatory part of iov_iter
    work. There's a lot more of that, but it'll probably go into the next
    merge window - it *does* shape up nicely, removes a lot of
    boilerplate, gets rid of locking inconsistencie between aio_write and
    splice_write and I hope to get Kent's direct-io rewrite merged into
    the same queue, but some of the stuff after this point is having
    (mostly trivial) conflicts with the things already merged into
    mainline and with some I want more testing.

    This one passes LTP and xfstests without regressions, in addition to
    usual beating. BTW, readahead02 in ltp syscalls testsuite has started
    giving failures since "mm/readahead.c: fix readahead failure for
    memoryless NUMA nodes and limit readahead pages" - might be a false
    positive, might be a real regression..."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    missing bits of "splice: fix racy pipe->buffers uses"
    cifs: fix the race in cifs_writev()
    ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
    kill generic_file_buffered_write()
    ocfs2_file_aio_write(): switch to generic_perform_write()
    ceph_aio_write(): switch to generic_perform_write()
    xfs_file_buffered_aio_write(): switch to generic_perform_write()
    export generic_perform_write(), start getting rid of generic_file_buffer_write()
    generic_file_direct_write(): get rid of ppos argument
    btrfs_file_aio_write(): get rid of ppos
    kill the 5th argument of generic_file_buffered_write()
    kill the 4th argument of __generic_file_aio_write()
    lustre: don't open-code kernel_recvmsg()
    ocfs2: don't open-code kernel_recvmsg()
    drbd: don't open-code kernel_recvmsg()
    constify blk_rq_map_user_iov() and friends
    lustre: switch to kernel_sendmsg()
    ocfs2: don't open-code kernel_sendmsg()
    take iov_iter stuff to mm/iov_iter.c
    process_vm_access: tidy up a bit
    ...

    Linus Torvalds
     

05 Apr, 2014

1 commit

  • Pull renameat2 system call from Miklos Szeredi:
    "This adds a new syscall, renameat2(), which is the same as renameat()
    but with a flags argument.

    The purpose of extending rename is to add cross-rename, a symmetric
    variant of rename, which exchanges the two files. This allows
    interesting things, which were not possible before, for example
    atomically replacing a directory tree with a symlink, etc... This
    also allows overlayfs and friends to operate on whiteouts atomically.

    Andy Lutomirski also suggested a "noreplace" flag, which disables the
    overwriting behavior of rename.

    These two flags, RENAME_EXCHANGE and RENAME_NOREPLACE are only
    implemented for ext4 as an example and for testing"

    * 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
    ext4: add cross rename support
    ext4: rename: split out helper functions
    ext4: rename: move EMLINK check up
    ext4: rename: create ext4_renament structure for local vars
    vfs: add cross-rename
    vfs: lock_two_nondirectories: allow directory args
    security: add flags to rename hooks
    vfs: add RENAME_NOREPLACE flag
    vfs: add renameat2 syscall
    vfs: rename: use common code for dir and non-dir
    vfs: rename: move d_move() up
    vfs: add d_is_dir()

    Linus Torvalds
     

04 Apr, 2014

1 commit

  • This code used to have its own lru cache pagevec up until a0b8cab3 ("mm:
    remove lru parameter from __pagevec_lru_add and remove parts of pagevec
    API"). Now it's just add_to_page_cache() followed by lru_cache_add(),
    might as well use add_to_page_cache_lru() directly.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: Minchan Kim
    Cc: Andrea Arcangeli
    Cc: Bob Liu
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Greg Thelen
    Cc: Hugh Dickins
    Cc: Jan Kara
    Cc: KOSAKI Motohiro
    Cc: Luigi Semenzato
    Cc: Mel Gorman
    Cc: Metin Doslu
    Cc: Michel Lespinasse
    Cc: Ozgun Erdogan
    Cc: Peter Zijlstra
    Cc: Roman Gushchin
    Cc: Ryan Mallon
    Cc: Tejun Heo
    Cc: Vlastimil Babka
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

02 Apr, 2014

1 commit


01 Apr, 2014

2 commits


13 Nov, 2013

1 commit

  • Pull vfs updates from Al Viro:
    "All kinds of stuff this time around; some more notable parts:

    - RCU'd vfsmounts handling
    - new primitives for coredump handling
    - files_lock is gone
    - Bruce's delegations handling series
    - exportfs fixes

    plus misc stuff all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits)
    ecryptfs: ->f_op is never NULL
    locks: break delegations on any attribute modification
    locks: break delegations on link
    locks: break delegations on rename
    locks: helper functions for delegation breaking
    locks: break delegations on unlink
    namei: minor vfs_unlink cleanup
    locks: implement delegations
    locks: introduce new FL_DELEG lock flag
    vfs: take i_mutex on renamed file
    vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
    vfs: don't use PARENT/CHILD lock classes for non-directories
    vfs: pull ext4's double-i_mutex-locking into common code
    exportfs: fix quadratic behavior in filehandle lookup
    exportfs: better variable name
    exportfs: move most of reconnect_path to helper function
    exportfs: eliminate unused "noprogress" counter
    exportfs: stop retrying once we race with rename/remove
    exportfs: clear DISCONNECTED on all parents sooner
    exportfs: more detailed comment for path_reconnect
    ...

    Linus Torvalds
     

09 Nov, 2013

3 commits

  • NFSv4 uses leases to guarantee that clients can cache metadata as well
    as data.

    Cc: Mikulas Patocka
    Cc: David Howells
    Cc: Tyler Hicks
    Cc: Dustin Kirkland
    Acked-by: Jeff Layton
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Al Viro

    J. Bruce Fields
     
  • Cc: David Howells
    Acked-by: Jeff Layton
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Al Viro

    J. Bruce Fields
     
  • We need to break delegations on any operation that changes the set of
    links pointing to an inode. Start with unlink.

    Such operations also hold the i_mutex on a parent directory. Breaking a
    delegation may require waiting for a timeout (by default 90 seconds) in
    the case of a unresponsive NFS client. To avoid blocking all directory
    operations, we therefore drop locks before waiting for the delegation.
    The logic then looks like:

    acquire locks
    ...
    test for delegation; if found:
    take reference on inode
    release locks
    wait for delegation break
    drop reference on inode
    retry

    It is possible this could never terminate. (Even if we take precautions
    to prevent another delegation being acquired on the same inode, we could
    get a different inode on each retry.) But this seems very unlikely.

    The initial test for a delegation happens after the lock on the target
    inode is acquired, but the directory inode may have been acquired
    further up the call stack. We therefore add a "struct inode **"
    argument to any intervening functions, which we use to pass the inode
    back up to the caller in the case it needs a delegation synchronously
    broken.

    Cc: David Howells
    Cc: Tyler Hicks
    Cc: Dustin Kirkland
    Acked-by: Jeff Layton
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Al Viro

    J. Bruce Fields
     

28 Sep, 2013

1 commit

  • Provide the ability to enable and disable fscache cookies. A disabled cookie
    will reject or ignore further requests to:

    Acquire a child cookie
    Invalidate and update backing objects
    Check the consistency of a backing object
    Allocate storage for backing page
    Read backing pages
    Write to backing pages

    but still allows:

    Checks/waits on the completion of already in-progress objects
    Uncaching of pages
    Relinquishment of cookies

    Two new operations are provided:

    (1) Disable a cookie:

    void fscache_disable_cookie(struct fscache_cookie *cookie,
    bool invalidate);

    If the cookie is not already disabled, this locks the cookie against other
    dis/enablement ops, marks the cookie as being disabled, discards or
    invalidates any backing objects and waits for cessation of activity on any
    associated object.

    This is a wrapper around a chunk split out of fscache_relinquish_cookie(),
    but it reinitialises the cookie such that it can be reenabled.

    All possible failures are handled internally. The caller should consider
    calling fscache_uncache_all_inode_pages() afterwards to make sure all page
    markings are cleared up.

    (2) Enable a cookie:

    void fscache_enable_cookie(struct fscache_cookie *cookie,
    bool (*can_enable)(void *data),
    void *data)

    If the cookie is not already enabled, this locks the cookie against other
    dis/enablement ops, invokes can_enable() and, if the cookie is not an
    index cookie, will begin the procedure of acquiring backing objects.

    The optional can_enable() function is passed the data argument and returns
    a ruling as to whether or not enablement should actually be permitted to
    begin.

    All possible failures are handled internally. The cookie will only be
    marked as enabled if provisional backing objects are allocated.

    A later patch will introduce these to NFS. Cookie enablement during nfs_open()
    is then contingent on i_writecount <dhowells@redhat.com

    David Howells
     

21 Sep, 2013

2 commits

  • Don't try to dump the index key that distinguishes an object if netfs
    data in the cookie the object refers to has been cleared (ie. the
    cookie has passed most of the way through
    __fscache_relinquish_cookie()).

    Since the netfs holds the index key, we can't get at it once the ->def
    and ->netfs_data pointers have been cleared - and a NULL pointer
    exception will ensue, usually just after a:

    CacheFiles: Error: Unexpected object collision

    error is reported.

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • In cachefiles_check_auxdata(), we allocate auxbuf but fail to free it if
    we determine there's an error or that the data is stale.

    Further, assigning the output of vfs_getxattr() to auxbuf->len gives
    problems with checking for errors as auxbuf->len is a u16. We don't
    actually need to set auxbuf->len, so keep the length in a variable for
    now. We shouldn't need to check the upper limit of the buffer as an
    overflow there should be indicated by -ERANGE.

    While we're at it, fscache_check_aux() returns an enum value, not an
    int, so assign it to an appropriately typed variable rather than to ret.

    Signed-off-by: Josh Boyer
    Signed-off-by: David Howells
    cc: Hongyi Jia
    cc: Milosz Tanski
    Signed-off-by: Linus Torvalds

    Josh Boyer
     

06 Sep, 2013

1 commit


04 Jul, 2013

1 commit

  • Now that the LRU to add a page to is decided at LRU-add time, remove the
    misleading lru parameter from __pagevec_lru_add. A consequence of this
    is that the pagevec_lru_add_file, pagevec_lru_add_anon and similar
    helpers are misleading as the caller no longer has direct control over
    what LRU the page is added to. Unused helpers are removed by this patch
    and existing users of pagevec_lru_add_file() are converted to use
    lru_cache_add_file() directly and use the per-cpu pagevecs instead of
    creating their own pagevec.

    Signed-off-by: Mel Gorman
    Reviewed-by: Jan Kara
    Reviewed-by: Rik van Riel
    Acked-by: Johannes Weiner
    Cc: Alexey Lyahkov
    Cc: Andrew Perepechko
    Cc: Robin Dong
    Cc: Theodore Tso
    Cc: Hugh Dickins
    Cc: Rik van Riel
    Cc: Bernd Schubert
    Cc: David Howells
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

19 Jun, 2013

1 commit