31 Aug, 2016

1 commit

  • register_shrinker in mb_cache_create may fail due to no memory. This
    patch fixes to do the check of return value of register_shrinker and
    handle the error case, otherwise mb_cache_create may return with no
    error, but losing the inner shrinker.

    Signed-off-by: Chao Yu
    Reviewed-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Chao Yu
     

23 Feb, 2016

4 commits

  • To reduce amount of damage caused by single bad block, we limit number
    of inodes sharing an xattr block to 1024. Thus there can be more xattr
    blocks with the same contents when there are lots of files with the same
    extended attributes. These xattr blocks naturally result in hash
    collisions and can form long hash chains and we unnecessarily check each
    such block only to find out we cannot use it because it is already
    shared by too many inodes.

    Add a reusable flag to cache entries which is cleared when a cache entry
    has reached its maximum refcount. Cache entries which are not marked
    reusable are skipped by mb_cache_entry_find_{first,next}. This
    significantly speeds up mbcache when there are many same xattr blocks.
    For example for xattr-bench with 5 values and each process handling
    20000 files, the run for 64 processes is 25x faster with this patch.
    Even for 8 processes the speedup is almost 3x. We have also verified
    that for situations where there is only one xattr block of each kind,
    the patch doesn't have a measurable cost.

    [JK: Remove handling of setting the same value since it is not needed
    anymore, check for races in e_reusable setting, improve changelog,
    add measurements]

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Andreas Gruenbacher
     
  • Get rid of field _e_hash_list_head in cache entries and add bit field
    e_referenced instead.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Andreas Gruenbacher
     
  • Since old mbcache code is gone, let's rename new code to mbcache since
    number 2 is now meaningless. This is just a mechanical replacement.

    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • Both ext2 and ext4 are now converted to mbcache2. Remove the old mbcache
    code.

    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     

26 Jun, 2014

1 commit


19 Mar, 2014

3 commits

  • This patch adds new interfaces to create and destory cache,
    ext4_xattr_create_cache() and ext4_xattr_destroy_cache(), and remove
    the cache creation and destory calls from ex4_init_xattr() and
    ext4_exitxattr() in fs/ext4/xattr.c.

    fs/ext4/super.c has been changed so that when a filesystem is mounted
    a cache is allocated and attched to its ext4_sb_info structure.

    fs/mbcache.c has been changed so that only one slab allocator is
    allocated and used by all mbcache structures.

    Signed-off-by: T. Makphaibulchoke

    T Makphaibulchoke
     
  • The patch increases the parallelism of mbcache by using the built-in
    lock in the hlist_bl_node to protect the mb_cache's local block and
    index hash chains. The global data mb_cache_lru_list and
    mb_cache_list continue to be protected by the global
    mb_cache_spinlock.

    New block group spinlock, mb_cache_bg_lock is also added to serialize
    accesses to mb_cache_entry's local data.

    A new member e_refcnt is added to the mb_cache_entry structure to help
    preventing an mb_cache_entry from being deallocated by a free while it
    is being referenced by either mb_cache_entry_get() or
    mb_cache_entry_find().

    Signed-off-by: T. Makphaibulchoke
    Signed-off-by: "Theodore Ts'o"

    T Makphaibulchoke
     
  • This patch changes each mb_cache's both block and index hash chains to
    use a hlist_bl_node, which contains a built-in lock. This is the
    first step in decoupling of locks serializing accesses to mb_cache
    global data and each mb_cache_entry local data.

    Signed-off-by: T. Makphaibulchoke
    Signed-off-by: "Theodore Ts'o"

    T Makphaibulchoke
     

11 Sep, 2013

2 commits

  • Convert the filesystem shrinkers to use the new API, and standardise some
    of the behaviours of the shrinkers at the same time. For example,
    nr_to_scan means the number of objects to scan, not the number of objects
    to free.

    I refactored the CIFS idmap shrinker a little - it really needs to be
    broken up into a shrinker per tree and keep an item count with the tree
    root so that we don't need to walk the tree every time the shrinker needs
    to count the number of objects in the tree (i.e. all the time under
    memory pressure).

    [glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree]
    [assorted fixes folded in]
    Signed-off-by: Dave Chinner
    Signed-off-by: Glauber Costa
    Acked-by: Mel Gorman
    Acked-by: Artem Bityutskiy
    Acked-by: Jan Kara
    Acked-by: Steven Whitehouse
    Cc: Adrian Hunter
    Cc: "Theodore Ts'o"
    Cc: Adrian Hunter
    Cc: Al Viro
    Cc: Artem Bityutskiy
    Cc: Arve Hjønnevåg
    Cc: Carlos Maiolino
    Cc: Christoph Hellwig
    Cc: Chuck Lever
    Cc: Daniel Vetter
    Cc: David Rientjes
    Cc: Gleb Natapov
    Cc: Greg Thelen
    Cc: J. Bruce Fields
    Cc: Jan Kara
    Cc: Jerome Glisse
    Cc: John Stultz
    Cc: KAMEZAWA Hiroyuki
    Cc: Kent Overstreet
    Cc: Kirill A. Shutemov
    Cc: Marcelo Tosatti
    Cc: Mel Gorman
    Cc: Steven Whitehouse
    Cc: Thomas Hellstrom
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton

    Signed-off-by: Al Viro

    Dave Chinner
     
  • The sysctl knob sysctl_vfs_cache_pressure is used to determine which
    percentage of the shrinkable objects in our cache we should actively try
    to shrink.

    It works great in situations in which we have many objects (at least more
    than 100), because the aproximation errors will be negligible. But if
    this is not the case, specially when total_objects < 100, we may end up
    concluding that we have no objects at all (total / 100 = 0, if total <
    100).

    This is certainly not the biggest killer in the world, but may matter in
    very low kernel memory situations.

    Signed-off-by: Glauber Costa
    Reviewed-by: Carlos Maiolino
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Mel Gorman
    Cc: Dave Chinner
    Cc: Al Viro
    Cc: "Theodore Ts'o"
    Cc: Adrian Hunter
    Cc: Al Viro
    Cc: Artem Bityutskiy
    Cc: Arve Hjønnevåg
    Cc: Carlos Maiolino
    Cc: Christoph Hellwig
    Cc: Chuck Lever
    Cc: Daniel Vetter
    Cc: David Rientjes
    Cc: Gleb Natapov
    Cc: Greg Thelen
    Cc: J. Bruce Fields
    Cc: Jan Kara
    Cc: Jerome Glisse
    Cc: John Stultz
    Cc: KAMEZAWA Hiroyuki
    Cc: Kent Overstreet
    Cc: Kirill A. Shutemov
    Cc: Marcelo Tosatti
    Cc: Mel Gorman
    Cc: Steven Whitehouse
    Cc: Thomas Hellstrom
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Glauber Costa
     

25 May, 2011

1 commit

  • Change each shrinker's API by consolidating the existing parameters into
    shrink_control struct. This will simplify any further features added w/o
    touching each file of shrinker.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: fix warning]
    [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
    [akpm@linux-foundation.org: fix xfs warning]
    [akpm@linux-foundation.org: update gfs2]
    Signed-off-by: Ying Han
    Cc: KOSAKI Motohiro
    Cc: Minchan Kim
    Acked-by: Pavel Emelyanov
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Acked-by: Rik van Riel
    Cc: Johannes Weiner
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Steven Whitehouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ying Han
     

31 Mar, 2011

1 commit


11 Jan, 2011

1 commit

  • When I enable EXT2_XATTR_DEBUG in fs/ext2/xattr.c I get a build error stating
    the following:

    CC fs/ext2/xattr.o
    fs/ext2/xattr.c: In function 'ext2_xattr_cache_insert':
    fs/ext2/xattr.c:841: error: dereferencing pointer to incomplete type
    fs/ext2/xattr.c:846: error: dereferencing pointer to incomplete type
    make[2]: *** [fs/ext2/xattr.o] Error 1
    make[1]: *** [fs/ext2] Error 2
    make: *** [fs] Error 2

    These lines reference ext2_xattr_cache->c_entry_count which is defined
    in struct mb_cache. struct mb_cache is currently only defined in fs/mbcache.c.
    Moving struct mb_cache definition to include/linux/mbcache.h to resolve the
    issue.

    Signed-off-by: Josh Hunt
    Signed-off-by: Jan Kara

    Josh Hunt
     

18 Aug, 2010

1 commit

  • Limit the maximum number of mb_cache entries depending on the number of
    hash buckets: if the only limit to the number of cache entries is the
    available memory the hash chains can grow very long, taking a long time
    to search.

    At least partially solves https://bugzilla.lustre.org/show_bug.cgi?id=22771.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     

10 Aug, 2010

2 commits

  • The shrinker function is supposed to return the number of cache
    entries after shrinking, not before shrinking. Fix that.

    Based on a patch from Wang Sheng-Hui .

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • The mbcache code was written to support a variable number of indexes,
    but all the existing users use exactly one index. Simplify to code to
    support only that case.

    There are also no users of the cache entry free operation, and none of
    the users keep extra data in cache entries. Remove those features as
    well.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     

19 Jul, 2010

1 commit

  • The current shrinker implementation requires the registered callback
    to have global state to work from. This makes it difficult to shrink
    caches that are not global (e.g. per-filesystem caches). Pass the shrinker
    structure to the callback so that users can embed the shrinker structure
    in the context the shrinker needs to operate on and get back to it in the
    callback via container_of().

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

16 Apr, 2008

1 commit

  • mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL. But
    filesystems are calling this function while holding xattr_sem so possible
    recursion into the fs violates locking ordering of xattr_sem and transaction
    start / i_mutex for ext2-4. Change mb_cache_entry_alloc() so that filesystems
    can specify desired gfp mask and use GFP_NOFS from all of them.

    Signed-off-by: Jan Kara
    Reported-by: Dave Jones
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

26 Oct, 2007

1 commit

  • This patch fixes the c_entry_count counter of the mbcache. Currently
    it increments the counter first & allocate the cache entry later. In
    case of failure to allocate the entry due to insufficient memory this
    counter is still left incremented. This patch fixes this anomaly.

    Signed-off-by: Ram Gupta
    Signed-off-by: Linus Torvalds

    Ram Gupta
     

20 Jul, 2007

1 commit

  • Slab destructors were no longer supported after Christoph's
    c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
    BUGs for both slab and slub, and slob never supported them
    either.

    This rips out support for the dtor pointer from kmem_cache_create()
    completely and fixes up every single callsite in the kernel (there were
    about 224, not including the slab allocator definitions themselves,
    or the documentation references).

    Signed-off-by: Paul Mundt

    Paul Mundt
     

18 Jul, 2007

1 commit

  • I can never remember what the function to register to receive VM pressure
    is called. I have to trace down from __alloc_pages() to find it.

    It's called "set_shrinker()", and it needs Your Help.

    1) Don't hide struct shrinker. It contains no magic.
    2) Don't allocate "struct shrinker". It's not helpful.
    3) Call them "register_shrinker" and "unregister_shrinker".
    4) Call the function "shrink" not "shrinker".
    5) Reduce the 17 lines of waffly comments to 13, but document it properly.

    Signed-off-by: Rusty Russell
    Cc: David Chinner
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rusty Russell
     

08 Dec, 2006

1 commit

  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

30 Sep, 2006

1 commit


29 Mar, 2006

1 commit


24 Mar, 2006

1 commit

  • Mark file system inode and similar slab caches subject to SLAB_MEM_SPREAD
    memory spreading.

    If a slab cache is marked SLAB_MEM_SPREAD, then anytime that a task that's
    in a cpuset with the 'memory_spread_slab' option enabled goes to allocate
    from such a slab cache, the allocations are spread evenly over all the
    memory nodes (task->mems_allowed) allowed to that task, instead of favoring
    allocation on the node local to the current cpu.

    The following inode and similar caches are marked SLAB_MEM_SPREAD:

    file cache
    ==== =====
    fs/adfs/super.c adfs_inode_cache
    fs/affs/super.c affs_inode_cache
    fs/befs/linuxvfs.c befs_inode_cache
    fs/bfs/inode.c bfs_inode_cache
    fs/block_dev.c bdev_cache
    fs/cifs/cifsfs.c cifs_inode_cache
    fs/coda/inode.c coda_inode_cache
    fs/dquot.c dquot
    fs/efs/super.c efs_inode_cache
    fs/ext2/super.c ext2_inode_cache
    fs/ext2/xattr.c (fs/mbcache.c) ext2_xattr
    fs/ext3/super.c ext3_inode_cache
    fs/ext3/xattr.c (fs/mbcache.c) ext3_xattr
    fs/fat/cache.c fat_cache
    fs/fat/inode.c fat_inode_cache
    fs/freevxfs/vxfs_super.c vxfs_inode
    fs/hpfs/super.c hpfs_inode_cache
    fs/isofs/inode.c isofs_inode_cache
    fs/jffs/inode-v23.c jffs_fm
    fs/jffs2/super.c jffs2_i
    fs/jfs/super.c jfs_ip
    fs/minix/inode.c minix_inode_cache
    fs/ncpfs/inode.c ncp_inode_cache
    fs/nfs/direct.c nfs_direct_cache
    fs/nfs/inode.c nfs_inode_cache
    fs/ntfs/super.c ntfs_big_inode_cache_name
    fs/ntfs/super.c ntfs_inode_cache
    fs/ocfs2/dlm/dlmfs.c dlmfs_inode_cache
    fs/ocfs2/super.c ocfs2_inode_cache
    fs/proc/inode.c proc_inode_cache
    fs/qnx4/inode.c qnx4_inode_cache
    fs/reiserfs/super.c reiser_inode_cache
    fs/romfs/inode.c romfs_inode_cache
    fs/smbfs/inode.c smb_inode_cache
    fs/sysv/inode.c sysv_inode_cache
    fs/udf/super.c udf_inode_cache
    fs/ufs/super.c ufs_inode_cache
    net/socket.c sock_inode_cache
    net/sunrpc/rpc_pipe.c rpc_inode_cache

    The choice of which slab caches to so mark was quite simple. I marked
    those already marked SLAB_RECLAIM_ACCOUNT, except for fs/xfs, dentry_cache,
    inode_cache, and buffer_head, which were marked in a previous patch. Even
    though SLAB_RECLAIM_ACCOUNT is for a different purpose, it marks the same
    potentially large file system i/o related slab caches as we need for memory
    spreading.

    Given that the rule now becomes "wherever you would have used a
    SLAB_RECLAIM_ACCOUNT slab cache flag before (usually the inode cache), use
    the SLAB_MEM_SPREAD flag too", this should be easy enough to maintain.
    Future file system writers will just copy one of the existing file system
    slab cache setups and tend to get it right without thinking.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     

15 Jan, 2006

1 commit


07 Nov, 2005

1 commit

  • This is the fs/ part of the big kfree cleanup patch.

    Remove pointless checks for NULL prior to calling kfree() in fs/.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     

28 Oct, 2005

1 commit

  • - ->releasepage() annotated (s/int/gfp_t), instances updated
    - missing gfp_t in fs/* added
    - fixed misannotation from the original sweep caught by bitwise checks:
    XFS used __nocast both for gfp_t and for flags used by XFS allocator.
    The latter left with unsigned int __nocast; we might want to add a
    different type for those but for now let's leave them alone. That,
    BTW, is a case when __nocast use had been actively confusing - it had
    been used in the same code for two different and similar types, with
    no way to catch misuses. Switch of gfp_t to bitwise had caught that
    immediately...

    One tricky bit is left alone to be dealt with later - mapping->flags is
    a mix of gfp_t and error indications. Left alone for now.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

28 Jul, 2005

1 commit


06 May, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds