08 May, 2007

1 commit

  • I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by
    SLAB.

    I think its purpose was to have a callback after an object has been freed
    to verify that the state is the constructor state again? The callback is
    performed before each freeing of an object.

    I would think that it is much easier to check the object state manually
    before the free. That also places the check near the code object
    manipulation of the object.

    Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
    compiled with SLAB debugging on. If there would be code in a constructor
    handling SLAB_DEBUG_INITIAL then it would have to be conditional on
    SLAB_DEBUG otherwise it would just be dead code. But there is no such code
    in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real
    use of, difficult to understand and there are easier ways to accomplish the
    same effect (i.e. add debug code before kfree).

    There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
    clear in fs inode caches. Remove the pointless checks (they would even be
    pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.

    This is the last slab flag that SLUB did not support. Remove the check for
    unimplemented flags from SLUB.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

15 Feb, 2007

1 commit

  • With this change the sysctl inodes can be cached and nothing needs to be done
    when removing a sysctl table.

    For a cost of 2K code we will save about 4K of static tables (when we remove
    de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or
    about half that on a 32bit arch.

    The speed feels about the same, even though we can now cache the sysctl
    dentries :(

    We get the core advantage that we don't need to have a 1 to 1 mapping between
    ctl table entries and proc files. Making it possible to have /proc/sys vary
    depending on the namespace you are in. The currently merged namespaces don't
    have an issue here but the network namespace under /proc/sys/net needs to have
    different directories depending on which network adapters are visible. By
    simply being a cache different directories being visible depending on who you
    are is trivial to implement.

    [akpm@osdl.org: fix uninitialised var]
    [akpm@osdl.org: fix ARM build]
    [bunk@stusta.de: make things static]
    Signed-off-by: Eric W. Biederman
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

13 Feb, 2007

1 commit


08 Dec, 2006

2 commits

  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • SLAB_KERNEL is an alias of GFP_KERNEL.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

16 Jul, 2006

1 commit


27 Jun, 2006

3 commits

  • Incrementally update my proc-dont-lock-task_structs-indefinitely patches so
    that they work with struct pid instead of struct task_ref.

    Mostly this is a straight 1-1 substitution.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Every inode in /proc holds a reference to a struct task_struct. If a
    directory or file is opened and remains open after the the task exits this
    pinning continues. With 8K stacks on a 32bit machine the amount pinned per
    file descriptor is about 10K.

    Normally I would figure a reasonable per user process limit is about 100
    processes. With 80 processes, with a 1000 file descriptors each I can trigger
    the 00M killer on a 32bit kernel, because I have pinned about 800MB of useless
    data.

    This patch replaces the struct task_struct pointer with a pointer to a struct
    task_ref which has a struct task_struct pointer. The so the pinning of dead
    tasks does not happen.

    The code now has to contend with the fact that the task may now exit at any
    time. Which is a little but not muh more complicated.

    With this change it takes about 1000 processes each opening up 1000 file
    descriptors before I can trigger the OOM killer. Much better.

    [mlp@google.com: task_mmu small fixes]
    Signed-off-by: Eric W. Biederman
    Cc: Trond Myklebust
    Cc: Paul Jackson
    Cc: Oleg Nesterov
    Cc: Albert Cahalan
    Signed-off-by: Prasanna Meda
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The sole renaming use of proc_inode.type is to discover the file descriptor
    number, so just store the file descriptor number and don't wory about
    processing this field. This removes any /proc limits on the maximum number of
    file descriptors, and clears the path to make the hard coded /proc inode
    numbers go away.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

24 Mar, 2006

2 commits

  • Rewrap the overly long source code lines resulting from the previous
    patch's addition of the slab cache flag SLAB_MEM_SPREAD. This patch
    contains only formatting changes, and no function change.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Mark file system inode and similar slab caches subject to SLAB_MEM_SPREAD
    memory spreading.

    If a slab cache is marked SLAB_MEM_SPREAD, then anytime that a task that's
    in a cpuset with the 'memory_spread_slab' option enabled goes to allocate
    from such a slab cache, the allocations are spread evenly over all the
    memory nodes (task->mems_allowed) allowed to that task, instead of favoring
    allocation on the node local to the current cpu.

    The following inode and similar caches are marked SLAB_MEM_SPREAD:

    file cache
    ==== =====
    fs/adfs/super.c adfs_inode_cache
    fs/affs/super.c affs_inode_cache
    fs/befs/linuxvfs.c befs_inode_cache
    fs/bfs/inode.c bfs_inode_cache
    fs/block_dev.c bdev_cache
    fs/cifs/cifsfs.c cifs_inode_cache
    fs/coda/inode.c coda_inode_cache
    fs/dquot.c dquot
    fs/efs/super.c efs_inode_cache
    fs/ext2/super.c ext2_inode_cache
    fs/ext2/xattr.c (fs/mbcache.c) ext2_xattr
    fs/ext3/super.c ext3_inode_cache
    fs/ext3/xattr.c (fs/mbcache.c) ext3_xattr
    fs/fat/cache.c fat_cache
    fs/fat/inode.c fat_inode_cache
    fs/freevxfs/vxfs_super.c vxfs_inode
    fs/hpfs/super.c hpfs_inode_cache
    fs/isofs/inode.c isofs_inode_cache
    fs/jffs/inode-v23.c jffs_fm
    fs/jffs2/super.c jffs2_i
    fs/jfs/super.c jfs_ip
    fs/minix/inode.c minix_inode_cache
    fs/ncpfs/inode.c ncp_inode_cache
    fs/nfs/direct.c nfs_direct_cache
    fs/nfs/inode.c nfs_inode_cache
    fs/ntfs/super.c ntfs_big_inode_cache_name
    fs/ntfs/super.c ntfs_inode_cache
    fs/ocfs2/dlm/dlmfs.c dlmfs_inode_cache
    fs/ocfs2/super.c ocfs2_inode_cache
    fs/proc/inode.c proc_inode_cache
    fs/qnx4/inode.c qnx4_inode_cache
    fs/reiserfs/super.c reiser_inode_cache
    fs/romfs/inode.c romfs_inode_cache
    fs/smbfs/inode.c smb_inode_cache
    fs/sysv/inode.c sysv_inode_cache
    fs/udf/super.c udf_inode_cache
    fs/ufs/super.c ufs_inode_cache
    net/socket.c sock_inode_cache
    net/sunrpc/rpc_pipe.c rpc_inode_cache

    The choice of which slab caches to so mark was quite simple. I marked
    those already marked SLAB_RECLAIM_ACCOUNT, except for fs/xfs, dentry_cache,
    inode_cache, and buffer_head, which were marked in a previous patch. Even
    though SLAB_RECLAIM_ACCOUNT is for a different purpose, it marks the same
    potentially large file system i/o related slab caches as we need for memory
    spreading.

    Given that the rule now becomes "wherever you would have used a
    SLAB_RECLAIM_ACCOUNT slab cache flag before (usually the inode cache), use
    the SLAB_MEM_SPREAD flag too", this should be easy enough to maintain.
    Future file system writers will just copy one of the existing file system
    slab cache setups and tend to get it right without thinking.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     

19 Feb, 2006

1 commit

  • 1) it should use nr_processes(), not nr_threads; otherwise we are getting
    very confused find(1) and friends, among other things.
    2) better do that at stat() time than at every damn lookup in procfs root.

    Patch had been sitting in FC4 kernels for many months now...

    Signed-off-by: Al Viro

    Al Viro
     

09 Jan, 2006

1 commit


31 Oct, 2005

1 commit

  • This patch fixes incorrect error path in proc_get_inode(), when module
    can't be get due to being unloaded. When try_module_get() fails, this
    function puts de(!) and still returns inode with non-getted de.

    There are still unresolved known bugs in proc yet to be fixed:
    - proc_dir_entry tree is managed without any serialization
    - create_proc_entry() doesn't setup de->owner anyhow,
    so setting it later manually is inatomic.
    - looks like almost all modules do not care whether
    it's de->owner is set...

    Signed-Off-By: Denis Lunev
    Signed-Off-By: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill Korotaev
     

10 Sep, 2005

1 commit

  • Update the file systems in fs/ implementing a delete_inode() callback to
    call truncate_inode_pages(). One implementation note: In developing this
    patch I put the calls to truncate_inode_pages() at the very top of those
    filesystems delete_inode() callbacks in order to retain the previous
    behavior. I'm guessing that some of those could probably be optimized.

    Signed-off-by: Mark Fasheh
    Acked-by: Christoph Hellwig
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Fasheh
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds