09 Dec, 2006

1 commit


08 Dec, 2006

3 commits

  • We currently insert socket dentries into the global dentry hashtable. This
    is suboptimal because there is currently no way these entries can be used
    for a lookup(). (/proc/xxx/fd/xxx uses a different mechanism). Inserting
    them in dentry hashtable slows dcache lookups.

    To let __dpath() still work correctly (ie not adding a " (deleted)") after
    dentry name, we do :

    - Right after d_alloc(), pretend they are hashed by clearing the
    DCACHE_UNHASHED bit.

    - Call d_instantiate() instead of d_add() : dentry is not inserted in
    hash table.

    __dpath() & friends work as intended during dentry lifetime.

    - At dismantle time, once dput() must clear the dentry, setting again
    DCACHE_UNHASHED bit inside the custom d_delete() function provided by
    socket code, so that dput() can just kill_it.

    Signed-off-by: Eric Dumazet
    Cc: Al Viro
    Acked-by: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • SLAB_KERNEL is an alias of GFP_KERNEL.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

03 Dec, 2006

1 commit


02 Oct, 2006

1 commit


01 Oct, 2006

2 commits


23 Sep, 2006

7 commits

  • Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     
  • No need to set ei->socket.flags to zero twice.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The sock_register() doesn't change the family, so the protocols can
    define it read-only. No caller ever checks return value from
    sock_unregister()

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Replace the gross custom locking done in socket code for net_family[]
    with simple RCU usage. Some reordering necessary to avoid sleep issues
    with sock_alloc.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Make socket.c conform to current style:
    * run through Lindent
    * get rid of unneeded casts
    * split assignment and comparsion where possible

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • This patch implements wrapper functions that provide a convenient way
    to access the sockets API for in-kernel users like sunrpc, cifs &
    ocfs2 etc and any future users.

    Signed-off-by: Sridhar Samudrala
    Acked-by: James Morris
    Signed-off-by: David S. Miller

    Sridhar Samudrala
     
  • Add NetLabel support to the SELinux LSM and modify the
    socket_post_create() LSM hook to return an error code. The most
    significant part of this patch is the addition of NetLabel hooks into
    the following SELinux LSM hooks:

    * selinux_file_permission()
    * selinux_socket_sendmsg()
    * selinux_socket_post_create()
    * selinux_socket_sock_rcv_skb()
    * selinux_socket_getpeersec_stream()
    * selinux_socket_getpeersec_dgram()
    * selinux_sock_graft()
    * selinux_inet_conn_request()

    The basic reasoning behind this patch is that outgoing packets are
    "NetLabel'd" by labeling their socket and the NetLabel security
    attributes are checked via the additional hook in
    selinux_socket_sock_rcv_skb(). NetLabel itself is only a labeling
    mechanism, similar to filesystem extended attributes, it is up to the
    SELinux enforcement mechanism to perform the actual access checks.

    In addition to the changes outlined above this patch also includes
    some changes to the extended bitmap (ebitmap) and multi-level security
    (mls) code to import and export SELinux TE/MLS attributes into and out
    of NetLabel.

    Signed-off-by: Paul Moore
    Signed-off-by: David S. Miller

    Venkat Yekkirala
     

01 Sep, 2006

1 commit


01 Jul, 2006

1 commit


23 Jun, 2006

1 commit

  • Extend the get_sb() filesystem operation to take an extra argument that
    permits the VFS to pass in the target vfsmount that defines the mountpoint.

    The filesystem is then required to manually set the superblock and root dentry
    pointers. For most filesystems, this should be done with simple_set_mnt()
    which will set the superblock pointer and then set the root dentry to the
    superblock's s_root (as per the old default behaviour).

    The get_sb() op now returns an integer as there's now no need to return the
    superblock pointer.

    This patch permits a superblock to be implicitly shared amongst several mount
    points, such as can be done with NFS to avoid potential inode aliasing. In
    such a case, simple_set_mnt() would not be called, and instead the mnt_root
    and mnt_sb would be set directly.

    The patch also makes the following changes:

    (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
    pointer argument and return an integer, so most filesystems have to change
    very little.

    (*) If one of the convenience function is not used, then get_sb() should
    normally call simple_set_mnt() to instantiate the vfsmount. This will
    always return 0, and so can be tail-called from get_sb().

    (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
    dcache upon superblock destruction rather than shrink_dcache_anon().

    This is required because the superblock may now have multiple trees that
    aren't actually bound to s_root, but that still need to be cleaned up. The
    currently called functions assume that the whole tree is rooted at s_root,
    and that anonymous dentries are not the roots of trees which results in
    dentries being left unculled.

    However, with the way NFS superblock sharing are currently set to be
    implemented, these assumptions are violated: the root of the filesystem is
    simply a dummy dentry and inode (the real inode for '/' may well be
    inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
    with child trees.

    [*] Anonymous until discovered from another tree.

    (*) The documentation has been adjusted, including the additional bit of
    changing ext2_* into foo_* in the documentation.

    [akpm@osdl.org: convert ipath_fs, do other stuff]
    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Nathan Scott
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

01 May, 2006

1 commit

  • On Thursday 23 March 2006 09:08, John D. Ramsdell wrote:
    > I noticed that a socketcall(bind) and socketcall(connect) event contain a
    > record of type=SOCKADDR, but I cannot see one for a system call event
    > associated with socketcall(accept). Recording the sockaddr of an accepted
    > socket is important for cross platform information flow analys

    Thanks for pointing this out. The following patch should address this.

    Signed-off-by: Steve Grubb
    Signed-off-by: Al Viro

    Steve Grubb
     

20 Apr, 2006

1 commit


11 Apr, 2006

3 commits

  • * 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
    [PATCH] vfs: add splice_write and splice_read to documentation
    [PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
    [PATCH] splice: warning fix
    [PATCH] another round of fs/pipe.c cleanups
    [PATCH] splice: comment styles
    [PATCH] splice: add Ingo as addition copyright holder
    [PATCH] splice: unlikely() optimizations
    [PATCH] splice: speedups and optimizations
    [PATCH] pipe.c/fifo.c code cleanups
    [PATCH] get rid of the PIPE_*() macros
    [PATCH] splice: speedup __generic_file_splice_read
    [PATCH] splice: add direct fd fd splicing support
    [PATCH] splice: add optional input and output offsets
    [PATCH] introduce a "kernel-internal pipe object" abstraction
    [PATCH] splice: be smarter about calling do_page_cache_readahead()
    [PATCH] splice: optimize the splice buffer mapping
    [PATCH] splice: cleanup __generic_file_splice_read()
    [PATCH] splice: only call wake_up_interruptible() when we really have to
    [PATCH] splice: potential !page dereference
    [PATCH] splice: mark the io page as accessed

    Linus Torvalds
     
  • for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
    in the past where people were using for_each_cpu() where they should have been
    iterating across only online or present CPUs. This is inefficient and
    possibly buggy.

    We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
    future.

    This patch replaces for_each_cpu with for_each_possible_cpu under /net

    Signed-off-by: KAMEZAWA Hiroyuki
    Acked-by: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • From: Andrew Morton

    net/socket.c:148: warning: initialization from incompatible pointer type

    extern declarations in .c files! Bad boy.

    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Andrew Morton
     

02 Apr, 2006

1 commit


01 Apr, 2006

1 commit

  • This regression was added by commit:
    39d8c1b6fbaeb8d6adec4a8c08365cc9eaca6ae4
    ("Do not lose accepted socket when -ENFILE/-EMFILE.")

    This is based upon a patch from Andi Kleen.

    Thanks to Adrian Bridgett for narrowing down a good test case, and to
    Andi Kleen and Andrew Morton for eyeballing this code.

    Signed-off-by: David S. Miller

    David S. Miller
     

31 Mar, 2006

1 commit

  • This adds support for the sys_splice system call. Using a pipe as a
    transport, it can connect to files or sockets (latter as output only).

    From the splice.c comments:

    "splice": joining two ropes together by interweaving their strands.

    This is the "extended pipe" functionality, where a pipe is used as
    an arbitrary in-memory buffer. Think of a pipe as a small kernel
    buffer that you can use to transfer data from one end to the other.

    The traditional unix read/write is extended with a "splice()" operation
    that transfers data buffers to or from a pipe buffer.

    Named by Larry McVoy, original implementation from Linus, extended by
    Jens to support splicing to files and fixing the initial implementation
    bugs.

    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Jens Axboe
     

29 Mar, 2006

1 commit

  • This is a conversion to make the various file_operations structs in fs/
    const. Basically a regexp job, with a few manual fixups

    The goal is both to increase correctness (harder to accidentally write to
    shared datastructures) and reducing the false sharing of cachelines with
    things that get dirty in .data (while .rodata is nicely read only and thus
    cache clean)

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

24 Mar, 2006

2 commits

  • Rewrap the overly long source code lines resulting from the previous
    patch's addition of the slab cache flag SLAB_MEM_SPREAD. This patch
    contains only formatting changes, and no function change.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Mark file system inode and similar slab caches subject to SLAB_MEM_SPREAD
    memory spreading.

    If a slab cache is marked SLAB_MEM_SPREAD, then anytime that a task that's
    in a cpuset with the 'memory_spread_slab' option enabled goes to allocate
    from such a slab cache, the allocations are spread evenly over all the
    memory nodes (task->mems_allowed) allowed to that task, instead of favoring
    allocation on the node local to the current cpu.

    The following inode and similar caches are marked SLAB_MEM_SPREAD:

    file cache
    ==== =====
    fs/adfs/super.c adfs_inode_cache
    fs/affs/super.c affs_inode_cache
    fs/befs/linuxvfs.c befs_inode_cache
    fs/bfs/inode.c bfs_inode_cache
    fs/block_dev.c bdev_cache
    fs/cifs/cifsfs.c cifs_inode_cache
    fs/coda/inode.c coda_inode_cache
    fs/dquot.c dquot
    fs/efs/super.c efs_inode_cache
    fs/ext2/super.c ext2_inode_cache
    fs/ext2/xattr.c (fs/mbcache.c) ext2_xattr
    fs/ext3/super.c ext3_inode_cache
    fs/ext3/xattr.c (fs/mbcache.c) ext3_xattr
    fs/fat/cache.c fat_cache
    fs/fat/inode.c fat_inode_cache
    fs/freevxfs/vxfs_super.c vxfs_inode
    fs/hpfs/super.c hpfs_inode_cache
    fs/isofs/inode.c isofs_inode_cache
    fs/jffs/inode-v23.c jffs_fm
    fs/jffs2/super.c jffs2_i
    fs/jfs/super.c jfs_ip
    fs/minix/inode.c minix_inode_cache
    fs/ncpfs/inode.c ncp_inode_cache
    fs/nfs/direct.c nfs_direct_cache
    fs/nfs/inode.c nfs_inode_cache
    fs/ntfs/super.c ntfs_big_inode_cache_name
    fs/ntfs/super.c ntfs_inode_cache
    fs/ocfs2/dlm/dlmfs.c dlmfs_inode_cache
    fs/ocfs2/super.c ocfs2_inode_cache
    fs/proc/inode.c proc_inode_cache
    fs/qnx4/inode.c qnx4_inode_cache
    fs/reiserfs/super.c reiser_inode_cache
    fs/romfs/inode.c romfs_inode_cache
    fs/smbfs/inode.c smb_inode_cache
    fs/sysv/inode.c sysv_inode_cache
    fs/udf/super.c udf_inode_cache
    fs/ufs/super.c ufs_inode_cache
    net/socket.c sock_inode_cache
    net/sunrpc/rpc_pipe.c rpc_inode_cache

    The choice of which slab caches to so mark was quite simple. I marked
    those already marked SLAB_RECLAIM_ACCOUNT, except for fs/xfs, dentry_cache,
    inode_cache, and buffer_head, which were marked in a previous patch. Even
    though SLAB_RECLAIM_ACCOUNT is for a different purpose, it marks the same
    potentially large file system i/o related slab caches as we need for memory
    spreading.

    Given that the rule now becomes "wherever you would have used a
    SLAB_RECLAIM_ACCOUNT slab cache flag before (usually the inode cache), use
    the SLAB_MEM_SPREAD flag too", this should be easy enough to maintain.
    Future file system writers will just copy one of the existing file system
    slab cache setups and tend to get it right without thinking.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     

22 Mar, 2006

1 commit


21 Mar, 2006

3 commits

  • Semaphore to mutex conversion.

    The conversion was generated via scripts, and the result was validated
    automatically via a script as well.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Arjan van de Ven
     
  • Here's an updated copy of the patch to use fget_light in net/socket.c.
    Rerunning the tests show a drop of ~80Mbit/s on average, which looks
    bad until you see the drop in cpu usage from ~89% to ~82%. That will
    get fixed in another patch...

    Before: max 8113.70, min 8026.32, avg 8072.34
    87380 16384 16384 10.01 8045.55 87.11 87.11 1.774 1.774
    87380 16384 16384 10.01 8065.14 90.86 90.86 1.846 1.846
    87380 16384 16384 10.00 8077.76 89.85 89.85 1.822 1.822
    87380 16384 16384 10.00 8026.32 89.80 89.80 1.833 1.833
    87380 16384 16384 10.01 8108.59 89.81 89.81 1.815 1.815
    87380 16384 16384 10.01 8034.53 89.01 89.01 1.815 1.815
    87380 16384 16384 10.00 8113.70 90.45 90.45 1.827 1.827
    87380 16384 16384 10.00 8111.37 89.90 89.90 1.816 1.816
    87380 16384 16384 10.01 8077.75 87.96 87.96 1.784 1.784
    87380 16384 16384 10.00 8062.70 90.25 90.25 1.834 1.834

    After: max 8035.81, min 7963.69, avg 7998.14
    87380 16384 16384 10.01 8000.93 82.11 82.11 1.682 1.682
    87380 16384 16384 10.01 8016.17 83.67 83.67 1.710 1.710
    87380 16384 16384 10.01 7963.69 83.47 83.47 1.717 1.717
    87380 16384 16384 10.01 8014.35 81.71 81.71 1.671 1.671
    87380 16384 16384 10.00 7967.68 83.41 83.41 1.715 1.715
    87380 16384 16384 10.00 7995.22 81.00 81.00 1.660 1.660
    87380 16384 16384 10.00 8002.61 83.90 83.90 1.718 1.718
    87380 16384 16384 10.00 8035.81 81.71 81.71 1.666 1.666
    87380 16384 16384 10.01 8005.36 82.56 82.56 1.690 1.690
    87380 16384 16384 10.00 7979.61 82.50 82.50 1.694 1.694

    Signed-off-by: Benjamin LaHaise
    Signed-off-by: David S. Miller

    Benjamin LaHaise
     
  • Try to allocate the struct file and an unused file
    descriptor before we try to pull a newly accepted
    socket out of the protocol layer.

    Based upon a patch by Prassana Meda.

    Signed-off-by: David S. Miller

    David S. Miller
     

07 Feb, 2006

1 commit


06 Feb, 2006

1 commit

  • percpu_data blindly allocates bootmem memory to store NR_CPUS instances of
    cpudata, instead of allocating memory only for possible cpus.

    As a preparation for changing that, we need to convert various 0 -> NR_CPUS
    loops to use for_each_cpu().

    (The above only applies to users of asm-generic/percpu.h. powerpc has gone it
    alone and is presently only allocating memory for present CPUs, so it's
    currently corrupting memory).

    Signed-off-by: Eric Dumazet
    Cc: "David S. Miller"
    Cc: James Bottomley
    Acked-by: Ingo Molnar
    Cc: Jens Axboe
    Cc: Anton Blanchard
    Acked-by: William Irwin
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

31 Jan, 2006

1 commit

  • This patch contains the following changes:
    - add a CONFIG_WIRELESS_EXT select'ed by NET_RADIO for conditional
    code
    - remove the now no longer required #ifdef CONFIG_NET_RADIO from some
    #include's

    Based on a patch by Jean Tourrilhes .

    Signed-off-by: Adrian Bunk
    Signed-off-by: John W. Linville

    Adrian Bunk
     

12 Jan, 2006

1 commit


04 Jan, 2006

3 commits

  • Currently all network protocols need to call dev_ioctl as the default
    fallback in their ioctl implementations. This patch adds a fallback
    to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
    This way all the procotol ioctl handlers can be simplified and we don't
    need to export dev_ioctl.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     
  • Mid-term I plan to restructure the file_operations so that we don't need
    to have all these duplicate aio and vectored versions. This patch is
    a small step in that direction but also a worthwile cleanup on it's own:

    (1) introduce a alloc_sock_iocb helper that encapsulates allocating a
    proper sock_iocb
    (2) add do_sock_read and do_sock_write helpers for common read/write
    code

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     
  • It needs to return zero now that it is an initcall.

    Also, net/nonet.c no longer needs a dummy sock_init().

    Signed-off-by: David S. Miller

    David S. Miller