20 Jul, 2007
1 commit
-
Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).Signed-off-by: Paul Mundt
17 Jul, 2007
1 commit
-
Part two in the O_CLOEXEC saga: adding support for file descriptors received
through Unix domain sockets.The patch is once again pretty minimal, it introduces a new flag for recvmsg
and passes it just like the existing MSG_CMSG_COMPAT flag. I think this bit
is not used otherwise but the networking people will know better.This new flag is not recognized by recvfrom and recv. These functions cannot
be used for that purpose and the asymmetry this introduces is not worse than
the already existing MSG_CMSG_COMPAT situations.The patch must be applied on the patch which introduced O_CLOEXEC. It has to
remove static from the new get_unused_fd_flags function but since scm.c cannot
live in a module the function still hasn't to be exported.Here's a test program to make sure the code works. It's so much longer than
the actual patch...#include
#include
#include
#include
#include
#include
#include
#include#ifndef O_CLOEXEC
# define O_CLOEXEC 02000000
#endif
#ifndef MSG_CMSG_CLOEXEC
# define MSG_CMSG_CLOEXEC 0x40000000
#endifint
main (int argc, char *argv[])
{
if (argc > 1)
{
int fd = atol (argv[1]);
printf ("child: fd = %d\n", fd);
if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
{
puts ("file descriptor valid in child");
return 1;
}
return 0;}
struct sockaddr_un sun;
strcpy (sun.sun_path, "./testsocket");
sun.sun_family = AF_UNIX;char databuf[] = "hello";
struct iovec iov[1];
iov[0].iov_base = databuf;
iov[0].iov_len = sizeof (databuf);union
{
struct cmsghdr hdr;
char bytes[CMSG_SPACE (sizeof (int))];
} buf;
struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1,
.msg_control = buf.bytes,
.msg_controllen = sizeof (buf) };
struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg);cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
cmsg->cmsg_len = CMSG_LEN (sizeof (int));msg.msg_controllen = cmsg->cmsg_len;
pid_t child = fork ();
if (child == -1)
error (1, errno, "fork");
if (child == 0)
{
int sock = socket (PF_UNIX, SOCK_STREAM, 0);
if (sock < 0)
error (1, errno, "socket");if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
error (1, errno, "bind");
if (listen (sock, SOMAXCONN) < 0)
error (1, errno, "listen");int conn = accept (sock, NULL, NULL);
if (conn == -1)
error (1, errno, "accept");*(int *) CMSG_DATA (cmsg) = sock;
if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0)
error (1, errno, "sendmsg");return 0;
}/* For a test suite this should be more robust like a
barrier in shared memory. */
sleep (1);int sock = socket (PF_UNIX, SOCK_STREAM, 0);
if (sock < 0)
error (1, errno, "socket");if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
error (1, errno, "connect");
unlink (sun.sun_path);*(int *) CMSG_DATA (cmsg) = -1;
if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0)
error (1, errno, "recvmsg");int fd = *(int *) CMSG_DATA (cmsg);
if (fd == -1)
error (1, 0, "no descriptor received");char fdname[20];
snprintf (fdname, sizeof (fdname), "%d", fd);
execl ("/proc/self/exe", argv[0], fdname, NULL);
puts ("execl failed");
return 1;
}[akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Ulrich Drepper
Cc: Ingo Molnar
Cc: Michael Buesch
Cc: Michael Kerrisk
Acked-by: David S. Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 May, 2007
1 commit
-
SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
Signed-off-by: Christoph Lameter
Cc: David Howells
Cc: Jens Axboe
Cc: Steven French
Cc: Michael Halcrow
Cc: OGAWA Hirofumi
Cc: Miklos Szeredi
Cc: Steven Whitehouse
Cc: Roman Zippel
Cc: David Woodhouse
Cc: Dave Kleikamp
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Cc: Anton Altaparmakov
Cc: Mark Fasheh
Cc: Paul Mackerras
Cc: Christoph Hellwig
Cc: Jan Kara
Cc: David Chinner
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 May, 2007
1 commit
-
1) Introduces a new method in 'struct dentry_operations'. This method
called d_dname() might be called from d_path() to build a pathname for
special filesystems. It is called without locks.Future patches (if we succeed in having one common dentry for all
pipes/sockets) may need to change prototype of this method, but we now
use : char *d_dname(struct dentry *dentry, char *buffer, int buflen);2) Adds a dynamic_dname() helper function that eases d_dname() implementations
3) Defines d_dname method for sockets : No more sprintf() at socket
creation. This is delayed up to the moment someone does an access to
/proc/pid/fd/...4) Defines d_dname method for pipes : No more sprintf() at pipe
creation. This is delayed up to the moment someone does an access to
/proc/pid/fd/...A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a
*nice* speedup on my Pentium(M) 1.6 Ghz :3.090 s instead of 3.450 s
Signed-off-by: Eric Dumazet
Acked-by: Christoph Hellwig
Acked-by: Linus Torvalds
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 May, 2007
1 commit
-
I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by
SLAB.I think its purpose was to have a callback after an object has been freed
to verify that the state is the constructor state again? The callback is
performed before each freeing of an object.I would think that it is much easier to check the object state manually
before the free. That also places the check near the code object
manipulation of the object.Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
compiled with SLAB debugging on. If there would be code in a constructor
handling SLAB_DEBUG_INITIAL then it would have to be conditional on
SLAB_DEBUG otherwise it would just be dead code. But there is no such code
in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real
use of, difficult to understand and there are easier ways to accomplish the
same effect (i.e. add debug code before kfree).There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
clear in fs inode caches. Remove the pointless checks (they would even be
pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.This is the last slab flag that SLUB did not support. Remove the check for
unimplemented flags from SLUB.Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Apr, 2007
3 commits
-
Kernel: arch/x86_64/boot/bzImage is ready (#2)
MODPOST 1816 modules
WARNING: "__sock_recv_timestamp" [net/sctp/sctp.ko] undefined!
WARNING: "__sock_recv_timestamp" [net/packet/af_packet.ko] undefined!
WARNING: "__sock_recv_timestamp" [net/key/af_key.ko] undefined!
WARNING: "__sock_recv_timestamp" [net/ipv6/ipv6.ko] undefined!
WARNING: "__sock_recv_timestamp" [net/atm/atm.ko] undefined!
make[2]: *** [__modpost] Error 1
make[1]: *** [modules] Error 2
make: *** [_all] Error 2Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller -
Now that network timestamps use ktime_t infrastructure, we can add a new
SOL_SOCKET sockopt SO_TIMESTAMPNS.This command is similar to SO_TIMESTAMP, but permits transmission of
a 'timespec struct' instead of a 'timeval struct' control message.
(nanosecond resolution instead of microsecond)Control message is labelled SCM_TIMESTAMPNS instead of SCM_TIMESTAMP
A socket cannot mix SO_TIMESTAMP and SO_TIMESTAMPNS : the two modes are
mutually exclusive.sock_recv_timestamp() became too big to be fully inlined so I added a
__sock_recv_timestamp() helper function.Signed-off-by: Eric Dumazet
CC: linux-arch@vger.kernel.org
Signed-off-by: David S. Miller -
Fix whitespace around keywords. Fix indentation especially of switch
statements.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
27 Mar, 2007
1 commit
-
* d_alloc() in sock_attach_fd() fails leaving ->f_dentry of new file NULL
* bail out to out_fd label, doing fput()/__fput() on new file
* but __fput() assumes valid ->f_dentry and dereferences itSigned-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller
18 Feb, 2007
1 commit
-
Provide an audit record of the descriptor pair returned by pipe() and
socketpair(). Rewritten from the original posted to linux-audit by
John D. RamsdellSigned-off-by: Al Viro
13 Feb, 2007
1 commit
-
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 Feb, 2007
1 commit
-
Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller
09 Feb, 2007
2 commits
-
GCC (correctly) says:
net/socket.c: In function ‘sys_sendto’:
net/socket.c:1510: warning: ‘err’ may be used uninitialized in this function
net/socket.c: In function ‘sys_recvfrom’:
net/socket.c:1571: warning: ‘err’ may be used uninitialized in this functionsock_from_file() either returns filp->private_data or it
sets *err and returns NULL.Callers return "err" on NULL, but filp->private_data could
be NULL.Some minor rearrangements of error handling in sys_sendto
and sys_recvfrom solves the issue.Signed-off-by: David S. Miller
-
I believe dead code from sock_from_file() can be cleaned up.
All sockets are now built using sock_attach_fd(), that puts the 'sock' pointer
into file->private_data and &socket_file_ops into file->f_opI could not find a place where file->private_data could be set to NULL,
keeping opened the file.So to get 'sock' from a 'file' pointer, either :
- This is a socket file (f_op == &socket_file_ops), and we can directly get
'sock' from private_data.
- This is not a socket, we return -ENOTSOCK and dont even try to find a socket
via dentry/inode :)Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
09 Dec, 2006
1 commit
-
Signed-off-by: Josef Sipek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Dec, 2006
3 commits
-
We currently insert socket dentries into the global dentry hashtable. This
is suboptimal because there is currently no way these entries can be used
for a lookup(). (/proc/xxx/fd/xxx uses a different mechanism). Inserting
them in dentry hashtable slows dcache lookups.To let __dpath() still work correctly (ie not adding a " (deleted)") after
dentry name, we do :- Right after d_alloc(), pretend they are hashed by clearing the
DCACHE_UNHASHED bit.- Call d_instantiate() instead of d_add() : dentry is not inserted in
hash table.__dpath() & friends work as intended during dentry lifetime.
- At dismantle time, once dput() must clear the dentry, setting again
DCACHE_UNHASHED bit inside the custom d_delete() function provided by
socket code, so that dput() can just kill_it.Signed-off-by: Eric Dumazet
Cc: Al Viro
Acked-by: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Replace all uses of kmem_cache_t with struct kmem_cache.
The patch was generated using the following script:
#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#set -e
for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
doneThe script was run like this
sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
SLAB_KERNEL is an alias of GFP_KERNEL.
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Dec, 2006
1 commit
-
This patch contains the scheduled removal of the frame diverter.
Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller
02 Oct, 2006
1 commit
-
File handles can be requested to send sigio and sigurg to processes. By
tracking the destination processes using struct pid instead of pid_t we make
the interface safe from all potential pid wrap around problems.Signed-off-by: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Oct, 2006
2 commits
-
This patch removes readv() and writev() methods and replaces them with
aio_read()/aio_write() methods.Signed-off-by: Badari Pulavarty
Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch vectorizes aio_read() and aio_write() methods to prepare for
collapsing all aio & vectored operations into one interface - which is
aio_read()/aio_write().Signed-off-by: Badari Pulavarty
Signed-off-by: Christoph Hellwig
Cc: Michael Holzheu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Sep, 2006
7 commits
-
Signed-off-by: Brian Haley
Signed-off-by: David S. Miller -
No need to set ei->socket.flags to zero twice.
Signed-off-by: David S. Miller
-
The sock_register() doesn't change the family, so the protocols can
define it read-only. No caller ever checks return value from
sock_unregister()Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
Replace the gross custom locking done in socket code for net_family[]
with simple RCU usage. Some reordering necessary to avoid sleep issues
with sock_alloc.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
Make socket.c conform to current style:
* run through Lindent
* get rid of unneeded casts
* split assignment and comparsion where possibleSigned-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
This patch implements wrapper functions that provide a convenient way
to access the sockets API for in-kernel users like sunrpc, cifs &
ocfs2 etc and any future users.Signed-off-by: Sridhar Samudrala
Acked-by: James Morris
Signed-off-by: David S. Miller -
Add NetLabel support to the SELinux LSM and modify the
socket_post_create() LSM hook to return an error code. The most
significant part of this patch is the addition of NetLabel hooks into
the following SELinux LSM hooks:* selinux_file_permission()
* selinux_socket_sendmsg()
* selinux_socket_post_create()
* selinux_socket_sock_rcv_skb()
* selinux_socket_getpeersec_stream()
* selinux_socket_getpeersec_dgram()
* selinux_sock_graft()
* selinux_inet_conn_request()The basic reasoning behind this patch is that outgoing packets are
"NetLabel'd" by labeling their socket and the NetLabel security
attributes are checked via the additional hook in
selinux_socket_sock_rcv_skb(). NetLabel itself is only a labeling
mechanism, similar to filesystem extended attributes, it is up to the
SELinux enforcement mechanism to perform the actual access checks.In addition to the changes outlined above this patch also includes
some changes to the extended bitmap (ebitmap) and multi-level security
(mls) code to import and export SELinux TE/MLS attributes into and out
of NetLabel.Signed-off-by: Paul Moore
Signed-off-by: David S. Miller
01 Sep, 2006
1 commit
-
This patch limits the warning messages when socket allocation failures
happen. It happens under memory pressure.Signed-off-by: Akinobu Mita
Signed-off-by: David S. Miller
01 Jul, 2006
1 commit
-
Signed-off-by: Jörn Engel
Signed-off-by: Adrian Bunk
23 Jun, 2006
1 commit
-
Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.The filesystem is then required to manually set the superblock and root dentry
pointers. For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing. In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.The patch also makes the following changes:
(*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
pointer argument and return an integer, so most filesystems have to change
very little.(*) If one of the convenience function is not used, then get_sb() should
normally call simple_set_mnt() to instantiate the vfsmount. This will
always return 0, and so can be tail-called from get_sb().(*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
dcache upon superblock destruction rather than shrink_dcache_anon().This is required because the superblock may now have multiple trees that
aren't actually bound to s_root, but that still need to be cleaned up. The
currently called functions assume that the whole tree is rooted at s_root,
and that anonymous dentries are not the roots of trees which results in
dentries being left unculled.However, with the way NFS superblock sharing are currently set to be
implemented, these assumptions are violated: the root of the filesystem is
simply a dummy dentry and inode (the real inode for '/' may well be
inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
with child trees.[*] Anonymous until discovered from another tree.
(*) The documentation has been adjusted, including the additional bit of
changing ext2_* into foo_* in the documentation.[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells
Acked-by: Al Viro
Cc: Nathan Scott
Cc: Roland Dreier
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 May, 2006
1 commit
-
On Thursday 23 March 2006 09:08, John D. Ramsdell wrote:
> I noticed that a socketcall(bind) and socketcall(connect) event contain a
> record of type=SOCKADDR, but I cannot see one for a system call event
> associated with socketcall(accept). Recording the sockaddr of an accepted
> socket is important for cross platform information flow analysThanks for pointing this out. The following patch should address this.
Signed-off-by: Steve Grubb
Signed-off-by: Al Viro
20 Apr, 2006
1 commit
-
This applies to 2.6.17-rc2.
There is a missing initialization of err in sockfd_lookup_light() that
could return random error for an invalid file handle.Signed-off-by: Hua Zhong
Signed-off-by: David S. Miller
11 Apr, 2006
3 commits
-
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] vfs: add splice_write and splice_read to documentation
[PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
[PATCH] splice: warning fix
[PATCH] another round of fs/pipe.c cleanups
[PATCH] splice: comment styles
[PATCH] splice: add Ingo as addition copyright holder
[PATCH] splice: unlikely() optimizations
[PATCH] splice: speedups and optimizations
[PATCH] pipe.c/fifo.c code cleanups
[PATCH] get rid of the PIPE_*() macros
[PATCH] splice: speedup __generic_file_splice_read
[PATCH] splice: add direct fd fd splicing support
[PATCH] splice: add optional input and output offsets
[PATCH] introduce a "kernel-internal pipe object" abstraction
[PATCH] splice: be smarter about calling do_page_cache_readahead()
[PATCH] splice: optimize the splice buffer mapping
[PATCH] splice: cleanup __generic_file_splice_read()
[PATCH] splice: only call wake_up_interruptible() when we really have to
[PATCH] splice: potential !page dereference
[PATCH] splice: mark the io page as accessed -
for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs. This is inefficient and
possibly buggy.We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.This patch replaces for_each_cpu with for_each_possible_cpu under /net
Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
From: Andrew Morton
net/socket.c:148: warning: initialization from incompatible pointer type
extern declarations in .c files! Bad boy.
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe
02 Apr, 2006
1 commit
-
Andi Kleen was right, fput() on sock->file will end up calling
sock_release() if necessary. So here is the rest of his version
of the fix for these leaks.Signed-off-by: David S. Miller
01 Apr, 2006
1 commit
-
This regression was added by commit:
39d8c1b6fbaeb8d6adec4a8c08365cc9eaca6ae4
("Do not lose accepted socket when -ENFILE/-EMFILE.")This is based upon a patch from Andi Kleen.
Thanks to Adrian Bridgett for narrowing down a good test case, and to
Andi Kleen and Andrew Morton for eyeballing this code.Signed-off-by: David S. Miller
31 Mar, 2006
1 commit
-
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).From the splice.c comments:
"splice": joining two ropes together by interweaving their strands.
This is the "extended pipe" functionality, where a pipe is used as
an arbitrary in-memory buffer. Think of a pipe as a small kernel
buffer that you can use to transfer data from one end to the other.The traditional unix read/write is extended with a "splice()" operation
that transfers data buffers to or from a pipe buffer.Named by Larry McVoy, original implementation from Linus, extended by
Jens to support splicing to files and fixing the initial implementation
bugs.Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds