Doug / smarc-fsl-linux-kernel | Embedian Git Server

31 Dec, 2011

1 commit

885ee74d5 af_unix: Move CINQ/COUTQ code to helpers ... Browse Code »

Currently tcp diag reports rqlen and wqlen values similar to how
the CINQ/COUTQ iotcls do. To make unix diag report these values
in the same way move the respective code into helpers.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-31 05:45:45 +0800

17 Dec, 2011

1 commit

fa7ff56f7 af_unix: Export stuff required for diag module ... Browse Code »

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-17 02:48:27 +0800

25 Apr, 2011

1 commit

2a9e95070 net: Remove __KERNEL__ cpp checks from include/net ... Browse Code »

These header files are never installed to user consumption, so any
__KERNEL__ cpp checks are superfluous.

Projects should also not copy these files into their userland utility
sources and try to use them there. If they insist on doing so, the
onus is on them to sanitize the headers as needed.

Signed-off-by: David S. Miller

David S. Miller
2011-04-25 01:54:56 +0800

30 Nov, 2010

1 commit

25888e303 af_unix: limit recursion level ... Browse Code »

Its easy to eat all kernel memory and trigger NMI watchdog, using an
exploit program that queues unix sockets on top of others.

lkml ref : http://lkml.org/lkml/2010/11/25/8

This mechanism is used in applications, one choice we have is to have a
recursion limit.

Other limits might be needed as well (if we queue other types of files),
since the passfd mechanism is currently limited by socket receive queue
sizes only.

Add a recursion_level to unix socket, allowing up to 4 levels.

Each time we send an unix socket through sendfd mechanism, we copy its
recursion level (plus one) to receiver. This recursion level is cleared
when socket receive queue is emptied.

Reported-by: Марк Коренберг
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-11-30 01:45:15 +0800

17 Jun, 2010

1 commit

7361c36c5 af_unix: Allow credentials to work across user and pid namespaces. ... Browse Code »

In unix_skb_parms store pointers to struct pid and struct cred instead
of raw uid, gid, and pid values, then translate the credentials on
reception into values that are meaningful in the receiving processes
namespaces.

Signed-off-by: Eric W. Biederman
Acked-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Eric W. Biederman
2010-06-17 05:58:16 +0800

02 May, 2010

1 commit

438154823 net: sock_def_readable() and friends RCU conversion ... Browse Code »

sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we
need two atomic operations (and associated dirtying) per incoming
packet.

RCU conversion is pretty much needed :

1) Add a new structure, called "struct socket_wq" to hold all fields
that will need rcu_read_lock() protection (currently: a
wait_queue_head_t and a struct fasync_struct pointer).

[Future patch will add a list anchor for wakeup coalescing]

2) Attach one of such structure to each "struct socket" created in
sock_alloc_inode().

3) Respect RCU grace period when freeing a "struct socket_wq"

4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
socket_wq"

5) Change sk_sleep() function to use new sk->sk_wq instead of
sk->sk_sleep

6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
a rcu_read_lock() section.

7) Change all sk_has_sleeper() callers to :
- Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock)
- Use wq_has_sleeper() to eventually wakeup tasks.
- Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock)

8) sock_wake_async() is modified to use rcu protection as well.

9) Exceptions :
macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
instead of dynamically allocated ones. They dont need rcu freeing.

Some cleanups or followups are probably needed, (possible
sk_callback_lock conversion to a spinlock for example...).

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-05-02 06:00:15 +0800

27 Nov, 2008

1 commit

5f23b7349 net: Fix soft lockups/OOM issues w/ unix garbage collector ... Browse Code »

This is an implementation of David Miller's suggested fix in:
https://bugzilla.redhat.com/show_bug.cgi?id=470201

It has been updated to use wait_event() instead of
wait_event_interruptible().

Paraphrasing the description from the above report, it makes sendmsg()
block while UNIX garbage collection is in progress. This avoids a
situation where child processes continue to queue new FDs over a
AF_UNIX socket to a parent which is in the exit path and running
garbage collection on these FDs. This contention can result in soft
lockups and oom-killing of unrelated processes.

Signed-off-by: dann frazier
Signed-off-by: David S. Miller

dann frazier
2008-11-27 07:32:27 +0800

10 Nov, 2008

1 commit

6209344f5 net: unix: fix inflight counting bug in garbage collector ... Browse Code »

Previously I assumed that the receive queues of candidates don't
change during the GC. This is only half true, nothing can be received
from the queues (see comment in unix_gc()), but buffers could be added
through the other half of the socket pair, which may still have file
descriptors referring to it.

This can result in inc_inflight_move_tail() erronously increasing the
"inflight" counter for a unix socket for which dec_inflight() wasn't
previously called. This in turn can trigger the "BUG_ON(total_refs <
inflight_refs)" in a later garbage collection run.

Fix this by only manipulating the "inflight" counter for sockets which
are candidates themselves. Duplicating the file references in
unix_attach_fds() is also needed to prevent a socket becoming a
candidate for GC while the skb that contains it is not yet queued.

Reported-by: Andrea Bittau
Signed-off-by: Miklos Szeredi
CC: stable@kernel.org
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-11-10 03:17:33 +0800

27 Jul, 2008

1 commit

516e0cc56 [PATCH] f_count may wrap around ... Browse Code »

make it atomic_long_t; while we are at it, get rid of useless checks in affs,
hfs and hpfs - ->open() always has it equal to 1, ->release() - to 0.

Signed-off-by: Al Viro

Al Viro
2008-07-27 08:53:40 +0800

29 Jan, 2008

2 commits

27147c9e6 [AF_UNIX]: Remove unused declaration of sysctl_unix_max_dgram_qlen. ... Browse Code »

Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller

Denis V. Lunev
2008-01-29 06:57:13 +0800
97577e382 [UNIX]: Extend unix_sysctl_(un)register prototypes ... Browse Code »

Add the struct net * argument to both of them to use in
the future. Also make the register one return an error code.

It is useless right now, but will make the future patches
much simpler.

Signed-off-by: Pavel Emelyanov
Acked-by: Eric W. Biederman
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Pavel Emelyanov
2008-01-29 06:55:21 +0800

11 Nov, 2007

1 commit

9305cfa44 [AF_UNIX]: Make unix_tot_inflight counter non-atomic ... Browse Code »

This counter is _always_ modified under the unix_gc_lock spinlock,
so its atomicity can be provided w/o additional efforts.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-11 14:06:01 +0800

31 Jul, 2007

1 commit

131116989 [AF_UNIX]: Make code static. ... Browse Code »

The following code can now become static:
- struct unix_socket_table
- unix_table_lock

Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller

Adrian Bunk
2007-07-31 17:28:27 +0800

12 Jul, 2007

1 commit

1fd05ba5a [AF_UNIX]: Rewrite garbage collector, fixes race. ... Browse Code »

Throw out the old mark & sweep garbage collector and put in a
refcounting cycle detecting one.

The old one had a race with recvmsg, that resulted in false positives
and hence data loss. The old algorithm operated on all unix sockets
in the system, so any additional locking would have meant performance
problems for all users of these.

The new algorithm instead only operates on "in flight" sockets, which
are very rare, and the additional locking for these doesn't negatively
impact the vast majority of users.

In fact it's probable, that there weren't *any* heavy senders of
sockets over sockets, otherwise the above race would have been
discovered long ago.

The patch works OK with the app that exposed the race with the old
code. The garbage collection has also been verified to work in a few
simple cases.

Signed-off-by: Miklos Szeredi
Signed-off-by: David S. Miller

Miklos Szeredi
2007-07-12 05:22:39 +0800

04 Jun, 2007

1 commit

1c92b4e50 [AF_UNIX]: Make socket locking much less confusing. ... Browse Code »

The unix_state_*() locking macros imply that there is some
rwlock kind of thing going on, but the implementation is
actually a spinlock which makes the code more confusing than
it needs to be.

So use plain unix_state_lock and unix_state_unlock.

Signed-off-by: David S. Miller

David S. Miller
2007-06-04 09:08:40 +0800

03 Aug, 2006

1 commit

dc49c1f94 [AF_UNIX]: Kernel memory leak fix for af_unix datagram getpeersec patch ... Browse Code »

From: Catherine Zhang

This patch implements a cleaner fix for the memory leak problem of the
original unix datagram getpeersec patch. Instead of creating a
security context each time a unix datagram is sent, we only create the
security context when the receiver requests it.

This new design requires modification of the current
unix_getsecpeer_dgram LSM hook and addition of two new hooks, namely,
secid_to_secctx and release_secctx. The former retrieves the security
context and the latter releases it. A hook is required for releasing
the security context because it is up to the security module to decide
how that's done. In the case of Selinux, it's a simple kfree
operation.

Acked-by: Stephen Smalley
Signed-off-by: David S. Miller

Catherine Zhang
2006-08-03 05:12:06 +0800

04 Jul, 2006

1 commit

a09785a24 [PATCH] lockdep: annotate af_unix locking ... Browse Code »

Teach special (recursive) locking code to the lock validator. Also splits
af_unix's sk_receive_queue.lock class from the other networking skb-queue
locks. Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar
Signed-off-by: Arjan van de Ven
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2006-07-04 06:27:07 +0800

30 Jun, 2006

1 commit

877ce7c1b [AF_UNIX]: Datagram getpeersec ... Browse Code »

This patch implements an API whereby an application can determine the
label of its peer's Unix datagram sockets via the auxiliary data mechanism of
recvmsg.

Patch purpose:

This patch enables a security-aware application to retrieve the
security context of the peer of a Unix datagram socket. The application
can then use this security context to determine the security context for
processing on behalf of the peer who sent the packet.

Patch design and implementation:

The design and implementation is very similar to the UDP case for INET
sockets. Basically we build upon the existing Unix domain socket API for
retrieving user credentials. Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message). To retrieve the security
context, the application first indicates to the kernel such desire by
setting the SO_PASSSEC option via getsockopt. Then the application
retrieves the security context using the auxiliary data mechanism.

An example server application for Unix datagram socket should look like this:

toggle = 1;
toggle_len = sizeof(toggle);

setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
if (cmsg_hdr->cmsg_len cmsg_level == SOL_SOCKET &&
cmsg_hdr->cmsg_type == SCM_SECURITY) {
memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
}
}

sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
a server socket to receive security context of the peer.

Testing:

We have tested the patch by setting up Unix datagram client and server
applications. We verified that the server can retrieve the security context
using the auxiliary data mechanism of recvmsg.

Signed-off-by: Catherine Zhang
Acked-by: Acked-by: James Morris
Signed-off-by: David S. Miller

Catherine Zhang
2006-06-30 07:58:06 +0800

26 Apr, 2006

1 commit

62c4f0a2d Don't include linux/config.h from anywhere else in include/ ... Browse Code »

Signed-off-by: David Woodhouse

David Woodhouse
2006-04-26 19:56:16 +0800

21 Mar, 2006

1 commit

57b47a53e [NET]: sem2mutex part 2 ... Browse Code »

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Ingo Molnar
2006-03-21 14:35:41 +0800

04 Jan, 2006

2 commits

fd19f329a [AF_UNIX]: Convert to use a spinlock instead of rwlock ... Browse Code »

From: Benjamin LaHaise

In af_unix, a rwlock is used to protect internal state. At least on my
P4 with HT it is faster to use a spinlock due to the simpler memory
barrier used to unlock. This patch raises bw_unix to ~690K/s.

Signed-off-by: David S. Miller

Benjamin LaHaise
2006-01-04 06:10:46 +0800
fbe9cc4a8 [AF_UNIX]: Use spinlock for unix_table_lock ... Browse Code »

This lock is actually taken mostly as a writer,
so using a rwlock actually just makes performance
worse especially on chips like the Intel P4.

Signed-off-by: David S. Miller

David S. Miller
2006-01-04 05:10:59 +0800

30 Aug, 2005

1 commit

20380731b [NET]: Fix sparse warnings ... Browse Code »

Of this type, mostly:

CHECK net/ipv6/netfilter.c
net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static?
net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static?

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 07:01:32 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800