Eric Lee / smarc-fsl-linux-kernel

22 Jun, 2005

7 commits

047601bf7 [NETFILTER]: Fix ip6t_LOG sit tunnel logging ... Browse Code »

Sit tunnel logging is currently broken:

MAC=01:23:45:67:89:ab->01:23:45:47:89:ac TUNNEL=123.123. 0.123-> 12.123. 6.123

Apart from the broken IP address, MAC addresses are printed differently
for sit tunnels than for everything else.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2005-06-22 05:07:13 +0800
97216c799 [NETFILTER]: Missing owner-field initialization in ip6table_raw ... Browse Code »

I missed this one when fixing up iptable_raw.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2005-06-22 05:03:01 +0800
e98231858 [NETFILTER]: Restore netfilter assumptions in IPv6 multicast ... Browse Code »

Netfilter assumes that skb->data == skb->nh.ipv6h

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2005-06-22 05:02:15 +0800
18b8afc77 [NETFILTER]: Kill nf_debug ... Browse Code »

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2005-06-22 05:01:57 +0800
e45b1be8b [NETFILTER]: Kill lockhelp.h ... Browse Code »

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2005-06-22 05:01:30 +0800
c9e3e8b69 [IPV6]: multicast join and misc ... Browse Code »

Here is a simplified version of the patch to fix a bug in IPv6
multicasting. It:

1) adds existence check & EADDRINUSE error for regular joins
2) adds an exception for EADDRINUSE in the source-specific multicast
join (where a prior join is ok)
3) adds a missing/needed read_lock on sock_mc_list; would've raced
with destroying the socket on interface down without
4) adds a "leave group" in the (INCLUDE, empty) source filter case.
This frees unneeded socket buffer memory, but also prevents
an inappropriate interaction among the 8 socket options that
mess with this. Some would fail as if in the group when you
aren't really.

Item #4 had a locking bug in the last version of this patch; rather than
removing the idev->lock read lock only, I've simplified it to remove
all lock state in the path and treat it as a direct "leave group" call for
the (INCLUDE,empty) case it covers. Tested on an MP machine. :-)

Much thanks to HoerdtMickael who
reported the original bug.

Signed-off-by: David L Stevens
Signed-off-by: David S. Miller

David L Stevens
2005-06-22 04:58:25 +0800
0d51aa80a [IPV6]: V6 route events reported with wrong netlink PID and seq number ... Browse Code »

Essentially netlink at the moment always reports a pid and sequence of 0
always for v6 route activities.
To understand the repurcassions of this look at:
http://lists.quagga.net/pipermail/quagga-dev/2005-June/003507.html

While fixing this, i took the liberty to resolve the outstanding issue
of IPV6 routes inserted via ioctls to have the correct pids as well.

This patch tries to behave as close as possible to the v4 routes i.e
maintains whatever PID the socket issuing the command owns as opposed to
the process. That made the patch a little bulky.

I have tested against both netlink derived utility to add/del routes as
well as ioctl derived one. The Quagga folks have tested against quagga.
This fixes the problem and so far hasnt been detected to introduce any
new issues.

Signed-off-by: Jamal Hadi Salim
Acked-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

Jamal Hadi Salim
2005-06-22 04:51:04 +0800

21 Jun, 2005

1 commit

72cb6962a [IPSEC]: Add xfrm_init_state ... Browse Code »

This patch adds xfrm_init_state which is simply a wrapper that calls
xfrm_get_type and subsequently x->type->init_state. It also gets rid
of the unused args argument.

Abstracting it out allows us to add common initialisation code, e.g.,
to set family-specific flags.

The add_time setting in xfrm_user.c was deleted because it's already
set by xfrm_state_alloc.

Signed-off-by: Herbert Xu
Acked-by: James Morris
Signed-off-by: David S. Miller

Herbert Xu
2005-06-21 04:18:08 +0800

19 Jun, 2005

8 commits

e0f9f8586 [IPV4/IPV6]: Replace spin_lock_irq with spin_lock_bh ... Browse Code »

In light of my recent patch to net/ipv4/udp.c that replaced the
spin_lock_irq calls on the receive queue lock with spin_lock_bh,
here is a similar patch for all other occurences of spin_lock_irq
on receive/error queue locks in IPv4 and IPv6.

In these stacks, we know that they can only be entered from user
or softirq context. Therefore it's safe to disable BH only.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2005-06-19 13:56:18 +0800
9ed19f339 [NETLINK]: Set correct pid for ioctl originating netlink events ... Browse Code »

This patch ensures that netlink events created as a result of programns
using ioctls (such as ifconfig, route etc) contains the correct PID of
those events.

Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller

Jamal Hadi Salim
2005-06-19 13:55:51 +0800
e431b8c00 [NETLINK]: Explicit typing ... Browse Code »

This patch converts "unsigned flags" to use more explict types like u16
instead and incrementally introduces NLMSG_NEW().

Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller

Jamal Hadi Salim
2005-06-19 13:55:31 +0800
b6544c0b4 [NETLINK]: Correctly set NLM_F_MULTI without checking the pid ... Browse Code »

This patch rectifies some rtnetlink message builders that derive the
flags from the pid. It is now explicit like the other cases
which get it right. Also fixes half a dozen dumpers which did not
set NLM_F_MULTI at all.

Signed-off-by: Jamal Hadi Salim
Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Jamal Hadi Salim
2005-06-19 13:54:12 +0800
2ad69c55a [NET] rename struct tcp_listen_opt to struct listen_sock ... Browse Code »

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:48:55 +0800
0e87506fc [NET] Generalise tcp_listen_opt ... Browse Code »

This chunks out the accept_queue and tcp_listen_opt code and moves
them to net/core/request_sock.c and include/net/request_sock.h, to
make it useful for other transport protocols, DCCP being the first one
to use it.

Next patches will rename tcp_listen_opt to accept_sock and remove the
inline tcp functions that just call a reqsk_queue_ function.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:47:59 +0800
60236fdd0 [NET] Rename open_request to request_sock ... Browse Code »

Ok, this one just renames some stuff to have a better namespace and to
dissassociate it from TCP:

struct open_request -> struct request_sock
tcp_openreq_alloc -> reqsk_alloc
tcp_openreq_free -> reqsk_free
tcp_openreq_fastfree -> __reqsk_free

With this most of the infrastructure closely resembles a struct
sock methods subset.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:47:21 +0800
2e6599cb8 [NET] Generalise TCP's struct open_request minisock infrastructure ... Browse Code »

Kept this first changeset minimal, without changing existing names to
ease peer review.

Basicaly tcp_openreq_alloc now receives the or_calltable, that in turn
has two new members:

->slab, that replaces tcp_openreq_cachep
->obj_size, to inform the size of the openreq descendant for
a specific protocol

The protocol specific fields in struct open_request were moved to a
class hierarchy, with the things that are common to all connection
oriented PF_INET protocols in struct inet_request_sock, the TCP ones
in tcp_request_sock, that is an inet_request_sock, that is an
open_request.

I.e. this uses the same approach used for the struct sock class
hierarchy, with sk_prot indicating if the protocol wants to use the
open_request infrastructure by filling in sk_prot->rsk_prot with an
or_calltable.

Results? Performance is improved and TCP v4 now uses only 64 bytes per
open request minisock, down from 96 without this patch :-)

Next changeset will rename some of the structs, fields and functions
mentioned above, struct or_calltable is way unclear, better name it
struct request_sock_ops, s/struct open_request/struct request_sock/g,
etc.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:46:52 +0800

14 Jun, 2005

2 commits

77bd91967 [IPv6] Don't generate temporary for TUN devices ... Browse Code »

Userland layer-2 tunneling devices allocated through the TUNTAP driver
(drivers/net/tun.c) have a type of ARPHRD_NONE, and have no link-layer
address. The kernel complains at regular interval when IPv6 Privacy
extension are enabled because it can't find an hardware address :

Dec 29 11:02:04 auguste kernel: __ipv6_regen_rndid(idev=cb3e0c00):
cannot get EUI64 identifier; use random bytes.

IPv6 Privacy extensions should probably be disabled on that sort of
device. They won't work anyway. If userland wants a more usual
Ethernet-ish interface with usual IPv6 autoconfiguration, it will use a
TAP device with an emulated link-layer and a random hardware address
rather than a TUN device.

As far as I could fine, TUN virtual device from TUNTAP is the very only
sort of device using ARPHRD_NONE as kernel device type.

Signed-off-by: Rémi Denis-Courmont
Acked-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

Rémi Denis-Courmont
2005-06-14 06:01:34 +0800
84427d533 [IPV6]: Ensure to use icmpv6_socket in non-preemptive context. ... Browse Code »

We saw following trace several times:

|BUG: using smp_processor_id() in preemptible [00000001] code: httpd/30137
|caller is icmpv6_send+0x23/0x540
| [] smp_processor_id+0x9b/0xb8
| [] icmpv6_send+0x23/0x540

This is because of icmpv6_socket, which is the only one user of
smp_processor_id() in icmpv6_send(), AFAIK.

Since it should be used in non-preemptive context,
let's defer the dereference after disabling preemption
(by icmpv6_xmit_lock()).

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2005-06-14 05:59:44 +0800

09 Jun, 2005

1 commit

8181b8c1f [IPV6]: Update parm.link in ip6ip6_tnl_change() ... Browse Code »

Signed-off-by: Gabor Fekete
Signed-off-by: David S. Miller

Gabor Fekete
2005-06-09 05:54:38 +0800

03 Jun, 2005

1 commit

4fef0304e [IPV6]: Kill export of fl6_sock_lookup. ... Browse Code »

There is no usage of this EXPORT_SYMBOL in the kernel.

Signed-off-by: Adrian Bunk
Acked-by: Hideaki YOSHIFUJI
Signed-off-by: David S. Miller

Adrian Bunk
2005-06-03 04:06:36 +0800

30 May, 2005

1 commit

6c94d3611 [IPV6]: Clear up user copy warning in flowlabel code. ... Browse Code »

We are intentionally ignoring the copy_to_user() value,
make it clear to the compiler too.

Noted by Jeff Garzik.

Signed-off-by: David S. Miller

David S. Miller
2005-05-30 11:28:01 +0800

27 May, 2005

1 commit

92d63decc From: Kazunori Miyazawa <kazunori@miyazawa.org> ... Browse Code »

[XFRM] Call dst_check() with appropriate cookie

This fixes infinite loop issue with IPv6 tunnel mode.

Signed-off-by: Kazunori Miyazawa
Signed-off-by: Hideaki YOSHIFUJI
Signed-off-by: David S. Miller

Hideaki YOSHIFUJI
2005-05-27 03:58:04 +0800

24 May, 2005

1 commit

180e42503 [IPV6]: Fix xfrm tunnel oops with large packets ... Browse Code »

Signed-off-by: Herbert Xu
Acked-by: Hideaki YOSHIFUJI
Signed-off-by: David S. Miller

Herbert Xu
2005-05-24 04:11:07 +0800

19 May, 2005

1 commit

2fdba6b08 [IPV4/IPV6] Ensure all frag_list members have NULL sk ... Browse Code »

Having frag_list members which holds wmem of an sk leads to nightmares
with partially cloned frag skb's. The reason is that once you unleash
a skb with a frag_list that has individual sk ownerships into the stack
you can never undo those ownerships safely as they may have been cloned
by things like netfilter. Since we have to undo them in order to make
skb_linearize happy this approach leads to a dead-end.

So let's go the other way and make this an invariant:

For any skb on a frag_list, skb->sk must be NULL.

That is, the socket ownership always belongs to the head skb.
It turns out that the implementation is actually pretty simple.

The above invariant is actually violated in the following patch
for a short duration inside ip_fragment. This is OK because the
offending frag_list member is either destroyed at the end of the
slow path without being sent anywhere, or it is detached from
the frag_list before being sent.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2005-05-19 13:52:33 +0800

04 May, 2005

5 commits

aabc9761b [IPSEC]: Store idev entries ... Browse Code »

I found a bug that stopped IPsec/IPv6 from working. About
a month ago IPv6 started using rt6i_idev->dev on the cached socket dst
entries. If the cached socket dst entry is IPsec, then rt6i_idev will
be NULL.

Since we want to look at the rt6i_idev of the original route in this
case, the easiest fix is to store rt6i_idev in the IPsec dst entry just
as we do for a number of other IPv6 route attributes. Unfortunately
this means that we need some new code to handle the references to
rt6i_idev. That's why this patch is bigger than it would otherwise be.

I've also done the same thing for IPv4 since it is conceivable that
once these idev attributes start getting used for accounting, we
probably need to dereference them for IPv4 IPsec entries too.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2005-05-04 07:27:10 +0800
2a0a6ebee [NETLINK]: Synchronous message processing. ... Browse Code »

Let's recap the problem. The current asynchronous netlink kernel
message processing is vulnerable to these attacks:

1) Hit and run: Attacker sends one or more messages and then exits
before they're processed. This may confuse/disable the next netlink
user that gets the netlink address of the attacker since it may
receive the responses to the attacker's messages.

Proposed solutions:

a) Synchronous processing.
b) Stream mode socket.
c) Restrict/prohibit binding.

2) Starvation: Because various netlink rcv functions were written
to not return until all messages have been processed on a socket,
it is possible for these functions to execute for an arbitrarily
long period of time. If this is successfully exploited it could
also be used to hold rtnl forever.

Proposed solutions:

a) Synchronous processing.
b) Stream mode socket.

Firstly let's cross off solution c). It only solves the first
problem and it has user-visible impacts. In particular, it'll
break user space applications that expect to bind or communicate
with specific netlink addresses (pid's).

So we're left with a choice of synchronous processing versus
SOCK_STREAM for netlink.

For the moment I'm sticking with the synchronous approach as
suggested by Alexey since it's simpler and I'd rather spend
my time working on other things.

However, it does have a number of deficiencies compared to the
stream mode solution:

1) User-space to user-space netlink communication is still vulnerable.

2) Inefficient use of resources. This is especially true for rtnetlink
since the lock is shared with other users such as networking drivers.
The latter could hold the rtnl while communicating with hardware which
causes the rtnetlink user to wait when it could be doing other things.

3) It is still possible to DoS all netlink users by flooding the kernel
netlink receive queue. The attacker simply fills the receive socket
with a single netlink message that fills up the entire queue. The
attacker then continues to call sendmsg with the same message in a loop.

Point 3) can be countered by retransmissions in user-space code, however
it is pretty messy.

In light of these problems (in particular, point 3), we should implement
stream mode netlink at some point. In the mean time, here is a patch
that implements synchronous processing.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2005-05-04 05:55:09 +0800
c3924c70d [TCP]: Optimize check in port-allocation code, v6 version. ... Browse Code »

Signed-off-by: Folkert van Heusden
Signed-off-by: David S. Miller

Folkert van Heusden
2005-05-04 05:36:45 +0800
db46edc6d [RTNETLINK] Cleanup rtnetlink_link tables ... Browse Code »

Converts remaining rtnetlink_link tables to use c99 designated
initializers to make greping a little bit easier.

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2005-05-04 05:29:39 +0800
679a87382 [IPV6]: Fix raw socket checksums with IPsec ... Browse Code »

I made a mistake in my last patch to the raw socket checksum code.
I used the value of inet->cork.length as the length of the payload.
While this works with normal packets, it breaks down when IPsec is
present since the cork length includes the extension header length.

So here is a patch to fix the length calculations.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2005-05-04 05:24:36 +0800

29 Apr, 2005

1 commit

89c8b3a11 [IPV6]: Incorrect permissions on route flush sysctl ... Browse Code »

On Mon, Apr 25, 2005 at 12:01:13PM -0400, Dave Jones wrote:
> This has been brought up before.. http://lkml.org/lkml/2000/1/21/116
> but didnt seem to get resolved. This morning I got someone
> file a bugzilla about it breaking sysctl(8).

And here's its ipv6 counterpart.

Signed-off-by: Dave Jones
Signed-off-by: David S. Miller

Dave Jones
2005-04-29 03:11:49 +0800

26 Apr, 2005

1 commit

b453257f0 [PATCH] kill gratitious includes of major.h under net/* ... Browse Code »

A lot of places in there are including major.h for no reason whatsoever.
Removed. And yes, it still builds.

The history of that stuff is often amusing. E.g. for net/core/sock.c
the story looks so, as far as I've been able to reconstruct it: we used
to need major.h in net/socket.c circa 1.1.early. In 1.1.13 that need
had disappeared, along with register_chrdev(SOCKET_MAJOR, "socket",
&net_fops) in sock_init(). Include had not. When 1.2 -> 1.3 reorg of
net/* had moved a lot of stuff from net/socket.c to net/core/sock.c,
this crap had followed...

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2005-04-26 09:32:13 +0800

25 Apr, 2005

2 commits

edec231a8 [IPV6]: export inet6_sock_nr ... Browse Code »

Please apply, SCTP/DCCP needs this when INET_REFCNT_DEBUG
is set.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-04-25 11:22:28 +0800
0d3d077cd [SELINUX]: Fix ipv6_skip_exthdr() invocation causing OOPS. ... Browse Code »

The SELinux hooks invoke ipv6_skip_exthdr() with an incorrect
length final argument. However, the length argument turns out
to be superfluous.

I was just reading ipv6_skip_exthdr and it occured to me that we can
get rid of len altogether. The only place where len is used is to
check whether the skb has two bytes for ipv6_opt_hdr. This check
is done by skb_header_pointer/skb_copy_bits anyway.

Now it might appear that we've made the code slower by deferring
the check to skb_copy_bits. However, this check should not trigger
in the common case so this is OK.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2005-04-25 11:16:19 +0800

20 Apr, 2005

3 commits

3320da890 [IPV6]: Replace bogus instances of inet->recverr ... Browse Code »

While looking at this problem I noticed that IPv6 was sometimes
looking at inet->recverr which is bogus. Here is a patch to
correct that and use np->recverr.

Signed-off-by: Herbert Xu
Acked-by: Hideaki YOSHIFUJI
Signed-off-by: David S. Miller

Herbert Xu
2005-04-20 13:32:22 +0800
357b40a18 [IPV6]: IPV6_CHECKSUM socket option can corrupt kernel memory ... Browse Code »

So here is a patch that introduces skb_store_bits -- the opposite of
skb_copy_bits, and uses them to read/write the csum field in rawv6.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2005-04-20 13:30:14 +0800
fd92833a5 [IPV6]: Fix a branch prediction ... Browse Code »

From: Tushar Gohad

Signed-off-by: Hideaki YOSHIFUJI
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2005-04-20 13:27:09 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800