Eric Lee / smarc-fsl-linux-kernel

08 Aug, 2011

1 commit

d52fbfc9e ipv4: use dst with ref during bcast/mcast loopback ... Browse Code »

Make sure skb dst has reference when moving to
another context. Currently, I don't see protocols that can
hit it when sending broadcasts/multicasts to loopback using
noref dsts, so it is just a precaution.

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2011-08-08 13:52:32 +0800

03 Aug, 2011

1 commit

f2c31e32b net: fix NULL dereferences in check_peer_redir() ... Browse Code »
48

Gergely Kalman reported crashes in check_peer_redir().

It appears commit f39925dbde778 (ipv4: Cache learned redirect
information in inetpeer.) added a race, leading to possible NULL ptr
dereference.

Since we can now change dst neighbour, we should make sure a reader can
safely use a neighbour.

Add RCU protection to dst neighbour, and make sure check_peer_redir()
can be called safely by different cpus in parallel.

As neighbours are already freed after one RCU grace period, this patch
should not add typical RCU penalty (cache cold effects)

Many thanks to Gergely for providing a pretty report pointing to the
bug.

Reported-by: Gergely Kalman
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-08-03 18:34:12 +0800

22 Jul, 2011

1 commit

d9be4f7a6 ipv4: Constrain UFO fragment sizes to multiples of 8 bytes ... Browse Code »
1

Because the ip fragment offset field counts 8-byte chunks, ip
fragments other than the last must contain a multiple of 8 bytes of
payload. ip_ufo_append_data wasn't respecting this constraint and,
depending on the MTU and ip option sizes, could create malformed
non-final fragments.

Google-Bug-Id: 5009328
Signed-off-by: Bill Sommerfeld
Signed-off-by: David S. Miller

Bill Sommerfeld
2011-07-22 12:31:41 +0800

18 Jul, 2011

1 commit

69cce1d14 net: Abstract dst->neighbour accesses behind helpers. ... Browse Code »
1

dst_{get,set}_neighbour()

Signed-off-by: David S. Miller

David S. Miller
2011-07-18 14:11:35 +0800

17 Jul, 2011

2 commits

05e3aa094 net: Create and use new helper, neigh_output(). ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2011-07-17 08:26:00 +0800
fec8292d9 ipv4: Use calculated 'neigh' instead of re-evaluating dst->neighbour ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2011-07-17 05:25:54 +0800

14 Jul, 2011

1 commit

f6b72b621 net: Embed hh_cache inside of struct neighbour. ... Browse Code »

Now that there is a one-to-one correspondance between neighbour
and hh_cache entries, we no longer need:

1) dynamic allocation
2) attachment to dst->hh
3) refcounting

Initialization of the hh_cache entry is indicated by hh_len
being non-zero, and such initialization is always done with
the neighbour's lock held as a writer.

Signed-off-by: David S. Miller

David S. Miller
2011-07-14 22:53:20 +0800

06 Jul, 2011

1 commit

e12fe68ce Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Browse Code »

David S. Miller
2011-07-06 14:23:37 +0800

02 Jul, 2011

1 commit

c146066ab ipv4: Don't use ufo handling on later transformed packets ... Browse Code »

We might call ip_ufo_append_data() for packets that will be IPsec
transformed later. This function should be used just for real
udp packets. So we check for rt->dst.header_len which is only
nonzero on IPsec handling and call ip_ufo_append_data() just
if rt->dst.header_len is zero.

Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2011-07-02 08:33:19 +0800

28 Jun, 2011

2 commits

353e5c9ab ipv4: Fix IPsec slowpath fragmentation problem ... Browse Code »

ip_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path).
On IPsec the effective mtu is lower because we need to add the protocol
headers and trailers later when we do the IPsec transformations. So after
the IPsec transformations the packet might be too big, which leads to a
slowpath fragmentation then. This patch fixes this by building the packets
based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr
handling to this.

Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2011-06-28 11:34:26 +0800
33f99dc7f ipv4: Fix packet size calculation in __ip_append_data ... Browse Code »

Git commit 59104f06 (ip: take care of last fragment in ip_append_data)
added a check to see if we exceed the mtu when we add trailer_len.
However, the mtu is already subtracted by the trailer length when the
xfrm transfomation bundles are set up. So IPsec packets with mtu
size get fragmented, or if the DF bit is set the packets will not
be send even though they match the mtu perfectly fine. This patch
actually reverts commit 59104f06.

Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2011-06-28 11:34:25 +0800

22 Jun, 2011

1 commit

56f8a75c1 ip: introduce ip_is_fragment helper inline function ... Browse Code »

There are enough instances of this:

iph->frag_off & htons(IP_MF | IP_OFFSET)

that a helper function is probably warranted.

Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Paul Gortmaker
2011-06-22 11:33:34 +0800

10 Jun, 2011

1 commit

96d7303e9 ipv4: Fix packet size calculation for raw IPsec packets in __ip_append_data ... Browse Code »

We assume that transhdrlen is positive on the first fragment
which is wrong for raw packets. So we don't add exthdrlen to the
packet size for raw packets. This leads to a reallocation on IPsec
because we have not enough headroom on the skb to place the IPsec
headers. This patch fixes this by adding exthdrlen to the packet
size whenever the send queue of the socket is empty. This issue was
introduced with git commit 1470ddf7 (inet: Remove explicit write
references to sk/inet in ip_append_data)

Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2011-06-10 05:49:59 +0800

14 May, 2011

1 commit

22f728f8f ipv4: Always call ip_options_build() after rest of IP header is filled in. ... Browse Code »

This will allow ip_options_build() to reliably look at the values of
iph->{daddr,saddr}

Signed-off-by: David S. Miller

David S. Miller
2011-05-14 05:21:27 +0800

11 May, 2011

1 commit

0a5ebb800 ipv4: Pass explicit daddr arg to ip_send_reply(). ... Browse Code »

This eliminates an access to rt->rt_src.

Signed-off-by: David S. Miller

David S. Miller
2011-05-11 04:32:46 +0800

09 May, 2011

5 commits

f5fca6086 ipv4: Pass flow key down into ip_append_*(). ... Browse Code »

This way rt->rt_dst accesses are unnecessary.

Signed-off-by: David S. Miller

David S. Miller
2011-05-09 12:24:07 +0800
77968b782 ipv4: Pass flow keys down into datagram packet building engine. ... Browse Code »

This way ip_output.c no longer needs rt->rt_{src,dst}.

We already have these keys sitting, ready and waiting, on the stack or
in a socket structure.

Signed-off-by: David S. Miller

David S. Miller
2011-05-09 12:24:06 +0800
ea4fc0d61 ipv4: Don't use rt->rt_{src,dst} in ip_queue_xmit(). ... Browse Code »
48

Now we can pick it out of the provided flow key.

Signed-off-by: David S. Miller

David S. Miller
2011-05-09 06:28:28 +0800
d9d8da805 inet: Pass flowi to ->queue_xmit(). ... Browse Code »

This allows us to acquire the exact route keying information from the
protocol, however that might be managed.

It handles all of the possibilities, from the simplest case of storing
the key in inet->cork.fl to the more complex setup SCTP has where
individual transports determine the flow.

Signed-off-by: David S. Miller

David S. Miller
2011-05-09 06:28:28 +0800
b57ae01a8 ipv4: Use cork flow in ip_queue_xmit() ... Browse Code »

All invokers of ip_queue_xmit() must make certain that the
socket is locked. All of SCTP, TCP, DCCP, and L2TP now make
sure this is the case.

Therefore we can use the cork flow during output route lookup in
ip_queue_xmit() when the socket route check fails.

Signed-off-by: David S. Miller

David S. Miller
2011-05-09 05:05:14 +0800

07 May, 2011

3 commits

706527280 ipv4: Initialize cork->opt using NULL not 0. ... Browse Code »

Noticed by Joe Perches.

Signed-off-by: David S. Miller

David S. Miller
2011-05-07 07:01:15 +0800
b80d72261 ipv4: Initialize on-stack cork more efficiently. ... Browse Code »

ip_setup_cork() explicitly initializes every member of
inet_cork except flags, addr, and opt. So we can simply
set those three members to zero instead of using a
memset() via an empty struct assignment.

Signed-off-by: David S. Miller
Acked-by: Eric Dumazet

David S. Miller
2011-05-07 06:37:57 +0800
bdc712b4c inet: Decrease overhead of on-stack inet_cork. ... Browse Code »

When we fast path datagram sends to avoid locking by putting
the inet_cork on the stack we use up lots of space that isn't
necessary.

This is because inet_cork contains a "struct flowi" which isn't
used in these code paths.

Split inet_cork to two parts, "inet_cork" and "inet_cork_full".
Only the latter of which has the "struct flowi" and is what is
stored in inet_sock.

Signed-off-by: David S. Miller
Acked-by: Eric Dumazet

David S. Miller
2011-05-07 06:37:57 +0800

05 May, 2011

1 commit

dd927a269 ipv4: In ip_build_and_send_pkt() use 'saddr' and 'daddr' args passed in. ... Browse Code »

Instead of rt->rt_{dst,src}

The only tricky part is source route option handling.

If the source route option is enabled we can't just use plain 'daddr',
we have to use opt->opt.faddr.

Signed-off-by: David S. Miller

David S. Miller
2011-05-05 03:03:30 +0800

04 May, 2011

1 commit

31e4543db ipv4: Make caller provide on-stack flow key to ip_route_output_ports(). ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2011-05-04 11:25:42 +0800

29 Apr, 2011

1 commit

f6d8bd051 inet: add RCU protection to inet->opt ... Browse Code »

We lack proper synchronization to manipulate inet->opt ip_options

Problem is ip_make_skb() calls ip_setup_cork() and
ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options),
without any protection against another thread manipulating inet->opt.

Another thread can change inet->opt pointer and free old one under us.

Use RCU to protect inet->opt (changed to inet->inet_opt).

Instead of handling atomic refcounts, just copy ip_options when
necessary, to avoid cache line dirtying.

We cant insert an rcu_head in struct ip_options since its included in
skb->cb[], so this patch is large because I had to introduce a new
ip_options_rcu structure.

Signed-off-by: Eric Dumazet
Cc: Herbert Xu
Signed-off-by: David S. Miller

Eric Dumazet
2011-04-29 04:16:35 +0800

12 Apr, 2011

1 commit

1c01a80cf Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/smsc911x.c

David S. Miller
2011-04-12 04:44:25 +0800

31 Mar, 2011

2 commits

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800
538de0e01 ipv4: Use flowi4_init_output() in ip_send_reply() ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2011-03-31 19:53:37 +0800

13 Mar, 2011

5 commits

9cce96df5 net: Put fl4_* macros to struct flowi4 and use them again. ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2011-03-13 07:08:54 +0800
9d6ec9380 ipv4: Use flowi4 in public route lookup interfaces. ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2011-03-13 07:08:48 +0800
6281dcc94 net: Make flowi ports AF dependent. ... Browse Code »

Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*

This will let us to create AF optimal flow instances.

It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.

Signed-off-by: David S. Miller

David S. Miller
2011-03-13 07:08:46 +0800
1d28f42c1 net: Put flowi_* prefix on AF independent members of struct flowi ... Browse Code »

I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.

Signed-off-by: David S. Miller

David S. Miller
2011-03-13 07:08:44 +0800
78fbfd8a6 ipv4: Create and use route lookup helpers. ... Browse Code »

The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.

Signed-off-by: David S. Miller

David S. Miller
2011-03-13 07:08:42 +0800

03 Mar, 2011

1 commit

b23dd4fe4 ipv4: Make output route lookup return rtable directly. ... Browse Code »

Instead of on the stack.

Signed-off-by: David S. Miller

David S. Miller
2011-03-03 06:31:35 +0800

02 Mar, 2011

5 commits

07df5294a inet: Replace left-over references to inet->cork ... Browse Code »

The patch to replace inet->cork with cork left out two spots in
__ip_append_data that can result in bogus packet construction.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2011-03-02 15:00:58 +0800
273447b35 ipv4: Kill can_sleep arg to ip_route_output_flow() ... Browse Code »

This boolean state is now available in the flow flags.

Signed-off-by: David S. Miller

David S. Miller
2011-03-02 06:27:04 +0800
420d44daa ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep" ... Browse Code »

Since that is what the current vague "flags" argument means.

Signed-off-by: David S. Miller

David S. Miller
2011-03-02 06:19:23 +0800
1c32c5ad6 inet: Add ip_make_skb and ip_finish_skb ... Browse Code »

This patch adds the helper ip_make_skb which is like ip_append_data
and ip_push_pending_frames all rolled into one, except that it does
not send the skb produced. The sending part is carried out by
ip_send_skb, which the transport protocol can call after it has
tweaked the skb.

It is meant to be called in cases where corking is not used should
have a one-to-one correspondence to sendmsg.

This patch also adds the helper ip_finish_skb which is meant to
be replace ip_push_pending_frames when corking is required.
Previously the protocol stack would peek at the socket write
queue and add its header to the first packet. With ip_finish_skb,
the protocol stack can directly operate on the final skb instead,
just like the non-corking case with ip_make_skb.

Signed-off-by: Herbert Xu
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Herbert Xu
2011-03-02 04:35:03 +0800
1470ddf7f inet: Remove explicit write references to sk/inet in ip_append_data ... Browse Code »

In order to allow simultaneous calls to ip_append_data on the same
socket, it must not modify any shared state in sk or inet (other
than those that are designed to allow that such as atomic counters).

This patch abstracts out write references to sk and inet_sk in
ip_append_data and its friends so that we may use the underlying
code in parallel.

Signed-off-by: Herbert Xu
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Herbert Xu
2011-03-02 04:35:02 +0800