Eric Lee / smarc-fsl-linux-kernel

24 Dec, 2009

1 commit

31d12926e net: Add rtnetlink init_rcvwnd to set the TCP initial receive window ... Browse Code »

Add rtnetlink init_rcvwnd to set the TCP initial receive window size
advertised by passive and active TCP connections.
The current Linux TCP implementation limits the advertised TCP initial
receive window to the one prescribed by slow start. For short lived
TCP connections used for transaction type of traffic (i.e. http
requests), bounding the advertised TCP initial receive window results
in increased latency to complete the transaction.
Support for setting initial congestion window is already supported
using rtnetlink init_cwnd, but the feature is useless without the
ability to set a larger TCP initial receive window.
The rtnetlink init_rcvwnd allows increasing the TCP initial receive
window, allowing TCP connection to advertise larger TCP receive window
than the ones bounded by slow start.

Signed-off-by: Laurent Chavey
Signed-off-by: David S. Miller

laurent chavey
2009-12-24 06:13:30 +0800

16 Dec, 2009

1 commit

bb5b7c112 tcp: Revert per-route SACK/DSACK/TIMESTAMP changes. ... Browse Code »

It creates a regression, triggering badness for SYN_RECV
sockets, for example:

[19148.022102] Badness at net/ipv4/inet_connection_sock.c:293
[19148.022570] NIP: c02a0914 LR: c02a0904 CTR: 00000000
[19148.023035] REGS: eeecbd30 TRAP: 0700 Not tainted (2.6.32)
[19148.023496] MSR: 00029032 CR: 24002442 XER: 00000000
[19148.024012] TASK = eee9a820[1756] 'privoxy' THREAD: eeeca000

This is likely caused by the change in the 'estab' parameter
passed to tcp_parse_options() when invoked by the functions
in net/ipv4/tcp_minisocks.c

But even if that is fixed, the ->conn_request() changes made in
this patch series is fundamentally wrong. They try to use the
listening socket's 'dst' to probe the route settings. The
listening socket doesn't even have a route, and you can't
get the right route (the child request one) until much later
after we setup all of the state, and it must be done by hand.

This stuff really isn't ready, so the best thing to do is a
full revert. This reverts the following commits:

f55017a93f1a74d50244b1254b9a2bd7ac9bbf7d
022c3f7d82f0f1c68018696f2f027b87b9bb45c2
1aba721eba1d84a2defce45b950272cee1e6c72a
cda42ebd67ee5fdf09d7057b5a4584d36fe8a335
345cda2fd695534be5a4494f1b59da9daed33663
dc343475ed062e13fc260acccaab91d7d80fd5b2
05eaade2782fb0c90d3034fd7a7d5a16266182bb
6a2a2d6bf8581216e08be15fcb563cfd6c430e1e

Signed-off-by: David S. Miller

David S. Miller
2009-12-16 12:56:42 +0800

05 Nov, 2009

1 commit

6a2a2d6bf tcp: Use defaults when no route options are available ... Browse Code »

Trying to parse the option of a SYN packet that we have
no route entry for should just use global wide defaults
for route entry options.

Signed-off-by: Gilad Ben-Yossef
Tested-by: Valdis.Kletnieks@vt.edu
Signed-off-by: David S. Miller

Gilad Ben-Yossef
2009-11-05 15:24:15 +0800

04 Nov, 2009

1 commit

fd2c3ef76 net: cleanup include/net ... Browse Code »

This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-11-04 21:06:25 +0800

29 Oct, 2009

1 commit

0c3adfb8e Add dst_feature to query route entry features ... Browse Code »

Adding an accessor to existing dst_entry feautres field and
refactor the only supported feature (allfrag) to use it.

Signed-off-by: Gilad Ben-Yossef
Sigend-off-by: Ori Finkelman
Sigend-off-by: Yony Amit
Signed-off-by: David S. Miller

Gilad Ben-Yossef
2009-10-29 16:28:42 +0800

21 Oct, 2009

1 commit

ea94ff3b5 net: Fix for dst_negative_advice ... Browse Code »

dst_negative_advice() should check for changed dst and reset
sk_tx_queue_mapping accordingly. Pass sock to the callers of
dst_negative_advice.

(sk_reset_txq is defined just for use by dst_negative_advice. The
only way I could find to get around this is to move dst_negative_()
from dst.h to dst.c, include sock.h in dst.c, etc)

Signed-off-by: Krishna Kumar
Signed-off-by: David S. Miller

Krishna Kumar
2009-10-21 09:55:46 +0800

02 Sep, 2009

1 commit

86393e52c netns: embed ip6_dst_ops directly ... Browse Code »

struct net::ipv6.ip6_dst_ops is separatedly dynamically allocated,
but there is no fundamental reason for it. Embed it directly into
struct netns_ipv6.

For that:
* move struct dst_ops into separate header to fix circular dependencies
I honestly tried not to, it's pretty impossible to do other way
* drop dynamical allocation, allocate together with netns

For a change, remove struct dst_ops::dst_net, it's deducible
by using container_of() given dst_ops pointer.

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2009-09-02 08:40:31 +0800

03 Jun, 2009

1 commit

adf30907d net: skb->dst accessors ... Browse Code »

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-06-03 17:51:04 +0800

26 Nov, 2008

1 commit

52479b623 netns xfrm: lookup in netns ... Browse Code »

Pass netns to xfrm_lookup()/__xfrm_lookup(). For that pass netns
to flow_cache_lookup() and resolver callback.

Take it from socket or netdevice. Stub DECnet to init_net.

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2008-11-26 09:35:18 +0800

17 Nov, 2008

1 commit

5635c10d9 net: make sure struct dst_entry refcount is aligned on 64 bytes ... Browse Code »

As found in the past (commit f1dd9c379cac7d5a76259e7dffcd5f8edc697d17
[NET]: Fix tbench regression in 2.6.25-rc1), it is really
important that struct dst_entry refcount is aligned on a cache line.

We cannot use __atribute((aligned)), so manually pad the structure
for 32 and 64 bit arches.

for 32bit : offsetof(truct dst_entry, __refcnt) is 0x80
for 64bit : offsetof(truct dst_entry, __refcnt) is 0xc0

As it is not possible to guess at compile time cache line size,
we use a generic value of 64 bytes, that satisfies many current arches.
(Using 128 bytes alignment on 64bit arches would waste 64 bytes)

Add a BUILD_BUG_ON to catch future updates to "struct dst_entry" dont
break this alignment.

"tbench 8" is 4.4 % faster on a dual quad core (HP BL460c G1), Intel E5450 @3.00GHz
(2350 MB/s instead of 2250 MB/s)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2008-11-17 11:46:36 +0800

12 Nov, 2008

1 commit

6bb3ce25d net: remove struct dst_entry::entry_size ... Browse Code »

Unused after kmem_cache_zalloc() conversion.

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2008-11-12 09:25:22 +0800

29 Oct, 2008

1 commit

def8b4faf net: reduce structures when XFRM=n ... Browse Code »

ifdef out
* struct sk_buff::sp (pointer)
* struct dst_entry::xfrm (pointer)
* struct sock::sk_policy (2 pointers)

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2008-10-29 04:24:06 +0800

05 Aug, 2008

1 commit

cc6533e98 net: Kill plain NET_XMIT_BYPASS. ... Browse Code »

dst_input() was doing something completely absurd, looping
on skb->dst->input() if NET_XMIT_BYPASS was seen, but these
functions never return such an error.

And as a result plain ole' NET_XMIT_BYPASS has no more
references and can be completely killed off.

Signed-off-by: David S. Miller

David S. Miller
2008-08-05 14:04:08 +0800

19 Jul, 2008

1 commit

c1e20f7c8 tcp: RTT metrics scaling ... Browse Code »

Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in
kernel units (jiffies) and this leaks out through the netlink API to
user space where the units for jiffies are unknown.

This patches changes the kernel to convert to/from milliseconds. This
changes the ABI, but milliseconds seemed like the most natural unit
for these parameters. Values available via syscall in
/proc/net/rt_cache and netlink will be in milliseconds.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2008-07-19 14:02:15 +0800

28 Mar, 2008

1 commit

8d3308687 [NET]: uninline dst_release ... Browse Code »

Codiff stats (allyesconfig, v2.6.24-mm1):
-16420 187 funcs, 103 +, 16523 -, diff: -16420 --- dst_release

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7257 186 funcs, 70 +, 7327 -, diff: -7257 --- dst_release
dst_release | +40

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2008-03-28 08:53:31 +0800

13 Mar, 2008

1 commit

f1dd9c379 [NET]: Fix tbench regression in 2.6.25-rc1 ... Browse Code »

Comparing with kernel 2.6.24, tbench result has regression with
2.6.25-rc1.

1) On 2 quad-core processor stoakley: 4%.
2) On 4 quad-core processor tigerton: more than 30%.

bisect located below patch.

b4ce92775c2e7ff9cf79cca4e0a19c8c5fd6287b is first bad commit
commit b4ce92775c2e7ff9cf79cca4e0a19c8c5fd6287b
Author: Herbert Xu
Date: Tue Nov 13 21:33:32 2007 -0800

[IPV6]: Move nfheader_len into rt6_info

The dst member nfheader_len is only used by IPv6. It's also currently
creating a rather ugly alignment hole in struct dst. Therefore this patch
moves it from there into struct rt6_info.

Above patch changes the cache line alignment, especially member
__refcnt. I did a testing by adding 2 unsigned long pading before
lastuse, so the 3 members, lastuse/__refcnt/__use, are moved to next
cache line. The performance is recovered.

I created a patch to rearrange the members in struct dst_entry.

With Eric and Valdis Kletnieks's suggestion, I made finer arrangement.

1) Move tclassid under ops in case CONFIG_NET_CLS_ROUTE=y. So
sizeof(dst_entry)=200 no matter if CONFIG_NET_CLS_ROUTE=y/n. I
tested many patches on my 16-core tigerton by moving tclassid to
different place. It looks like tclassid could also have impact on
performance. If moving tclassid before metrics, or just don't move
tclassid, the performance isn't good. So I move it behind metrics.

2) Add comments before __refcnt.

On 16-core tigerton:

If CONFIG_NET_CLS_ROUTE=y, the result with below patch is about 18%
better than the one without the patch;

If CONFIG_NET_CLS_ROUTE=n, the result with below patch is about 30%
better than the one without the patch.

With 32bit 2.6.25-rc1 on 8-core stoakley, the new patch doesn't
introduce regression.

Thank Eric, Valdis, and David!

Signed-off-by: Zhang Yanmin
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Zhang Yanmin
2008-03-13 13:52:37 +0800

29 Jan, 2008

9 commits

69a73829d [DST]: shrinks sizeof(struct rtable) by 64 bytes on x86_64 ... Browse Code »

On x86_64, sizeof(struct rtable) is 0x148, which is rounded up to
0x180 bytes by SLAB allocator.

We can reduce this to exactly 0x140 bytes, without alignment overhead,
and store 12 struct rtable per PAGE instead of 10.

rate_tokens is currently defined as an "unsigned long", while its
content should not exceed 6*HZ. It can safely be converted to an
unsigned int.

Moving tclassid right after rate_tokens to fill the 4 bytes hole
permits to save 8 bytes on 'struct dst_entry', which finally permits
to save 8 bytes on 'struct rtable'

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2008-01-29 07:10:41 +0800
d4fa26ff4 [NETNS][DST]: Add the network namespace pointer in dst_ops ... Browse Code »

The network namespace pointer can be stored into the dst_ops structure.
This is usefull when there are multiple instances of the dst_ops for a
protocol. When there are no several instances, this field will be never
used in the protocol. So there is no impact for the protocols which do
implement the network namespaces.

Signed-off-by: Daniel Lezcano
Signed-off-by: David S. Miller

Daniel Lezcano
2008-01-29 07:02:47 +0800
569d36452 [NETNS][DST] dst: pass the dst_ops as parameter to the gc functions ... Browse Code »

The garbage collection function receive the dst_ops structure as
parameter. This is useful for the next incoming patchset because it
will need the dst_ops (there will be several instances) and the
network namespace pointer (contained in the dst_ops).

The protocols which do not take care of the namespaces will not be
impacted by this change (expect for the function signature), they do
just ignore the parameter.

Signed-off-by: Daniel Lezcano
Signed-off-by: David S. Miller

Daniel Lezcano
2008-01-29 07:02:46 +0800
3becd578c [NET]: Remove unused member of dst_entry ... Browse Code »

The info placeholder member of dst_entry seems to be unused in the
network stack.

Signed-off-by: Rami Rosen
Signed-off-by: David S. Miller

Rami Rosen
2008-01-29 07:00:47 +0800
8b7817f3a [IPSEC]: Add ICMP host relookup support ... Browse Code »

RFC 4301 requires us to relookup ICMP traffic that does not match any
policies using the reverse of its payload. This patch implements this
for ICMP traffic that originates from or terminates on localhost.

This is activated on outbound with the new policy flag XFRM_POLICY_ICMP,
and on inbound by the new state flag XFRM_STATE_ICMP.

On inbound the policy check is now performed by the ICMP protocol so
that it can repeat the policy check where necessary.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2008-01-29 06:57:23 +0800
815f4e57e [IPSEC]: Make xfrm_lookup flags argument a bit-field ... Browse Code »

This patch introduces an enum for bits in the flags argument of xfrm_lookup.
This is so that we can cram more information into it later.

Since all current users use just the values 0 and 1, XFRM_LOOKUP_WAIT has
been added with the value 1 << 0 to represent the current meaning of flags.

The test in __xfrm_lookup has been changed accordingly.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2008-01-29 06:57:21 +0800
862b82c6f [IPSEC]: Merge most of the output path ... Browse Code »

As part of the work on asynchrnous cryptographic operations, we need
to be able to resume from the spot where they occur. As such, it
helps if we isolate them to one spot.

This patch moves most of the remaining family-specific processing into
the common output code.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2008-01-29 06:53:48 +0800
352e512c3 [NET]: Eliminate duplicate copies of dst_discard ... Browse Code »

We have a number of copies of dst_discard scattered around the place
which all do the same thing, namely free a packet on the input or
output paths.

This patch deletes all of them except dst_discard and points all the
users to it.

The only non-trivial bit is decnet where it returns an error.
However, conceptually this is identical to the blackhole functions
used in IPv4 and IPv6 which do not return errors. So they should
either all return errors or all return zero. For now I've stuck with
the majority and picked zero as the return value.

It doesn't really matter in practice since few if any driver would
react differently depending on a zero return value or NET_RX_DROP.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2008-01-29 06:53:37 +0800
b4ce92775 [IPV6]: Move nfheader_len into rt6_info ... Browse Code »

The dst member nfheader_len is only used by IPv6. It's also currently
creating a rather ugly alignment hole in struct dst. Therefore this patch
moves it from there into struct rt6_info.

It also reorders the fields in rt6_info to minimize holes.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2008-01-29 06:53:37 +0800

11 Nov, 2007

1 commit

03f49f345 [NET]: Make helper to get dst entry and "use" it ... Browse Code »

There are many places that get the dst entry, increase the
__use counter and set the "lastuse" time stamp.

Make a helper for this.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-11 13:28:34 +0800

11 Jul, 2007

1 commit

e06e7c615 [IPV4]: The scheduled removal of multipath cached routing support. ... Browse Code »

With help from Chris Wedgwood.

Signed-off-by: David S. Miller

David S. Miller
2007-07-11 13:05:57 +0800

25 May, 2007

1 commit

14e50e57a [XFRM]: Allow packet drops during larval state resolution. ... Browse Code »

The current IPSEC rule resolution behavior we have does not work for a
lot of people, even though technically it's an improvement from the
-EAGAIN buisness we had before.

Right now we'll block until the key manager resolves the route. That
works for simple cases, but many folks would rather packets get
silently dropped until the key manager resolves the IPSEC rules.

We can't tell these folks to "set the socket non-blocking" because
they don't have control over the non-block setting of things like the
sockets used to resolve DNS deep inside of the resolver libraries in
libc.

With that in mind I coded up the patch below with some help from
Herbert Xu which provides packet-drop behavior during larval state
resolution, controllable via sysctl and off by default.

This lays the framework to either:

1) Make this default at some point or...

2) Move this logic into xfrm{4,6}_policy.c and implement the
ARP-like resolution queue we've all been dreaming of.
The idea would be to queue packets to the policy, then
once the larval state is resolved by the key manager we
re-resolve the route and push the packets out. The
packets would timeout if the rule didn't get resolved
in a certain amount of time.

Signed-off-by: David S. Miller

David S. Miller
2007-05-25 09:17:54 +0800

11 Feb, 2007

2 commits

1e19e02ca [NET]: Reorder fields of struct dst_entry ... Browse Code »

This last patch (but not least :) ) finally moves the next pointer at
the end of struct dst_entry. This permits to perform route cache
lookups with a minimal cost of one cache line per entry, instead of
two.

Both 32bits and 64bits platforms benefit from this new layout.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2007-02-11 15:20:45 +0800
75ce7ceaa [NET]: Introduce union in struct dst_entry to hold 'next' pointer ... Browse Code »

This patch introduces an anonymous union to nicely express the fact that all
objects inherited from struct dst_entry should access to the generic 'next'
pointer but with appropriate type verification.

This patch is a prereq before following patches.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2007-02-11 15:20:36 +0800

08 Dec, 2006

1 commit

e18b890bb [PATCH] slab: remove kmem_cache_t ... Browse Code »

Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#

set -e

for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done

The script was run like this

sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-12-08 00:39:25 +0800

29 Sep, 2006

1 commit

d77072ecf [NET]: Annotate dst_ops protocol ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David S. Miller

Al Viro
2006-09-29 09:02:58 +0800

23 Sep, 2006

1 commit

1b5c22998 [XFRM] STATE: Support non-fragment outbound transformation headers. ... Browse Code »

For originated outbound IPv6 packets which will fragment, ip6_append_data()
should know length of extension headers before sending them and
the length is carried by dst_entry.
IPv6 IPsec headers fragment then transformation was
designed to place all headers after fragment header.
OTOH Mobile IPv6 extension headers do not fragment then
it is a good idea to make dst_entry have non-fragment length to tell it
to ip6_append_data().

Signed-off-by: Masahide NAKAMURA
Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

Masahide NAKAMURA
2006-09-23 06:06:41 +0800

26 Apr, 2006

1 commit

62c4f0a2d Don't include linux/config.h from anywhere else in include/ ... Browse Code »

Signed-off-by: David Woodhouse

David Woodhouse
2006-04-26 19:56:16 +0800

08 Jan, 2006

1 commit

16a6677fd [XFRM]: Netfilter IPsec output hooks ... Browse Code »

Call netfilter hooks before IPsec transforms. Packets visit the
FORWARD/LOCAL_OUT and POST_ROUTING hook before the first encapsulation
and the LOCAL_OUT and POST_ROUTING hook before each following tunnel mode
transform.

Patch from Herbert Xu :

Move the loop from dst_output into xfrm4_output/xfrm6_output since they're
the only ones who need to it. xfrm{4,6}_output_one() processes the first SA
all subsequent transport mode SAs and is called in a loop that calls the
netfilter hooks between each two calls.

In order to avoid the tail call issue, I've added the inline function
nf_hook which is nf_hook_slow plus the empty list check.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2006-01-08 04:57:28 +0800

04 Jan, 2006

1 commit

14c850212 [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h ... Browse Code »

To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2006-01-04 05:11:21 +0800

26 Oct, 2005

1 commit

80b30c102 [IPSEC]: Kill obsolete get_mss function ... Browse Code »

Now that we've switched over to storing MTUs in the xfrm_dst entries,
we no longer need the dst's get_mss methods. This patch gets rid of
them.

It also documents the fact that our MTU calculation is not optimal
for ESP.

Signed-off-by: Herbert Xu
Signed-off-by: Arnaldo Carvalho de Melo

Herbert Xu
2005-10-26 10:48:45 +0800

20 Apr, 2005

1 commit

c4d541106 [NET]: Shave sizeof(ptr) bytes off dst_entry ... Browse Code »

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2005-04-20 11:46:37 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800