09 Mar, 2013
1 commit
-
This patch requires multicast interface-scoped addresses to supply a
sin6_scope_id. Because the sin6_scope_id is now also correctly used
in case of interface-scoped multicast traffic this enables one to use
interface scoped addresses over interfaces which are not targeted by the
default multicast route (the route has to be put there manually, though).getsockname() and getpeername() now return the correct sin6_scope_id in
case of interface-local mc addresses.v2:
a) rebased ontop of patch 1/4 (now uses ipv6_addr_props)v3:
a) reverted changes for ipv6_addr_propsv4:
a) unchangedCc: YOSHIFUJI Hideaki
Acked-by: YOSHIFUJI Hideaki
Signed-off-by: Hannes Frederic Sowa
Acked-by: YOSHIFUJI Hideaki dave
Signed-off-by: David S. Miller
06 Mar, 2013
1 commit
-
We must try harder to get unique (addr, port) pairs when
doing port autoselection for sockets with SO_REUSEADDR
option set.This is a continuation of commit aacd9289af8b82f5fb01bcdd53d0e3406d1333c7
for IPv6.Signed-off-by: Flavio Leitner
Signed-off-by: David S. Miller
28 Feb, 2013
1 commit
-
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;type T;
expression a,c,d,e;
identifier b;
statement S;
@@-T b;
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin
Acked-by: Paul E. McKenney
Signed-off-by: Sasha Levin
Cc: Wu Fengguang
Cc: Marcelo Tosatti
Cc: Gleb Natapov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 Jan, 2013
1 commit
-
When attempting to build linux-next with user namespaces enabled I ran
into this fun build error.CC net/ipv6/inet6_connection_sock.o
.../net/ipv6/inet6_connection_sock.c: In function ‘inet6_csk_bind_conflict’:
.../net/ipv6/inet6_connection_sock.c:37:12: error: incompatible types when initializing type ‘int’ using
type ‘kuid_t’
.../net/ipv6/inet6_connection_sock.c:54:30: error: incompatible type for argument 1 of ‘uid_eq’
.../include/linux/uidgid.h:48:20: note: expected ‘kuid_t’ but argument is of type ‘int’
make[3]: *** [net/ipv6/inet6_connection_sock.o] Error 1
make[2]: *** [net/ipv6] Error 2
make[2]: *** Waiting for unfinished jobs....Using kuid_t instead of int to hold the uid fixes this.
Cc: Tom Herbert
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
24 Jan, 2013
1 commit
-
Motivation for soreuseport would be something like a web server
binding to port 80 running with multiple threads, where each thread
might have it's own listener socket. This could be done as an
alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads. In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets. We have seen the disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest. With so_reusport the distribution is
uniform.Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller
21 Nov, 2012
1 commit
-
In case of error, inet6_csk_update_pmtu() should consistently
return NULL.Bug added in commit 35ad9b9cf7d8a
(ipv6: Add helper inet6_csk_update_pmtu().)Reported-by: Lluís Batlle i Rossell
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
19 Sep, 2012
1 commit
-
IPv6 dst should take care of rt_genid too. When a xfrm policy is inserted or
deleted, all dst should be invalidated.
To force the validation, dst entries should be created with ->obsolete set to
DST_OBSOLETE_FORCE_CHK. This was already the case for all functions calling
ip6_dst_alloc(), except for ip6_rt_copy().As a consequence, we can remove the specific code in inet6_connection_sock.
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
18 Jul, 2012
1 commit
-
We should provide to inet6_csk_route_socket a struct flowi6 pointer,
so that net6_csk_xmit() works correctly instead of sending garbage.Also add some consts
Signed-off-by: Eric Dumazet
Reported-by: Yuchung Cheng
Cc: Neal Cardwell
Signed-off-by: David S. Miller
17 Jul, 2012
1 commit
-
This will be used so that we can compose a full flow key.
Even though we have a route in this context, we need more. In the
future the routes will be without destination address, source address,
etc. keying. One ipv4 route will cover entire subnets, etc.In this environment we have to have a way to possess persistent storage
for redirects and PMTU information. This persistent storage will exist
in the FIB tables, and that's why we'll need to be able to rebuild a
full lookup flow key here. Using that flow key will do a fib_lookup()
and create/update the persistent entry.Signed-off-by: David S. Miller
16 Jul, 2012
1 commit
-
This is the ipv6 version of inet_csk_update_pmtu().
Signed-off-by: David S. Miller
29 Jun, 2012
2 commits
-
This commit changes inet_csk_route_req() so that it uses a pointer to
a struct flowi6, rather than allocating its own on the stack. This
brings its behavior in line with its IPv4 cousin,
inet_csk_route_req(), and allows a follow-on patch to fix a dst leak.Signed-off-by: Neal Cardwell
Signed-off-by: David S. Miller -
Fix inet6_csk_route_req() to use as the flowi6_oif the treq->iif,
which is correctly fixed up in tcp_v6_conn_request() to handle the
case of link-local addresses. This brings it in line with the
tcp_v6_send_synack() code, which is already correctly using the
treq->iif in this way.Signed-off-by: Neal Cardwell
Signed-off-by: David S. Miller
15 Apr, 2012
1 commit
-
We must try harder to get unique (addr, port) pairs when
doing port autoselection for sockets with SO_REUSEADDR
option set.We achieve this by adding a relaxation parameter to
inet_csk_bind_conflict. When 'relax' parameter is off
we return a conflict whenever the current searched
pair (addr, port) is not unique.This tries to address the problems reported in patch:
8d238b25b1ec22a73b1c2206f111df2faaff8285
Revert "tcp: bind() fix when many ports are bound"Tests where ran for creating and binding(0) many sockets
on 100 IPs. The results are, on average:* 60000 sockets, 600 ports / IP:
* 0.210 s, 620 (IP, port) duplicates without patch
* 0.219 s, no duplicates with patch
* 100000 sockets, 1000 ports / IP:
* 0.371 s, 1720 duplicates without patch
* 0.373 s, no duplicates with patch
* 200000 sockets, 2000 ports / IP:
* 0.766 s, 6900 duplicates without patch
* 0.768 s, no duplicates with patch
* 500000 sockets, 5000 ports / IP:
* 2.227 s, 41500 duplicates without patch
* 2.284 s, no duplicates with patchSigned-off-by: Alex Copot
Signed-off-by: Daniel Baluta
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
27 Nov, 2011
1 commit
-
Conflicts:
net/ipv4/inet_diag.c
24 Nov, 2011
1 commit
-
commit 72a3effaf633bc ([NET]: Size listen hash tables using backlog
hint) added a bug allowing inet6_synq_hash() to return an out of bound
array index, because of u16 overflow.Bug can happen if system admins set net.core.somaxconn &
net.ipv4.tcp_max_syn_backlog sysctls to values greater than 65536Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
23 Nov, 2011
1 commit
-
C assignment can handle struct in6_addr copying.
Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller
27 Oct, 2011
1 commit
-
commit 66b13d99d96a (ipv4: tcp: fix TOS value in ACK messages sent from
TIME_WAIT) fixed IPv4 only.This part is for the IPv6 side, adding a tclass param to ip6_xmit()
We alias tw_tclass and tw_tos, if socket family is INET6.
[ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill
TOS field, TCLASS is not taken into account ]Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
01 Aug, 2011
1 commit
-
Use RCU to avoid changing dst_entry refcount in fast path.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
09 May, 2011
1 commit
-
This allows us to acquire the exact route keying information from the
protocol, however that might be managed.It handles all of the possibilities, from the simplest case of storing
the key in inet->cork.fl to the more complex setup SCTP has where
individual transports determine the flow.Signed-off-by: David S. Miller
14 Apr, 2011
1 commit
-
This reverts commit c191a836a908d1dd6b40c503741f91b914de3348.
It causes known regressions for programs that expect to be able to use
SO_REUSEADDR to shutdown a socket, then successfully rebind another
socket to the same ID.Programs such as haproxy and amavisd expect this to work.
This should fix kernel bugzilla 32832.
Signed-off-by: David S. Miller
13 Mar, 2011
4 commits
-
Signed-off-by: David S. Miller
-
Signed-off-by: David S. Miller
-
Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*This will let us to create AF optimal flow instances.
It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.Signed-off-by: David S. Miller
-
I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.This is the first step to move in that direction.
Signed-off-by: David S. Miller
02 Mar, 2011
1 commit
-
Route lookups follow a general pattern in the ipv6 code wherein
we first find the non-IPSEC route, potentially override the
flow destination address due to ipv6 options settings, and then
finally make an IPSEC search using either xfrm_lookup() or
__xfrm_lookup().__xfrm_lookup() is used when we want to generate a blackhole route
if the key manager needs to resolve the IPSEC rules (in this case
-EREMOTE is returned and the original 'dst' is left unchanged).Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC
resolution is necessary, we simply fail the lookup completely.All of these cases are encapsulated into two routines,
ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which
handles unconnected UDP datagram sockets.Signed-off-by: David S. Miller
12 Jan, 2011
1 commit
-
inet_csk_bind_conflict() logic currently disallows a bind() if
it finds a friend socket (a socket bound on same address/port)
satisfying a set of conditions :1) Current (to be bound) socket doesnt have sk_reuse set
OR
2) other socket doesnt have sk_reuse set
OR
3) other socket is in LISTEN stateWe should add the CLOSE state in the 3) condition, in order to avoid two
REUSEADDR sockets in CLOSE state with same local address/port, since
this can deny further operations.Note : a prior patch tried to address the problem in a different (and
buggy) way. (commit fda48a0d7a8412ced tcp: bind() fix when many ports
are bound).Reported-by: Gaspar Chilingarov
Reported-by: Daniel Baluta
Tested-by: Daniel Baluta
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
03 Dec, 2010
1 commit
-
Brother of ipv4's inet_csk_route_req().
Signed-off-by: David S. Miller
29 Nov, 2010
1 commit
-
jhash is widely used in the kernel and because the functions
are inlined, the cost in size is significant. Also, the new jhash
functions are slightly larger than the previous ones so better un-inline.
As a preparation step, the calls to the internal macros are replaced
with the plain jhash function calls.Signed-off-by: Jozsef Kadlecsik
Signed-off-by: David S. Miller
02 Jun, 2010
1 commit
-
There are more than a dozen occurrences of following code in the
IPv6 stack:if (opt && opt->srcrt) {
struct rt0_hdr *rt0 = (struct rt0_hdr *) opt->srcrt;
ipv6_addr_copy(&final, &fl.fl6_dst);
ipv6_addr_copy(&fl.fl6_dst, rt0->addr);
final_p = &final;
}Replace those with a helper. Note that the helper overrides final_p
in all cases. This is ok as final_p was previously initialized to
NULL when declared.Signed-off-by: Arnaud Ebalard
Signed-off-by: David S. Miller
03 May, 2010
1 commit
29 Apr, 2010
1 commit
-
This reverts two commits:
fda48a0d7a8412cedacda46a9c0bf8ef9cd13559
tcp: bind() fix when many ports are boundand a follow-on fix for it:
6443bb1fc2050ca2b6585a3fa77f7833b55329ed
ipv6: Fix inet6_csk_bind_conflict()It causes problems with binding listening sockets when time-wait
sockets from a previous instance still are alive.It's too late to keep fiddling with this so late in the -rc
series, and we'll deal with it in net-next-2.6 instead.Signed-off-by: David S. Miller
28 Apr, 2010
1 commit
-
Conflicts:
drivers/net/e100.c
drivers/net/e1000e/netdev.c
26 Apr, 2010
1 commit
-
Commit fda48a0d7a84 (tcp: bind() fix when many ports are bound)
introduced a bug on IPV6 part.
We should not call ipv6_addr_any(inet6_rcv_saddr(sk2)) but
ipv6_addr_any(inet6_rcv_saddr(sk)) because sk2 can be IPV4, while sk is
IPV6.Reported-by: Michael S. Tsirkin
Signed-off-by: Eric Dumazet
Tested-by: Michael S. Tsirkin
Signed-off-by: David S. Miller
23 Apr, 2010
1 commit
-
Port autoselection done by kernel only works when number of bound
sockets is under a threshold (typically 30000).When this threshold is over, we must check if there is a conflict before
exiting first loop in inet_csk_get_port()Change inet_csk_bind_conflict() to forbid two reuse-enabled sockets to
bind on same (address,port) tuple (with a non ANY address)Same change for inet6_csk_bind_conflict()
Reported-by: Gaspar Chilingarov
Signed-off-by: Eric Dumazet
Acked-by: Evgeniy Polyakov
Signed-off-by: David S. Miller
16 Apr, 2010
1 commit
-
As Herbert Xu said: we should be able to simply replace ipfragok
with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function)
has droped ipfragok and set local_df value properly.The patch kills the ipfragok parameter of .queue_xmit().
Signed-off-by: Shan Wei
Signed-off-by: David S. Miller
30 Mar, 2010
1 commit
-
…it slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
21 Oct, 2009
1 commit
-
IPv6: Reset sk_tx_queue_mapping when dst_cache is reset. Use existing
macro to do the work.Signed-off-by: Krishna Kumar
Signed-off-by: David S. Miller
19 Oct, 2009
1 commit
-
In order to have better cache layouts of struct sock (separate zones
for rx/tx paths), we need this preliminary patch.Goal is to transfert fields used at lookup time in the first
read-mostly cache line (inside struct sock_common) and move sk_refcnt
to a separate cache line (only written by rx path)This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
sport and id fields. This allows a future patch to define these
fields as macros, like sk_refcnt, without name clashes.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
07 Oct, 2009
1 commit
-
Atis Elsts wrote:
> Not sure if there is need to fill the mark from skb in tunnel xmit functions. In any case, it's not done for GRE or IPIP tunnels at the moment.Ok, I'll just drop that part, I'm not sure what should be done in this case.
> Also, in this patch you are doing that for SIT (v6-in-v4) tunnels only, and not doing it for v4-in-v6 or v6-in-v6 tunnels. Any reason for that?
I just sent that patch out too quickly, here's a better one with the updates.
Add support for IPv6 route lookups using sk_mark.
Signed-off-by: Brian Haley
Signed-off-by: David S. Miller
03 Jun, 2009
1 commit
-
Define three accessors to get/set dst attached to a skb
struct dst_entry *skb_dst(const struct sk_buff *skb)
void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;Delete skb->dst field
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller