13 Jan, 2012
1 commit
-
commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
y).We miss needed barriers, even on x86, when y is not NULL.
Signed-off-by: Eric Dumazet
CC: Stephen Hemminger
CC: Paul E. McKenney
Signed-off-by: David S. Miller
23 Nov, 2011
1 commit
-
C assignment can handle struct in6_addr copying.
Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller
19 Nov, 2011
1 commit
-
ipv6: Remove all uses of LL_ALLOCATED_SPACE
The macro LL_ALLOCATED_SPACE was ill-conceived. It applies the
alignment to the sum of needed_headroom and needed_tailroom. As
the amount that is then reserved for head room is needed_headroom
with alignment, this means that the tail room left may be too small.This patch replaces all uses of LL_ALLOCATED_SPACE in net/ipv6
with the macro LL_RESERVED_SPACE and direct reference to
needed_tailroom.This also fixes the problem with needed_headroom changing between
allocating the skb and reserving the head room.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller
10 Nov, 2011
1 commit
-
Le lundi 07 novembre 2011 à 15:33 +0100, Eric Dumazet a écrit :
> At least, in recent kernels we dont change dst->refcnt in forwarding
> patch (usinf NOREF skb->dst)
>
> One particular point is the atomic_inc(dst->refcnt) we have to perform
> when queuing an UDP packet if socket asked PKTINFO stuff (for example a
> typical DNS server has to setup this option)
>
> I have one patch somewhere that stores the information in skb->cb[] and
> avoid the atomic_{inc|dec}(dst->refcnt).
>OK I found it, I did some extra tests and believe its ready.
[PATCH net-next] ipv4: IP_PKTINFO doesnt need dst reference
When a socket uses IP_PKTINFO notifications, we currently force a dst
reference for each received skb. Reader has to access dst to get needed
information (rt_iif & rt_spec_dst) and must release dst reference.We also forced a dst reference if skb was put in socket backlog, even
without IP_PKTINFO handling. This happens under stress/load.We can instead store the needed information in skb->cb[], so that only
softirq handler really access dst, improving cache hit ratios.This removes two atomic operations per packet, and false sharing as
well.On a benchmark using a mono threaded receiver (doing only recvmsg()
calls), I can reach 720.000 pps instead of 570.000 pps.IP_PKTINFO is typically used by DNS servers, and any multihomed aware
UDP application.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
01 Nov, 2011
1 commit
-
These files are non modular, but need to export symbols using
the macros now living in export.h -- call out the include so
that things won't break when we remove the implicit presence
of module.h from everywhere.Signed-off-by: Paul Gortmaker
19 Oct, 2011
1 commit
-
ip6_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path).
On IPsec the effective mtu is lower because we need to add the protocol
headers and trailers later when we do the IPsec transformations. So after
the IPsec transformations the packet might be too big, which leads to a
slowpath fragmentation then. This patch fixes this by building the packets
based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr
handling to this.Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller
22 Sep, 2011
1 commit
-
Conflicts:
MAINTAINERS
drivers/net/Kconfig
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
drivers/net/ethernet/broadcom/tg3.c
drivers/net/wireless/iwlwifi/iwl-pci.c
drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c
drivers/net/wireless/rt2x00/rt2800usb.c
drivers/net/wireless/wl12xx/main.c
31 Aug, 2011
1 commit
-
Allow transparent sockets to be less restrictive about
the source ip of ipv6 udp packets being sent.Google-Bug-Id: 5018138
Signed-off-by: Maciej Żenczykowski
CC: "Erik Kline"
CC: "Lorenzo Colitti"
Signed-off-by: David S. Miller
12 Aug, 2011
1 commit
-
RCU api had been completed and rcu_access_pointer() or
rcu_dereference_protected() are better than generic
rcu_dereference_raw()Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
02 Aug, 2011
1 commit
-
When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.Convert all rcu_assign_pointer of NULL value.
//smpl
@@ expression P; @@- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)//
Signed-off-by: Stephen Hemminger
Acked-by: Paul E. McKenney
Signed-off-by: David S. Miller
02 Jul, 2011
1 commit
-
Make the case labels the same indent as the switch.
git diff -w shows 80 column reflowing,
removal of a useless break after return, and moving
open brace after case instead of separate line.Signed-off-by: Joe Perches
Signed-off-by: David S. Miller
24 May, 2011
1 commit
-
The %pK format specifier is designed to hide exposed kernel pointers,
specifically via /proc interfaces. Exposing these pointers provides an
easy target for kernel write vulnerabilities, since they reveal the
locations of writable structures containing easily triggerable function
pointers. The behavior of %pK depends on the kptr_restrict sysctl.If kptr_restrict is set to 0, no deviation from the standard %p behavior
occurs. If kptr_restrict is set to 1, the default, if the current user
(intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
(currently in the LSM tree), kernel pointers using %pK are printed as 0's.
If kptr_restrict is set to 2, kernel pointers using %pK are printed as
0's regardless of privileges. Replacing with 0's was chosen over the
default "(null)", which cannot be parsed by userland %p, which expects
"(nil)".The supporting code for kptr_restrict and %pK are currently in the -mm
tree. This patch converts users of %p in net/ to %pK. Cases of printing
pointers to the syslog are not covered, since this would eliminate useful
information for postmortem debugging and the reading of the syslog is
already optionally protected by the dmesg_restrict sysctl.Signed-off-by: Dan Rosenberg
Cc: James Morris
Cc: Eric Dumazet
Cc: Thomas Graf
Cc: Eugene Teo
Cc: Kees Cook
Cc: Ingo Molnar
Cc: David S. Miller
Cc: Peter Zijlstra
Cc: Eric Paris
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller
07 May, 2011
1 commit
-
When we fast path datagram sends to avoid locking by putting
the inet_cork on the stack we use up lots of space that isn't
necessary.This is because inet_cork contains a "struct flowi" which isn't
used in these code paths.Split inet_cork to two parts, "inet_cork" and "inet_cork_full".
Only the latter of which has the "struct flowi" and is what is
stored in inet_sock.Signed-off-by: David S. Miller
Acked-by: Eric Dumazet
23 Apr, 2011
1 commit
-
Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers
where possible, to make code intention more obvious.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
13 Mar, 2011
4 commits
-
Signed-off-by: David S. Miller
-
Signed-off-by: David S. Miller
-
Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*This will let us to create AF optimal flow instances.
It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.Signed-off-by: David S. Miller
-
I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.This is the first step to move in that direction.
Signed-off-by: David S. Miller
02 Mar, 2011
1 commit
-
Route lookups follow a general pattern in the ipv6 code wherein
we first find the non-IPSEC route, potentially override the
flow destination address due to ipv6 options settings, and then
finally make an IPSEC search using either xfrm_lookup() or
__xfrm_lookup().__xfrm_lookup() is used when we want to generate a blackhole route
if the key manager needs to resolve the IPSEC rules (in this case
-EREMOTE is returned and the original 'dst' is left unchanged).Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC
resolution is necessary, we simply fail the lookup completely.All of these cases are encapsulated into two routines,
ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which
handles unconnected UDP datagram sockets.Signed-off-by: David S. Miller
05 Feb, 2011
1 commit
04 Feb, 2011
1 commit
-
Signed-off-by: David S. Miller
21 Jan, 2011
1 commit
-
Remove sparse warnings, using a function typedef to be able to use __rcu
annotation on mh_filter pointer.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
26 Oct, 2010
1 commit
-
Add __rcu annotation to :
(struct sock)->sk_filterAnd use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=ySigned-off-by: Eric Dumazet
Signed-off-by: David S. Miller
24 Sep, 2010
1 commit
-
Change "return (EXPR);" to "return EXPR;"
return is not a function, parentheses are not required.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
11 Jun, 2010
1 commit
-
remove useless union keyword in rtable, rt6_info and dn_route.
Since there is only one member in a union, the union keyword isn't useful.
Signed-off-by: Changli Gao
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
07 Jun, 2010
1 commit
-
Avoid two atomic ops per raw_send_hdrinc() call
Avoid two atomic ops per raw6_send_hdrinc() call
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
02 Jun, 2010
1 commit
-
There are more than a dozen occurrences of following code in the
IPv6 stack:if (opt && opt->srcrt) {
struct rt0_hdr *rt0 = (struct rt0_hdr *) opt->srcrt;
ipv6_addr_copy(&final, &fl.fl6_dst);
ipv6_addr_copy(&fl.fl6_dst, rt0->addr);
final_p = &final;
}Replace those with a helper. Note that the helper overrides final_p
in all cases. This is ok as final_p was previously initialized to
NULL when declared.Signed-off-by: Arnaud Ebalard
Signed-off-by: David S. Miller
11 May, 2010
1 commit
-
Conflicts:
net/bridge/br_device.c
net/bridge/br_forward.cSigned-off-by: Patrick McHardy
29 Apr, 2010
1 commit
-
When queueing a skb to socket, we can immediately release its dst if
target socket do not use IP_CMSG_PKTINFO.tcp_data_queue() can drop dst too.
This to benefit from a hot cache line and avoid the receiver, possibly
on another cpu, to dirty this cache line himself.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
24 Apr, 2010
2 commits
-
Finally add support to detect a local IPV6_DONTFRAG event
and return the relevant data to the user if they've enabled
IPV6_RECVPATHMTU on the socket. The next recvmsg() will
return no data, but have an IPV6_PATHMTU as ancillary data.Signed-off-by: Brian Haley
Signed-off-by: David S. Miller -
Add dontfrag argument to relevant functions for
IPV6_DONTFRAG support, as well as allowing the value
to be passed-in via ancillary cmsg data.Signed-off-by: Brian Haley
Signed-off-by: David S. Miller
20 Apr, 2010
1 commit
-
Conflicts:
Documentation/feature-removal-schedule.txt
net/ipv6/netfilter/ip6t_REJECT.c
net/netfilter/xt_limit.cSigned-off-by: Patrick McHardy
30 Mar, 2010
1 commit
-
…it slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
25 Mar, 2010
1 commit
-
The semantic patch that was used:
//
@@
@@
(NF_HOOK
|NF_HOOK_THRESH
|nf_hook
)(
-PF_INET6,
+NFPROTO_IPV6,
...)
//Signed-off-by: Jan Engelhardt
18 Jan, 2010
1 commit
-
__net_init/__net_exit are apparently not going away, so use them
to full extent.In some cases __net_init was removed, because it was called from
__net_exit code.Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller
08 Nov, 2009
1 commit
-
Using RCU helps not touching device refcount in rawv6_bind()
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
06 Nov, 2009
1 commit
-
struct can_proto had a capability field which wasn't ever used. It is
dropped entirely.struct inet_protosw had a capability field which can be more clearly
expressed in the code by just checking if sock->type = SOCK_RAW.Signed-off-by: Eric Paris
Acked-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller
19 Oct, 2009
2 commits
-
- skb_kill_datagram() can increment sk->sk_drops itself, not callers.
- UDP on IPV4 & IPV6 dropped frames (because of bad checksum or policy checks) increment sk_drops
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
In order to have better cache layouts of struct sock (separate zones
for rx/tx paths), we need this preliminary patch.Goal is to transfert fields used at lookup time in the first
read-mostly cache line (inside struct sock_common) and move sk_refcnt
to a separate cache line (only written by rx path)This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
sport and id fields. This allows a future patch to define these
fields as macros, like sk_refcnt, without name clashes.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
15 Oct, 2009
1 commit
-
sock_queue_rcv_skb() can update sk_drops itself, removing need for
callers to take care of it. This is more consistent since
sock_queue_rcv_skb() also reads sk_drops when queueing a skb.This adds sk_drops managment to many protocols that not cared yet.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller