11 Apr, 2010
1 commit
09 Apr, 2010
1 commit
-
Commits 5051ebd275de672b807c28d93002c2fb0514a3c9 and
5051ebd275de672b807c28d93002c2fb0514a3c9 ("ipv[46]: udp: optimize unicast RX
path") broke some programs.After upgrading a L2TP server to 2.6.33 it started to fail, tunnels going up an
down, after the 10th tunnel came up. My modified rp-l2tp uses a global
unconnected socket bound to (INADDR_ANY, 1701) and one connected socket per
tunnel after parameter negotiation.After ten sockets were open and due to mixed parameters to
udp[46]_lib_lookup2() kernel started to drop packets.Signed-off-by: Jorge Boncompte [DTI2]
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
30 Mar, 2010
1 commit
-
…it slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
29 Mar, 2010
1 commit
-
This is ipv6 variant of the commit 5e016cbf6.. ("ipv4: Don't drop
redirected route cache entry unless PTMU actually expired")
by Guenter Roeck .Remove cache route entry in ipv6_negative_advice() only if
the timer is expired.Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller
27 Mar, 2010
2 commits
-
When cache is unresolved, c->mf[6]c_parent is set to 65535 and
minvif, maxvif are not initialized, hence we must avoid to
parse IIF and OIF.
A second problem can happen when the user dumps a cache entry
where a VIF, that was referenced at creation time, has been
removed.Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller -
When a dump is interrupted at the last device in a hash chain and
then continued, "idx" won't get incremented past s_idx, so s_ip_idx
is not reset when moving on to the next device. This means of all
following devices only the last n - s_ip_idx addresses are dumped.Tested-by: Pawel Staszewski
Signed-off-by: Patrick McHardy
26 Mar, 2010
1 commit
25 Mar, 2010
1 commit
-
The order of the IPv6 raw table is currently reversed, that makes impossible
to use the NOTRACK target in IPv6: for example if someone entersip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK
and if we receive fragmented packets then the first fragment will be
untracked and thus skip nf_ct_frag6_gather (and conntrack), while all
subsequent fragments enter nf_ct_frag6_gather and reassembly will never
successfully be finished.Singed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy
20 Mar, 2010
2 commits
-
mfc_parent of cache entries is used to index into the vif_table and is
initialised from mfcctl->mfcc_parent. This can take values of to 2^16-1,
while the vif_table has only MAXVIFS (32) entries. The same problem
affects ip6mr.Refuse invalid values to fix a potential out-of-bounds access. Unlike
the other validity checks, this is checked in ipmr_mfc_add() instead of
the setsockopt handler since its unused in the delete path and might be
uninitialized.Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
As the only path leading to ip6_dst_check makes an indirect call
through dst->ops, dst cannot be NULL in ip6_dst_check.This patch removes this check in case it misleads people who
come across this code.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller
14 Mar, 2010
1 commit
-
If we are managing IPv6 addresses using DHCP, it would be nice
for user-space to be notified if an address configured through
DHCP fails DAD. Otherwise user-space would have to poll to see
whether DAD succeeds.This patch uses the existing notification mechanism and simply
hooks it into the DAD failure code path.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller
09 Mar, 2010
1 commit
-
Commit 6b03a53a (tcp: use limited socket backlog) added the possibility
of dropping frames when backlog queue is full.Commit d218d111 (tcp: Generalized TTL Security Mechanism) added the
possibility of dropping frames when TTL is under a given limit.This patch adds new SNMP MIB entries, named TCPBacklogDrop and
TCPMinTTLDrop, published in /proc/net/netstat in TcpExt: linenetstat -s | egrep "TCPBacklogDrop|TCPMinTTLDrop"
TCPBacklogDrop: 0
TCPMinTTLDrop: 0Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
08 Mar, 2010
1 commit
-
IPV6_PREFER_SRC_xxx definitions:
| #define IPV6_PREFER_SRC_TMP 0x0001
| #define IPV6_PREFER_SRC_PUBLIC 0x0002
| #define IPV6_PREFER_SRC_COA 0x0004RT6_LOOKUP_F_xxx definitions:
| #define RT6_LOOKUP_F_SRCPREF_TMP 0x00000008
| #define RT6_LOOKUP_F_SRCPREF_PUBLIC 0x00000010
| #define RT6_LOOKUP_F_SRCPREF_COA 0x00000020So, we can translate between these two groups by shift operation
instead of multiple 'if's.Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller
06 Mar, 2010
3 commits
-
sk_add_backlog -> __sk_add_backlog
sk_add_backlog_limited -> sk_add_backlogSigned-off-by: Zhu Yi
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
Make udp adapt to the limited socket backlog change.
Cc: "David S. Miller"
Cc: Alexey Kuznetsov
Cc: "Pekka Savola (ipv6)"
Cc: Patrick McHardy
Signed-off-by: Zhu Yi
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
Make tcp adapt to the limited socket backlog change.
Cc: "David S. Miller"
Cc: Alexey Kuznetsov
Cc: "Pekka Savola (ipv6)"
Cc: Patrick McHardy
Signed-off-by: Zhu Yi
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
04 Mar, 2010
4 commits
-
This solves a potential race problem during the cleanup process.
The issue is that addrconf_ifdown() needs to traverse address list,
but then drop lock to call the notifier. The version in -next
could get confused if add/delete happened during this window.
Original code (2.6.32 and earlier) was okay because all addresses
were always deleted.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
My recent change in net-next to retain permanent addresses caused regression.
Device refcount would not go to zero when device was unregistered because
left over anycast reference would hold ipv6 dev reference which would hold
device references...The correct procedure is to call notify chain when address is no longer
available for use. When interface comes back DAD timer will notify
back that address is available.Also, link local addresses should be purged when interface is brought
down. The address might be changed.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
The Router Solicitation timer races with device state changes
because it doesn't lock the device. Use local variable to avoid
one repeated dereference.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
Timer code runs in bottom half, so there is no need for
using _bh form of locking. Also check if device is not ready
to avoid race with address that is no longer active.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
03 Mar, 2010
1 commit
-
When I merged the bundle creation code, I introduced a bogus
flowi value in the bundle. Instead of getting from the caller,
it was instead set to the flow in the route object, which is
totally different.The end result is that the bundles we created never match, and
we instead end up with an ever growing bundle list.Thanks to Jamal for find this problem.
Reported-by: Jamal Hadi Salim
Signed-off-by: Herbert Xu
Acked-by: Steffen Klassert
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
27 Feb, 2010
2 commits
-
Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy
26 Feb, 2010
3 commits
-
Clients will set their MTU to 1280 if they receive a
ICMPV6_PKT_TOOBIG message with an MTU less than 1280.To allow encapsulating of packets over a 1280 link
we should always accept packets with a size of 1280
for forwarding even if the path has a lower MTU and
fragment the encapsulated packets afterwards.In case a forwarded packet is not going to be encapsulated
a ICMPV6_PKT_TOOBIG msg will still be send by ip6_fragment()
with the correct MTU.Signed-off-by: Ulrich Weber
Signed-off-by: David S. Miller -
RFC 4291 section 2.4 states that all uncategorized addresses
should be considered as Global Unicast.This will remove IPV6_ADDR_RESERVED completely
and return IPV6_ADDR_UNICAST in ipv6_addr_type() instead.Signed-off-by: Ulrich Weber
Signed-off-by: David S. Miller
25 Feb, 2010
6 commits
-
Just pass in the entire repl struct. In case of a new table (e.g.
ip6t_register_table), the repldata has been previously filled with
table->name and table->size already (in ip6t_alloc_initial_table).Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy -
Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy -
The macro is replaced by a list.h-like foreach loop. This makes
the code more inspectable.Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy -
Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy -
The macro is replaced by a list.h-like foreach loop. This makes
the code much more inspectable.Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy
23 Feb, 2010
1 commit
-
pass mark to all SA lookups to prepare them for when we add code
to have them search.Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
20 Feb, 2010
2 commits
-
Yuck. It turns out that when we restart sysctls we were restarting
with the values already changed. Which unfortunately meant that
the second time through we thought there was no change and skipped
all kinds of work, despite the fact that there was indeed a change.I have fixed this the simplest way possible by restoring the changed
values when we restart the sysctl write.One of my coworkers spotted this bug when after disabling forwarding
on an interface pings were still forwarded.Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller -
When an ICMPV6_PKT_TOOBIG message is received with a MTU below 1280,
all further packets include a fragment header.Unlike regular defragmentation, conntrack also needs to "reassemble"
those fragments in order to obtain a packet without the fragment
header for connection tracking. Currently nf_conntrack_reasm checks
whether a fragment has either IP6_MF set or an offset != 0, which
makes it ignore those fragments.Remove the invalid check and make reassembly handle fragment queues
containing only a single fragment.Reported-and-tested-by: Ulrich Weber
Signed-off-by: Patrick McHardy
19 Feb, 2010
3 commits
-
Dunno, what was the idea, it wasn't used for a long time.
Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller -
Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller -
ICMP6 MIB statistics was per-netns for quite a time.
Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller
18 Feb, 2010
1 commit
-
Only used for writing, so convert to spinlock
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
17 Feb, 2010
1 commit
-
call_rcu() will unconditionally reinitialize RCU head anyway.
Signed-off-by: Alexey Dobriyan
Acked-by: Paul E. McKenney
Signed-off-by: David S. Miller