Eric Lee / linux-smarc-t335x-v3.2

19 Oct, 2011

1 commit

58af19e38 tproxy: copy transparent flag when creating a time wait ... Browse Code »

The transparent socket option setting was not copied to the time wait
socket when an inet socket was being replaced by a time wait socket. This
broke the --transparent option of the socket match and may have caused
that FIN packets belonging to sockets in FIN_WAIT2 or TIME_WAIT state
were being dropped by the packet filter.

Signed-off-by: KOVACS Krisztian
Signed-off-by: David S. Miller

KOVACS Krisztian
2011-10-19 15:21:35 +0800

05 Oct, 2011

2 commits

1e5289e12 tcp: properly update lost_cnt_hint during shifting ... Browse Code »

lost_skb_hint is used by tcp_mark_head_lost() to mark the first unhandled skb.
lost_cnt_hint is the number of packets or sacked packets before the lost_skb_hint;
When shifting a skb that is before the lost_skb_hint, if tcp_is_fack() is ture,
the skb has already been counted in the lost_cnt_hint; if tcp_is_fack() is false,
tcp_sacktag_one() will increase the lost_cnt_hint. So tcp_shifted_skb() does not
need to adjust the lost_cnt_hint by itself. When shifting a skb that is equal to
lost_skb_hint, the shifted packets will not be counted by tcp_mark_head_lost().
So tcp_shifted_skb() should adjust the lost_cnt_hint even tcp_is_fack(tp) is true.

Signed-off-by: Zheng Yan
Signed-off-by: David S. Miller

Yan, Zheng
2011-10-05 11:31:24 +0800
260fcbeb1 tcp: properly handle md5sig_pool references ... Browse Code »

tcp_v4_clear_md5_list() assumes that multiple tcp md5sig peers
only hold one reference to md5sig_pool. but tcp_v4_md5_do_add()
increases use count of md5sig_pool for each peer. This patch
makes tcp_v4_md5_do_add() only increases use count for the first
tcp md5sig peer.

Signed-off-by: Zheng Yan
Signed-off-by: David S. Miller

Yan, Zheng
2011-10-05 11:31:24 +0800

19 Sep, 2011

1 commit

f779b2d60 tcp: fix validation of D-SACK ... Browse Code »

D-SACK is allowed to reside below snd_una. But the corresponding check
in tcp_is_sackblock_valid() is the exact opposite. It looks like a typo.

Signed-off-by: Zheng Yan
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Zheng Yan
2011-09-19 10:37:34 +0800

17 Sep, 2011

1 commit

19c1ea14c ipv4: Fix fib_info->fib_metrics leak ... Browse Code »

Commit 4670994d(net,rcu: convert call_rcu(fc_rport_free_rcu) to
kfree_rcu()) introduced a memory leak. This patch reverts it.

Signed-off-by: Zheng Yan
Signed-off-by: David S. Miller

Yan, Zheng
2011-09-17 05:42:26 +0800

16 Sep, 2011

2 commits

52b9aca7a Merge branch 'master' of ../netdev/ Browse Code »

David S. Miller
2011-09-16 13:09:02 +0800
946cedccb tcp: Change possible SYN flooding messages ... Browse Code »

"Possible SYN flooding on port xxxx " messages can fill logs on servers.

Change logic to log the message only once per listener, and add two new
SNMP counters to track :

TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client

TCPReqQFullDrop : number of times a SYN request was dropped because
syncookies were not enabled.

Based on a prior patch from Tom Herbert, and suggestions from David.

Signed-off-by: Eric Dumazet
CC: Tom Herbert
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-16 02:49:43 +0800

31 Aug, 2011

2 commits

29c486df6 net: ipv4: relax AF_INET check in bind() ... Browse Code »

commit d0733d2e29b65 (Check for mistakenly passed in non-IPv4 address)
added regression on legacy apps that use bind() with AF_UNSPEC family.

Relax the check, but make sure the bind() is done on INADDR_ANY
addresses, as AF_UNSPEC has probably no sane meaning for other
addresses.

Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=42012

Signed-off-by: Eric Dumazet
Reported-and-bisected-by: Rene Meier
CC: Marcus Meissner
Signed-off-by: David S. Miller

Eric Dumazet
2011-08-31 06:57:00 +0800
785824165 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 Browse Code »

David S. Miller
2011-08-31 05:43:56 +0800

30 Aug, 2011

1 commit

c6675233f netfilter: nf_queue: reject NF_STOLEN verdicts from userspace ... Browse Code »

A userspace listener may send (bogus) NF_STOLEN verdict, which causes skb leak.

This problem was previously fixed via
64507fdbc29c3a622180378210ecea8659b14e40 (netfilter:
nf_queue: fix NF_STOLEN skb leak) but this had to be reverted because
NF_STOLEN can also be returned by a netfilter hook when iterating the
rules in nf_reinject.

Reject userspace NF_STOLEN verdict, as suggested by Michal Miroslaw.

This is complementary to commit fad54440438a7c231a6ae347738423cbabc936d9
(netfilter: avoid double free in nf_reinject).

Cc: Julian Anastasov
Cc: Eric Dumazet
Signed-off-by: Florian Westphal
Signed-off-by: Patrick McHardy

Florian Westphal
2011-08-30 21:01:20 +0800

25 Aug, 2011

1 commit

e05c4ad3e mcast: Fix source address selection for multicast listener report ... Browse Code »

Should check use count of include mode filter instead of total number
of include mode filters.

Signed-off-by: Zheng Yan
Signed-off-by: David S. Miller

Yan, Zheng
2011-08-25 08:46:15 +0800

11 Aug, 2011

2 commits

97a804102 ipv4: some rt_iif -> rt_route_iif conversions ... Browse Code »

As rt_iif represents input device even for packets
coming from loopback with output route, it is not an unique
key specific to input routes. Now rt_route_iif has such role,
it was fl.iif in 2.6.38, so better to change the checks at
some places to save CPU cycles and to restore 2.6.38 semantics.

compare_keys:
- input routes: only rt_route_iif matters, rt_iif is same
- output routes: only rt_oif matters, rt_iif is not
used for matching in __ip_route_output_key
- now we are back to 2.6.38 state

ip_route_input_common:
- matching rt_route_iif implies input route
- compared to 2.6.38 we eliminated one rth->fl.oif check
because it was not needed even for 2.6.38

compare_hash_inputs:
Only the change here is not an optimization, it has
effect only for output routes. I assume I'm restoring
the original intention to ignore oif, it was using fl.iif
- now we are back to 2.6.38 state

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2011-08-11 20:58:59 +0800
f0e3d0689 tcp: initialize variable ecn_ok in syncookies path ... Browse Code »

Using a gcc 4.4.3, warnings are emitted for a possibly uninitialized use
of ecn_ok.

This can happen if cookie_check_timestamp() returns due to not having
seen a timestamp. Defaulting to ecn off seems like a reasonable thing
to do in this case, so initialized ecn_ok to false.

Signed-off-by: Mike Waychison
Signed-off-by: David S. Miller

Mike Waychison
2011-08-11 12:59:57 +0800

08 Aug, 2011

5 commits

d52fbfc9e ipv4: use dst with ref during bcast/mcast loopback ... Browse Code »

Make sure skb dst has reference when moving to
another context. Currently, I don't see protocols that can
hit it when sending broadcasts/multicasts to loopback using
noref dsts, so it is just a precaution.

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2011-08-08 13:52:32 +0800
47670b767 ipv4: route non-local sources for raw socket ... Browse Code »

The raw sockets can provide source address for
routing but their privileges are not considered. We
can provide non-local source address, make sure the
FLOWI_FLAG_ANYSRC flag is set if socket has privileges
for this, i.e. based on hdrincl (IP_HDRINCL) and
transparent flags.

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2011-08-08 13:52:32 +0800
797fd3913 netfilter: TCP and raw fix for ip_route_me_harder ... Browse Code »

TCP in some cases uses different global (raw) socket
to send RST and ACK. The transparent flag is not set there.
Currently, it is a problem for rerouting after the previous
change.

Fix it by simplifying the checks in ip_route_me_harder
and use FLOWI_FLAG_ANYSRC even for sockets. It looks safe
because the initial routing allowed this source address to
be used and now we just have to make sure the packet is rerouted.

As a side effect this also allows rerouting for normal
raw sockets that use spoofed source addresses which was not possible
even before we eliminated the ip_route_input call.

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2011-08-08 13:52:32 +0800
dd23198e5 ipv4: Fix ip_getsockopt for IP_PKTOPTIONS ... Browse Code »

IP_PKTOPTIONS is broken for 32-bit applications running
in COMPAT mode on 64-bit kernels.

This happens because msghdr's msg_flags field is always
set to zero. When running in COMPAT mode this should be
set to MSG_CMSG_COMPAT instead.

Signed-off-by: Tiberiu Szocs-Mihai
Signed-off-by: Daniel Baluta
Signed-off-by: David S. Miller

Daniel Baluta
2011-08-08 13:31:07 +0800
d547f727d ipv4: fix the reusing of routing cache entries ... Browse Code »

compare_keys and ip_route_input_common rely on
rt_oif for distinguishing of input and output routes
with same keys values. But sometimes the input route has
also same hash chain (keyed by iif != 0) with the output
routes (keyed by orig_oif=0). Problem visible if running
with small number of rhash_entries.

Fix them to use rt_route_iif instead. By this way
input route can not be returned to users that request
output route.

The patch fixes the ip_rt_bug errors that were
reported in ip_local_out context, mostly for 255.255.255.255
destinations.

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2011-08-08 13:20:20 +0800

07 Aug, 2011

1 commit

6e5714eaf net: Compute protocol sequence numbers and fragment IDs using MD5. ... Browse Code »
1

Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.

MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)

Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation. So the periodic
regeneration and 8-bit counter have been removed. We compute and
use a full 32-bit sequence number.

For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.

Reported-by: Dan Kaminsky
Tested-by: Willy Tarreau
Signed-off-by: David S. Miller

David S. Miller
2011-08-07 09:33:19 +0800

03 Aug, 2011

1 commit

f2c31e32b net: fix NULL dereferences in check_peer_redir() ... Browse Code »
3

Gergely Kalman reported crashes in check_peer_redir().

It appears commit f39925dbde778 (ipv4: Cache learned redirect
information in inetpeer.) added a race, leading to possible NULL ptr
dereference.

Since we can now change dst neighbour, we should make sure a reader can
safely use a neighbour.

Add RCU protection to dst neighbour, and make sure check_peer_redir()
can be called safely by different cpus in parallel.

As neighbours are already freed after one RCU grace period, this patch
should not add typical RCU penalty (cache cold effects)

Many thanks to Gergely for providing a pretty report pointing to the
bug.

Reported-by: Gergely Kalman
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-08-03 18:34:12 +0800

01 Aug, 2011

1 commit

a1889c0d2 net: adjust array index ... Browse Code »

Convert array index from the loop bound to the loop index.

A simplified version of the semantic patch that fixes this problem is as
follows: (http://coccinelle.lip6.fr/)

//
@@
expression e1,e2,ar;
@@

for(e1 = 0; e1 < e2; e1++) { }
//

Signed-off-by: Julia Lawall
Signed-off-by: David S. Miller

Julia Lawall
2011-08-01 17:27:21 +0800

29 Jul, 2011

1 commit

91c66c689 netfilter: ip_queue: Fix small leak in ipq_build_packet_message() ... Browse Code »

ipq_build_packet_message() in net/ipv4/netfilter/ip_queue.c and
net/ipv6/netfilter/ip6_queue.c contain a small potential mem leak as
far as I can tell.

We allocate memory for 'skb' with alloc_skb() annd then call
nlh = NLMSG_PUT(skb, 0, 0, IPQM_PACKET, size - sizeof(*nlh));

NLMSG_PUT is a macro
NLMSG_PUT(skb, pid, seq, type, len) \
NLMSG_NEW(skb, pid, seq, type, len, 0)

that expands to NLMSG_NEW, which is also a macro which expands to:
NLMSG_NEW(skb, pid, seq, type, len, flags) \
({ if (unlikely(skb_tailroom(skb) < (int)NLMSG_SPACE(len))) \
goto nlmsg_failure; \
__nlmsg_put(skb, pid, seq, type, len, flags); })

If we take the true branch of the 'if' statement and 'goto
nlmsg_failure', then we'll, at that point, return from
ipq_build_packet_message() without having assigned 'skb' to anything
and we'll leak the memory we allocated for it when it goes out of
scope.

Fix this by placing a 'kfree(skb)' at 'nlmsg_failure'.

I admit that I do not know how likely this to actually happen or even
if there's something that guarantees that it will never happen - I'm
not that familiar with this code, but if that is so, I've not been
able to spot it.

Signed-off-by: Jesper Juhl
Signed-off-by: Patrick McHardy

Jesper Juhl
2011-07-29 22:38:49 +0800

28 Jul, 2011

1 commit

d5eab9152 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits)
tg3: Remove 5719 jumbo frames and TSO blocks
tg3: Break larger frags into 4k chunks for 5719
tg3: Add tx BD budgeting code
tg3: Consolidate code that calls tg3_tx_set_bd()
tg3: Add partial fragment unmapping code
tg3: Generalize tg3_skb_error_unmap()
tg3: Remove short DMA check for 1st fragment
tg3: Simplify tx bd assignments
tg3: Reintroduce tg3_tx_ring_info
ASIX: Use only 11 bits of header for data size
ASIX: Simplify condition in rx_fixup()
Fix cdc-phonet build
bonding: reduce noise during init
bonding: fix string comparison errors
net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared
net: add IFF_SKB_TX_SHARED flag to priv_flags
net: sock_sendmsg_nosec() is static
forcedeth: fix vlans
gianfar: fix bug caused by 87c288c6e9aa31720b72e2bc2d665e24e1653c3e
gro: Only reset frag0 when skb can be pulled
...

Linus Torvalds
2011-07-28 20:58:19 +0800

27 Jul, 2011

1 commit

60063497a atomic: use <linux/atomic.h> ... Browse Code »

This allows us to move duplicated code in
(atomic_inc_not_zero() for now) to

Signed-off-by: Arun Sharma
Reviewed-by: Eric Dumazet
Cc: Ingo Molnar
Cc: David Miller
Cc: Eric Dumazet
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arun Sharma
2011-07-27 07:49:47 +0800

26 Jul, 2011

1 commit

b76d0789c IPv4: Send gratuitous ARP for secondary IP addresses also ... Browse Code »

If a device event generates gratuitous ARP messages, only primary
address is used for sending. This patch iterates through the whole
list. Tested with 2 IP addresses configuration on bonding interface.

Signed-off-by: Zoltan Kiss
Signed-off-by: David S. Miller

Zoltan Kiss
2011-07-26 07:16:00 +0800

24 Jul, 2011

2 commits

559fafb94 gre: fix improper error handling ... Browse Code »

Fix improper protocol err_handler, current implementation is fully
unapplicable and may cause kernel crash due to double kfree_skb.

Signed-off-by: Dmitry Kozlov
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

xeb@mail.ru
2011-07-24 11:06:00 +0800
b0fe4a318 ipv4: use RT_TOS after some rt_tos conversions ... Browse Code »

rt_tos was changed to iph->tos but it must be filtered by RT_TOS

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2011-07-24 11:05:31 +0800

22 Jul, 2011

6 commits

415b3334a icmp: Fix regression in nexthop resolution during replies. ... Browse Code »

icmp_route_lookup() uses the wrong flow parameters if the reverse
session route lookup isn't used.

So do not commit to the re-decoded flow until we actually make a
final decision to use a real route saved in 'rt2'.

Reported-by: Florian Westphal
Signed-off-by: David S. Miller

David S. Miller
2011-07-22 21:22:10 +0800
d9be4f7a6 ipv4: Constrain UFO fragment sizes to multiples of 8 bytes ... Browse Code »

Because the ip fragment offset field counts 8-byte chunks, ip
fragments other than the last must contain a multiple of 8 bytes of
payload. ip_ufo_append_data wasn't respecting this constraint and,
depending on the MTU and ip option sizes, could create malformed
non-final fragments.

Google-Bug-Id: 5009328
Signed-off-by: Bill Sommerfeld
Signed-off-by: David S. Miller

Bill Sommerfeld
2011-07-22 12:31:41 +0800
87c48fa3b ipv6: make fragment identifications less predictable ... Browse Code »
1

IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.

Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)

This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter

Reported-by: Fernando Gont
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-07-22 12:25:58 +0800
9fea03302 lro: do vlan cleanup ... Browse Code »

- remove useless vlan parameters and pointers

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2011-07-22 04:47:54 +0800
0f7257281 lro: kill lro_vlan_hwaccel_receive_frags ... Browse Code »

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2011-07-22 04:47:54 +0800
7756a96e1 lro: kill lro_vlan_hwaccel_receive_skb ... Browse Code »

no longer used

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2011-07-22 04:47:54 +0800

19 Jul, 2011

1 commit

5c74501f7 ipv4: save cpu cycles from check_leaf() ... Browse Code »

Compiler is not smart enough to avoid double BSWAP instructions in
ntohl(inet_make_mask(plen)).

Lets cache this value in struct leaf_info, (fill a hole on 64bit arches)

With route cache disabled, this saves ~2% of cpu in udpflood bench on
x86_64 machine.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-07-19 01:41:18 +0800

18 Jul, 2011

3 commits

d3aaeb38c net: Add ->neigh_lookup() operation to dst_ops ... Browse Code »

In the future dst entries will be neigh-less. In that environment we
need to have an easy transition point for current users of
dst->neighbour outside of the packet output fast path.

Signed-off-by: David S. Miller

David S. Miller
2011-07-18 15:40:17 +0800
69cce1d14 net: Abstract dst->neighbour accesses behind helpers. ... Browse Code »

dst_{get,set}_neighbour()

Signed-off-by: David S. Miller

David S. Miller
2011-07-18 14:11:35 +0800
8f40b161d neigh: Pass neighbour entry to output ops. ... Browse Code »

This will get us closer to being able to do "neigh stuff"
completely independent of the underlying dst_entry for
protocols (ipv4/ipv6) that wish to do so.

We will also be able to make dst entries neigh-less.

Signed-off-by: David S. Miller

David S. Miller
2011-07-18 14:11:17 +0800

17 Jul, 2011

3 commits

542d4d685 neigh: Kill ndisc_ops->queue_xmit ... Browse Code »

It is always dev_queue_xmit().

Signed-off-by: David S. Miller

David S. Miller
2011-07-17 09:30:59 +0800
b23b5455b neigh: Kill hh_cache->hh_output ... Browse Code »

It's just taking on one of two possible values, either
neigh_ops->output or dev_queue_xmit(). And this is purely depending
upon whether nud_state has NUD_CONNECTED set or not.

Signed-off-by: David S. Miller

David S. Miller
2011-07-17 08:45:02 +0800
47ec132a4 neigh: Kill neigh_ops->hh_output ... Browse Code »

It's always dev_queue_xmit().

Signed-off-by: David S. Miller

David S. Miller
2011-07-17 08:39:57 +0800