Eric Lee / smarc-fsl-linux-kernel

27 Jun, 2014

1 commit

724d0d32f udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup ... Browse Code »

[ Upstream commit 63c6f81cdde58c41da62a8d8a209592e42a0203e ]

Its too easy to add thousand of UDP sockets on a particular bucket,
and slow down an innocent multicast receiver.

Early demux is supposed to be an optimization, we should avoid spending
too much time in it.

It is interesting to note __udp4_lib_demux_lookup() only tries to
match first socket in the chain.

10 is the threshold we already have in __udp4_lib_lookup() to switch
to secondary hash.

Fixes: 421b3885bf6d5 ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet
Reported-by: David Held
Cc: Shawn Bohrer
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2014-06-27 03:15:40 +0800

19 Jan, 2014

1 commit

342dfc306 net: add build-time checks for msg->msg_name size ... Browse Code »

This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
handler msg_name and msg_namelen logic").

DECLARE_SOCKADDR validates that the structure we use for writing the
name information to is not larger than the buffer which is reserved
for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
consistently in sendmsg code paths.

Signed-off-by: Steffen Hurrle
Suggested-by: Hannes Frederic Sowa
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Steffen Hurrle
2014-01-19 15:04:16 +0800

15 Jan, 2014

1 commit

63862b5be net: replace macros net_random and net_srandom with direct calls to prandom ... Browse Code »

This patch removes the net_random and net_srandom macros and replaces
them with direct calls to the prandom ones. As new commits only seem to
use prandom_u32 there is no use to keep them around.
This change makes it easier to grep for users of prandom_u32.

Signed-off-by: Aruna-Hewapathirane
Suggested-by: Hannes Frederic Sowa
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Aruna-Hewapathirane
2014-01-15 07:15:25 +0800

07 Jan, 2014

1 commit

56a4342df Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
net/ipv6/ip6_tunnel.c
net/ipv6/ip6_vti.c

ipv6 tunnel statistic bug fixes conflicting with consolidation into
generic sw per-cpu net stats.

qlogic conflict between queue counting bug fix and the addition
of multiple MAC address support.

Signed-off-by: David S. Miller

David S. Miller
2014-01-07 06:37:45 +0800

03 Jan, 2014

1 commit

7a7ffbabf ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NIC ... Browse Code »

VM to VM GSO traffic is broken if it goes through VXLAN or GRE
tunnel and the physical NIC on the host supports hardware VXLAN/GRE
GSO offload (e.g. bnx2x and next-gen mlx4).

Two issues -
(VXLAN) VM traffic has SKB_GSO_DODGY and SKB_GSO_UDP_TUNNEL with
SKB_GSO_TCP/UDP set depending on the inner protocol. GSO header
integrity check fails in udp4_ufo_fragment if inner protocol is
TCP. Also gso_segs is calculated incorrectly using skb->len that
includes tunnel header. Fix: robust check should only be applied
to the inner packet.

(VXLAN & GRE) Once GSO header integrity check passes, NULL segs
is returned and the original skb is sent to hardware. However the
tunnel header is already pulled. Fix: tunnel header needs to be
restored so that hardware can perform GSO properly on the original
packet.

Signed-off-by: Wei-Chun Chao
Signed-off-by: David S. Miller

Wei-Chun Chao
2014-01-03 08:06:47 +0800

20 Dec, 2013

1 commit

1669cb985 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next ... Browse Code »

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2013-12-19

1) Use the user supplied policy index instead of a generated one
if present. From Fan Du.

2) Make xfrm migration namespace aware. From Fan Du.

3) Make the xfrm state and policy locks namespace aware. From Fan Du.

4) Remove ancient sleeping when the SA is in acquire state,
we now queue packets to the policy instead. This replaces the
sleeping code.

5) Remove FLOWI_FLAG_CAN_SLEEP. This was used to notify xfrm about the
posibility to sleep. The sleeping code is gone, so remove it.

6) Check user specified spi for IPComp. Thr spi for IPcomp is only
16 bit wide, so check for a valid value. From Fan Du.

7) Export verify_userspi_info to check for valid user supplied spi ranges
with pfkey and netlink. From Fan Du.

8) RFC3173 states that if the total size of a compressed payload and the IPComp
header is not smaller than the size of the original payload, the IP datagram
must be sent in the original non-compressed form. These packets are dropped
by the inbound policy check because they are not transformed. Document the need
to set 'level use' for IPcomp to receive such packets anyway. From Fan Du.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller

David S. Miller
2013-12-20 07:37:49 +0800

18 Dec, 2013

1 commit

e47eb5dfb udp: ipv4: do not use sk_dst_lock from softirq context ... Browse Code »

Using sk_dst_lock from softirq context is not supported right now.

Instead of adding BH protection everywhere,
udp_sk_rx_dst_set() can instead use xchg(), as suggested
by David.

Reported-by: Fengguang Wu
Fixes: 975022310233 ("udp: ipv4: must add synchronization in udp_sk_rx_dst_set()")
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-12-18 03:50:58 +0800

12 Dec, 2013

2 commits

975022310 udp: ipv4: must add synchronization in udp_sk_rx_dst_set() ... Browse Code »

Unlike TCP, UDP input path does not hold the socket lock.

Before messing with sk->sk_rx_dst, we must use a spinlock, otherwise
multiple cpus could leak a refcount.

This patch also takes care of renewing a stale dst entry.
(When the sk->sk_rx_dst would not be used by IP early demux)

Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet
Cc: Shawn Bohrer
Signed-off-by: David S. Miller

Eric Dumazet
2013-12-12 09:21:10 +0800
610438b74 udp: ipv4: fix potential use after free in udp_v4_early_demux() ... Browse Code »

pskb_may_pull() can reallocate skb->head, we need to move the
initialization of iph and uh pointers after its call.

Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet
Cc: Shawn Bohrer
Signed-off-by: David S. Miller

Eric Dumazet
2013-12-12 05:10:14 +0800

11 Dec, 2013

1 commit

8afdd99a1 udp: ipv4: fix an use after free in __udp4_lib_rcv() ... Browse Code »

Dave Jones reported a use after free in UDP stack :

[ 5059.434216] =========================
[ 5059.434314] [ BUG: held lock freed! ]
[ 5059.434420] 3.13.0-rc3+ #9 Not tainted
[ 5059.434520] -------------------------
[ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there!
[ 5059.434815] (slock-AF_INET){+.-...}, at: [] udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.435012] 3 locks held by named/863:
[ 5059.435086] #0: (rcu_read_lock){.+.+..}, at: [] __netif_receive_skb_core+0x11d/0x940
[ 5059.435295] #1: (rcu_read_lock){.+.+..}, at: [] ip_local_deliver_finish+0x3e/0x410
[ 5059.435500] #2: (slock-AF_INET){+.-...}, at: [] udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.435734]
stack backtrace:
[ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ #9 [loadavg: 0.21 0.06 0.06 1/115 1365]
[ 5059.436052] Hardware name: /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010
[ 5059.436223] 0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0
[ 5059.436390] ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0
[ 5059.436554] ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48
[ 5059.436718] Call Trace:
[ 5059.436769] [] dump_stack+0x4d/0x66
[ 5059.436904] [] debug_check_no_locks_freed+0x15a/0x160
[ 5059.437037] [] ? __sk_free+0x110/0x230
[ 5059.437147] [] kmem_cache_free+0x6a/0x150
[ 5059.437260] [] __sk_free+0x110/0x230
[ 5059.437364] [] sk_free+0x19/0x20
[ 5059.437463] [] sock_edemux+0x25/0x40
[ 5059.437567] [] sock_queue_rcv_skb+0x81/0x280
[ 5059.437685] [] ? udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.437805] [] __udp_queue_rcv_skb+0x42/0x240
[ 5059.437925] [] ? _raw_spin_lock+0x65/0x70
[ 5059.438038] [] udp_queue_rcv_skb+0x26b/0x4b0
[ 5059.438155] [] __udp4_lib_rcv+0x152/0xb00
[ 5059.438269] [] udp_rcv+0x15/0x20
[ 5059.438367] [] ip_local_deliver_finish+0x10f/0x410
[ 5059.438492] [] ? ip_local_deliver_finish+0x3e/0x410
[ 5059.438621] [] ip_local_deliver+0x43/0x80
[ 5059.438733] [] ip_rcv_finish+0x140/0x5a0
[ 5059.438843] [] ip_rcv+0x296/0x3f0
[ 5059.438945] [] __netif_receive_skb_core+0x742/0x940
[ 5059.439074] [] ? __netif_receive_skb_core+0x11d/0x940
[ 5059.442231] [] ? trace_hardirqs_on+0xd/0x10
[ 5059.442231] [] __netif_receive_skb+0x13/0x60
[ 5059.442231] [] netif_receive_skb+0x1e/0x1f0
[ 5059.442231] [] napi_gro_receive+0x70/0xa0
[ 5059.442231] [] rtl8169_poll+0x166/0x700 [r8169]
[ 5059.442231] [] net_rx_action+0x129/0x1e0
[ 5059.442231] [] __do_softirq+0xed/0x240
[ 5059.442231] [] irq_exit+0x125/0x140
[ 5059.442231] [] do_IRQ+0x51/0xc0
[ 5059.442231] [] common_interrupt+0x6f/0x6f

We need to keep a reference on the socket, by using skb_steal_sock()
at the right place.

Note that another patch is needed to fix a race in
udp_sk_rx_dst_set(), as we hold no lock protecting the dst.

Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux")
Reported-by: Dave Jones
Signed-off-by: Eric Dumazet
Cc: Shawn Bohrer
Signed-off-by: David S. Miller

Eric Dumazet
2013-12-11 11:58:40 +0800

06 Dec, 2013

1 commit

0e0d44ab4 net: Remove FLOWI_FLAG_CAN_SLEEP ... Browse Code »

FLOWI_FLAG_CAN_SLEEP was used to notify xfrm about the posibility
to sleep until the needed states are resolved. This code is gone,
so FLOWI_FLAG_CAN_SLEEP is not needed anymore.

Signed-off-by: Steffen Klassert

Steffen Klassert
2013-12-06 14:24:39 +0800

30 Nov, 2013

2 commits

f1d8cba61 inet: fix possible seqlock deadlocks ... Browse Code »

In commit c9e9042994d3 ("ipv4: fix possible seqlock deadlock") I left
another places where IP_INC_STATS_BH() were improperly used.

udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from
process context, not from softirq context.

This was detected by lockdep seqlock support.

Reported-by: jongman heo
Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Signed-off-by: Eric Dumazet
Cc: Hannes Frederic Sowa
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Eric Dumazet
2013-11-30 05:37:36 +0800
d3f7d56a7 net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST ... Browse Code »

Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
added an internal flag MSG_SENDPAGE_NOTLAST, similar to
MSG_MORE.

algif_hash, algif_skcipher, and udp used MSG_MORE from tcp_sendpages()
and need to see the new flag as identical to MSG_MORE.

This fixes sendfile() on AF_ALG.

v3: also fix udp

Cc: Tom Herbert
Cc: Eric Dumazet
Cc: David S. Miller
Cc: # 3.4.x + 3.2.x
Reported-and-tested-by: Shawn Landden
Original-patch: Richard Weinberger
Signed-off-by: Shawn Landden
Signed-off-by: David S. Miller

Shawn Landden
2013-11-30 05:32:54 +0800

24 Nov, 2013

1 commit

85fbaa750 inet: fix addr_len/msg->msg_namelen assignment in recv_error and rxpmtu functions ... Browse Code »

Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
of uninitialized memory to user in recv syscalls") conditionally updated
addr_len if the msg_name is written to. The recv_error and rxpmtu
functions relied on the recvmsg functions to set up addr_len before.

As this does not happen any more we have to pass addr_len to those
functions as well and set it to the size of the corresponding sockaddr
length.

This broke traceroute and such.

Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")
Reported-by: Brad Spengler
Reported-by: Tom Labanowski
Cc: mpb
Cc: David S. Miller
Cc: Eric Dumazet
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2013-11-24 06:46:23 +0800

20 Nov, 2013

1 commit

1ee2dcc22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:
"Mostly these are fixes for fallout due to merge window changes, as
well as cures for problems that have been with us for a much longer
period of time"

1) Johannes Berg noticed two major deficiencies in our genetlink
registration. Some genetlink protocols we passing in constant
counts for their ops array rather than something like
ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
using fixed IDs for their multicast groups.

We have to retain these fixed IDs to keep existing userland tools
working, but reserve them so that other multicast groups used by
other protocols can not possibly conflict.

In dealing with these two problems, we actually now use less state
management for genetlink operations and multicast groups.

2) When configuring interface hardware timestamping, fix several
drivers that simply do not validate that the hwtstamp_config value
is one the driver actually supports. From Ben Hutchings.

3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.

4) In dev_forward_skb(), set the skb->protocol in the right order
relative to skb_scrub_packet(). From Alexei Starovoitov.

5) Bridge erroneously fails to use the proper wrapper functions to make
calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
Makita.

6) When detaching a bridge port, make sure to flush all VLAN IDs to
prevent them from leaking, also from Toshiaki Makita.

7) Put in a compromise for TCP Small Queues so that deep queued devices
that delay TX reclaim non-trivially don't have such a performance
decrease. One particularly problematic area is 802.11 AMPDU in
wireless. From Eric Dumazet.

8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
here. Fix from Eric Dumzaet, reported by Dave Jones.

9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.

10) When computing mergeable buffer sizes, virtio-net fails to take the
virtio-net header into account. From Michael Dalton.

11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
bumping, this one has been with us for a while. From Eric Dumazet.

12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
Hugne.

13) 6lowpan bit used for traffic classification was wrong, from Jukka
Rissanen.

14) macvlan has the same issue as normal vlans did wrt. propagating LRO
disabling down to the real device, fix it the same way. From Michal
Kubecek.

15) CPSW driver needs to soft reset all slaves during suspend, from
Daniel Mack.

16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.

17) The xen-netfront RX buffer refill timer isn't properly scheduled on
partial RX allocation success, from Ma JieYue.

18) When ipv6 ping protocol support was added, the AF_INET6 protocol
initialization cleanup path on failure was borked a little. Fix
from Vlad Yasevich.

19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
blocks we can do the wrong thing with the msg_name we write back to
userspace. From Hannes Frederic Sowa. There is another fix in the
works from Hannes which will prevent future problems of this nature.

20) Fix route leak in VTI tunnel transmit, from Fan Du.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
genetlink: make multicast groups const, prevent abuse
genetlink: pass family to functions using groups
genetlink: add and use genl_set_err()
genetlink: remove family pointer from genl_multicast_group
genetlink: remove genl_unregister_mc_group()
hsr: don't call genl_unregister_mc_group()
quota/genetlink: use proper genetlink multicast APIs
drop_monitor/genetlink: use proper genetlink multicast APIs
genetlink: only pass array to genl_register_family_with_ops()
tcp: don't update snd_nxt, when a socket is switched from repair mode
atm: idt77252: fix dev refcnt leak
xfrm: Release dst if this dst is improper for vti tunnel
netlink: fix documentation typo in netlink_set_err()
be2net: Delete secondary unicast MAC addresses during be_close
be2net: Fix unconditional enabling of Rx interface options
net, virtio_net: replace the magic value
ping: prevent NULL pointer dereference on write to msg_name
bnx2x: Prevent "timeout waiting for state X"
bnx2x: prevent CFC attention
bnx2x: Prevent panic during DMAE timeout
...

Linus Torvalds
2013-11-20 07:50:47 +0800

19 Nov, 2013

1 commit

bceaa9024 inet: prevent leakage of uninitialized memory to user in recv syscalls ... Browse Code »

Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.

If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.

Reported-by: mpb
Suggested-by: Eric Dumazet
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2013-11-19 04:12:03 +0800

15 Nov, 2013

1 commit

652586df9 seq_file: remove "%n" usage from seq_file users ... Browse Code »

All seq_printf() users are using "%n" for calculating padding size,
convert them to use seq_setwidth() / seq_pad() pair.

Signed-off-by: Tetsuo Handa
Signed-off-by: Kees Cook
Cc: Joe Perches
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tetsuo Handa
2013-11-15 08:32:20 +0800

20 Oct, 2013

2 commits

1bbdceef1 inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once ... Browse Code »

Initialize the ehash and ipv6_hash_secrets with net_get_random_once.

Each compilation unit gets its own secret now:
ipv4/inet_hashtables.o
ipv4/udp.o
ipv6/inet6_hashtables.o
ipv6/udp.o
rds/connection.o

The functions still get inlined into the hashing functions. In the fast
path we have at most two (needed in ipv6) if (unlikely(...)).

Cc: Eric Dumazet
Cc: "David S. Miller"
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2013-10-20 07:45:35 +0800
65cd8033f ipv4: split inet_ehashfn to hash functions per compilation unit ... Browse Code »

This duplicates a bit of code but let's us easily introduce
separate secret keys later. The separate compilation units are
ipv4/inet_hashtabbles.o, ipv4/udp.o and rds/connection.o.

Cc: Eric Dumazet
Cc: "David S. Miller"
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2013-10-20 07:45:34 +0800

09 Oct, 2013

4 commits

f69b923a7 udp: fix a typo in __udp4_lib_mcast_demux_lookup ... Browse Code »

At this point sk might contain garbage.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-09 13:51:57 +0800
fbf8866d6 net: ipv4 only populate IP_PKTINFO when needed ... Browse Code »

The since the removal of the routing cache computing
fib_compute_spec_dst() does a fib_table lookup for each UDP multicast
packet received. This has introduced a performance regression for some
UDP workloads.

This change skips populating the packet info for sockets that do not have
IP_PKTINFO set.

Benchmark results from a netperf UDP_RR test:
Before 89789.68 transactions/s
After 90587.62 transactions/s

Benchmark results from a fio 1 byte UDP multicast pingpong test
(Multicast one way unicast response):
Before 12.63us RTT
After 12.48us RTT

Signed-off-by: Shawn Bohrer
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Shawn Bohrer
2013-10-09 04:27:33 +0800
421b3885b udp: ipv4: Add udp early demux ... Browse Code »

The removal of the routing cache introduced a performance regression for
some UDP workloads since a dst lookup must be done for each packet.
This change caches the dst per socket in a similar manner to what we do
for TCP by implementing early_demux.

For UDP multicast we can only cache the dst if there is only one
receiving socket on the host. Since caching only works when there is
one receiving socket we do the multicast socket lookup using RCU.

For UDP unicast we only demux sockets with an exact match in order to
not break forwarding setups. Additionally since the hash chains may be
long we only check the first socket to see if it is a match and not
waste extra time searching the whole chain when we might not find an
exact match.

Benchmark results from a netperf UDP_RR test:
Before 87961.22 transactions/s
After 89789.68 transactions/s

Benchmark results from a fio 1 byte UDP multicast pingpong test
(Multicast one way unicast response):
Before 12.97us RTT
After 12.63us RTT

Signed-off-by: Shawn Bohrer
Signed-off-by: David S. Miller

Shawn Bohrer
2013-10-09 04:27:33 +0800
005ec9743 udp: Only allow busy read/poll on connected sockets ... Browse Code »

UDP sockets can receive packets from multiple endpoints and thus may be
received on multiple receive queues. Since packets packets can arrive
on multiple receive queues we should not mark the napi_id for all
packets. This makes busy read/poll only work for connected UDP sockets.

This additionally enables busy read/poll for UDP multicast packets as
long as the socket is connected by moving the check into
__udp_queue_rcv_skb().

Signed-off-by: Shawn Bohrer
Suggested-by: Eric Dumazet
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Shawn Bohrer
2013-10-09 04:27:33 +0800

02 Oct, 2013

1 commit

4fbef95af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/emulex/benet/be.h
drivers/net/usb/qmi_wwan.c
drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h
include/net/netfilter/nf_conntrack_synproxy.h
include/net/secure_seq.h

The conflicts are of two varieties:

1) Conflicts with Joe Perches's 'extern' removal from header file
function declarations. Usually it's an argument signature change
or a function being added/removed. The resolutions are trivial.

2) Some overlapping changes in qmi_wwan.c and be.h, one commit adds
a new value, another changes an existing value. That sort of
thing.

Signed-off-by: David S. Miller

David S. Miller
2013-10-02 05:06:14 +0800

01 Oct, 2013

1 commit

0bbf87d85 net ipv4: Convert ipv4.ip_local_port_range to be per netns v3 ... Browse Code »

- Move sysctl_local_ports from a global variable into struct netns_ipv4.
- Modify inet_get_local_port_range to take a struct net, and update all
of the callers.
- Move the initialization of sysctl_local_ports into
sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c

v2:
- Ensure indentation used tabs
- Fixed ip.h so it applies cleanly to todays net-next

v3:
- Compile fixes of strange callers of inet_get_local_port_range.
This patch now successfully passes an allmodconfig build.
Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range

Originally-by: Samya
Acked-by: Nicolas Dichtel
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2013-10-01 12:59:38 +0800

29 Sep, 2013

1 commit

aa6615814 ipv4: processing ancillary IP_TOS or IP_TTL ... Browse Code »

If IP_TOS or IP_TTL are specified as ancillary data, then sendmsg() sends out
packets with the specified TTL or TOS overriding the socket values specified
with the traditional setsockopt().

The struct inet_cork stores the values of TOS, TTL and priority that are
passed through the struct ipcm_cookie. If there are user-specified TOS
(tos != -1) or TTL (ttl != 0) in the struct ipcm_cookie, these values are
used to override the per-socket values. In case of TOS also the priority
is changed accordingly.

Two helper functions get_rttos and get_rtconn_flags are defined to take
into account the presence of a user specified TOS value when computing
RT_TOS and RT_CONN_FLAGS.

Signed-off-by: Francesco Fusco
Signed-off-by: David S. Miller

Francesco Fusco
2013-09-29 06:21:52 +0800

24 Sep, 2013

1 commit

1a462d189 net: udp: do not report ICMP redirects to user space ... Browse Code »

Redirect isn't an error condition, it should leave
the error handler without touching the socket.

Signed-off-by: Duan Jiong
Signed-off-by: David S. Miller

Duan Jiong
2013-09-24 22:15:49 +0800

01 Sep, 2013

1 commit

eb3c0d83c net: unify skb_udp_tunnel_segment() and skb_udp6_tunnel_segment() ... Browse Code »

As suggested by Pravin, we can unify the code in case of duplicated
code.

Cc: Pravin Shelar
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2013-09-01 10:30:01 +0800

16 Aug, 2013

1 commit

d14c5ab6b net: proc_fs: trivial: print UIDs as unsigned int ... Browse Code »

UIDs are printed in the proc_fs as signed int, whereas
they are unsigned int.

Signed-off-by: Francesco Fusco
Signed-off-by: David S. Miller

Francesco Fusco
2013-08-16 05:37:46 +0800

28 Jul, 2013

1 commit

c26bf4a51 pktgen: Add UDPCSUM flag to support UDP checksums ... Browse Code »

UDP checksums are optional, hence pktgen has been omitting them in
favour of performance. The optional flag UDPCSUM enables UDP
checksumming. If the output device supports hardware checksumming
the skb is prepared and marked CHECKSUM_PARTIAL, otherwise the
checksum is generated in software.

Signed-off-by: Thomas Graf
Cc: Eric Dumazet
Cc: Ben Greear
Signed-off-by: David S. Miller

Thomas Graf
2013-07-28 13:16:36 +0800

12 Jul, 2013

1 commit

cdbaa0bb2 gso: Update tunnel segmentation to support Tx checksum offload ... Browse Code »

This change makes it so that the GRE and VXLAN tunnels can make use of Tx
checksum offload support provided by some drivers via the hw_enc_features.
Without this fix enabling GSO means sacrificing Tx checksum offload and
this actually leads to a performance regression as shown below:

Utilization
Send
Throughput local GSO
10^6bits/s % S state
6276.51 8.39 enabled
7123.52 8.42 disabled

To resolve this it was necessary to address two items. First
netif_skb_features needed to be updated so that it would correctly handle
the Trans Ether Bridging protocol without impacting the need to check for
Q-in-Q tagging. To do this it was necessary to update harmonize_features
so that it used skb_network_protocol instead of just using the outer
protocol.

Second it was necessary to update the GRE and UDP tunnel segmentation
offloads so that they would reset the encapsulation bit and inner header
offsets after the offload was complete.

As a result of this change I have seen the following results on a interface
with Tx checksum enabled for encapsulated frames:

Utilization
Send
Throughput local GSO
10^6bits/s % S state
7123.52 8.42 disabled
8321.75 5.43 enabled

v2: Instead of replacing refrence to skb->protocol with
skb_network_protocol just replace the protocol reference in
harmonize_features to allow for double VLAN tag checks.

Signed-off-by: Alexander Duyck
Signed-off-by: David S. Miller

Alexander Duyck
2013-07-12 03:18:49 +0800

11 Jul, 2013

2 commits

8b80cda53 net: rename ll methods to busy-poll ... Browse Code »

Rename ndo_ll_poll to ndo_busy_poll.
Rename sk_mark_ll to sk_mark_napi_id.
Rename skb_mark_ll to skb_mark_napi_id.
Correct all useres of these functions.
Update comments and defines in include/net/busy_poll.h

Signed-off-by: Eliezer Tamir
Signed-off-by: David S. Miller

Eliezer Tamir
2013-07-11 08:08:27 +0800
076bb0c82 net: rename include/net/ll_poll.h to include/net/busy_poll.h ... Browse Code »

Rename the file and correct all the places where it is included.

Signed-off-by: Eliezer Tamir
Signed-off-by: David S. Miller

Eliezer Tamir
2013-07-11 08:08:27 +0800

03 Jul, 2013

1 commit

8822b64a0 ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data ... Browse Code »

We accidentally call down to ip6_push_pending_frames when uncorking
pending AF_INET data on a ipv6 socket. This results in the following
splat (from Dave Jones):

skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:126!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth
+netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c
CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37
task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000
RIP: 0010:[] [] skb_panic+0x63/0x65
RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282
RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006
RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520
RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800
R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800
FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4
ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6
ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0
Call Trace:
[] skb_push+0x3a/0x40
[] ip6_push_pending_frames+0x1f6/0x4d0
[] ? mark_held_locks+0xbb/0x140
[] udp_v6_push_pending_frames+0x2b9/0x3d0
[] ? udplite_getfrag+0x20/0x20
[] udp_lib_setsockopt+0x1aa/0x1f0
[] ? fget_light+0x387/0x4f0
[] udpv6_setsockopt+0x34/0x40
[] sock_common_setsockopt+0x14/0x20
[] SyS_setsockopt+0x71/0xd0
[] tracesys+0xdd/0xe2
Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55
RIP [] skb_panic+0x63/0x65
RSP

This patch adds a check if the pending data is of address family AF_INET
and directly calls udp_push_ending_frames from udp_v6_push_pending_frames
if that is the case.

This bug was found by Dave Jones with trinity.

(Also move the initialization of fl6 below the AF_INET check, even if
not strictly necessary.)

Cc: Dave Jones
Cc: YOSHIFUJI Hideaki
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2013-07-03 03:44:18 +0800

13 Jun, 2013

1 commit

7c0cadc69 udp: fix two sparse errors ... Browse Code »

commit ba418fa357a7b3c ("soreuseport: UDP/IPv4 implementation")
added following sparse errors :

net/ipv4/udp.c:433:60: warning: cast from restricted __be16
net/ipv4/udp.c:433:60: warning: incorrect type in argument 1 (different base types)
net/ipv4/udp.c:433:60: expected unsigned short [unsigned] [usertype] val
net/ipv4/udp.c:433:60: got restricted __be16 [usertype] sport
net/ipv4/udp.c:433:60: warning: cast from restricted __be16
net/ipv4/udp.c:433:60: warning: cast from restricted __be16
net/ipv4/udp.c:514:60: warning: cast from restricted __be16
net/ipv4/udp.c:514:60: warning: incorrect type in argument 1 (different base types)
net/ipv4/udp.c:514:60: expected unsigned short [unsigned] [usertype] val
net/ipv4/udp.c:514:60: got restricted __be16 [usertype] sport
net/ipv4/udp.c:514:60: warning: cast from restricted __be16
net/ipv4/udp.c:514:60: warning: cast from restricted __be16

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-06-13 06:03:24 +0800

12 Jun, 2013

1 commit

da5bab079 net: udp4: move GSO functions to udp_offload ... Browse Code »

Similarly to TCP offloading and UDPv6 offloading, move all related
UDPv4 functions to udp_offload.c to make things more explicit. Also,
by this, we can make those functions static.

Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

Daniel Borkmann
2013-06-12 15:47:25 +0800

11 Jun, 2013

1 commit

a5b50476f udp: add low latency socket poll support ... Browse Code »

Add upport for busy-polling on UDP sockets.
In __udp[46]_lib_rcv add a call to sk_mark_ll() to copy the napi_id
from the skb into the sk.
This is done at the earliest possible moment, right after we identify
which socket this skb is for.
In __skb_recv_datagram When there is no data and the user
tries to read we busy poll.

Signed-off-by: Alexander Duyck
Signed-off-by: Jesse Brandeburg
Signed-off-by: Eliezer Tamir
Acked-by: Eric Dumazet
Tested-by: Willem de Bruijn
Signed-off-by: David S. Miller

Eliezer Tamir
2013-06-11 12:22:36 +0800

01 Jun, 2013

1 commit

c3f1dbaf6 net: Update RFS target at poll for tcp/udp ... Browse Code »

The current state of affairs is that read()/write() will setup
RFS (Receive Flow Steering) for internet protocol sockets while
poll()/epoll() does not.

When poll() gets called with a TCP or UDP socket, we should update
the flow target.

This permits to RFS (if enabled) to select the appropriate CPU for
following incoming packets.

Note: Only connected UDP sockets can benefit from RFS.

Signed-off-by: David Majnemer
Signed-off-by: Eric Dumazet
Cc: Paul Turner
Cc: Tom Herbert
Signed-off-by: David S. Miller

David Majnemer
2013-06-01 07:24:43 +0800

28 May, 2013

1 commit

0d89d2035 MPLS: Add limited GSO support ... Browse Code »

In the case where a non-MPLS packet is received and an MPLS stack is
added it may well be the case that the original skb is GSO but the
NIC used for transmit does not support GSO of MPLS packets.

The aim of this code is to provide GSO in software for MPLS packets
whose skbs are GSO.

SKB Usage:

When an implementation adds an MPLS stack to a non-MPLS packet it should do
the following to skb metadata:

* Set skb->inner_protocol to the old non-MPLS ethertype of the packet.
skb->inner_protocol is added by this patch.

* Set skb->protocol to the new MPLS ethertype of the packet.

* Set skb->network_header to correspond to the
end of the L3 header, including the MPLS label stack.

I have posted a patch, "[PATCH v3.29] datapath: Add basic MPLS support to
kernel" which adds MPLS support to the kernel datapath of Open vSwtich.
That patch sets the above requirements in datapath/actions.c:push_mpls()
and was used to exercise this code. The datapath patch is against the Open
vSwtich tree but it is intended that it be added to the Open vSwtich code
present in the mainline Linux kernel at some point.

Features:

I believe that the approach that I have taken is at least partially
consistent with the handling of other protocols. Jesse, I understand that
you have some ideas here. I am more than happy to change my implementation.

This patch adds dev->mpls_features which may be used by devices
to advertise features supported for MPLS packets.

A new NETIF_F_MPLS_GSO feature is added for devices which support
hardware MPLS GSO offload. Currently no devices support this
and MPLS GSO always falls back to software.

Alternate Implementation:

One possible alternate implementation is to teach netif_skb_features()
and skb_network_protocol() about MPLS, in a similar way to their
understanding of VLANs. I believe this would avoid the need
for net/mpls/mpls_gso.c and in particular the calls to
__skb_push() and __skb_push() in mpls_gso_segment().

I have decided on the implementation in this patch as it should
not introduce any overhead in the case where mpls_gso is not compiled
into the kernel or inserted as a module.

MPLS GSO suggested by Jesse Gross.
Based in part on "v4 GRE: Add TCP segmentation offload for GRE"
by Pravin B Shelar.

Cc: Jesse Gross
Cc: Pravin B Shelar
Signed-off-by: Simon Horman
Signed-off-by: David S. Miller

Simon Horman
2013-05-28 13:50:59 +0800

09 May, 2013

1 commit

19acc3272 gso: Handle Trans-Ether-Bridging protocol in skb_network_protocol() ... Browse Code »

Rather than having logic to calculate inner protocol in every
tunnel gso handler move it to gso code. This simplifies code.

Cc: Eric Dumazet
Cc: Cong Wang
Cc: David S. Miller
Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Pravin B Shelar
2013-05-09 04:13:30 +0800