Eric Lee / smarc-fsl-linux-kernel

04 Nov, 2018

1 commit

189771d69 udp6: fix encap return code for resubmitting ... Browse Code »

[ Upstream commit 84dad55951b0d009372ec21760b650634246e144 ]

The commit eb63f2964dbe ("udp6: add missing checks on edumux packet
processing") used the same return code convention of the ipv4 counterpart,
but ipv6 uses the opposite one: positive values means resubmit.

This change addresses the issue, using positive return value for
resubmitting. Also update the related comment, which was broken, too.

Fixes: eb63f2964dbe ("udp6: add missing checks on edumux packet processing")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Paolo Abeni
2018-11-04 21:52:49 +0800

29 Sep, 2018

1 commit

b13f721a3 udp6: add missing checks on edumux packet processing ... Browse Code »

[ Upstream commit eb63f2964dbe36f26deac77d3016791675821ded ]

Currently the UDPv6 early demux rx code path lacks some mandatory
checks, already implemented into the normal RX code path - namely
the checksum conversion and no_check6_rx check.

Similar to the previous commit, we move the common processing to
an UDPv6 specific helper and call it from both edemux code path
and normal code path. In respect to the UDPv4, we need to add an
explicit check for non zero csum according to no_check6_rx value.

Reported-by: Jianlin Shi
Suggested-by: Xin Long
Fixes: c9f2c1ae123a ("udp6: fix socket leak on early demux")
Fixes: 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Paolo Abeni
2018-09-29 18:06:01 +0800

26 Jun, 2018

1 commit

2e5d31688 udp: fix rx queue len reported by diag and proc interface ... Browse Code »

[ Upstream commit 6c206b20092a3623184cff9470dba75d21507874 ]

After commit 6b229cf77d68 ("udp: add batching to udp_rmem_release()")
the sk_rmem_alloc field does not measure exactly anymore the
receive queue length, because we batch the rmem release. The issue
is really apparent only after commit 0d4a6608f68c ("udp: do rmem bulk
free even if the rx sk queue is empty"): the user space can easily
check for an empty socket with not-0 queue length reported by the 'ss'
tool or the procfs interface.

We need to use a custom UDP helper to report the correct queue length,
taking into account the forward allocation deficit.

Reported-by: trevor.francis@46labs.com
Fixes: 6b229cf77d68 ("UDP: add batching to udp_rmem_release()")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Paolo Abeni
2018-06-26 08:06:28 +0800

19 May, 2018

1 commit

59afc1841 udp: fix SO_BINDTODEVICE ... Browse Code »

[ Upstream commit 69678bcd4d2dedbc3e8fcd6d7d99f283d83c531a ]

Damir reported a breakage of SO_BINDTODEVICE for UDP sockets.
In absence of VRF devices, after commit fb74c27735f0 ("net:
ipv4: add second dif to udp socket lookups") the dif mismatch
isn't fatal anymore for UDP socket lookup with non null
sk_bound_dev_if, breaking SO_BINDTODEVICE semantics.

This changeset addresses the issue making the dif match mandatory
again in the above scenario.

Reported-by: Damir Mansurov
Fixes: fb74c27735f0 ("net: ipv4: add second dif to udp socket lookups")
Fixes: 1801b570dd2a ("net: ipv6: add second dif to udp socket lookups")
Signed-off-by: Paolo Abeni
Acked-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Paolo Abeni
2018-05-19 16:20:27 +0800

19 Sep, 2017

1 commit

63ecc3d94 udpv6: Fix the checksum computation when HW checksum does not apply ... Browse Code »

While trying an ESP transport mode encryption for UDPv6 packets of
datagram size 1436 with MTU 1500, checksum error was observed in
the secondary fragment.

This error occurs due to the UDP payload checksum being missed out
when computing the full checksum for these packets in
udp6_hwcsum_outgoing().

Fixes: d39d938c8228 ("ipv6: Introduce udpv6_send_skb()")
Signed-off-by: Subash Abhinov Kasiviswanathan
Signed-off-by: David S. Miller

Subash Abhinov Kasiviswanathan
2017-09-19 02:43:03 +0800

07 Sep, 2017

1 commit

aae3dbb47 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:

1) Support ipv6 checksum offload in sunvnet driver, from Shannon
Nelson.

2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
Dumazet.

3) Allow generic XDP to work on virtual devices, from John Fastabend.

4) Add bpf device maps and XDP_REDIRECT, which can be used to build
arbitrary switching frameworks using XDP. From John Fastabend.

5) Remove UFO offloads from the tree, gave us little other than bugs.

6) Remove the IPSEC flow cache, from Florian Westphal.

7) Support ipv6 route offload in mlxsw driver.

8) Support VF representors in bnxt_en, from Sathya Perla.

9) Add support for forward error correction modes to ethtool, from
Vidya Sagar Ravipati.

10) Add time filter for packet scheduler action dumping, from Jamal Hadi
Salim.

11) Extend the zerocopy sendmsg() used by virtio and tap to regular
sockets via MSG_ZEROCOPY. From Willem de Bruijn.

12) Significantly rework value tracking in the BPF verifier, from Edward
Cree.

13) Add new jump instructions to eBPF, from Daniel Borkmann.

14) Rework rtnetlink plumbing so that operations can be run without
taking the RTNL semaphore. From Florian Westphal.

15) Support XDP in tap driver, from Jason Wang.

16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.

17) Add Huawei hinic ethernet driver.

18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
Delalande.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
i40e: avoid NVM acquire deadlock during NVM update
drivers: net: xgene: Remove return statement from void function
drivers: net: xgene: Configure tx/rx delay for ACPI
drivers: net: xgene: Read tx/rx delay for ACPI
rocker: fix kcalloc parameter order
rds: Fix non-atomic operation on shared flag variable
net: sched: don't use GFP_KERNEL under spin lock
vhost_net: correctly check tx avail during rx busy polling
net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
rxrpc: Make service connection lookup always check for retry
net: stmmac: Delete dead code for MDIO registration
gianfar: Fix Tx flow control deactivation
cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
cxgb4: Fix pause frame count in t4_get_port_stats
cxgb4: fix memory leak
tun: rename generic_xdp to skb_xdp
tun: reserve extra headroom only when XDP is set
net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
net: dsa: bcm_sf2: Advertise number of egress queues
...

Linus Torvalds
2017-09-07 05:45:08 +0800

04 Sep, 2017

1 commit

edc2988c5 Merge branch 'linus' into locking/core, to fix up conflicts ... Browse Code »

Conflicts:
mm/page_alloc.c

Signed-off-by: Ingo Molnar

Ingo Molnar
2017-09-04 17:01:18 +0800

02 Sep, 2017

1 commit

6026e043d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Three cases of simple overlapping changes.

Signed-off-by: David S. Miller

David S. Miller
2017-09-02 08:42:05 +0800

29 Aug, 2017

1 commit

a8e3bb347 net: Add comment that early_demux can change via sysctl ... Browse Code »

Twice patches trying to constify inet{6}_protocol have been reverted:
39294c3df2a8 ("Revert "ipv6: constify inet6_protocol structures"") to
revert 3a3a4e3054137 and then 03157937fe0b5 ("Revert "ipv4: make
net_protocol const"") to revert aa8db499ea67.

Add a comment that the structures can not be const because the
early_demux field can change based on a sysctl.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2017-08-29 06:17:29 +0800

26 Aug, 2017

1 commit

64f0f5d18 udp6: set rx_dst_cookie on rx_dst updates ... Browse Code »

Currently, in the udp6 code, the dst cookie is not initialized/updated
concurrently with the RX dst used by early demux.

As a result, the dst_check() in the early_demux path always fails,
the rx dst cache is always invalidated, and we can't really
leverage significant gain from the demux lookup.

Fix it adding udp6 specific variant of sk_rx_dst_set() and use it
to set the dst cookie when the dst entry is really changed.

The issue is there since the introduction of early demux for ipv6.

Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
Acked-by: Hannes Frederic Sowa
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2017-08-26 11:09:13 +0800

25 Aug, 2017

1 commit

10c9850cb Merge branch 'linus' into locking/core, to pick up fixes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2017-08-25 17:04:51 +0800

22 Aug, 2017

1 commit

e2a7c34fb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2017-08-22 08:06:42 +0800

19 Aug, 2017

1 commit

a0917e0bc datagram: When peeking datagrams with offset < 0 don't skip empty skbs ... Browse Code »

Due to commit e6afc8ace6dd5cef5e812f26c72579da8806f5ac ("udp: remove
headers from UDP packets before queueing"), when udp packets are being
peeked the requested extra offset is always 0 as there is no need to skip
the udp header. However, when the offset is 0 and the next skb is
of length 0, it is only returned once. The behaviour can be seen with
the following python script:

from socket import *;
f=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
g=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
f.bind(('::', 0));
addr=('::1', f.getsockname()[1]);
g.sendto(b'', addr)
g.sendto(b'b', addr)
print(f.recvfrom(10, MSG_PEEK));
print(f.recvfrom(10, MSG_PEEK));

Where the expected output should be the empty string twice.

Instead, make sk_peek_offset return negative values, and pass those values
to __skb_try_recv_datagram/__skb_try_recv_from_queue. If the passed offset
to __skb_try_recv_from_queue is negative, the checked skb is never skipped.
__skb_try_recv_from_queue will then ensure the offset is reset back to 0
if a peek is requested without an offset, unless no packets are found.

Also simplify the if condition in __skb_try_recv_from_queue. If _off is
greater then 0, and off is greater then or equal to skb->len, then
(_off || skb->len) must always be true assuming skb->len >= 0 is always
true.

Also remove a redundant check around a call to sk_peek_offset in af_unix.c,
as it double checked if MSG_PEEK was set in the flags.

V2:
- Moved the negative fixup into __skb_try_recv_from_queue, and remove now
redundant checks
- Fix peeking in udp{,v6}_recvmsg to report the right value when the
offset is 0

V3:
- Marked new branch in __skb_try_recv_from_queue as unlikely.

Signed-off-by: Matthew Dawson
Acked-by: Willem de Bruijn
Signed-off-by: David S. Miller

Matthew Dawson
2017-08-19 06:12:54 +0800

10 Aug, 2017

1 commit

7a34bcb8b jump_label: Do not use unserialized static_key_enabled() ... Browse Code »

Any use of key->enabled (that is static_key_enabled and static_key_count)
outside jump_label_lock should handle its own serialization. The only
two that are not doing so are the UDP encapsulation static keys. Change
them to use static_key_enable, which now correctly tests key->enabled under
the jump label lock.

Signed-off-by: Paolo Bonzini
Signed-off-by: Peter Zijlstra (Intel)
Cc: Eric Dumazet
Cc: Jason Baron
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1501601046-35683-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Ingo Molnar

Paolo Bonzini
2017-08-10 18:28:56 +0800

08 Aug, 2017

2 commits

4297a0ef0 net: ipv6: add second dif to inet6 socket lookups ... Browse Code »

Add a second device index, sdif, to inet6 socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.

TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the
ingress index is obtained from IPCB using inet_sdif and after tcp_v4_rcv
tcp_v4_sdif is used.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2017-08-08 02:39:22 +0800
1801b570d net: ipv6: add second dif to udp socket lookups ... Browse Code »

Add a second device index, sdif, to udp socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.

Early demux lookups are handled in the next patch as part of INET_MATCH
changes.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2017-08-08 02:39:22 +0800

01 Aug, 2017

1 commit

cb891fa6a udp6: fix jumbogram reception ... Browse Code »

Since commit 67a51780aebb ("ipv6: udp: leverage scratch area
helpers") udp6_recvmsg() read the skb len from the scratch area,
to avoid a cache miss.
But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their
length exceeds the 16 bits available in the scratch area. As a side
effect the length returned by recvmsg() is:
% (1<len if
required, without a measurable overhead.

Fixes: 67a51780aebb ("ipv6: udp: leverage scratch area helpers")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2017-08-01 13:01:21 +0800

30 Jul, 2017

1 commit

c9f2c1ae1 udp6: fix socket leak on early demux ... Browse Code »

When an early demuxed packet reaches __udp6_lib_lookup_skb(), the
sk reference is retrieved and used, but the relevant reference
count is leaked and the socket destructor is never called.
Beyond leaking the sk memory, if there are pending UDP packets
in the receive queue, even the related accounted memory is leaked.

In the long run, this will cause persistent forward allocation errors
and no UDP skbs (both ipv4 and ipv6) will be able to reach the
user-space.

Fix this by explicitly accessing the early demux reference before
the lookup, and properly decreasing the socket reference count
after usage.

Also drop the skb_steal_sock() in __udp6_lib_lookup_skb(), and
the now obsoleted comment about "socket cache".

The newly added code is derived from the current ipv4 code for the
similar path.

v1 -> v2:
fixed the __udp6_lib_rcv() return code for resubmission,
as suggested by Eric

Reported-by: Sam Edwards
Reported-by: Marc Haber
Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Paolo Abeni
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Paolo Abeni
2017-07-30 05:19:03 +0800

01 Jul, 2017

2 commits

41c6d650f net: convert sock.sk_refcnt from atomic_t to refcount_t ... Browse Code »

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

This patch uses refcount_inc_not_zero() instead of
atomic_inc_not_zero_hint() due to absense of a _hint()
version of refcount API. If the hint() version must
be used, we might need to revisit API.

Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller

Reshetova, Elena
2017-07-01 22:39:08 +0800
b07911593 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

A set of overlapping changes in macvlan and the rocker
driver, nothing serious.

Signed-off-by: David S. Miller

David S. Miller
2017-07-01 00:43:08 +0800

28 Jun, 2017

1 commit

67a51780a ipv6: udp: leverage scratch area helpers ... Browse Code »

The commit b65ac44674dd ("udp: try to avoid 2 cache miss on dequeue")
leveraged the scratched area helpers for UDP v4 but I forgot to
update accordingly the IPv6 code path.

This change extends the scratch area usage to the IPv6 code, synching
the two implementations and giving some performance benefit.
IPv6 is again almost on the same level of IPv4, performance-wide.

Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2017-06-28 03:43:57 +0800

25 Jun, 2017

1 commit

85cb73ff9 net: ipv6: reset daddr and dport in sk if connect() fails ... Browse Code »

In __ip6_datagram_connect(), reset sk->sk_v6_daddr and inet->dport if
error occurs.
In udp_v6_early_demux(), check for sk_state to make sure it is in
TCP_ESTABLISHED state.
Together, it makes sure unconnected UDP socket won't be considered as a
valid candidate for early demux.

v3: add TCP_ESTABLISHED state check in udp_v6_early_demux()
v2: fix compilation error

Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Wei Wang
Acked-by: Maciej Żenczykowski
Signed-off-by: David S. Miller

Wei Wang
2017-06-25 23:46:56 +0800

23 Jun, 2017

1 commit

4b943faed udp/v6: prefetch rmem_alloc in udp6_queue_rcv_skb() ... Browse Code »

very similar to commit dd99e425be23 ("udp: prefetch
rmem_alloc in udp_queue_rcv_skb()"), this allows saving a cache
miss when the BH is bottle-neck for UDP over ipv6 packet
processing, e.g. for small packets when a single RX NIC ingress
queue is in use.

Performances under flood when multiple NIC RX queues used are
unaffected, but when a single NIC rx queue is in use, this
gives ~8% performance improvement.

Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2017-06-23 01:44:04 +0800

18 Jun, 2017

1 commit

d24406c85 udp: call dst_hold_safe() in udp_sk_rx_set_dst() ... Browse Code »

In udp_v4/6_early_demux() code, we try to hold dst->__refcnt for
dst with DST_NOCACHE flag. This is because later in udp_sk_rx_dst_set()
function, we will try to cache this dst in sk for connected case.
However, a better way to achieve this is to not try to hold dst in
early_demux(), but in udp_sk_rx_dst_set(), call dst_hold_safe(). This
approach is also more consistant with how tcp is handling it. And it
will make later changes simpler.

Signed-off-by: Wei Wang
Acked-by: Martin KaFai Lau
Signed-off-by: David S. Miller

Wei Wang
2017-06-18 10:53:59 +0800

19 May, 2017

1 commit

c6cd850d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2017-05-19 04:11:32 +0800

18 May, 2017

1 commit

a3f96c47c udp: make *udp*_queue_rcv_skb() functions static ... Browse Code »

Since the udp memory accounting refactor, we don't need any more
to export the *udp*_queue_rcv_skb(). Make them static and fix
a couple of sparse warnings:

net/ipv4/udp.c:1615:5: warning: symbol 'udp_queue_rcv_skb' was not
declared. Should it be static?
net/ipv6/udp.c:572:5: warning: symbol 'udpv6_queue_rcv_skb' was not
declared. Should it be static?

Fixes: 850cbaddb52d ("udp: use it's own memory accounting schema")
Fixes: c915fe13cbaa ("udplite: fix NULL pointer dereference")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2017-05-18 22:23:33 +0800

17 May, 2017

1 commit

2276f58ac udp: use a separate rx queue for packet reception ... Browse Code »

under udp flood the sk_receive_queue spinlock is heavily contended.
This patch try to reduce the contention on such lock adding a
second receive queue to the udp sockets; recvmsg() looks first
in such queue and, only if empty, tries to fetch the data from
sk_receive_queue. The latter is spliced into the newly added
queue every time the receive path has to acquire the
sk_receive_queue lock.

The accounting of forward allocated memory is still protected with
the sk_receive_queue lock, so udp_rmem_release() needs to acquire
both locks when the forward deficit is flushed.

On specific scenarios we can end up acquiring and releasing the
sk_receive_queue lock multiple times; that will be covered by
the next patch

Suggested-by: Eric Dumazet
Signed-off-by: Paolo Abeni
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Paolo Abeni
2017-05-17 03:41:29 +0800

21 Apr, 2017

1 commit

0bd84065b net: ipv6: Fix UDP early demux lookup with udp_l3mdev_accept=0 ... Browse Code »

David Ahern reported that 5425077d73e0c ("net: ipv6: Add early demux
handler for UDP unicast") breaks udp_l3mdev_accept=0 since early
demux for IPv6 UDP was doing a generic socket lookup which does not
require an exact match. Fix this by making UDPv6 early demux match
connected sockets only.

v1->v2: Take reference to socket after match as suggested by Eric
v2->v3: Add comment before break

Fixes: 5425077d73e0c ("net: ipv6: Add early demux handler for UDP unicast")
Reported-by: David Ahern
Signed-off-by: Subash Abhinov Kasiviswanathan
Cc: Eric Dumazet
Acked-by: David Ahern
Tested-by: David Ahern
Signed-off-by: David S. Miller

subashab@codeaurora.org
2017-04-21 03:50:27 +0800

25 Mar, 2017

1 commit

dddb64bcb net: Add sysctl to toggle early demux for tcp and udp ... Browse Code »

Certain system process significant unconnected UDP workload.
It would be preferrable to disable UDP early demux for those systems
and enable it for TCP only.

By disabling UDP demux, we see these slight gains on an ARM64 system-
782 -> 788Mbps unconnected single stream UDPv4
633 -> 654Mbps unconnected UDPv4 different sources

The performance impact can change based on CPU architecure and cache
sizes. There will not much difference seen if entire UDP hash table
is in cache.

Both sysctls are enabled by default to preserve existing behavior.

v1->v2: Change function pointer instead of adding conditional as
suggested by Stephen.

v2->v3: Read once in callers to avoid issues due to compiler
optimizations. Also update commit message with the tests.

v3->v4: Store and use read once result instead of querying pointer
again incorrectly.

v4->v5: Refactor to avoid errors due to compilation with IPV6={m,n}

Signed-off-by: Subash Abhinov Kasiviswanathan
Suggested-by: Eric Dumazet
Cc: Stephen Hemminger
Cc: Tom Herbert
Cc: David Miller
Signed-off-by: David S. Miller

subashab@codeaurora.org
2017-03-25 04:17:07 +0800

24 Mar, 2017

1 commit

16ae1f223 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/broadcom/genet/bcmmii.c
drivers/net/hyperv/netvsc.c
kernel/bpf/hashtab.c

Almost entirely overlapping changes.

Signed-off-by: David S. Miller

David S. Miller
2017-03-24 07:41:27 +0800

23 Mar, 2017

1 commit

d515684d7 ipv6: make sure to initialize sockc.tsflags before first use ... Browse Code »

In the case udp_sk(sk)->pending is AF_INET6, udpv6_sendmsg() would
jump to do_append_data, skipping the initialization of sockc.tsflags.
Fix the problem by moving sockc.tsflags initialization earlier.

The bug was detected with KMSAN.

Fixes: c14ac9451c34 ("sock: enable timestamping using control messages")
Signed-off-by: Alexander Potapenko
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller

Alexander Potapenko
2017-03-23 03:40:22 +0800

13 Mar, 2017

1 commit

5425077d7 net: ipv6: Add early demux handler for UDP unicast ... Browse Code »

While running a single stream UDPv6 test, we observed that amount
of CPU spent in NET_RX softirq was much greater than UDPv4 for an
equivalent receive rate. The test here was run on an ARM64 based
Android system. On further analysis with perf, we found that UDPv6
was spending significant time in the statistics netfilter targets
which did socket lookup per packet. These statistics rules perform
a lookup when there is no socket associated with the skb. Since
there are multiple instances of these rules based on UID, there
will be equal number of lookups per skb.

By introducing early demux for UDPv6, we avoid the redundant lookups.
This also helped to improve the performance (800Mbps -> 870Mbps) on a
CPU limited system in a single stream UDPv6 receive test with 1450
byte sized datagrams using iperf.

v1->v2: Use IPv6 cookie to validate dst instead of 0 as suggested
by Eric

Signed-off-by: Subash Abhinov Kasiviswanathan
Signed-off-by: David S. Miller

subashab@codeaurora.org
2017-03-13 13:54:17 +0800

17 Feb, 2017

1 commit

3f64116a8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2017-02-17 08:34:01 +0800

15 Feb, 2017

1 commit

052d2369d ipv6: Handle IPv4-mapped src to in6addr_any dst. ... Browse Code »

This patch adds a check on the type of the source address for the case
where the destination address is in6addr_any. If the source is an
IPv4-mapped IPv6 source address, the destination is changed to
::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
is done in three locations to handle UDP calls to either connect() or
sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
handling an in6addr_any destination until very late, so the patch only
needs to handle the case where the source is an IPv4-mapped IPv6
address.

Signed-off-by: Jonathan T. Leighton
Signed-off-by: David S. Miller

Jonathan T. Leighton
2017-02-15 01:13:51 +0800

08 Feb, 2017

3 commits

3efa70d78 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

The conflict was an interaction between a bug fix in the
netvsc driver in 'net' and an optimization of the RX path
in 'net-next'.

Signed-off-by: David S. Miller

David S. Miller
2017-02-08 05:29:30 +0800
0dec879f6 net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP ... Browse Code »

When same struct dst_entry can be used for many different
neighbours we can not use it for pending confirmations.

The datagram protocols can use MSG_CONFIRM to confirm the
neighbour. When used with MSG_PROBE we do not reach the
code where neighbour is confirmed, so we have to do the
same slow lookup by using the dst_confirm_neigh() helper.
When MSG_PROBE is not used, ip_append_data/ip6_append_data
will set the skb flag dst_pending_confirm.

Reported-by: YueHaibing
Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
Signed-off-by: Julian Anastasov
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Julian Anastasov
2017-02-08 02:07:47 +0800
69629464e udp: properly cope with csum errors ... Browse Code »

Dmitry reported that UDP sockets being destroyed would trigger the
WARN_ON(atomic_read(&sk->sk_rmem_alloc)); in inet_sock_destruct()

It turns out we do not properly destroy skb(s) that have wrong UDP
checksum.

Thanks again to syzkaller team.

Fixes : 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue")
Reported-by: Dmitry Vyukov
Signed-off-by: Eric Dumazet
Cc: Paolo Abeni
Cc: Hannes Frederic Sowa
Acked-by: Paolo Abeni
Signed-off-by: David S. Miller

Eric Dumazet
2017-02-08 00:19:00 +0800

31 Jan, 2017

1 commit

63a6fff35 net: Avoid receiving packets with an l3mdev on unbound UDP sockets ... Browse Code »

Packets arriving in a VRF currently are delivered to UDP sockets that
aren't bound to any interface. TCP defaults to not delivering packets
arriving in a VRF to unbound sockets. IP route lookup and socket
transmit both assume that unbound means using the default table and
UDP applications that haven't been changed to be aware of VRFs may not
function correctly in this case since they may not be able to handle
overlapping IP address ranges, or be able to send packets back to the
original sender if required.

So add a sysctl, udp_l3mdev_accept, to control this behaviour with it
being analgous to the existing tcp_l3mdev_accept, namely to allow a
process to have a VRF-global listen socket. Have this default to off
as this is the behaviour that users will expect, given that there is
no explicit mechanism to set unmodified VRF-unaware application into a
default VRF.

Signed-off-by: Robert Shearman
Acked-by: David Ahern
Tested-by: David Ahern
Signed-off-by: David S. Miller

Robert Shearman
2017-01-31 04:00:58 +0800

19 Jan, 2017

1 commit

fe38d2a1c inet: collapse ipv4/v6 rcv_saddr_equal functions into one ... Browse Code »

We pass these per-protocol equal functions around in various places, but
we can just have one function that checks the sk->sk_family and then do
the right comparison function. I've also changed the ipv4 version to
not cast to inet_sock since it is unneeded.

Signed-off-by: Josef Bacik
Signed-off-by: David S. Miller

Josef Bacik
2017-01-19 02:04:28 +0800

25 Dec, 2016

1 commit

7c0f6ba68 Replace <asm/uaccess.h> with <linux/uaccess.h> globally ... Browse Code »

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-12-25 03:46:01 +0800