Eric Lee / smarc-fsl-linux-kernel

18 Feb, 2017

9 commits

0d4c19ee6 tcp: don't annotate mark on control socket from tcp_v6_send_response() ... Browse Code »

commit 92e55f412cffd016cc245a74278cb4d7b89bb3bc upstream.

Unlike ipv4, this control socket is shared by all cpus so we cannot use
it as scratchpad area to annotate the mark that we pass to ip6_xmit().

Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket
family caches the flowi6 structure in the sctp_transport structure, so
we cannot use to carry the mark unless we later on reset it back, which
I discarded since it looks ugly to me.

Fixes: bf99b4ded5f8 ("tcp: fix mark propagation with fwmark_reflect enabled")
Suggested-by: Eric Dumazet
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Pablo Neira
2017-02-18 22:11:44 +0800
7c4c32a29 tcp: fix mark propagation with fwmark_reflect enabled ... Browse Code »

commit bf99b4ded5f8a4767dbb9d180626f06c51f9881f upstream.

Otherwise, RST packets generated by the TCP stack for non-existing
sockets always have mark 0.
The mark from the original packet is assigned to the netns_ipv4/6
socket used to send the response so that it can get copied into the
response skb when the socket sends it.

Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies")
Cc: Lorenzo Colitti
Signed-off-by: Pau Espin Pedrol
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Pau Espin Pedrol
2017-02-18 22:11:43 +0800
16a3fbe52 igmp, mld: Fix memory leak in igmpv3/mld_del_delrec() ... Browse Code »

[ Upstream commit 9c8bb163ae784be4f79ae504e78c862806087c54 ]

In function igmpv3/mld_add_delrec() we allocate pmc and put it in
idev->mc_tomb, so we should free it when we don't need it in del_delrec().
But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.

Fixes: 24803f38a5c0 ("igmp: do not remove igmp souce list info when ...")
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when ...")
Reported-by: Daniel Borkmann
Signed-off-by: Hangbin Liu
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Hangbin Liu
2017-02-18 22:11:43 +0800
53a76d633 mld: do not remove mld souce list info when set link down ... Browse Code »

[ Upstream commit 1666d49e1d416fcc2cce708242a52fe3317ea8ba ]

This is an IPv6 version of commit 24803f38a5c0 ("igmp: do not remove igmp
souce list..."). In mld_del_delrec(), we will restore back all source filter
info instead of flush them.

Move mld_clear_delrec() from ipv6_mc_down() to ipv6_mc_destroy_dev() since
we should not remove source list info when set link down. Remove
igmp6_group_dropped() in ipv6_mc_destroy_dev() since we have called it in
ipv6_mc_down().

Also clear all source info after igmp6_group_dropped() instead of in it
because ipv6_mc_down() will call igmp6_group_dropped().

Signed-off-by: Hangbin Liu
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Hangbin Liu
2017-02-18 22:11:43 +0800
4cd036211 sit: fix a double free on error path ... Browse Code »

[ Upstream commit d7426c69a1942b2b9b709bf66b944ff09f561484 ]

Dmitry reported a double free in sit_init_net():

kernel BUG at mm/percpu.c:689!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 15692 Comm: syz-executor1 Not tainted 4.10.0-rc6-next-20170206 #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
task: ffff8801c9cc27c0 task.stack: ffff88017d1d8000
RIP: 0010:pcpu_free_area+0x68b/0x810 mm/percpu.c:689
RSP: 0018:ffff88017d1df488 EFLAGS: 00010046
RAX: 0000000000010000 RBX: 00000000000007c0 RCX: ffffc90002829000
RDX: 0000000000010000 RSI: ffffffff81940efb RDI: ffff8801db841d94
RBP: ffff88017d1df590 R08: dffffc0000000000 R09: 1ffffffff0bb3bdd
R10: dffffc0000000000 R11: 00000000000135dd R12: ffff8801db841d80
R13: 0000000000038e40 R14: 00000000000007c0 R15: 00000000000007c0
FS: 00007f6ea608f700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002000aff8 CR3: 00000001c8d44000 CR4: 00000000001426f0
DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Call Trace:
free_percpu+0x212/0x520 mm/percpu.c:1264
ipip6_dev_free+0x43/0x60 net/ipv6/sit.c:1335
sit_init_net+0x3cb/0xa10 net/ipv6/sit.c:1831
ops_init+0x10a/0x530 net/core/net_namespace.c:115
setup_net+0x2ed/0x690 net/core/net_namespace.c:291
copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396
create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106
unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205
SYSC_unshare kernel/fork.c:2281 [inline]
SyS_unshare+0x64e/0xfc0 kernel/fork.c:2231
entry_SYSCALL_64_fastpath+0x1f/0xc2

This is because when tunnel->dst_cache init fails, we free dev->tstats
once in ipip6_tunnel_init() and twice in sit_init_net(). This looks
redundant but its ndo_uinit() does not seem enough to clean up everything
here. So avoid this by setting dev->tstats to NULL after the first free,
at least for -net.

Reported-by: Dmitry Vyukov
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

WANG Cong
2017-02-18 22:11:43 +0800
1e340bb22 ipv6: tcp: add a missing tcp_v6_restore_cb() ... Browse Code »

[ Upstream commit ebf6c9cb23d7e56eec8575a88071dec97ad5c6e2 ]

Dmitry reported use-after-free in ip6_datagram_recv_specific_ctl()

A similar bug was fixed in commit 8ce48623f0cf ("ipv6: tcp: restore
IP6CB for pktoptions skbs"), but I missed another spot.

tcp_v6_syn_recv_sock() can indeed set np->pktoptions from ireq->pktopts

Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet
Reported-by: Dmitry Vyukov
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-02-18 22:11:41 +0800
ae1768bbb ip6_gre: fix ip6gre_err() invalid reads ... Browse Code »

[ Upstream commit 7892032cfe67f4bde6fc2ee967e45a8fbaf33756 ]

Andrey Konovalov reported out of bound accesses in ip6gre_err()

If GRE flags contains GRE_KEY, the following expression
*(((__be32 *)p) + (grehlen / 4) - 1)

accesses data ~40 bytes after the expected point, since
grehlen includes the size of IPv6 headers.

Let's use a "struct gre_base_hdr *greh" pointer to make this
code more readable.

p[1] becomes greh->protocol.
grhlen is the GRE header length.

Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Signed-off-by: Eric Dumazet
Reported-by: Andrey Konovalov
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-02-18 22:11:41 +0800
e6fbace87 ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim() ... Browse Code »

[ Upstream commit 63117f09c768be05a0bf465911297dc76394f686 ]

Casting is a high precedence operation but "off" and "i" are in terms of
bytes so we need to have some parenthesis here.

Fixes: fbfa743a9d2a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
Signed-off-by: Dan Carpenter
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Dan Carpenter
2017-02-18 22:11:41 +0800
a7fe4e5d0 ipv6: fix ip6_tnl_parse_tlv_enc_lim() ... Browse Code »

[ Upstream commit fbfa743a9d2a0ffa24251764f10afc13eb21e739 ]

This function suffers from multiple issues.

First one is that pskb_may_pull() may reallocate skb->head,
so the 'raw' pointer needs either to be reloaded or not used at all.

Second issue is that NEXTHDR_DEST handling does not validate
that the options are present in skb->data, so we might read
garbage or access non existent memory.

With help from Willem de Bruijn.

Signed-off-by: Eric Dumazet
Reported-by: Dmitry Vyukov
Cc: Willem de Bruijn
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-02-18 22:11:41 +0800

04 Feb, 2017

5 commits

89c258862 net: Specify the owning module for lwtunnel ops ... Browse Code »

[ Upstream commit 88ff7334f25909802140e690c0e16433e485b0a0 ]

Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so specify the owning
module for all lwtunnel ops.

Signed-off-by: Robert Shearman
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Robert Shearman
2017-02-04 16:47:11 +0800
79453ab88 ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lock ... Browse Code »

[ Upstream commit 03e4deff4987f79c34112c5ba4eb195d4f9382b0 ]

Just like commit 4acd4945cd1e ("ipv6: addrconf: Avoid calling
netdevice notifiers with RCU read-side lock"), it is unnecessary
to make addrconf_disable_change() use RCU iteration over the
netdev list, since it already holds the RTNL lock, or we may meet
Illegal context switch in RCU read-side critical section.

Signed-off-by: Kefeng Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Kefeng Wang
2017-02-04 16:47:10 +0800
e9db042dc lwtunnel: fix autoload of lwt modules ... Browse Code »

[ Upstream commit 9ed59592e3e379b2e9557dc1d9e9ec8fcbb33f16]

Trying to add an mpls encap route when the MPLS modules are not loaded
hangs. For example:

CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
CONFIG_MPLS_ROUTING=m
CONFIG_MPLS_IPTUNNEL=m

$ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2

The ip command hangs:
root 880 826 0 21:25 pts/0 00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2

$ cat /proc/880/stack
[] call_usermodehelper_exec+0xd6/0x134
[] __request_module+0x27b/0x30a
[] lwtunnel_build_state+0xe4/0x178
[] fib_create_info+0x47f/0xdd4
[] fib_table_insert+0x90/0x41f
[] inet_rtm_newroute+0x4b/0x52
...

modprobe is trying to load rtnl-lwt-MPLS:

root 881 5 0 21:25 ? 00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS

and it hangs after loading mpls_router:

$ cat /proc/881/stack
[] rtnl_lock+0x12/0x14
[] register_netdevice_notifier+0x16/0x179
[] mpls_init+0x25/0x1000 [mpls_router]
[] do_one_initcall+0x8e/0x13f
[] do_init_module+0x5a/0x1e5
[] load_module+0x13bd/0x17d6
...

The problem is that lwtunnel_build_state is called with rtnl lock
held preventing mpls_init from registering.

Given the potential references held by the time lwtunnel_build_state it
can not drop the rtnl lock to the load module. So, extract the module
loading code from lwtunnel_build_state into a new function to validate
the encap type. The new function is called while converting the user
request into a fib_config which is well before any table, device or
fib entries are examined.

Fixes: 745041e2aaf1 ("lwtunnel: autoload of lwt modules")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

David Ahern
2017-02-04 16:47:10 +0800
c7a5df92d ip6_tunnel: Account for tunnel header in tunnel MTU ... Browse Code »

[ Upstream commit 02ca0423fd65a0a9c4d70da0dbb8f4b8503f08c7 ]

With ip6gre we have a tunnel header which also makes the tunnel MTU
smaller. We need to reserve room for it. Previously we were using up
space reserved for the Tunnel Encapsulation Limit option
header (RFC 2473).

Also, after commit b05229f44228 ("gre6: Cleanup GREv6 transmit path,
call common GRE functions") our contract with the caller has
changed. Now we check if the packet length exceeds the tunnel MTU after
the tunnel header has been pushed, unlike before.

This is reflected in the check where we look at the packet length minus
the size of the tunnel header, which is already accounted for in tunnel
MTU.

Fixes: b05229f44228 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
Signed-off-by: Jakub Sitnicki
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jakub Sitnicki
2017-02-04 16:47:09 +0800
6980c52c4 net: lwtunnel: Handle lwtunnel_fill_encap failure ... Browse Code »

[ Upstream commit ea7a80858f57d8878b1499ea0f1b8a635cc48de7 ]

Handle failure in lwtunnel_fill_encap adding attributes to skb.

Fixes: 571e722676fe ("ipv4: support for fib route lwtunnel encap attributes")
Fixes: 19e42e451506 ("ipv6: support for fib route lwtunnel encap attributes")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

David Ahern
2017-02-04 16:47:08 +0800

15 Jan, 2017

3 commits

17a561b19 gro: Disable frag0 optimization on IPv6 ext headers ... Browse Code »

[ Upstream commit 57ea52a865144aedbcd619ee0081155e658b6f7d ]

The GRO fast path caches the frag0 address. This address becomes
invalid if frag0 is modified by pskb_may_pull or its variants.
So whenever that happens we must disable the frag0 optimization.

This is usually done through the combination of gro_header_hard
and gro_header_slow, however, the IPv6 extension header path did
the pulling directly and would continue to use the GRO fast path
incorrectly.

This patch fixes it by disabling the fast path when we enter the
IPv6 extension header path.

Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address")
Reported-by: Slava Shwartsman
Signed-off-by: Herbert Xu
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Herbert Xu
2017-01-15 20:42:55 +0800
ee99e2bc5 ipv6: handle -EFAULT from skb_copy_bits ... Browse Code »

[ Upstream commit a98f91758995cb59611e61318dddd8a6956b52c3 ]

By setting certain socket options on ipv6 raw sockets, we can confuse the
length calculation in rawv6_push_pending_frames triggering a BUG_ON.

RIP: 0010:[] [] rawv6_sendmsg+0xc30/0xc40
RSP: 0018:ffff881f6c4a7c18 EFLAGS: 00010282
RAX: 00000000fffffff2 RBX: ffff881f6c681680 RCX: 0000000000000002
RDX: ffff881f6c4a7cf8 RSI: 0000000000000030 RDI: ffff881fed0f6a00
RBP: ffff881f6c4a7da8 R08: 0000000000000000 R09: 0000000000000009
R10: ffff881fed0f6a00 R11: 0000000000000009 R12: 0000000000000030
R13: ffff881fed0f6a00 R14: ffff881fee39ba00 R15: ffff881fefa93a80

Call Trace:
[] ? unmap_page_range+0x693/0x830
[] inet_sendmsg+0x67/0xa0
[] sock_sendmsg+0x38/0x50
[] SYSC_sendto+0xef/0x170
[] SyS_sendto+0xe/0x10
[] do_syscall_64+0x50/0xa0
[] entry_SYSCALL64_slow_path+0x25/0x25

Handle by jumping to the failure path if skb_copy_bits gets an EFAULT.

Reproducer:

#include
#include
#include
#include
#include
#include
#include

#define LEN 504

int main(int argc, char* argv[])
{
int fd;
int zero = 0;
char buf[LEN];

memset(buf, 0, LEN);

fd = socket(AF_INET6, SOCK_RAW, 7);

setsockopt(fd, SOL_IPV6, IPV6_CHECKSUM, &zero, 4);
setsockopt(fd, SOL_IPV6, IPV6_DSTOPTS, &buf, LEN);

sendto(fd, buf, 1, 0, (struct sockaddr *) buf, 110);
}

Signed-off-by: Dave Jones
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Dave Jones
2017-01-15 20:42:53 +0800
d36a1cb1e inet: fix IP(V6)_RECVORIGDSTADDR for udp sockets ... Browse Code »

[ Upstream commit 39b2dd765e0711e1efd1d1df089473a8dd93ad48 ]

Socket cmsg IP(V6)_RECVORIGDSTADDR checks that port range lies within
the packet. For sockets that have transport headers pulled, transport
offset can be negative. Use signed comparison to avoid overflow.

Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Reported-by: Nisar Jagabar
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Willem de Bruijn
2017-01-15 20:42:52 +0800

03 Dec, 2016

3 commits

6b6ebb6b0 ip6_offload: check segs for NULL in ipv6_gso_segment. ... Browse Code »

segs needs to be checked for being NULL in ipv6_gso_segment() before calling
skb_shinfo(segs), otherwise kernel can run into a NULL-pointer dereference:

[ 97.811262] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc
[ 97.819112] IP: [] ipv6_gso_segment+0x119/0x2f0
[ 97.825214] PGD 0 [ 97.827047]
[ 97.828540] Oops: 0000 [#1] SMP
[ 97.831678] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 rpcsec_gss_krb5
nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
bridge stp llc snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel
snd_hda_codec edac_mce_amd snd_hda_core edac_core snd_hwdep kvm_amd snd_seq kvm snd_seq_device
snd_pcm irqbypass snd_timer ppdev parport_serial snd parport_pc k10temp pcspkr soundcore parport
sp5100_tco shpchp sg wmi i2c_piix4 acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc
ip_tables xfs libcrc32c sr_mod cdrom sd_mod ata_generic pata_acpi amdkfd amd_iommu_v2 radeon
broadcom bcm_phy_lib i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
ttm ahci serio_raw tg3 firewire_ohci libahci pata_atiixp drm ptp libata firewire_core pps_core
i2c_core crc_itu_t fjes dm_mirror dm_region_hash dm_log dm_mod
[ 97.927721] CPU: 1 PID: 3504 Comm: vhost-3495 Not tainted 4.9.0-7.el7.test.x86_64 #1
[ 97.935457] Hardware name: AMD Snook/Snook, BIOS ESK0726A 07/26/2010
[ 97.941806] task: ffff880129a1c080 task.stack: ffffc90001bcc000
[ 97.947720] RIP: 0010:[] [] ipv6_gso_segment+0x119/0x2f0
[ 97.956251] RSP: 0018:ffff88012fc43a10 EFLAGS: 00010207
[ 97.961557] RAX: 0000000000000000 RBX: ffff8801292c8700 RCX: 0000000000000594
[ 97.968687] RDX: 0000000000000593 RSI: ffff880129a846c0 RDI: 0000000000240000
[ 97.975814] RBP: ffff88012fc43a68 R08: ffff880129a8404e R09: 0000000000000000
[ 97.982942] R10: 0000000000000000 R11: ffff880129a84076 R12: 00000020002949b3
[ 97.990070] R13: ffff88012a580000 R14: 0000000000000000 R15: ffff88012a580000
[ 97.997198] FS: 0000000000000000(0000) GS:ffff88012fc40000(0000) knlGS:0000000000000000
[ 98.005280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 98.011021] CR2: 00000000000000cc CR3: 0000000126c5d000 CR4: 00000000000006e0
[ 98.018149] Stack:
[ 98.020157] 00000000ffffffff ffff88012fc43ac8 ffffffffa017ad0a 000000000000000e
[ 98.027584] 0000001300000000 0000000077d59998 ffff8801292c8700 00000020002949b3
[ 98.035010] ffff88012a580000 0000000000000000 ffff88012a580000 ffff88012fc43a98
[ 98.042437] Call Trace:
[ 98.044879] [ 98.046803] [] ? tg3_start_xmit+0x84a/0xd60 [tg3]
[ 98.053156] [] skb_mac_gso_segment+0xb0/0x130
[ 98.059158] [] __skb_gso_segment+0x73/0x110
[ 98.064985] [] validate_xmit_skb+0x12d/0x2b0
[ 98.070899] [] validate_xmit_skb_list+0x42/0x70
[ 98.077073] [] sch_direct_xmit+0xd0/0x1b0
[ 98.082726] [] __dev_queue_xmit+0x486/0x690
[ 98.088554] [] ? cpumask_next_and+0x35/0x50
[ 98.094380] [] dev_queue_xmit+0x10/0x20
[ 98.099863] [] br_dev_queue_push_xmit+0xa7/0x170 [bridge]
[ 98.106907] [] br_forward_finish+0x41/0xc0 [bridge]
[ 98.113430] [] ? nf_iterate+0x52/0x60
[ 98.118735] [] ? nf_hook_slow+0x6b/0xc0
[ 98.124216] [] __br_forward+0x14c/0x1e0 [bridge]
[ 98.130480] [] ? br_dev_queue_push_xmit+0x170/0x170 [bridge]
[ 98.137785] [] br_forward+0x9d/0xb0 [bridge]
[ 98.143701] [] br_handle_frame_finish+0x267/0x560 [bridge]
[ 98.150834] [] br_handle_frame+0x174/0x2f0 [bridge]
[ 98.157355] [] ? sched_clock+0x9/0x10
[ 98.162662] [] ? sched_clock_cpu+0x72/0xa0
[ 98.168403] [] __netif_receive_skb_core+0x1e5/0xa20
[ 98.174926] [] ? timerqueue_add+0x59/0xb0
[ 98.180580] [] __netif_receive_skb+0x18/0x60
[ 98.186494] [] process_backlog+0x95/0x140
[ 98.192145] [] net_rx_action+0x16d/0x380
[ 98.197713] [] __do_softirq+0xd1/0x283
[ 98.203106] [] do_softirq_own_stack+0x1c/0x30
[ 98.209107] [ 98.211029] [] do_softirq+0x50/0x60
[ 98.216166] [] netif_rx_ni+0x33/0x80
[ 98.221386] [] tun_get_user+0x487/0x7f0 [tun]
[ 98.227388] [] tun_sendmsg+0x4b/0x60 [tun]
[ 98.233129] [] handle_tx+0x282/0x540 [vhost_net]
[ 98.239392] [] handle_tx_kick+0x15/0x20 [vhost_net]
[ 98.245916] [] vhost_worker+0x9e/0xf0 [vhost]
[ 98.251919] [] ? vhost_umem_alloc+0x40/0x40 [vhost]
[ 98.258440] [] ? do_syscall_64+0x67/0x180
[ 98.264094] [] kthread+0xd9/0xf0
[ 98.268965] [] ? kthread_park+0x60/0x60
[ 98.274444] [] ret_from_fork+0x25/0x30
[ 98.279836] Code: 8b 93 d8 00 00 00 48 2b 93 d0 00 00 00 4c 89 e6 48 89 df 66 89 93 c2 00 00 00 ff 10 48 3d 00 f0 ff ff 49 89 c2 0f 87 52 01 00 00 8b 92 cc 00 00 00 48 8b 80 d0 00 00 00 44 0f b7 74 10 06 66
[ 98.299425] RIP [] ipv6_gso_segment+0x119/0x2f0
[ 98.305612] RSP
[ 98.309094] CR2: 00000000000000cc
[ 98.312406] ---[ end trace 726a2c7a2d2d78d0 ]---

Signed-off-by: Artem Savkov
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Artem Savkov
2016-12-03 02:34:58 +0800
80d1106ae Revert: "ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()" ... Browse Code »

This reverts commit ae148b085876fa771d9ef2c05f85d4b4bf09ce0d
("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").

skb->protocol is now set in __ip_local_out() and __ip6_local_out() before
dst_output() is called. It is no longer necessary to do it for each tunnel.

Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper
Signed-off-by: David S. Miller

Eli Cooper
2016-12-03 01:34:22 +0800
b4e479a96 ipv6: Set skb->protocol properly for local output ... Browse Code »

When xfrm is applied to TSO/GSO packets, it follows this path:

xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()

where skb_gso_segment() relies on skb->protocol to function properly.

This patch sets skb->protocol to ETH_P_IPV6 before dst_output() is called,
fixing a bug where GSO packets sent through an ipip6 tunnel are dropped
when xfrm is involved.

Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper
Signed-off-by: David S. Miller

Eli Cooper
2016-12-03 01:34:22 +0800

02 Dec, 2016

2 commits

7bbf91ce2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec ... Browse Code »

Steffen Klassert says:

====================
pull request (net): ipsec 2016-12-01

1) Change the error value when someone tries to run 32bit
userspace on a 64bit host from -ENOTSUPP to the userspace
exported -EOPNOTSUPP. Fix from Yi Zhao.

2) On inbound, ESN sequence numbers are already in network
byte order. So don't try to convert it again, this fixes
integrity verification for ESN. Fixes from Tobias Brunner.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller

David S. Miller
2016-12-02 00:35:49 +0800
3d2dd617f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

This is a large batch of Netfilter fixes for net, they are:

1) Three patches to fix NAT conversion to rhashtable: Switch to rhlist
structure that allows to have several objects with the same key.
Moreover, fix wrong comparison logic in nf_nat_bysource_cmp() as this is
expecting a return value similar to memcmp(). Change location of
the nat_bysource field in the nf_conn structure to avoid zeroing
this as it breaks interaction with SLAB_DESTROY_BY_RCU and lead us
to crashes. From Florian Westphal.

2) Don't allow malformed fragments go through in IPv6, drop them,
otherwise we hit GPF, patch from Florian Westphal.

3) Fix crash if attributes are missing in nft_range, from Liping Zhang.

4) Fix arptables 32-bits userspace 64-bits kernel compat, from Hongxu Jia.

5) Two patches from David Ahern to fix netfilter interaction with vrf.
From David Ahern.

6) Fix element timeout calculation in nf_tables, we take milliseconds
from userspace, but we use jiffies from kernelspace. Patch from
Anders K. Pedersen.

7) Missing validation length netlink attribute for nft_hash, from
Laura Garcia.

8) Fix nf_conntrack_helper documentation, we don't default to off
anymore for a bit of time so let's get this in sync with the code.

I know is late but I think these are important, specifically the NAT
bits, as they are mostly addressing fallout from recent changes. I also
read there are chances to have -rc8, if that is the case, that would
also give us a bit more time to test this.
====================

Signed-off-by: David S. Miller

David S. Miller
2016-12-02 00:04:41 +0800

01 Dec, 2016

1 commit

0382a25af l2tp: lock socket before checking flags in connect() ... Browse Code »

Socket flags aren't updated atomically, so the socket must be locked
while reading the SOCK_ZAPPED flag.

This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch
also brings error handling for __ip6_datagram_connect() failures.

Signed-off-by: Guillaume Nault
Signed-off-by: David S. Miller

Guillaume Nault
2016-12-01 03:14:07 +0800

30 Nov, 2016

2 commits

a55e23864 esp6: Fix integrity verification when ESN are used ... Browse Code »

When handling inbound packets, the two halves of the sequence number
stored on the skb are already in network order.

Fixes: 000ae7b2690e ("esp6: Switch to new AEAD interface")
Signed-off-by: Tobias Brunner
Acked-by: Herbert Xu
Signed-off-by: Steffen Klassert

Tobias Brunner
2016-11-30 18:10:16 +0800
9b57da063 netfilter: ipv6: nf_defrag: drop mangled skb on ream error ... Browse Code »

Dmitry Vyukov reported GPF in network stack that Andrey traced down to
negative nh offset in nf_ct_frag6_queue().

Problem is that all network headers before fragment header are pulled.
Normal ipv6 reassembly will drop the skb when errors occur further down
the line.

netfilter doesn't do this, and instead passed the original fragment
along. That was also fine back when netfilter ipv6 defrag worked with
cloned fragments, as the original, pristine fragment was passed on.

So we either have to undo the pull op, or discard such fragments.
Since they're malformed after all (e.g. overlapping fragment) it seems
preferrable to just drop them.

Same for temporary errors -- it doesn't make sense to accept (and
perhaps forward!) only some fragments of same datagram.

Fixes: 029f7f3b8701cc7ac ("netfilter: ipv6: nf_defrag: avoid/free clone operations")
Reported-by: Dmitry Vyukov
Debugged-by: Andrey Konovalov
Diagnosed-by: Eric Dumazet
Signed-off-by: Florian Westphal
Acked-by: Eric Dumazet
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2016-11-30 03:23:58 +0800

29 Nov, 2016

1 commit

79dc7e3f1 net: handle no dst on skb in icmp6_send ... Browse Code »

Andrey reported the following while fuzzing the kernel with syzkaller:

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 3859 Comm: a.out Not tainted 4.9.0-rc6+ #429
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff8800666d4200 task.stack: ffff880067348000
RIP: 0010:[] []
icmp6_send+0x5fc/0x1e30 net/ipv6/icmp.c:451
RSP: 0018:ffff88006734f2c0 EFLAGS: 00010206
RAX: ffff8800666d4200 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
RBP: ffff88006734f630 R08: ffff880064138418 R09: 0000000000000003
R10: dffffc0000000000 R11: 0000000000000005 R12: 0000000000000000
R13: ffffffff84e7e200 R14: ffff880064138484 R15: ffff8800641383c0
FS: 00007fb3887a07c0(0000) GS:ffff88006cc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 000000006b040000 CR4: 00000000000006f0
Stack:
ffff8800666d4200 ffff8800666d49f8 ffff8800666d4200 ffffffff84c02460
ffff8800666d4a1a 1ffff1000ccdaa2f ffff88006734f498 0000000000000046
ffff88006734f440 ffffffff832f4269 ffff880064ba7456 0000000000000000
Call Trace:
[] icmpv6_param_prob+0x2c/0x40 net/ipv6/icmp.c:557
[< inline >] ip6_tlvopt_unknown net/ipv6/exthdrs.c:88
[] ip6_parse_tlv+0x555/0x670 net/ipv6/exthdrs.c:157
[] ipv6_parse_hopopts+0x199/0x460 net/ipv6/exthdrs.c:663
[] ipv6_rcv+0xfa3/0x1dc0 net/ipv6/ip6_input.c:191
...

icmp6_send / icmpv6_send is invoked for both rx and tx paths. In both
cases the dst->dev should be preferred for determining the L3 domain
if the dst has been set on the skb. Fallback to the skb->dev if it has
not. This covers the case reported here where icmp6_send is invoked on
Rx before the route lookup.

Fixes: 5d41ce29e ("net: icmp6_send should use dst dev to determine L3 domain")
Reported-by: Andrey Konovalov
Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2016-11-29 05:13:01 +0800

28 Nov, 2016

1 commit

8eb4adf60 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec ... Browse Code »

Steffen Klassert says:

====================
pull request (net): ipsec 2016-11-25

1) Fix a refcount leak in vti6.
From Nicolas Dichtel.

2) Fix a wrong if statement in xfrm_sk_policy_lookup.
From Florian Westphal.

3) The flowcache watermarks are per cpu. Take this into
account when comparing to the threshold where we
refusing new allocations. From Miroslav Urbanek.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller

David S. Miller
2016-11-28 09:21:48 +0800

25 Nov, 2016

2 commits

30c7be26f udplite: call proper backlog handlers ... Browse Code »

In commits 93821778def10 ("udp: Fix rcv socket locking") and
f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into
__udpv6_queue_rcv_skb") UDP backlog handlers were renamed, but UDPlite
was forgotten.

This leads to crashes if UDPlite header is pulled twice, which happens
starting from commit e6afc8ace6dd ("udp: remove headers from UDP packets
before queueing")

Bug found by syzkaller team, thanks a lot guys !

Note that backlog use in UDP/UDPlite is scheduled to be removed starting
from linux-4.10, so this patch is only needed up to linux-4.9

Fixes: 93821778def1 ("udp: Fix rcv socket locking")
Fixes: f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb")
Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet
Reported-by: Andrey Konovalov
Cc: Benjamin LaHaise
Cc: Herbert Xu
Signed-off-by: David S. Miller

Eric Dumazet
2016-11-25 04:32:14 +0800
764d3be6e ipv6: bump genid when the IFA_F_TENTATIVE flag is clear ... Browse Code »

When an ipv6 address has the tentative flag set, it can't be
used as source for egress traffic, while the associated route,
if any, can be looked up and even stored into some dst_cache.

In the latter scenario, the source ipv6 address selected and
stored in the cache is most probably wrong (e.g. with
link-local scope) and the entity using the dst_cache will
experience lack of ipv6 connectivity until said cache is
cleared or invalidated.

Overall this may cause lack of connectivity over most IPv6 tunnels
(comprising geneve and vxlan), if the first egress packet reaches
the tunnel before the DaD is completed for the used ipv6
address.

This patch bumps a new genid after that the IFA_F_TENTATIVE flag
is cleared, so that dst_cache will be invalidated on
next lookup and ipv6 connectivity restored.

Fixes: 0c1d70af924b ("net: use dst_cache for vxlan device")
Fixes: 468dfffcd762 ("geneve: add dst caching support")
Acked-by: Hannes Frederic Sowa
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2016-11-25 01:04:10 +0800

24 Nov, 2016

1 commit

00b4422fe netfilter: Update nf_send_reset6 to consider L3 domain ... Browse Code »

nf_send_reset6 is not considering the L3 domain and lookups are sent
to the wrong table. For example consider the following output rule:

ip6tables -A OUTPUT -p tcp --dport 12345 -j REJECT --reject-with tcp-reset

using perf to analyze lookups via the fib6_table_lookup tracepoint shows:

swapper 0 [001] 248.787816: fib6:fib6_table_lookup: table 255 oif 0 iif 1 src 2100:1::3 dst 2100:1:
ffffffff81439cdc perf_trace_fib6_table_lookup ([kernel.kallsyms])
ffffffff814c1ce3 trace_fib6_table_lookup ([kernel.kallsyms])
ffffffff814c3e89 ip6_pol_route ([kernel.kallsyms])
ffffffff814c40d5 ip6_pol_route_output ([kernel.kallsyms])
ffffffff814e7b6f fib6_rule_action ([kernel.kallsyms])
ffffffff81437f60 fib_rules_lookup ([kernel.kallsyms])
ffffffff814e7c79 fib6_rule_lookup ([kernel.kallsyms])
ffffffff814c2541 ip6_route_output_flags ([kernel.kallsyms])
528 nf_send_reset6 ([nf_reject_ipv6])

The lookup is directed to table 255 rather than the table associated with
the device via the L3 domain. Update nf_send_reset6 to pull the L3 domain
from the dst currently attached to the skb.

Signed-off-by: David Ahern
Signed-off-by: Pablo Neira Ayuso

David Ahern
2016-11-24 19:47:08 +0800

18 Nov, 2016

1 commit

b5c2d4954 ip6_tunnel: disable caching when the traffic class is inherited ... Browse Code »

If an ip6 tunnel is configured to inherit the traffic class from
the inner header, the dst_cache must be disabled or it will foul
the policy routing.

The issue is apprently there since at leat Linux-2.6.12-rc2.

Reported-by: Liam McBirnie
Cc: Liam McBirnie
Acked-by: Hannes Frederic Sowa
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2016-11-18 01:08:56 +0800

16 Nov, 2016

1 commit

73e2d5e34 udp: restore UDPlite many-cast delivery ... Browse Code »

Honor udptable parameter that is passed to __udp*_lib_mcast_deliver(),
otherwise udplite broadcast/multicast use the wrong table and it breaks.

Fixes: 2dc41cff7545 ("udp: Use hash2 for long hash1 chains in __udp*_lib_mcast_deliver.")
Signed-off-by: Pablo Neira Ayuso
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Pablo Neira
2016-11-16 11:14:27 +0800

14 Nov, 2016

1 commit

ac6e78007 tcp: take care of truncations done by sk_filter() ... Browse Code »

With syzkaller help, Marco Grassi found a bug in TCP stack,
crashing in tcp_collapse()

Root cause is that sk_filter() can truncate the incoming skb,
but TCP stack was not really expecting this to happen.
It probably was expecting a simple DROP or ACCEPT behavior.

We first need to make sure no part of TCP header could be removed.
Then we need to adjust TCP_SKB_CB(skb)->end_seq

Many thanks to syzkaller team and Marco for giving us a reproducer.

Signed-off-by: Eric Dumazet
Reported-by: Marco Grassi
Reported-by: Vladis Dronov
Signed-off-by: David S. Miller

Eric Dumazet
2016-11-14 01:30:02 +0800

10 Nov, 2016

3 commits

9b6c14d51 net: tcp response should set oif only if it is L3 master ... Browse Code »

Lorenzo noted an Android unit test failed due to e0d56fdd7342:
"The expectation in the test was that the RST replying to a SYN sent to a
closed port should be generated with oif=0. In other words it should not
prefer the interface where the SYN came in on, but instead should follow
whatever the routing table says it should do."

Revert the change to ip_send_unicast_reply and tcp_v6_send_response such
that the oif in the flow is set to the skb_iif only if skb_iif is an L3
master.

Fixes: e0d56fdd7342 ("net: l3mdev: remove redundant calls")
Reported-by: Lorenzo Colitti
Signed-off-by: David Ahern
Tested-by: Lorenzo Colitti
Acked-by: Lorenzo Colitti
Signed-off-by: David S. Miller

David Ahern
2016-11-10 11:32:10 +0800
9fa684ec8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains a larger than usual batch of Netfilter
fixes for your net tree. This series contains a mixture of old bugs and
recently introduced bugs, they are:

1) Fix a crash when using nft_dynset with nft_set_rbtree, which doesn't
support the set element updates from the packet path. From Liping
Zhang.

2) Fix leak when nft_expr_clone() fails, from Liping Zhang.

3) Fix a race when inserting new elements to the set hash from the
packet path, also from Liping.

4) Handle segmented TCP SIP packets properly, basically avoid that the
INVITE in the allow header create bogus expectations by performing
stricter SIP message parsing, from Ulrich Weber.

5) nft_parse_u32_check() should return signed integer for errors, from
John Linville.

6) Fix wrong allocation instead of connlabels, allocate 16 instead of
32 bytes, from Florian Westphal.

7) Fix compilation breakage when building the ip_vs_sync code with
CONFIG_OPTIMIZE_INLINING on x86, from Arnd Bergmann.

8) Destroy the new set if the transaction object cannot be allocated,
also from Liping Zhang.

9) Use device to route duplicated packets via nft_dup only when set by
the user, otherwise packets may not follow the right route, again
from Liping.

10) Fix wrong maximum genetlink attribute definition in IPVS, from
WANG Cong.

11) Ignore untracked conntrack objects from xt_connmark, from Florian
Westphal.

12) Allow to use conntrack helpers that are registered NFPROTO_UNSPEC
via CT target, otherwise we cannot use the h.245 helper, from
Florian.

13) Revisit garbage collection heuristic in the new workqueue-based
timer approach for conntrack to evict objects earlier, again from
Florian.

14) Fix crash in nf_tables when inserting an element into a verdict map,
from Liping Zhang.
====================

Signed-off-by: David S. Miller

David S. Miller
2016-11-10 09:38:18 +0800
fb56be83e net-ipv6: on device mtu change do not add mtu to mtu-less routes ... Browse Code »

Routes can specify an mtu explicitly or inherit the mtu from
the underlying device - this inheritance is implemented in
dst->ops->mtu handlers ip6_mtu() and ip6_blackhole_mtu().

Currently changing the mtu of a device adds mtu explicitly
to routes using that device.

ie.
# ip link set dev lo mtu 65536
# ip -6 route add local 2000::1 dev lo
# ip -6 route get 2000::1
local 2000::1 dev lo table local src ... metric 1024 pref medium

# ip link set dev lo mtu 65535
# ip -6 route get 2000::1
local 2000::1 dev lo table local src ... metric 1024 mtu 65535 pref medium

# ip link set dev lo mtu 65536
# ip -6 route get 2000::1
local 2000::1 dev lo table local src ... metric 1024 mtu 65536 pref medium

# ip -6 route del local 2000::1

After this patch the route entry no longer changes unless it already has an mtu.
There is no need: this inheritance is already done in ip6_mtu()

# ip link set dev lo mtu 65536
# ip -6 route add local 2000::1 dev lo
# ip -6 route add local 2000::2 dev lo mtu 2000
# ip -6 route get 2000::1; ip -6 route get 2000::2
local 2000::1 dev lo table local src ... metric 1024 pref medium
local 2000::2 dev lo table local src ... metric 1024 mtu 2000 pref medium

# ip link set dev lo mtu 65535
# ip -6 route get 2000::1; ip -6 route get 2000::2
local 2000::1 dev lo table local src ... metric 1024 pref medium
local 2000::2 dev lo table local src ... metric 1024 mtu 2000 pref medium

# ip link set dev lo mtu 1501
# ip -6 route get 2000::1; ip -6 route get 2000::2
local 2000::1 dev lo table local src ... metric 1024 pref medium
local 2000::2 dev lo table local src ... metric 1024 mtu 1501 pref medium

# ip link set dev lo mtu 65536
# ip -6 route get 2000::1; ip -6 route get 2000::2
local 2000::1 dev lo table local src ... metric 1024 pref medium
local 2000::2 dev lo table local src ... metric 1024 mtu 65536 pref medium

# ip -6 route del local 2000::1
# ip -6 route del local 2000::2

This is desirable because changing device mtu and then resetting it
to the previous value shouldn't change the user visible routing table.

Signed-off-by: Maciej Żenczykowski
CC: Eric Dumazet
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Maciej Żenczykowski
2016-11-10 02:19:32 +0800

08 Nov, 2016

1 commit

5d41ce29e net: icmp6_send should use dst dev to determine L3 domain ... Browse Code »

icmp6_send is called in response to some event. The skb may not have
the device set (skb->dev is NULL), but it is expected to have a dst set.
Update icmp6_send to use the dst on the skb to determine L3 domain.

Fixes: ca254490c8dfd ("net: Add VRF support to IPv6 stack")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2016-11-08 09:30:19 +0800

03 Nov, 2016

1 commit

4fd19c15d ip6_udp_tunnel: remove unused IPCB related codes ... Browse Code »

Some IPCB fields are currently set in udp_tunnel6_xmit_skb(), which are
never used before it reaches ip6tunnel_xmit(), and past that point the
control buffer is no longer interpreted as IPCB.

This clears these unused IPCB related codes. Currently there is no skb
scrubbing in ip6_udp_tunnel, otherwise IPCB(skb)->opt might need to be
cleared for IPv4 packets, as shown in 5146d1f1511
("tunnel: Clear IPCB(skb)->opt before dst_link_failure called").

Signed-off-by: Eli Cooper
Signed-off-by: David S. Miller

Eli Cooper
2016-11-03 03:18:36 +0800

01 Nov, 2016

2 commits

19bda36c4 ipv6: add mtu lock check in __ip6_rt_update_pmtu ... Browse Code »

Prior to this patch, ipv6 didn't do mtu lock check in ip6_update_pmtu.
It leaded to that mtu lock doesn't really work when receiving the pkt
of ICMPV6_PKT_TOOBIG.

This patch is to add mtu lock check in __ip6_rt_update_pmtu just as ipv4
did in __ip_rt_update_pmtu.

Acked-by: Hannes Frederic Sowa
Signed-off-by: Xin Long
Signed-off-by: David S. Miller

Xin Long
2016-11-01 02:24:24 +0800
f89c56ce7 ipv6: Don't use ufo handling on later transformed packets ... Browse Code »

Similar to commit c146066ab802 ("ipv4: Don't use ufo handling on later
transformed packets"), don't perform UFO on packets that will be IPsec
transformed. To detect it we rely on the fact that headerlen in
dst_entry is non-zero only for transformation bundles (xfrm_dst
objects).

Unwanted segmentation can be observed with a NETIF_F_UFO capable device,
such as a dummy device:

DEV=dum0 LEN=1493

ip li add $DEV type dummy
ip addr add fc00::1/64 dev $DEV nodad
ip link set $DEV up
ip xfrm policy add dir out src fc00::1 dst fc00::2 \
tmpl src fc00::1 dst fc00::2 proto esp spi 1
ip xfrm state add src fc00::1 dst fc00::2 \
proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b

tcpdump -n -nn -i $DEV -t &
socat /dev/zero,readbytes=$LEN udp6:[fc00::2]:$LEN

tcpdump output before:

IP6 fc00::1 > fc00::2: frag (0|1448) ESP(spi=0x00000001,seq=0x1), length 1448
IP6 fc00::1 > fc00::2: frag (1448|48)
IP6 fc00::1 > fc00::2: ESP(spi=0x00000001,seq=0x2), length 88

... and after:

IP6 fc00::1 > fc00::2: frag (0|1448) ESP(spi=0x00000001,seq=0x1), length 1448
IP6 fc00::1 > fc00::2: frag (1448|80)

Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")

Signed-off-by: Jakub Sitnicki
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Jakub Sitnicki
2016-11-01 01:10:41 +0800