03 Jun, 2020

1 commit

  • commit 976eba8ab596bab94b9714cd46d38d5c6a2c660d upstream.

    In Commit dd9ee3444014 ("vti4: Fix a ipip packet processing bug in
    'IPCOMP' virtual tunnel"), it tries to receive IPIP packets in vti
    by calling xfrm_input(). This case happens when a small packet or
    frag sent by peer is too small to get compressed.

    However, xfrm_input() will still get to the IPCOMP path where skb
    sec_path is set, but never dropped while it should have been done
    in vti_ipcomp4_protocol.cb_handler(vti_rcv_cb), as it's not an
    ipcomp4 packet. This will cause that the packet can never pass
    xfrm4_policy_check() in the upper protocol rcv functions.

    So this patch is to call ip_tunnel_rcv() to process IPIP packets
    instead.

    Fixes: dd9ee3444014 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
    Reported-by: Xiumei Mu
    Signed-off-by: Xin Long
    Signed-off-by: Steffen Klassert
    Signed-off-by: Greg Kroah-Hartman

    Xin Long
     

01 Apr, 2020

1 commit


06 Feb, 2020

1 commit

  • [ Upstream commit 95224166a9032ff5d08fca633d37113078ce7d01 ]

    With an ebpf program that redirects packets through a vti[6] interface,
    the packets are dropped because no dst is attached.

    This could also be reproduced with an AF_PACKET socket, with the following
    python script (vti1 is an ip_vti interface):

    import socket
    send_s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, 0)
    # scapy
    # p = IP(src='10.100.0.2', dst='10.200.0.1')/ICMP(type='echo-request')
    # raw(p)
    req = b'E\x00\x00\x1c\x00\x01\x00\x00@\x01e\xb2\nd\x00\x02\n\xc8\x00\x01\x08\x00\xf7\xff\x00\x00\x00\x00'
    send_s.sendto(req, ('vti1', 0x800, 0, 0))

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin

    Nicolas Dichtel
     

05 Jan, 2020

1 commit

  • [ Upstream commit 8247a79efa2f28b44329f363272550c1738377de ]

    When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
    we should not call dst_confirm_neigh() as there is no two-way communication.

    Although vti and vti6 are immune to this problem because they are IFF_NOARP
    interfaces, as Guillaume pointed. There is still no sense to confirm neighbour
    here.

    v5: Update commit description.
    v4: No change.
    v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
    v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

    Reviewed-by: Guillaume Nault
    Acked-by: David Ahern
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

03 May, 2019

1 commit


08 Apr, 2019

3 commits

  • This structure is now only 4 bytes, so its more efficient
    to cache a copy rather than its address.

    No significant size difference in allmodconfig vmlinux.

    With non-modular kernel that has all XFRM options enabled, this
    series reduces vmlinux image size by ~11kb. All xfrm_mode
    indirections are gone and all modes are built-in.

    before (ipsec-next master):
    text data bss dec filename
    21071494 7233140 11104324 39408958 vmlinux.master

    after this series:
    21066448 7226772 11104324 39397544 vmlinux.patched

    With allmodconfig kernel, the size increase is only 362 bytes,
    even all the xfrm config options removed in this series are
    modular.

    before:
    text data bss dec filename
    15731286 6936912 4046908 26715106 vmlinux.master

    after this series:
    15731492 6937068 4046908 26715468 vmlinux

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     
  • after previous changes, xfrm_mode contains no function pointers anymore
    and all modules defining such struct contain no code except an init/exit
    functions to register the xfrm_mode struct with the xfrm core.

    Just place the xfrm modes core and remove the modules,
    the run-time xfrm_mode register/unregister functionality is removed.

    Before:

    text data bss dec filename
    7523 200 2364 10087 net/xfrm/xfrm_input.o
    40003 628 440 41071 net/xfrm/xfrm_state.o
    15730338 6937080 4046908 26714326 vmlinux

    7389 200 2364 9953 net/xfrm/xfrm_input.o
    40574 656 440 41670 net/xfrm/xfrm_state.o
    15730084 6937068 4046908 26714060 vmlinux

    The xfrm*_mode_{transport,tunnel,beet} modules are gone.

    v2: replace CONFIG_INET6_XFRM_MODE_* IS_ENABLED guards with CONFIG_IPV6
    ones rather than removing them.

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     
  • Now that we have the family available directly in the
    xfrm_mode struct, we can use that and avoid one extra dereference.

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     

24 Mar, 2019

1 commit

  • The ipip tunnel introduced in commit dd9ee3444014 ("vti4: Fix a ipip
    packet processing bug in 'IPCOMP' virtual tunnel") largely duplicated
    the existing vti_input and vti_recv functions. Refactored to
    deduplicate the common code.

    Signed-off-by: Jeremy Sowden
    Signed-off-by: Steffen Klassert

    Jeremy Sowden
     

21 Mar, 2019

2 commits

  • Removed info log-message if ipip tunnel registration fails during
    module-initialization: it adds nothing to the error message that is
    written on all failures.

    Fixes: dd9ee3444014e ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
    Signed-off-by: Jeremy Sowden
    Signed-off-by: Steffen Klassert

    Jeremy Sowden
     
  • If tunnel registration failed during module initialization, the module
    would fail to deregister the IPPROTO_COMP protocol and would attempt to
    deregister the tunnel.

    The tunnel was not deregistered during module-exit.

    Fixes: dd9ee3444014e ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
    Signed-off-by: Jeremy Sowden
    Signed-off-by: Steffen Klassert

    Jeremy Sowden
     

09 Jan, 2019

1 commit

  • Recently we run a network test over ipcomp virtual tunnel.We find that
    if a ipv4 packet needs fragment, then the peer can't receive
    it.

    We deep into the code and find that when packet need fragment the smaller
    fragment will be encapsulated by ipip not ipcomp. So when the ipip packet
    goes into xfrm, it's skb->dev is not properly set. The ipv4 reassembly code
    always set skb'dev to the last fragment's dev. After ipv4 defrag processing,
    when the kernel rp_filter parameter is set, the skb will be drop by -EXDEV
    error.

    This patch adds compatible support for the ipip process in ipcomp virtual tunnel.

    Signed-off-by: Su Yanjun
    Signed-off-by: Steffen Klassert

    Su Yanjun
     

02 Jan, 2019

1 commit

  • KMSAN detected read beyond end of buffer in vti and sit devices when
    passing truncated packets with PF_PACKET. The issue affects additional
    ip tunnel devices.

    Extend commit 76c0ddd8c3a6 ("ip6_tunnel: be careful when accessing the
    inner header") and commit ccfec9e5cb2d ("ip_tunnel: be careful when
    accessing the inner header").

    Move the check to a separate helper and call at the start of each
    ndo_start_xmit function in net/ipv4 and net/ipv6.

    Minor changes:
    - convert dev_kfree_skb to kfree_skb on error path,
    as dev_kfree_skb calls consume_skb which is not for error paths.
    - use pskb_network_may_pull even though that is pedantic here,
    as the same as pskb_may_pull for devices without llheaders.
    - do not cache ipv6 hdrs if used only once
    (unsafe across pskb_may_pull, was more relevant to earlier patch)

    Reported-by: syzbot
    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

27 Sep, 2018

2 commits


20 Aug, 2018

1 commit

  • After set fb_tunnels_only_for_init_net to 1, the itn->fb_tunnel_dev will
    be NULL and will cause following crash:

    [ 2742.849298] BUG: unable to handle kernel NULL pointer dereference at 0000000000000941
    [ 2742.851380] PGD 800000042c21a067 P4D 800000042c21a067 PUD 42aaed067 PMD 0
    [ 2742.852818] Oops: 0002 [#1] SMP PTI
    [ 2742.853570] CPU: 7 PID: 2484 Comm: unshare Kdump: loaded Not tainted 4.18.0-rc8+ #2
    [ 2742.855163] Hardware name: Fedora Project OpenStack Nova, BIOS seabios-1.7.5-11.el7 04/01/2014
    [ 2742.856970] RIP: 0010:vti_init_net+0x3a/0x50 [ip_vti]
    [ 2742.858034] Code: 90 83 c0 48 c7 c2 20 a1 83 c0 48 89 fb e8 6e 3b f6 ff 85 c0 75 22 8b 0d f4 19 00 00 48 8b 93 00 14 00 00 48 8b 14 ca 48 8b 12 82 41 09 00 00 04 c6 82 38 09 00 00 45 5b c3 66 0f 1f 44 00 00
    [ 2742.861940] RSP: 0018:ffff9be28207fde0 EFLAGS: 00010246
    [ 2742.863044] RAX: 0000000000000000 RBX: ffff8a71ebed4980 RCX: 0000000000000013
    [ 2742.864540] RDX: 0000000000000000 RSI: 0000000000000013 RDI: ffff8a71ebed4980
    [ 2742.866020] RBP: ffff8a71ea717000 R08: ffffffffc083903c R09: ffff8a71ea717000
    [ 2742.867505] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a71ebed4980
    [ 2742.868987] R13: 0000000000000013 R14: ffff8a71ea5b49c0 R15: 0000000000000000
    [ 2742.870473] FS: 00007f02266c9740(0000) GS:ffff8a71ffdc0000(0000) knlGS:0000000000000000
    [ 2742.872143] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 2742.873340] CR2: 0000000000000941 CR3: 000000042bc20006 CR4: 00000000001606e0
    [ 2742.874821] Call Trace:
    [ 2742.875358] ops_init+0x38/0xf0
    [ 2742.876078] setup_net+0xd9/0x1f0
    [ 2742.876789] copy_net_ns+0xb7/0x130
    [ 2742.877538] create_new_namespaces+0x11a/0x1d0
    [ 2742.878525] unshare_nsproxy_namespaces+0x55/0xa0
    [ 2742.879526] ksys_unshare+0x1a7/0x330
    [ 2742.880313] __x64_sys_unshare+0xe/0x20
    [ 2742.881131] do_syscall_64+0x5b/0x180
    [ 2742.881933] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Reproduce:
    echo 1 > /proc/sys/net/core/fb_tunnels_only_for_init_net
    modprobe ip_vti
    unshare -n

    Fixes: 79134e6ce2c9 ("net: do not create fallback tunnels for non-default namespaces")
    Cc: Eric Dumazet
    Signed-off-by: Haishuang Yan
    Signed-off-by: David S. Miller

    Haishuang Yan
     

19 Mar, 2018

2 commits

  • Don't hardcode a MTU value on vti tunnel initialization,
    ip_tunnel_newlink() is able to deal with this already. See also
    commit ffc2b6ee4174 ("ip_gre: fix IFLA_MTU ignored on NEWLINK").

    Fixes: 1181412c1a67 ("net/ipv4: VTI support new module for ip_vti.")
    Signed-off-by: Stefano Brivio
    Acked-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Stefano Brivio
     
  • This re-introduces the effect of commit a32452366b72 ("vti4:
    Don't count header length twice.") which was accidentally
    reverted by merge commit f895f0cfbb77 ("Merge branch 'master' of
    git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec").

    The commit message from Steffen Klassert said:

    We currently count the size of LL_MAX_HEADER and struct iphdr
    twice for vti4 devices, this leads to a wrong device mtu.
    The size of LL_MAX_HEADER and struct iphdr is already counted in
    ip_tunnel_bind_dev(), so don't do it again in vti_tunnel_init().

    And this is still the case now: ip_tunnel_bind_dev() already
    accounts for the header length of the link layer (not
    necessarily LL_MAX_HEADER, if the output device is found), plus
    one IP header.

    For example, with a vti device on top of veth, with MTU of 1500,
    the existing implementation would set the initial vti MTU to
    1332, accounting once for LL_MAX_HEADER (128, included in
    hard_header_len by vti) and twice for the same IP header (once
    from hard_header_len, once from ip_tunnel_bind_dev()).

    It should instead be 1480, because ip_tunnel_bind_dev() is able
    to figure out that the output device is veth, so no additional
    link layer header is attached, and will properly count one
    single IP header.

    The existing issue had the side effect of avoiding PMTUD for
    most xfrm policies, by arbitrarily lowering the initial MTU.
    However, the only way to get a consistent PMTU value is to let
    the xfrm PMTU discovery do its course, and commit d6af1a31cc72
    ("vti: Add pmtu handling to vti_xmit.") now takes care of local
    delivery cases where the application ignores local socket
    notifications.

    Fixes: b9959fd3b0fa ("vti: switch to new ip tunnel code")
    Fixes: f895f0cfbb77 ("Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec")
    Signed-off-by: Stefano Brivio
    Acked-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Stefano Brivio
     

26 Jan, 2018

1 commit

  • Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
    "BUG: unable to handle kernel NULL pointer dereference at (null)"

    Let's add a helper to check if update_pmtu is available before calling it.

    Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
    Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
    CC: Roman Kapl
    CC: Xin Long
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

01 Nov, 2017

1 commit


06 Oct, 2017

1 commit


27 Sep, 2017

1 commit

  • When running LTP IPsec tests, KASan might report:

    BUG: KASAN: use-after-free in vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
    Read of size 4 at addr ffff880dc6ad1980 by task swapper/0/0
    ...
    Call Trace:

    dump_stack+0x63/0x89
    print_address_description+0x7c/0x290
    kasan_report+0x28d/0x370
    ? vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
    __asan_report_load4_noabort+0x19/0x20
    vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
    ? vti_init_net+0x190/0x190 [ip_vti]
    ? save_stack_trace+0x1b/0x20
    ? save_stack+0x46/0xd0
    dev_hard_start_xmit+0x147/0x510
    ? icmp_echo.part.24+0x1f0/0x210
    __dev_queue_xmit+0x1394/0x1c60
    ...
    Freed by task 0:
    save_stack_trace+0x1b/0x20
    save_stack+0x46/0xd0
    kasan_slab_free+0x70/0xc0
    kmem_cache_free+0x81/0x1e0
    kfree_skbmem+0xb1/0xe0
    kfree_skb+0x75/0x170
    kfree_skb_list+0x3e/0x60
    __dev_queue_xmit+0x1298/0x1c60
    dev_queue_xmit+0x10/0x20
    neigh_resolve_output+0x3a8/0x740
    ip_finish_output2+0x5c0/0xe70
    ip_finish_output+0x4ba/0x680
    ip_output+0x1c1/0x3a0
    xfrm_output_resume+0xc65/0x13d0
    xfrm_output+0x1e4/0x380
    xfrm4_output_finish+0x5c/0x70

    Can be fixed if we get skb->len before dst_output().

    Fixes: b9959fd3b0fa ("vti: switch to new ip tunnel code")
    Fixes: 22e1b23dafa8 ("vti6: Support inter address family tunneling.")
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller

    Alexey Kodanev
     

20 Sep, 2017

1 commit

  • Implement exit_batch() method to dismantle more devices
    per round.

    (rtnl_lock() ...
    unregister_netdevice_many() ...
    rtnl_unlock())

    Tested:
    $ cat add_del_unshare.sh
    for i in `seq 1 40`
    do
    (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) &
    done
    wait ; grep net_namespace /proc/slabinfo

    Before patch :
    $ time ./add_del_unshare.sh
    net_namespace 126 282 5504 1 2 : tunables 8 4 0 : slabdata 126 282 0

    real 1m38.965s
    user 0m0.688s
    sys 0m37.017s

    After patch:
    $ time ./add_del_unshare.sh
    net_namespace 135 291 5504 1 2 : tunables 8 4 0 : slabdata 135 291 0

    real 0m22.117s
    user 0m0.728s
    sys 0m35.328s

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Jul, 2017

1 commit


27 Jun, 2017

3 commits


09 May, 2017

1 commit


22 Apr, 2017

1 commit

  • This feature allows the administrator to set an fwmark for
    packets traversing a tunnel. This allows the use of independent
    routing tables for tunneled packets without the use of iptables.

    There is no concept of per-packet routing decisions through IPv4
    tunnels, so this implementation does not need to work with
    per-packet route lookups as the v6 implementation may
    (with IP6_TNL_F_USE_ORIG_FWMARK).

    Further, since the v4 tunnel ioctls share datastructures
    (which can not be trivially modified) with the kernel's internal
    tunnel configuration structures, the mark attribute must be stored
    in the tunnel structure itself and passed as a parameter when
    creating or changing tunnel attributes.

    Signed-off-by: Craig Gallek
    Signed-off-by: David S. Miller

    Craig Gallek
     

18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

09 Sep, 2016

1 commit

  • In case of inter address family tunneling (IPv6 over vti4 or IPv4 over
    vti6), the inbound policy checks in vti_rcv_cb() and vti6_rcv_cb() are
    using the wrong address family. As a result, all inbound inter address
    family traffic is dropped.

    Use the xfrm_ip2inner_mode() helper, as done in xfrm_input() (i.e., also
    increment LINUX_MIB_XFRMINSTATEMODEERROR in case of error), to select the
    inner_mode that contains the right address family for the inbound policy
    checks.

    Signed-off-by: Thomas Zeitlhofer
    Signed-off-by: Steffen Klassert

    thomas.zeitlhofer+lkml@ze-it.at
     

10 Aug, 2016

1 commit

  • When executing the script included below, the netns delete operation
    hangs with the following message (repeated at 10 second intervals):

    kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1

    This occurs because a reference to the lo interface in the "secure" netns
    is still held by a dst entry in the xfrm bundle cache in the init netns.

    Address this problem by garbage collecting the tunnel netns flow cache
    when a cross-namespace vti interface receives a NETDEV_DOWN notification.

    A more detailed description of the problem scenario (referencing commands
    in the script below):

    (1) ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1

    The vti_test interface is created in the init namespace. vti_tunnel_init()
    attaches a struct ip_tunnel to the vti interface's netdev_priv(dev),
    setting the tunnel net to &init_net.

    (2) ip link set vti_test netns secure

    The vti_test interface is moved to the "secure" netns. Note that
    the associated struct ip_tunnel still has tunnel->net set to &init_net.

    (3) ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1

    The first packet sent using the vti device causes xfrm_lookup() to be
    called as follows:

    dst = xfrm_lookup(tunnel->net, skb_dst(skb), fl, NULL, 0);

    Note that tunnel->net is the init namespace, while skb_dst(skb) references
    the vti_test interface in the "secure" namespace. The returned dst
    references an interface in the init namespace.

    Also note that the first parameter to xfrm_lookup() determines which flow
    cache is used to store the computed xfrm bundle, so after xfrm_lookup()
    returns there will be a cached bundle in the init namespace flow cache
    with a dst referencing a device in the "secure" namespace.

    (4) ip netns del secure

    Kernel begins to delete the "secure" namespace. At some point the
    vti_test interface is deleted, at which point dst_ifdown() changes
    the dst->dev in the cached xfrm bundle flow from vti_test to lo (still
    in the "secure" namespace however).
    Since nothing has happened to cause the init namespace's flow cache
    to be garbage collected, this dst remains attached to the flow cache,
    so the kernel loops waiting for the last reference to lo to go away.

    ip link add br1 type bridge
    ip link set dev br1 up
    ip addr add dev br1 1.1.1.1/8

    ip netns add secure
    ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1
    ip link set vti_test netns secure
    ip netns exec secure ip link set vti_test up
    ip netns exec secure ip link s lo up
    ip netns exec secure ip addr add dev lo 192.168.100.1/24
    ip netns exec secure ip route add 192.168.200.0/24 dev vti_test
    ip xfrm policy flush
    ip xfrm state flush
    ip xfrm policy add dir out tmpl src 1.1.1.1 dst 1.1.1.2 \
    proto esp mode tunnel mark 1
    ip xfrm policy add dir in tmpl src 1.1.1.2 dst 1.1.1.1 \
    proto esp mode tunnel mark 1
    ip xfrm state add src 1.1.1.1 dst 1.1.1.2 proto esp spi 1 \
    mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788
    ip xfrm state add src 1.1.1.2 dst 1.1.1.1 proto esp spi 1 \
    mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788

    ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1

    ip netns del secure

    Reported-by: Hangbin Liu
    Reported-by: Jan Tluka
    Signed-off-by: Lance Richardson
    Signed-off-by: David S. Miller

    Lance Richardson
     

31 Mar, 2016

1 commit

  • We currently rely on the PMTU discovery of xfrm.
    However if a packet is locally sent, the PMTU mechanism
    of xfrm tries to do local socket notification what
    might not work for applications like ping that don't
    check for this. So add pmtu handling to vti_xmit to
    report MTU changes immediately.

    Reported-by: Mark McKinstry
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

26 Dec, 2015

1 commit


01 Dec, 2015

1 commit


08 Oct, 2015

1 commit


18 Sep, 2015

1 commit


28 May, 2015

2 commits

  • The vti6_rcv_cb and vti_rcv_cb calls were leaving the skb->mark modified
    after completing the function. This resulted in the original skb->mark
    value being lost. Since we only need skb->mark to be set for
    xfrm_policy_check we can pull the assignment into the rcv_cb calls and then
    just restore the original mark after xfrm_policy_check has been completed.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Steffen Klassert

    Alexander Duyck
     
  • Instead of modifying skb->mark we can simply modify the flowi_mark that is
    generated as a result of the xfrm_decode_session. By doing this we don't
    need to actually touch the skb->mark and it can be preserved as it passes
    out through the tunnel.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Steffen Klassert

    Alexander Duyck