06 Oct, 2020

1 commit


08 Aug, 2020

1 commit


31 Jul, 2020

1 commit

  • Steffen Klassert says:

    ====================
    pull request (net-next): ipsec-next 2020-07-30

    Please note that I did the first time now --no-ff merges
    of my testing branch into the master branch to include
    the [PATCH 0/n] message of a patchset. Please let me
    know if this is desirable, or if I should do it any
    different.

    1) Introduce a oseq-may-wrap flag to disable anti-replay
    protection for manually distributed ICVs as suggested
    in RFC 4303. From Petr Vaněk.

    2) Patchset to fully support IPCOMP for vti4, vti6 and
    xfrm interfaces. From Xin Long.

    3) Switch from a linear list to a hash list for xfrm interface
    lookups. From Eyal Birger.

    4) Fixes to not register one xfrm(6)_tunnel object twice.
    From Xin Long.

    5) Fix two compile errors that were introduced with the
    IPCOMP support for vti and xfrm interfaces.
    Also from Xin Long.

    6) Make the policy hold queue work with VTI. This was
    forgotten when VTI was implemented.

    Please pull or let me know if there are problems.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

21 Jul, 2020

1 commit


14 Jul, 2020

1 commit

  • An xfrm_tunnel object is linked into the list when registering,
    so vti_ipip_handler can not be registered twice, otherwise its
    next pointer will be overwritten on the second time.

    So this patch is to define a new xfrm_tunnel object to register
    for AF_INET6.

    Fixes: e6ce64570f24 ("ip_vti: support IPIP6 tunnel processing")
    Signed-off-by: Xin Long
    Signed-off-by: Steffen Klassert

    Xin Long
     

09 Jul, 2020

2 commits

  • For IPIP6 tunnel processing, the functions called will be the
    same as that for IPIP tunnel's. So reuse it and register it
    with family == AF_INET6.

    Signed-off-by: Xin Long
    Signed-off-by: Steffen Klassert

    Xin Long
     
  • With tunnel4_input_afinfo added, IPIP tunnel processing in
    ip_vti can be easily done with .cb_handler. So replace the
    processing by calling ip_tunnel_rcv() with it.

    v1->v2:
    - no change.
    v2-v3:
    - enable it only when CONFIG_INET_XFRM_TUNNEL is defined, to fix
    the build error, reported by kbuild test robot.

    Signed-off-by: Xin Long
    Signed-off-by: Steffen Klassert

    Xin Long
     

01 Jul, 2020

1 commit

  • Vti uses skb->protocol to determine packet type, and bails out if it's
    not set. For AF_PACKET injection, we need to support its call chain of:

    packet_sendmsg -> packet_snd -> packet_parse_headers ->
    dev_parse_header_protocol -> parse_protocol

    Without a valid parse_protocol, this returns zero, and vti rejects the
    skb. So, this wires up the ip_tunnel handler for layer 3 packets for
    that case.

    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: David S. Miller

    Jason A. Donenfeld
     

01 Jun, 2020

1 commit

  • xdp_umem.c had overlapping changes between the 64-bit math fix
    for the calculation of npgs and the removal of the zerocopy
    memory type which got rid of the chunk_size_nohdr member.

    The mlx5 Kconfig conflict is a case where we just take the
    net-next copy of the Kconfig entry dependency as it takes on
    the ESWITCH dependency by one level of indirection which is
    what the 'net' conflicting change is trying to ensure.

    Signed-off-by: David S. Miller

    David S. Miller
     

20 May, 2020

1 commit

  • This method is used to properly allow kernel callers of the IPv4 route
    management ioctls. The exsting ip_tunnel_ioctl helper is renamed to
    ip_tunnel_ctl to better reflect that it doesn't directly implement ioctls
    touching user memory, and is used for the guts of ndo_tunnel_ctl
    implementations. A new ip_tunnel_ioctl helper is added that can be wired
    up directly to the ndo_do_ioctl method and takes care of the copy to and
    from userspace.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

23 Apr, 2020

1 commit

  • In Commit dd9ee3444014 ("vti4: Fix a ipip packet processing bug in
    'IPCOMP' virtual tunnel"), it tries to receive IPIP packets in vti
    by calling xfrm_input(). This case happens when a small packet or
    frag sent by peer is too small to get compressed.

    However, xfrm_input() will still get to the IPCOMP path where skb
    sec_path is set, but never dropped while it should have been done
    in vti_ipcomp4_protocol.cb_handler(vti_rcv_cb), as it's not an
    ipcomp4 packet. This will cause that the packet can never pass
    xfrm4_policy_check() in the upper protocol rcv functions.

    So this patch is to call ip_tunnel_rcv() to process IPIP packets
    instead.

    Fixes: dd9ee3444014 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
    Reported-by: Xiumei Mu
    Signed-off-by: Xin Long
    Signed-off-by: Steffen Klassert

    Xin Long
     

06 Feb, 2020

1 commit


14 Jan, 2020

1 commit

  • With an ebpf program that redirects packets through a vti[6] interface,
    the packets are dropped because no dst is attached.

    This could also be reproduced with an AF_PACKET socket, with the following
    python script (vti1 is an ip_vti interface):

    import socket
    send_s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, 0)
    # scapy
    # p = IP(src='10.100.0.2', dst='10.200.0.1')/ICMP(type='echo-request')
    # raw(p)
    req = b'E\x00\x00\x1c\x00\x01\x00\x00@\x01e\xb2\nd\x00\x02\n\xc8\x00\x01\x08\x00\xf7\xff\x00\x00\x00\x00'
    send_s.sendto(req, ('vti1', 0x800, 0, 0))

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: Steffen Klassert

    Nicolas Dichtel
     

25 Dec, 2019

1 commit

  • When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
    we should not call dst_confirm_neigh() as there is no two-way communication.

    Although vti and vti6 are immune to this problem because they are IFF_NOARP
    interfaces, as Guillaume pointed. There is still no sense to confirm neighbour
    here.

    v5: Update commit description.
    v4: No change.
    v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
    v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

    Reviewed-by: Guillaume Nault
    Acked-by: David Ahern
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     

10 Dec, 2019

1 commit

  • Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
    at places where these are defined. Later patches will remove the unused
    definition of FIELD_SIZEOF().

    This patch is generated using following script:

    EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

    git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
    do

    if [[ "$file" =~ $EXCLUDE_FILES ]]; then
    continue
    fi
    sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
    done

    Signed-off-by: Pankaj Bharadiya
    Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
    Co-developed-by: Kees Cook
    Signed-off-by: Kees Cook
    Acked-by: David Miller # for net

    Pankaj Bharadiya
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

03 May, 2019

1 commit


08 Apr, 2019

3 commits

  • This structure is now only 4 bytes, so its more efficient
    to cache a copy rather than its address.

    No significant size difference in allmodconfig vmlinux.

    With non-modular kernel that has all XFRM options enabled, this
    series reduces vmlinux image size by ~11kb. All xfrm_mode
    indirections are gone and all modes are built-in.

    before (ipsec-next master):
    text data bss dec filename
    21071494 7233140 11104324 39408958 vmlinux.master

    after this series:
    21066448 7226772 11104324 39397544 vmlinux.patched

    With allmodconfig kernel, the size increase is only 362 bytes,
    even all the xfrm config options removed in this series are
    modular.

    before:
    text data bss dec filename
    15731286 6936912 4046908 26715106 vmlinux.master

    after this series:
    15731492 6937068 4046908 26715468 vmlinux

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     
  • after previous changes, xfrm_mode contains no function pointers anymore
    and all modules defining such struct contain no code except an init/exit
    functions to register the xfrm_mode struct with the xfrm core.

    Just place the xfrm modes core and remove the modules,
    the run-time xfrm_mode register/unregister functionality is removed.

    Before:

    text data bss dec filename
    7523 200 2364 10087 net/xfrm/xfrm_input.o
    40003 628 440 41071 net/xfrm/xfrm_state.o
    15730338 6937080 4046908 26714326 vmlinux

    7389 200 2364 9953 net/xfrm/xfrm_input.o
    40574 656 440 41670 net/xfrm/xfrm_state.o
    15730084 6937068 4046908 26714060 vmlinux

    The xfrm*_mode_{transport,tunnel,beet} modules are gone.

    v2: replace CONFIG_INET6_XFRM_MODE_* IS_ENABLED guards with CONFIG_IPV6
    ones rather than removing them.

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     
  • Now that we have the family available directly in the
    xfrm_mode struct, we can use that and avoid one extra dereference.

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     

24 Mar, 2019

1 commit

  • The ipip tunnel introduced in commit dd9ee3444014 ("vti4: Fix a ipip
    packet processing bug in 'IPCOMP' virtual tunnel") largely duplicated
    the existing vti_input and vti_recv functions. Refactored to
    deduplicate the common code.

    Signed-off-by: Jeremy Sowden
    Signed-off-by: Steffen Klassert

    Jeremy Sowden
     

21 Mar, 2019

2 commits

  • Removed info log-message if ipip tunnel registration fails during
    module-initialization: it adds nothing to the error message that is
    written on all failures.

    Fixes: dd9ee3444014e ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
    Signed-off-by: Jeremy Sowden
    Signed-off-by: Steffen Klassert

    Jeremy Sowden
     
  • If tunnel registration failed during module initialization, the module
    would fail to deregister the IPPROTO_COMP protocol and would attempt to
    deregister the tunnel.

    The tunnel was not deregistered during module-exit.

    Fixes: dd9ee3444014e ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
    Signed-off-by: Jeremy Sowden
    Signed-off-by: Steffen Klassert

    Jeremy Sowden
     

09 Jan, 2019

1 commit

  • Recently we run a network test over ipcomp virtual tunnel.We find that
    if a ipv4 packet needs fragment, then the peer can't receive
    it.

    We deep into the code and find that when packet need fragment the smaller
    fragment will be encapsulated by ipip not ipcomp. So when the ipip packet
    goes into xfrm, it's skb->dev is not properly set. The ipv4 reassembly code
    always set skb'dev to the last fragment's dev. After ipv4 defrag processing,
    when the kernel rp_filter parameter is set, the skb will be drop by -EXDEV
    error.

    This patch adds compatible support for the ipip process in ipcomp virtual tunnel.

    Signed-off-by: Su Yanjun
    Signed-off-by: Steffen Klassert

    Su Yanjun
     

02 Jan, 2019

1 commit

  • KMSAN detected read beyond end of buffer in vti and sit devices when
    passing truncated packets with PF_PACKET. The issue affects additional
    ip tunnel devices.

    Extend commit 76c0ddd8c3a6 ("ip6_tunnel: be careful when accessing the
    inner header") and commit ccfec9e5cb2d ("ip_tunnel: be careful when
    accessing the inner header").

    Move the check to a separate helper and call at the start of each
    ndo_start_xmit function in net/ipv4 and net/ipv6.

    Minor changes:
    - convert dev_kfree_skb to kfree_skb on error path,
    as dev_kfree_skb calls consume_skb which is not for error paths.
    - use pskb_network_may_pull even though that is pedantic here,
    as the same as pskb_may_pull for devices without llheaders.
    - do not cache ipv6 hdrs if used only once
    (unsafe across pskb_may_pull, was more relevant to earlier patch)

    Reported-by: syzbot
    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

27 Sep, 2018

2 commits


20 Aug, 2018

1 commit

  • After set fb_tunnels_only_for_init_net to 1, the itn->fb_tunnel_dev will
    be NULL and will cause following crash:

    [ 2742.849298] BUG: unable to handle kernel NULL pointer dereference at 0000000000000941
    [ 2742.851380] PGD 800000042c21a067 P4D 800000042c21a067 PUD 42aaed067 PMD 0
    [ 2742.852818] Oops: 0002 [#1] SMP PTI
    [ 2742.853570] CPU: 7 PID: 2484 Comm: unshare Kdump: loaded Not tainted 4.18.0-rc8+ #2
    [ 2742.855163] Hardware name: Fedora Project OpenStack Nova, BIOS seabios-1.7.5-11.el7 04/01/2014
    [ 2742.856970] RIP: 0010:vti_init_net+0x3a/0x50 [ip_vti]
    [ 2742.858034] Code: 90 83 c0 48 c7 c2 20 a1 83 c0 48 89 fb e8 6e 3b f6 ff 85 c0 75 22 8b 0d f4 19 00 00 48 8b 93 00 14 00 00 48 8b 14 ca 48 8b 12 82 41 09 00 00 04 c6 82 38 09 00 00 45 5b c3 66 0f 1f 44 00 00
    [ 2742.861940] RSP: 0018:ffff9be28207fde0 EFLAGS: 00010246
    [ 2742.863044] RAX: 0000000000000000 RBX: ffff8a71ebed4980 RCX: 0000000000000013
    [ 2742.864540] RDX: 0000000000000000 RSI: 0000000000000013 RDI: ffff8a71ebed4980
    [ 2742.866020] RBP: ffff8a71ea717000 R08: ffffffffc083903c R09: ffff8a71ea717000
    [ 2742.867505] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a71ebed4980
    [ 2742.868987] R13: 0000000000000013 R14: ffff8a71ea5b49c0 R15: 0000000000000000
    [ 2742.870473] FS: 00007f02266c9740(0000) GS:ffff8a71ffdc0000(0000) knlGS:0000000000000000
    [ 2742.872143] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 2742.873340] CR2: 0000000000000941 CR3: 000000042bc20006 CR4: 00000000001606e0
    [ 2742.874821] Call Trace:
    [ 2742.875358] ops_init+0x38/0xf0
    [ 2742.876078] setup_net+0xd9/0x1f0
    [ 2742.876789] copy_net_ns+0xb7/0x130
    [ 2742.877538] create_new_namespaces+0x11a/0x1d0
    [ 2742.878525] unshare_nsproxy_namespaces+0x55/0xa0
    [ 2742.879526] ksys_unshare+0x1a7/0x330
    [ 2742.880313] __x64_sys_unshare+0xe/0x20
    [ 2742.881131] do_syscall_64+0x5b/0x180
    [ 2742.881933] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Reproduce:
    echo 1 > /proc/sys/net/core/fb_tunnels_only_for_init_net
    modprobe ip_vti
    unshare -n

    Fixes: 79134e6ce2c9 ("net: do not create fallback tunnels for non-default namespaces")
    Cc: Eric Dumazet
    Signed-off-by: Haishuang Yan
    Signed-off-by: David S. Miller

    Haishuang Yan
     

19 Mar, 2018

2 commits

  • Don't hardcode a MTU value on vti tunnel initialization,
    ip_tunnel_newlink() is able to deal with this already. See also
    commit ffc2b6ee4174 ("ip_gre: fix IFLA_MTU ignored on NEWLINK").

    Fixes: 1181412c1a67 ("net/ipv4: VTI support new module for ip_vti.")
    Signed-off-by: Stefano Brivio
    Acked-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Stefano Brivio
     
  • This re-introduces the effect of commit a32452366b72 ("vti4:
    Don't count header length twice.") which was accidentally
    reverted by merge commit f895f0cfbb77 ("Merge branch 'master' of
    git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec").

    The commit message from Steffen Klassert said:

    We currently count the size of LL_MAX_HEADER and struct iphdr
    twice for vti4 devices, this leads to a wrong device mtu.
    The size of LL_MAX_HEADER and struct iphdr is already counted in
    ip_tunnel_bind_dev(), so don't do it again in vti_tunnel_init().

    And this is still the case now: ip_tunnel_bind_dev() already
    accounts for the header length of the link layer (not
    necessarily LL_MAX_HEADER, if the output device is found), plus
    one IP header.

    For example, with a vti device on top of veth, with MTU of 1500,
    the existing implementation would set the initial vti MTU to
    1332, accounting once for LL_MAX_HEADER (128, included in
    hard_header_len by vti) and twice for the same IP header (once
    from hard_header_len, once from ip_tunnel_bind_dev()).

    It should instead be 1480, because ip_tunnel_bind_dev() is able
    to figure out that the output device is veth, so no additional
    link layer header is attached, and will properly count one
    single IP header.

    The existing issue had the side effect of avoiding PMTUD for
    most xfrm policies, by arbitrarily lowering the initial MTU.
    However, the only way to get a consistent PMTU value is to let
    the xfrm PMTU discovery do its course, and commit d6af1a31cc72
    ("vti: Add pmtu handling to vti_xmit.") now takes care of local
    delivery cases where the application ignores local socket
    notifications.

    Fixes: b9959fd3b0fa ("vti: switch to new ip tunnel code")
    Fixes: f895f0cfbb77 ("Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec")
    Signed-off-by: Stefano Brivio
    Acked-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Stefano Brivio
     

26 Jan, 2018

1 commit

  • Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
    "BUG: unable to handle kernel NULL pointer dereference at (null)"

    Let's add a helper to check if update_pmtu is available before calling it.

    Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
    Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
    CC: Roman Kapl
    CC: Xin Long
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

01 Nov, 2017

1 commit


06 Oct, 2017

1 commit


27 Sep, 2017

1 commit

  • When running LTP IPsec tests, KASan might report:

    BUG: KASAN: use-after-free in vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
    Read of size 4 at addr ffff880dc6ad1980 by task swapper/0/0
    ...
    Call Trace:

    dump_stack+0x63/0x89
    print_address_description+0x7c/0x290
    kasan_report+0x28d/0x370
    ? vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
    __asan_report_load4_noabort+0x19/0x20
    vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
    ? vti_init_net+0x190/0x190 [ip_vti]
    ? save_stack_trace+0x1b/0x20
    ? save_stack+0x46/0xd0
    dev_hard_start_xmit+0x147/0x510
    ? icmp_echo.part.24+0x1f0/0x210
    __dev_queue_xmit+0x1394/0x1c60
    ...
    Freed by task 0:
    save_stack_trace+0x1b/0x20
    save_stack+0x46/0xd0
    kasan_slab_free+0x70/0xc0
    kmem_cache_free+0x81/0x1e0
    kfree_skbmem+0xb1/0xe0
    kfree_skb+0x75/0x170
    kfree_skb_list+0x3e/0x60
    __dev_queue_xmit+0x1298/0x1c60
    dev_queue_xmit+0x10/0x20
    neigh_resolve_output+0x3a8/0x740
    ip_finish_output2+0x5c0/0xe70
    ip_finish_output+0x4ba/0x680
    ip_output+0x1c1/0x3a0
    xfrm_output_resume+0xc65/0x13d0
    xfrm_output+0x1e4/0x380
    xfrm4_output_finish+0x5c/0x70

    Can be fixed if we get skb->len before dst_output().

    Fixes: b9959fd3b0fa ("vti: switch to new ip tunnel code")
    Fixes: 22e1b23dafa8 ("vti6: Support inter address family tunneling.")
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller

    Alexey Kodanev
     

20 Sep, 2017

1 commit

  • Implement exit_batch() method to dismantle more devices
    per round.

    (rtnl_lock() ...
    unregister_netdevice_many() ...
    rtnl_unlock())

    Tested:
    $ cat add_del_unshare.sh
    for i in `seq 1 40`
    do
    (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) &
    done
    wait ; grep net_namespace /proc/slabinfo

    Before patch :
    $ time ./add_del_unshare.sh
    net_namespace 126 282 5504 1 2 : tunables 8 4 0 : slabdata 126 282 0

    real 1m38.965s
    user 0m0.688s
    sys 0m37.017s

    After patch:
    $ time ./add_del_unshare.sh
    net_namespace 135 291 5504 1 2 : tunables 8 4 0 : slabdata 135 291 0

    real 0m22.117s
    user 0m0.728s
    sys 0m35.328s

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Jul, 2017

1 commit


27 Jun, 2017

3 commits


09 May, 2017

1 commit