08 Dec, 2018

1 commit

  • commit 000ade8016400d93b4d7c89970d96b8c14773d45 upstream.

    By passing a limit of 2 bytes to strncat, strncat is limited to writing
    fewer bytes than what it's supposed to append to the name here.

    Since the bounds are checked on the line above this, just remove the string
    bounds checks entirely since they're unneeded.

    Signed-off-by: Sultan Alsawaf
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Sultan Alsawaf
     

18 Oct, 2018

1 commit

  • [ Upstream commit ccfec9e5cb2d48df5a955b7bf47f7782157d3bc2]

    Cong noted that we need the same checks introduced by commit 76c0ddd8c3a6
    ("ip6_tunnel: be careful when accessing the inner header")
    even for ipv4 tunnels.

    Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
    Suggested-by: Cong Wang
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     

30 May, 2018

1 commit

  • [ Upstream commit 24fc79798b8ddfd46f2dd363a8d29072c083b977 ]

    Otherwise, it's possible to specify invalid MTU values directly
    on creation of a link (via 'ip link add'). This is already
    prevented on subsequent MTU changes by commit b96f9afee4eb
    ("ipv4/6: use core net MTU range checking").

    Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
    Signed-off-by: Stefano Brivio
    Acked-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Stefano Brivio
     

09 May, 2018

1 commit

  • commit f15ca723c1ebe6c1a06bc95fda6b62cd87b44559 upstream.

    Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
    "BUG: unable to handle kernel NULL pointer dereference at (null)"

    Let's add a helper to check if update_pmtu is available before calling it.

    Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
    Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
    CC: Roman Kapl
    CC: Xin Long
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller
    Cc: Thomas Deutschmann
    Cc: Eddie Chapman
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Dichtel
     

12 Apr, 2018

1 commit

  • [ Upstream commit 9cb726a212a82c88c98aa9f0037fd04777cd8fe5 ]

    Use dev_valid_name() to make sure user does not provide illegal
    device name.

    syzbot caught the following bug :

    BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline]
    BUG: KASAN: stack-out-of-bounds in __ip_tunnel_create+0xca/0x6b0 net/ipv4/ip_tunnel.c:257
    Write of size 20 at addr ffff8801ac79f810 by task syzkaller268107/4482

    CPU: 0 PID: 4482 Comm: syzkaller268107 Not tainted 4.16.0+ #1
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x1b9/0x29f lib/dump_stack.c:53
    print_address_description+0x6c/0x20b mm/kasan/report.c:256
    kasan_report_error mm/kasan/report.c:354 [inline]
    kasan_report.cold.7+0xac/0x2f5 mm/kasan/report.c:412
    check_memory_region_inline mm/kasan/kasan.c:260 [inline]
    check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
    memcpy+0x37/0x50 mm/kasan/kasan.c:303
    strlcpy include/linux/string.h:300 [inline]
    __ip_tunnel_create+0xca/0x6b0 net/ipv4/ip_tunnel.c:257
    ip_tunnel_create net/ipv4/ip_tunnel.c:352 [inline]
    ip_tunnel_ioctl+0x818/0xd40 net/ipv4/ip_tunnel.c:861
    ipip_tunnel_ioctl+0x1c5/0x420 net/ipv4/ipip.c:350
    dev_ifsioc+0x43e/0xb90 net/core/dev_ioctl.c:334
    dev_ioctl+0x69a/0xcc0 net/core/dev_ioctl.c:525
    sock_ioctl+0x47e/0x680 net/socket.c:1015
    vfs_ioctl fs/ioctl.c:46 [inline]
    file_ioctl fs/ioctl.c:500 [inline]
    do_vfs_ioctl+0x1cf/0x1650 fs/ioctl.c:684
    ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
    SYSC_ioctl fs/ioctl.c:708 [inline]
    SyS_ioctl+0x24/0x30 fs/ioctl.c:706
    do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x42/0xb7

    Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

03 Jan, 2018

1 commit

  • [ Upstream commit b5476022bbada3764609368f03329ca287528dc8 ]

    IPv4 stack reacts to changes to small MTU, by disabling itself under
    RTNL.

    But there is a window where threads not using RTNL can see a wrong
    device mtu. This can lead to surprises, in igmp code where it is
    assumed the mtu is suitable.

    Fix this by reading device mtu once and checking IPv4 minimal MTU.

    This patch adds missing IPV4_MIN_MTU define, to not abuse
    ETH_MIN_MTU anymore.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

13 Sep, 2017

1 commit

  • In collect_md mode, if the tun dev is down, it still can call
    ip_tunnel_rcv to receive on packets, and the rx statistics increase
    improperly.

    When the md tunnel is down, it's not neccessary to increase RX drops
    for the tunnel device, packets would be recieved on fallback tunnel,
    and the RX drops on fallback device will be increased as expected.

    Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
    Cc: Pravin B Shelar
    Signed-off-by: Haishuang Yan
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Haishuang Yan
     

09 Sep, 2017

1 commit


17 Jun, 2017

1 commit

  • When ip_tunnel_rcv fails, the tun_dst won't be freed, so call
    dst_release to free it in error code path.

    Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
    Acked-by: Eric Dumazet
    Acked-by: Pravin B Shelar
    Tested-by: Zhang Shengju
    Signed-off-by: Haishuang Yan
    Signed-off-by: David S. Miller

    Haishuang Yan
     

08 Jun, 2017

1 commit

  • Network devices can allocate reasources and private memory using
    netdev_ops->ndo_init(). However, the release of these resources
    can occur in one of two different places.

    Either netdev_ops->ndo_uninit() or netdev->destructor().

    The decision of which operation frees the resources depends upon
    whether it is necessary for all netdev refs to be released before it
    is safe to perform the freeing.

    netdev_ops->ndo_uninit() presumably can occur right after the
    NETDEV_UNREGISTER notifier completes and the unicast and multicast
    address lists are flushed.

    netdev->destructor(), on the other hand, does not run until the
    netdev references all go away.

    Further complicating the situation is that netdev->destructor()
    almost universally does also a free_netdev().

    This creates a problem for the logic in register_netdevice().
    Because all callers of register_netdevice() manage the freeing
    of the netdev, and invoke free_netdev(dev) if register_netdevice()
    fails.

    If netdev_ops->ndo_init() succeeds, but something else fails inside
    of register_netdevice(), it does call ndo_ops->ndo_uninit(). But
    it is not able to invoke netdev->destructor().

    This is because netdev->destructor() will do a free_netdev() and
    then the caller of register_netdevice() will do the same.

    However, this means that the resources that would normally be released
    by netdev->destructor() will not be.

    Over the years drivers have added local hacks to deal with this, by
    invoking their destructor parts by hand when register_netdevice()
    fails.

    Many drivers do not try to deal with this, and instead we have leaks.

    Let's close this hole by formalizing the distinction between what
    private things need to be freed up by netdev->destructor() and whether
    the driver needs unregister_netdevice() to perform the free_netdev().

    netdev->priv_destructor() performs all actions to free up the private
    resources that used to be freed by netdev->destructor(), except for
    free_netdev().

    netdev->needs_free_netdev is a boolean that indicates whether
    free_netdev() should be done at the end of unregister_netdevice().

    Now, register_netdevice() can sanely release all resources after
    ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
    and netdev->priv_destructor().

    And at the end of unregister_netdevice(), we invoke
    netdev->priv_destructor() and optionally call free_netdev().

    Signed-off-by: David S. Miller

    David S. Miller
     

22 Apr, 2017

1 commit

  • This feature allows the administrator to set an fwmark for
    packets traversing a tunnel. This allows the use of independent
    routing tables for tunneled packets without the use of iptables.

    There is no concept of per-packet routing decisions through IPv4
    tunnels, so this implementation does not need to work with
    per-packet route lookups as the v6 implementation may
    (with IP6_TNL_F_USE_ORIG_FWMARK).

    Further, since the v4 tunnel ioctls share datastructures
    (which can not be trivially modified) with the kernel's internal
    tunnel configuration structures, the mark attribute must be stored
    in the tunnel structure itself and passed as a parameter when
    creating or changing tunnel attributes.

    Signed-off-by: Craig Gallek
    Signed-off-by: David S. Miller

    Craig Gallek
     

18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

21 Oct, 2016

1 commit

  • ipv4/ip_tunnel:
    - min_mtu = 68, max_mtu = 0xFFF8 - dev->hard_header_len - t_hlen
    - preserve all ndo_change_mtu checks for now to prevent regressions

    ipv6/ip6_tunnel:
    - min_mtu = 68, max_mtu = 0xFFF8 - dev->hard_header_len
    - preserve all ndo_change_mtu checks for now to prevent regressions

    ipv6/ip6_vti:
    - min_mtu = 1280, max_mtu = 65535
    - remove redundant vti6_change_mtu

    ipv6/sit:
    - min_mtu = 1280, max_mtu = 0xFFF8 - t_hlen
    - remove redundant ipip6_tunnel_change_mtu

    CC: netdev@vger.kernel.org
    CC: "David S. Miller"
    CC: Alexey Kuznetsov
    CC: James Morris
    CC: Hideaki YOSHIFUJI
    CC: Patrick McHardy
    Signed-off-by: Jarod Wilson
    Signed-off-by: David S. Miller

    Jarod Wilson
     

17 Sep, 2016

1 commit

  • Similar to gre, vxlan, geneve tunnels allow IPIP tunnels to
    operate in 'collect metadata' mode.
    bpf_skb_[gs]et_tunnel_key() helpers can make use of it right away.
    ovs can use it as well in the future (once appropriate ovs-vport
    abstractions and user apis are added).
    Note that just like in other tunnels we cannot cache the dst,
    since tunnel_info metadata can be different for every packet.

    Signed-off-by: Alexei Starovoitov
    Acked-by: Thomas Graf
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

16 Jun, 2016

1 commit

  • In the presence of firewalls which improperly block ICMP Unreachable
    (including Fragmentation Required) messages, Path MTU Discovery is
    prevented from working.

    A workaround is to handle IPv4 payloads opaquely, ignoring the DF bit--as
    is done for other payloads like AppleTalk--and doing transparent
    fragmentation and reassembly.

    Redux includes the enforcement of mutual exclusion between this feature
    and Path MTU Discovery as suggested by Alexander Duyck.

    Cc: Alexander Duyck
    Reviewed-by: Stephen Hemminger
    Signed-off-by: Philip Prindeville

    Signed-off-by: David S. Miller

    Philip Prindeville
     

21 May, 2016

1 commit

  • Consolidate all the ip_tunnel_encap definitions in one spot in the
    header file. Also, move ip_encap_hlen and ip_tunnel_encap from
    ip_tunnel.c to ip_tunnels.h so they call be called without a dependency
    on ip_tunnel module. Similarly, move iptun_encaps to ip_tunnel_core.c.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

30 Apr, 2016

1 commit

  • After the commit e09acddf873b ("ip_tunnel: replace dst_cache with generic
    implementation"), a preemption debug warning is triggered on ip4
    tunnels updating; the dst cache helper needs to be invoked in unpreemptible
    context.

    We don't need to load the cache on tunnel update, so this commit fixes
    the warning replacing the load with a dst cache reset, which is
    preempt safe.

    Fixes: e09acddf873b ("ip_tunnel: replace dst_cache with generic implementation")
    Reported-by: Eric Dumazet
    Signed-off-by: Paolo Abeni
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Paolo Abeni
     

09 Mar, 2016

1 commit


24 Feb, 2016

1 commit

  • IPCB may contain data from previous layers (in the observed case the
    qdisc layer). In the observed scenario, the data was misinterpreted as
    ip header options, which later caused the ihl to be set to an invalid
    value (opt before dst_link_failure can be called for
    various types of tunnels. This change only applies to encapsulated ipv4
    packets.

    The code introduced in 11c21a30 which clears all of IPCB has been removed
    to be consistent with these changes, and instead the opt field is cleared
    unconditionally in ip_tunnel_xmit. The change in ip_tunnel_xmit applies to
    SIT, GRE, and IPIP tunnels.

    The relevant vti, l2tp, and pptp functions already contain similar code for
    clearing the IPCB.

    Signed-off-by: Bernie Harris
    Signed-off-by: David S. Miller

    Bernie Harris
     

23 Feb, 2016

1 commit


17 Feb, 2016

1 commit

  • The current ip_tunnel cache implementation is prone to a race
    that will cause the wrong dst to be cached on cuncurrent dst cache
    miss and ip tunnel update via netlink.

    Replacing with the generic implementation fix the issue.

    Signed-off-by: Paolo Abeni
    Suggested-and-acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Paolo Abeni
     

10 Feb, 2016

1 commit

  • Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
    transmit vxlan packets of any size, constrained only by the ability to
    send out the resulting packets. 4.3 introduced netdevs corresponding
    to tunnel vports. These netdevs have an MTU, which limits the size of
    a packet that can be successfully encapsulated. The default MTU
    values are low (1500 or less), which is awkwardly small in the context
    of physical networks supporting jumbo frames, and leads to a
    conspicuous change in behaviour for userspace.

    Instead, set the MTU on openvswitch-created netdevs to be the relevant
    maximum (i.e. the maximum IP packet size minus any relevant overhead),
    effectively restoring the behaviour prior to 4.3.

    Signed-off-by: David Wragg
    Signed-off-by: David S. Miller

    David Wragg
     

26 Dec, 2015

1 commit


01 Dec, 2015

1 commit


11 Aug, 2015

1 commit


09 Jul, 2015

1 commit

  • Frag needed should be sent only if the inner header asked
    to not fragment. Currently fragmentation is broken if the
    tunnel has df set, but df was not asked in the original
    packet. The tunnel's df needs to be still checked to update
    internally the pmtu cache.

    Commit 23a3647bc4f93bac broke it, and this commit fixes
    the ipv4 df check back to the way it was.

    Fixes: 23a3647bc4f93bac ("ip_tunnels: Use skb-len to PMTU check.")
    Cc: Pravin B Shelar
    Signed-off-by: Timo Teräs
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Timo Teräs
     

08 Apr, 2015

1 commit


04 Apr, 2015

2 commits

  • The ipv4 code uses a mixture of coding styles. In some instances check
    for non-NULL pointer is done as x != NULL and sometimes as x. x is
    preferred according to checkpatch and this patch makes the code
    consistent by adopting the latter form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     
  • The ipv4 code uses a mixture of coding styles. In some instances check
    for NULL pointer is done as x == NULL and sometimes as !x. !x is
    preferred according to checkpatch and this patch makes the code
    consistent by adopting the latter form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

03 Apr, 2015

1 commit


20 Jan, 2015

1 commit


17 Dec, 2014

2 commits


13 Nov, 2014

1 commit

  • Instead of calling fou and gue functions directly from ip_tunnel
    use ops for these that were previously registered. This patch adds the
    logic to add and remove encapsulation operations for ip_tunnel,
    and modified fou (and gue) to register with ip_tunnels.

    This patch also addresses a circular dependency between ip_tunnel
    and fou that was causing link errors when CONFIG_NET_IP_TUNNEL=y
    and CONFIG_NET_FOU=m. References to fou an gue have been removed from
    ip_tunnel.c

    Reported-by: Randy Dunlap
    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

06 Nov, 2014

1 commit

  • Move fou_build_header out of ip_tunnel.c and into fou.c splitting
    it up into fou_build_header, gue_build_header, and fou_build_udp.
    This allows for other users for TX of FOU or GUE. Change ip_tunnel_encap
    to call fou_build_header or gue_build_header based on the tunnel
    encapsulation type. Similarly, added fou_encap_hlen and gue_encap_hlen
    functions which are called by ip_encap_hlen. New net/fou.h has
    prototypes and defines for this.

    Added NET_FOU_IP_TUNNELS configuration. When this is set, IP tunnels
    can use FOU/GUE and fou module is also selected.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

04 Oct, 2014

2 commits


03 Oct, 2014

1 commit


26 Sep, 2014

1 commit

  • When we try to add an already existing tunnel, we don't return
    an error. Instead we continue and call ip_tunnel_update().
    This means that we can change existing tunnels by adding
    the same tunnel multiple times. It is even possible to change
    the tunnel endpoints of the fallback device.

    We fix this by returning an error if we try to add an existing
    tunnel.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     

24 Sep, 2014

1 commit