11 Aug, 2020

1 commit

  • [ Upstream commit 706ec919164622ff5ce822065472d0f30a9e9dd2 ]

    ip6_route_info_create() invokes nexthop_get(), which increases the
    refcount of the "nh".

    When ip6_route_info_create() returns, local variable "nh" becomes
    invalid, so the refcount should be decreased to keep refcount balanced.

    The reference counting issue happens in one exception handling path of
    ip6_route_info_create(). When nexthops can not be used with source
    routing, the function forgets to decrease the refcnt increased by
    nexthop_get(), causing a refcnt leak.

    Fix this issue by pulling up the error source routing handling when
    nexthops can not be used with source routing.

    Fixes: f88d8ea67fbd ("ipv6: Plumb support for nexthop object in a fib6_info")
    Signed-off-by: Xiyu Yang
    Signed-off-by: Xin Tan
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Xiyu Yang
     

22 Jul, 2020

2 commits

  • [ Upstream commit aea23c323d89836bcdcee67e49def997ffca043b ]

    Thomas reported a regression with IPv6 and anycast using the following
    reproducer:

    echo 1 > /proc/sys/net/ipv6/conf/all/forwarding
    ip -6 a add fc12::1/16 dev lo
    sleep 2
    echo "pinging lo"
    ping6 -c 2 fc12::

    The conversion of addrconf_f6i_alloc to use ip6_route_info_create missed
    the use of fib6_is_reject which checks addresses added to the loopback
    interface and sets the REJECT flag as needed. Update fib6_is_reject for
    loopback checks to handle RTF_ANYCAST addresses.

    Fixes: c7a1ce397ada ("ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create")
    Reported-by: thomas.gambier@nexedi.com
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Ahern
     
  • [ Upstream commit 34fe5a1cf95c3f114068fc16d919c9cf4b00e428 ]

    Brian reported a crash in IPv6 code when using rpfilter with a setup
    running FRR and external nexthop objects. The root cause of the crash
    is fib6_select_path setting fib6_nh in the result to NULL because of
    an improper check for nexthop objects.

    More specifically, rpfilter invokes ip6_route_lookup with flowi6_oif
    set causing fib6_select_path to be called with have_oif_match set.
    fib6_select_path has early check on have_oif_match and jumps to the
    out label which presumes a builtin fib6_nh. This path is invalid for
    nexthop objects; for external nexthops fib6_select_path needs to just
    return if the fib6_nh has already been set in the result otherwise it
    returns after the call to nexthop_path_fib6_result. Update the check
    on have_oif_match to not bail on external nexthops.

    Update selftests for this problem.

    Fixes: f88d8ea67fbd ("ipv6: Plumb support for nexthop object in a fib6_info")
    Reported-by: Brian Rak
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Ahern
     

20 May, 2020

1 commit

  • [ Upstream commit 09454fd0a4ce23cb3d8af65066c91a1bf27120dd ]

    This reverts commit 19bda36c4299ce3d7e5bce10bebe01764a655a6d:

    | ipv6: add mtu lock check in __ip6_rt_update_pmtu
    |
    | Prior to this patch, ipv6 didn't do mtu lock check in ip6_update_pmtu.
    | It leaded to that mtu lock doesn't really work when receiving the pkt
    | of ICMPV6_PKT_TOOBIG.
    |
    | This patch is to add mtu lock check in __ip6_rt_update_pmtu just as ipv4
    | did in __ip_rt_update_pmtu.

    The above reasoning is incorrect. IPv6 *requires* icmp based pmtu to work.
    There's already a comment to this effect elsewhere in the kernel:

    $ git grep -p -B1 -A3 'RTAX_MTU lock'
    net/ipv6/route.c=4813=

    static int rt6_mtu_change_route(struct fib6_info *f6i, void *p_arg)
    ...
    /* In IPv6 pmtu discovery is not optional,
    so that RTAX_MTU lock cannot disable it.
    We still use this lock to block changes
    caused by addrconf/ndisc.
    */

    This reverts to the pre-4.9 behaviour.

    Cc: Eric Dumazet
    Cc: Willem de Bruijn
    Cc: Xin Long
    Cc: Hannes Frederic Sowa
    Signed-off-by: Maciej Żenczykowski
    Fixes: 19bda36c4299 ("ipv6: add mtu lock check in __ip6_rt_update_pmtu")
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Maciej Żenczykowski
     

14 May, 2020

1 commit

  • [ Upstream commit 8f34e53b60b337e559f1ea19e2780ff95ab2fa65 ]

    Nik reported a bug with pcpu dst cache when nexthop objects are
    used illustrated by the following:
    $ ip netns add foo
    $ ip -netns foo li set lo up
    $ ip -netns foo addr add 2001:db8:11::1/128 dev lo
    $ ip netns exec foo sysctl net.ipv6.conf.all.forwarding=1
    $ ip li add veth1 type veth peer name veth2
    $ ip li set veth1 up
    $ ip addr add 2001:db8:10::1/64 dev veth1
    $ ip li set dev veth2 netns foo
    $ ip -netns foo li set veth2 up
    $ ip -netns foo addr add 2001:db8:10::2/64 dev veth2
    $ ip -6 nexthop add id 100 via 2001:db8:10::2 dev veth1
    $ ip -6 route add 2001:db8:11::1/128 nhid 100

    Create a pcpu entry on cpu 0:
    $ taskset -a -c 0 ip -6 route get 2001:db8:11::1

    Re-add the route entry:
    $ ip -6 ro del 2001:db8:11::1
    $ ip -6 route add 2001:db8:11::1/128 nhid 100

    Route get on cpu 0 returns the stale pcpu:
    $ taskset -a -c 0 ip -6 route get 2001:db8:11::1
    RTNETLINK answers: Network is unreachable

    While cpu 1 works:
    $ taskset -a -c 1 ip -6 route get 2001:db8:11::1
    2001:db8:11::1 from :: via 2001:db8:10::2 dev veth1 src 2001:db8:10::1 metric 1024 pref medium

    Conversion of FIB entries to work with external nexthop objects
    missed an important difference between IPv4 and IPv6 - how dst
    entries are invalidated when the FIB changes. IPv4 has a per-network
    namespace generation id (rt_genid) that is bumped on changes to the FIB.
    Checking if a dst_entry is still valid means comparing rt_genid in the
    rtable to the current value of rt_genid for the namespace.

    IPv6 also has a per network namespace counter, fib6_sernum, but the
    count is saved per fib6_node. With the per-node counter only dst_entries
    based on fib entries under the node are invalidated when changes are
    made to the routes - limiting the scope of invalidations. IPv6 uses a
    reference in the rt6_info, 'from', to track the corresponding fib entry
    used to create the dst_entry. When validating a dst_entry, the 'from'
    is used to backtrack to the fib6_node and check the sernum of it to the
    cookie passed to the dst_check operation.

    With the inline format (nexthop definition inline with the fib6_info),
    dst_entries cached in the fib6_nh have a 1:1 correlation between fib
    entries, nexthop data and dst_entries. With external nexthops, IPv6
    looks more like IPv4 which means multiple fib entries across disparate
    fib6_nodes can all reference the same fib6_nh. That means validation
    of dst_entries based on external nexthops needs to use the IPv4 format
    - the per-network namespace counter.

    Add sernum to rt6_info and set it when creating a pcpu dst entry. Update
    rt6_get_cookie to return sernum if it is set and update dst_check for
    IPv6 to look for sernum set and based the check on it if so. Finally,
    rt6_get_pcpu_route needs to validate the cached entry before returning
    a pcpu entry (similar to the rt_cache_valid calls in __mkroute_input and
    __mkroute_output for IPv4).

    This problem only affects routes using the new, external nexthops.

    Thanks to the kbuild test robot for catching the IS_ENABLED needed
    around rt_genid_ipv6 before I sent this out.

    Fixes: 5b98324ebe29 ("ipv6: Allow routes to use nexthop objects")
    Reported-by: Nikolay Aleksandrov
    Signed-off-by: David Ahern
    Reviewed-by: Nikolay Aleksandrov
    Tested-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Ahern
     

05 Mar, 2020

1 commit

  • [ Upstream commit afecdb376bd81d7e16578f0cfe82a1aec7ae18f3 ]

    When splitting an RTA_MULTIPATH request into multiple routes and adding the
    second and later components, we must not simply remove NLM_F_REPLACE but
    instead replace it by NLM_F_CREATE. Otherwise, it may look like the netlink
    message was malformed.

    For example,
    ip route add 2001:db8::1/128 dev dummy0
    ip route change 2001:db8::1/128 nexthop via fe80::30:1 dev dummy0 \
    nexthop via fe80::30:2 dev dummy0
    results in the following warnings:
    [ 1035.057019] IPv6: RTM_NEWROUTE with no NLM_F_CREATE or NLM_F_REPLACE
    [ 1035.057517] IPv6: NLM_F_CREATE should be set when creating new route

    This patch makes the nlmsg sequence look equivalent for __ip6_ins_rt() to
    what it would get if the multipath route had been added in multiple netlink
    operations:
    ip route add 2001:db8::1/128 dev dummy0
    ip route change 2001:db8::1/128 nexthop via fe80::30:1 dev dummy0
    ip route append 2001:db8::1/128 nexthop via fe80::30:2 dev dummy0

    Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
    Signed-off-by: Benjamin Poirier
    Reviewed-by: Michal Kubecek
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Benjamin Poirier
     

05 Jan, 2020

1 commit

  • [ Upstream commit bd085ef678b2cc8c38c105673dfe8ff8f5ec0c57 ]

    The MTU update code is supposed to be invoked in response to real
    networking events that update the PMTU. In IPv6 PMTU update function
    __ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
    confirmed time.

    But for tunnel code, it will call pmtu before xmit, like:
    - tnl_update_pmtu()
    - skb_dst_update_pmtu()
    - ip6_rt_update_pmtu()
    - __ip6_rt_update_pmtu()
    - dst_confirm_neigh()

    If the tunnel remote dst mac address changed and we still do the neigh
    confirm, we will not be able to update neigh cache and ping6 remote
    will failed.

    So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
    should not be invoking dst_confirm_neigh() as we have no evidence
    of successful two-way communication at this point.

    On the other hand it is also important to keep the neigh reachability fresh
    for TCP flows, so we cannot remove this dst_confirm_neigh() call.

    To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
    to choose whether we should do neigh update or not. I will add the parameter
    in this patch and set all the callers to true to comply with the previous
    way, and fix the tunnel code one by one on later patches.

    v5: No change.
    v4: No change.
    v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
    v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

    Suggested-by: David Miller
    Reviewed-by: Guillaume Nault
    Acked-by: David Ahern
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     

21 Nov, 2019

1 commit

  • Previously we will return directly if (!rt || !rt->fib6_nh.fib_nh_gw_family)
    in function rt6_probe(), but after commit cc3a86c802f0
    ("ipv6: Change rt6_probe to take a fib6_nh"), the logic changed to
    return if there is fib_nh_gw_family.

    Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh")
    Signed-off-by: Hangbin Liu
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Hangbin Liu
     

08 Nov, 2019

1 commit

  • While looking at a syzbot KCSAN report [1], I found multiple
    issues in this code :

    1) fib6_nh->last_probe has an initial value of 0.

    While probably okay on 64bit kernels, this causes an issue
    on 32bit kernels since the time_after(jiffies, 0 + interval)
    might be false ~24 days after boot (for HZ=1000)

    2) The data-race found by KCSAN
    I could use READ_ONCE() and WRITE_ONCE(), but we also can
    take the opportunity of not piling-up too many rt6_probe_deferred()
    works by using instead cmpxchg() so that only one cpu wins the race.

    [1]
    BUG: KCSAN: data-race in find_match / find_match

    write to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 1:
    rt6_probe net/ipv6/route.c:663 [inline]
    find_match net/ipv6/route.c:757 [inline]
    find_match+0x5bd/0x790 net/ipv6/route.c:733
    __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
    find_rr_leaf net/ipv6/route.c:852 [inline]
    rt6_select net/ipv6/route.c:896 [inline]
    fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
    ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
    ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
    fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
    ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
    ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
    ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
    ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
    inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
    inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
    __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
    tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
    tcp_xmit_probe_skb+0x19b/0x1d0 net/ipv4/tcp_output.c:3735

    read to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 0:
    rt6_probe net/ipv6/route.c:657 [inline]
    find_match net/ipv6/route.c:757 [inline]
    find_match+0x521/0x790 net/ipv6/route.c:733
    __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
    find_rr_leaf net/ipv6/route.c:852 [inline]
    rt6_select net/ipv6/route.c:896 [inline]
    fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
    ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
    ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
    fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
    ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
    ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
    ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
    ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
    inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
    inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
    __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169

    Reported by Kernel Concurrency Sanitizer on:
    CPU: 0 PID: 18894 Comm: udevd Not tainted 5.4.0-rc3+ #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

    Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh")
    Fixes: f547fac624be ("ipv6: rate-limit probes for neighbourless routes")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Sep, 2019

1 commit


12 Sep, 2019

1 commit

  • This is the equivalent of commit 2c6b55f45d53 ("ipv6: fix neighbour
    resolution with raw socket") for ip6_confirm_neigh(): we can send a
    packet with MSG_CONFIRM on a raw socket for a connected route, so the
    gateway would be :: here, and we should pick the next hop using
    rt6_nexthop() instead.

    This was found by code review and, to the best of my knowledge, doesn't
    actually fix a practical issue: the destination address from the packet
    is not considered while confirming a neighbour, as ip6_confirm_neigh()
    calls choose_neigh_daddr() without passing the packet, so there are no
    similar issues as the one fixed by said commit.

    A possible source of issues with the existing implementation might come
    from the fact that, if we have a cached dst, we won't consider it,
    while rt6_nexthop() takes care of that. I might just not be creative
    enough to find a practical problem here: the only way to affect this
    with cached routes is to have one coming from an ICMPv6 redirect, but
    if the next hop is a directly connected host, there should be no
    topology for which a redirect applies here, and tests with redirected
    routes show no differences for MSG_CONFIRM (and MSG_PROBE) packets on
    raw sockets destined to a directly connected host.

    However, directly using the dst gateway here is not consistent anymore
    with neighbour resolution, and, in general, as we want the next hop,
    using rt6_nexthop() looks like the only sane way to fetch it.

    Reported-by: Guillaume Nault
    Signed-off-by: Stefano Brivio
    Acked-by: Guillaume Nault
    Acked-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Stefano Brivio
     

07 Sep, 2019

1 commit

  • Fixes a stupid bug I recently introduced...
    ip6_route_info_create() returns an ERR_PTR(err) and not a NULL on error.

    Fixes: d55a2e374a94 ("net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)'")
    Cc: David Ahern
    Cc: Lorenzo Colitti
    Cc: Eric Dumazet
    Signed-off-by: Maciej Żenczykowski
    Reported-by: syzbot
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Maciej Żenczykowski
     

05 Sep, 2019

3 commits

  • When creating a v4 route that uses a v6 nexthop from a nexthop group.
    Allow the kernel to properly send the nexthop as v6 via the RTA_VIA
    attribute.

    Broken behavior:

    $ ip nexthop add via fe80::9 dev eth0
    $ ip nexthop show
    id 1 via fe80::9 dev eth0 scope link
    $ ip route add 4.5.6.7/32 nhid 1
    $ ip route show
    default via 10.0.2.2 dev eth0
    4.5.6.7 nhid 1 via 254.128.0.0 dev eth0
    10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
    $

    Fixed behavior:

    $ ip nexthop add via fe80::9 dev eth0
    $ ip nexthop show
    id 1 via fe80::9 dev eth0 scope link
    $ ip route add 4.5.6.7/32 nhid 1
    $ ip route show
    default via 10.0.2.2 dev eth0
    4.5.6.7 nhid 1 via inet6 fe80::9 dev eth0
    10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
    $

    v2, v3: Addresses code review comments from David Ahern

    Fixes: dcb1ecb50edf (“ipv4: Prepare for fib6_nh from a nexthop object”)
    Signed-off-by: Donald Sharp
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Donald Sharp
     
  • A change to the core nla helpers was missed during the push of
    the nexthop changes. rt6_fill_node_nexthop should be calling
    nla_nest_start_noflag not nla_nest_start. Currently, iproute2
    does not print multipath data because of parsing issues with
    the attribute.

    Fixes: f88d8ea67fbd ("ipv6: Plumb support for nexthop object in a fib6_info")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • There is a subtle change in behaviour introduced by:
    commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
    'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create'

    Before that patch /proc/net/ipv6_route includes:
    00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo

    Afterwards /proc/net/ipv6_route includes:
    00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo

    ie. the above commit causes the ::1/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000).

    AFAICT, this is incorrect since these routes are *not* coming from RA's.

    As such, this patch restores the old behaviour.

    Fixes: c7a1ce397ada ("ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create")
    Cc: David Ahern
    Cc: Lorenzo Colitti
    Signed-off-by: Maciej Żenczykowski
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Maciej Żenczykowski
     

07 Aug, 2019

1 commit


06 Aug, 2019

2 commits


20 Jul, 2019

1 commit

  • Pull networking fixes from David Miller:

    1) Fix AF_XDP cq entry leak, from Ilya Maximets.

    2) Fix handling of PHY power-down on RTL8411B, from Heiner Kallweit.

    3) Add some new PCI IDs to iwlwifi, from Ihab Zhaika.

    4) Fix handling of neigh timers wrt. entries added by userspace, from
    Lorenzo Bianconi.

    5) Various cases of missing of_node_put(), from Nishka Dasgupta.

    6) The new NET_ACT_CT needs to depend upon NF_NAT, from Yue Haibing.

    7) Various RDS layer fixes, from Gerd Rausch.

    8) Fix some more fallout from TCQ_F_CAN_BYPASS generalization, from
    Cong Wang.

    9) Fix FIB source validation checks over loopback, also from Cong Wang.

    10) Use promisc for unsupported number of filters, from Justin Chen.

    11) Missing sibling route unlink on failure in ipv6, from Ido Schimmel.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
    tcp: fix tcp_set_congestion_control() use from bpf hook
    ag71xx: fix return value check in ag71xx_probe()
    ag71xx: fix error return code in ag71xx_probe()
    usb: qmi_wwan: add D-Link DWM-222 A2 device ID
    bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips.
    net: dsa: sja1105: Fix missing unlock on error in sk_buff()
    gve: replace kfree with kvfree
    selftests/bpf: fix test_xdp_noinline on s390
    selftests/bpf: fix "valid read map access into a read-only array 1" on s390
    net/mlx5: Replace kfree with kvfree
    MAINTAINERS: update netsec driver
    ipv6: Unlink sibling route in case of failure
    liquidio: Replace vmalloc + memset with vzalloc
    udp: Fix typo in net/ipv4/udp.c
    net: bcmgenet: use promisc for unsupported filters
    ipv6: rt6_check should return NULL if 'from' is NULL
    tipc: initialize 'validated' field of received packets
    selftests: add a test case for rp_filter
    fib: relax source validation check for loopback packets
    mlxsw: spectrum: Do not process learned records with a dummy FID
    ...

    Linus Torvalds
     

19 Jul, 2019

1 commit

  • In the sysctl code the proc_dointvec_minmax() function is often used to
    validate the user supplied value between an allowed range. This
    function uses the extra1 and extra2 members from struct ctl_table as
    minimum and maximum allowed value.

    On sysctl handler declaration, in every source file there are some
    readonly variables containing just an integer which address is assigned
    to the extra1 and extra2 members, so the sysctl range is enforced.

    The special values 0, 1 and INT_MAX are very often used as range
    boundary, leading duplication of variables like zero=0, one=1,
    int_max=INT_MAX in different source files:

    $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
    248

    Add a const int array containing the most commonly used values, some
    macros to refer more easily to the correct array member, and use them
    instead of creating a local one for every object file.

    This is the bloat-o-meter output comparing the old and new binary
    compiled with the default Fedora config:

    # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
    add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
    Data old new delta
    sysctl_vals - 12 +12
    __kstrtab_sysctl_vals - 12 +12
    max 14 10 -4
    int_max 16 - -16
    one 68 - -68
    zero 128 28 -100
    Total: Before=20583249, After=20583085, chg -0.00%

    [mcroce@redhat.com: tipc: remove two unused variables]
    Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
    [akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
    [arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
    Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
    [akpm@linux-foundation.org: fix fs/eventpoll.c]
    Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
    Signed-off-by: Matteo Croce
    Signed-off-by: Arnd Bergmann
    Acked-by: Kees Cook
    Reviewed-by: Aaron Tomlin
    Cc: Matthew Wilcox
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matteo Croce
     

18 Jul, 2019

1 commit

  • Paul reported that l2tp sessions were broken after the commit referenced
    in the Fixes tag. Prior to this commit rt6_check returned NULL if the
    rt6_info 'from' was NULL - ie., the dst_entry was disconnected from a FIB
    entry. Restore that behavior.

    Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")
    Reported-by: Paul Donohue
    Tested-by: Paul Donohue
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

09 Jul, 2019

1 commit


02 Jul, 2019

1 commit


28 Jun, 2019

2 commits

  • The new route handling in ip_mc_finish_output() from 'net' overlapped
    with the new support for returning congestion notifications from BPF
    programs.

    In order to handle this I had to take the dev_loopback_xmit() calls
    out of the switch statement.

    The aquantia driver conflicts were simple overlapping changes.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Gateway validation does not need a dst_entry, it only needs the fib
    entry to validate the gateway resolution and egress device. So,
    convert ip6_nh_lookup_table from ip6_pol_route to fib6_table_lookup
    and ip6_route_check_nh to use fib6_lookup over rt6_lookup.

    ip6_pol_route is a call to fib6_table_lookup and if successful a call
    to fib6_select_path. From there the exception cache is searched for an
    entry or a dst_entry is created to return to the caller. The exception
    entry is not relevant for gateway validation, so what matters are the
    calls to fib6_table_lookup and then fib6_select_path.

    Similarly, rt6_lookup can be replaced with a call to fib6_lookup with
    RT6_LOOKUP_F_IFACE set in flags. Again, the exception cache search is
    not relevant, only the lookup with path selection. The primary difference
    in the lookup paths is the use of rt6_select with fib6_lookup versus
    rt6_device_match with rt6_lookup. When you remove complexities in the
    rt6_select path, e.g.,
    1. saddr is not set for gateway validation, so RT6_LOOKUP_F_HAS_SADDR
    is not relevant
    2. rt6_check_neigh is not called so that removes the RT6_NUD_FAIL_DO_RR
    return and round-robin logic.

    the code paths are believed to be equivalent for the given use case -
    validate the gateway and optionally given the device. Furthermore, it
    aligns the validation with onlink code path and the lookup path actually
    used for rx and tx.

    Adjust the users, ip6_route_check_nh_onlink and ip6_route_check_nh to
    handle a fib6_info vs a rt6_info when performing validation checks.

    Existing selftests fib-onlink-tests.sh and fib_tests.sh are used to
    verify the changes.

    Signed-off-by: David Ahern
    Reviewed-by: Wei Wang
    Signed-off-by: David S. Miller

    David Ahern
     

27 Jun, 2019

2 commits

  • The scenario is the following: the user uses a raw socket to send an ipv6
    packet, destinated to a not-connected network, and specify a connected nh.
    Here is the corresponding python script to reproduce this scenario:

    import socket
    IPPROTO_RAW = 255
    send_s = socket.socket(socket.AF_INET6, socket.SOCK_RAW, IPPROTO_RAW)
    # scapy
    # p = IPv6(src='fd00:100::1', dst='fd00:200::fa')/ICMPv6EchoRequest()
    # str(p)
    req = b'`\x00\x00\x00\x00\x08:@\xfd\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xfd\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfa\x80\x00\x81\xc0\x00\x00\x00\x00'
    send_s.sendto(req, ('fd00:175::2', 0, 0, 0))

    fd00:175::/64 is a connected route and fd00:200::fa is not a connected
    host.

    With this scenario, the kernel starts by sending a NS to resolve
    fd00:175::2. When it receives the NA, it flushes its queue and try to send
    the initial packet. But instead of sending it, it sends another NS to
    resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
    the user sends again the packet, it now uses the right nh (fd00:175::2).

    The problem is that ip6_dst_lookup_neigh() uses the rt6i_gateway, which is
    :: because the associated route is a connected route, thus it uses the dst
    addr of the packet. Let's use rt6_nexthop() to choose the right nh.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • syzbot reminded us that rt6_nh_dump_exceptions() needs to be called
    with rcu_read_lock()

    net/ipv6/route.c:1593 suspicious rcu_dereference_check() usage!

    other info that might help us debug this:

    rcu_scheduler_active = 2, debug_locks = 1
    2 locks held by syz-executor609/8966:
    #0: 00000000b7dbe288 (rtnl_mutex){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199
    #1: 00000000f2d87c21 (&(&tb->tb6_lock)->rlock){+...}, at: spin_lock_bh include/linux/spinlock.h:343 [inline]
    #1: 00000000f2d87c21 (&(&tb->tb6_lock)->rlock){+...}, at: fib6_dump_table.isra.0+0x37e/0x570 net/ipv6/ip6_fib.c:533

    stack backtrace:
    CPU: 0 PID: 8966 Comm: syz-executor609 Not tainted 5.2.0-rc5+ #43
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x172/0x1f0 lib/dump_stack.c:113
    lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5250
    fib6_nh_get_excptn_bucket+0x18e/0x1b0 net/ipv6/route.c:1593
    rt6_nh_dump_exceptions+0x45/0x4d0 net/ipv6/route.c:5541
    rt6_dump_route+0x904/0xc50 net/ipv6/route.c:5640
    fib6_dump_node+0x168/0x280 net/ipv6/ip6_fib.c:467
    fib6_walk_continue+0x4a9/0x8e0 net/ipv6/ip6_fib.c:1986
    fib6_walk+0x9d/0x100 net/ipv6/ip6_fib.c:2034
    fib6_dump_table.isra.0+0x38a/0x570 net/ipv6/ip6_fib.c:534
    inet6_dump_fib+0x93c/0xb00 net/ipv6/ip6_fib.c:624
    rtnl_dump_all+0x295/0x490 net/core/rtnetlink.c:3445
    netlink_dump+0x558/0xfb0 net/netlink/af_netlink.c:2244
    __netlink_dump_start+0x5b1/0x7d0 net/netlink/af_netlink.c:2352
    netlink_dump_start include/linux/netlink.h:226 [inline]
    rtnetlink_rcv_msg+0x73d/0xb00 net/core/rtnetlink.c:5182
    netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
    rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5237
    netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
    netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
    netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1917
    sock_sendmsg_nosec net/socket.c:646 [inline]
    sock_sendmsg+0xd7/0x130 net/socket.c:665
    sock_write_iter+0x27c/0x3e0 net/socket.c:994
    call_write_iter include/linux/fs.h:1872 [inline]
    new_sync_write+0x4d3/0x770 fs/read_write.c:483
    __vfs_write+0xe1/0x110 fs/read_write.c:496
    vfs_write+0x20c/0x580 fs/read_write.c:558
    ksys_write+0x14f/0x290 fs/read_write.c:611
    __do_sys_write fs/read_write.c:623 [inline]
    __se_sys_write fs/read_write.c:620 [inline]
    __x64_sys_write+0x73/0xb0 fs/read_write.c:620
    do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x4401b9
    Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007ffc8e134978 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401b9
    RDX: 000000000000001c RSI: 0000000020000000 RDI: 00

    Fixes: 1e47b4837f3b ("ipv6: Dump route exceptions if requested")
    Signed-off-by: Eric Dumazet
    Cc: Stefano Brivio
    Cc: David Ahern
    Reviewed-by: Stefano Brivio
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 Jun, 2019

1 commit


25 Jun, 2019

3 commits

  • Since commit 2b760fcf5cfb ("ipv6: hook up exception table to store dst
    cache"), route exceptions reside in a separate hash table, and won't be
    found by walking the FIB, so they won't be dumped to userspace on a
    RTM_GETROUTE message.

    This causes 'ip -6 route list cache' and 'ip -6 route flush cache' to
    have no function anymore:

    # ip -6 route get fc00:3::1
    fc00:3::1 via fc00:1::2 dev veth_A-R1 src fc00:1::1 metric 1024 expires 539sec mtu 1400 pref medium
    # ip -6 route get fc00:4::1
    fc00:4::1 via fc00:2::2 dev veth_A-R2 src fc00:2::1 metric 1024 expires 536sec mtu 1500 pref medium
    # ip -6 route list cache
    # ip -6 route flush cache
    # ip -6 route get fc00:3::1
    fc00:3::1 via fc00:1::2 dev veth_A-R1 src fc00:1::1 metric 1024 expires 520sec mtu 1400 pref medium
    # ip -6 route get fc00:4::1
    fc00:4::1 via fc00:2::2 dev veth_A-R2 src fc00:2::1 metric 1024 expires 519sec mtu 1500 pref medium

    because iproute2 lists cached routes using RTM_GETROUTE, and flushes them
    by listing all the routes, and deleting them with RTM_DELROUTE one by one.

    If cached routes are requested using the RTM_F_CLONED flag together with
    strict checking, or if no strict checking is requested (and hence we can't
    consistently apply filters), look up exceptions in the hash table
    associated with the current fib6_info in rt6_dump_route(), and, if present
    and not expired, add them to the dump.

    We might be unable to dump all the entries for a given node in a single
    message, so keep track of how many entries were handled for the current
    node in fib6_walker, and skip that amount in case we start from the same
    partially dumped node.

    When a partial dump restarts, as the starting node might change when
    'sernum' changes, we have no guarantee that we need to skip the same
    amount of in-node entries. Therefore, we need two counters, and we need to
    zero the in-node counter if the node from which the dump is resumed
    differs.

    Note that, with the current version of iproute2, this only fixes the
    'ip -6 route list cache': on a flush command, iproute2 doesn't pass
    RTM_F_CLONED and, due to this inconsistency, 'ip -6 route flush cache' is
    still unable to fetch the routes to be flushed. This will be addressed in
    a patch for iproute2.

    To flush cached routes, a procfs entry could be introduced instead: that's
    how it works for IPv4. We already have a rt6_flush_exception() function
    ready to be wired to it. However, this would not solve the issue for
    listing.

    Versions of iproute2 and kernel tested:

    iproute2
    kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0 5.1.0, patched
    3.18 list + + + + + +
    flush + + + + + +
    4.4 list + + + + + +
    flush + + + + + +
    4.9 list + + + + + +
    flush + + + + + +
    4.14 list + + + + + +
    flush + + + + + +
    4.15 list
    flush
    4.19 list
    flush
    5.0 list
    flush
    5.1 list
    flush
    with list + + + + + +
    fix flush + + + +

    v7:
    - Explain usage of "skip" counters in commit message (suggested by
    David Ahern)

    v6:
    - Rebase onto net-next, use recently introduced nexthop walker
    - Make rt6_nh_dump_exceptions() a separate function (suggested by David
    Ahern)

    v5:
    - Use dump_routes and dump_exceptions from filter, ignore NLM_F_MATCH,
    update test results (flushing works with iproute2 < 5.0.0 now)

    v4:
    - Split NLM_F_MATCH and strict check handling in separate patches
    - Filter routes using RTM_F_CLONED: if it's not set, only return
    non-cached routes, and if it's set, only return cached routes:
    change requested by David Ahern and Martin Lau. This implies that
    iproute2 needs a separate patch to be able to flush IPv6 cached
    routes. This is not ideal because we can't fix the breakage caused
    by 2b760fcf5cfb entirely in kernel. However, two years have passed
    since then, and this makes it more tolerable

    v3:
    - More descriptive comment about expired exceptions in rt6_dump_route()
    - Swap return values of rt6_dump_route() (suggested by Martin Lau)
    - Don't zero skip_in_node in case we don't dump anything in a given pass
    (also suggested by Martin Lau)
    - Remove check on RTM_F_CLONED altogether: in the current UAPI semantic,
    it's just a flag to indicate the route was cloned, not to filter on
    routes

    v2: Add tracking of number of entries to be skipped in current node after
    a partial dump. As we restart from the same node, if not all the
    exceptions for a given node fit in a single message, the dump will
    not terminate, as suggested by Martin Lau. This is a concrete
    possibility, setting up a big number of exceptions for the same route
    actually causes the issue, suggested by David Ahern.

    Reported-by: Jianlin Shi
    Fixes: 2b760fcf5cfb ("ipv6: hook up exception table to store dst cache")
    Signed-off-by: Stefano Brivio
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Stefano Brivio
     
  • In the next patch, we are going to add optional dump of exceptions to
    rt6_dump_route().

    Change the return code of rt6_dump_route() to accomodate partial node
    dumps: we might dump multiple routes per node, and might be able to dump
    only a given number of them, so fib6_dump_node() will need to know how
    many routes have been dumped on partial dump, to restart the dump from the
    point where it was interrupted.

    Note that fib6_dump_node() is the only caller and already handles all
    non-negative return codes as success: those become -1 to signal that we're
    done with the node. If we fail, return 0, as we were unable to dump the
    single route in the node, but we're not done with it.

    Signed-off-by: Stefano Brivio
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Stefano Brivio
     
  • If fc_nh_id isn't set, we shouldn't try to match against it. This
    actually matters just for the RTF_CACHE below (where this case is
    already handled): if iproute2 gets a route exception and tries to
    delete it, it won't reference it by fc_nh_id, even if a nexthop
    object might be associated to the originating route.

    Fixes: 5b98324ebe29 ("ipv6: Allow routes to use nexthop objects")
    Signed-off-by: Stefano Brivio
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Stefano Brivio
     

24 Jun, 2019

4 commits

  • For tx path, in most cases, we still have to take refcnt on the dst
    cause the caller is caching the dst somewhere. But it still is
    beneficial to make use of RT6_LOOKUP_F_DST_NOREF flag while doing the
    route lookup. It is cause this flag prevents manipulating refcnt on
    net->ipv6.ip6_null_entry when doing fib6_rule_lookup() to traverse each
    routing table. The null_entry is a shared object and constant updates on
    it cause false sharing.

    We converted the current major lookup function ip6_route_output_flags()
    to make use of RT6_LOOKUP_F_DST_NOREF.

    Together with the change in the rx path, we see noticable performance
    boost:
    I ran synflood tests between 2 hosts under the same switch. Both hosts
    have 20G mlx NIC, and 8 tx/rx queues.
    Sender sends pure SYN flood with random src IPs and ports using trafgen.
    Receiver has a simple TCP listener on the target port.
    Both hosts have multiple custom rules:
    - For incoming packets, only local table is traversed.
    - For outgoing packets, 3 tables are traversed to find the route.
    The packet processing rate on the receiver is as follows:
    - Before the fix: 3.78Mpps
    - After the fix: 5.50Mpps

    Signed-off-by: Wei Wang
    Signed-off-by: David S. Miller

    Wei Wang
     
  • ip6_route_input() is the key function to do the route lookup in the
    rx data path. All the callers to this function are already holding rcu
    lock. So it is fairly easy to convert it to not take refcnt on the dst:
    We pass in flag RT6_LOOKUP_F_DST_NOREF and do skb_dst_set_noref().
    This saves a few atomic inc or dec operations and should boost
    performance overall.
    This also makes the logic more aligned with v4.

    Signed-off-by: Wei Wang
    Acked-by: Eric Dumazet
    Acked-by: Mahesh Bandewar
    Signed-off-by: David S. Miller

    Wei Wang
     
  • Initialize rt6->rt6i_uncached on the following pre-allocated dsts:
    net->ipv6.ip6_null_entry
    net->ipv6.ip6_prohibit_entry
    net->ipv6.ip6_blk_hole_entry

    This is a preparation patch for later commits to be able to distinguish
    dst entries in uncached list by doing:
    !list_empty(rt6->rt6i_uncached)

    Signed-off-by: Wei Wang
    Acked-by: Eric Dumazet
    Acked-by: Mahesh Bandewar
    Signed-off-by: David S. Miller

    Wei Wang
     
  • This new flag is to instruct the route lookup function to not take
    refcnt on the dst entry. The user which does route lookup with this flag
    must properly use rcu protection.
    ip6_pol_route() is the major route lookup function for both tx and rx
    path.
    In this function:
    Do not take refcnt on dst if RT6_LOOKUP_F_DST_NOREF flag is set, and
    directly return the route entry. The caller should be holding rcu lock
    when using this flag, and decide whether to take refcnt or not.

    One note on the dst cache in the uncached_list:
    As uncached_list does not consume refcnt, one refcnt is always returned
    back to the caller even if RT6_LOOKUP_F_DST_NOREF flag is set.
    Uncached dst is only possible in the output path. So in such call path,
    caller MUST check if the dst is in the uncached_list before assuming
    that there is no refcnt taken on the returned dst.

    Signed-off-by: Wei Wang
    Acked-by: Eric Dumazet
    Acked-by: Mahesh Bandewar
    Signed-off-by: David S. Miller

    Wei Wang
     

23 Jun, 2019

1 commit

  • When user space sends invalid information in RTA_MULTIPATH, the nexthop
    list in ip6_route_multipath_add() is empty and 'rt_notif' is set to
    NULL.

    The code that emits the in-kernel notifications does not check for this
    condition, which results in a NULL pointer dereference [1].

    Fix this by bailing earlier in the function if the parsed nexthop list
    is empty. This is consistent with the corresponding IPv4 code.

    v2:
    * Check if parsed nexthop list is empty and bail with extack set

    [1]
    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 0 PID: 9190 Comm: syz-executor149 Not tainted 5.2.0-rc5+ #38
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    RIP: 0010:call_fib6_multipath_entry_notifiers+0xd1/0x1a0
    net/ipv6/ip6_fib.c:396
    Code: 8b b5 30 ff ff ff 48 c7 85 68 ff ff ff 00 00 00 00 48 c7 85 70 ff ff
    ff 00 00 00 00 89 45 88 4c 89 e0 48 c1 e8 03 4c 89 65 80 80 3c 28 00
    0f 85 9a 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d
    RSP: 0018:ffff88809788f2c0 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 1ffff11012f11e59 RCX: 00000000ffffffff
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffff88809788f390 R08: ffff88809788f8c0 R09: 000000000000000c
    R10: ffff88809788f5d8 R11: ffff88809788f527 R12: 0000000000000000
    R13: dffffc0000000000 R14: ffff88809788f8c0 R15: ffffffff89541d80
    FS: 000055555632c880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020000080 CR3: 000000009ba7c000 CR4: 00000000001406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    ip6_route_multipath_add+0xc55/0x1490 net/ipv6/route.c:5094
    inet6_rtm_newroute+0xed/0x180 net/ipv6/route.c:5208
    rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5219
    netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
    rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5237
    netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
    netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
    netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1917
    sock_sendmsg_nosec net/socket.c:646 [inline]
    sock_sendmsg+0xd7/0x130 net/socket.c:665
    ___sys_sendmsg+0x803/0x920 net/socket.c:2286
    __sys_sendmsg+0x105/0x1d0 net/socket.c:2324
    __do_sys_sendmsg net/socket.c:2333 [inline]
    __se_sys_sendmsg net/socket.c:2331 [inline]
    __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2331
    do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x4401f9
    Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7
    48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff
    ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007ffc09fd0028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401f9
    RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
    RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401a80
    R13: 0000000000401b10 R14: 0000000000000000 R15: 0000000000000000

    Reported-by: syzbot+382566d339d52cd1a204@syzkaller.appspotmail.com
    Fixes: ebee3cad835f ("ipv6: Add IPv6 multipath notifications for add / replace")
    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Ido Schimmel
     

22 Jun, 2019

1 commit


20 Jun, 2019

1 commit

  • A user reported that routes are getting installed with type 0 (RTN_UNSPEC)
    where before the routes were RTN_UNICAST. One example is from accel-ppp
    which apparently still uses the ioctl interface and does not set
    rtmsg_type. Another is the netlink interface where ipv6 does not require
    rtm_type to be set (v4 does). Prior to the commit in the Fixes tag the
    ipv6 stack converted type 0 to RTN_UNICAST, so restore that behavior.

    Fixes: e8478e80e5a7 ("net/ipv6: Save route type in rt6_info")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

19 Jun, 2019

2 commits