28 Oct, 2016

1 commit

  • rt6_add_route_info and rt6_add_dflt_router were updated to pull the FIB
    table from the device index, but the corresponding rt6_get_route_info
    and rt6_get_dflt_router functions were not leading to the failure to
    process RA's:

    ICMPv6: RA: ndisc_router_discovery failed to add default route

    Fix the 'get' functions by using the table id associated with the
    device when applicable.

    Also, now that default routes can be added to tables other than the
    default table, rt6_purge_dflt_routers needs to be updated as well to
    look at all tables. To handle that efficiently, add a flag to the table
    denoting if it is has a default route via RA.

    Fixes: ca254490c8dfd ("net: Add VRF support to IPv6 stack")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

16 Nov, 2015

1 commit

  • All DST_NOCACHE rt6_info used to have rt->dst.from set to
    its parent.

    After commit 8e3d5be73681 ("ipv6: Avoid double dst_free"),
    DST_NOCACHE is also set to rt6_info which does not have
    a parent (i.e. rt->dst.from is NULL).

    This patch catches the rt->dst.from == NULL case.

    Fixes: 8e3d5be73681 ("ipv6: Avoid double dst_free")
    Signed-off-by: Martin KaFai Lau
    Cc: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

18 Sep, 2015

1 commit


21 Aug, 2015

1 commit

  • Currently, the lwtunnel state resides in per-protocol data. This is
    a problem if we encapsulate ipv6 traffic in an ipv4 tunnel (or vice versa).
    The xmit function of the tunnel does not know whether the packet has been
    routed to it by ipv4 or ipv6, yet it needs the lwtstate data. Moving the
    lwtstate data to dst_entry makes such inter-protocol tunneling possible.

    As a bonus, this brings a nice diffstat.

    Signed-off-by: Jiri Benc
    Acked-by: Roopa Prabhu
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Jiri Benc
     

22 Jul, 2015

1 commit


26 May, 2015

4 commits

  • After the patch
    'ipv6: Only create RTF_CACHE routes after encountering pmtu exception',
    we need to compensate the performance hit (bouncing dst->__refcnt).

    Signed-off-by: Martin KaFai Lau
    Cc: Hannes Frederic Sowa
    Cc: Steffen Klassert
    Cc: Julian Anastasov
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • This patch keeps track of the DST_NOCACHE routes in a list and replaces its
    dev with loopback during the iface down/unregister event.

    Signed-off-by: Martin KaFai Lau
    Cc: Hannes Frederic Sowa
    Cc: Steffen Klassert
    Cc: Julian Anastasov
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • This patch always creates RTF_CACHE clone with DST_NOCACHE
    when FLOWI_FLAG_KNOWN_NH is set so that the rt6i_dst is set to
    the fl6->daddr.

    Signed-off-by: Martin KaFai Lau
    Acked-by: Julian Anastasov
    Tested-by: Julian Anastasov
    Cc: Hannes Frederic Sowa
    Cc: Steffen Klassert
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • Instead of doing the rt6->rt6i_node check whenever we need
    to get the route's cookie. Refactor it into rt6_get_cookie().
    It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also
    percpu rt6_info later.

    Signed-off-by: Martin KaFai Lau
    Cc: Hannes Frederic Sowa
    Cc: Steffen Klassert
    Cc: Julian Anastasov
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

02 May, 2015

2 commits

  • _rt6i_peer is no longer needed after the last patch,
    'ipv6: Stop rt6_info from using inet_peer's metrics'.

    DST_METRICS_FORCE_OVERWRITE is added by
    commit e5fd387ad5b3 ("ipv6: do not overwrite inetpeer metrics prematurely").
    Since inetpeer is no longer used for metrics, this bit is also not needed.

    Signed-off-by: Martin KaFai Lau
    Reviewed-by: Hannes Frederic Sowa
    Cc: Michal Kubeček
    Cc: Steffen Klassert
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • inet_peer is indexed by the dst address alone. However, the fib6 tree
    could have multiple routing entries (rt6_info) for the same dst. For
    example,
    1. A /128 dst via multiple gateways.
    2. A RTF_CACHE route cloned from a /128 route.

    In the above cases, all of them will share the same metrics and
    step on each other.

    This patch will steer away from inet_peer's metrics and use
    dst_cow_metrics_generic() for everything.

    Change Highlights:
    1. Remove rt6_cow_metrics() which currently acquires metrics from
    inet_peer for DST_HOST route (i.e. /128 route).
    2. Add rt6i_pmtu to take care of the pmtu update to avoid creating a
    full size metrics just to override the RTAX_MTU.
    3. After (2), the RTF_CACHE route can also share the metrics with its
    dst.from route, by:
    dst_init_metrics(&cache_rt->dst, dst_metrics_ptr(cache_rt->dst.from), true);
    4. Stop creating RTF_CACHE route by cloning another RTF_CACHE route. Instead,
    directly clone from rt->dst.

    [ Currently, cloning from another RTF_CACHE is only possible during
    rt6_do_redirect(). Also, the old clone is removed from the tree
    immediately after the new clone is added. ]

    In case of cloning from an older redirect RTF_CACHE, it should work as
    before.

    In case of cloning from an older pmtu RTF_CACHE, this patch will forget
    the pmtu and re-learn it (if there is any) from the redirected route.

    The _rt6i_peer and DST_METRICS_FORCE_OVERWRITE will be removed
    in the next cleanup patch.

    Signed-off-by: Martin KaFai Lau
    Reviewed-by: Hannes Frederic Sowa
    Cc: Steffen Klassert
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

06 Jan, 2015

1 commit


07 Oct, 2014

2 commits


01 Oct, 2014

1 commit

  • Eric Dumazet noticed that all no-nonexthop or no-gateway routes which
    are already marked DST_HOST (e.g. input routes routes) will always be
    invalidated during sk_dst_check. Thus per-socket dst caching absolutely
    had no effect and early demuxing had no effect.

    Thus this patch removes rt6i_genid: fn_sernum already gets modified during
    add operations, so we only must ensure we mutate fn_sernum during ipv6
    address remove operations. This is a fairly cost extensive operations,
    but address removal should not happen that often. Also our mtu update
    functions do the same and we heard no complains so far. xfrm policy
    changes also cause a call into fib6_flush_trees. Also plug a hole in
    rt6_info (no cacheline changes).

    I verified via tracing that this change has effect.

    Cc: Eric Dumazet
    Cc: YOSHIFUJI Hideaki
    Cc: Vlad Yasevich
    Cc: Nicolas Dichtel
    Cc: Martin Lau
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

28 Mar, 2014

1 commit

  • If an IPv6 host route with metrics exists, an attempt to add a
    new route for the same target with different metrics fails but
    rewrites the metrics anyway:

    12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000
    12sp0:~ # ip -6 route show
    fe80::/64 dev eth0 proto kernel metric 256
    fec0::1 dev eth0 metric 1024 rto_min lock 1s
    12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1500
    RTNETLINK answers: File exists
    12sp0:~ # ip -6 route show
    fe80::/64 dev eth0 proto kernel metric 256
    fec0::1 dev eth0 metric 1024 rto_min lock 1.5s

    This is caused by all IPv6 host routes using the metrics in
    their inetpeer (or the shared default). This also holds for the
    new route created in ip6_route_add() which shares the metrics
    with the already existing route and thus ip6_route_add()
    rewrites the metrics even if the new route ends up not being
    used at all.

    Another problem is that old metrics in inetpeer can reappear
    unexpectedly for a new route, e.g.

    12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000
    12sp0:~ # ip route del fec0::1
    12sp0:~ # ip route add fec0::1 dev eth0
    12sp0:~ # ip route change fec0::1 dev eth0 hoplimit 10
    12sp0:~ # ip -6 route show
    fe80::/64 dev eth0 proto kernel metric 256
    fec0::1 dev eth0 metric 1024 hoplimit 10 rto_min lock 1s

    Resolve the first problem by moving the setting of metrics down
    into fib6_add_rt2node() to the point we are sure we are
    inserting the new route into the tree. Second problem is
    addressed by introducing new flag DST_METRICS_FORCE_OVERWRITE
    which is set for a new host route in ip6_route_add() and makes
    ipv6_cow_metrics() always overwrite the metrics in inetpeer
    (even if they are not "new"); it is reset after that.

    v5: use a flag in _metrics member rather than one in flags

    v4: fix a typo making a condition always true (thanks to Hannes
    Frederic Sowa)

    v3: rewritten based on David Miller's idea to move setting the
    metrics (and allocation in non-host case) down to the point we
    already know the route is to be inserted. Also rebased to
    net-next as it is quite late in the cycle.

    Signed-off-by: Michal Kubecek
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Michal Kubeček
     

02 Jan, 2014

1 commit


05 Nov, 2013

1 commit

  • Conflicts:
    drivers/net/ethernet/emulex/benet/be.h
    drivers/net/netconsole.c
    net/bridge/br_private.h

    Three mostly trivial conflicts.

    The net/bridge/br_private.h conflict was a function signature (argument
    addition) change overlapping with the extern removals from Joe Perches.

    In drivers/net/netconsole.c we had one change adjusting a printk message
    whilst another changed "printk(KERN_INFO" into "pr_info(".

    Lastly, the emulex change was a new inline function addition overlapping
    with Joe Perches's extern removals.

    Signed-off-by: David S. Miller

    David S. Miller
     

26 Oct, 2013

1 commit

  • On receiving a packet too big icmp error we update the expire value by
    calling rt6_update_expires. This function uses dst_set_expires which is
    implemented that it can only reduce the expiration value of the dst entry.

    If we insert new routing non-expiry information into the ipv6 fib where
    we already have a matching rt6_info we only clear the RTF_EXPIRES flag
    in rt6i_flags and leave the dst.expires value as is.

    When new mtu information arrives for that cached dst_entry we again
    call dst_set_expires. This time it won't update the dst.expire value
    because we left the dst.expire value intact from the last update. So
    dst_set_expires won't touch dst.expires.

    Fix this by resetting dst.expires when clearing the RTF_EXPIRE flag.
    dst_set_expires checks for a zero expiration and updates the
    dst.expires.

    In the past this (not updating dst.expires) was necessary because
    dst.expire was placed in a union with the dst_entry *from reference
    and rt6_clean_expires did assign NULL to it. This split happend in
    ecd9883724b78cc72ed92c98bcb1a46c764fff21 ("ipv6: fix race condition
    regarding dst->expires and dst->from").

    Reported-by: Steinar H. Gunderson
    Reported-by: Valentijn Sessink
    Cc: YOSHIFUJI Hideaki
    Acked-by: Eric Dumazet
    Tested-by: Valentijn Sessink
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

28 Sep, 2013

1 commit

  • Dumping routes on a system with lots rt6_infos in the fibs causes up to
    11-order allocations in seq_file (which fail). While we could switch
    there to vmalloc we could just implement the streaming interface for
    /proc/net/ipv6_route. This patch switches /proc/net/ipv6_route from
    single_open_net to seq_open_net.

    loff_t *pos tracks dst entries.

    Also kill never used struct rt6_proc_arg and now unused function
    fib6_clean_all_ro.

    Cc: Ben Greear
    Cc: Patrick McHardy
    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

22 Sep, 2013

1 commit

  • There are a mix of function prototypes with and without extern
    in the kernel sources. Standardize on not using extern for
    function prototypes.

    Function prototypes don't need to be written with extern.
    extern is assumed by the compiler. Its use is as unnecessary as
    using auto to declare automatic/local variables in a block.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

02 Aug, 2013

1 commit

  • On a high-traffic router with many processors and many IPv6 dst
    entries, soft lockup in fib6_run_gc() can occur when number of
    entries reaches gc_thresh.

    This happens because fib6_run_gc() uses fib6_gc_lock to allow
    only one thread to run the garbage collector but ip6_dst_gc()
    doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
    returns. On a system with many entries, this can take some time
    so that in the meantime, other threads pass the tests in
    ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
    the lock. They then have to run the garbage collector one after
    another which blocks them for quite long.

    Resolve this by replacing special value ~0UL of expire parameter
    to fib6_run_gc() by explicit "force" parameter to choose between
    spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
    force=false if gc_thresh is reached but not max_size.

    Signed-off-by: Michal Kubecek
    Signed-off-by: David S. Miller

    Michal Kubeček
     

21 Feb, 2013

1 commit

  • Eric Dumazet wrote:
    | Some strange crashes happen in rt6_check_expired(), with access
    | to random addresses.
    |
    | At first glance, it looks like the RTF_EXPIRES and
    | stuff added in commit 1716a96101c49186b
    | (ipv6: fix problem with expired dst cache)
    | are racy : same dst could be manipulated at the same time
    | on different cpus.
    |
    | At some point, our stack believes rt->dst.from contains a dst pointer,
    | while its really a jiffie value (as rt->dst.expires shares the same area
    | of memory)
    |
    | rt6_update_expires() should be fixed, or am I missing something ?
    |
    | CC Neil because of https://bugzilla.redhat.com/show_bug.cgi?id=892060

    Because we do not have any locks for dst_entry, we cannot change
    essential structure in the entry; e.g., we cannot change reference
    to other entity.

    To fix this issue, split 'from' and 'expires' field in dst_entry
    out of union. Once it is 'from' is assigned in the constructor,
    keep the reference until the very last stage of the life time of
    the object.

    Of course, it is unsafe to change 'from', so make rt6_set_from simple
    just for fresh entries.

    Reported-by: Eric Dumazet
    Reported-by: Neil Horman
    CC: Gao Feng
    Signed-off-by: YOSHIFUJI Hideaki
    Reviewed-by: Eric Dumazet
    Reported-by: Steinar H. Gunderson
    Reviewed-by: Neil Horman
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki / 吉藤英明
     

18 Jan, 2013

1 commit


09 Nov, 2012

1 commit

  • 6431cbc25f(Create a mechanism for upward inetpeer propagation into routes)
    introduces these codes, but this mechanism is never enabled since
    rt6i_peer_genid always is zero whether it is not assigned or assigned by
    rt6_peer_genid(). After 5943634fc5 (ipv4: Maintain redirect and PMTU info
    in struct rtable again), the ipv4 related codes of this mechanism has been
    removed, I think we maybe able to remove them now.

    Signed-off-by: Li RongQing
    Signed-off-by: David S. Miller

    Li RongQing
     

04 Nov, 2012

1 commit

  • As suggested by Eric, we could introduce a helper function
    for ipv6 too, to avoid checking if rt is NULL before
    dst_release().

    Cc: Eric Dumazet
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Amerigo Wang
     

23 Oct, 2012

1 commit

  • Each nexthop is added like a single route in the routing table. All routes
    that have the same metric/weight and destination but not the same gateway
    are considering as ECMP routes. They are linked together, through a list called
    rt6i_siblings.

    ECMP routes can be added in one shot, with RTA_MULTIPATH attribute or one after
    the other (in both case, the flag NLM_F_EXCL should not be set).

    The patch is based on a previous work from
    Luc Saillard .

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

29 Sep, 2012

1 commit

  • Conflicts:
    drivers/net/team/team.c
    drivers/net/usb/qmi_wwan.c
    net/batman-adv/bat_iv_ogm.c
    net/ipv4/fib_frontend.c
    net/ipv4/route.c
    net/l2tp/l2tp_netlink.c

    The team, fib_frontend, route, and l2tp_netlink conflicts were simply
    overlapping changes.

    qmi_wwan and bat_iv_ogm were of the "use HEAD" variety.

    With help from Antonio Quartulli.

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Sep, 2012

1 commit

  • IPv6 dst should take care of rt_genid too. When a xfrm policy is inserted or
    deleted, all dst should be invalidated.
    To force the validation, dst entries should be created with ->obsolete set to
    DST_OBSOLETE_FORCE_CHK. This was already the case for all functions calling
    ip6_dst_alloc(), except for ip6_rt_copy().

    As a consequence, we can remove the specific code in inet6_connection_sock.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

06 Sep, 2012

1 commit

  • When adding a blackhole or a prohibit route, they were handling like classic
    routes. Moreover, it was only possible to add this kind of routes by specifying
    an interface.

    Bug already reported here:
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=498498

    Before the patch:
    $ ip route add blackhole 2001::1/128
    RTNETLINK answers: No such device
    $ ip route add blackhole 2001::1/128 dev eth0
    $ ip -6 route | grep 2001
    2001::1 dev eth0 metric 1024

    After:
    $ ip route add blackhole 2001::1/128
    $ ip -6 route | grep 2001
    blackhole 2001::1 dev lo metric 1024 error -22

    v2: wrong patch
    v3: add a field fc_type in struct fib6_config to store RTN_* type

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

05 Jul, 2012

1 commit


16 Jun, 2012

4 commits


11 Jun, 2012

2 commits


18 Apr, 2012

2 commits

  • Functionally, this change is a NOP.

    Semantically, rt6_clean_expires() wants to do rt->dst.from = NULL instead of
    rt->dst.expires = 0. It is clearing the RTF_EXPIRES flag, so the union is going
    to be treated as a pointer (dst.from) not a long (dst.expires).

    Signed-off-by: Jiri Bohac
    Signed-off-by: David S. Miller

    Jiri Bohac
     
  • Commit 1716a961 (ipv6: fix problem with expired dst cache) broke PMTU
    discovery. rt6_update_expires() calls dst_set_expires(), which only updates
    dst->expires if it has not been set previously (expires == 0) or if the new
    expires is earlier than the current dst->expires.

    rt6_update_expires() needs to zero rt->dst.expires, otherwise it will contain
    ivalid data left over from rt->dst.from and will confuse dst_set_expires().

    Signed-off-by: Jiri Bohac
    Signed-off-by: David S. Miller

    Jiri Bohac
     

14 Apr, 2012

1 commit

  • If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet.
    this dst cache will not check expire because it has no RTF_EXPIRES flag.
    So this dst cache will always be used until the dst gc run.

    Change the struct dst_entry,add a union contains new pointer from and expires.
    When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use.
    we can use this field to point to where the dst cache copy from.
    The dst.from is only used in IPV6.

    rt6_check_expired check if rt6_info.dst.from is expired.

    ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF
    and RTF_DEFAULT.then hold the ort.

    ip6_dst_destroy release the ort.

    Add some functions to operate the RTF_EXPIRES flag and expires(from) together.
    and change the code to use these new adding functions.

    Changes from v5:
    modify ip6_route_add and ndisc_router_discovery to use new adding functions.

    Only set dst.from when the ort has flag RTF_ADDRCONF
    and RTF_DEFAULT.then hold the ort.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng