09 Aug, 2017

1 commit

  • If the user hasn't installed any custom rules, don't go through the
    whole FIB rules layer. This is pretty similar to f4530fa574df (ipv4:
    Avoid overhead when no custom FIB rules are installed).

    Using a micro-benchmark module [1], timing ip6_route_output() with
    get_cycles(), with 40,000 routes in the main routing table, before this
    patch:

    min=606 max=12911 count=627 average=1959 95th=4903 90th=3747 50th=1602 mad=821
    table=254 avgdepth=21.8 maxdepth=39
    value │ ┊ count
    600 │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 199
    880 │▒▒▒░░░░░░░░░░░░░░░░ 43
    1160 │▒▒▒░░░░░░░░░░░░░░░░░░░░ 48
    1440 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░ 43
    1720 │▒▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░ 59
    2000 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 50
    2280 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 26
    2560 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 31
    2840 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 28
    3120 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 17
    3400 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 17
    3680 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8
    3960 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11
    4240 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 6
    4520 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 6
    4800 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 9

    After:

    min=544 max=11687 count=627 average=1776 95th=4546 90th=3585 50th=1227 mad=565
    table=254 avgdepth=21.8 maxdepth=39
    value │ ┊ count
    540 │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 201
    800 │▒▒▒▒▒░░░░░░░░░░░░░░░░ 63
    1060 │▒▒▒▒▒░░░░░░░░░░░░░░░░░░░░░ 68
    1320 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░ 39
    1580 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 32
    1840 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 32
    2100 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 34
    2360 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 33
    2620 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 26
    2880 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 22
    3140 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 9
    3400 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8
    3660 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 9
    3920 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8
    4180 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8
    4440 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 8

    At the frequency of the host during the bench (~ 3.7 GHz), this is
    about a 100 ns difference on the median value.

    A next step would be to collapse local and main tables, as in
    0ddcf43d5d4a (ipv4: FIB Local/MAIN table collapse).

    [1]: https://github.com/vincentbernat/network-lab/blob/master/lab-routes-ipv6/kbench_mod.c

    Signed-off-by: Vincent Bernat
    Reviewed-by: Jiri Pirko
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Vincent Bernat
     

04 Aug, 2017

2 commits

  • Allow users of the FIB notification chain to receive a complete view of
    the IPv6 FIB rules upon registration to the chain.

    The integrity of the dump is ensured by a per-family sequence counter
    that is incremented (under RTNL) whenever a rule is added or deleted.

    All the sequence counters are read (under RTNL) and summed, prior and
    after the dump. In case the counters differ, then the dump is either
    restarted or the registration fails.

    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • As explained in commit 3c71006d15fd ("ipv4: fib_rules: Check if rule is
    a default rule"), drivers supporting IPv6 FIB offload need to be able to
    sanitize the rules they don't support and potentially flush their
    tables.

    Add an IPv6 helper to check if a FIB rule is a default rule.

    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     

21 Jun, 2017

1 commit

  • While commit 73ba57bfae4a ("ipv6: fix backtracking for throw routes")
    does good job on error propagation to the fib_rules_lookup()
    in fib rules core framework that also corrects throw routes
    handling, it does not solve route reference leakage problem
    happened when we return -EAGAIN to the fib_rules_lookup()
    and leave routing table entry referenced in arg->result.

    If rule with matched throw route isn't last matched in the
    list we overwrite arg->result losing reference on throw
    route stored previously forever.

    We also partially revert commit ab997ad40839 ("ipv6: fix the
    incorrect return value of throw route") since we never return
    routing table entry with dst.error == -EAGAIN when
    CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point
    to check for RTF_REJECT flag since it is always set throw
    route.

    Fixes: 73ba57bfae4a ("ipv6: fix backtracking for throw routes")
    Signed-off-by: Serhey Popovych
    Signed-off-by: David S. Miller

    Serhey Popovych
     

11 Sep, 2016

1 commit

  • Add l3mdev hook to set FLOWI_FLAG_SKIP_NH_OIF flag and update oif/iif
    in flow struct if its oif or iif points to a device enslaved to an L3
    Master device. Only 1 needs to be converted to match the l3mdev FIB
    rule. This moves the flow adjustment for l3mdev to a single point
    catching all lookups. It is redundant for existing hooks (those are
    removed in later patches) but is needed for missed lookups such as
    PMTU updates.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

09 Jun, 2016

1 commit

  • Currently, VRFs require 1 oif and 1 iif rule per address family per
    VRF. As the number of VRF devices increases it brings scalability
    issues with the increasing rule list. All of the VRF rules have the
    same format with the exception of the specific table id to direct the
    lookup. Since the table id is available from the oif or iif in the
    loopup, the VRF rules can be consolidated to a single rule that pulls
    the table from the VRF device.

    This patch introduces a new rule attribute l3mdev. The l3mdev rule
    means the table id used for the lookup is pulled from the L3 master
    device (e.g., VRF) rather than being statically defined. With the
    l3mdev rule all of the basic VRF FIB rules are reduced to 1 l3mdev
    rule per address family (IPv4 and IPv6).

    If an admin wishes to insert higher priority rules for specific VRFs
    those rules will co-exist with the l3mdev rule. This capability means
    current VRF scripts will co-exist with this new simpler implementation.

    Currently, the rules list for both ipv4 and ipv6 look like this:
    $ ip ru ls
    1000: from all oif vrf1 lookup 1001
    1000: from all iif vrf1 lookup 1001
    1000: from all oif vrf2 lookup 1002
    1000: from all iif vrf2 lookup 1002
    1000: from all oif vrf3 lookup 1003
    1000: from all iif vrf3 lookup 1003
    1000: from all oif vrf4 lookup 1004
    1000: from all iif vrf4 lookup 1004
    1000: from all oif vrf5 lookup 1005
    1000: from all iif vrf5 lookup 1005
    1000: from all oif vrf6 lookup 1006
    1000: from all iif vrf6 lookup 1006
    1000: from all oif vrf7 lookup 1007
    1000: from all iif vrf7 lookup 1007
    1000: from all oif vrf8 lookup 1008
    1000: from all iif vrf8 lookup 1008
    ...
    32765: from all lookup local
    32766: from all lookup main
    32767: from all lookup default

    With the l3mdev rule the list is just the following regardless of the
    number of VRFs:
    $ ip ru ls
    1000: from all lookup [l3mdev table]
    32765: from all lookup local
    32766: from all lookup main
    32767: from all lookup default

    (Note: the above pretty print of the rule is based on an iproute2
    prototype. Actual verbage may change)

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

23 Oct, 2015

1 commit

  • The error condition -EAGAIN, which is signaled by throw routes, tells
    the rules framework to walk on searching for next matches. If the walk
    ends and we stop walking the rules with the result of a throw route we
    have to translate the error conditions to -ENETUNREACH.

    Signed-off-by: Xin Long
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    lucien
     

10 Sep, 2015

1 commit

  • This switches IPv6 policy routing to use the shared
    fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
    multicast routing for IPv4 as well as IPv6.

    The motivation for this patch is a complaint about iproute2 behaving
    inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
    IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
    assigned priority value was decreased with each rule added.

    Since then all users of the default_pref field have been converted to
    assign the generic function fib_default_rule_pref(), fib_nl_newrule()
    may just use it directly instead. Therefore get rid of the function
    pointer altogether and make fib_default_rule_pref() static, as it's not
    used outside fib_rules.c anymore.

    Signed-off-by: Phil Sutter
    Signed-off-by: David S. Miller

    Phil Sutter
     

07 Apr, 2015

1 commit

  • Conflicts:
    drivers/net/ethernet/mellanox/mlx4/cmd.c
    net/core/fib_rules.c
    net/ipv4/fib_frontend.c

    The fib_rules.c and fib_frontend.c conflicts were locking adjustments
    in 'net' overlapping addition and removal of code in 'net-next'.

    The mlx4 conflict was a bug fix in 'net' happening in the same
    place a constant was being replaced with a more suitable macro.

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Apr, 2015

1 commit

  • We have to hold rtnl lock for fib_rules_unregister()
    otherwise the following race could happen:

    fib_rules_unregister(): fib_nl_delrule():
    ... ...
    ... ops = lookup_rules_ops();
    list_del_rcu(&ops->list);
    list_for_each_entry(ops->rules) {
    fib_rules_cleanup_ops(ops); ...
    list_del_rcu(); list_del_rcu();
    }

    Note, net->rules_mod_lock is actually not needed at all,
    either upper layer netns code or rtnl lock guarantees
    we are safe.

    Cc: Alexander Duyck
    Cc: Thomas Graf
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     

01 Apr, 2015

2 commits


30 Mar, 2015

1 commit


21 Mar, 2015

1 commit

  • for throw routes to trigger evaluation of other policy rules
    EAGAIN needs to be propagated up to fib_rules_lookup
    similar to how its done for IPv4

    A simple testcase for verification is:

    ip -6 rule add lookup 33333 priority 33333
    ip -6 route add throw 2001:db8::1
    ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333
    ip route get 2001:db8::1

    Signed-off-by: Steven Barth
    Signed-off-by: David S. Miller

    Steven Barth
     

16 Jan, 2014

1 commit


11 Dec, 2013

1 commit

  • This changes ensures that the routing entry investigated by the suppress
    function actually does point to a device struct before following that pointer,
    fixing a possible kernel oops situation when verifying the interface group
    associated with a routing table entry.

    According to Daniel Golle, this Oops can be triggered by a user process trying
    to establish an outgoing IPv6 connection while having no real IPv6 connectivity
    set up (only autoassigned link-local addresses).

    Fixes: 6ef94cfafba15 ("fib_rules: add route suppression based on ifgroup")

    Reported-by: Daniel Golle
    Tested-by: Daniel Golle
    Signed-off-by: Stefan Tomanek
    Signed-off-by: David S. Miller

    Stefan Tomanek
     

12 Sep, 2013

1 commit


04 Aug, 2013

1 commit

  • This change brings the suppressor attribute names into line; it also changes
    the data types to provide a more consistent interface.

    While -1 indicates that the suppressor is not enabled, values >= 0 for
    suppress_prefixlen or suppress_ifgroup reject routing decisions violating the
    constraint.

    This changes the previously presented behaviour of suppress_prefixlen, where a
    prefix length _less_ than the attribute value was rejected. After this change,
    a prefix length less than *or* equal to the value is considered a violation of
    the rule constraint.

    It also changes the default values for default and newly added rules (disabling
    any suppression for those).

    Signed-off-by: Stefan Tomanek
    Signed-off-by: David S. Miller

    Stefan Tomanek
     

03 Aug, 2013

1 commit


01 Aug, 2013

2 commits

  • With the addition of the suppress operation
    (7764a45a8f1fe74d4f7d301eaca2e558e7e2831a ("fib_rules: add .suppress
    operation") we rely on accurate error reporting of the fib_rules.actions.

    fib6_rule_action always returned -EAGAIN in case we could not find a
    matching route and 0 if a rule was matched. This also included a match
    for blackhole or prohibited rule actions which could get suppressed by
    the new logic.

    So adapt fib6_rule_action to always return the correct error code as
    its counterpart fib4_rule_action does. This also fixes a possiblity of
    nullptr-deref where we don't find a table, thus rt == NULL. Because
    the condition rt != ip6_null_entry still holdes it seems we could later
    get a nullptr bug on dereference rt->dst.

    v2:
    a) Fixed a brain fart in the commit msg (the rule => a table, etc). No
    changes to the patch.

    Cc: Stefan Tomanek
    Cc: Hideaki YOSHIFUJI
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • This change adds a new operation to the fib_rules_ops struct; it allows the
    suppression of routing decisions if certain criteria are not met by its
    results.

    The first implemented constraint is a minimum prefix length added to the
    structures of routing rules. If a rule is added with a minimum prefix length
    >0, only routes meeting this threshold will be considered. Any other (more
    general) routing table entries will be ignored.

    When configuring a system with multiple network uplinks and default routes, it
    is often convinient to reference the main routing table multiple times - but
    omitting the default route. Using this patch and a modified "ip" utility, this
    can be achieved by using the following command sequence:

    $ ip route add table secuplink default via 10.42.23.1

    $ ip rule add pref 100 table main prefixlength 1
    $ ip rule add pref 150 fwmark 0xA table secuplink

    With this setup, packets marked 0xA will be processed by the additional routing
    table "secuplink", but only if no suitable route in the main routing table can
    be found. By using a minimal prefixlength of 1, the default route (/0) of the
    table "main" is hidden to packets processed by rule 100; packets traveling to
    destinations with more specific routing entries are processed as usual.

    Signed-off-by: Stefan Tomanek
    Signed-off-by: David S. Miller

    Stefan Tomanek
     

04 Nov, 2012

1 commit

  • As suggested by Eric, we could introduce a helper function
    for ipv6 too, to avoid checking if rt is NULL before
    dst_release().

    Cc: Eric Dumazet
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Amerigo Wang
     

06 Oct, 2012

1 commit


02 Apr, 2012

2 commits


23 Nov, 2011

1 commit


01 Nov, 2011

1 commit


13 Mar, 2011

1 commit


17 Oct, 2010

1 commit


11 Jun, 2010

1 commit


26 Apr, 2010

2 commits

  • Decouple rtnetlink address families from real address families in socket.h to
    be able to add rtnetlink interfaces to code that is not a real address family
    without increasing AF_MAX/NPROTO.

    This will be used to add support for multicast route dumping from all tables
    as the proc interface can't be extended to support anything but the main table
    without breaking compatibility.

    This partialy undoes the patch to introduce independant families for routing
    rules and converts ipmr routing rules to a new rtnetlink family. Similar to
    that patch, values up to 127 are reserved for real address families, values
    above that may be used arbitrarily.

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     
  • fib_rules_register() duplicates the template passed to it without modification,
    mark the argument as const. Additionally the templates are only needed when
    instantiating a new namespace, so mark them as __net_initdata, which means
    they can be discarded when CONFIG_NET_NS=n.

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     

14 Apr, 2010

2 commits

  • Decouple the address family values used for fib_rules from the real
    address families in socket.h. This allows to use fib_rules for
    code that is not a real address family without increasing AF_MAX/NPROTO.

    Values up to 127 are reserved for real address families and map directly
    to the corresponding AF value, values starting from 128 are for other
    uses. rtnetlink is changed to invoke the AF_UNSPEC dumpit/doit handlers
    for these families.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • All fib_rules implementations need to set the family in their ->fill()
    functions. Since the value is available to the generic fib_nl_fill_rule()
    function, set it there.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

08 Mar, 2010

1 commit

  • IPV6_PREFER_SRC_xxx definitions:
    | #define IPV6_PREFER_SRC_TMP 0x0001
    | #define IPV6_PREFER_SRC_PUBLIC 0x0002
    | #define IPV6_PREFER_SRC_COA 0x0004

    RT6_LOOKUP_F_xxx definitions:
    | #define RT6_LOOKUP_F_SRCPREF_TMP 0x00000008
    | #define RT6_LOOKUP_F_SRCPREF_PUBLIC 0x00000010
    | #define RT6_LOOKUP_F_SRCPREF_COA 0x00000020

    So, we can translate between these two groups by shift operation
    instead of multiple 'if's.

    Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki / 吉藤英明
     

18 Jan, 2010

1 commit


04 Dec, 2009

2 commits

  • Refactor the code so fib_rules_register always takes a template instead
    of the actual fib_rules_ops structure that will be used. This is
    required for network namespace support so 2 out of the 3 callers already
    do this, it allows the error handling to be made common, and it allows
    fib_rules_unregister to free the template for hte caller.

    Modify fib_rules_unregister to use call_rcu instead of syncrhonize_rcu
    to allw multiple namespaces to be cleaned up in the same rcu grace
    period.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • commit d124356ce314fff22a047ea334379d5105b2d834
    Author: Patrick McHardy
    Date: Thu Dec 3 12:16:35 2009 +0100

    net: fib_rules: allow to delete local rule

    Allow to delete the local rule and recreate it with a higher priority. This
    can be used to force packets with a local destination out on the wire instead
    of routing them to loopback. Additionally this patch allows to recreate rules
    with a priority of 0.

    Combined with the previous patch to allow oif classification, a socket can
    be bound to the desired interface and packets routed to the wire like this:

    # move local rule to lower priority
    ip rule add pref 1000 lookup local
    ip rule del pref 0

    # route packets of sockets bound to eth0 to the wire independant
    # of the destination address
    ip rule add pref 100 oif eth0 lookup 100
    ip route add default dev eth0 table 100

    Signed-off-by: Patrick McHardy

    Signed-off-by: David S. Miller

    Patrick McHardy
     

21 May, 2009

1 commit


18 May, 2009

1 commit