02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

04 Aug, 2017

1 commit

  • Unlike the routing tables, the FIB rules share a common core, so instead
    of replicating the same logic for each address family we can simply dump
    the rules and send notifications from the core itself.

    To protect the integrity of the dump, a rules-specific sequence counter
    is added for each address family and incremented whenever a rule is
    added or deleted (under RTNL).

    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     

01 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

18 Apr, 2017

1 commit

  • Add netlink_ext_ack arg to rtnl_doit_func. Pass extack arg to nlmsg_parse
    for doit functions that call it directly.

    This is the first step to using extended error reporting in rtnetlink.
    >From here individual subsystems can be updated to set netlink_ext_ack as
    needed.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

17 Mar, 2017

1 commit

  • Currently, when non-default (custom) FIB rules are used, devices capable
    of layer 3 offloading flush their tables and let the kernel do the
    forwarding instead.

    When these devices' drivers are loaded they register to the FIB
    notification chain, which lets them know about the existence of any
    custom FIB rules. This is done by sending a RULE_ADD notification based
    on the value of 'net->ipv4.fib_has_custom_rules'.

    This approach is problematic when VRF offload is taken into account, as
    upon the creation of the first VRF netdev, a l3mdev rule is programmed
    to direct skbs to the VRF's table.

    Instead of merely reading the above value and sending a single RULE_ADD
    notification, we should iterate over all the FIB rules and send a
    detailed notification for each, thereby allowing offloading drivers to
    sanitize the rules they don't support and potentially flush their
    tables.

    While l3mdev rules are uniquely marked, the default rules are not.
    Therefore, when they are being notified they might invoke offloading
    drivers to unnecessarily flush their tables.

    Solve this by adding an helper to check if a FIB rule is a default rule.
    Namely, its selector should match all packets and its action should
    point to the local, main or default tables.

    As noted by David Ahern, uniquely marking the default rules is
    insufficient. When using VRFs, it's common to avoid false hits by moving
    the rule for the local table to just before the main table:

    Default configuration:
    $ ip rule show
    0: from all lookup local
    32766: from all lookup main
    32767: from all lookup default

    Common configuration with VRFs:
    $ ip rule show
    1000: from all lookup [l3mdev-table]
    32765: from all lookup local
    32766: from all lookup main
    32767: from all lookup default

    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Ido Schimmel
     

05 Nov, 2016

1 commit

  • - Define a new FIB rule attributes, FRA_UID_RANGE, to describe a
    range of UIDs.
    - Define a RTA_UID attribute for per-UID route lookups and dumps.
    - Support passing these attributes to and from userspace via
    rtnetlink. The value INVALID_UID indicates no UID was
    specified.
    - Add a UID field to the flow structures.

    Signed-off-by: Lorenzo Colitti
    Signed-off-by: David S. Miller

    Lorenzo Colitti
     

09 Jun, 2016

1 commit

  • Currently, VRFs require 1 oif and 1 iif rule per address family per
    VRF. As the number of VRF devices increases it brings scalability
    issues with the increasing rule list. All of the VRF rules have the
    same format with the exception of the specific table id to direct the
    lookup. Since the table id is available from the oif or iif in the
    loopup, the VRF rules can be consolidated to a single rule that pulls
    the table from the VRF device.

    This patch introduces a new rule attribute l3mdev. The l3mdev rule
    means the table id used for the lookup is pulled from the L3 master
    device (e.g., VRF) rather than being statically defined. With the
    l3mdev rule all of the basic VRF FIB rules are reduced to 1 l3mdev
    rule per address family (IPv4 and IPv6).

    If an admin wishes to insert higher priority rules for specific VRFs
    those rules will co-exist with the l3mdev rule. This capability means
    current VRF scripts will co-exist with this new simpler implementation.

    Currently, the rules list for both ipv4 and ipv6 look like this:
    $ ip ru ls
    1000: from all oif vrf1 lookup 1001
    1000: from all iif vrf1 lookup 1001
    1000: from all oif vrf2 lookup 1002
    1000: from all iif vrf2 lookup 1002
    1000: from all oif vrf3 lookup 1003
    1000: from all iif vrf3 lookup 1003
    1000: from all oif vrf4 lookup 1004
    1000: from all iif vrf4 lookup 1004
    1000: from all oif vrf5 lookup 1005
    1000: from all iif vrf5 lookup 1005
    1000: from all oif vrf6 lookup 1006
    1000: from all iif vrf6 lookup 1006
    1000: from all oif vrf7 lookup 1007
    1000: from all iif vrf7 lookup 1007
    1000: from all oif vrf8 lookup 1008
    1000: from all iif vrf8 lookup 1008
    ...
    32765: from all lookup local
    32766: from all lookup main
    32767: from all lookup default

    With the l3mdev rule the list is just the following regardless of the
    number of VRFs:
    $ ip ru ls
    1000: from all lookup [l3mdev table]
    32765: from all lookup local
    32766: from all lookup main
    32767: from all lookup default

    (Note: the above pretty print of the rule is based on an iproute2
    prototype. Actual verbage may change)

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

10 Sep, 2015

1 commit

  • This switches IPv6 policy routing to use the shared
    fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
    multicast routing for IPv4 as well as IPv6.

    The motivation for this patch is a complaint about iproute2 behaving
    inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
    IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
    assigned priority value was decreased with each rule added.

    Since then all users of the default_pref field have been converted to
    assign the generic function fib_default_rule_pref(), fib_nl_newrule()
    may just use it directly instead. Therefore get rid of the function
    pointer altogether and make fib_default_rule_pref() static, as it's not
    used outside fib_rules.c anymore.

    Signed-off-by: Phil Sutter
    Signed-off-by: David S. Miller

    Phil Sutter
     

22 Jul, 2015

1 commit

  • This add the ability to select a routing table based on the tunnel
    id which allows to maintain separate routing tables for each virtual
    tunnel network.

    ip rule add from all tunnel-id 100 lookup 100
    ip rule add from all tunnel-id 200 lookup 200

    A new static key controls the collection of metadata at tunnel level
    upon demand.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     

24 Jun, 2015

1 commit

  • This feature is only enabled with the new per-interface or ipv4 global
    sysctls called 'ignore_routes_with_linkdown'.

    net.ipv4.conf.all.ignore_routes_with_linkdown = 0
    net.ipv4.conf.default.ignore_routes_with_linkdown = 0
    net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
    ...

    When the above sysctls are set, will report to userspace that a route is
    dead and will no longer resolve to this nexthop when performing a fib
    lookup. This will signal to userspace that the route will not be
    selected. The signalling of a RTNH_F_DEAD is only passed to userspace
    if the sysctl is enabled and link is down. This was done as without it
    the netlink listeners would have no idea whether or not a nexthop would
    be selected. The kernel only sets RTNH_F_DEAD internally if the
    interface has IFF_UP cleared.

    With the new sysctl set, the following behavior can be observed
    (interface p8p1 is link-down):

    default via 10.0.5.2 dev p9p1
    10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
    70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
    80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown
    90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 dead linkdown
    90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
    90.0.0.1 via 70.0.0.2 dev p7p1 src 70.0.0.1
    cache
    local 80.0.0.1 dev lo src 80.0.0.1
    cache
    80.0.0.2 via 10.0.5.2 dev p9p1 src 10.0.5.15
    cache

    While the route does remain in the table (so it can be modified if
    needed rather than being wiped away as it would be if IFF_UP was
    cleared), the proper next-hop is chosen automatically when the link is
    down. Now interface p8p1 is linked-up:

    default via 10.0.5.2 dev p9p1
    10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
    70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
    80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1
    90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1
    90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
    192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2
    90.0.0.1 via 80.0.0.2 dev p8p1 src 80.0.0.1
    cache
    local 80.0.0.1 dev lo src 80.0.0.1
    cache
    80.0.0.2 dev p8p1 src 80.0.0.1
    cache

    and the output changes to what one would expect.

    If the sysctl is not set, the following output would be expected when
    p8p1 is down:

    default via 10.0.5.2 dev p9p1
    10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
    70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
    80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 linkdown
    90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 linkdown
    90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2

    Since the dead flag does not appear, there should be no expectation that
    the kernel would skip using this route due to link being down.

    v2: Split kernel changes into 2 patches, this actually makes a
    behavioral change if the sysctl is set. Also took suggestion from Alex
    to simplify code by only checking sysctl during fib lookup and
    suggestion from Scott to add a per-interface sysctl.

    v3: Code clean-ups to make it more readable and efficient as well as a
    reverse path check fix.

    v4: Drop binary sysctl

    v5: Whitespace fixups from Dave

    v6: Style changes from Dave and checkpatch suggestions

    v7: One more checkpatch fixup

    Signed-off-by: Andy Gospodarek
    Signed-off-by: Dinesh Dutt
    Acked-by: Scott Feldman
    Signed-off-by: David S. Miller

    Andy Gospodarek
     

13 Mar, 2015

1 commit

  • hold_net and release_net were an idea that turned out to be useless.
    The code has been disabled since 2008. Kill the code it is long past due.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

12 Mar, 2015

1 commit

  • This patch is meant to collapse local and main into one by converting
    tb_data from an array to a pointer. Doing this allows us to point the
    local table into the main while maintaining the same variables in the
    table.

    As such the tb_data was converted from an array to a pointer, and a new
    array called data is added in order to still provide an object for tb_data
    to point to.

    In order to track the origin of the fib aliases a tb_id value was added in
    a hole that existed on 64b systems. Using this we can also reverse the
    merge in the event that custom FIB rules are enabled.

    With this patch I am seeing an improvement of 20ns to 30ns for routing
    lookups as long as custom rules are not enabled, with custom rules enabled
    we fall back to split tables and the original behavior.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

21 Sep, 2013

1 commit

  • There are a mix of function prototypes with and without extern
    in the kernel sources. Standardize on not using extern for
    function prototypes.

    Function prototypes don't need to be written with extern.
    extern is assumed by the compiler. Its use is as unnecessary as
    using auto to declare automatic/local variables in a block.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

04 Aug, 2013

2 commits

  • Move refcnt, pref, suppress_ifgroup, suppress_prefixlen out of first
    cache line, as they are not used in fast path.

    Make sure ctarget & fr_net are in first cache line.

    (Assuming 64 bit arches and 64 bytes cache lines)

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This change brings the suppressor attribute names into line; it also changes
    the data types to provide a more consistent interface.

    While -1 indicates that the suppressor is not enabled, values >= 0 for
    suppress_prefixlen or suppress_ifgroup reject routing decisions violating the
    constraint.

    This changes the previously presented behaviour of suppress_prefixlen, where a
    prefix length _less_ than the attribute value was rejected. After this change,
    a prefix length less than *or* equal to the value is considered a violation of
    the rule constraint.

    It also changes the default values for default and newly added rules (disabling
    any suppression for those).

    Signed-off-by: Stefan Tomanek
    Signed-off-by: David S. Miller

    Stefan Tomanek
     

03 Aug, 2013

1 commit


01 Aug, 2013

1 commit

  • This change adds a new operation to the fib_rules_ops struct; it allows the
    suppression of routing decisions if certain criteria are not met by its
    results.

    The first implemented constraint is a minimum prefix length added to the
    structures of routing rules. If a rule is added with a minimum prefix length
    >0, only routes meeting this threshold will be considered. Any other (more
    general) routing table entries will be ignored.

    When configuring a system with multiple network uplinks and default routes, it
    is often convinient to reference the main routing table multiple times - but
    omitting the default route. Using this patch and a modified "ip" utility, this
    can be achieved by using the following command sequence:

    $ ip route add table secuplink default via 10.42.23.1

    $ ip rule add pref 100 table main prefixlength 1
    $ ip rule add pref 150 fwmark 0xA table secuplink

    With this setup, packets marked 0xA will be processed by the additional routing
    table "secuplink", but only if no suitable route in the main routing table can
    be found. By using a minimal prefixlength of 1, the default route (/0) of the
    table "main" is hidden to packets processed by rule 100; packets traveling to
    destinations with more specific routing entries are processed as usual.

    Signed-off-by: Stefan Tomanek
    Signed-off-by: David S. Miller

    Stefan Tomanek
     

29 Jun, 2012

1 commit

  • If rpfilter is off (or the SKB has an IPSEC path) and there are not
    tclassid users, we don't have to do anything at all when
    fib_validate_source() is invoked besides setting the itag to zero.

    We monitor tclassid uses with a counter (modified only under RTNL and
    marked __read_mostly) and we protect the fib_validate_source() real
    work with a test against this counter and whether rpfilter is to be
    done.

    Having a way to know whether we need no tclassid processing or not
    also opens the door for future optimized rpfilter algorithms that do
    not perform full FIB lookups.

    Signed-off-by: David S. Miller

    David S. Miller
     

28 Oct, 2010

1 commit


06 Oct, 2010

1 commit

  • fib_lookup() converted to be called in RCU protected context, no
    reference taken and released on a contended cache line (fib_clntref)

    fib_table_lookup() and fib_semantic_match() get an additional parameter.

    struct fib_info gets an rcu_head field, and is freed after an rcu grace
    period.

    Stress test :
    (Sending 160.000.000 UDP frames on same neighbour,
    IP route cache disabled, dual E5540 @2.53GHz,
    32bit kernel, FIB_HASH) (about same results for FIB_TRIE)

    Before patch :

    real 1m31.199s
    user 0m13.761s
    sys 23m24.780s

    After patch:

    real 1m5.375s
    user 0m14.997s
    sys 15m50.115s

    Before patch Profile :

    13044.00 15.4% __ip_route_output_key vmlinux
    8438.00 10.0% dst_destroy vmlinux
    5983.00 7.1% fib_semantic_match vmlinux
    5410.00 6.4% fib_rules_lookup vmlinux
    4803.00 5.7% neigh_lookup vmlinux
    4420.00 5.2% _raw_spin_lock vmlinux
    3883.00 4.6% rt_set_nexthop vmlinux
    3261.00 3.9% _raw_read_lock vmlinux
    2794.00 3.3% fib_table_lookup vmlinux
    2374.00 2.8% neigh_resolve_output vmlinux
    2153.00 2.5% dst_alloc vmlinux
    1502.00 1.8% _raw_read_lock_bh vmlinux
    1484.00 1.8% kmem_cache_alloc vmlinux
    1407.00 1.7% eth_header vmlinux
    1406.00 1.7% ipv4_dst_destroy vmlinux
    1298.00 1.5% __copy_from_user_ll vmlinux
    1174.00 1.4% dev_queue_xmit vmlinux
    1000.00 1.2% ip_output vmlinux

    After patch Profile :

    13712.00 15.8% dst_destroy vmlinux
    8548.00 9.9% __ip_route_output_key vmlinux
    7017.00 8.1% neigh_lookup vmlinux
    4554.00 5.3% fib_semantic_match vmlinux
    4067.00 4.7% _raw_read_lock vmlinux
    3491.00 4.0% dst_alloc vmlinux
    3186.00 3.7% neigh_resolve_output vmlinux
    3103.00 3.6% fib_table_lookup vmlinux
    2098.00 2.4% _raw_read_lock_bh vmlinux
    2081.00 2.4% kmem_cache_alloc vmlinux
    2013.00 2.3% _raw_spin_lock vmlinux
    1763.00 2.0% __copy_from_user_ll vmlinux
    1763.00 2.0% ip_output vmlinux
    1761.00 2.0% ipv4_dst_destroy vmlinux
    1631.00 1.9% eth_header vmlinux
    1440.00 1.7% _raw_read_unlock_bh vmlinux

    Reference results, if IP route cache is enabled :

    real 0m29.718s
    user 0m10.845s
    sys 7m37.341s

    25213.00 29.5% __ip_route_output_key vmlinux
    9011.00 10.5% dst_release vmlinux
    4817.00 5.6% ip_push_pending_frames vmlinux
    4232.00 5.0% ip_finish_output vmlinux
    3940.00 4.6% udp_sendmsg vmlinux
    3730.00 4.4% __copy_from_user_ll vmlinux
    3716.00 4.4% ip_route_output_flow vmlinux
    2451.00 2.9% __xfrm_lookup vmlinux
    2221.00 2.6% ip_append_data vmlinux
    1718.00 2.0% _raw_spin_lock_bh vmlinux
    1655.00 1.9% __alloc_skb vmlinux
    1572.00 1.8% sock_wfree vmlinux
    1345.00 1.6% kfree vmlinux

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

05 Oct, 2010

1 commit


26 Apr, 2010

1 commit


14 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

04 Dec, 2009

4 commits

  • Refactor the code so fib_rules_register always takes a template instead
    of the actual fib_rules_ops structure that will be used. This is
    required for network namespace support so 2 out of the 3 callers already
    do this, it allows the error handling to be made common, and it allows
    fib_rules_unregister to free the template for hte caller.

    Modify fib_rules_unregister to use call_rcu instead of syncrhonize_rcu
    to allw multiple namespaces to be cleaned up in the same rcu grace
    period.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • commit 68144d350f4f6c348659c825cde6a82b34c27a91
    Author: Patrick McHardy
    Date: Thu Dec 3 12:05:25 2009 +0100

    net: fib_rules: add oif classification

    Support routing table lookup based on the flow's oif. This is useful to
    classify packets originating from sockets bound to interfaces differently.

    The route cache already includes the oif and needs no changes.

    Signed-off-by: Patrick McHardy

    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • commit 229e77eec406ad68662f18e49fda8b5d366768c5
    Author: Patrick McHardy
    Date: Thu Dec 3 12:05:23 2009 +0100

    net: fib_rules: rename ifindex/ifname/FRA_IFNAME to iifindex/iifname/FRA_IIFNAME

    The next patch will add oif classification, rename interface related members
    and attributes to reflect that they're used for iif classification.

    Signed-off-by: Patrick McHardy

    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • commit b8952893d5d86f69c4e499d191b98c6658f64b0f
    Author: Patrick McHardy
    Date: Thu Dec 3 12:05:22 2009 +0100

    net: fib_rules: rearrange struct fib_rule

    The ifname member is only used to resolve interface names and is not needed
    during rule lookups. The target and ctarget members however are used during
    rule lookups and are currently located in a second cacheline.

    Move ifname further to the end to make sure both target and ctarget are
    located in the same cacheline as other members used during rule lookups.

    The layout on 64 bit changes from:

    struct fib_rule {
    ...
    u32 table; /* 56 4 */
    u8 action; /* 60 1 */

    /* XXX 3 bytes hole, try to pack */

    /* --- cacheline 1 boundary (64 bytes) --- */
    u32 target; /* 64 4 */

    /* XXX 4 bytes hole, try to pack */

    struct fib_rule * ctarget; /* 72 8 */
    struct rcu_head rcu; /* 80 16 */
    struct net * fr_net; /* 96 8 */
    };

    to:

    struct fib_rule {
    ...
    u32 table; /* 40 4 */
    u8 action; /* 44 1 */

    /* XXX 3 bytes hole, try to pack */

    u32 target; /* 48 4 */

    /* XXX 4 bytes hole, try to pack */

    struct fib_rule * ctarget; /* 56 8 */
    /* --- cacheline 1 boundary (64 bytes) --- */
    char ifname[16]; /* 64 16 */
    struct rcu_head rcu; /* 80 16 */
    struct net * fr_net; /* 96 8 */

    };

    Signed-off-by: Patrick McHardy

    Signed-off-by: David S. Miller

    Patrick McHardy
     

04 Nov, 2009

1 commit

  • This cleanup patch puts struct/union/enum opening braces,
    in first line to ease grep games.

    struct something
    {

    becomes :

    struct something {

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 May, 2009

1 commit


18 May, 2009

1 commit


06 Jul, 2008

1 commit


16 Apr, 2008

1 commit


29 Jan, 2008

7 commits

  • Save namespace context on the fib rule at the rule creation time and
    call routing lookup in the correct namespace.

    Signed-off-by: Denis V. Lunev
    Acked-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • Remove struct net from fib_rules_register(unregister)/notify_change
    paths and diet code size a bit.

    add/remove: 0/0 grow/shrink: 10/12 up/down: 35/-100 (-65)
    function old new delta
    notify_rule_change 273 280 +7
    trie_show_stats 471 475 +4
    fn_trie_delete 473 477 +4
    fib_rules_unregister 144 148 +4
    fib4_rule_compare 119 123 +4
    resize 2842 2845 +3
    fn_trie_select_default 515 518 +3
    inet_sk_rebuild_header 836 838 +2
    fib_trie_seq_show 764 766 +2
    __devinet_sysctl_register 276 278 +2
    fn_trie_lookup 1124 1123 -1
    ip_fib_check_default 133 131 -2
    devinet_conf_sysctl 223 221 -2
    snmp_fold_field 126 123 -3
    fn_trie_insert 2091 2086 -5
    inet_create 876 870 -6
    fib4_rules_init 197 191 -6
    fib_sync_down 452 444 -8
    inet_gso_send_check 334 325 -9
    fib_create_info 3003 2991 -12
    fib_nl_delrule 568 553 -15
    fib_nl_newrule 883 852 -31

    Signed-off-by: Denis V. Lunev
    Acked-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • The backward link from FIB rules operations to the network namespace
    will allow to simplify the API a bit.

    Signed-off-by: Denis V. Lunev
    Acked-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • fib_rules_unregister is called only after successful register and the
    return code is never checked.

    Signed-off-by: Denis V. Lunev
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • fib_rules_ops contains operations and the list of configured rules. ops will
    become per/namespace soon, so we need them to be known in the default_pref
    callback.

    Acked-by: Benjamin Thery
    Acked-by: Daniel Lezcano
    Signed-off-by: Denis V. Lunev
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • The patch extends the different fib rules API in order to pass the
    network namespace pointer. That will allow to access the different
    tables from a namespace relative object. As usual, the pointer to the
    init_net variable is passed as parameter so we don't break the
    network.

    Acked-by: Benjamin Thery
    Acked-by: Daniel Lezcano
    Signed-off-by: Denis V. Lunev
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • When the fib_rules initialization finished, no return code is provided
    so there is no way to know, for the caller, if the initialization has
    been successful or has failed. This patch fix that.

    Signed-off-by: Daniel Lezcano
    Acked-by: Benjamin Thery
    Signed-off-by: David S. Miller

    Daniel Lezcano