05 Nov, 2015

1 commit

  • A bug report (https://bugzilla.kernel.org/show_bug.cgi?id=107071) noted
    that the follwoing ip command is failing with v4.3:

    $ ip route add 10.248.5.0/24 dev bond0.250 table vlan_250 src 10.248.5.154
    RTNETLINK answers: Invalid argument

    021dd3b8a142d changed the lookup of the given preferred source address to
    use the table id passed in, but this assumes the local entries are in the
    given table which is not necessarily true for non-VRF use cases. When
    validating the preferred source fallback to the local table on failure.

    Fixes: 021dd3b8a142d ("net: Add routes to the table associated with the device")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

04 Nov, 2015

1 commit


03 Nov, 2015

1 commit

  • This patch changes how the multipath hash is computed for locally
    generated flows: now the hash comprises l4 information.

    This allows better utilization of the available paths when the existing
    flows have the same source IP and the same destination IP: with l3 hash,
    even when multiple connections are in place simultaneously, a single path
    will be used, while with l4 hash we can use all the available paths.

    v2 changes:
    - use get_hash_from_flowi4() instead of implementing just another l4 hash
    function

    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

02 Nov, 2015

2 commits

  • When nexthop is part of multipath route we should clear the
    LINKDOWN flag when link goes UP or when first address is added.
    This is needed because we always set LINKDOWN flag when DEAD flag
    was set but now on UP the nexthop is not dead anymore. Examples when
    LINKDOWN bit can be forgotten when no NETDEV_CHANGE is delivered:

    - link goes down (LINKDOWN is set), then link goes UP and device
    shows carrier OK but LINKDOWN remains set

    - last address is deleted (LINKDOWN is set), then address is
    added and device shows carrier OK but LINKDOWN remains set

    Steps to reproduce:
    modprobe dummy
    ifconfig dummy0 192.168.168.1 up

    here add a multipath route where one nexthop is for dummy0:

    ip route add 1.2.3.4 nexthop dummy0 nexthop SOME_OTHER_DEVICE
    ifconfig dummy0 down
    ifconfig dummy0 up

    now ip route shows nexthop that is not dead. Now set the sysctl var:

    echo 1 > /proc/sys/net/ipv4/conf/dummy0/ignore_routes_with_linkdown

    now ip route will show a dead nexthop because the forgotten
    RTNH_F_LINKDOWN is propagated as RTNH_F_DEAD.

    Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops")
    Signed-off-by: Julian Anastasov
    Signed-off-by: David S. Miller

    Julian Anastasov
     
  • When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
    we should not delete the local routes if the local address
    is still present. The confusion comes from the fact that both
    fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
    constant. Fix it by returning back the variable 'force'.

    Steps to reproduce:
    modprobe dummy
    ifconfig dummy0 192.168.168.1 up
    ifconfig dummy0 down
    ip route list table local | grep dummy | grep host
    local 192.168.168.1 dev dummy0 proto kernel scope host src 192.168.168.1

    Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops")
    Signed-off-by: Julian Anastasov
    Signed-off-by: David S. Miller

    Julian Anastasov
     

16 Oct, 2015

1 commit

  • This command:
    ip route add 192.168.1.0/24 nexthop via 10.2.1.5 dev eth1 nexthop via 10.2.2.5 dev eth2

    generated this suspicious RCU usage message:

    [ 63.249262]
    [ 63.249939] ===============================
    [ 63.251571] [ INFO: suspicious RCU usage. ]
    [ 63.253250] 4.3.0-rc3+ #298 Not tainted
    [ 63.254724] -------------------------------
    [ 63.256401] ../include/linux/inetdevice.h:205 suspicious rcu_dereference_check() usage!
    [ 63.259450]
    [ 63.259450] other info that might help us debug this:
    [ 63.259450]
    [ 63.262297]
    [ 63.262297] rcu_scheduler_active = 1, debug_locks = 1
    [ 63.264647] 1 lock held by ip/2870:
    [ 63.265896] #0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x12/0x14
    [ 63.268858]
    [ 63.268858] stack backtrace:
    [ 63.270409] CPU: 4 PID: 2870 Comm: ip Not tainted 4.3.0-rc3+ #298
    [ 63.272478] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
    [ 63.275745] 0000000000000001 ffff8800b8c9f8b8 ffffffff8125f73c ffff88013afcf301
    [ 63.278185] ffff8800bab7a380 ffff8800b8c9f8e8 ffffffff8107bf30 ffff8800bb728000
    [ 63.280634] ffff880139fe9a60 0000000000000000 ffff880139fe9a00 ffff8800b8c9f908
    [ 63.283177] Call Trace:
    [ 63.283959] [] dump_stack+0x4c/0x68
    [ 63.285593] [] lockdep_rcu_suspicious+0xfa/0x103
    [ 63.287500] [] __in_dev_get_rcu+0x48/0x4f
    [ 63.289169] [] fib_rebalance+0x3e/0x127
    [ 63.290753] [] ? rcu_read_unlock+0x3e/0x5f
    [ 63.292442] [] fib_create_info+0xaf9/0xdcc
    [ 63.294093] [] ? sched_clock_local+0x12/0x75
    [ 63.295791] [] fib_table_insert+0x8c/0x451
    [ 63.297493] [] ? fib_get_table+0x36/0x43
    [ 63.299109] [] inet_rtm_newroute+0x43/0x51
    [ 63.300709] [] rtnetlink_rcv_msg+0x182/0x195
    [ 63.302334] [] ? trace_hardirqs_on+0xd/0xf
    [ 63.303888] [] ? rtnl_lock+0x12/0x14
    [ 63.305346] [] ? __rtnl_unlock+0x12/0x12
    [ 63.306878] [] netlink_rcv_skb+0x3d/0x90
    [ 63.308437] [] rtnetlink_rcv+0x21/0x28
    [ 63.309916] [] netlink_unicast+0xfa/0x17f
    [ 63.311447] [] netlink_sendmsg+0x297/0x2dc
    [ 63.313029] [] sock_sendmsg_nosec+0x12/0x1d
    [ 63.314597] [] ___sys_sendmsg+0x196/0x21b
    [ 63.316125] [] ? native_sched_clock+0x1f/0x3c
    [ 63.317671] [] ? sched_clock_local+0x12/0x75
    [ 63.319185] [] ? sched_clock_cpu+0x9d/0xb6
    [ 63.320693] [] ? __lock_is_held+0x32/0x54
    [ 63.322145] [] ? __fget_light+0x4b/0x77
    [ 63.323541] [] __sys_sendmsg+0x3d/0x5b
    [ 63.324947] [] SyS_sendmsg+0xd/0x19
    [ 63.326274] [] entry_SYSCALL_64_fastpath+0x12/0x6f

    It looks like all of the code paths to fib_rebalance are under rtnl.

    Fixes: 0e884c78ee19 ("ipv4: L3 hash-based multipath")
    Cc: Peter Nørlund
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

07 Oct, 2015

1 commit


06 Oct, 2015

1 commit

  • This fixes

    net/built-in.o: In function `fib_rebalance':
    fib_semantics.c:(.text+0x9df14): undefined reference to `__divdi3'

    and

    net/built-in.o: In function `fib_rebalance':
    net/ipv4/fib_semantics.c:572: undefined reference to `__aeabi_ldivmod'

    Fixes: 0e884c78ee19 ("ipv4: L3 hash-based multipath")

    Signed-off-by: Peter Nørlund
    Signed-off-by: David S. Miller

    Peter Nørlund
     

05 Oct, 2015

1 commit


02 Sep, 2015

1 commit

  • A number of VRF patches used 'int' for table id. It should be u32 to be
    consistent with the rest of the stack.

    Fixes:
    4e3c89920cd3a ("net: Introduce VRF related flags and helpers")
    15be405eb2ea9 ("net: Add inet_addr lookup by table")
    30bbaa1950055 ("net: Fix up inet_addr_type checks")
    021dd3b8a142d ("net: Add routes to the table associated with the device")
    dc028da54ed35 ("inet: Move VRF table lookup to inlined function")
    f6d3c19274c74 ("net: FIB tracepoints")

    Signed-off-by: David Ahern
    Reviewed-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    David Ahern
     

01 Sep, 2015

3 commits

  • Currently, the following case doesn't use DCTCP, even if it should:
    A responder has f.e. Cubic as system wide default, but for a specific
    route to the initiating host, DCTCP is being set in RTAX_CC_ALGO. The
    initiating host then uses DCTCP as congestion control, but since the
    initiator sets ECT(0), tcp_ecn_create_request() doesn't set ecn_ok,
    and we have to fall back to Reno after 3WHS completes.

    We were thinking on how to solve this in a minimal, non-intrusive
    way without bloating tcp_ecn_create_request() needlessly: lets cache
    the CA ecn option flag in RTAX_FEATURES. In other words, when ECT(0)
    is set on the SYN packet, set ecn_ok=1 iff route RTAX_FEATURES
    contains the unexposed (internal-only) DST_FEATURE_ECN_CA. This allows
    to only do a single metric feature lookup inside tcp_ecn_create_request().

    Joint work with Florian Westphal.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Feature bits that are invalid should not be accepted by the kernel,
    only the lower 4 bits may be configured, but not the remaining ones.
    Even from these 4, 2 of them are unused.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • fib_create_info() is already quite large, so before adding more
    code to the metrics section move that to a helper, similar to
    ip6_convert_metrics.

    Suggested-by: Daniel Borkmann
    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

25 Aug, 2015

1 commit

  • Add cfg and family arguments to lwt build state functions. cfg is a void
    pointer and will either be a pointer to a fib_config or fib6_config
    structure. The family parameter indicates which one (either AF_INET
    or AF_INET6).

    LWT encpasulation implementation may use the fib configuration to build
    the LWT state.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

21 Aug, 2015

2 commits

  • Andreas reported breakage adding routes with local nexthops:
    $ ip route show table main
    ...
    172.28.0.0/24 dev vnf-xe1p0 proto kernel scope link src 172.28.0.16

    $ ip route add 10.0.0.0/8 via 172.28.0.32 table 100 dev vnf-xe1p0
    RTNETLINK answers: Resource temporarily unavailable

    3bfd847203c changed the lookup to use the passed in table but for cases like
    this the nexthop is in the local table rather than the passed in table.

    Fixes: 3bfd847203c ("net: Use passed in table for nexthop lookups")
    Reported-by: Andreas Schultz
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Make fib_encap_match() static as it isn't used outside the file.

    Signed-off-by: Ying Xue
    Reviewed-by: Jiri Benc
    Signed-off-by: David S. Miller

    Ying Xue
     

19 Aug, 2015

1 commit

  • The built lwtunnel_state struct has to be freed after comparison.

    Fixes: 571e722676fe3 ("ipv4: support for fib route lwtunnel encap attributes")
    Signed-off-by: Jiri Benc
    Acked-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Jiri Benc
     

17 Aug, 2015

1 commit

  • fib_lookup() forces FIB_LOOKUP_NOREF flag, while fib_table_lookup()
    does not.

    This patch solves the typical message at reboot time or device
    dismantle :

    unregister_netdevice: waiting for eth0 to become free. Usage count = 4

    Fixes: 3bfd847203c6 ("net: Use passed in table for nexthop lookups")
    Signed-off-by: Eric Dumazet
    Cc: David Ahern
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Aug, 2015

3 commits

  • If a user passes in a table for new routes use that table for nexthop
    lookups. Specifically, this solves the case where a connected route does
    not exist in the main table, but only another table and then a subsequent
    route is added with a next hop using the connected route. ie.,

    $ ip route ls
    default via 10.0.2.2 dev eth0
    10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
    169.254.0.0/16 dev eth0 scope link metric 1003
    192.168.56.0/24 dev eth1 proto kernel scope link src 192.168.56.51

    $ ip route ls table 10
    1.1.1.0/24 dev eth2 scope link

    Without this patch adding a nexthop route fails:

    $ ip route add table 10 2.2.2.0/24 via 1.1.1.10
    RTNETLINK answers: Network is unreachable

    With this patch the route is added successfully.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • When a device associated with a VRF is brought up or down routes
    should be added to/removed from the table associated with the VRF.
    fib_magic defaults to using the main or local tables. Have it use
    the table with the device if there is one.

    A part of this is directing prefsrc validations to the correct
    table as well.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Currently inet_addr_type and inet_dev_addr_type expect local addresses
    to be in the local table. With the VRF device local routes for devices
    associated with a VRF will be in the table associated with the VRF.
    Provide an alternate inet_addr lookup to use a specific table rather
    than defaulting to the local table.

    inet_addr_type_dev_table keeps the same semantics as inet_addr_type but
    if the passed in device is enslaved to a VRF then the table for that VRF
    is used for the lookup.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

01 Aug, 2015

1 commit


27 Jul, 2015

2 commits

  • It saves some lines and simplify a bit the code when the state is returning
    by this function. It's also useful to handle a NULL entry.

    To avoid too long lines, I've also renamed lwtunnel_state_get() and
    lwtunnel_state_put() to lwtstate_get() and lwtstate_put().

    CC: Thomas Graf
    CC: Roopa Prabhu
    Signed-off-by: Nicolas Dichtel
    Acked-by: Thomas Graf
    Acked-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • Currently, we do not notice if new alternative gateways
    are added. We can do it by checking for present neigh
    entry. Also, gateways that are currently probed (NUD_INCOMPLETE)
    can be skipped from round-robin probing.

    Suggested-by: Florian Westphal
    Signed-off-by: Julian Anastasov
    Signed-off-by: David S. Miller

    Julian Anastasov
     

25 Jul, 2015

2 commits

  • fib_select_default considers alternative routes only when
    res->fi is for the first alias in res->fa_head. In the
    common case this can happen only when the initial lookup
    matches the first alias with highest TOS value. This
    prevents the alternative routes to require specific TOS.

    This patch solves the problem as follows:

    - routes that require specific TOS should be returned by
    fib_select_default only when TOS matches, as already done
    in fib_table_lookup. This rule implies that depending on the
    TOS we can have many different lists of alternative gateways
    and we have to keep the last used gateway (fa_default) in first
    alias for the TOS instead of using single tb_default value.

    - as the aliases are ordered by many keys (TOS desc,
    fib_priority asc), we restrict the possible results to
    routes with matching TOS and lowest metric (fib_priority)
    and routes that match any TOS, again with lowest metric.

    For example, packet with TOS 8 can not use gw3 (not lowest
    metric), gw4 (different TOS) and gw6 (not lowest metric),
    all other gateways can be used:

    tos 8 via gw1 metric 2 fa_head and res->fi
    tos 8 via gw2 metric 2
    tos 8 via gw3 metric 3
    tos 4 via gw4
    tos 0 via gw5
    tos 0 via gw6 metric 1

    Reported-by: Hagen Paul Pfeifer
    Signed-off-by: Julian Anastasov
    Signed-off-by: David S. Miller

    Julian Anastasov
     
  • fib_trie starting from 4.1 can link fib aliases from
    different prefixes in same list. Make sure the alternative
    gateways are in same table and for same prefix (0) by
    checking tb_id and fa_slen.

    Fixes: 79e5ad2ceb00 ("fib_trie: Remove leaf_info")
    Signed-off-by: Julian Anastasov
    Signed-off-by: David S. Miller

    Julian Anastasov
     

22 Jul, 2015

1 commit


29 Jun, 2015

1 commit

  • The following lockdep splat was seen due to the wrong context for
    grabbing in_dev.

    ===============================
    [ INFO: suspicious RCU usage. ]
    4.1.0-next-20150626-dbg-00020-g54a6d91-dirty #244 Not tainted
    -------------------------------
    include/linux/inetdevice.h:205 suspicious rcu_dereference_check() usage!

    other info that might help us debug this:

    rcu_scheduler_active = 1, debug_locks = 0
    2 locks held by ip/403:
    #0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x19
    #1: ((inetaddr_chain).rwsem){.+.+.+}, at: [] __blocking_notifier_call_chain+0x35/0x6a

    stack backtrace:
    CPU: 2 PID: 403 Comm: ip Not tainted 4.1.0-next-20150626-dbg-00020-g54a6d91-dirty #244
    0000000000000001 ffff8800b189b728 ffffffff8150a542 ffffffff8107a8b3
    ffff880037bbea40 ffff8800b189b758 ffffffff8107cb74 ffff8800379dbd00
    ffff8800bec85800 ffff8800bf9e13c0 00000000000000ff ffff8800b189b7d8
    Call Trace:
    [] dump_stack+0x4c/0x6e
    [] ? up+0x39/0x3e
    [] lockdep_rcu_suspicious+0xf7/0x100
    [] fib_dump_info+0x227/0x3e2
    [] rtmsg_fib+0xa6/0x116
    [] fib_table_insert+0x316/0x355
    [] fib_magic+0xb7/0xc7
    [] fib_add_ifaddr+0xb1/0x13b
    [] fib_inetaddr_event+0x36/0x90
    [] notifier_call_chain+0x4c/0x71
    [] __blocking_notifier_call_chain+0x4e/0x6a
    [] blocking_notifier_call_chain+0x14/0x16
    [] __inet_insert_ifa+0x1a5/0x1b3
    [] inet_rtm_newaddr+0x350/0x35f
    [] rtnetlink_rcv_msg+0x17b/0x18a
    [] ? trace_hardirqs_on+0xd/0xf
    [] ? netlink_deliver_tap+0x1cb/0x1f7
    [] ? rtnl_newlink+0x72a/0x72a
    ...

    This patch resolves that splat.

    Signed-off-by: Andy Gospodarek
    Reported-by: Sergey Senozhatsky
    Signed-off-by: David S. Miller

    Andy Gospodarek
     

24 Jun, 2015

2 commits

  • This feature is only enabled with the new per-interface or ipv4 global
    sysctls called 'ignore_routes_with_linkdown'.

    net.ipv4.conf.all.ignore_routes_with_linkdown = 0
    net.ipv4.conf.default.ignore_routes_with_linkdown = 0
    net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
    ...

    When the above sysctls are set, will report to userspace that a route is
    dead and will no longer resolve to this nexthop when performing a fib
    lookup. This will signal to userspace that the route will not be
    selected. The signalling of a RTNH_F_DEAD is only passed to userspace
    if the sysctl is enabled and link is down. This was done as without it
    the netlink listeners would have no idea whether or not a nexthop would
    be selected. The kernel only sets RTNH_F_DEAD internally if the
    interface has IFF_UP cleared.

    With the new sysctl set, the following behavior can be observed
    (interface p8p1 is link-down):

    default via 10.0.5.2 dev p9p1
    10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
    70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
    80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown
    90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 dead linkdown
    90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
    90.0.0.1 via 70.0.0.2 dev p7p1 src 70.0.0.1
    cache
    local 80.0.0.1 dev lo src 80.0.0.1
    cache
    80.0.0.2 via 10.0.5.2 dev p9p1 src 10.0.5.15
    cache

    While the route does remain in the table (so it can be modified if
    needed rather than being wiped away as it would be if IFF_UP was
    cleared), the proper next-hop is chosen automatically when the link is
    down. Now interface p8p1 is linked-up:

    default via 10.0.5.2 dev p9p1
    10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
    70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
    80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1
    90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1
    90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
    192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2
    90.0.0.1 via 80.0.0.2 dev p8p1 src 80.0.0.1
    cache
    local 80.0.0.1 dev lo src 80.0.0.1
    cache
    80.0.0.2 dev p8p1 src 80.0.0.1
    cache

    and the output changes to what one would expect.

    If the sysctl is not set, the following output would be expected when
    p8p1 is down:

    default via 10.0.5.2 dev p9p1
    10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
    70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
    80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 linkdown
    90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 linkdown
    90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2

    Since the dead flag does not appear, there should be no expectation that
    the kernel would skip using this route due to link being down.

    v2: Split kernel changes into 2 patches, this actually makes a
    behavioral change if the sysctl is set. Also took suggestion from Alex
    to simplify code by only checking sysctl during fib lookup and
    suggestion from Scott to add a per-interface sysctl.

    v3: Code clean-ups to make it more readable and efficient as well as a
    reverse path check fix.

    v4: Drop binary sysctl

    v5: Whitespace fixups from Dave

    v6: Style changes from Dave and checkpatch suggestions

    v7: One more checkpatch fixup

    Signed-off-by: Andy Gospodarek
    Signed-off-by: Dinesh Dutt
    Acked-by: Scott Feldman
    Signed-off-by: David S. Miller

    Andy Gospodarek
     
  • Add a fib flag called RTNH_F_LINKDOWN to any ipv4 nexthops that are
    reachable via an interface where carrier is off. No action is taken,
    but additional flags are passed to userspace to indicate carrier status.

    This also includes a cleanup to fib_disable_ip to more clearly indicate
    what event made the function call to replace the more cryptic force
    option previously used.

    v2: Split out kernel functionality into 2 patches, this patch simply
    sets and clears new nexthop flag RTNH_F_LINKDOWN.

    v3: Cleanups suggested by Alex as well as a bug noticed in
    fib_sync_down_dev and fib_sync_up when multipath was not enabled.

    v5: Whitespace and variable declaration fixups suggested by Dave.

    v6: Style fixups noticed by Dave; ran checkpatch to be sure I got them
    all.

    Signed-off-by: Andy Gospodarek
    Signed-off-by: Dinesh Dutt
    Acked-by: Scott Feldman
    Signed-off-by: David S. Miller

    Andy Gospodarek
     

03 May, 2015

1 commit


04 Apr, 2015

1 commit

  • The ipv4 code uses a mixture of coding styles. In some instances check
    for NULL pointer is done as x == NULL and sometimes as !x. !x is
    preferred according to checkpatch and this patch makes the code
    consistent by adopting the latter form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

01 Apr, 2015

2 commits


13 Mar, 2015

1 commit

  • hold_net and release_net were an idea that turned out to be useless.
    The code has been disabled since 2008. Kill the code it is long past due.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

28 Feb, 2015

1 commit


26 Jan, 2015

1 commit


18 Jan, 2015

1 commit

  • Contrary to common expectations for an "int" return, these functions
    return only a positive value -- if used correctly they cannot even
    return 0 because the message header will necessarily be in the skb.

    This makes the very common pattern of

    if (genlmsg_end(...) < 0) { ... }

    be a whole bunch of dead code. Many places also simply do

    return nlmsg_end(...);

    and the caller is expected to deal with it.

    This also commonly (at least for me) causes errors, because it is very
    common to write

    if (my_function(...))
    /* error condition */

    and if my_function() does "return nlmsg_end()" this is of course wrong.

    Additionally, there's not a single place in the kernel that actually
    needs the message length returned, and if anyone needs it later then
    it'll be very easy to just use skb->len there.

    Remove this, and make the functions void. This removes a bunch of dead
    code as described above. The patch adds lines because I did

    - return nlmsg_end(...);
    + nlmsg_end(...);
    + return 0;

    I could have preserved all the function's return values by returning
    skb->len, but instead I've audited all the places calling the affected
    functions and found that none cared. A few places actually compared
    the return value with < 0 with no change in behaviour, so I opted for the more
    efficient version.

    One instance of the error I've made numerous times now is also present
    in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
    check for
    Signed-off-by: David S. Miller

    Johannes Berg
     

06 Jan, 2015

1 commit

  • This patch adds the minimum necessary for the RTAX_CC_ALGO congestion
    control metric to be set up and dumped back to user space.

    While the internal representation of RTAX_CC_ALGO is handled as a u32
    key, we avoided to expose this implementation detail to user space, thus
    instead, we chose the netlink attribute that is being exchanged between
    user space to be the actual congestion control algorithm name, similarly
    as in the setsockopt(2) API in order to allow for maximum flexibility,
    even for 3rd party modules.

    It is a bit unfortunate that RTAX_QUICKACK used up a whole RTAX slot as
    it should have been stored in RTAX_FEATURES instead, we first thought
    about reusing it for the congestion control key, but it brings more
    complications and/or confusion than worth it.

    Joint work with Florian Westphal.

    Signed-off-by: Florian Westphal
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

15 Oct, 2014

1 commit

  • fib_nh_match does not match nexthops correctly. Example:

    ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \
    nexthop via 192.168.122.13 dev eth0
    ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \
    nexthop via 192.168.122.15 dev eth0

    Del command is successful and route is removed. After this patch
    applied, the route is correctly matched and result is:
    RTNETLINK answers: No such process

    Please consider this for stable trees as well.

    Fixes: 4e902c57417c4 ("[IPv4]: FIB configuration using struct fib_config")
    Signed-off-by: Jiri Pirko
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Jiri Pirko