10 Jan, 2019

1 commit

  • [ Upstream commit 5648451e30a0d13d11796574919a359025d52cce ]

    vr.vifi is indirectly controlled by user-space, hence leading to
    a potential exploitation of the Spectre variant 1 vulnerability.

    This issue was detected with the help of Smatch:

    net/ipv4/ipmr.c:1616 ipmr_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)
    net/ipv4/ipmr.c:1690 ipmr_compat_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)

    Fix this by sanitizing vr.vifi before using it to index mrt->vif_table'

    Notice that given that speculation windows are large, the policy is
    to kill the speculation on the first load and not worry if it can be
    completed with a dependent load/store [1].

    [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Gustavo A. R. Silva
     

04 Nov, 2018

1 commit

  • [ Upstream commit eddf016b910486d2123675a6b5fd7d64f77cdca8 ]

    If the skb space ends in an unresolved entry while dumping we'll miss
    some unresolved entries. The reason is due to zeroing the entry counter
    between dumping resolved and unresolved mfc entries. We should just
    keep counting until the whole table is dumped and zero when we move to
    the next as we have a separate table counter.

    Reported-by: Colin Ian King
    Fixes: 8fb472c09b9d ("ipmr: improve hash scalability")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Aleksandrov
     

12 Jun, 2018

1 commit

  • [ Upstream commit 66fb33254f45df4b049f487aff1cbde1ef919390 ]

    commit 8fb472c09b9d ("ipmr: improve hash scalability")
    added a call to rhltable_init() without checking its return value.

    This problem was then later copied to IPv6 and factorized in commit
    0bbbf0e7d0e7 ("ipmr, ip6mr: Unite creation of new mr_table")

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] SMP KASAN
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Modules linked in:
    CPU: 1 PID: 31552 Comm: syz-executor7 Not tainted 4.17.0-rc5+ #60
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:rht_key_hashfn include/linux/rhashtable.h:277 [inline]
    RIP: 0010:__rhashtable_lookup include/linux/rhashtable.h:630 [inline]
    RIP: 0010:rhltable_lookup include/linux/rhashtable.h:716 [inline]
    RIP: 0010:mr_mfc_find_parent+0x2ad/0xbb0 net/ipv4/ipmr_base.c:63
    RSP: 0018:ffff8801826aef70 EFLAGS: 00010203
    RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffc90001ea0000
    RDX: 0000000000000079 RSI: ffffffff8661e859 RDI: 000000000000000c
    RBP: ffff8801826af1c0 R08: ffff8801b2212000 R09: ffffed003b5e46c2
    R10: ffffed003b5e46c2 R11: ffff8801daf23613 R12: dffffc0000000000
    R13: ffff8801826af198 R14: ffff8801cf8225c0 R15: ffff8801826af658
    FS: 00007ff7fa732700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000003ffffff9c CR3: 00000001b0210000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    ip6mr_cache_find_parent net/ipv6/ip6mr.c:981 [inline]
    ip6mr_mfc_delete+0x1fe/0x6b0 net/ipv6/ip6mr.c:1221
    ip6_mroute_setsockopt+0x15c6/0x1d70 net/ipv6/ip6mr.c:1698
    do_ipv6_setsockopt.isra.9+0x422/0x4660 net/ipv6/ipv6_sockglue.c:163
    ipv6_setsockopt+0xbd/0x170 net/ipv6/ipv6_sockglue.c:922
    rawv6_setsockopt+0x59/0x140 net/ipv6/raw.c:1060
    sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039
    __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
    __do_sys_setsockopt net/socket.c:1914 [inline]
    __se_sys_setsockopt net/socket.c:1911 [inline]
    __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
    do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Fixes: 8fb472c09b9d ("ipmr: improve hash scalability")
    Fixes: 0bbbf0e7d0e7 ("ipmr, ip6mr: Unite creation of new mr_table")
    Signed-off-by: Eric Dumazet
    Cc: Nikolay Aleksandrov
    Cc: Yuval Mintz
    Reported-by: syzbot
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

10 Aug, 2017

1 commit

  • This change allows us to later indicate to rtnetlink core that certain
    doit functions should be called without acquiring rtnl_mutex.

    This change should have no effect, we simply replace the last (now
    unused) calcit argument with the new flag.

    Signed-off-by: Florian Westphal
    Reviewed-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florian Westphal
     

12 Jul, 2017

1 commit


30 Jun, 2017

1 commit

  • Add to RTNL_FAMILY_IPMR, RTM_GETROUTE the ability
    to retrieve one S,G mroute from a specified table.

    *,G will return mroute information for just that
    particular mroute if it exists. This is because
    it is entirely possible to have more S's then
    can fit in one skb to return to the requesting
    process.

    Signed-off-by: Donald Sharp
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Donald Sharp
     

21 Jun, 2017

1 commit

  • Add Netlink notifications on cache reports in ipmr, in addition to the
    existing igmpmsg sent to mroute_sk.
    Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV4_MROUTE_R.

    MSGTYPE, VIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the
    same data as their equivalent fields in the igmpmsg header.
    PKT attribute is the packet sent to mroute_sk, without the added igmpmsg
    header.

    Suggested-by: Ryan Halbrook
    Signed-off-by: Julien Gomes
    Reviewed-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Julien Gomes
     

16 Jun, 2017

2 commits

  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions return void * and remove all the casts across
    the tree, adding a (u8 *) cast only where the unsigned char pointer
    was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = {
    skb_pull,
    __skb_pull,
    skb_pull_inline,
    __pskb_pull_tail,
    __pskb_pull,
    pskb_pull
    };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = {
    skb_pull,
    __skb_pull,
    skb_pull_inline,
    __pskb_pull_tail,
    __pskb_pull,
    pskb_pull
    };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions (skb_put, __skb_put and pskb_put) return void *
    and remove all the casts across the tree, adding a (u8 *) cast only
    where the unsigned char pointer was used directly, all done with the
    following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_put, __skb_put };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_put, __skb_put };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    which actually doesn't cover pskb_put since there are only three
    users overall.

    A handful of stragglers were converted manually, notably a macro in
    drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
    instances in net/bluetooth/hci_sock.c. In the former file, I also
    had to fix one whitespace problem spatch introduced.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

15 Jun, 2017

1 commit


12 Jun, 2017

1 commit

  • This patch fixes two issues:

    1) When forwarding on *,G mroutes that are in a vrf, the
    kernel was dropping information about the actual incoming
    interface when calling ip_mr_forward from ip_mr_input.
    This caused ip_mr_forward to send the multicast packet
    back out the incoming interface. Fix this by
    modifying ip_mr_forward to be handed the correctly
    resolved dev.

    2) When a unresolved cache entry is created we store
    the incoming skb on the unresolved cache entry and
    upon mroute resolution from the user space daemon,
    we attempt to forward the packet. Again we were
    not resolving to the correct incoming device for
    a vrf scenario, before calling ip_mr_forward.
    Fix this by resolving to the correct interface
    and calling ip_mr_forward with the result.

    Fixes: e58e41596811 ("net: Enable support for VRF with ipv4 multicast")
    Signed-off-by: Donald Sharp
    Acked-by: David Ahern
    Acked-by: Nikolay Aleksandrov
    Reviewed-by: Yotam Gigi
    Signed-off-by: David S. Miller

    Donald Sharp
     

09 Jun, 2017

1 commit

  • Currently there's no way to dump the VIF table for an ipmr table other
    than the default (via proc). This is a major issue when debugging ipmr
    issues and in general it is good to know which interfaces are
    configured. This patch adds support for RTM_GETLINK for the ipmr family
    so we can dump the VIF table and the ipmr table's current config for
    each table. We're protected by rtnl so no need to acquire RCU or
    mrt_lock.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

08 Jun, 2017

1 commit

  • Network devices can allocate reasources and private memory using
    netdev_ops->ndo_init(). However, the release of these resources
    can occur in one of two different places.

    Either netdev_ops->ndo_uninit() or netdev->destructor().

    The decision of which operation frees the resources depends upon
    whether it is necessary for all netdev refs to be released before it
    is safe to perform the freeing.

    netdev_ops->ndo_uninit() presumably can occur right after the
    NETDEV_UNREGISTER notifier completes and the unicast and multicast
    address lists are flushed.

    netdev->destructor(), on the other hand, does not run until the
    netdev references all go away.

    Further complicating the situation is that netdev->destructor()
    almost universally does also a free_netdev().

    This creates a problem for the logic in register_netdevice().
    Because all callers of register_netdevice() manage the freeing
    of the netdev, and invoke free_netdev(dev) if register_netdevice()
    fails.

    If netdev_ops->ndo_init() succeeds, but something else fails inside
    of register_netdevice(), it does call ndo_ops->ndo_uninit(). But
    it is not able to invoke netdev->destructor().

    This is because netdev->destructor() will do a free_netdev() and
    then the caller of register_netdevice() will do the same.

    However, this means that the resources that would normally be released
    by netdev->destructor() will not be.

    Over the years drivers have added local hacks to deal with this, by
    invoking their destructor parts by hand when register_netdevice()
    fails.

    Many drivers do not try to deal with this, and instead we have leaks.

    Let's close this hole by formalizing the distinction between what
    private things need to be freed up by netdev->destructor() and whether
    the driver needs unregister_netdevice() to perform the free_netdev().

    netdev->priv_destructor() performs all actions to free up the private
    resources that used to be freed by netdev->destructor(), except for
    free_netdev().

    netdev->needs_free_netdev is a boolean that indicates whether
    free_netdev() should be done at the end of unregister_netdevice().

    Now, register_netdevice() can sanely release all resources after
    ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
    and netdev->priv_destructor().

    And at the end of unregister_netdevice(), we invoke
    netdev->priv_destructor() and optionally call free_netdev().

    Signed-off-by: David S. Miller

    David S. Miller
     

17 May, 2017

1 commit

  • The skb->dev that is passed into ip_mr_input is
    the loX device for VRFs. When we lookup a vif
    for this dev, none is found as we do not create
    vifs for loopbacks. Instead lookup a vif for the
    actual device that the packet was received on,
    eg the vlan.

    Signed-off-by: Thomas Winter
    cc: David Ahern
    cc: Nikolay Aleksandrov
    cc: roopa
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Thomas Winter
     

20 Apr, 2017

1 commit


18 Apr, 2017

2 commits

  • Add netlink_ext_ack arg to rtnl_doit_func. Pass extack arg to nlmsg_parse
    for doit functions that call it directly.

    This is the first step to using extended error reporting in rtnetlink.
    >From here individual subsystems can be updated to set netlink_ext_ack as
    needed.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Similar to commit 87e9f0315952
    ("ipv4: fix a potential deadlock in mcast getsockopt() path"),
    there is a deadlock scenario for IP_ROUTER_ALERT too:

    CPU0 CPU1
    ---- ----
    lock(rtnl_mutex);
    lock(sk_lock-AF_INET);
    lock(rtnl_mutex);
    lock(sk_lock-AF_INET);

    Fix this by always locking RTNL first on all setsockopt() paths.

    Note, after this patch ip_ra_lock is no longer needed either.

    Reported-by: Dmitry Vyukov
    Tested-by: Andrey Konovalov
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     

14 Apr, 2017

1 commit


29 Mar, 2017

1 commit


28 Feb, 2017

1 commit

  • Now that %z is standartised in C99 there is no reason to support %Z.
    Unlike %L it doesn't even make format strings smaller.

    Use BUILD_BUG_ON in a couple ATM drivers.

    In case anyone didn't notice lib/vsprintf.o is about half of SLUB which
    is in my opinion is quite an achievement. Hopefully this patch inspires
    someone else to trim vsprintf.c more.

    Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2
    Signed-off-by: Alexey Dobriyan
    Cc: Andy Shevchenko
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

13 Jan, 2017

1 commit

  • Recently we started using ipmr with thousands of entries and easily hit
    soft lockups on smaller devices. The reason is that the hash function
    uses the high order bits from the src and dst, but those don't change in
    many common cases, also the hash table is only 64 elements so with
    thousands it doesn't scale at all.
    This patch migrates the hash table to rhashtable, and in particular the
    rhl interface which allows for duplicate elements to be chained because
    of the MFC_PROXY support (*,G; *,*,oif cases) which allows for multiple
    duplicate entries to be added with different interfaces (IMO wrong, but
    it's been in for a long time).

    And here are some results from tests I've run in a VM:
    mr_table size (default, allocated for all namespaces):
    Before After
    49304 bytes 2400 bytes

    Add 65000 routes (the diff is much larger on smaller devices):
    Before After
    1m42s 58s

    Forwarding 256 byte packets with 65000 routes (test done in a VM):
    Before After
    3 Mbps / ~1465 pps 122 Mbps / ~59000 pps

    As a bonus we no longer see the soft lockups on smaller devices which
    showed up even with 2000 entries before.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

09 Jan, 2017

1 commit


03 Jan, 2017

1 commit

  • While working with ipmr, we noticed that it is impossible to determine
    if an entry is actually unresolved or its IIF interface has disappeared
    (e.g. virtual interface got deleted). These entries look almost
    identical to user-space when dumping or receiving notifications. So in
    order to recognize them add a new RTNH_F_UNRESOLVED flag which is set when
    sending an unresolved cache entry to user-space.

    Suggested-by: Roopa Prabhu
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

25 Dec, 2016

1 commit


15 Nov, 2016

1 commit


04 Nov, 2016

1 commit

  • Some configurations (e.g. geneve interface with default
    MTU of 1500 over an ethernet interface with 1500 MTU) result
    in the transmission of packets that exceed the configured MTU.
    While this should be considered to be a "bad" configuration,
    it is still allowed and should not result in the sending
    of packets that exceed the configured MTU.

    Fix by dropping the assumption in ip_finish_output_gso() that
    locally originated gso packets will never need fragmentation.
    Basic testing using iperf (observing CPU usage and bandwidth)
    have shown no measurable performance impact for traffic not
    requiring fragmentation.

    Fixes: c7ba65d7b649 ("net: ip: push gso skb forwarding handling down the stack")
    Reported-by: Jan Tluka
    Signed-off-by: Lance Richardson
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Lance Richardson
     

01 Nov, 2016

2 commits

  • Enable support for IPv4 multicast:
    - similar to unicast the flow struct is updated to L3 master device
    if relevant prior to calling fib_rules_lookup. The table id is saved
    to the lookup arg so the rule action for ipmr can return the table
    associated with the device.

    - ip_mr_forward needs to check for master device mismatch as well
    since the skb->dev is set to it

    - allow multicast address on VRF device for Rx by checking for the
    daddr in the VRF device as well as the original ingress device

    - on Tx need to drop to __mkroute_output when FIB lookup fails for
    multicast destination address.

    - if CONFIG_IP_MROUTE_MULTIPLE_TABLES is enabled VRF driver creates
    IPMR FIB rules on first device create similar to FIB rules. In
    addition the VRF driver does not divert IPv4 multicast packets:
    it breaks on Tx since the fib lookup fails on the mcast address.

    With this patch, ipmr forwarding and local rx/tx work.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

26 Sep, 2016

1 commit

  • Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
    instead of the previous dst_pid which was copied from in_skb's portid.
    Since the skb is new the portid is 0 at that point so the packets are sent
    to the kernel and we get scheduling while atomic or a deadlock (depending
    on where it happens) by trying to acquire rtnl two times.
    Also since this is RTM_GETROUTE, it can be triggered by a normal user.

    Here's the sleeping while atomic trace:
    [ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
    [ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
    [ 7858.212881] 2 locks held by swapper/0/0:
    [ 7858.213013] #0: (((&mrt->ipmr_expire_timer))){+.-...}, at: [] call_timer_fn+0x5/0x350
    [ 7858.213422] #1: (mfc_unres_lock){+.....}, at: [] ipmr_expire_process+0x25/0x130
    [ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
    [ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
    [ 7858.214108] 0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
    [ 7858.214412] ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
    [ 7858.214716] 000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
    [ 7858.215251] Call Trace:
    [ 7858.215412] [] dump_stack+0x85/0xc1
    [ 7858.215662] [] ___might_sleep+0x192/0x250
    [ 7858.215868] [] __might_sleep+0x6f/0x100
    [ 7858.216072] [] mutex_lock_nested+0x33/0x4d0
    [ 7858.216279] [] ? netlink_lookup+0x25f/0x460
    [ 7858.216487] [] rtnetlink_rcv+0x1b/0x40
    [ 7858.216687] [] netlink_unicast+0x19c/0x260
    [ 7858.216900] [] rtnl_unicast+0x20/0x30
    [ 7858.217128] [] ipmr_destroy_unres+0xa9/0xf0
    [ 7858.217351] [] ipmr_expire_process+0x8f/0x130
    [ 7858.217581] [] ? ipmr_net_init+0x180/0x180
    [ 7858.217785] [] ? ipmr_net_init+0x180/0x180
    [ 7858.217990] [] call_timer_fn+0xa5/0x350
    [ 7858.218192] [] ? call_timer_fn+0x5/0x350
    [ 7858.218415] [] ? ipmr_net_init+0x180/0x180
    [ 7858.218656] [] run_timer_softirq+0x260/0x640
    [ 7858.218865] [] ? __do_softirq+0xbb/0x54f
    [ 7858.219068] [] __do_softirq+0xe8/0x54f
    [ 7858.219269] [] irq_exit+0xb8/0xc0
    [ 7858.219463] [] smp_apic_timer_interrupt+0x42/0x50
    [ 7858.219678] [] apic_timer_interrupt+0x8c/0xa0
    [ 7858.219897] [] ? native_safe_halt+0x6/0x10
    [ 7858.220165] [] ? trace_hardirqs_on+0xd/0x10
    [ 7858.220373] [] default_idle+0x23/0x190
    [ 7858.220574] [] arch_cpu_idle+0xf/0x20
    [ 7858.220790] [] default_idle_call+0x4c/0x60
    [ 7858.221016] [] cpu_startup_entry+0x39b/0x4d0
    [ 7858.221257] [] rest_init+0x135/0x140
    [ 7858.221469] [] start_kernel+0x50e/0x51b
    [ 7858.221670] [] ? early_idt_handler_array+0x120/0x120
    [ 7858.221894] [] x86_64_start_reservations+0x2a/0x2c
    [ 7858.222113] [] x86_64_start_kernel+0x13b/0x14a

    Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

21 Sep, 2016

1 commit

  • When I introduced the lastuse member I made a subtle error because it was
    returned as an absolute value but that is meaningless to user-space as it
    doesn't allow to see how old exactly an entry is. Let's make it similar to
    how the bridge returns such values and make it relative to "now" (jiffies).
    This allows us to show the actual age of the entries and is much more
    useful (e.g. user-space daemons can age out entries, iproute2 can display
    the lastuse properly).

    Fixes: 43b9e1274060 ("net: ipmr/ip6mr: add support for keeping an entry age")
    Reported-by: Satish Ashok
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

27 Jul, 2016

1 commit

  • Currently lastuse is updated on entry creation and cache hit, but it should
    also be updated on entry change. Since both on add and update the ttl array
    is updated we can simply update the lastuse in ipmr_update_thresholds.

    Signed-off-by: Nikolay Aleksandrov
    CC: Roopa Prabhu
    CC: Donald Sharp
    CC: David S. Miller
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

20 Jul, 2016

1 commit


17 Jul, 2016

1 commit

  • In preparation for hardware offloading of ipmr/ip6mr we need an
    interface that allows to check (and later update) the age of entries.
    Relying on stats alone can show activity but not actual age of the entry,
    furthermore when there're tens of thousands of entries a lot of the
    hardware implementations only support "hit" bits which are cleared on
    read to denote that the entry was active and shouldn't be aged out,
    these can then be naturally translated into age timestamp and will be
    compatible with the software forwarding age. Using a lastuse entry doesn't
    affect performance because the members in that cache line are written to
    along with the age.
    Since all new users are encouraged to use ipmr via netlink, this is
    exported via the RTA_EXPIRES attribute.
    Also do a minor local variable declaration style adjustment - arrange them
    longest to shortest.

    Signed-off-by: Nikolay Aleksandrov
    CC: Roopa Prabhu
    CC: Shrijeet Mukherjee
    CC: Satish Ashok
    CC: Donald Sharp
    CC: David S. Miller
    CC: Alexey Kuznetsov
    CC: James Morris
    CC: Hideaki YOSHIFUJI
    CC: Patrick McHardy
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

28 Jun, 2016

1 commit


22 Apr, 2016

1 commit


04 Dec, 2015

1 commit


01 Dec, 2015

4 commits

  • This patch adds support to add and remove MFC entries. It uses the
    same attributes like the already present dump support in order to be
    consistent. There's one new entry - RTA_PREFSRC, it's used to denote an
    MFC_PROXY entry (see MRT_ADD_MFC vs MRT_ADD_MFC_PROXY).
    The already existing infrastructure is used to create and delete the
    entries, the netlink message gets converted internally to a struct mfcctl
    which is used with ipmr_mfc_add/delete.
    The other used attributes are:
    RTA_IIF - used for mfcc_parent (when adding it's required to be valid)
    RTA_SRC - used for mfcc_origin
    RTA_DST - used for mfcc_mcastgrp
    RTA_TABLE - the MRT table id
    RTA_MULTIPATH - the "oifs" ttl array

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • We can have both errors and we'll return the second one, fix it to
    return an error at a time as it's normal. I've overlooked this in my
    previous set.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Move the inline pimsm_enabled() to pim.h and rename it to
    ipmr_pimsm_enabled to show it's for the ipv4 ipmr code since pim.h is
    used by IPv6 too.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Move the definitions of VIF_EXISTS() and struct mr_table to mroute.h

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov