25 Sep, 2015

1 commit


09 Sep, 2015

1 commit

  • Pull inifiniband/rdma updates from Doug Ledford:
    "This is a fairly sizeable set of changes. I've put them through a
    decent amount of testing prior to sending the pull request due to
    that.

    There are still a few fixups that I know are coming, but I wanted to
    go ahead and get the big, sizable chunk into your hands sooner rather
    than waiting for those last few fixups.

    Of note is the fact that this creates what is intended to be a
    temporary area in the drivers/staging tree specifically for some
    cleanups and additions that are coming for the RDMA stack. We
    deprecated two drivers (ipath and amso1100) and are waiting to hear
    back if we can deprecate another one (ehca). We also put Intel's new
    hfi1 driver into this area because it needs to be refactored and a
    transfer library created out of the factored out code, and then it and
    the qib driver and the soft-roce driver should all be modified to use
    that library.

    I expect drivers/staging/rdma to be around for three or four kernel
    releases and then to go away as all of the work is completed and final
    deletions of deprecated drivers are done.

    Summary of changes for 4.3:

    - Create drivers/staging/rdma
    - Move amso1100 driver to staging/rdma and schedule for deletion
    - Move ipath driver to staging/rdma and schedule for deletion
    - Add hfi1 driver to staging/rdma and set TODO for move to regular
    tree
    - Initial support for namespaces to be used on RDMA devices
    - Add RoCE GID table handling to the RDMA core caching code
    - Infrastructure to support handling of devices with differing read
    and write scatter gather capabilities
    - Various iSER updates
    - Kill off unsafe usage of global mr registrations
    - Update SRP driver
    - Misc mlx4 driver updates
    - Support for the mr_alloc verb
    - Support for a netlink interface between kernel and user space cache
    daemon to speed path record queries and route resolution
    - Ininitial support for safe hot removal of verbs devices"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (136 commits)
    IB/ipoib: Suppress warning for send only join failures
    IB/ipoib: Clean up send-only multicast joins
    IB/srp: Fix possible protection fault
    IB/core: Move SM class defines from ib_mad.h to ib_smi.h
    IB/core: Remove unnecessary defines from ib_mad.h
    IB/hfi1: Add PSM2 user space header to header_install
    IB/hfi1: Add CSRs for CONFIG_SDMA_VERBOSITY
    mlx5: Fix incorrect wc pkey_index assignment for GSI messages
    IB/mlx5: avoid destroying a NULL mr in reg_user_mr error flow
    IB/uverbs: reject invalid or unknown opcodes
    IB/cxgb4: Fix if statement in pick_local_ip6adddrs
    IB/sa: Fix rdma netlink message flags
    IB/ucma: HW Device hot-removal support
    IB/mlx4_ib: Disassociate support
    IB/uverbs: Enable device removal when there are active user space applications
    IB/uverbs: Explicitly pass ib_dev to uverbs commands
    IB/uverbs: Fix race between ib_uverbs_open and remove_one
    IB/uverbs: Fix reference counting usage of event files
    IB/core: Make ib_dealloc_pd return void
    IB/srp: Create an insecure all physical rkey only if needed
    ...

    Linus Torvalds
     

31 Aug, 2015

1 commit

  • For loopback purposes, RoCE devices should have a default GID in the
    port GID table, even when the interface is down. In order to do so,
    we use the IPv6 link local address which would have been genenrated
    for the related Ethernet netdevice when it goes up as a default GID.

    addrconf_ifid_eui48 is used to gernerate this address, export it.

    Signed-off-by: Matan Barak
    Signed-off-by: Doug Ledford

    Matan Barak
     

01 Aug, 2015

1 commit

  • This patch adds net argument to ipv6_stub_impl.ipv6_dst_lookup
    for use cases where sk is not available (like mpls).
    sk appears to be needed to get the namespace 'net' and is optional
    otherwise. This patch series changes ipv6_stub_impl.ipv6_dst_lookup
    to take net argument. sk remains optional.

    All callers of ipv6_stub_impl.ipv6_dst_lookup have been modified
    to pass net. I have modified them to use already available
    'net' in the scope of the call. I can change them to
    sock_net(sk) to avoid any unintended change in behaviour if sock
    namespace is different. They dont seem to be from code inspection.

    Signed-off-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Roopa Prabhu
     

05 May, 2015

1 commit

  • With this patch, the IGMP and MLD message validation functions are moved
    from the bridge code to IPv4/IPv6 multicast files. Some small
    refactoring was done to enhance readibility and to iron out some
    differences in behaviour between the IGMP and MLD parsing code (e.g. the
    skb-cloning of MLD messages is now only done if necessary, just like the
    IGMP part always did).

    Finally, these IGMP and MLD message validation functions are exported so
    that not only the bridge can use it but batman-adv later, too.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     

06 Feb, 2015

1 commit

  • RFC 4429 ("Optimistic DAD") states that optimistic addresses
    should be treated as deprecated addresses. From section 2.1:

    Unless noted otherwise, components of the IPv6 protocol stack
    should treat addresses in the Optimistic state equivalently to
    those in the Deprecated state, indicating that the address is
    available for use but should not be used if another suitable
    address is available.

    Optimistic addresses are indeed avoided when other addresses are
    available (i.e. at source address selection time), but they have
    not heretofore been available for things like explicit bind() and
    sendmsg() with struct in6_pktinfo, etc.

    This change makes optimistic addresses treated more like
    deprecated addresses than tentative ones.

    Signed-off-by: Erik Kline
    Acked-by: Lorenzo Colitti
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Erik Kline
     

24 Sep, 2014

1 commit


14 Sep, 2014

1 commit


13 Sep, 2014

1 commit

  • If we try to rmmod the driver for an interface while sockets with
    setsockopt(JOIN_ANYCAST) are alive, some refcounts aren't cleaned up
    and we get stuck on:

    unregister_netdevice: waiting for ens3 to become free. Usage count = 1

    If we LEAVE_ANYCAST/close everything before rmmod'ing, there is no
    problem.

    We need to perform a cleanup similar to the one for multicast in
    addrconf_ifdown(how == 1).

    Signed-off-by: Sabrina Dubroca
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     

01 May, 2014

1 commit


28 Feb, 2014

1 commit

  • Avoid the following sparse __CHECK_ENDIAN__ warnings:

    include/net/addrconf.h:318:25: warning: restricted __be64 degrades to integer
    include/net/addrconf.h:318:70: warning: restricted __be64 degrades to integer
    include/net/addrconf.h:330:25: warning: restricted __be64 degrades to integer
    include/net/addrconf.h:330:70: warning: restricted __be64 degrades to integer
    include/net/addrconf.h:347:25: warning: restricted __be64 degrades to integer
    include/net/addrconf.h:348:26: warning: restricted __be64 degrades to integer
    include/net/addrconf.h:349:18: warning: restricted __be64 degrades to integer

    The warnings are false but they make it harder to spot real
    bugs.

    Signed-off-by: Bjørn Mork
    Signed-off-by: David S. Miller

    Bjørn Mork
     

23 Jan, 2014

1 commit

  • This change allows to consider an anycast address valid as source address
    when given via an IPV6_PKTINFO or IPV6_2292PKTINFO ancillary data item.
    So, when sending a datagram with ancillary data, the unicast and anycast
    addresses are handled in the same way.

    - Adds ipv6_chk_acast_addr_src() to check if an anycast address is link-local
    on given interface or is global.
    - Uses it in ip6_datagram_send_ctl().

    Signed-off-by: Francois-Xavier Le Bail
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    FX Le Bail
     

10 Dec, 2013

1 commit


07 Dec, 2013

1 commit


29 Sep, 2013

1 commit

  • When a router is doing DNAT for 6to4/6rd packets the latest
    anti-spoofing commit 218774dc ("ipv6: add anti-spoofing checks for
    6to4 and 6rd") will drop them because the IPv6 address embedded does
    not match the IPv4 destination. This patch will allow them to pass by
    testing if we have an address that matches on 6to4/6rd interface. I
    have been hit by this problem using Fedora and IPV6TO4_IPV4ADDR.
    Also, log the dropped packets (with rate limit).

    Signed-off-by: Catalin(ux) M. BOIE
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Catalin\(ux\) M. BOIE
     

01 Sep, 2013

3 commits


01 Aug, 2013

1 commit

  • There are a mix of function prototypes with and without extern
    in the kernel sources. Standardize on not using extern for
    function prototypes.

    Function prototypes don't need to be written with extern.
    extern is assumed by the compiler. Its use is as unnecessary as
    using auto to declare automatic/local variables in a block.

    Reflow modified prototypes to 80 columns.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

02 Jul, 2013

1 commit

  • dingtianhong reported the following deadlock detected by lockdep:

    ======================================================
    [ INFO: possible circular locking dependency detected ]
    3.4.24.05-0.1-default #1 Not tainted
    -------------------------------------------------------
    ksoftirqd/0/3 is trying to acquire lock:
    (&ndev->lock){+.+...}, at: [] ipv6_get_lladdr+0x74/0x120

    but task is already holding lock:
    (&mc->mca_lock){+.+...}, at: [] mld_send_report+0x40/0x150

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&mc->mca_lock){+.+...}:
    [] validate_chain+0x637/0x730
    [] __lock_acquire+0x2f7/0x500
    [] lock_acquire+0x114/0x150
    [] rt_spin_lock+0x4a/0x60
    [] igmp6_group_added+0x3b/0x120
    [] ipv6_mc_up+0x38/0x60
    [] ipv6_find_idev+0x3d/0x80
    [] addrconf_notify+0x3d5/0x4b0
    [] notifier_call_chain+0x3f/0x80
    [] raw_notifier_call_chain+0x11/0x20
    [] call_netdevice_notifiers+0x32/0x60
    [] __dev_notify_flags+0x34/0x80
    [] dev_change_flags+0x40/0x70
    [] do_setlink+0x237/0x8a0
    [] rtnl_newlink+0x3ec/0x600
    [] rtnetlink_rcv_msg+0x160/0x310
    [] netlink_rcv_skb+0x89/0xb0
    [] rtnetlink_rcv+0x27/0x40
    [] netlink_unicast+0x140/0x180
    [] netlink_sendmsg+0x33e/0x380
    [] sock_sendmsg+0x112/0x130
    [] __sys_sendmsg+0x44e/0x460
    [] sys_sendmsg+0x44/0x70
    [] system_call_fastpath+0x16/0x1b

    -> #0 (&ndev->lock){+.+...}:
    [] check_prev_add+0x3de/0x440
    [] validate_chain+0x637/0x730
    [] __lock_acquire+0x2f7/0x500
    [] lock_acquire+0x114/0x150
    [] rt_read_lock+0x42/0x60
    [] ipv6_get_lladdr+0x74/0x120
    [] mld_newpack+0xb6/0x160
    [] add_grhead+0xab/0xc0
    [] add_grec+0x3ab/0x460
    [] mld_send_report+0x5a/0x150
    [] igmp6_timer_handler+0x4e/0xb0
    [] call_timer_fn+0xca/0x1d0
    [] run_timer_softirq+0x1df/0x2e0
    [] handle_pending_softirqs+0xf7/0x1f0
    [] __do_softirq_common+0x7b/0xf0
    [] __thread_do_softirq+0x1af/0x210
    [] run_ksoftirqd+0xe1/0x1f0
    [] kthread+0xae/0xc0
    [] kernel_thread_helper+0x4/0x10

    actually we can just hold idev->lock before taking pmc->mca_lock,
    and avoid taking idev->lock again when iterating idev->addr_list,
    since the upper callers of mld_newpack() already take
    read_lock_bh(&idev->lock).

    Reported-by: dingtianhong
    Cc: dingtianhong
    Cc: Hideaki YOSHIFUJI
    Cc: David S. Miller
    Cc: Hannes Frederic Sowa
    Tested-by: Ding Tianhong
    Tested-by: Chen Weilong
    Signed-off-by: Cong Wang
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Amerigo Wang
     

29 Jun, 2013

1 commit

  • RFC3590/RFC3810 specifies we should resend MLD reports as soon as a
    valid link-local address is available.

    We now use the valid_ll_addr_cnt to check if it is necessary to resend
    a new report.

    Changes since Flavio Leitner's version:
    a) adapt for valid_ll_addr_cnt
    b) resend first reports directly in the path and just arm the timer for
    mc_qrv-1 resends.

    Reported-by: Flavio Leitner
    Cc: Hideaki YOSHIFUJI
    Cc: David Stevens
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

23 May, 2013

1 commit

  • Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812:

    [ ip6tables -m addrtype ]
    When I tried to use in the nat/PREROUTING it messes up the
    routing cache even if the rule didn't matched at all.
    [..]
    If I remove the --limit-iface-in from the non-working scenario, so just
    use the -m addrtype --dst-type LOCAL it works!

    This happens when LOCAL type matching is requested with --limit-iface-in,
    and the default ipv6 route is via the interface the packet we test
    arrived on.

    Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation
    creates an unwanted cached entry, and the packet won't make it to the
    real/expected destination.

    Silently ignoring --limit-iface-in makes the routing work but it breaks
    rule matching (--dst-type LOCAL with limit-iface-in is supposed to only
    match if the dst address is configured on the incoming interface;
    without --limit-iface-in it will match if the address is reachable
    via lo).

    The test should call ipv6_chk_addr() instead. However, this would add
    a link-time dependency on ipv6.

    There are two possible solutions:

    1) Revert the commit that moved ipt_addrtype to xt_addrtype,
    and put ipv6 specific code into ip6t_addrtype.
    2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions.

    While the former might seem preferable, Pablo pointed out that there
    are more xt modules with link-time dependeny issues regarding ipv6,
    so lets go for 2).

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

15 Apr, 2013

1 commit

  • Tomas reported the following build error:

    net/built-in.o: In function `ieee80211_unregister_hw':
    (.text+0x10f0e1): undefined reference to `unregister_inet6addr_notifier'
    net/built-in.o: In function `ieee80211_register_hw':
    (.text+0x10f610): undefined reference to `register_inet6addr_notifier'
    make: *** [vmlinux] Error 1

    when built IPv6 as a module.

    So we have to statically link these symbols.

    Reported-by: Tomas Melin
    Cc: Tomas Melin
    Cc: "David S. Miller"
    Cc: YOSHIFUJI Hidaki
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

30 Jan, 2013

1 commit

  • There are some usecase when lifetime of ipv4 addresses might be helpful.
    For example:
    1) initramfs networkmanager uses a DHCP daemon to learn network
    configuration parameters
    2) initramfs networkmanager addresses, routes and DNS configuration
    3) initramfs networkmanager is requested to stop
    4) initramfs networkmanager stops all daemons including dhclient
    5) there are addresses and routes configured but no daemon running. If
    the system doesn't start networkmanager for some reason, addresses and
    routes will be used forever, which violates RFC 2131.

    This patch is essentially a backport of ivp6 address lifetime mechanism
    for ipv4 addresses.

    Current "ip" tool supports this without any patch (since it does not
    distinguish between ipv4 and ipv6 addresses in this perspective.

    Also, this should be back-compatible with all current netlink users.

    Reported-by: Pavel Šimerda
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     

21 Jan, 2013

4 commits


14 Jan, 2013

1 commit


05 Dec, 2012

1 commit


30 Aug, 2012

1 commit


19 Jul, 2012

1 commit

  • Introduce ipv6_addr_hash() helper doing a XOR on all bits
    of an IPv6 address, with an optimized x86_64 version.

    Use it in flow dissector, as suggested by Andrew McGregor,
    to reduce hash collision probabilities in fq_codel (and other
    users of flow dissector)

    Use it in ip6_tunnel.c and use more bit shuffling, as suggested
    by David Laight, as existing hash was ignoring most of them.

    Use it in sunrpc and use more bit shuffling, using hash_32().

    Use it in net/ipv6/addrconf.c, using hash_32() as well.

    As a cleanup, use it in net/ipv4/tcp_metrics.c

    Signed-off-by: Eric Dumazet
    Reported-by: Andrew McGregor
    Cc: Dave Taht
    Cc: Tom Herbert
    Cc: David Laight
    Cc: Joe Perches
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 May, 2012

1 commit


16 Apr, 2012

1 commit


02 Feb, 2012

1 commit


05 Jan, 2012

1 commit

  • Recently Dave noticed that a test we did in ipv6_add_addr to see if we next hop
    route for the interface we're adding an addres to was wrong (see commit
    7ffbcecbeed91e5874e9a1cfc4c0cbb07dac3069). for one, it never triggers, and two,
    it was completely wrong to begin with. This test was meant to cover this
    section of RFC 4429:

    3.3 Modifications to RFC 2462 Stateless Address Autoconfiguration

    * (modifies section 5.5) A host MAY choose to configure a new address
    as an Optimistic Address. A host that does not know the SLLAO
    of its router SHOULD NOT configure a new address as Optimistic.
    A router SHOULD NOT configure an Optimistic Address.

    This patch should bring us into proper compliance with the above clause. Since
    we only add a SLAAC address after we've received a RA which may or may not
    contain a source link layer address option, we can pass a pointer to that option
    to addrconf_prefix_rcv (which may be null if the option is not present), and
    only set the optimistic flag if the option was found in the RA.

    Change notes:
    (v2) modified the new parameter to addrconf_prefix_rcv to be a bool rather than
    a pointer to make its use more clear as per request from davem.

    Signed-off-by: Neil Horman
    CC: "David S. Miller"
    CC: Hideaki YOSHIFUJI
    Signed-off-by: David S. Miller

    Neil Horman
     

02 Aug, 2011

1 commit

  • Update the code to handle some of the differences between
    RFC 3041 and RFC 4941, which obsoletes it. Also a couple
    of janitorial fixes.

    - Allow router advertisements to increase the lifetime of
    temporary addresses. This was not allowed by RFC 3041,
    but is specified by RFC 4941. It is useful when RA
    lifetimes are lower than TEMP_{VALID,PREFERRED}_LIFETIME:
    in this case, the previous code would delete or deprecate
    addresses prematurely.

    - Change the default of MAX_RETRY to 3 per RFC 4941.

    - Add a comment to clarify that the preferred and valid
    lifetimes in inet6_ifaddr are relative to the timestamp.

    - Shorten lines to 80 characters in a couple of places.

    Signed-off-by: Lorenzo Colitti
    Signed-off-by: David S. Miller

    Lorenzo Colitti
     

25 Apr, 2011

1 commit

  • These header files are never installed to user consumption, so any
    __KERNEL__ cpp checks are superfluous.

    Projects should also not copy these files into their userland utility
    sources and try to use them there. If they insist on doing so, the
    onus is on them to sanitize the headers as needed.

    Signed-off-by: David S. Miller

    David S. Miller
     

23 Apr, 2011

1 commit


03 Dec, 2010

1 commit