31 Jul, 2020

1 commit

  • IPV6_ADDRFORM causes resource leaks when converting an IPv6 socket
    to IPv4, particularly struct ipv6_ac_socklist. Similar to
    struct ipv6_mc_socklist, we should just close it on this path.

    This bug can be easily reproduced with the following C program:

    #include
    #include
    #include
    #include
    #include

    int main()
    {
    int s, value;
    struct sockaddr_in6 addr;
    struct ipv6_mreq m6;

    s = socket(AF_INET6, SOCK_DGRAM, 0);
    addr.sin6_family = AF_INET6;
    addr.sin6_port = htons(5000);
    inet_pton(AF_INET6, "::ffff:192.168.122.194", &addr.sin6_addr);
    connect(s, (struct sockaddr *)&addr, sizeof(addr));

    inet_pton(AF_INET6, "fe80::AAAA", &m6.ipv6mr_multiaddr);
    m6.ipv6mr_interface = 5;
    setsockopt(s, SOL_IPV6, IPV6_JOIN_ANYCAST, &m6, sizeof(m6));

    value = AF_INET;
    setsockopt(s, SOL_IPV6, IPV6_ADDRFORM, &value, sizeof(value));

    close(s);
    return 0;
    }

    Reported-by: ch3332xr@gmail.com
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

29 Apr, 2020

1 commit


31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

09 Nov, 2018

1 commit


06 Nov, 2018

1 commit


03 Nov, 2018

1 commit

  • icmp6_send() function is expensive on systems with a large number of
    interfaces. Every time it’s called, it has to verify that the source
    address does not correspond to an existing anycast address by looping
    through every device and every anycast address on the device. This can
    result in significant delays for a CPU when there are a large number of
    neighbors and ND timers are frequently timing out and calling
    neigh_invalidate().

    Add anycast addresses to a global hashtable to allow quick searching for
    matching anycast addresses. This is based on inet6_addr_lst in addrconf.c.

    Signed-off-by: Jeff Barnhill
    Signed-off-by: David S. Miller

    Jeff Barnhill
     

07 Jun, 2018

1 commit

  • Pull networking updates from David Miller:

    1) Add Maglev hashing scheduler to IPVS, from Inju Song.

    2) Lots of new TC subsystem tests from Roman Mashak.

    3) Add TCP zero copy receive and fix delayed acks and autotuning with
    SO_RCVLOWAT, from Eric Dumazet.

    4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
    Brouer.

    5) Add ttl inherit support to vxlan, from Hangbin Liu.

    6) Properly separate ipv6 routes into their logically independant
    components. fib6_info for the routing table, and fib6_nh for sets of
    nexthops, which thus can be shared. From David Ahern.

    7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
    messages from XDP programs. From Nikita V. Shirokov.

    8) Lots of long overdue cleanups to the r8169 driver, from Heiner
    Kallweit.

    9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.

    10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.

    11) Plumb extack down into fib_rules, from Roopa Prabhu.

    12) Add Flower classifier offload support to igb, from Vinicius Costa
    Gomes.

    13) Add UDP GSO support, from Willem de Bruijn.

    14) Add documentation for eBPF helpers, from Quentin Monnet.

    15) Add TLS tx offload to mlx5, from Ilya Lesokhin.

    16) Allow applications to be given the number of bytes available to read
    on a socket via a control message returned from recvmsg(), from
    Soheil Hassas Yeganeh.

    17) Add x86_32 eBPF JIT compiler, from Wang YanQing.

    18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
    From Björn Töpel.

    19) Remove indirect load support from all of the BPF JITs and handle
    these operations in the verifier by translating them into native BPF
    instead. From Daniel Borkmann.

    20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.

    21) Allow XDP programs to do lookups in the main kernel routing tables
    for forwarding. From David Ahern.

    22) Allow drivers to store hardware state into an ELF section of kernel
    dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.

    23) Various RACK and loss detection improvements in TCP, from Yuchung
    Cheng.

    24) Add TCP SACK compression, from Eric Dumazet.

    25) Add User Mode Helper support and basic bpfilter infrastructure, from
    Alexei Starovoitov.

    26) Support ports and protocol values in RTM_GETROUTE, from Roopa
    Prabhu.

    27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
    Brouer.

    28) Add lots of forwarding selftests, from Petr Machata.

    29) Add generic network device failover driver, from Sridhar Samudrala.

    * ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
    strparser: Add __strp_unpause and use it in ktls.
    rxrpc: Fix terminal retransmission connection ID to include the channel
    net: hns3: Optimize PF CMDQ interrupt switching process
    net: hns3: Fix for VF mailbox receiving unknown message
    net: hns3: Fix for VF mailbox cannot receiving PF response
    bnx2x: use the right constant
    Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
    net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
    enic: fix UDP rss bits
    netdev-FAQ: clarify DaveM's position for stable backports
    rtnetlink: validate attributes in do_setlink()
    mlxsw: Add extack messages for port_{un, }split failures
    netdevsim: Add extack error message for devlink reload
    devlink: Add extack to reload and port_{un, }split operations
    net: metrics: add proper netlink validation
    ipmr: fix error path when ipmr_new_table fails
    ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
    net: hns3: remove unused hclgevf_cfg_func_mta_filter
    netfilter: provide udp*_lib_lookup for nf_tproxy
    qed*: Utilize FW 8.37.2.0
    ...

    Linus Torvalds
     

16 May, 2018

1 commit

  • Variants of proc_create{,_data} that directly take a struct seq_operations
    and deal with network namespaces in ->open and ->release. All callers of
    proc_create + seq_open_net converted over, and seq_{open,release}_net are
    removed entirely.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

20 Apr, 2018

3 commits

  • aca_idev has only 1 user - inet6_fill_ifacaddr - and it only
    wants the device index which can be extracted from the fib6_info
    nexthop.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • addrconf_dst_alloc now returns a fib6_info. Update the name
    and its users to reflect the change.

    Rename only; no functional change intended.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Change the prefix for fib6_info struct elements from rt6i_ to fib6_.
    rt6i_pcpu and rt6i_exception_bucket are left as is given that they
    point to rt6_info entries.

    Rename only; not functional change intended.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

18 Apr, 2018

4 commits

  • Convert all code paths referencing a FIB entry from
    rt6_info to fib6_info.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Last step before flipping the data type for FIB entries:
    - use fib6_info_alloc to create FIB entries in ip6_route_info_create
    and addrconf_dst_alloc
    - use fib6_info_release in place of dst_release, ip6_rt_put and
    rt6_release
    - remove the dst_hold before calling __ip6_ins_rt or ip6_del_rt
    - when purging routes, drop per-cpu routes
    - replace inc and dec of rt6i_ref with fib6_info_hold and fib6_info_release
    - use rt->from since it points to the FIB entry
    - drop references to exception bucket, fib6_metrics and per-cpu from
    dst entries (those are relevant for fib entries only)

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Most FIB entries can be added using memory allocated with GFP_KERNEL.
    Add gfp_flags to ip6_route_add and addrconf_dst_alloc. Code paths that
    can be reached from the packet path (e.g., ndisc and autoconfig) or
    atomic notifiers use GFP_ATOMIC; paths from user context (adding
    addresses and routes) use GFP_KERNEL.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Pass network namespace reference into route add, delete and get
    functions.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

27 Mar, 2018

1 commit

  • Prefer the direct use of octal for permissions.

    Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace
    and some typing.

    Miscellanea:

    o Whitespace neatening around these conversions.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

16 Mar, 2018

1 commit

  • ipv6_chk_addr_and_flags determines if an address is a local address and
    optionally if it is an address on a specific device. For example, it is
    called by ip6_route_info_create to determine if a given gateway address
    is a local address. The address check currently does not consider L3
    domains and as a result does not allow a route to be added in one VRF
    if the nexthop points to an address in a second VRF. e.g.,

    $ ip route add 2001:db8:1::/64 vrf r2 via 2001:db8:102::23
    Error: Invalid gateway address.

    where 2001:db8:102::23 is an address on an interface in vrf r1.

    ipv6_chk_addr_and_flags needs to allow callers to always pass in a device
    with a separate argument to not limit the address to the specific device.
    The device is used used to determine the L3 domain of interest.

    To that end add an argument to skip the device check and update callers
    to always pass a device where possible and use the new argument to mean
    any address in the domain.

    Update a handful of users of ipv6_chk_addr with a NULL dev argument. This
    patch handles the change to these callers without adding the domain check.

    ip6_validate_gw needs to handle 2 cases - one where the device is given
    as part of the nexthop spec and the other where the device is resolved.
    There is at least 1 VRF case where deferring the check to only after
    the route lookup has resolved the device fails with an unintuitive error
    "RTNETLINK answers: No route to host" as opposed to the preferred
    "Error: Gateway can not be a local address." The 'no route to host'
    error is because of the fallback to a full lookup. The check is done
    twice to avoid this error.

    Signed-off-by: David Ahern
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    David Ahern
     

05 Mar, 2018

1 commit

  • IPv6 does path selection for multipath routes deep in the lookup
    functions. The next patch adds L4 hash option and needs the skb
    for the forward path. To get the skb to the relevant FIB lookup
    functions it needs to go through the fib rules layer, so add a
    lookup_data argument to the fib_lookup_arg struct.

    Signed-off-by: David Ahern
    Reviewed-by: Ido Schimmel
    Reviewed-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    David Ahern
     

01 Mar, 2018

1 commit

  • Ran simple script to find/remove trailing whitespace and blank lines
    at EOF because that kind of stuff git whines about and editors leave
    behind.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

17 Jan, 2018

1 commit

  • /proc has been ignoring struct file_operations::owner field for 10 years.
    Specifically, it started with commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba
    ("Fix rmmod/read/write races in /proc entries"). Notice the chunk where
    inode->i_fop is initialized with proxy struct file_operations for
    regular files:

    - if (de->proc_fops)
    - inode->i_fop = de->proc_fops;
    + if (de->proc_fops) {
    + if (S_ISREG(inode->i_mode))
    + inode->i_fop = &proc_reg_file_ops;
    + else
    + inode->i_fop = de->proc_fops;
    + }

    VFS stopped pinning module at this point.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

04 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

01 Apr, 2015

1 commit

  • The ipv6 code uses a mixture of coding styles. In some instances check for NULL
    pointer is done as x == NULL and sometimes as !x. !x is preferred according to
    checkpatch and this patch makes the code consistent by adopting the latter
    form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

21 Mar, 2015

1 commit

  • Commit baf606d9c9b1 ("ipv4,ipv6: grab rtnl before locking the socket")
    missed to update two setsockopt options, IPV6_JOIN_ANYCAST and
    IPV6_LEAVE_ANYCAST, causing a lock inverstion regarding to the updated ones.

    As ipv6_sock_ac_join and ipv6_sock_ac_leave are only called from
    do_ipv6_setsockopt, we are good to just move the rtnl lock upper.

    Fixes: baf606d9c9b1 ("ipv4,ipv6: grab rtnl before locking the socket")
    Reported-by: Ying Huang
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

15 Oct, 2014

1 commit


24 Sep, 2014

1 commit


14 Sep, 2014

4 commits


13 Sep, 2014

1 commit

  • If we try to rmmod the driver for an interface while sockets with
    setsockopt(JOIN_ANYCAST) are alive, some refcounts aren't cleaned up
    and we get stuck on:

    unregister_netdevice: waiting for ens3 to become free. Usage count = 1

    If we LEAVE_ANYCAST/close everything before rmmod'ing, there is no
    problem.

    We need to perform a cleanup similar to the one for multicast in
    addrconf_ifdown(how == 1).

    Signed-off-by: Sabrina Dubroca
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     

08 Sep, 2014

1 commit

  • It is possible that the interface is already gone after joining
    the list of anycast on this interface as we don't hold a refcount
    for the device, in this case we are safe to ignore the error.

    What's more important, for API compatibility we should not
    change this behavior for applications even if it were correct.

    Fixes: commit a9ed4a2986e13011 ("ipv6: fix rtnl locking in setsockopt for anycast and multicast")
    Cc: Sabrina Dubroca
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    WANG Cong
     

06 Sep, 2014

1 commit

  • Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
    triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()

    ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
    take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
    ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
    calling ipv6_dev_mc_inc/dec.

    This patch moves ASSERT_RTNL() up a level in the call stack.

    Signed-off-by: Cong Wang
    Signed-off-by: Sabrina Dubroca
    Reported-by: Tommi Rantala
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     

23 Jan, 2014

1 commit

  • This change allows to consider an anycast address valid as source address
    when given via an IPV6_PKTINFO or IPV6_2292PKTINFO ancillary data item.
    So, when sending a datagram with ancillary data, the unicast and anycast
    addresses are handled in the same way.

    - Adds ipv6_chk_acast_addr_src() to check if an anycast address is link-local
    on given interface or is global.
    - Uses it in ip6_datagram_send_ctl().

    Signed-off-by: Francois-Xavier Le Bail
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    FX Le Bail
     

19 Feb, 2013

2 commits

  • proc_net_remove is only used to remove proc entries
    that under /proc/net,it's not a general function for
    removing proc entries of netns. if we want to remove
    some proc entries which under /proc/net/stat/, we still
    need to call remove_proc_entry.

    this patch use remove_proc_entry to replace proc_net_remove.
    we can remove proc_net_remove after this patch.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • Right now, some modules such as bonding use proc_create
    to create proc entries under /proc/net/, and other modules
    such as ipv4 use proc_net_fops_create.

    It looks a little chaos.this patch changes all of
    proc_net_fops_create to proc_create. we can remove
    proc_net_fops_create after this patch.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     

31 Jan, 2013

1 commit


06 Dec, 2012

1 commit

  • ipv6_sock_mc_close() is called for ipv6 sockets at close time, and most
    of them don't use multicast.

    Add a test to avoid contention on a shared spinlock.

    Same heuristic applies for ipv6_sock_ac_close(), to avoid contention
    on a shared rwlock.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Nov, 2012

1 commit

  • Allow an unpriviled user who has created a user namespace, and then
    created a network namespace to effectively use the new network
    namespace, by reducing capable(CAP_NET_ADMIN) and
    capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
    CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.

    Settings that merely control a single network device are allowed.
    Either the network device is a logical network device where
    restrictions make no difference or the network device is hardware NIC
    that has been explicity moved from the initial network namespace.

    In general policy and network stack state changes are allowed while
    resource control is left unchanged.

    Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
    Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
    Allow the SIOCADDRT ioctl to add ipv6 routes.
    Allow the SIOCDELRT ioctl to delete ipv6 routes.

    Allow creation of ipv6 raw sockets.

    Allow setting the IPV6_JOIN_ANYCAST socket option.
    Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
    socket option.

    Allow setting the IPV6_TRANSPARENT socket option.
    Allow setting the IPV6_HOPOPTS socket option.
    Allow setting the IPV6_RTHDRDSTOPTS socket option.
    Allow setting the IPV6_DSTOPTS socket option.
    Allow setting the IPV6_IPSEC_POLICY socket option.
    Allow setting the IPV6_XFRM_POLICY socket option.

    Allow sending packets with the IPV6_2292HOPOPTS control message.
    Allow sending packets with the IPV6_2292DSTOPTS control message.
    Allow sending packets with the IPV6_RTHDRDSTOPTS control message.

    Allow setting the multicast routing socket options on non multicast
    routing sockets.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
    setting up, changing and deleting tunnels over ipv6.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
    setting up, changing and deleting ipv6 over ipv4 tunnels.

    Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
    deleting, and changing the potential router list for ISATAP tunnels.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

04 Nov, 2012

1 commit

  • As suggested by Eric, we could introduce a helper function
    for ipv6 too, to avoid checking if rt is NULL before
    dst_release().

    Cc: Eric Dumazet
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Amerigo Wang
     

19 May, 2012

1 commit