19 Jul, 2018

20 commits

  • If out ring is full temporarily and receive completion cannot go out,
    we may still need to reschedule napi if certain conditions are met.
    Otherwise the napi poll might be stopped forever, and cause network
    disconnect.

    Fixes: 7426b1a51803 ("netvsc: optimize receive completions")
    Signed-off-by: Stephen Hemminger
    Signed-off-by: Haiyang Zhang
    Signed-off-by: David S. Miller

    Haiyang Zhang
     
  • The Vitaly Bordug's email bounces ("ru.mvista.com: Name or service not
    known") and there was no activity (ack, review, sign) since 2009.

    Cc: Vitaly Bordug
    Cc: Pantelis Antoniou
    Cc: "David S. Miller"
    Signed-off-by: Krzysztof Kozlowski
    Signed-off-by: David S. Miller

    Krzysztof Kozlowski
     
  • Add dependencies on PCI where necessary.

    Fixes: 7e2bc7fb65 ("net: cavium: Drop dependency of NET_VENDOR_CAVIUM on PCI")
    Signed-off-by: Alexander Sverdlin
    Signed-off-by: David S. Miller

    Alexander Sverdlin
     
  • Stefan Wahren says:

    ====================
    net: qca_spi: Minor bugfixes

    This patch series contains some minor bugfixes for
    the qca_spi driver.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • In cases the probing fails the log level of the messages should
    be an error.

    Signed-off-by: Stefan Wahren
    Signed-off-by: David S. Miller

    Stefan Wahren
     
  • In case the SPI thread is not running, a simple reset of sync
    state won't fix the transmit timeout. We also need to wake up the kernel
    thread.

    Signed-off-by: Stefan Wahren
    Fixes: ed7d42e24eff ("net: qca_spi: fix transmit queue timeout handling")
    Signed-off-by: David S. Miller

    Stefan Wahren
     
  • As long as the synchronization with the QCA7000 isn't finished, we
    cannot accept packets from the upper layers. So let the SPI thread
    enable the TX queue after sync and avoid unwanted packet drop.

    Signed-off-by: Stefan Wahren
    Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
    Signed-off-by: David S. Miller

    Stefan Wahren
     
  • The rol32 call is currently rotating hash but the rol'd value is
    being discarded. I believe the current code is incorrect and hash
    should be assigned the rotated value returned from rol32.

    Thanks to David Lebrun for spotting this.

    Signed-off-by: Colin Ian King
    Signed-off-by: David S. Miller

    Colin Ian King
     
  • The rol32 call is currently rotating hash but the rol'd value is
    being discarded. I believe the current code is incorrect and hash
    should be assigned the rotated value returned from rol32.

    Detected by CoverityScan, CID#1468411 ("Useless call")

    Fixes: b5facfdba14c ("ipv6: sr: Compute flowlabel for outer IPv6 header of seg6 encap mode")
    Signed-off-by: Colin Ian King
    Acked-by: dlebrun@google.com
    Signed-off-by: David S. Miller

    Colin Ian King
     
  • Simon Wunderlich says:

    ====================
    Here are some batman-adv fixes:

    - Fix gateway refcounting in BATMAN IV and V, by Sven Eckelmann (2 patches)

    - Fix debugfs paths when renaming interfaces, by Sven Eckelmann (2 patches)

    - Fix TT flag issues, by Linus Luessing (2 patches)
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Fixes the following sparse warnings:

    net/sched/cls_api.c:1101:43: warning: Using plain integer as NULL pointer
    net/sched/cls_api.c:1492:75: warning: Using plain integer as NULL pointer

    Signed-off-by: YueHaibing
    Signed-off-by: David S. Miller

    YueHaibing
     
  • mii_nway_restart is not pm aware which results in a rtnl deadlock.
    Implement mii_nway_restart manual by setting BMCR_ANRESTART if
    BMCR_ANENABLE is set.

    To reproduce:
    * plug an asix based usb network interface
    * wait until the device enters PM (~5 sec)
    * `ip link set eth1 up` will never return

    Fixes: d9fe64e51114 ("net: asix: Add in_pm parameter")
    Signed-off-by: Alexander Couzens
    Signed-off-by: David S. Miller

    Alexander Couzens
     
  • t.qset_idx can be indirectly controlled by user-space, hence leading to
    a potential exploitation of the Spectre variant 1 vulnerability.

    This issue was detected with the help of Smatch:

    drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:2286 cxgb_extension_ioctl()
    warn: potential spectre issue 'adapter->msix_info'

    Fix this by sanitizing t.qset_idx before using it to index
    adapter->msix_info

    Notice that given that speculation windows are large, the policy is
    to kill the speculation on the first load and not worry if it can be
    completed with a dependent load/store [1].

    [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

    Cc: stable@vger.kernel.org
    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     
  • rhashtable_init() currently does not take into account the user-passed
    min_size parameter unless param->nelem_hint is set as well. As such,
    the default size (number of buckets) will always be HASH_DEFAULT_SIZE
    even if the smallest allowed size is larger than that. Remediate this
    by unconditionally calling into rounded_hashtable_size() and handling
    things accordingly.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Davidlohr Bueso
     
  • Ursula Braun says:

    ====================
    net/smc: fixes 2018-07-18

    here are small fixes for SMC: The first patch speeds up unidirectional
    traffic, the second patch increases security, and the third patch
    fixes a problem for fallback cases.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • During clc handshake the receive timeout is set to CLC_WAIT_TIME.
    Remember and reset the original timeout value after the receive calls,
    and remove a duplicate assignment of CLC_WAIT_TIME.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • For security reasons the return code of get_user() should always be
    checked.

    Fixes: 01d2f7e2cdd31 ("net/smc: sockopts TCP_NODELAY and TCP_CORK")
    Reported-by: Heiko Carstens
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • The SMC protocol requires to send a separate consumer cursor update,
    if it cannot be piggybacked to updates of the producer cursor.
    Currently the decision to send a separate consumer cursor update
    just considers the amount of data already received by the socket
    program. It does not consider the amount of data already arrived, but
    not yet consumed by the receiver. Basing the decision on the
    difference between already confirmed and already arrived data
    (instead of difference between already confirmed and already consumed
    data), may lead to a somewhat earlier consumer cursor update send in
    fast unidirectional traffic scenarios, and thus to better throughput.

    Signed-off-by: Ursula Braun
    Suggested-by: Thomas Richter
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • syzbot is reporting stalls at nfc_llcp_send_ui_frame() [1]. This is
    because nfc_llcp_send_ui_frame() is retrying the loop without any delay
    when nonblocking nfc_alloc_send_skb() returned NULL.

    Since there is no need to use MSG_DONTWAIT if we retry until
    sock_alloc_send_pskb() succeeds, let's use blocking call.
    Also, in case an unexpected error occurred, let's break the loop
    if blocking nfc_alloc_send_skb() failed.

    [1] https://syzkaller.appspot.com/bug?id=4a131cc571c3733e0eff6bc673f4e36ae48f19c6

    Signed-off-by: Tetsuo Handa
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Tetsuo Handa
     
  • My randconfig builds came across an old missing dependency for ILA:

    ERROR: "dst_cache_set_ip6" [net/ipv6/ila/ila.ko] undefined!
    ERROR: "dst_cache_get" [net/ipv6/ila/ila.ko] undefined!
    ERROR: "dst_cache_init" [net/ipv6/ila/ila.ko] undefined!
    ERROR: "dst_cache_destroy" [net/ipv6/ila/ila.ko] undefined!

    We almost never run into this by accident because randconfig builds
    end up selecting DST_CACHE from some other tunnel protocol, and this
    one appears to be the only one missing the explicit 'select'.

    >From all I can tell, this problem first appeared in linux-4.9
    when dst_cache support got added to ILA.

    Fixes: 79ff2fc31e0f ("ila: Cache a route to translated address")
    Cc: Tom Herbert
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

18 Jul, 2018

2 commits

  • This driver can spam the kernel log with multiple messages of:

    net eth0: eth0: allmulti set

    Usually 4 or 8 at a time (probably because of using ConnMan).

    This message doesn't seem useful, so let's demote it from dev_info()
    to dev_dbg().

    Signed-off-by: David Lechner
    Signed-off-by: David S. Miller

    David Lechner
     
  • octeon_mgmt driver doesn't drop RX frames that are 1-4 bytes bigger than
    MTU set for the corresponding interface. The problem is in the
    AGL_GMX_RX0/1_FRM_MAX register setting, which should not account for VLAN
    tagging.

    According to Octeon HW manual:
    "For tagged frames, MAX increases by four bytes for each VLAN found up to a
    maximum of two VLANs, or MAX + 8 bytes."

    OCTEON_FRAME_HEADER_LEN "define" is fine for ring buffer management, but
    should not be used for AGL_GMX_RX0/1_FRM_MAX.

    The problem could be easily reproduced using "ping" command. If affected
    system has default MTU 1500, other host (having MTU >= 1504) can
    successfully "ping" the affected system with payload size 1473-1476,
    resulting in IP packets of size 1501-1504 accepted by the mgmt driver.
    Fixed system still accepts IP packets of 1500 bytes even with VLAN tagging,
    because the limits are lifted in HW as expected, for every VLAN tag.

    Signed-off-by: Alexander Sverdlin
    Signed-off-by: David S. Miller

    Alexander Sverdlin
     

17 Jul, 2018

18 commits

  • SMC ioctl processing requires the sock lock to work properly in
    all thinkable scenarios.
    Problem has been found with RaceFuzzer and fixes:
    KASAN: null-ptr-deref Read in smc_ioctl

    Reported-by: Byoungyoung Lee
    Reported-by: syzbot+35b2c5aa76fd398b9fd4@syzkaller.appspotmail.com
    Signed-off-by: Ursula Braun
    Reviewed-by: Stefano Brivio
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • Siva Reddy Kallam says:

    ====================
    tg3: Update copyright and fix for tx timeout with 5762

    First patch:
    Update copyright

    Second patch:
    Add higher cpu clock for 5762
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This patch has fix for TX timeout while running bi-directional
    traffic with 100 Mbps using 5762.

    Signed-off-by: Sanjeev Bansal
    Signed-off-by: Siva Reddy Kallam
    Reviewed-by: Michael Chan
    Signed-off-by: David S. Miller

    Sanjeev Bansal
     
  • Signed-off-by: Siva Reddy Kallam
    Reviewed-by: Michael Chan
    Signed-off-by: David S. Miller

    Siva Reddy Kallam
     
  • Testing has uncovered a failure case that is not handled properly. In the
    event that a login fails and we are not able to recover on the spot, we
    return 0 from do_reset, preventing any error recovery code from being
    triggered. Additionally, the state is set to "probed" meaning that when we
    are able to trigger the error recovery, the driver always comes up in the
    probed state. To handle the case properly, we need to return a failure code
    here and set the adapter state to the state that we entered the reset in
    indicating the state that we would like to come out of the recovery reset
    in.

    Signed-off-by: John Allen
    Signed-off-by: David S. Miller

    John Allen
     
  • The skb size calculation in lan78xx_tx_bh is in race with the start_xmit,
    which could lead to rare kernel oopses. So protect the whole skb walk with
    a spin lock. As a benefit we can unlink the skb directly.

    This patch was tested on Raspberry Pi 3B+

    Link: https://github.com/raspberrypi/linux/issues/2608
    Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet")
    Cc: stable
    Signed-off-by: Floris Bos
    Signed-off-by: Stefan Wahren
    Signed-off-by: David S. Miller

    Stefan Wahren
     
  • Eric reported that reverting the patch that fixed and simplified IPv6
    multipath routes means reverting back to invalid userspace notifications.
    eg.,
    $ ip -6 route add 2001:db8:1::/64 nexthop dev eth0 nexthop dev eth1

    only generates a single notification:
    2001:db8:1::/64 dev eth0 metric 1024 pref medium

    While working on a fix for this problem I found another case that is just
    broken completely - a multipath route with a gateway followed by device
    followed by gateway:
    $ ip -6 ro add 2001:db8:103::/64
    nexthop via 2001:db8:1::64
    nexthop dev dummy2
    nexthop via 2001:db8:3::64

    In this case the device only route is dropped completely - no notification
    to userpsace but no addition to the FIB either:

    $ ip -6 ro ls
    2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium
    2001:db8:2::/64 dev dummy2 proto kernel metric 256 pref medium
    2001:db8:3::/64 dev dummy3 proto kernel metric 256 pref medium
    2001:db8:103::/64 metric 1024
    nexthop via 2001:db8:1::64 dev dummy1 weight 1
    nexthop via 2001:db8:3::64 dev dummy3 weight 1 pref medium
    fe80::/64 dev dummy1 proto kernel metric 256 pref medium
    fe80::/64 dev dummy2 proto kernel metric 256 pref medium
    fe80::/64 dev dummy3 proto kernel metric 256 pref medium

    Really, IPv6 multipath is just FUBAR'ed beyond repair when it comes to
    device only routes, so do not allow it all.

    This change will break any scripts relying on the mpath api for insert,
    but I don't see any other way to handle the permutations. Besides, since
    the routes are added to the FIB as standalone (non-multipath) routes the
    kernel is not doing what the user requested, so it might as well tell the
    user that.

    Reported-by: Eric Dumazet
    Signed-off-by: David Ahern
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    David Ahern
     
  • Correct previous bad attempt at allowing sockets to come out of TCP
    repair without sending window probes. To avoid changing size of
    the repair variable in struct tcp_sock, this lets the decision for
    sending probes or not to be made when coming out of repair by
    introducing two ways to turn it off.

    v2:
    * Remove erroneous comment; defines now make behavior clear

    Fixes: 70b7ff130224 ("tcp: allow user to create repair socket without window probes")
    Signed-off-by: Stefan Baranoff
    Signed-off-by: Eric Dumazet
    Acked-by: Andrei Vagin
    Signed-off-by: David S. Miller

    Stefan Baranoff
     
  • When a new rx packet arrives, the rx path will decide whether to reuse
    the remainder of the page or not according to one of the below conditions:
    1. frag_info->frag_stride == PAGE_SIZE / 2
    2. frags->page_offset + frag_info->frag_size > PAGE_SIZE;

    The first condition is no met for when XDP is set.
    For XDP, page_offset is always set to priv->rx_headroom which is
    XDP_PACKET_HEADROOM and frag_info->frag_size is around mtu size + some
    padding, still the 2nd release condition will hold since
    XDP_PACKET_HEADROOM + 1536 < PAGE_SIZE, as a result the page will not
    be released and will be _wrongly_ reused for next free rx descriptor.

    In XDP there is an assumption to have a page per packet and reuse can
    break such assumption and might cause packet data corruptions.

    Fix this by adding an extra condition (!priv->rx_headroom) to the 2nd
    case to avoid page reuse when XDP is set, since rx_headroom is set to 0
    for non XDP setup and set to XDP_PACKET_HEADROOM for XDP setup.

    No additional cache line is required for the new condition.

    Fixes: 34db548bfb95 ("mlx4: add page recycling in receive path")
    Signed-off-by: Saeed Mahameed
    Signed-off-by: Tariq Toukan
    Suggested-by: Martin KaFai Lau
    CC: Eric Dumazet
    Signed-off-by: David S. Miller

    Saeed Mahameed
     
  • CC [M] drivers/net/ethernet/freescale/fman/fman.o
    In file included from ../drivers/net/ethernet/freescale/fman/fman.c:35:
    ../include/linux/fsl/guts.h: In function 'guts_set_dmacr':
    ../include/linux/fsl/guts.h:165:2: error: implicit declaration of function 'clrsetbits_be32' [-Werror=implicit-function-declaration]
    clrsetbits_be32(&guts->dmacr, 3 << shift, device << shift);
    ^~~~~~~~~~~~~~~

    Signed-off-by: Randy Dunlap
    Cc: Madalin Bucur
    Cc: netdev@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: David S. Miller

    Randy Dunlap
     
  • The netvsc device may need to fallback to running in single queue
    mode if host side only wants to support single queue.

    Recent change for handling mtu broke this in setup logic.

    Reported-by: Dan Carpenter
    Fixes: 3ffe64f1a641 ("hv_netvsc: split sub-channel setup into async and sync")
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • During a device failover, there may be latency between the loss
    of the current backing device and a notification from firmware that
    a failover has occurred. This latency can result in a large amount of
    error printouts as firmware returns outgoing traffic with a generic
    error code. These are not necessarily errors in this case as the
    firmware is busy swapping in a new backing adapter and is not ready
    to send packets yet. This patch reclassifies those error codes as
    warnings with an explanation that a failover may be pending. All
    other return codes will be considered errors.

    Signed-off-by: Thomas Falcon
    Signed-off-by: David S. Miller

    Thomas Falcon
     
  • Commit adc176c54722 ("ipv6 addrconf: Implemented enhanced DAD (RFC7527)")
    added enhanced DAD with a nonce length of 6 bytes. However, RFC7527
    doesn't specify the length of the nonce, other than being 6 + 8*k bytes,
    with integer k >= 0 (RFC3971 5.3.2). The current implementation simply
    assumes that the nonce will always be 6 bytes, but others systems are
    free to choose different sizes.

    If another system sends a nonce of different length but with the same 6
    bytes prefix, it shouldn't be considered as the same nonce. Thus, check
    that the length of the received nonce is the same as the length we sent.

    Ugly scapy test script running on veth0:

    def loop():
    pkt=sniff(iface="veth0", filter="icmp6", count=1)
    pkt = pkt[0]
    b = bytearray(pkt[Raw].load)
    b[1] += 1
    b += b'\xde\xad\xbe\xef\xde\xad\xbe\xef'
    pkt[Raw].load = bytes(b)
    pkt[IPv6].plen += 8
    # fixup checksum after modifying the payload
    pkt[IPv6].payload.cksum -= 0x3b44
    if pkt[IPv6].payload.cksum < 0:
    pkt[IPv6].payload.cksum += 0xffff
    sendp(pkt, iface="veth0")

    This should result in DAD failure for any address added to veth0's peer,
    but is currently ignored.

    Fixes: adc176c54722 ("ipv6 addrconf: Implemented enhanced DAD (RFC7527)")
    Signed-off-by: Sabrina Dubroca
    Reviewed-by: Stefano Brivio
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     
  • This patch remove the following documentation warning
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:103: warning: Excess function parameter 'priv' description in 'stmmac_axi_setup'
    It was introduced in commit afea03656add7 ("stmmac: rework DMA bus setting and introduce new platform AXI structure")

    Signed-off-by: Corentin Labbe
    Signed-off-by: David S. Miller

    Corentin Labbe
     
  • This patch fix a typo in the word Describe
    Signed-off-by: Corentin Labbe
    Signed-off-by: David S. Miller

    Corentin Labbe
     
  • A KASAN:use-after-free bug was found related to ip6-erspan
    while running selftests/net/ip6_gre_headroom.sh

    It happens because of following sequence:
    - ipv6hdr pointer is obtained from skb
    - skb_cow_head() is called, skb->head memory is reallocated
    - old data is accessed using ipv6hdr pointer

    skb_cow_head() call was added in e41c7c68ea77 ("ip6erspan: make sure
    enough headroom at xmit."), but looking at the history there was a
    chance of similar bug because gre_handle_offloads() and pskb_trim()
    can also reallocate skb->head memory. Fixes tag points to commit
    which introduced possibility of this bug.

    This patch moves ipv6hdr pointer assignment after skb_cow_head() call.

    Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
    Signed-off-by: Prashant Bhole
    Reviewed-by: Greg Rose
    Acked-by: William Tu
    Signed-off-by: David S. Miller

    Prashant Bhole
     
  • On XDP_TX we need to free up the frame only when tun_xdp_tx() returns a
    negative value. A positive value indicates that the packet is
    successfully enqueued to the ptr_ring, so freeing the page causes
    use-after-free.

    Fixes: 735fc4054b3a ("xdp: change ndo_xdp_xmit API to support bulking")
    Signed-off-by: Toshiaki Makita
    Acked-by: Jason Wang
    Acked-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Toshiaki Makita
     
  • This patch fixes a spelling typo in bonding.txt

    Signed-off-by: Masanari Iida
    Signed-off-by: David S. Miller

    Masanari Iida