19 Mar, 2019

1 commit

  • [ Upstream commit d7cf4a3bf3a83c977a29055e1c4ffada7697b31f ]

    smc_poll() returns with mask bit EPOLLPRI if the connection urg_state
    is SMC_URG_VALID. Since SMC_URG_VALID is zero, smc_poll signals
    EPOLLPRI errorneously if called in state SMC_INIT before the connection
    is created, for instance in a non-blocking connect scenario.

    This patch switches to non-zero values for the urg states.

    Reviewed-by: Karsten Graul
    Fixes: de8474eb9d50 ("net/smc: urgent data support")
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ursula Braun
     

23 Jan, 2019

1 commit

  • [ Upstream commit 26d92e951fe0a44ee4aec157cabb65a818cc8151 ]

    In smc_release() we release smc->clcsock before unhash the smc
    sock, but a parallel smc_diag_dump() may be still reading
    smc->clcsock, therefore this could cause a use-after-free as
    reported by syzbot.

    Reported-and-tested-by: syzbot+fbd1e5476e4c94c7b34e@syzkaller.appspotmail.com
    Fixes: 51f1de79ad8e ("net/smc: replace sock_put worker by socket refcounting")
    Cc: Ursula Braun
    Signed-off-by: Cong Wang
    Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com
    Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     

10 Jan, 2019

1 commit

  • [ Upstream commit 78abe3d0dfad196959b1246003366e2610775ea6 ]

    clcsock can be released while kernel_accept() references it in TCP
    listen worker. Also, clcsock needs to wake up before released if TCP
    fallback is used and the clcsock is blocked by accept. Add a lock to
    safely release clcsock and call kernel_sock_shutdown() to wake up
    clcsock from accept in smc_release().

    Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com
    Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com
    Signed-off-by: Myungho Jung
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Myungho Jung
     

04 Nov, 2018

2 commits

  • [ Upstream commit fb692ec4117f6fd25044cfb5720d6b79d400dc65 ]

    The pointer to the link group is unset in the smc connection structure
    right before the call to smc_buf_unuse. Provide the lgr pointer to
    smc_buf_unuse explicitly.
    And move the call to smc_lgr_schedule_free_work to the end of
    smc_conn_free.

    Fixes: a6920d1d130c ("net/smc: handle unregistered buffers")
    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Karsten Graul
     
  • [ Upstream commit 89ab066d4229acd32e323f1569833302544a4186 ]

    This reverts commit dd979b4df817e9976f18fb6f9d134d6bc4a3c317.

    This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
    internal TCP socket for the initial handshake with the remote peer.
    Whenever the SMC connection can not be established this TCP socket is
    used as a fallback. All socket operations on the SMC socket are then
    forwarded to the TCP socket. In case of poll, the file->private_data
    pointer references the SMC socket because the TCP socket has no file
    assigned. This causes tcp_poll to wait on the wrong socket.

    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Karsten Graul
     

20 Sep, 2018

1 commit


19 Sep, 2018

5 commits

  • Comparing an int to a size, which is unsigned, causes the int to become
    unsigned, giving the wrong result. kernel_sendmsg can return a negative
    error code.

    Signed-off-by: YueHaibing
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    YueHaibing
     
  • Don't check a listen socket for pending urgent data in smc_poll().

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • If a linkgroup is terminated abnormally already due to failing
    LLC CONFIRM LINK or LLC ADD LINK, fallback to TCP is still possible.
    In this case do not switch to state SMC_PEERABORTWAIT and do not set
    sk_err.

    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • For a failing smc_listen_rdma_finish() smc_listen_decline() is
    called. If fallback is possible, the new socket is already enqueued
    to be accepted in smc_listen_decline(). Avoid enqueuing a second time
    afterwards in this case, otherwise the smc_create_lgr_pending lock
    is released twice:
    [ 373.463976] WARNING: bad unlock balance detected!
    [ 373.463978] 4.18.0-rc7+ #123 Tainted: G O
    [ 373.463979] -------------------------------------
    [ 373.463980] kworker/1:1/30 is trying to release lock (smc_create_lgr_pending) at:
    [ 373.463990] [] smc_listen_work+0x22c/0x5d0 [smc]
    [ 373.463991] but there are no more locks to release!
    [ 373.463991]
    other info that might help us debug this:
    [ 373.463993] 2 locks held by kworker/1:1/30:
    [ 373.463994] #0: 00000000772cbaed ((wq_completion)"events"){+.+.}, at: process_one_work+0x1ec/0x6b0
    [ 373.464000] #1: 000000003ad0894a ((work_completion)(&new_smc->smc_listen_work)){+.+.}, at: process_one_work+0x1ec/0x6b0
    [ 373.464003]
    stack backtrace:
    [ 373.464005] CPU: 1 PID: 30 Comm: kworker/1:1 Kdump: loaded Tainted: G O 4.18.0-rc7uschi+ #123
    [ 373.464007] Hardware name: IBM 2827 H43 738 (LPAR)
    [ 373.464010] Workqueue: events smc_listen_work [smc]
    [ 373.464011] Call Trace:
    [ 373.464015] ([] show_stack+0x60/0xd8)
    [ 373.464019] [] dump_stack+0x9c/0xd8
    [ 373.464021] [] print_unlock_imbalance_bug+0xf8/0x108
    [ 373.464022] [] lock_release+0x114/0x4f8
    [ 373.464025] [] __mutex_unlock_slowpath+0x4a/0x300
    [ 373.464027] [] smc_listen_work+0x22c/0x5d0 [smc]
    [ 373.464029] [] process_one_work+0x2a8/0x6b0
    [ 373.464030] [] worker_thread+0x52/0x410
    [ 373.464033] [] kthread+0x15e/0x178
    [ 373.464035] [] kernel_thread_starter+0x6/0xc
    [ 373.464052] [] kernel_thread_starter+0x0/0xc
    [ 373.464054] INFO: lockdep is turned off.

    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • In state SMC_INIT smc_poll() delegates polling to the internal
    CLC socket. This means, once the connect worker has finished
    its kernel_connect() step, the poll wake-up may occur. This is not
    intended. The wake-up should occur from the wake up call in
    smc_connect_work() after __smc_connect() has finished.
    Thus in state SMC_INIT this patch now calls sock_poll_wait() on the
    main SMC socket.

    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     

18 Aug, 2018

1 commit

  • All RDMA ULPs should be using rdma_get_gid_attr instead of
    ib_query_gid. Convert SMC to use the new API.

    In the process correct some confusion with gid_type - if attr->ndev is
    !NULL then gid_type can never be IB_GID_TYPE_IB by
    definition. IB_GID_TYPE_ROCE shares the same enum value and is probably
    what was intended here.

    Reviewed-by: Parav Pandit
    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     

17 Aug, 2018

3 commits

  • rdma.git merge resolution for the 4.19 merge window

    Conflicts:
    drivers/infiniband/core/rdma_core.c
    - Use the rdma code and revise with the new spelling for
    atomic_fetch_add_unless
    drivers/nvme/host/rdma.c
    - Replace max_sge with max_send_sge in new blk code
    drivers/nvme/target/rdma.c
    - Use the blk code and revise to use NULL for ib_post_recv when
    appropriate
    - Replace max_sge with max_recv_sge in new blk code
    net/rds/ib_send.c
    - Use the net code and revise to use NULL for ib_post_recv when
    appropriate

    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     
  • This reverts commit ddb457c6993babbcdd41fca638b870d2a2fc3941.

    The include rdma/ib_cache.h is kept, and we have to add a memset
    to the compat wrapper to avoid compiler warnings in gcc-7

    This revert is done to avoid extensive merge conflicts with SMC
    changes in netdev during the 4.19 merge window.

    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     
  • Resolve merge conflicts from the -rc cycle against the rdma.git tree:

    Conflicts:
    drivers/infiniband/core/uverbs_cmd.c
    - New ifs added to ib_uverbs_ex_create_flow in -rc and for-next
    - Merge removal of file->ucontext in for-next with new code in -rc
    drivers/infiniband/core/uverbs_main.c
    - for-next removed code from ib_uverbs_write() that was modified
    in for-rc

    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     

11 Aug, 2018

1 commit


10 Aug, 2018

1 commit


09 Aug, 2018

3 commits

  • When an SMC socket is connecting it is decided whether fallback to
    TCP is needed. To avoid races between connect and ioctl move the
    sock lock before the use_fallback check.

    Reported-by: syzbot+5b2cece1a8ecb2ca77d8@syzkaller.appspotmail.com
    Reported-by: syzbot+19557374321ca3710990@syzkaller.appspotmail.com
    Fixes: 1992d99882af ("net/smc: take sock lock in smc_ioctl()")
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • Without setsockopt SO_SNDBUF and SO_RCVBUF settings, the sysctl
    defaults net.ipv4.tcp_wmem and net.ipv4.tcp_rmem should be the base
    for the sizes of the SMC sndbuf and rcvbuf. Any TCP buffer size
    optimizations for servers should be ignored.

    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • Invoking shutdown for a socket in state SMC_LISTEN does not make
    sense. Nevertheless programs like syzbot fuzzing the kernel may
    try to do this. For SMC this means a socket refcounting problem.
    This patch makes sure a shutdown call for an SMC socket in state
    SMC_LISTEN simply returns with -ENOTCONN.

    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     

06 Aug, 2018

1 commit


05 Aug, 2018

1 commit

  • If a writer blocked condition is received without data, the current
    consumer cursor is immediately sent. Servers could already receive this
    condition in state SMC_INIT without finished tx-setup. This patch
    avoids sending a consumer cursor update in this case.

    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     

31 Jul, 2018

1 commit


26 Jul, 2018

4 commits

  • Send an orderly DELETE LINK request before termination of a link group,
    add support for client triggered DELETE LINK processing. And send a
    disorderly DELETE LINK before module is unloaded.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • Remember the fallback reason code and the peer diagnosis code for
    smc sockets, and provide them in smc_diag.c to the netlink interface.
    And add more detailed reason codes.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • SMC code uses the base gid for VLAN traffic. The gids exchanged in
    the CLC handshake and the gid index used for the QP have to switch
    from the base gid to the appropriate vlan gid.

    When searching for a matching IB device port for a certain vlan
    device, it does not make sense to return an IB device port, which
    is not enabled for the used vlan_id. Add another check whether a
    vlan gid exists for a certain IB device port.

    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • Link confirmation will always be sent across the new link being
    confirmed. This allows to shrink the parameter list.
    No functional change.

    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     

25 Jul, 2018

2 commits


24 Jul, 2018

5 commits


21 Jul, 2018

1 commit


19 Jul, 2018

4 commits

  • Pull networking fixes from David Miller:
    "Lots of fixes, here goes:

    1) NULL deref in qtnfmac, from Gustavo A. R. Silva.

    2) Kernel oops when fw download fails in rtlwifi, from Ping-Ke Shih.

    3) Lost completion messages in AF_XDP, from Magnus Karlsson.

    4) Correct bogus self-assignment in rhashtable, from Rishabh
    Bhatnagar.

    5) Fix regression in ipv6 route append handling, from David Ahern.

    6) Fix masking in __set_phy_supported(), from Heiner Kallweit.

    7) Missing module owner set in x_tables icmp, from Florian Westphal.

    8) liquidio's timeouts are HZ dependent, fix from Nicholas Mc Guire.

    9) Link setting fixes for sh_eth and ravb, from Vladimir Zapolskiy.

    10) Fix NULL deref when using chains in act_csum, from Davide Caratti.

    11) XDP_REDIRECT needs to check if the interface is up and whether the
    MTU is sufficient. From Toshiaki Makita.

    12) Net diag can do a double free when killing TCP_NEW_SYN_RECV
    connections, from Lorenzo Colitti.

    13) nf_defrag in ipv6 can unnecessarily hold onto dst entries for a
    full minute, delaying device unregister. From Eric Dumazet.

    14) Update MAC entries in the correct order in ixgbe, from Alexander
    Duyck.

    15) Don't leave partial mangles bpf program in jit_subprogs, from
    Daniel Borkmann.

    16) Fix pfmemalloc SKB state propagation, from Stefano Brivio.

    17) Fix ACK handling in DCTCP congestion control, from Yuchung Cheng.

    18) Use after free in tun XDP_TX, from Toshiaki Makita.

    19) Stale ipv6 header pointer in ipv6 gre code, from Prashant Bhole.

    20) Don't reuse remainder of RX page when XDP is set in mlx4, from
    Saeed Mahameed.

    21) Fix window probe handling of TCP rapair sockets, from Stefan
    Baranoff.

    22) Missing socket locking in smc_ioctl(), from Ursula Braun.

    23) IPV6_ILA needs DST_CACHE, from Arnd Bergmann.

    24) Spectre v1 fix in cxgb3, from Gustavo A. R. Silva.

    25) Two spots in ipv6 do a rol32() on a hash value but ignore the
    result. Fixes from Colin Ian King"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (176 commits)
    tcp: identify cryptic messages as TCP seq # bugs
    ptp: fix missing break in switch
    hv_netvsc: Fix napi reschedule while receive completion is busy
    MAINTAINERS: Drop inactive Vitaly Bordug's email
    net: cavium: Add fine-granular dependencies on PCI
    net: qca_spi: Fix log level if probe fails
    net: qca_spi: Make sure the QCA7000 reset is triggered
    net: qca_spi: Avoid packet drop during initial sync
    ipv6: fix useless rol32 call on hash
    ipv6: sr: fix useless rol32 call on hash
    net: sched: Using NULL instead of plain integer
    net: usb: asix: replace mii_nway_restart in resume path
    net: cxgb3_main: fix potential Spectre v1
    lib/rhashtable: consider param->min_size when setting initial table size
    net/smc: reset recv timeout after clc handshake
    net/smc: add error handling for get_user()
    net/smc: optimize consumer cursor updates
    net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
    ipv6: ila: select CONFIG_DST_CACHE
    net: usb: rtl8150: demote allmulti message to dev_dbg()
    ...

    Linus Torvalds
     
  • During clc handshake the receive timeout is set to CLC_WAIT_TIME.
    Remember and reset the original timeout value after the receive calls,
    and remove a duplicate assignment of CLC_WAIT_TIME.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • For security reasons the return code of get_user() should always be
    checked.

    Fixes: 01d2f7e2cdd31 ("net/smc: sockopts TCP_NODELAY and TCP_CORK")
    Reported-by: Heiko Carstens
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • The SMC protocol requires to send a separate consumer cursor update,
    if it cannot be piggybacked to updates of the producer cursor.
    Currently the decision to send a separate consumer cursor update
    just considers the amount of data already received by the socket
    program. It does not consider the amount of data already arrived, but
    not yet consumed by the receiver. Basing the decision on the
    difference between already confirmed and already arrived data
    (instead of difference between already confirmed and already consumed
    data), may lead to a somewhat earlier consumer cursor update send in
    fast unidirectional traffic scenarios, and thus to better throughput.

    Signed-off-by: Ursula Braun
    Suggested-by: Thomas Richter
    Signed-off-by: David S. Miller

    Ursula Braun
     

17 Jul, 2018

1 commit

  • SMC ioctl processing requires the sock lock to work properly in
    all thinkable scenarios.
    Problem has been found with RaceFuzzer and fixes:
    KASAN: null-ptr-deref Read in smc_ioctl

    Reported-by: Byoungyoung Lee
    Reported-by: syzbot+35b2c5aa76fd398b9fd4@syzkaller.appspotmail.com
    Signed-off-by: Ursula Braun
    Reviewed-by: Stefano Brivio
    Signed-off-by: David S. Miller

    Ursula Braun