13 Mar, 2020

1 commit


11 Mar, 2020

1 commit

  • During IB device removal, cancel the event worker before the device
    structure is freed.

    Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
    Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com
    Signed-off-by: Karsten Graul
    Reviewed-by: Ursula Braun
    Reviewed-by: Leon Romanovsky
    Signed-off-by: David S. Miller

    Karsten Graul
     

28 Feb, 2020

1 commit


27 Feb, 2020

4 commits

  • In smc_ib_remove_dev() check if the provided ib device was actually
    initialized for SMC before.

    Reported-by: syzbot+84484ccebdd4e5451d91@syzkaller.appspotmail.com
    Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • According to RFC 7609, all CLC messages contain a peer ID that consists
    of a unique instance ID and the MAC address of one of the host's RoCE
    devices. But if a SMC-R connection cannot be established, e.g., because
    no matching pnet table entry is found, the current implementation uses a
    zero value in the CLC decline message although the host's peer ID is set
    to a proper value.

    If no RoCE and no ISM device is usable for a connection, there is no LGR
    and the LGR check in smc_clc_send_decline() prevents that the peer ID is
    copied into the CLC decline message for both SMC-D and SMC-R. So, this
    patch modifies the check to also accept the case of no LGR. Also, only a
    valid peer ID is copied into the decline message.

    Signed-off-by: Hans Wippel
    Signed-off-by: David S. Miller

    Hans Wippel
     
  • This patch initializes the peer ID to a random instance ID and a zero
    MAC address. If a RoCE device is in the host, the MAC address part of
    the peer ID is overwritten with the respective address. Also, a function
    for checking if the peer ID is valid is added. A peer ID is considered
    valid if the MAC address part contains a non-zero MAC address.

    Signed-off-by: Hans Wippel
    Signed-off-by: David S. Miller

    Hans Wippel
     
  • If an SMC connection to a certain peer is setup the first time,
    a new linkgroup is created. In case of setup failures, such a
    linkgroup is unusable and should disappear. As a first step the
    linkgroup is removed from the linkgroup list in smc_lgr_forget().

    There are 2 problems:
    smc_listen_decline() might be called before linkgroup creation
    resulting in a crash due to calling smc_lgr_forget() with
    parameter NULL.
    If a setup failure occurs after linkgroup creation, the connection
    is never unregistered from the linkgroup, preventing linkgroup
    freeing.

    This patch introduces an enhanced smc_lgr_cleanup_early() function
    which
    * contains a linkgroup check for early smc_listen_decline()
    invocations
    * invokes smc_conn_free() to guarantee unregistering of the
    connection.
    * schedules fast linkgroup removal of the unusable linkgroup

    And the unused function smcd_conn_free() is removed from smc_core.h.

    Fixes: 3b2dec2603d5b ("net/smc: restructure client and server code in af_smc")
    Fixes: 2a0674fffb6bc ("net/smc: improve abnormal termination of link groups")
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     

18 Feb, 2020

6 commits

  • IB event handlers schedule the port event worker for further
    processing of port state changes. This patch reduces the number of
    schedules to avoid duplicate processing of the same port change.

    Reviewed-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • smc_lgr_terminate() and smc_lgr_terminate_sched() both result in soft
    link termination, smc_lgr_terminate_sched() is scheduling a worker for
    this task. Take out complexity by always using the termination worker
    and getting rid of smc_lgr_terminate() completely.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • The soft parameter of smc_lgr_terminate() is not used and obsolete.
    Remove it.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • When 2 callers call smc_lgr_terminate() at the same time
    for the same lgr, one gets the lgr_lock and deletes the lgr from the
    list and releases the lock. Then the second caller gets the lock and
    tries to delete it again.
    In smc_lgr_terminate() add a check if the link group lgr is already
    deleted from the link group list and prevent to try to delete it a
    second time.
    And add a check if the lgr is marked as freeing, which means that a
    termination is already pending.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • smc_tx_rdma_write() is called under the send_lock and should not call
    smc_lgr_terminate() directly. Call smc_lgr_terminate_sched() instead
    which schedules a worker.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • smc_lgr_cleanup() is called during termination processing, there is no
    need to send a DELETE_LINK at that time. A DELETE_LINK should have been
    sent before the termination is initiated, if needed.
    And remove the extra call to wake_up(&lnk->wr_reg_wait) because
    smc_llc_link_inactive() already calls the related helper function
    smc_wr_wakeup_reg_wait().

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     

14 Feb, 2020

2 commits

  • Just SMCR requires a CLC Peer ID, but not SMCD. The field should be
    zero for SMCD.

    Fixes: c758dfddc1b5 ("net/smc: add SMC-D support in CLC messages")
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • SMC does not work together with FASTOPEN. If sendmsg() is called with
    flag MSG_FASTOPEN in SMC_INIT state, the SMC-socket switches to
    fallback mode. To handle the previous ioctl FIOASYNC call correctly
    in this case, it is necessary to transfer the socket wait queue
    fasync_list to the internal TCP socket.

    Reported-by: syzbot+4b1fe8105f8044a26162@syzkaller.appspotmail.com
    Fixes: ee9dfbef02d18 ("net/smc: handle sockopts forcing fallback")
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     

12 Feb, 2020

1 commit

  • As nlmsg_put() does not clear the memory that is reserved,
    it this the caller responsability to make sure all of this
    memory will be written, in order to not reveal prior content.

    While we are at it, we can provide the socket cookie even
    if clsock is not set.

    syzbot reported :

    BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
    BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline]
    BUG: KMSAN: uninit-value in __swab32p include/uapi/linux/swab.h:179 [inline]
    BUG: KMSAN: uninit-value in __be32_to_cpup include/uapi/linux/byteorder/little_endian.h:82 [inline]
    BUG: KMSAN: uninit-value in get_unaligned_be32 include/linux/unaligned/access_ok.h:30 [inline]
    BUG: KMSAN: uninit-value in ____bpf_skb_load_helper_32 net/core/filter.c:240 [inline]
    BUG: KMSAN: uninit-value in ____bpf_skb_load_helper_32_no_cache net/core/filter.c:255 [inline]
    BUG: KMSAN: uninit-value in bpf_skb_load_helper_32_no_cache+0x14a/0x390 net/core/filter.c:252
    CPU: 1 PID: 5262 Comm: syz-executor.5 Not tainted 5.5.0-rc5-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1c9/0x220 lib/dump_stack.c:118
    kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
    __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
    __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
    __fswab32 include/uapi/linux/swab.h:59 [inline]
    __swab32p include/uapi/linux/swab.h:179 [inline]
    __be32_to_cpup include/uapi/linux/byteorder/little_endian.h:82 [inline]
    get_unaligned_be32 include/linux/unaligned/access_ok.h:30 [inline]
    ____bpf_skb_load_helper_32 net/core/filter.c:240 [inline]
    ____bpf_skb_load_helper_32_no_cache net/core/filter.c:255 [inline]
    bpf_skb_load_helper_32_no_cache+0x14a/0x390 net/core/filter.c:252

    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
    kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
    kmsan_kmalloc_large+0x73/0xc0 mm/kmsan/kmsan_hooks.c:128
    kmalloc_large_node_hook mm/slub.c:1406 [inline]
    kmalloc_large_node+0x282/0x2c0 mm/slub.c:3841
    __kmalloc_node_track_caller+0x44b/0x1200 mm/slub.c:4368
    __kmalloc_reserve net/core/skbuff.c:141 [inline]
    __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209
    alloc_skb include/linux/skbuff.h:1049 [inline]
    netlink_dump+0x44b/0x1ab0 net/netlink/af_netlink.c:2224
    __netlink_dump_start+0xbb2/0xcf0 net/netlink/af_netlink.c:2352
    netlink_dump_start include/linux/netlink.h:233 [inline]
    smc_diag_handler_dump+0x2ba/0x300 net/smc/smc_diag.c:242
    sock_diag_rcv_msg+0x211/0x610 net/core/sock_diag.c:256
    netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477
    sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:275
    netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
    netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328
    netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917
    sock_sendmsg_nosec net/socket.c:639 [inline]
    sock_sendmsg net/socket.c:659 [inline]
    kernel_sendmsg+0x433/0x440 net/socket.c:679
    sock_no_sendpage+0x235/0x300 net/core/sock.c:2740
    kernel_sendpage net/socket.c:3776 [inline]
    sock_sendpage+0x1e1/0x2c0 net/socket.c:937
    pipe_to_sendpage+0x38c/0x4c0 fs/splice.c:458
    splice_from_pipe_feed fs/splice.c:512 [inline]
    __splice_from_pipe+0x539/0xed0 fs/splice.c:636
    splice_from_pipe fs/splice.c:671 [inline]
    generic_splice_sendpage+0x1d5/0x2d0 fs/splice.c:844
    do_splice_from fs/splice.c:863 [inline]
    do_splice fs/splice.c:1170 [inline]
    __do_sys_splice fs/splice.c:1447 [inline]
    __se_sys_splice+0x2380/0x3350 fs/splice.c:1427
    __x64_sys_splice+0x6e/0x90 fs/splice.c:1427
    do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
    Signed-off-by: Eric Dumazet
    Cc: Ursula Braun
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Jan, 2020

1 commit

  • The current flags of the SMC_PNET_GET command only allow privileged
    users to retrieve entries from the pnet table via netlink. The content
    of the pnet table may be useful for all users though, e.g., for
    debugging smc connection problems.

    This patch removes the GENL_ADMIN_PERM flag so that unprivileged users
    can read the pnet table.

    Signed-off-by: Hans Wippel
    Signed-off-by: David S. Miller

    Hans Wippel
     

23 Dec, 2019

1 commit


21 Dec, 2019

1 commit

  • In the reboot_event handler, unregister the ib devices and enable
    the IB layer to release the devices before the reboot.

    Fixes: a33a803cfe64 ("net/smc: guarantee removal of link groups in reboot")
    Signed-off-by: Karsten Graul
    Reviewed-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     

16 Dec, 2019

2 commits

  • FASTOPEN setsockopt() or sendmsg() may switch the SMC socket to fallback
    mode. Once fallback mode is active, the native TCP socket functions are
    called. Nevertheless there is a small race window, when FASTOPEN
    setsockopt/sendmsg runs in parallel to a connect(), and switch the
    socket into fallback mode before connect() takes the sock lock.
    Make sure the SMC-specific connect setup is omitted in this case.

    This way a syzbot-reported refcount problem is fixed, triggered by
    different threads running non-blocking connect() and FASTOPEN_KEY
    setsockopt.

    Reported-by: syzbot+96d3f9ff6a86d37e44c8@syzkaller.appspotmail.com
    Fixes: 6d6dd528d5af ("net/smc: fix refcount non-blocking connect() -part 2")
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: Jakub Kicinski

    Ursula Braun
     
  • Save a line of code by making use of ATOMIC_INIT() for lgr_cnt.

    Suggested-by: David S. Miller
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: Jakub Kicinski

    Ursula Braun
     

17 Nov, 2019

6 commits

  • Lots of overlapping changes and parallel additions, stuff
    like that.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • FASTOPEN does not work with SMC-sockets. Since SMC allows fallback to
    TCP native during connection start, the FASTOPEN setsockopts trigger
    this fallback, if the SMC-socket is still in state SMC_INIT.
    But if a FASTOPEN setsockopt is called after a non-blocking connect(),
    this is broken, and fallback does not make sense.
    This change complements
    commit cd2063604ea6 ("net/smc: avoid fallback in case of non-blocking connect")
    and fixes the syzbot reported problem "WARNING in smc_unhash_sk".

    Reported-by: syzbot+8488cc4cf1c9e09b8b86@syzkaller.appspotmail.com
    Fixes: e1bbdd570474 ("net/smc: reduce sock_put() for fallback sockets")
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • Constant SMC_CLOSE_WAIT_LISTEN_CLCSOCK_TIME is defined, but since
    commit 3d502067599f ("net/smc: simplify wait when closing listen socket")
    no longer used. Remove it.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • Add rcu_barrier() to make sure no RCU readers or callbacks are
    pending when the module is unloaded.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • When rebooting it should be guaranteed all link groups are cleaned
    up and freed.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • If the smc module is unloaded return control from exit routine only,
    if all link groups are freed.
    If an IB device is thrown away return control from device removal only,
    if all link groups belonging to this device are freed.
    Counters for the total number of SMCR link groups and for the total
    number of SMCR links per IB device are introduced. smc module unloading
    continues only if the total number of SMCR link groups is zero. IB device
    removal continues only it the total number of SMCR links per IB device
    has decreased to zero.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     

16 Nov, 2019

8 commits

  • If the SMC module is unloaded or an IB device is thrown away, the
    immediate link group freeing introduced for SMCD is exploited for SMCR
    as well. That means SMCR-specifics are added to smc_conn_kill().

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • Make sure all pending work requests are completed before freeing
    a link.
    Dismiss tx pending slots already when terminating a link group to
    exploit termination shortcut in tx completion queue handler.

    And kill the completion queue tasklets after destroy of the
    completion queues, otherwise there is a time window for another
    tasklet schedule of an already killed tasklet.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • For abnormal termination issue an LLC DELETE_LINK without the
    orderly flag.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • Avoid waiting for a free work request buffer, if the link group
    is already terminating.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • If the ism module is unloaded return control from exit routine only,
    if all link groups are freed.
    If an IB device is thrown away return control from device removal only,
    if all link groups belonging to this device are freed.
    A counters for the total number of SMCD link groups per ISM device is
    introduced. ism module unloading continues only if the total number of
    SMCD link groups for all ISM devices is zero. ISM device
    removal continues only it the total number of SMCD link groups per ISM
    device has decreased to zero.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • A final cleanup due to SMCD device removal means immediate freeing
    of all link groups belonging to this device in interrupt context.

    This patch introduces a separate SMCD link group termination routine,
    which terminates all link groups of an SMCD device.

    This new routine smcd_terminate_all ()is reused if the smc module is
    unloaded.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • SMCD link group termination is called when peer signals its shutdown
    of its corresponding link group. For regular shutdowns no connections
    exist anymore. For abnormal shutdowns connections must be killed and
    their DMBs must be unregistered immediately. That means the SMCR method
    to delay the link group freeing several seconds does not fit.

    This patch adds immediate termination of a link group and its SMCD
    connections and makes sure all SMCD link group related cleanup steps
    are finished.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • If peer announces shutdown, use the link group terminate worker for
    local cleanup of link groups and connections to terminate link group
    in proper context.

    Make sure link groups are cleaned up first before destroying the
    event queue of the SMCD device, because link group cleanup may
    raise events.

    Send signal shutdown only if peer has not done it already.

    Send socket abort or close only, if peer has not already announced
    shutdown.

    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     

13 Nov, 2019

1 commit

  • If an SMC socket is immediately terminated after a non-blocking connect()
    has been called, a memory leak is possible.
    Due to the sock_hold move in
    commit 301428ea3708 ("net/smc: fix refcounting for non-blocking connect()")
    an extra sock_put() is needed in smc_connect_work(), if the internal
    TCP socket is aborted and cancels the sk_stream_wait_connect() of the
    connect worker.

    Reported-by: syzbot+4b73ad6fc767e576e275@syzkaller.appspotmail.com
    Fixes: 301428ea3708 ("net/smc: fix refcounting for non-blocking connect()")
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     

10 Nov, 2019

1 commit


07 Nov, 2019

1 commit

  • If a pnet table entry is to be added mentioning a valid ethernet
    interface, but an invalid infiniband or ISM device, the dev_put()
    operation for the ethernet interface is called twice, resulting
    in a negative refcount for the ethernet interface, which disables
    removal of such a network interface.

    This patch removes one of the dev_put() calls.

    Fixes: 890a2cb4a966 ("net/smc: rework pnet table")
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Ursula Braun
     

03 Nov, 2019

1 commit


30 Oct, 2019

1 commit

  • If a nonblocking socket is immediately closed after connect(),
    the connect worker may not have started. This results in a refcount
    problem, since sock_hold() is called from the connect worker.
    This patch moves the sock_hold in front of the connect worker
    scheduling.

    Reported-by: syzbot+4c063e6dea39e4b79f29@syzkaller.appspotmail.com
    Fixes: 50717a37db03 ("net/smc: nonblocking connect rework")
    Reviewed-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun