24 Aug, 2019

1 commit

  • Add the RDMA cookie and RX timestamp to the usercopy whitelist.

    After the introduction of hardened usercopy whitelisting
    (https://lwn.net/Articles/727322/), a warning is displayed when the
    RDMA cookie or RX timestamp is copied to userspace:

    kernel: WARNING: CPU: 3 PID: 5750 at
    mm/usercopy.c:81 usercopy_warn+0x8e/0xa6
    [...]
    kernel: Call Trace:
    kernel: __check_heap_object+0xb8/0x11b
    kernel: __check_object_size+0xe3/0x1bc
    kernel: put_cmsg+0x95/0x115
    kernel: rds_recvmsg+0x43d/0x620 [rds]
    kernel: sock_recvmsg+0x43/0x4a
    kernel: ___sys_recvmsg+0xda/0x1e6
    kernel: ? __handle_mm_fault+0xcae/0xf79
    kernel: __sys_recvmsg+0x51/0x8a
    kernel: SyS_recvmsg+0x12/0x1c
    kernel: do_syscall_64+0x79/0x1ae

    When the whitelisting feature was introduced, the memory for the RDMA
    cookie and RX timestamp in RDS was not added to the whitelist, causing
    the warning above.

    Signed-off-by: Dag Moxnes
    Tested-by: Jenny
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Dag Moxnes
     

16 Aug, 2019

1 commit


10 Jul, 2019

1 commit

  • RDS composite message(rdma + control) user notification needs to be
    triggered once the full message is delivered and such a fix was
    added as part of commit 941f8d55f6d61 ("RDS: RDMA: Fix the composite
    message user notification"). But rds_send_remove_from_sock is missing
    data part notify check and hence at times the user don't get
    notification which isn't desirable.

    One way is to fix the rds_send_remove_from_sock to check of that case
    but considering the ordering complexity with completion handler and
    rdma + control messages are always dispatched back to back in same send
    context, just delaying the signaled completion on rmda work request also
    gets the desired behaviour. i.e Notifying application only after
    RDMA + control message send completes. So patch updates the earlier
    fix with this approach. The delay signaling completions of rdma op
    till the control message send completes fix was done by Venkat
    Venkatsubra in downstream kernel.

    Reviewed-and-tested-by: Zhu Yanjun
    Reviewed-by: Gerd Rausch
    Signed-off-by: Santosh Shilimkar

    Santosh Shilimkar
     

05 Feb, 2019

3 commits

  • RDMA transport maps user tos to underline virtual lanes(VL)
    for IB or DSCP values. RDMA CM transport abstract thats for
    RDS. TCP transport makes use of default priority 0 and maps
    all user tos values to it.

    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes]
    Signed-off-by: Zhu Yanjun

    Santosh Shilimkar
     
  • RDS Service type (TOS) is user-defined and needs to be configured
    via RDS IOCTL interface. It must be set before initiating any
    traffic and once set the TOS can not be changed. All out-going
    traffic from the socket will be associated with its TOS.

    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes]
    Signed-off-by: Zhu Yanjun

    Santosh Shilimkar
     
  • Mark RDSv3.1 as compat version and add v4.1 version macro's.
    Subsequent patches enable TOS(Type of Service) feature which is
    tied with v4.1 for RDMA transport.

    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes]
    Signed-off-by: Zhu Yanjun

    Santosh Shilimkar
     

07 Jan, 2019

1 commit


20 Dec, 2018

2 commits

  • per comment from Leon in rdma mailing list
    https://lkml.org/lkml/2018/10/31/312 :

    Please don't forget to remove user triggered WARN_ON.
    https://lwn.net/Articles/769365/
    "Greg Kroah-Hartman raised the problem of core kernel API code that will
    use WARN_ON_ONCE() to complain about bad usage; that will not generate
    the desired result if WARN_ON_ONCE() is configured to crash the machine.
    He was told that the code should just call pr_warn() instead, and that
    the called function should return an error in such situations. It was
    generally agreed that any WARN_ON() or WARN_ON_ONCE() calls that can be
    triggered from user space need to be fixed."

    in addition harden rds_sendmsg to detect and overcome issues with
    invalid sg count and fail the sendmsg.

    Suggested-by: Leon Romanovsky
    Acked-by: Santosh Shilimkar
    Signed-off-by: shamir rabinovitch
    Signed-off-by: David S. Miller

    shamir rabinovitch
     
  • redundant copy_from_user in rds_sendmsg system call expose rds
    to issue where rds_rdma_extra_size walk the rds iovec and and
    calculate the number pf pages (sgs) it need to add to the tail of
    rds message and later rds_cmsg_rdma_args copy the rds iovec again
    and re calculate the same number and get different result causing
    WARN_ON in rds_message_alloc_sgs.

    fix this by doing the copy_from_user only once per rds_sendmsg
    system call.

    When issue occur the below dump is seen:

    WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316 rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
    Kernel panic - not syncing: panic_on_warn set ...
    CPU: 0 PID: 19789 Comm: syz-executor827 Not tainted 4.19.0-next-20181030+ #101
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x244/0x39d lib/dump_stack.c:113
    panic+0x2ad/0x55c kernel/panic.c:188
    __warn.cold.8+0x20/0x45 kernel/panic.c:540
    report_bug+0x254/0x2d0 lib/bug.c:186
    fixup_bug arch/x86/kernel/traps.c:178 [inline]
    do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
    do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
    invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
    RIP: 0010:rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
    Code: c0 74 04 3c 03 7e 6c 44 01 ab 78 01 00 00 e8 2b 9e 35 fa 4c 89 e0 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 14 9e 35 fa 0b 31 ff 44 89 ee e8 18 9f 35 fa 45 85 ed 75 1b e8 fe 9d 35 fa
    RSP: 0018:ffff8801c51b7460 EFLAGS: 00010293
    RAX: ffff8801bc412080 RBX: ffff8801d7bf4040 RCX: ffffffff8749c9e6
    RDX: 0000000000000000 RSI: ffffffff8749ca5c RDI: 0000000000000004
    RBP: ffff8801c51b7490 R08: ffff8801bc412080 R09: ffffed003b5c5b67
    R10: ffffed003b5c5b67 R11: ffff8801dae2db3b R12: 0000000000000000
    R13: 000000000007165c R14: 000000000007165c R15: 0000000000000005
    rds_cmsg_rdma_args+0x82d/0x1510 net/rds/rdma.c:623
    rds_cmsg_send net/rds/send.c:971 [inline]
    rds_sendmsg+0x19a2/0x3180 net/rds/send.c:1273
    sock_sendmsg_nosec net/socket.c:622 [inline]
    sock_sendmsg+0xd5/0x120 net/socket.c:632
    ___sys_sendmsg+0x7fd/0x930 net/socket.c:2117
    __sys_sendmsg+0x11d/0x280 net/socket.c:2155
    __do_sys_sendmsg net/socket.c:2164 [inline]
    __se_sys_sendmsg net/socket.c:2162 [inline]
    __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
    do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x44a859
    Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 6b cb fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f1d4710ada8 EFLAGS: 00000297 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00000000006dcc28 RCX: 000000000044a859
    RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003
    RBP: 00000000006dcc20 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000297 R12: 00000000006dcc2c
    R13: 646e732f7665642f R14: 00007f1d4710b9c0 R15: 00000000006dcd2c
    Kernel Offset: disabled
    Rebooting in 86400 seconds..

    Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com
    Acked-by: Santosh Shilimkar
    Signed-off-by: shamir rabinovitch
    Signed-off-by: David S. Miller

    shamir rabinovitch
     

02 Sep, 2018

1 commit

  • rds is the last in-kernel user of the old do_gettimeofday()
    function. Convert it over to ktime_get_real() to make it
    work more like the generic socket timestamps, and to let
    us kill off do_gettimeofday().

    A follow-up patch will have to change the user space interface
    to deal better with 32-bit tasks, which may use an incompatible
    layout for 'struct timespec'.

    Signed-off-by: Arnd Bergmann
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

03 Aug, 2018

1 commit


27 Jul, 2018

1 commit

  • Registration of a memory region(MR) through FRMR/fastreg(unlike FMR)
    needs a connection/qp. With a proxy qp, this dependency on connection
    will be removed, but that needs more infrastructure patches, which is a
    work in progress.

    As an intermediate fix, the get_mr returns EOPNOTSUPP when connection
    details are not populated. The MR registration through sendmsg() will
    continue to work even with fast registration, since connection in this
    case is formed upfront.

    This patch fixes the following crash:
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] SMP KASAN
    Modules linked in:
    CPU: 1 PID: 4244 Comm: syzkaller468044 Not tainted 4.16.0-rc6+ #361
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    RIP: 0010:rds_ib_get_mr+0x5c/0x230 net/rds/ib_rdma.c:544
    RSP: 0018:ffff8801b059f890 EFLAGS: 00010202
    RAX: dffffc0000000000 RBX: ffff8801b07e1300 RCX: ffffffff8562d96e
    RDX: 000000000000000d RSI: 0000000000000001 RDI: 0000000000000068
    RBP: ffff8801b059f8b8 R08: ffffed0036274244 R09: ffff8801b13a1200
    R10: 0000000000000004 R11: ffffed0036274243 R12: ffff8801b13a1200
    R13: 0000000000000001 R14: ffff8801ca09fa9c R15: 0000000000000000
    FS: 00007f4d050af700(0000) GS:ffff8801db300000(0000)
    knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f4d050aee78 CR3: 00000001b0d9b006 CR4: 00000000001606e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    __rds_rdma_map+0x710/0x1050 net/rds/rdma.c:271
    rds_get_mr_for_dest+0x1d4/0x2c0 net/rds/rdma.c:357
    rds_setsockopt+0x6cc/0x980 net/rds/af_rds.c:347
    SYSC_setsockopt net/socket.c:1849 [inline]
    SyS_setsockopt+0x189/0x360 net/socket.c:1828
    do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x42/0xb7
    RIP: 0033:0x4456d9
    RSP: 002b:00007f4d050aedb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
    RAX: ffffffffffffffda RBX: 00000000006dac3c RCX: 00000000004456d9
    RDX: 0000000000000007 RSI: 0000000000000114 RDI: 0000000000000004
    RBP: 00000000006dac38 R08: 00000000000000a0 R09: 0000000000000000
    R10: 0000000020000380 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007fffbfb36d6f R14: 00007f4d050af9c0 R15: 0000000000000005
    Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 cc 01 00 00 4c 8b bb 80 04 00 00
    48
    b8 00 00 00 00 00 fc ff df 49 8d 7f 68 48 89 fa 48 c1 ea 03 3c 02
    00 0f
    85 9c 01 00 00 4d 8b 7f 68 48 b8 00 00 00 00 00
    RIP: rds_ib_get_mr+0x5c/0x230 net/rds/ib_rdma.c:544 RSP:
    ffff8801b059f890
    ---[ end trace 7e1cea13b85473b0 ]---

    Reported-by: syzbot+b51c77ef956678a65834@syzkaller.appspotmail.com
    Signed-off-by: Santosh Shilimkar
    Signed-off-by: Avinash Repaka

    Signed-off-by: David S. Miller

    Avinash Repaka
     

24 Jul, 2018

2 commits

  • This patch enables RDS to use IPv6 addresses. For RDS/TCP, the
    listener is now an IPv6 endpoint which accepts both IPv4 and IPv6
    connection requests. RDS/RDMA/IB uses a private data (struct
    rds_ib_connect_private) exchange between endpoints at RDS connection
    establishment time to support RDMA. This private data exchange uses a
    32 bit integer to represent an IP address. This needs to be changed in
    order to support IPv6. A new private data struct
    rds6_ib_connect_private is introduced to handle this. To ensure
    backward compatibility, an IPv6 capable RDS stack uses another RDMA
    listener port (RDS_CM_PORT) to accept IPv6 connection. And it
    continues to use the original RDS_PORT for IPv4 RDS connections. When
    it needs to communicate with an IPv6 peer, it uses the RDS_CM_PORT to
    send the connection set up request.

    v5: Fixed syntax problem (David Miller).

    v4: Changed port history comments in rds.h (Sowmini Varadhan).

    v3: Added support to set up IPv4 connection using mapped address
    (David Miller).
    Added support to set up connection between link local and non-link
    addresses.
    Various review comments from Santosh Shilimkar and Sowmini Varadhan.

    v2: Fixed bound and peer address scope mismatched issue.
    Added back rds_connect() IPv6 changes.

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     
  • This patch changes the internal representation of an IP address to use
    struct in6_addr. IPv4 address is stored as an IPv4 mapped address.
    All the functions which take an IP address as argument are also
    changed to use struct in6_addr. But RDS socket layer is not modified
    such that it still does not accept IPv6 address from an application.
    And RDS layer does not accept nor initiate IPv6 connections.

    v2: Fixed sparse warnings.

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     

15 Jun, 2018

1 commit

  • Loop transport which is self loopback, remote port congestion
    update isn't relevant. Infact the xmit path already ignores it.
    Receive path needs to do the same.

    Reported-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com
    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Santosh Shilimkar
     

13 Mar, 2018

1 commit


08 Mar, 2018

1 commit

  • Commit 401910db4cd4 ("rds: deliver zerocopy completion notification
    with data") removes support fo r zerocopy completion notification
    on the sk_error_queue, thus we no longer need to track the cookie
    information in sk_buff structures.

    This commit removes the struct sk_buff_head rs_zcookie_queue by
    a simpler list that results in a smaller memory footprint as well
    as more efficient memory_allocation time.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

28 Feb, 2018

1 commit

  • This commit is an optimization over commit 01883eda72bd
    ("rds: support for zcopy completion notification") for PF_RDS sockets.

    RDS applications are predominantly request-response transactions, so
    it is more efficient to reduce the number of system calls and have
    zerocopy completion notification delivered as ancillary data on the
    POLLIN channel.

    Cookies are passed up as ancillary data (at level SOL_RDS) in a
    struct rds_zcopy_cookies when the returned value of recvmsg() is
    greater than, or equal to, 0. A max of RDS_MAX_ZCOOKIES may be passed
    with each message.

    This commit removes support for zerocopy completion notification on
    MSG_ERRQUEUE for PF_RDS sockets.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Willem de Bruijn
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

17 Feb, 2018

2 commits

  • If the MSG_ZEROCOPY flag is specified with rds_sendmsg(), and,
    if the SO_ZEROCOPY socket option has been set on the PF_RDS socket,
    application pages sent down with rds_sendmsg() are pinned.

    The pinning uses the accounting infrastructure added by
    Commit a91dbff551a6 ("sock: ulimit on MSG_ZEROCOPY pages")

    The payload bytes in the message may not be modified for the
    duration that the message has been pinned. A multi-threaded
    application using this infrastructure may thus need to be notified
    about send-completion so that it can free/reuse the buffers
    passed to rds_sendmsg(). Notification of send-completion will
    identify each message-buffer by a cookie that the application
    must specify as ancillary data to rds_sendmsg().
    The ancillary data in this case has cmsg_level == SOL_RDS
    and cmsg_type == RDS_CMSG_ZCOPY_COOKIE.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • RDS removes a datagram (rds_message) from the retransmit queue when
    an ACK is received. The ACK indicates that the receiver has queued
    the RDS datagram, so that the sender can safely forget the datagram.
    When all references to the rds_message are quiesced, rds_message_purge
    is called to release resources used by the rds_message

    If the datagram to be removed had pinned pages set up, add
    an entry to the rs->rs_znotify_queue so that the notifcation
    will be sent up via rds_rm_zerocopy_callback() when the
    rds_message is eventually freed by rds_message_purge.

    rds_rm_zerocopy_callback() attempts to batch the number of cookies
    sent with each notification to a max of SO_EE_ORIGIN_MAX_ZCOOKIES.
    This is achieved by checking the tail skb in the sk_error_queue:
    if this has room for one more cookie, the cookie from the
    current notification is added; else a new skb is added to the
    sk_error_queue. Every invocation of rds_rm_zerocopy_callback() will
    trigger a ->sk_error_report to notify the application.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

09 Feb, 2018

1 commit

  • … connection/workq management

    An rds_connection can get added during netns deletion between lines 528
    and 529 of

    506 static void rds_tcp_kill_sock(struct net *net)
    :
    /* code to pull out all the rds_connections that should be destroyed */
    :
    528 spin_unlock_irq(&rds_tcp_conn_lock);
    529 list_for_each_entry_safe(tc, _tc, &tmp_list, t_tcp_node)
    530 rds_conn_destroy(tc->t_cpath->cp_conn);

    Such an rds_connection would miss out the rds_conn_destroy()
    loop (that cancels all pending work) and (if it was scheduled
    after netns deletion) could trigger the use-after-free.

    A similar race-window exists for the module unload path
    in rds_tcp_exit -> rds_tcp_destroy_conns

    Concurrency with netns deletion (rds_tcp_kill_sock()) must be handled
    by checking check_net() before enqueuing new work or adding new
    connections.

    Concurrency with module-unload is handled by maintaining a module
    specific flag that is set at the start of the module exit function,
    and must be checked before enqueuing new work or adding new connections.

    This commit refactors existing RDS_DESTROY_PENDING checks added by
    commit 3db6e0d172c9 ("rds: use RCU to synchronize work-enqueue with
    connection teardown") and consolidates all the concurrency checks
    listed above into the function rds_destroy_pending().

    Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
    Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

    Sowmini Varadhan
     

06 Jan, 2018

1 commit


02 Dec, 2017

1 commit

  • Commit 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
    introduces a regression in rds-tcp netns cleanup. The cleanup_net(),
    (and thus rds_tcp_dev_event notification) is only called from put_net()
    when all netns refcounts go to 0, but this cannot happen if the
    rds_connection itself is holding a c_net ref that it expects to
    release in rds_tcp_kill_sock.

    Instead, the rds_tcp_kill_sock callback should make sure to
    tear down state carefully, ensuring that the socket teardown
    is only done after all data-structures and workqs that depend
    on it are quiesced.

    The original motivation for commit 8edc3affc077 ("rds: tcp: Take explicit
    refcounts on struct net") was to resolve a race condition reported by
    syzkaller where workqs for tx/rx/connect were triggered after the
    namespace was deleted. Those worker threads should have been
    cancelled/flushed before socket tear-down and indeed,
    rds_conn_path_destroy() does try to sequence this by doing
    /* cancel cp_send_w */
    /* cancel cp_recv_w */
    /* flush cp_down_w */
    /* free data structures */
    Here the "flush cp_down_w" will trigger rds_conn_shutdown and thus
    invoke rds_tcp_conn_path_shutdown() to close the tcp socket, so that
    we ought to have satisfied the requirement that "socket-close is
    done after all other dependent state is quiesced". However,
    rds_conn_shutdown has a bug in that it *always* triggers the reconnect
    workq (and if connection is successful, we always restart tx/rx
    workqs so with the right timing, we risk the race conditions reported
    by syzkaller).

    Netns deletion is like module teardown- no need to restart a
    reconnect in this case. We can use the c_destroy_in_prog bit
    to avoid restarting the reconnect.

    Fixes: 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

04 Aug, 2017

1 commit

  • RDS over IB does not use multipath RDS, so the array
    of additional rds_conn_path structures is always superfluous
    in this case. Reduce the memory footprint of the rds module
    by making this a dynamic allocation predicated on whether
    the transport is mp_capable.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Tested-by: Efrain Galaviz
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

17 Jul, 2017

1 commit

  • We could end up executing rds_conn_shutdown before the rds_recv_worker
    thread, then rds_conn_shutdown -> rds_tcp_conn_shutdown can do a
    sock_release and set sock->sk to null, which may interleave in bad
    ways with rds_recv_worker, e.g., it could result in:

    "BUG: unable to handle kernel NULL pointer dereference at 0000000000000078"
    [ffff881769f6fd70] release_sock at ffffffff815f337b
    [ffff881769f6fd90] rds_tcp_recv at ffffffffa043c888 [rds_tcp]
    [ffff881769f6fdb0] rds_recv_worker at ffffffffa04a4810 [rds]
    [ffff881769f6fde0] process_one_work at ffffffff810a14c1
    [ffff881769f6fe40] worker_thread at ffffffff810a1940
    [ffff881769f6fec0] kthread at ffffffff810a6b1e

    Also, do not enqueue any new shutdown workq items when the connection is
    shutting down (this may happen for rds-tcp in softirq mode, if a FIN
    or CLOSE is received while the modules is in the middle of an unload)

    Signed-off-by: Sowmini Varadhan
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

05 Jul, 2017

3 commits

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

22 Jun, 2017

2 commits

  • If we are unloading the rds_tcp module, we can set linger to 1
    and drop pending packets to accelerate reconnect. The peer will
    end up resetting the connection based on new generation numbers
    of the new incarnation, so hanging on to unsent TCP packets via
    linger is mostly pointless in this case.

    Signed-off-by: Sowmini Varadhan
    Tested-by: Jenny Xu
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • The RDS handshake ping probe added by commit 5916e2c1554f
    ("RDS: TCP: Enable multipath RDS for TCP") is sent from rds_sendmsg()
    before the first data packet is sent to a peer. If the conversation
    is not bidirectional (i.e., one side is always passive and never
    invokes rds_sendmsg()) and the passive side restarts its rds_tcp
    module, a new HS ping probe needs to be sent, so that the number
    of paths can be re-established.

    This patch achieves that by sending a HS ping probe from
    rds_tcp_accept_one() when c_npaths is 0 (i.e., we have not done
    a handshake probe with this peer yet).

    Signed-off-by: Sowmini Varadhan
    Tested-by: Jenny Xu
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

17 Jun, 2017

2 commits

  • Found when testing between sparc and x86 machines on different
    subnets, so the address comparison patterns hit the corner cases and
    brought out some bugs fixed by this patch.

    Signed-off-by: Sowmini Varadhan
    Tested-by: Imanti Mendez
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • After commit 1a0e100fb2c9 ("RDS: TCP: Force every connection to be
    initiated by numerically smaller IP address") we no longer need
    the logic associated with cp_outgoing, so clean up usage of this
    field.

    Signed-off-by: Sowmini Varadhan
    Tested-by: Imanti Mendez
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

08 Mar, 2017

1 commit

  • It is incorrect for the rds_connection to piggyback on the
    sock_net() refcount for the netns because this gives rise to
    a chicken-and-egg problem during rds_conn_destroy. Instead explicitly
    take a ref on the net, and hold the netns down till the connection
    tear-down is complete.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Sowmini Varadhan
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

05 Mar, 2017

1 commit

  • Pull networking fixes from David Miller:

    1) Fix double-free in batman-adv, from Sven Eckelmann.

    2) Fix packet stats for fast-RX path, from Joannes Berg.

    3) Netfilter's ip_route_me_harder() doesn't handle request sockets
    properly, fix from Florian Westphal.

    4) Fix sendmsg deadlock in rxrpc, from David Howells.

    5) Add missing RCU locking to transport hashtable scan, from Xin Long.

    6) Fix potential packet loss in mlxsw driver, from Ido Schimmel.

    7) Fix race in NAPI handling between poll handlers and busy polling,
    from Eric Dumazet.

    8) TX path in vxlan and geneve need proper RCU locking, from Jakub
    Kicinski.

    9) SYN processing in DCCP and TCP need to disable BH, from Eric
    Dumazet.

    10) Properly handle net_enable_timestamp() being invoked from IRQ
    context, also from Eric Dumazet.

    11) Fix crash on device-tree systems in xgene driver, from Alban Bedel.

    12) Do not call sk_free() on a locked socket, from Arnaldo Carvalho de
    Melo.

    13) Fix use-after-free in netvsc driver, from Dexuan Cui.

    14) Fix max MTU setting in bonding driver, from WANG Cong.

    15) xen-netback hash table can be allocated from softirq context, so use
    GFP_ATOMIC. From Anoob Soman.

    16) Fix MAC address change bug in bgmac driver, from Hari Vyas.

    17) strparser needs to destroy strp_wq on module exit, from WANG Cong.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
    strparser: destroy workqueue on module exit
    sfc: fix IPID endianness in TSOv2
    sfc: avoid max() in array size
    rds: remove unnecessary returned value check
    rxrpc: Fix potential NULL-pointer exception
    nfp: correct DMA direction in XDP DMA sync
    nfp: don't tell FW about the reserved buffer space
    net: ethernet: bgmac: mac address change bug
    net: ethernet: bgmac: init sequence bug
    xen-netback: don't vfree() queues under spinlock
    xen-netback: keep a local pointer for vif in backend_disconnect()
    netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() fails
    netfilter: nft_set_rbtree: incorrect assumption on lower interval lookups
    netfilter: nf_conntrack_sip: fix wrong memory initialisation
    can: flexcan: fix typo in comment
    can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer
    can: gs_usb: fix coding style
    can: gs_usb: Don't use stack memory for USB transfers
    ixgbe: Limit use of 2K buffers on architectures with 256B or larger cache lines
    ixgbe: update the rss key on h/w, when ethtool ask for it
    ...

    Linus Torvalds
     

04 Mar, 2017

1 commit


03 Mar, 2017

1 commit

  • Pull vfs sendmsg updates from Al Viro:
    "More sendmsg work.

    This is a fairly separate isolated stuff (there's a continuation
    around lustre, but that one was too late to soak in -next), thus the
    separate pull request"

    * 'work.sendmsg' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    ncpfs: switch to sock_sendmsg()
    ncpfs: don't mess with manually advancing iovec on send
    ncpfs: sendmsg does *not* bugger iovec these days
    ceph_tcp_sendpage(): use ITER_BVEC sendmsg
    afs_send_pages(): use ITER_BVEC
    rds: remove dead code
    ceph: switch to sock_recvmsg()
    usbip_recv(): switch to sock_recvmsg()
    iscsi_target: deal with short writes on the tx side
    [nbd] pass iov_iter to nbd_xmit()
    [nbd] switch sock_xmit() to sock_{send,recv}msg()
    [drbd] use sock_sendmsg()

    Linus Torvalds
     

03 Jan, 2017

3 commits