19 Apr, 2018

1 commit

  • [ Upstream commit a43cced9a348901f9015f4730b70b69e7c41a9c9 ]

    rds_sendmsg() calls rds_send_mprds_hash() to find a c_path to use to
    send a message. Suppose the RDS connection is not yet up. In
    rds_send_mprds_hash(), it does

    if (conn->c_npaths == 0)
    wait_event_interruptible(conn->c_hs_waitq,
    (conn->c_npaths != 0));

    If it is interrupted before the connection is set up,
    rds_send_mprds_hash() will return a non-zero hash value. Hence
    rds_sendmsg() will use a non-zero c_path to send the message. But if
    the RDS connection ends up to be non-MP capable, the message will be
    lost as only the zero c_path can be used.

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ka-Cheong Poon
     

12 Apr, 2018

1 commit

  • [ Upstream commit 7ae0c649c47f1c5d2db8cee6dd75855970af1669 ]

    If the rds_sock is not added to the bind_hash_table, we must
    reset rs_bound_addr so that rds_remove_bound will not trip on
    this rds_sock.

    rds_add_bound() does a rds_sock_put() in this failure path, so
    failing to reset rs_bound_addr will result in a socket refcount
    bug, and will trigger a WARN_ON with the stack shown below when
    the application subsequently tries to close the PF_RDS socket.

    WARNING: CPU: 20 PID: 19499 at net/rds/af_rds.c:496 \
    rds_sock_destruct+0x15/0x30 [rds]
    :
    __sk_destruct+0x21/0x190
    rds_remove_bound.part.13+0xb6/0x140 [rds]
    rds_release+0x71/0x120 [rds]
    sock_release+0x1a/0x70
    sock_close+0xe/0x20
    __fput+0xd5/0x210
    task_work_run+0x82/0xa0
    do_exit+0x2ce/0xb30
    ? syscall_trace_enter+0x1cc/0x2b0
    do_group_exit+0x39/0xa0
    SyS_exit_group+0x10/0x10
    do_syscall_64+0x61/0x1a0

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Sowmini Varadhan
     

25 Feb, 2018

2 commits

  • commit f10b4cff98c6977668434fbf5dd58695eeca2897 upstream.

    The rds_tcp_kill_sock() function parses the rds_tcp_conn_list
    to find the rds_connection entries marked for deletion as part
    of the netns deletion under the protection of the rds_tcp_conn_lock.
    Since the rds_tcp_conn_list tracks rds_tcp_connections (which
    have a 1:1 mapping with rds_conn_path), multiple tc entries in
    the rds_tcp_conn_list will map to a single rds_connection, and will
    be deleted as part of the rds_conn_destroy() operation that is
    done outside the rds_tcp_conn_lock.

    The rds_tcp_conn_list traversal done under the protection of
    rds_tcp_conn_lock should not leave any doomed tc entries in
    the list after the rds_tcp_conn_lock is released, else another
    concurrently executiong netns delete (for a differnt netns) thread
    may trip on these entries.

    Reported-by: syzbot
    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Sowmini Varadhan
     
  • commit 681648e67d43cf269c5590ecf021ed481f4551fc upstream.

    Commit 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
    introduces a regression in rds-tcp netns cleanup. The cleanup_net(),
    (and thus rds_tcp_dev_event notification) is only called from put_net()
    when all netns refcounts go to 0, but this cannot happen if the
    rds_connection itself is holding a c_net ref that it expects to
    release in rds_tcp_kill_sock.

    Instead, the rds_tcp_kill_sock callback should make sure to
    tear down state carefully, ensuring that the socket teardown
    is only done after all data-structures and workqs that depend
    on it are quiesced.

    The original motivation for commit 8edc3affc077 ("rds: tcp: Take explicit
    refcounts on struct net") was to resolve a race condition reported by
    syzkaller where workqs for tx/rx/connect were triggered after the
    namespace was deleted. Those worker threads should have been
    cancelled/flushed before socket tear-down and indeed,
    rds_conn_path_destroy() does try to sequence this by doing
    /* cancel cp_send_w */
    /* cancel cp_recv_w */
    /* flush cp_down_w */
    /* free data structures */
    Here the "flush cp_down_w" will trigger rds_conn_shutdown and thus
    invoke rds_tcp_conn_path_shutdown() to close the tcp socket, so that
    we ought to have satisfied the requirement that "socket-close is
    done after all other dependent state is quiesced". However,
    rds_conn_shutdown has a bug in that it *always* triggers the reconnect
    workq (and if connection is successful, we always restart tx/rx
    workqs so with the right timing, we risk the race conditions reported
    by syzkaller).

    Netns deletion is like module teardown- no need to restart a
    reconnect in this case. We can use the c_destroy_in_prog bit
    to avoid restarting the reconnect.

    Fixes: 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Sowmini Varadhan
     

17 Jan, 2018

2 commits

  • [ Upstream commit 7d11f77f84b27cef452cee332f4e469503084737 ]

    set rm->atomic.op_active to 0 when rds_pin_pages() fails
    or the user supplied address is invalid,
    this prevents a NULL pointer usage in rds_atomic_free_op()

    Signed-off-by: Mohamed Ghannam
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Mohamed Ghannam
     
  • [ Upstream commit c095508770aebf1b9218e77026e48345d719b17c ]

    When args->nr_local is 0, nr_pages gets also 0 due some size
    calculation via rds_rm_size(), which is later used to allocate
    pages for DMA, this bug produces a heap Out-Of-Bound write access
    to a specific memory region.

    Signed-off-by: Mohamed Ghannam
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Mohamed Ghannam
     

03 Jan, 2018

1 commit

  • [ Upstream commit 14e138a86f6347c6199f610576d2e11c03bec5f0 ]

    RDS currently doesn't check if the length of the control message is
    large enough to hold the required data, before dereferencing the control
    message data. This results in following crash:

    BUG: KASAN: stack-out-of-bounds in rds_rdma_bytes net/rds/send.c:1013
    [inline]
    BUG: KASAN: stack-out-of-bounds in rds_sendmsg+0x1f02/0x1f90
    net/rds/send.c:1066
    Read of size 8 at addr ffff8801c928fb70 by task syzkaller455006/3157

    CPU: 0 PID: 3157 Comm: syzkaller455006 Not tainted 4.15.0-rc3+ #161
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:53
    print_address_description+0x73/0x250 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x25b/0x340 mm/kasan/report.c:409
    __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
    rds_rdma_bytes net/rds/send.c:1013 [inline]
    rds_sendmsg+0x1f02/0x1f90 net/rds/send.c:1066
    sock_sendmsg_nosec net/socket.c:628 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:638
    ___sys_sendmsg+0x320/0x8b0 net/socket.c:2018
    __sys_sendmmsg+0x1ee/0x620 net/socket.c:2108
    SYSC_sendmmsg net/socket.c:2139 [inline]
    SyS_sendmmsg+0x35/0x60 net/socket.c:2134
    entry_SYSCALL_64_fastpath+0x1f/0x96
    RIP: 0033:0x43fe49
    RSP: 002b:00007fffbe244ad8 EFLAGS: 00000217 ORIG_RAX: 0000000000000133
    RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fe49
    RDX: 0000000000000001 RSI: 000000002020c000 RDI: 0000000000000003
    RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004017b0
    R13: 0000000000401840 R14: 0000000000000000 R15: 0000000000000000

    To fix this, we verify that the cmsg_len is large enough to hold the
    data to be read, before proceeding further.

    Reported-by: syzbot
    Signed-off-by: Avinash Repaka
    Acked-by: Santosh Shilimkar
    Reviewed-by: Yuval Shaia
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Avinash Repaka
     

17 Dec, 2017

1 commit

  • [ Upstream commit f3069c6d33f6ae63a1668737bc78aaaa51bff7ca ]

    This is a fix for syzkaller719569, where memory registration was
    attempted without any underlying transport being loaded.

    Analysis of the case reveals that it is the setsockopt() RDS_GET_MR
    (2) and RDS_GET_MR_FOR_DEST (7) that are vulnerable.

    Here is an example stack trace when the bug is hit:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000000c0
    IP: __rds_rdma_map+0x36/0x440 [rds]
    PGD 2f93d03067 P4D 2f93d03067 PUD 2f93d02067 PMD 0
    Oops: 0000 [#1] SMP
    Modules linked in: bridge stp llc tun rpcsec_gss_krb5 nfsv4
    dns_resolver nfs fscache rds binfmt_misc sb_edac intel_powerclamp
    coretemp kvm_intel kvm irqbypass crct10dif_pclmul c rc32_pclmul
    ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd
    iTCO_wdt mei_me sg iTCO_vendor_support ipmi_si mei ipmi_devintf nfsd
    shpchp pcspkr i2c_i801 ioatd ma ipmi_msghandler wmi lpc_ich mfd_core
    auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2
    mgag200 i2c_algo_bit drm_kms_helper ixgbe syscopyarea ahci sysfillrect
    sysimgblt libahci mdio fb_sys_fops ttm ptp libata sd_mod mlx4_core drm
    crc32c_intel pps_core megaraid_sas i2c_core dca dm_mirror
    dm_region_hash dm_log dm_mod
    CPU: 48 PID: 45787 Comm: repro_set2 Not tainted 4.14.2-3.el7uek.x86_64 #2
    Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
    task: ffff882f9190db00 task.stack: ffffc9002b994000
    RIP: 0010:__rds_rdma_map+0x36/0x440 [rds]
    RSP: 0018:ffffc9002b997df0 EFLAGS: 00010202
    RAX: 0000000000000000 RBX: ffff882fa2182580 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffffc9002b997e40 RDI: ffff882fa2182580
    RBP: ffffc9002b997e30 R08: 0000000000000000 R09: 0000000000000002
    R10: ffff885fb29e3838 R11: 0000000000000000 R12: ffff882fa2182580
    R13: ffff882fa2182580 R14: 0000000000000002 R15: 0000000020000ffc
    FS: 00007fbffa20b700(0000) GS:ffff882fbfb80000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000000000c0 CR3: 0000002f98a66006 CR4: 00000000001606e0
    Call Trace:
    rds_get_mr+0x56/0x80 [rds]
    rds_setsockopt+0x172/0x340 [rds]
    ? __fget_light+0x25/0x60
    ? __fdget+0x13/0x20
    SyS_setsockopt+0x80/0xe0
    do_syscall_64+0x67/0x1b0
    entry_SYSCALL64_slow_path+0x25/0x25
    RIP: 0033:0x7fbff9b117f9
    RSP: 002b:00007fbffa20aed8 EFLAGS: 00000293 ORIG_RAX: 0000000000000036
    RAX: ffffffffffffffda RBX: 00000000000c84a4 RCX: 00007fbff9b117f9
    RDX: 0000000000000002 RSI: 0000400000000114 RDI: 000000000000109b
    RBP: 00007fbffa20af10 R08: 0000000000000020 R09: 00007fbff9dd7860
    R10: 0000000020000ffc R11: 0000000000000293 R12: 0000000000000000
    R13: 00007fbffa20b9c0 R14: 00007fbffa20b700 R15: 0000000000000021

    Code: 41 56 41 55 49 89 fd 41 54 53 48 83 ec 18 8b 87 f0 02 00 00 48
    89 55 d0 48 89 4d c8 85 c0 0f 84 2d 03 00 00 48 8b 87 00 03 00 00
    83 b8 c0 00 00 00 00 0f 84 25 03 00 0 0 48 8b 06 48 8b 56 08

    The fix is to check the existence of an underlying transport in
    __rds_rdma_map().

    Signed-off-by: Håkon Bugge
    Reported-by: syzbot
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Håkon Bugge
     

10 Nov, 2017

1 commit

  • rds_ib_recv_refill() is a function that refills an IB receive
    queue. It can be called from both the CQE handler (tasklet) and a
    worker thread.

    Just after the call to ib_post_recv(), a debug message is printed with
    rdsdebug():

    ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
    rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
    recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
    (long) ib_sg_dma_address(
    ic->i_cm_id->device,
    &recv->r_frag->f_sg),
    ret);

    Now consider an invocation of rds_ib_recv_refill() from the worker
    thread, which is preemptible. Further, assume that the worker thread
    is preempted between the ib_post_recv() and rdsdebug() statements.

    Then, if the preemption is due to a receive CQE event, the
    rds_ib_recv_cqe_handler() will be invoked. This function processes
    receive completions, including freeing up data structures, such as the
    recv->r_frag.

    In this scenario, rds_ib_recv_cqe_handler() will process the receive
    WR posted above. That implies, that the recv->r_frag has been freed
    before the above rdsdebug() statement has been executed. When it is
    later executed, we will have a NULL pointer dereference:

    [ 4088.068008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
    [ 4088.076754] IP: rds_ib_recv_refill+0x87/0x620 [rds_rdma]
    [ 4088.082686] PGD 0 P4D 0
    [ 4088.085515] Oops: 0000 [#1] SMP
    [ 4088.089015] Modules linked in: rds_rdma(OE) rds(OE) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) mlx4_ib(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_core(E) binfmt_misc(E) sb_edac(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) crypto_simd(E) iTCO_wdt(E) glue_helper(E) iTCO_vendor_support(E) sg(E) cryptd(E) pcspkr(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) shpchp(E) ioatdma(E) i2c_i801(E) wmi(E) lpc_ich(E) mei_me(E) mei(E) mfd_core(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) fscrypto(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
    [ 4088.168486] fb_sys_fops(E) ahci(E) ixgbe(E) libahci(E) ttm(E) mdio(E) ptp(E) pps_core(E) drm(E) sd_mod(E) libata(E) crc32c_intel(E) mlx4_core(E) i2c_core(E) dca(E) megaraid_sas(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: rds]
    [ 4088.193442] CPU: 20 PID: 1244 Comm: kworker/20:2 Tainted: G OE 4.14.0-rc7.master.20171105.ol7.x86_64 #1
    [ 4088.205097] Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
    [ 4088.216074] Workqueue: ib_cm cm_work_handler [ib_cm]
    [ 4088.221614] task: ffff885fa11d0000 task.stack: ffffc9000e598000
    [ 4088.228224] RIP: 0010:rds_ib_recv_refill+0x87/0x620 [rds_rdma]
    [ 4088.234736] RSP: 0018:ffffc9000e59bb68 EFLAGS: 00010286
    [ 4088.240568] RAX: 0000000000000000 RBX: ffffc9002115d050 RCX: ffffc9002115d050
    [ 4088.248535] RDX: ffffffffa0521380 RSI: ffffffffa0522158 RDI: ffffffffa0525580
    [ 4088.256498] RBP: ffffc9000e59bbf8 R08: 0000000000000005 R09: 0000000000000000
    [ 4088.264465] R10: 0000000000000339 R11: 0000000000000001 R12: 0000000000000000
    [ 4088.272433] R13: ffff885f8c9d8000 R14: ffffffff81a0a060 R15: ffff884676268000
    [ 4088.280397] FS: 0000000000000000(0000) GS:ffff885fbec80000(0000) knlGS:0000000000000000
    [ 4088.289434] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 4088.295846] CR2: 0000000000000020 CR3: 0000000001e09005 CR4: 00000000001606e0
    [ 4088.303816] Call Trace:
    [ 4088.306557] rds_ib_cm_connect_complete+0xe0/0x220 [rds_rdma]
    [ 4088.312982] ? __dynamic_pr_debug+0x8c/0xb0
    [ 4088.317664] ? __queue_work+0x142/0x3c0
    [ 4088.321944] rds_rdma_cm_event_handler+0x19e/0x250 [rds_rdma]
    [ 4088.328370] cma_ib_handler+0xcd/0x280 [rdma_cm]
    [ 4088.333522] cm_process_work+0x25/0x120 [ib_cm]
    [ 4088.338580] cm_work_handler+0xd6b/0x17aa [ib_cm]
    [ 4088.343832] process_one_work+0x149/0x360
    [ 4088.348307] worker_thread+0x4d/0x3e0
    [ 4088.352397] kthread+0x109/0x140
    [ 4088.355996] ? rescuer_thread+0x380/0x380
    [ 4088.360467] ? kthread_park+0x60/0x60
    [ 4088.364563] ret_from_fork+0x25/0x30
    [ 4088.368548] Code: 48 89 45 90 48 89 45 98 eb 4d 0f 1f 44 00 00 48 8b 43 08 48 89 d9 48 c7 c2 80 13 52 a0 48 c7 c6 58 21 52 a0 48 c7 c7 80 55 52 a0 8b 48 20 44 89 64 24 08 48 8b 40 30 49 83 e1 fc 48 89 04 24
    [ 4088.389612] RIP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] RSP: ffffc9000e59bb68
    [ 4088.397772] CR2: 0000000000000020
    [ 4088.401505] ---[ end trace fe922e6ccf004431 ]---

    This bug was provoked by compiling rds out-of-tree with
    EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
    between the rdsdebug() and ib_ib_port_recv() statements:

    /* XXX when can this fail? */
    ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
    + if (can_wait)
    + usleep_range(1000, 5000);
    rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
    recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
    (long) ib_sg_dma_address(

    The fix is simply to move the rdsdebug() statement up before the
    ib_post_recv() and remove the printing of ret, which is taken care of
    anyway by the non-debug code.

    Signed-off-by: Håkon Bugge
    Reviewed-by: Knut Omang
    Reviewed-by: Wei Lin Guay
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Håkon Bugge
     

03 Nov, 2017

1 commit

  • …el/git/gregkh/driver-core

    Pull initial SPDX identifiers from Greg KH:
    "License cleanup: add SPDX license identifiers to some files

    Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the
    'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
    binding shorthand, which can be used instead of the full boiler plate
    text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart
    and Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset
    of the use cases:

    - file had no licensing information it it.

    - file was a */uapi/* one with no licensing information in it,

    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to
    license had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied
    to a file was done in a spreadsheet of side by side results from of
    the output of two independent scanners (ScanCode & Windriver)
    producing SPDX tag:value files created by Philippe Ombredanne.
    Philippe prepared the base worksheet, and did an initial spot review
    of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537
    files assessed. Kate Stewart did a file by file comparison of the
    scanner results in the spreadsheet to determine which SPDX license
    identifier(s) to be applied to the file. She confirmed any
    determination that was not immediately clear with lawyers working with
    the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:

    - Files considered eligible had to be source code files.

    - Make and config files were included as candidates if they contained
    >5 lines of source

    - File already had some variant of a license header in it (even if <5
    lines).

    All documentation files were explicitly excluded.

    The following heuristics were used to determine which SPDX license
    identifiers to apply.

    - when both scanners couldn't find any license traces, file was
    considered to have no license information in it, and the top level
    COPYING file license applied.

    For non */uapi/* files that summary was:

    SPDX license identifier # files
    ---------------------------------------------------|-------
    GPL-2.0 11139

    and resulted in the first patch in this series.

    If that file was a */uapi/* path one, it was "GPL-2.0 WITH
    Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
    was:

    SPDX license identifier # files
    ---------------------------------------------------|-------
    GPL-2.0 WITH Linux-syscall-note 930

    and resulted in the second patch in this series.

    - if a file had some form of licensing information in it, and was one
    of the */uapi/* ones, it was denoted with the Linux-syscall-note if
    any GPL family license was found in the file or had no licensing in
    it (per prior point). Results summary:

    SPDX license identifier # files
    ---------------------------------------------------|------
    GPL-2.0 WITH Linux-syscall-note 270
    GPL-2.0+ WITH Linux-syscall-note 169
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
    LGPL-2.1+ WITH Linux-syscall-note 15
    GPL-1.0+ WITH Linux-syscall-note 14
    ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
    LGPL-2.0+ WITH Linux-syscall-note 4
    LGPL-2.1 WITH Linux-syscall-note 3
    ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
    ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

    and that resulted in the third patch in this series.

    - when the two scanners agreed on the detected license(s), that
    became the concluded license(s).

    - when there was disagreement between the two scanners (one detected
    a license but the other didn't, or they both detected different
    licenses) a manual inspection of the file occurred.

    - In most cases a manual inspection of the information in the file
    resulted in a clear resolution of the license that should apply
    (and which scanner probably needed to revisit its heuristics).

    - When it was not immediately clear, the license identifier was
    confirmed with lawyers working with the Linux Foundation.

    - If there was any question as to the appropriate license identifier,
    the file was flagged for further research and to be revisited later
    in time.

    In total, over 70 hours of logged manual review was done on the
    spreadsheet to determine the SPDX license identifiers to apply to the
    source files by Kate, Philippe, Thomas and, in some cases,
    confirmation by lawyers working with the Linux Foundation.

    Kate also obtained a third independent scan of the 4.13 code base from
    FOSSology, and compared selected files where the other two scanners
    disagreed against that SPDX file, to see if there was new insights.
    The Windriver scanner is based on an older version of FOSSology in
    part, so they are related.

    Thomas did random spot checks in about 500 files from the spreadsheets
    for the uapi headers and agreed with SPDX license identifier in the
    files he inspected. For the non-uapi files Thomas did random spot
    checks in about 15000 files.

    In initial set of patches against 4.14-rc6, 3 files were found to have
    copy/paste license identifier errors, and have been fixed to reflect
    the correct identifier.

    Additionally Philippe spent 10 hours this week doing a detailed manual
    inspection and review of the 12,461 patched files from the initial
    patch version early this week with:

    - a full scancode scan run, collecting the matched texts, detected
    license ids and scores

    - reviewing anything where there was a license detected (about 500+
    files) to ensure that the applied SPDX license was correct

    - reviewing anything where there was no detection but the patch
    license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
    applied SPDX license was correct

    This produced a worksheet with 20 files needing minor correction. This
    worksheet was then exported into 3 different .csv files for the
    different types of files to be modified.

    These .csv files were then reviewed by Greg. Thomas wrote a script to
    parse the csv files and add the proper SPDX tag to the file, in the
    format that the file expected. This script was further refined by Greg
    based on the output to detect more types of files automatically and to
    distinguish between header and source .c files (which need different
    comment types.) Finally Greg ran the script using the .csv files to
    generate the patches.

    Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
    Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
    Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

    * tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    License cleanup: add SPDX license identifier to uapi header files with a license
    License cleanup: add SPDX license identifier to uapi header files with no license
    License cleanup: add SPDX GPL-2.0 license identifier to files with no license

    Linus Torvalds
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

26 Oct, 2017

2 commits

  • The number of unsignaled work-requests posted to the IB send queue is
    tracked by a counter in the rds_ib_connection struct. When it reaches
    zero, or the caller explicitly asks for it, the send-signaled bit is
    set in send_flags and the counter is reset. This is performed by the
    rds_ib_set_wr_signal_state() function.

    However, this function is not always used which yields inaccurate
    accounting. This commit fixes this, re-factors a code bloat related to
    the matter, and makes the actual parameter type to the function
    consistent.

    Signed-off-by: Håkon Bugge
    Signed-off-by: David S. Miller

    Håkon Bugge
     
  • send_flags needs to be initialized before calling
    rds_ib_set_wr_signal_state().

    Signed-off-by: Håkon Bugge
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Håkon Bugge
     

08 Sep, 2017

1 commit

  • In rds_send_xmit() there is logic to batch the sends. However, if
    another thread has acquired the lock and has incremented the send_gen,
    it is considered a race and we yield. The code incrementing the
    s_send_lock_queue_raced statistics counter did not count this event
    correctly.

    This commit counts the race condition correctly.

    Changes from v1:
    - Removed check for *someone_on_xmit()*
    - Fixed incorrect indentation

    Signed-off-by: Håkon Bugge
    Reviewed-by: Knut Omang
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Håkon Bugge
     

06 Sep, 2017

1 commit

  • The bits in m_flags in struct rds_message are used for a plurality of
    reasons, and from different contexts. To avoid any missing updates to
    m_flags, use the atomic set_bit() instead of the non-atomic equivalent.

    Signed-off-by: Håkon Bugge
    Reviewed-by: Knut Omang
    Reviewed-by: Wei Lin Guay
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Håkon Bugge
     

29 Aug, 2017

1 commit


10 Aug, 2017

1 commit

  • The UDP offload conflict is dealt with by simply taking what is
    in net-next where we have removed all of the UFO handling code
    entirely.

    The TCP conflict was a case of local variables in a function
    being removed from both net and net-next.

    In netvsc we had an assignment right next to where a missing
    set of u64 stats sync object inits were added.

    Signed-off-by: David S. Miller

    David S. Miller
     

09 Aug, 2017

1 commit

  • In commit 7e3f2952eeb1 ("rds: don't let RDS shutdown a connection
    while senders are present"), refilling the receive queue was removed
    from rds_ib_recv(), along with the increment of
    s_ib_rx_refill_from_thread.

    Commit 73ce4317bf98 ("RDS: make sure we post recv buffers")
    re-introduces filling the receive queue from rds_ib_recv(), but does
    not add the statistics counter. rds_ib_recv() was later renamed to
    rds_ib_recv_path().

    This commit reintroduces the statistics counting of
    s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq.

    Signed-off-by: Håkon Bugge
    Reviewed-by: Knut Omang
    Reviewed-by: Wei Lin Guay
    Reviewed-by: Shamir Rabinovitch
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Håkon Bugge
     

04 Aug, 2017

1 commit

  • RDS over IB does not use multipath RDS, so the array
    of additional rds_conn_path structures is always superfluous
    in this case. Reduce the memory footprint of the rds module
    by making this a dynamic allocation predicated on whether
    the transport is mp_capable.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Tested-by: Efrain Galaviz
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

21 Jul, 2017

2 commits


17 Jul, 2017

1 commit

  • We could end up executing rds_conn_shutdown before the rds_recv_worker
    thread, then rds_conn_shutdown -> rds_tcp_conn_shutdown can do a
    sock_release and set sock->sk to null, which may interleave in bad
    ways with rds_recv_worker, e.g., it could result in:

    "BUG: unable to handle kernel NULL pointer dereference at 0000000000000078"
    [ffff881769f6fd70] release_sock at ffffffff815f337b
    [ffff881769f6fd90] rds_tcp_recv at ffffffffa043c888 [rds_tcp]
    [ffff881769f6fdb0] rds_recv_worker at ffffffffa04a4810 [rds]
    [ffff881769f6fde0] process_one_work at ffffffff810a14c1
    [ffff881769f6fe40] worker_thread at ffffffff810a1940
    [ffff881769f6fec0] kthread at ffffffff810a6b1e

    Also, do not enqueue any new shutdown workq items when the connection is
    shutting down (this may happen for rds-tcp in softirq mode, if a FIN
    or CLOSE is received while the modules is in the middle of an unload)

    Signed-off-by: Sowmini Varadhan
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

08 Jul, 2017

1 commit

  • There are two problems with calling sock_create_kern() from
    rds_tcp_accept_one()
    1. it sets up a new_sock->sk that is wasteful, because this ->sk
    is going to get replaced by inet_accept() in the subsequent ->accept()
    2. The new_sock->sk is a leaked reference in sock_graft() which
    expects to find a null parent->sk

    Avoid these problems by calling sock_create_lite().

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

05 Jul, 2017

4 commits

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

01 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

22 Jun, 2017

2 commits

  • If we are unloading the rds_tcp module, we can set linger to 1
    and drop pending packets to accelerate reconnect. The peer will
    end up resetting the connection based on new generation numbers
    of the new incarnation, so hanging on to unsent TCP packets via
    linger is mostly pointless in this case.

    Signed-off-by: Sowmini Varadhan
    Tested-by: Jenny Xu
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • The RDS handshake ping probe added by commit 5916e2c1554f
    ("RDS: TCP: Enable multipath RDS for TCP") is sent from rds_sendmsg()
    before the first data packet is sent to a peer. If the conversation
    is not bidirectional (i.e., one side is always passive and never
    invokes rds_sendmsg()) and the passive side restarts its rds_tcp
    module, a new HS ping probe needs to be sent, so that the number
    of paths can be re-established.

    This patch achieves that by sending a HS ping probe from
    rds_tcp_accept_one() when c_npaths is 0 (i.e., we have not done
    a handshake probe with this peer yet).

    Signed-off-by: Sowmini Varadhan
    Tested-by: Jenny Xu
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

17 Jun, 2017

3 commits

  • Each time we get an incoming SYN to the RDS_TCP_PORT, the TCP
    layer accepts the connection and then the rds_tcp_accept_one()
    callback is invoked to process the incoming connection.

    rds_tcp_accept_one() may reject the incoming syn for a number of
    reasons, e.g., commit 1a0e100fb2c9 ("RDS: TCP: Force every connection
    to be initiated by numerically smaller IP address"), or because
    we are getting spammed by a malicious node that is triggering
    a flood of connection attempts to RDS_TCP_PORT. If the incoming
    syn is rejected, no data would have been sent on the TCP socket,
    and we do not need to be in TIME_WAIT state, so we set linger on
    the TCP socket before closing, thereby closing the socket efficiently
    with a RST.

    Signed-off-by: Sowmini Varadhan
    Tested-by: Imanti Mendez
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • Found when testing between sparc and x86 machines on different
    subnets, so the address comparison patterns hit the corner cases and
    brought out some bugs fixed by this patch.

    Signed-off-by: Sowmini Varadhan
    Tested-by: Imanti Mendez
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • After commit 1a0e100fb2c9 ("RDS: TCP: Force every connection to be
    initiated by numerically smaller IP address") we no longer need
    the logic associated with cp_outgoing, so clean up usage of this
    field.

    Signed-off-by: Sowmini Varadhan
    Tested-by: Imanti Mendez
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

03 May, 2017

2 commits

  • Pull networking updates from David Millar:
    "Here are some highlights from the 2065 networking commits that
    happened this development cycle:

    1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri)

    2) Add a generic XDP driver, so that anyone can test XDP even if they
    lack a networking device whose driver has explicit XDP support
    (me).

    3) Sparc64 now has an eBPF JIT too (me)

    4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei
    Starovoitov)

    5) Make netfitler network namespace teardown less expensive (Florian
    Westphal)

    6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana)

    7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger)

    8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky)

    9) Multiqueue support in stmmac driver (Joao Pinto)

    10) Remove TCP timewait recycling, it never really could possibly work
    well in the real world and timestamp randomization really zaps any
    hint of usability this feature had (Soheil Hassas Yeganeh)

    11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay
    Aleksandrov)

    12) Add socket busy poll support to epoll (Sridhar Samudrala)

    13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso,
    and several others)

    14) IPSEC hw offload infrastructure (Steffen Klassert)"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits)
    tipc: refactor function tipc_sk_recv_stream()
    tipc: refactor function tipc_sk_recvmsg()
    net: thunderx: Optimize page recycling for XDP
    net: thunderx: Support for XDP header adjustment
    net: thunderx: Add support for XDP_TX
    net: thunderx: Add support for XDP_DROP
    net: thunderx: Add basic XDP support
    net: thunderx: Cleanup receive buffer allocation
    net: thunderx: Optimize CQE_TX handling
    net: thunderx: Optimize RBDR descriptor handling
    net: thunderx: Support for page recycling
    ipx: call ipxitf_put() in ioctl error path
    net: sched: add helpers to handle extended actions
    qed*: Fix issues in the ptp filter config implementation.
    qede: Fix concurrency issue in PTP Tx path processing.
    stmmac: Add support for SIMATIC IOT2000 platform
    net: hns: fix ethtool_get_strings overflow in hns driver
    tcp: fix wraparound issue in tcp_lp
    bpf, arm64: fix jit branch offset related to ldimm64
    bpf, arm64: implement jiting of BPF_XADD
    ...

    Linus Torvalds
     
  • Pull iov_iter updates from Al Viro:
    "Cleanups that sat in -next + -stable fodder that has just missed 4.11.

    There's more iov_iter work in my local tree, but I'd prefer to push
    the stuff that had been in -next first"

    * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    iov_iter: don't revert iov buffer if csum error
    generic_file_read_iter(): make use of iov_iter_revert()
    generic_file_direct_write(): make use of iov_iter_revert()
    orangefs: use iov_iter_revert()
    sctp: switch to copy_from_iter_full()
    net/9p: switch to copy_from_iter_full()
    switch memcpy_from_msg() to copy_from_iter_full()
    rds: make use of iov_iter_revert()

    Linus Torvalds
     

27 Apr, 2017

1 commit

  • …uaccess.avr32', 'uaccess.bfin', 'uaccess.c6x', 'uaccess.cris', 'uaccess.frv', 'uaccess.h8300', 'uaccess.hexagon', 'uaccess.ia64', 'uaccess.m32r', 'uaccess.m68k', 'uaccess.metag', 'uaccess.microblaze', 'uaccess.mips', 'uaccess.mn10300', 'uaccess.nios2', 'uaccess.openrisc', 'uaccess.parisc', 'uaccess.powerpc', 'uaccess.s390', 'uaccess.score', 'uaccess.sh', 'uaccess.sparc', 'uaccess.tile', 'uaccess.um', 'uaccess.unicore32', 'uaccess.x86' and 'uaccess.xtensa' into work.uaccess

    Al Viro
     

22 Apr, 2017

1 commit


06 Apr, 2017

1 commit


03 Apr, 2017

2 commits

  • The rds_connect_worker() has a bug in the check that enforces the
    canonical connection order described in the comments of
    rds_tcp_state_change(). The intention is to make sure that all
    the multipath connections are always initiated by the smaller IP
    address via rds_start_mprds. To achieve this, rds_connection_worker
    should check that cp_index > 0.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • …ROR by an intervening FIN

    rds_conn_shutdown() runs in workq context, and marks the rds_connection
    as DISCONNECTING before quiescing Tx/Rx paths. However, after all I/O
    has quiesced, we may still find the rds_connection state to be
    RDS_CONN_ERROR if an intervening FIN was processed in softirq context.

    This is not a fatal error: rds_conn_shutdown() should continue the
    shutdown, and there is no need to log noisy messages about this event.

    Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
    Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

    Sowmini Varadhan