18 Oct, 2018

2 commits

  • [ Upstream commit 2ab2ddd301a22ca3c5f0b743593e4ad2953dfa53 ]

    Timer handlers do not imply rcu_read_lock(), so my recent fix
    triggered a LOCKDEP warning when SYNACK is retransmit.

    Lets add rcu_read_lock()/rcu_read_unlock() pairs around ireq->ireq_opt
    usages instead of guessing what is done by callers, since it is
    not worth the pain.

    Get rid of ireq_opt_deref() helper since it hides the logic
    without real benefit, since it is now a standard rcu_dereference().

    Fixes: 1ad98e9d1bdf ("tcp/dccp: fix lockdep issue when SYN is backlogged")
    Signed-off-by: Eric Dumazet
    Reported-by: Willem de Bruijn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 1ad98e9d1bdf4724c0a8532fabd84bf3c457c2bc ]

    In normal SYN processing, packets are handled without listener
    lock and in RCU protected ingress path.

    But syzkaller is known to be able to trick us and SYN
    packets might be processed in process context, after being
    queued into socket backlog.

    In commit 06f877d613be ("tcp/dccp: fix other lockdep splats
    accessing ireq_opt") I made a very stupid fix, that happened
    to work mostly because of the regular path being RCU protected.

    Really the thing protecting ireq->ireq_opt is RCU read lock,
    and the pseudo request refcnt is not relevant.

    This patch extends what I did in commit 449809a66c1d ("tcp/dccp:
    block BH for SYN processing") by adding an extra rcu_read_{lock|unlock}
    pair in the paths that might be taken when processing SYN from
    socket backlog (thus possibly in process context)

    Fixes: 06f877d613be ("tcp/dccp: fix other lockdep splats accessing ireq_opt")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

22 Aug, 2018

1 commit

  • [ Upstream commit 61ef4b07fcdc30535889990cf4229766502561cf ]

    The shift of 'cwnd' with '(now - hc->tx_lsndtime) / hc->tx_rto' value
    can lead to undefined behavior [1].

    In order to fix this use a gradual shift of the window with a 'while'
    loop, similar to what tcp_cwnd_restart() is doing.

    When comparing delta and RTO there is a minor difference between TCP
    and DCCP, the last one also invokes dccp_cwnd_restart() and reduces
    'cwnd' if delta equals RTO. That case is preserved in this change.

    [1]:
    [40850.963623] UBSAN: Undefined behaviour in net/dccp/ccids/ccid2.c:237:7
    [40851.043858] shift exponent 67 is too large for 32-bit type 'unsigned int'
    [40851.127163] CPU: 3 PID: 15940 Comm: netstress Tainted: G W E 4.18.0-rc7.x86_64 #1
    ...
    [40851.377176] Call Trace:
    [40851.408503] dump_stack+0xf1/0x17b
    [40851.451331] ? show_regs_print_info+0x5/0x5
    [40851.503555] ubsan_epilogue+0x9/0x7c
    [40851.548363] __ubsan_handle_shift_out_of_bounds+0x25b/0x2b4
    [40851.617109] ? __ubsan_handle_load_invalid_value+0x18f/0x18f
    [40851.686796] ? xfrm4_output_finish+0x80/0x80
    [40851.739827] ? lock_downgrade+0x6d0/0x6d0
    [40851.789744] ? xfrm4_prepare_output+0x160/0x160
    [40851.845912] ? ip_queue_xmit+0x810/0x1db0
    [40851.895845] ? ccid2_hc_tx_packet_sent+0xd36/0x10a0 [dccp]
    [40851.963530] ccid2_hc_tx_packet_sent+0xd36/0x10a0 [dccp]
    [40852.029063] dccp_xmit_packet+0x1d3/0x720 [dccp]
    [40852.086254] dccp_write_xmit+0x116/0x1d0 [dccp]
    [40852.142412] dccp_sendmsg+0x428/0xb20 [dccp]
    [40852.195454] ? inet_dccp_listen+0x200/0x200 [dccp]
    [40852.254833] ? sched_clock+0x5/0x10
    [40852.298508] ? sched_clock+0x5/0x10
    [40852.342194] ? inet_create+0xdf0/0xdf0
    [40852.388988] sock_sendmsg+0xd9/0x160
    ...

    Fixes: 113ced1f52e5 ("dccp ccid-2: Perform congestion-window validation")
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kodanev
     

22 Jul, 2018

2 commits

  • [ Upstream commit 0ce4e70ff00662ad7490e545ba0cd8c1fa179fca ]

    To compute delays, better not use time of the day which can
    be changed by admins or malicious programs.

    Also change ccid3_first_li() to use s64 type for delta variable
    to avoid potential overflows.

    Signed-off-by: Eric Dumazet
    Cc: Gerrit Renker
    Cc: dccp@vger.kernel.org
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 74174fe5634ffbf645a7ca5a261571f700b2f332 ]

    On fast hosts or malicious bots, we trigger a DCCP_BUG() which
    seems excessive.

    syzbot reported :

    BUG: delta (-6195)
    Reported-by: syzbot
    Cc: Gerrit Renker
    Cc: dccp@vger.kernel.org
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

12 Jun, 2018

1 commit

  • [ Upstream commit 2677d20677314101293e6da0094ede7b5526d2b1 ]

    Syzbot reported the use-after-free in timer_is_static_object() [1].

    This can happen because the structure for the rto timer (ccid2_hc_tx_sock)
    is removed in dccp_disconnect(), and ccid2_hc_tx_rto_expire() can be
    called after that.

    The report [1] is similar to the one in commit 120e9dabaf55 ("dccp:
    defer ccid_hc_tx_delete() at dismantle time"). And the fix is the same,
    delay freeing ccid2_hc_tx_sock structure, so that it is freed in
    dccp_sk_destruct().

    [1]

    ==================================================================
    BUG: KASAN: use-after-free in timer_is_static_object+0x80/0x90
    kernel/time/timer.c:607
    Read of size 8 at addr ffff8801bebb5118 by task syz-executor2/25299

    CPU: 1 PID: 25299 Comm: syz-executor2 Not tainted 4.17.0-rc5+ #54
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1b9/0x294 lib/dump_stack.c:113
    print_address_description+0x6c/0x20b mm/kasan/report.c:256
    kasan_report_error mm/kasan/report.c:354 [inline]
    kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
    __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
    timer_is_static_object+0x80/0x90 kernel/time/timer.c:607
    debug_object_activate+0x2d9/0x670 lib/debugobjects.c:508
    debug_timer_activate kernel/time/timer.c:709 [inline]
    debug_activate kernel/time/timer.c:764 [inline]
    __mod_timer kernel/time/timer.c:1041 [inline]
    mod_timer+0x4d3/0x13b0 kernel/time/timer.c:1102
    sk_reset_timer+0x22/0x60 net/core/sock.c:2742
    ccid2_hc_tx_rto_expire+0x587/0x680 net/dccp/ccids/ccid2.c:147
    call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
    expire_timers kernel/time/timer.c:1363 [inline]
    __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
    run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
    __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
    invoke_softirq kernel/softirq.c:365 [inline]
    irq_exit+0x1d1/0x200 kernel/softirq.c:405
    exiting_irq arch/x86/include/asm/apic.h:525 [inline]
    smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
    apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863

    ...
    Allocated by task 25374:
    save_stack+0x43/0xd0 mm/kasan/kasan.c:448
    set_track mm/kasan/kasan.c:460 [inline]
    kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
    kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
    kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554
    ccid_new+0x25b/0x3e0 net/dccp/ccid.c:151
    dccp_hdlr_ccid+0x27/0x150 net/dccp/feat.c:44
    __dccp_feat_activate+0x184/0x270 net/dccp/feat.c:344
    dccp_feat_activate_values+0x3a7/0x819 net/dccp/feat.c:1538
    dccp_create_openreq_child+0x472/0x610 net/dccp/minisocks.c:128
    dccp_v4_request_recv_sock+0x12c/0xca0 net/dccp/ipv4.c:408
    dccp_v6_request_recv_sock+0x125d/0x1f10 net/dccp/ipv6.c:415
    dccp_check_req+0x455/0x6a0 net/dccp/minisocks.c:197
    dccp_v4_rcv+0x7b8/0x1f3f net/dccp/ipv4.c:841
    ip_local_deliver_finish+0x2e3/0xd80 net/ipv4/ip_input.c:215
    NF_HOOK include/linux/netfilter.h:288 [inline]
    ip_local_deliver+0x1e1/0x720 net/ipv4/ip_input.c:256
    dst_input include/net/dst.h:450 [inline]
    ip_rcv_finish+0x81b/0x2200 net/ipv4/ip_input.c:396
    NF_HOOK include/linux/netfilter.h:288 [inline]
    ip_rcv+0xb70/0x143d net/ipv4/ip_input.c:492
    __netif_receive_skb_core+0x26f5/0x3630 net/core/dev.c:4592
    __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4657
    process_backlog+0x219/0x760 net/core/dev.c:5337
    napi_poll net/core/dev.c:5735 [inline]
    net_rx_action+0x7b7/0x1930 net/core/dev.c:5801
    __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285

    Freed by task 25374:
    save_stack+0x43/0xd0 mm/kasan/kasan.c:448
    set_track mm/kasan/kasan.c:460 [inline]
    __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
    kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
    __cache_free mm/slab.c:3498 [inline]
    kmem_cache_free+0x86/0x2d0 mm/slab.c:3756
    ccid_hc_tx_delete+0xc3/0x100 net/dccp/ccid.c:190
    dccp_disconnect+0x130/0xc66 net/dccp/proto.c:286
    dccp_close+0x3bc/0xe60 net/dccp/proto.c:1045
    inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
    inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
    sock_release+0x96/0x1b0 net/socket.c:594
    sock_close+0x16/0x20 net/socket.c:1149
    __fput+0x34d/0x890 fs/file_table.c:209
    ____fput+0x15/0x20 fs/file_table.c:243
    task_work_run+0x1e4/0x290 kernel/task_work.c:113
    tracehook_notify_resume include/linux/tracehook.h:191 [inline]
    exit_to_usermode_loop+0x2bd/0x310 arch/x86/entry/common.c:166
    prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
    syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
    do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    The buggy address belongs to the object at ffff8801bebb4cc0
    which belongs to the cache ccid2_hc_tx_sock of size 1240
    The buggy address is located 1112 bytes inside of
    1240-byte region [ffff8801bebb4cc0, ffff8801bebb5198)
    The buggy address belongs to the page:
    page:ffffea0006faed00 count:1 mapcount:0 mapping:ffff8801bebb41c0
    index:0xffff8801bebb5240 compound_mapcount: 0
    flags: 0x2fffc0000008100(slab|head)
    raw: 02fffc0000008100 ffff8801bebb41c0 ffff8801bebb5240 0000000100000003
    raw: ffff8801cdba3138 ffffea0007634120 ffff8801cdbaab40 0000000000000000
    page dumped because: kasan: bad access detected
    ...
    ==================================================================

    Reported-by: syzbot+5d47e9ec91a6f15dbd6f@syzkaller.appspotmail.com
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kodanev
     

19 May, 2018

1 commit

  • [ Upstream commit a8d7aa17bbc970971ccdf71988ea19230ab368b1 ]

    syzbot reported a crash in tasklet_action_common() caused by dccp.

    dccp needs to make sure socket wont disappear before tasklet handler
    has completed.

    This patch takes a reference on the socket when arming the tasklet,
    and moves the sock_put() from dccp_write_xmit_timer() to dccp_write_xmitlet()

    kernel BUG at kernel/softirq.c:514!
    invalid opcode: 0000 [#1] SMP KASAN
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Modules linked in:
    CPU: 1 PID: 17 Comm: ksoftirqd/1 Not tainted 4.17.0-rc3+ #30
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515
    RSP: 0018:ffff8801d9b3faf8 EFLAGS: 00010246
    dccp_close: ABORT with 65423 bytes unread
    RAX: 1ffff1003b367f6b RBX: ffff8801daf1f3f0 RCX: 0000000000000000
    RDX: ffff8801cf895498 RSI: 0000000000000004 RDI: 0000000000000000
    RBP: ffff8801d9b3fc40 R08: ffffed0039f12a95 R09: ffffed0039f12a94
    dccp_close: ABORT with 65423 bytes unread
    R10: ffffed0039f12a94 R11: ffff8801cf8954a3 R12: 0000000000000000
    R13: ffff8801d9b3fc18 R14: dffffc0000000000 R15: ffff8801cf895490
    FS: 0000000000000000(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000001b2bc28000 CR3: 00000001a08a9000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    tasklet_action+0x1d/0x20 kernel/softirq.c:533
    __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
    dccp_close: ABORT with 65423 bytes unread
    run_ksoftirqd+0x86/0x100 kernel/softirq.c:646
    smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164
    kthread+0x345/0x410 kernel/kthread.c:238
    ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
    Code: 48 8b 85 e8 fe ff ff 48 8b 95 f0 fe ff ff e9 94 fb ff ff 48 89 95 f0 fe ff ff e8 81 53 6e 00 48 8b 95 f0 fe ff ff e9 62 fb ff ff 0b 48 89 cf 48 89 8d e8 fe ff ff e8 64 53 6e 00 48 8b 8d e8
    RIP: tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515 RSP: ffff8801d9b3faf8

    Fixes: dc841e30eaea ("dccp: Extend CCID packet dequeueing interface")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Cc: Gerrit Renker
    Cc: dccp@vger.kernel.org
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

16 May, 2018

1 commit

  • commit b855ff827476adbdc2259e9895681d82b7b26065 upstream.

    syzbot reported an uninit-value read of skb->mark in iptable_mangle_hook()

    Thanks to the nice report, I tracked the problem to dccp not caring
    of ireq->ir_mark for passive sessions.

    BUG: KMSAN: uninit-value in ipt_mangle_out net/ipv4/netfilter/iptable_mangle.c:66 [inline]
    BUG: KMSAN: uninit-value in iptable_mangle_hook+0x5e5/0x720 net/ipv4/netfilter/iptable_mangle.c:84
    CPU: 0 PID: 5300 Comm: syz-executor3 Not tainted 4.16.0+ #81
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x185/0x1d0 lib/dump_stack.c:53
    kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
    __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
    ipt_mangle_out net/ipv4/netfilter/iptable_mangle.c:66 [inline]
    iptable_mangle_hook+0x5e5/0x720 net/ipv4/netfilter/iptable_mangle.c:84
    nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
    nf_hook_slow+0x158/0x3d0 net/netfilter/core.c:483
    nf_hook include/linux/netfilter.h:243 [inline]
    __ip_local_out net/ipv4/ip_output.c:113 [inline]
    ip_local_out net/ipv4/ip_output.c:122 [inline]
    ip_queue_xmit+0x1d21/0x21c0 net/ipv4/ip_output.c:504
    dccp_transmit_skb+0x15eb/0x1900 net/dccp/output.c:142
    dccp_xmit_packet+0x814/0x9e0 net/dccp/output.c:281
    dccp_write_xmit+0x20f/0x480 net/dccp/output.c:363
    dccp_sendmsg+0x12ca/0x12d0 net/dccp/proto.c:818
    inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764
    sock_sendmsg_nosec net/socket.c:630 [inline]
    sock_sendmsg net/socket.c:640 [inline]
    ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
    __sys_sendmsg net/socket.c:2080 [inline]
    SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
    SyS_sendmsg+0x54/0x80 net/socket.c:2087
    do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x455259
    RSP: 002b:00007f1a4473dc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007f1a4473e6d4 RCX: 0000000000455259
    RDX: 0000000000000000 RSI: 0000000020b76fc8 RDI: 0000000000000015
    RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
    R13: 00000000000004f0 R14: 00000000006fa720 R15: 0000000000000000

    Uninit was stored to memory at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
    kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
    kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
    __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
    ip_queue_xmit+0x1e35/0x21c0 net/ipv4/ip_output.c:502
    dccp_transmit_skb+0x15eb/0x1900 net/dccp/output.c:142
    dccp_xmit_packet+0x814/0x9e0 net/dccp/output.c:281
    dccp_write_xmit+0x20f/0x480 net/dccp/output.c:363
    dccp_sendmsg+0x12ca/0x12d0 net/dccp/proto.c:818
    inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764
    sock_sendmsg_nosec net/socket.c:630 [inline]
    sock_sendmsg net/socket.c:640 [inline]
    ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
    __sys_sendmsg net/socket.c:2080 [inline]
    SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
    SyS_sendmsg+0x54/0x80 net/socket.c:2087
    do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    Uninit was stored to memory at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
    kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
    kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
    __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
    inet_csk_clone_lock+0x503/0x580 net/ipv4/inet_connection_sock.c:797
    dccp_create_openreq_child+0x7f/0x890 net/dccp/minisocks.c:92
    dccp_v4_request_recv_sock+0x22c/0xe90 net/dccp/ipv4.c:408
    dccp_v6_request_recv_sock+0x290/0x2000 net/dccp/ipv6.c:414
    dccp_check_req+0x7b9/0x8f0 net/dccp/minisocks.c:197
    dccp_v4_rcv+0x12e4/0x2630 net/dccp/ipv4.c:840
    ip_local_deliver_finish+0x6ed/0xd40 net/ipv4/ip_input.c:216
    NF_HOOK include/linux/netfilter.h:288 [inline]
    ip_local_deliver+0x43c/0x4e0 net/ipv4/ip_input.c:257
    dst_input include/net/dst.h:449 [inline]
    ip_rcv_finish+0x1253/0x16d0 net/ipv4/ip_input.c:397
    NF_HOOK include/linux/netfilter.h:288 [inline]
    ip_rcv+0x119d/0x16f0 net/ipv4/ip_input.c:493
    __netif_receive_skb_core+0x47cf/0x4a80 net/core/dev.c:4562
    __netif_receive_skb net/core/dev.c:4627 [inline]
    process_backlog+0x62d/0xe20 net/core/dev.c:5307
    napi_poll net/core/dev.c:5705 [inline]
    net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
    __do_softirq+0x56d/0x93d kernel/softirq.c:285
    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
    kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
    kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
    kmem_cache_alloc+0xaab/0xb90 mm/slub.c:2756
    reqsk_alloc include/net/request_sock.h:88 [inline]
    inet_reqsk_alloc+0xc4/0x7f0 net/ipv4/tcp_input.c:6145
    dccp_v4_conn_request+0x5cc/0x1770 net/dccp/ipv4.c:600
    dccp_v6_conn_request+0x299/0x1880 net/dccp/ipv6.c:317
    dccp_rcv_state_process+0x2ea/0x2410 net/dccp/input.c:612
    dccp_v4_do_rcv+0x229/0x340 net/dccp/ipv4.c:682
    dccp_v6_do_rcv+0x16d/0x1220 net/dccp/ipv6.c:578
    sk_backlog_rcv include/net/sock.h:908 [inline]
    __sk_receive_skb+0x60e/0xf20 net/core/sock.c:513
    dccp_v4_rcv+0x24d4/0x2630 net/dccp/ipv4.c:874
    ip_local_deliver_finish+0x6ed/0xd40 net/ipv4/ip_input.c:216
    NF_HOOK include/linux/netfilter.h:288 [inline]
    ip_local_deliver+0x43c/0x4e0 net/ipv4/ip_input.c:257
    dst_input include/net/dst.h:449 [inline]
    ip_rcv_finish+0x1253/0x16d0 net/ipv4/ip_input.c:397
    NF_HOOK include/linux/netfilter.h:288 [inline]
    ip_rcv+0x119d/0x16f0 net/ipv4/ip_input.c:493
    __netif_receive_skb_core+0x47cf/0x4a80 net/core/dev.c:4562
    __netif_receive_skb net/core/dev.c:4627 [inline]
    process_backlog+0x62d/0xe20 net/core/dev.c:5307
    napi_poll net/core/dev.c:5705 [inline]
    net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
    __do_softirq+0x56d/0x93d kernel/softirq.c:285

    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

01 Apr, 2018

1 commit

  • [ Upstream commit 67f93df79aeefc3add4e4b31a752600f834236e2 ]

    dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL,
    therefore if DCCP socket is disconnected and dccp_sendmsg() is
    called after it, it will cause a NULL pointer dereference in
    dccp_write_xmit().

    This crash and the reproducer was reported by syzbot. Looks like
    it is reproduced if commit 69c64866ce07 ("dccp: CVE-2017-8824:
    use-after-free in DCCP code") is applied.

    Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kodanev
     

17 Feb, 2018

1 commit

  • commit 69c64866ce072dea1d1e59a0d61e0f66c0dffb76 upstream.

    Whenever the sock object is in DCCP_CLOSED state,
    dccp_disconnect() must free dccps_hc_tx_ccid and
    dccps_hc_rx_ccid and set to NULL.

    Signed-off-by: Mohamed Ghannam
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Mohamed Ghannam
     

31 Jan, 2018

1 commit

  • [ Upstream commit dd5684ecae3bd8e44b644f50e2c12c7e57fdfef5 ]

    ccid2_hc_tx_rto_expire() timer callback always restarts the timer
    again and can run indefinitely (unless it is stopped outside), and after
    commit 120e9dabaf55 ("dccp: defer ccid_hc_tx_delete() at dismantle time"),
    which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from
    dccp_destroy_sock() to sk_destruct(), this started to happen quite often.
    The timer prevents releasing the socket, as a result, sk_destruct() won't
    be called.

    Found with LTP/dccp_ipsec tests running on the bonding device,
    which later couldn't be unloaded after the tests were completed:

    unregister_netdevice: waiting for bond0 to become free. Usage count = 148

    Fixes: 2a91aa396739 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation")
    Signed-off-by: Alexey Kodanev
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kodanev
     

17 Dec, 2017

1 commit

  • [ Upstream commit cfac7f836a715b91f08c851df915d401a4d52783 ]

    Maciej Żenczykowski reported some panics in tcp_twsk_destructor()
    that might be caused by the following bug.

    timewait timer is pinned to the cpu, because we want to transition
    timwewait refcount from 0 to 4 in one go, once everything has been
    initialized.

    At the time commit ed2e92394589 ("tcp/dccp: fix timewait races in timer
    handling") was merged, TCP was always running from BH habdler.

    After commit 5413d1babe8f ("net: do not block BH while processing
    socket backlog") we definitely can run tcp_time_wait() from process
    context.

    We need to block BH in the critical section so that the pinned timer
    has still its purpose.

    This bug is more likely to happen under stress and when very small RTO
    are used in datacenter flows.

    Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
    Signed-off-by: Eric Dumazet
    Reported-by: Maciej Żenczykowski
    Acked-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

03 Nov, 2017

1 commit

  • …el/git/gregkh/driver-core

    Pull initial SPDX identifiers from Greg KH:
    "License cleanup: add SPDX license identifiers to some files

    Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the
    'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
    binding shorthand, which can be used instead of the full boiler plate
    text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart
    and Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset
    of the use cases:

    - file had no licensing information it it.

    - file was a */uapi/* one with no licensing information in it,

    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to
    license had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied
    to a file was done in a spreadsheet of side by side results from of
    the output of two independent scanners (ScanCode & Windriver)
    producing SPDX tag:value files created by Philippe Ombredanne.
    Philippe prepared the base worksheet, and did an initial spot review
    of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537
    files assessed. Kate Stewart did a file by file comparison of the
    scanner results in the spreadsheet to determine which SPDX license
    identifier(s) to be applied to the file. She confirmed any
    determination that was not immediately clear with lawyers working with
    the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:

    - Files considered eligible had to be source code files.

    - Make and config files were included as candidates if they contained
    >5 lines of source

    - File already had some variant of a license header in it (even if <5
    lines).

    All documentation files were explicitly excluded.

    The following heuristics were used to determine which SPDX license
    identifiers to apply.

    - when both scanners couldn't find any license traces, file was
    considered to have no license information in it, and the top level
    COPYING file license applied.

    For non */uapi/* files that summary was:

    SPDX license identifier # files
    ---------------------------------------------------|-------
    GPL-2.0 11139

    and resulted in the first patch in this series.

    If that file was a */uapi/* path one, it was "GPL-2.0 WITH
    Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
    was:

    SPDX license identifier # files
    ---------------------------------------------------|-------
    GPL-2.0 WITH Linux-syscall-note 930

    and resulted in the second patch in this series.

    - if a file had some form of licensing information in it, and was one
    of the */uapi/* ones, it was denoted with the Linux-syscall-note if
    any GPL family license was found in the file or had no licensing in
    it (per prior point). Results summary:

    SPDX license identifier # files
    ---------------------------------------------------|------
    GPL-2.0 WITH Linux-syscall-note 270
    GPL-2.0+ WITH Linux-syscall-note 169
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
    LGPL-2.1+ WITH Linux-syscall-note 15
    GPL-1.0+ WITH Linux-syscall-note 14
    ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
    LGPL-2.0+ WITH Linux-syscall-note 4
    LGPL-2.1 WITH Linux-syscall-note 3
    ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
    ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

    and that resulted in the third patch in this series.

    - when the two scanners agreed on the detected license(s), that
    became the concluded license(s).

    - when there was disagreement between the two scanners (one detected
    a license but the other didn't, or they both detected different
    licenses) a manual inspection of the file occurred.

    - In most cases a manual inspection of the information in the file
    resulted in a clear resolution of the license that should apply
    (and which scanner probably needed to revisit its heuristics).

    - When it was not immediately clear, the license identifier was
    confirmed with lawyers working with the Linux Foundation.

    - If there was any question as to the appropriate license identifier,
    the file was flagged for further research and to be revisited later
    in time.

    In total, over 70 hours of logged manual review was done on the
    spreadsheet to determine the SPDX license identifiers to apply to the
    source files by Kate, Philippe, Thomas and, in some cases,
    confirmation by lawyers working with the Linux Foundation.

    Kate also obtained a third independent scan of the 4.13 code base from
    FOSSology, and compared selected files where the other two scanners
    disagreed against that SPDX file, to see if there was new insights.
    The Windriver scanner is based on an older version of FOSSology in
    part, so they are related.

    Thomas did random spot checks in about 500 files from the spreadsheets
    for the uapi headers and agreed with SPDX license identifier in the
    files he inspected. For the non-uapi files Thomas did random spot
    checks in about 15000 files.

    In initial set of patches against 4.14-rc6, 3 files were found to have
    copy/paste license identifier errors, and have been fixed to reflect
    the correct identifier.

    Additionally Philippe spent 10 hours this week doing a detailed manual
    inspection and review of the 12,461 patched files from the initial
    patch version early this week with:

    - a full scancode scan run, collecting the matched texts, detected
    license ids and scores

    - reviewing anything where there was a license detected (about 500+
    files) to ensure that the applied SPDX license was correct

    - reviewing anything where there was no detection but the patch
    license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
    applied SPDX license was correct

    This produced a worksheet with 20 files needing minor correction. This
    worksheet was then exported into 3 different .csv files for the
    different types of files to be modified.

    These .csv files were then reviewed by Greg. Thomas wrote a script to
    parse the csv files and add the proper SPDX tag to the file, in the
    format that the file expected. This script was further refined by Greg
    based on the output to detect more types of files automatically and to
    distinguish between header and source .c files (which need different
    comment types.) Finally Greg ran the script using the .csv files to
    generate the patches.

    Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
    Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
    Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

    * tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    License cleanup: add SPDX license identifier to uapi header files with a license
    License cleanup: add SPDX license identifier to uapi header files with no license
    License cleanup: add SPDX GPL-2.0 license identifier to files with no license

    Linus Torvalds
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

26 Oct, 2017

1 commit

  • In my first attempt to fix the lockdep splat, I forgot we could
    enter inet_csk_route_req() with a freshly allocated request socket,
    for which refcount has not yet been elevated, due to complex
    SLAB_TYPESAFE_BY_RCU rules.

    We either are in rcu_read_lock() section _or_ we own a refcount on the
    request.

    Correct RCU verb to use here is rcu_dereference_check(), although it is
    not possible to prove we actually own a reference on a shared
    refcount :/

    In v2, I added ireq_opt_deref() helper and use in three places, to fix other
    possible splats.

    [ 49.844590] lockdep_rcu_suspicious+0xea/0xf3
    [ 49.846487] inet_csk_route_req+0x53/0x14d
    [ 49.848334] tcp_v4_route_req+0xe/0x10
    [ 49.850174] tcp_conn_request+0x31c/0x6a0
    [ 49.851992] ? __lock_acquire+0x614/0x822
    [ 49.854015] tcp_v4_conn_request+0x5a/0x79
    [ 49.855957] ? tcp_v4_conn_request+0x5a/0x79
    [ 49.858052] tcp_rcv_state_process+0x98/0xdcc
    [ 49.859990] ? sk_filter_trim_cap+0x2f6/0x307
    [ 49.862085] tcp_v4_do_rcv+0xfc/0x145
    [ 49.864055] ? tcp_v4_do_rcv+0xfc/0x145
    [ 49.866173] tcp_v4_rcv+0x5ab/0xaf9
    [ 49.868029] ip_local_deliver_finish+0x1af/0x2e7
    [ 49.870064] ip_local_deliver+0x1b2/0x1c5
    [ 49.871775] ? inet_del_offload+0x45/0x45
    [ 49.873916] ip_rcv_finish+0x3f7/0x471
    [ 49.875476] ip_rcv+0x3f1/0x42f
    [ 49.876991] ? ip_local_deliver_finish+0x2e7/0x2e7
    [ 49.878791] __netif_receive_skb_core+0x6d3/0x950
    [ 49.880701] ? process_backlog+0x7e/0x216
    [ 49.882589] __netif_receive_skb+0x1d/0x5e
    [ 49.884122] process_backlog+0x10c/0x216
    [ 49.885812] net_rx_action+0x147/0x3df

    Fixes: a6ca7abe53633 ("tcp/dccp: fix lockdep splat in inet_csk_route_req()")
    Fixes: c92e8c02fe66 ("tcp/dccp: fix ireq->opt races")
    Signed-off-by: Eric Dumazet
    Reported-by: kernel test robot
    Reported-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Oct, 2017

1 commit

  • syzkaller found another bug in DCCP/TCP stacks [1]

    For the reasons explained in commit ce1050089c96 ("tcp/dccp: fix
    ireq->pktopts race"), we need to make sure we do not access
    ireq->opt unless we own the request sock.

    Note the opt field is renamed to ireq_opt to ease grep games.

    [1]
    BUG: KASAN: use-after-free in ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
    Read of size 1 at addr ffff8801c951039c by task syz-executor5/3295

    CPU: 1 PID: 3295 Comm: syz-executor5 Not tainted 4.14.0-rc4+ #80
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:52
    print_address_description+0x73/0x250 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x25b/0x340 mm/kasan/report.c:409
    __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427
    ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
    tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1135
    tcp_send_ack.part.37+0x3bb/0x650 net/ipv4/tcp_output.c:3587
    tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3557
    __tcp_ack_snd_check+0x2c6/0x4b0 net/ipv4/tcp_input.c:5072
    tcp_ack_snd_check net/ipv4/tcp_input.c:5085 [inline]
    tcp_rcv_state_process+0x2eff/0x4850 net/ipv4/tcp_input.c:6071
    tcp_child_process+0x342/0x990 net/ipv4/tcp_minisocks.c:816
    tcp_v4_rcv+0x1827/0x2f80 net/ipv4/tcp_ipv4.c:1682
    ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
    NF_HOOK include/linux/netfilter.h:249 [inline]
    ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
    dst_input include/net/dst.h:464 [inline]
    ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
    NF_HOOK include/linux/netfilter.h:249 [inline]
    ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
    __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
    __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
    netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
    netif_receive_skb+0xae/0x390 net/core/dev.c:4611
    tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
    tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
    tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
    call_write_iter include/linux/fs.h:1770 [inline]
    new_sync_write fs/read_write.c:468 [inline]
    __vfs_write+0x68a/0x970 fs/read_write.c:481
    vfs_write+0x18f/0x510 fs/read_write.c:543
    SYSC_write fs/read_write.c:588 [inline]
    SyS_write+0xef/0x220 fs/read_write.c:580
    entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x40c341
    RSP: 002b:00007f469523ec10 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 000000000040c341
    RDX: 0000000000000037 RSI: 0000000020004000 RDI: 0000000000000015
    RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
    R10: 00000000000f4240 R11: 0000000000000293 R12: 00000000004b7fd1
    R13: 00000000ffffffff R14: 0000000020000000 R15: 0000000000025000

    Allocated by task 3295:
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
    save_stack+0x43/0xd0 mm/kasan/kasan.c:447
    set_track mm/kasan/kasan.c:459 [inline]
    kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
    __do_kmalloc mm/slab.c:3725 [inline]
    __kmalloc+0x162/0x760 mm/slab.c:3734
    kmalloc include/linux/slab.h:498 [inline]
    tcp_v4_save_options include/net/tcp.h:1962 [inline]
    tcp_v4_init_req+0x2d3/0x3e0 net/ipv4/tcp_ipv4.c:1271
    tcp_conn_request+0xf6d/0x3410 net/ipv4/tcp_input.c:6283
    tcp_v4_conn_request+0x157/0x210 net/ipv4/tcp_ipv4.c:1313
    tcp_rcv_state_process+0x8ea/0x4850 net/ipv4/tcp_input.c:5857
    tcp_v4_do_rcv+0x55c/0x7d0 net/ipv4/tcp_ipv4.c:1482
    tcp_v4_rcv+0x2d10/0x2f80 net/ipv4/tcp_ipv4.c:1711
    ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
    NF_HOOK include/linux/netfilter.h:249 [inline]
    ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
    dst_input include/net/dst.h:464 [inline]
    ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
    NF_HOOK include/linux/netfilter.h:249 [inline]
    ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
    __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
    __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
    netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
    netif_receive_skb+0xae/0x390 net/core/dev.c:4611
    tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
    tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
    tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
    call_write_iter include/linux/fs.h:1770 [inline]
    new_sync_write fs/read_write.c:468 [inline]
    __vfs_write+0x68a/0x970 fs/read_write.c:481
    vfs_write+0x18f/0x510 fs/read_write.c:543
    SYSC_write fs/read_write.c:588 [inline]
    SyS_write+0xef/0x220 fs/read_write.c:580
    entry_SYSCALL_64_fastpath+0x1f/0xbe

    Freed by task 3306:
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
    save_stack+0x43/0xd0 mm/kasan/kasan.c:447
    set_track mm/kasan/kasan.c:459 [inline]
    kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
    __cache_free mm/slab.c:3503 [inline]
    kfree+0xca/0x250 mm/slab.c:3820
    inet_sock_destruct+0x59d/0x950 net/ipv4/af_inet.c:157
    __sk_destruct+0xfd/0x910 net/core/sock.c:1560
    sk_destruct+0x47/0x80 net/core/sock.c:1595
    __sk_free+0x57/0x230 net/core/sock.c:1603
    sk_free+0x2a/0x40 net/core/sock.c:1614
    sock_put include/net/sock.h:1652 [inline]
    inet_csk_complete_hashdance+0xd5/0xf0 net/ipv4/inet_connection_sock.c:959
    tcp_check_req+0xf4d/0x1620 net/ipv4/tcp_minisocks.c:765
    tcp_v4_rcv+0x17f6/0x2f80 net/ipv4/tcp_ipv4.c:1675
    ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
    NF_HOOK include/linux/netfilter.h:249 [inline]
    ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
    dst_input include/net/dst.h:464 [inline]
    ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
    NF_HOOK include/linux/netfilter.h:249 [inline]
    ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
    __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
    __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
    netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
    netif_receive_skb+0xae/0x390 net/core/dev.c:4611
    tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
    tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
    tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
    call_write_iter include/linux/fs.h:1770 [inline]
    new_sync_write fs/read_write.c:468 [inline]
    __vfs_write+0x68a/0x970 fs/read_write.c:481
    vfs_write+0x18f/0x510 fs/read_write.c:543
    SYSC_write fs/read_write.c:588 [inline]
    SyS_write+0xef/0x220 fs/read_write.c:580
    entry_SYSCALL_64_fastpath+0x1f/0xbe

    Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets")
    Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table")
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Sep, 2017

1 commit


22 Aug, 2017

1 commit


17 Aug, 2017

1 commit

  • syszkaller team reported another problem in DCCP [1]

    Problem here is that the structure holding RTO timer
    (ccid2_hc_tx_rto_expire() handler) is freed too soon.

    We can not use del_timer_sync() to cancel the timer
    since this timer wants to grab socket lock (that would risk a dead lock)

    Solution is to defer the freeing of memory when all references to
    the socket were released. Socket timers do own a reference, so this
    should fix the issue.

    [1]

    ==================================================================
    BUG: KASAN: use-after-free in ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
    Read of size 4 at addr ffff8801d2660540 by task kworker/u4:7/3365

    CPU: 1 PID: 3365 Comm: kworker/u4:7 Not tainted 4.13.0-rc4+ #3
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Workqueue: events_unbound call_usermodehelper_exec_work
    Call Trace:

    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:52
    print_address_description+0x73/0x250 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x24e/0x340 mm/kasan/report.c:409
    __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429
    ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
    call_timer_fn+0x233/0x830 kernel/time/timer.c:1268
    expire_timers kernel/time/timer.c:1307 [inline]
    __run_timers+0x7fd/0xb90 kernel/time/timer.c:1601
    run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
    __do_softirq+0x2f5/0xba3 kernel/softirq.c:284
    invoke_softirq kernel/softirq.c:364 [inline]
    irq_exit+0x1cc/0x200 kernel/softirq.c:405
    exiting_irq arch/x86/include/asm/apic.h:638 [inline]
    smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:1044
    apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:702
    RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:824 [inline]
    RIP: 0010:__raw_write_unlock_irq include/linux/rwlock_api_smp.h:267 [inline]
    RIP: 0010:_raw_write_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:343
    RSP: 0018:ffff8801cd50eaa8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
    RAX: dffffc0000000000 RBX: ffffffff85a090c0 RCX: 0000000000000006
    RDX: 1ffffffff0b595f3 RSI: 1ffff1003962f989 RDI: ffffffff85acaf98
    RBP: ffff8801cd50eab0 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801cc96ea60
    R13: dffffc0000000000 R14: ffff8801cc96e4c0 R15: ffff8801cc96e4c0

    release_task+0xe9e/0x1a40 kernel/exit.c:220
    wait_task_zombie kernel/exit.c:1162 [inline]
    wait_consider_task+0x29b8/0x33c0 kernel/exit.c:1389
    do_wait_thread kernel/exit.c:1452 [inline]
    do_wait+0x441/0xa90 kernel/exit.c:1523
    kernel_wait4+0x1f5/0x370 kernel/exit.c:1665
    SYSC_wait4+0x134/0x140 kernel/exit.c:1677
    SyS_wait4+0x2c/0x40 kernel/exit.c:1673
    call_usermodehelper_exec_sync kernel/kmod.c:286 [inline]
    call_usermodehelper_exec_work+0x1a0/0x2c0 kernel/kmod.c:323
    process_one_work+0xbf3/0x1bc0 kernel/workqueue.c:2097
    worker_thread+0x223/0x1860 kernel/workqueue.c:2231
    kthread+0x35e/0x430 kernel/kthread.c:231
    ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:425

    Allocated by task 21267:
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
    save_stack+0x43/0xd0 mm/kasan/kasan.c:447
    set_track mm/kasan/kasan.c:459 [inline]
    kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
    kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
    kmem_cache_alloc+0x127/0x750 mm/slab.c:3561
    ccid_new+0x20e/0x390 net/dccp/ccid.c:151
    dccp_hdlr_ccid+0x27/0x140 net/dccp/feat.c:44
    __dccp_feat_activate+0x142/0x2a0 net/dccp/feat.c:344
    dccp_feat_activate_values+0x34e/0xa90 net/dccp/feat.c:1538
    dccp_rcv_request_sent_state_process net/dccp/input.c:472 [inline]
    dccp_rcv_state_process+0xed1/0x1620 net/dccp/input.c:677
    dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
    sk_backlog_rcv include/net/sock.h:911 [inline]
    __release_sock+0x124/0x360 net/core/sock.c:2269
    release_sock+0xa4/0x2a0 net/core/sock.c:2784
    inet_wait_for_connect net/ipv4/af_inet.c:557 [inline]
    __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643
    inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682
    SYSC_connect+0x204/0x470 net/socket.c:1642
    SyS_connect+0x24/0x30 net/socket.c:1623
    entry_SYSCALL_64_fastpath+0x1f/0xbe

    Freed by task 3049:
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
    save_stack+0x43/0xd0 mm/kasan/kasan.c:447
    set_track mm/kasan/kasan.c:459 [inline]
    kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
    __cache_free mm/slab.c:3503 [inline]
    kmem_cache_free+0x77/0x280 mm/slab.c:3763
    ccid_hc_tx_delete+0xc5/0x100 net/dccp/ccid.c:190
    dccp_destroy_sock+0x1d1/0x2b0 net/dccp/proto.c:225
    inet_csk_destroy_sock+0x166/0x3f0 net/ipv4/inet_connection_sock.c:833
    dccp_done+0xb7/0xd0 net/dccp/proto.c:145
    dccp_time_wait+0x13d/0x300 net/dccp/minisocks.c:72
    dccp_rcv_reset+0x1d1/0x5b0 net/dccp/input.c:160
    dccp_rcv_state_process+0x8fc/0x1620 net/dccp/input.c:663
    dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
    sk_backlog_rcv include/net/sock.h:911 [inline]
    __sk_receive_skb+0x33e/0xc00 net/core/sock.c:521
    dccp_v4_rcv+0xef1/0x1c00 net/dccp/ipv4.c:871
    ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
    NF_HOOK include/linux/netfilter.h:248 [inline]
    ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257
    dst_input include/net/dst.h:477 [inline]
    ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397
    NF_HOOK include/linux/netfilter.h:248 [inline]
    ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488
    __netif_receive_skb_core+0x19af/0x33d0 net/core/dev.c:4417
    __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4455
    process_backlog+0x203/0x740 net/core/dev.c:5130
    napi_poll net/core/dev.c:5527 [inline]
    net_rx_action+0x792/0x1910 net/core/dev.c:5593
    __do_softirq+0x2f5/0xba3 kernel/softirq.c:284

    The buggy address belongs to the object at ffff8801d2660100
    which belongs to the cache ccid2_hc_tx_sock of size 1240
    The buggy address is located 1088 bytes inside of
    1240-byte region [ffff8801d2660100, ffff8801d26605d8)
    The buggy address belongs to the page:
    page:ffffea0007499800 count:1 mapcount:0 mapping:ffff8801d2660100 index:0x0 compound_mapcount: 0
    flags: 0x200000000008100(slab|head)
    raw: 0200000000008100 ffff8801d2660100 0000000000000000 0000000100000005
    raw: ffffea00075271a0 ffffea0007538820 ffff8801d3aef9c0 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff8801d2660400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ffff8801d2660480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff8801d2660500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ^
    ffff8801d2660580: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
    ffff8801d2660600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ==================================================================

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Cc: Gerrit Renker
    Signed-off-by: David S. Miller

    Eric Dumazet
     

16 Aug, 2017

1 commit


15 Aug, 2017

1 commit

  • syzkaller reported that DCCP could have a non empty
    write queue at dismantle time.

    WARNING: CPU: 1 PID: 2953 at net/core/stream.c:199 sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 1 PID: 2953 Comm: syz-executor0 Not tainted 4.13.0-rc4+ #2
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:52
    panic+0x1e4/0x417 kernel/panic.c:180
    __warn+0x1c4/0x1d9 kernel/panic.c:541
    report_bug+0x211/0x2d0 lib/bug.c:183
    fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
    do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
    do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
    do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310
    do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
    invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:846
    RIP: 0010:sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
    RSP: 0018:ffff8801d182f108 EFLAGS: 00010297
    RAX: ffff8801d1144140 RBX: ffff8801d13cb280 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffffffff85137b00 RDI: ffff8801d13cb280
    RBP: ffff8801d182f148 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d13cb4d0
    R13: ffff8801d13cb3b8 R14: ffff8801d13cb300 R15: ffff8801d13cb3b8
    inet_csk_destroy_sock+0x175/0x3f0 net/ipv4/inet_connection_sock.c:835
    dccp_close+0x84d/0xc10 net/dccp/proto.c:1067
    inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
    sock_release+0x8d/0x1e0 net/socket.c:597
    sock_close+0x16/0x20 net/socket.c:1126
    __fput+0x327/0x7e0 fs/file_table.c:210
    ____fput+0x15/0x20 fs/file_table.c:246
    task_work_run+0x18a/0x260 kernel/task_work.c:116
    exit_task_work include/linux/task_work.h:21 [inline]
    do_exit+0xa32/0x1b10 kernel/exit.c:865
    do_group_exit+0x149/0x400 kernel/exit.c:969
    get_signal+0x7e8/0x17e0 kernel/signal.c:2330
    do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
    exit_to_usermode_loop+0x21c/0x2d0 arch/x86/entry/common.c:157
    prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
    syscall_return_slowpath+0x3a7/0x450 arch/x86/entry/common.c:263

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Aug, 2017

2 commits

  • Add a second device index, sdif, to inet6 socket lookups. sdif is the
    index for ingress devices enslaved to an l3mdev. It allows the lookups
    to consider the enslaved device as well as the L3 domain when searching
    for a socket.

    TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the
    ingress index is obtained from IPCB using inet_sdif and after tcp_v4_rcv
    tcp_v4_sdif is used.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Add a second device index, sdif, to inet socket lookups. sdif is the
    index for ingress devices enslaved to an l3mdev. It allows the lookups
    to consider the enslaved device as well as the L3 domain when searching
    for a socket.

    TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the
    ingress index is obtained from IPCB using inet_sdif and after the cb move
    in tcp_v4_rcv the tcp_v4_sdif helper is used.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

27 Jul, 2017

3 commits

  • In dccp_feat_init, when ccid_get_builtin_ccids failsto alloc
    memory for rx.val, it should free tx.val before returning an
    error.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • The patch "dccp: fix a memleak that dccp_ipv6 doesn't put reqsk
    properly" fixed reqsk refcnt leak for dccp_ipv6. The same issue
    exists on dccp_ipv4.

    This patch is to fix it for dccp_ipv4.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • In dccp_v6_conn_request, after reqsk gets alloced and hashed into
    ehash table, reqsk's refcnt is set 3. one is for req->rsk_timer,
    one is for hlist, and the other one is for current using.

    The problem is when dccp_v6_conn_request returns and finishes using
    reqsk, it doesn't put reqsk. This will cause reqsk refcnt leaks and
    reqsk obj never gets freed.

    Jianlin found this issue when running dccp_memleak.c in a loop, the
    system memory would run out.

    dccp_memleak.c:
    int s1 = socket(PF_INET6, 6, IPPROTO_IP);
    bind(s1, &sa1, 0x20);
    listen(s1, 0x9);
    int s2 = socket(PF_INET6, 6, IPPROTO_IP);
    connect(s2, &sa1, 0x20);
    close(s1);
    close(s2);

    This patch is to put the reqsk before dccp_v6_conn_request returns,
    just as what tcp_conn_request does.

    Reported-by: Jianlin Shi
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

14 Jul, 2017

1 commit

  • Don't populate array error_code on the stack but make it static. Makes
    the object code smaller by almost 250 bytes:

    Before:
    text data bss dec hex filename
    10366 983 0 11349 2c55 net/dccp/input.o

    After:
    text data bss dec hex filename
    10161 1039 0 11200 2bc0 net/dccp/input.o

    Signed-off-by: Colin Ian King
    Signed-off-by: David S. Miller

    Colin Ian King
     

01 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

21 Jun, 2017

2 commits

  • Patch "call inet_add_protocol after register_pernet_subsys in dccp_v4_init"
    fixed a null pointer dereference issue for dccp_ipv4 module.

    The same fix is needed for dccp_ipv6 module.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • Now dccp_ipv4 works as a kernel module. During loading this module, if
    one dccp packet is being recieved after inet_add_protocol but before
    register_pernet_subsys in which v4_ctl_sk is initialized, a null pointer
    dereference may be triggered because of init_net.dccp.v4_ctl_sk is 0x0.

    Jianlin found this issue when the following call trace occurred:

    [ 171.950177] BUG: unable to handle kernel NULL pointer dereference at 0000000000000110
    [ 171.951007] IP: [] dccp_v4_ctl_send_reset+0xc4/0x220 [dccp_ipv4]
    [...]
    [ 171.984629] Call Trace:
    [ 171.984859]
    [ 171.985061]
    [ 171.985213] [] dccp_v4_rcv+0x383/0x3f9 [dccp_ipv4]
    [ 171.985711] [] ip_local_deliver_finish+0xb4/0x1f0
    [ 171.986309] [] ip_local_deliver+0x59/0xd0
    [ 171.986852] [] ? update_curr+0x104/0x190
    [ 171.986956] [] ip_rcv_finish+0x8a/0x350
    [ 171.986956] [] ip_rcv+0x2b6/0x410
    [ 171.986956] [] ? task_cputime+0x44/0x80
    [ 171.986956] [] __netif_receive_skb_core+0x572/0x7c0
    [ 171.986956] [] ? trigger_load_balance+0x61/0x1e0
    [ 171.986956] [] __netif_receive_skb+0x18/0x60
    [ 171.986956] [] process_backlog+0xae/0x180
    [ 171.986956] [] net_rx_action+0x16d/0x380
    [ 171.986956] [] __do_softirq+0xef/0x280
    [ 171.986956] [] call_softirq+0x1c/0x30

    This patch is to move inet_add_protocol after register_pernet_subsys in
    dccp_v4_init, so that v4_ctl_sk is initialized before any incoming dccp
    packets are processed.

    Reported-by: Jianlin Shi
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

16 Jun, 2017

1 commit

  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions return void * and remove all the casts across
    the tree, adding a (u8 *) cast only where the unsigned char pointer
    was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    @@
    expression SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - fn(SKB, LEN)[0]
    + *(u8 *)fn(SKB, LEN)

    Note that the last part there converts from push(...)[0] to the
    more idiomatic *(u8 *)push(...).

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

18 May, 2017

1 commit


16 May, 2017

1 commit

  • Pull networking fixes from David Miller:

    1) Track alignment in BPF verifier so that legitimate programs won't be
    rejected on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS architectures.

    2) Make tail calls work properly in arm64 BPF JIT, from Deniel
    Borkmann.

    3) Make the configuration and semantics Generic XDP make more sense and
    don't allow both generic XDP and a driver specific instance to be
    active at the same time. Also from Daniel.

    4) Don't crash on resume in xen-netfront, from Vitaly Kuznetsov.

    5) Fix use-after-free in VRF driver, from Gao Feng.

    6) Use netdev_alloc_skb_ip_align() to avoid unaligned IP headers in
    qca_spi driver, from Stefan Wahren.

    7) Always run cleanup routines in BPF samples when we get SIGTERM, from
    Andy Gospodarek.

    8) The mdio phy code should bring PHYs out of reset using the shared
    GPIO lines before invoking bus->reset(). From Florian Fainelli.

    9) Some USB descriptor access endian fixes in various drivers from
    Johan Hovold.

    10) Handle PAUSE advertisements properly in mlx5 driver, from Gal
    Pressman.

    11) Fix reversed test in mlx5e_setup_tc(), from Saeed Mahameed.

    12) Cure netdev leak in AF_PACKET when using timestamping via control
    messages. From Douglas Caetano dos Santos.

    13) netcp doesn't support HWTSTAMP_FILTER_ALl, reject it. From Miroslav
    Lichvar.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
    ldmvsw: stop the clean timer at beginning of remove
    ldmvsw: unregistering netdev before disable hardware
    net: netcp: fix check of requested timestamping filter
    ipv6: avoid dad-failures for addresses with NODAD
    qed: Fix uninitialized data in aRFS infrastructure
    mdio: mux: fix device_node_continue.cocci warnings
    net/packet: fix missing net_device reference release
    net/mlx4_core: Use min3 to select number of MSI-X vectors
    macvlan: Fix performance issues with vlan tagged packets
    net: stmmac: use correct pointer when printing normal descriptor ring
    net/mlx5: Use underlay QPN from the root name space
    net/mlx5e: IPoIB, Only support regular RQ for now
    net/mlx5e: Fix setup TC ndo
    net/mlx5e: Fix ethtool pause support and advertise reporting
    net/mlx5e: Use the correct pause values for ethtool advertising
    vmxnet3: ensure that adapter is in proper state during force_close
    sfc: revert changes to NIC revision numbers
    net: ch9200: add missing USB-descriptor endianness conversions
    net: irda: irda-usb: fix firmware name on big-endian hosts
    net: dsa: mv88e6xxx: add default case to switch
    ...

    Linus Torvalds
     

12 May, 2017

1 commit


23 Apr, 2017

1 commit


19 Apr, 2017

1 commit

  • A group of Linux kernel hackers reported chasing a bug that resulted
    from their assumption that SLAB_DESTROY_BY_RCU provided an existence
    guarantee, that is, that no block from such a slab would be reallocated
    during an RCU read-side critical section. Of course, that is not the
    case. Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire
    slab of blocks.

    However, there is a phrase for this, namely "type safety". This commit
    therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order
    to avoid future instances of this sort of confusion.

    Signed-off-by: Paul E. McKenney
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrew Morton
    Cc:
    Acked-by: Johannes Weiner
    Acked-by: Vlastimil Babka
    [ paulmck: Add comments mentioning the old name, as requested by Eric
    Dumazet, in order to help people familiar with the old name find
    the new one. ]
    Acked-by: David Rientjes

    Paul E. McKenney
     

14 Mar, 2017

2 commits

  • This patch fixes a memory leak, which happens if the connection request
    is not fulfilled between parsing the DCCP options and handling the SYN
    (because e.g. the backlog is full), because we forgot to free the
    list of ack vectors.

    Reported-by: Jianwen Ji
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • As Eric Dumazet pointed out this also needs to be fixed in IPv6.
    v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

    We have seen a few incidents lately where a dst_enty has been freed
    with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
    dst_entry. If the conditions/timings are right a crash then ensues when the
    freed dst_entry is referenced later on. A Common crashing back trace is:

    #8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
    .
    .
    #9 [] tcp_rcv_established at ffffffff81580b64
    #10 [] tcp_v4_do_rcv at ffffffff8158b54a
    #11 [] tcp_v4_rcv at ffffffff8158cd02
    #12 [] ip_local_deliver_finish at ffffffff815668f4
    #13 [] ip_local_deliver at ffffffff81566bd9
    #14 [] ip_rcv_finish at ffffffff8156656d
    #15 [] ip_rcv at ffffffff81566f06
    #16 [] __netif_receive_skb_core at ffffffff8152b3a2
    #17 [] __netif_receive_skb at ffffffff8152b608
    #18 [] netif_receive_skb at ffffffff8152b690
    #19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
    #20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
    #21 [] net_rx_action at ffffffff8152bac2
    #22 [] __do_softirq at ffffffff81084b4f
    #23 [] call_softirq at ffffffff8164845c
    #24 [] do_softirq at ffffffff81016fc5
    #25 [] irq_exit at ffffffff81084ee5
    #26 [] do_IRQ at ffffffff81648ff8

    Of course it may happen with other NIC drivers as well.

    It's found the freed dst_entry here:

    224 static bool tcp_in_quickack_mode(struct sock *sk)↩
    225 {↩
    226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩
    227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩
    228 ↩
    229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
    230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
    231 }↩

    But there are other backtraces attributed to the same freed dst_entry in
    netfilter code as well.

    All the vmcores showed 2 significant clues:

    - Remote hosts behind the default gateway had always been redirected to a
    different gateway. A rtable/dst_entry will be added for that host. Making
    more dst_entrys with lower reference counts. Making this more probable.

    - All vmcores showed a postitive LockDroppedIcmps value, e.g:

    LockDroppedIcmps 267

    A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
    regardless of whether user space has the socket locked. This can result in a
    race condition where the same dst_entry cached in sk->sk_dst_entry can be
    decremented twice for the same socket via:

    do_redirect()->__sk_dst_check()-> dst_release().

    Which leads to the dst_entry being prematurely freed with another socket
    pointing to it via sk->sk_dst_cache and a subsequent crash.

    To fix this skip do_redirect() if usespace has the socket locked. Instead let
    the redirect take place later when user space does not have the socket
    locked.

    The dccp/IPv6 code is very similar in this respect, so fixing it there too.

    As Eric Garver pointed out the following commit now invalidates routes. Which
    can set the dst->obsolete flag so that ipv4_dst_check() returns null and
    triggers the dst_release().

    Fixes: ceb3320610d6 ("ipv4: Kill routes during PMTU/redirect updates.")
    Cc: Eric Garver
    Cc: Hannes Sowa
    Signed-off-by: Jon Maxwell
    Signed-off-by: David S. Miller

    Jon Maxwell
     

08 Mar, 2017

1 commit

  • Dmitry reported crashes in DCCP stack [1]

    Problem here is that when I got rid of listener spinlock, I missed the
    fact that DCCP stores a complex state in struct dccp_request_sock,
    while TCP does not.

    Since multiple cpus could access it at the same time, we need to add
    protection.

    [1]
    BUG: KASAN: use-after-free in dccp_feat_activate_values+0x967/0xab0
    net/dccp/feat.c:1541 at addr ffff88003713be68
    Read of size 8 by task syz-executor2/8457
    CPU: 2 PID: 8457 Comm: syz-executor2 Not tainted 4.10.0-rc7+ #127
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:15 [inline]
    dump_stack+0x292/0x398 lib/dump_stack.c:51
    kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
    print_address_description mm/kasan/report.c:200 [inline]
    kasan_report_error mm/kasan/report.c:289 [inline]
    kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
    kasan_report mm/kasan/report.c:332 [inline]
    __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
    dccp_feat_activate_values+0x967/0xab0 net/dccp/feat.c:1541
    dccp_create_openreq_child+0x464/0x610 net/dccp/minisocks.c:121
    dccp_v6_request_recv_sock+0x1f6/0x1960 net/dccp/ipv6.c:457
    dccp_check_req+0x335/0x5a0 net/dccp/minisocks.c:186
    dccp_v6_rcv+0x69e/0x1d00 net/dccp/ipv6.c:711
    ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
    NF_HOOK include/linux/netfilter.h:257 [inline]
    ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
    dst_input include/net/dst.h:507 [inline]
    ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
    NF_HOOK include/linux/netfilter.h:257 [inline]
    ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
    __netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
    __netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
    process_backlog+0xe5/0x6c0 net/core/dev.c:4839
    napi_poll net/core/dev.c:5202 [inline]
    net_rx_action+0xe70/0x1900 net/core/dev.c:5267
    __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
    do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902

    do_softirq.part.17+0x1e8/0x230 kernel/softirq.c:328
    do_softirq kernel/softirq.c:176 [inline]
    __local_bh_enable_ip+0x1f2/0x200 kernel/softirq.c:181
    local_bh_enable include/linux/bottom_half.h:31 [inline]
    rcu_read_unlock_bh include/linux/rcupdate.h:971 [inline]
    ip6_finish_output2+0xbb0/0x23d0 net/ipv6/ip6_output.c:123
    ip6_finish_output+0x302/0x960 net/ipv6/ip6_output.c:148
    NF_HOOK_COND include/linux/netfilter.h:246 [inline]
    ip6_output+0x1cb/0x8d0 net/ipv6/ip6_output.c:162
    ip6_xmit+0xcdf/0x20d0 include/net/dst.h:501
    inet6_csk_xmit+0x320/0x5f0 net/ipv6/inet6_connection_sock.c:179
    dccp_transmit_skb+0xb09/0x1120 net/dccp/output.c:141
    dccp_xmit_packet+0x215/0x760 net/dccp/output.c:280
    dccp_write_xmit+0x168/0x1d0 net/dccp/output.c:362
    dccp_sendmsg+0x79c/0xb10 net/dccp/proto.c:796
    inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
    sock_sendmsg_nosec net/socket.c:635 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:645
    SYSC_sendto+0x660/0x810 net/socket.c:1687
    SyS_sendto+0x40/0x50 net/socket.c:1655
    entry_SYSCALL_64_fastpath+0x1f/0xc2
    RIP: 0033:0x4458b9
    RSP: 002b:00007f8ceb77bb58 EFLAGS: 00000282 ORIG_RAX: 000000000000002c
    RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 00000000004458b9
    RDX: 0000000000000023 RSI: 0000000020e60000 RDI: 0000000000000017
    RBP: 00000000006e1b90 R08: 00000000200f9fe1 R09: 0000000000000020
    R10: 0000000000008010 R11: 0000000000000282 R12: 00000000007080a8
    R13: 0000000000000000 R14: 00007f8ceb77c9c0 R15: 00007f8ceb77c700
    Object at ffff88003713be50, in cache kmalloc-64 size: 64
    Allocated:
    PID = 8446
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
    save_stack+0x43/0xd0 mm/kasan/kasan.c:502
    set_track mm/kasan/kasan.c:514 [inline]
    kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
    kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2738
    kmalloc include/linux/slab.h:490 [inline]
    dccp_feat_entry_new+0x214/0x410 net/dccp/feat.c:467
    dccp_feat_push_change+0x38/0x220 net/dccp/feat.c:487
    __feat_register_sp+0x223/0x2f0 net/dccp/feat.c:741
    dccp_feat_propagate_ccid+0x22b/0x2b0 net/dccp/feat.c:949
    dccp_feat_server_ccid_dependencies+0x1b3/0x250 net/dccp/feat.c:1012
    dccp_make_response+0x1f1/0xc90 net/dccp/output.c:423
    dccp_v6_send_response+0x4ec/0xc20 net/dccp/ipv6.c:217
    dccp_v6_conn_request+0xaba/0x11b0 net/dccp/ipv6.c:377
    dccp_rcv_state_process+0x51e/0x1650 net/dccp/input.c:606
    dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
    sk_backlog_rcv include/net/sock.h:893 [inline]
    __sk_receive_skb+0x36f/0xcc0 net/core/sock.c:479
    dccp_v6_rcv+0xba5/0x1d00 net/dccp/ipv6.c:742
    ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
    NF_HOOK include/linux/netfilter.h:257 [inline]
    ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
    dst_input include/net/dst.h:507 [inline]
    ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
    NF_HOOK include/linux/netfilter.h:257 [inline]
    ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
    __netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
    __netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
    process_backlog+0xe5/0x6c0 net/core/dev.c:4839
    napi_poll net/core/dev.c:5202 [inline]
    net_rx_action+0xe70/0x1900 net/core/dev.c:5267
    __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
    Freed:
    PID = 15
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
    save_stack+0x43/0xd0 mm/kasan/kasan.c:502
    set_track mm/kasan/kasan.c:514 [inline]
    kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
    slab_free_hook mm/slub.c:1355 [inline]
    slab_free_freelist_hook mm/slub.c:1377 [inline]
    slab_free mm/slub.c:2954 [inline]
    kfree+0xe8/0x2b0 mm/slub.c:3874
    dccp_feat_entry_destructor.part.4+0x48/0x60 net/dccp/feat.c:418
    dccp_feat_entry_destructor net/dccp/feat.c:416 [inline]
    dccp_feat_list_pop net/dccp/feat.c:541 [inline]
    dccp_feat_activate_values+0x57f/0xab0 net/dccp/feat.c:1543
    dccp_create_openreq_child+0x464/0x610 net/dccp/minisocks.c:121
    dccp_v6_request_recv_sock+0x1f6/0x1960 net/dccp/ipv6.c:457
    dccp_check_req+0x335/0x5a0 net/dccp/minisocks.c:186
    dccp_v6_rcv+0x69e/0x1d00 net/dccp/ipv6.c:711
    ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
    NF_HOOK include/linux/netfilter.h:257 [inline]
    ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
    dst_input include/net/dst.h:507 [inline]
    ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
    NF_HOOK include/linux/netfilter.h:257 [inline]
    ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
    __netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
    __netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
    process_backlog+0xe5/0x6c0 net/core/dev.c:4839
    napi_poll net/core/dev.c:5202 [inline]
    net_rx_action+0xe70/0x1900 net/core/dev.c:5267
    __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
    Memory state around the buggy address:
    ffff88003713bd00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff88003713bd80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    >ffff88003713be00: fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb fb
    ^

    Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table")
    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Tested-by: Dmitry Vyukov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

05 Mar, 2017

1 commit

  • Pull networking fixes from David Miller:

    1) Fix double-free in batman-adv, from Sven Eckelmann.

    2) Fix packet stats for fast-RX path, from Joannes Berg.

    3) Netfilter's ip_route_me_harder() doesn't handle request sockets
    properly, fix from Florian Westphal.

    4) Fix sendmsg deadlock in rxrpc, from David Howells.

    5) Add missing RCU locking to transport hashtable scan, from Xin Long.

    6) Fix potential packet loss in mlxsw driver, from Ido Schimmel.

    7) Fix race in NAPI handling between poll handlers and busy polling,
    from Eric Dumazet.

    8) TX path in vxlan and geneve need proper RCU locking, from Jakub
    Kicinski.

    9) SYN processing in DCCP and TCP need to disable BH, from Eric
    Dumazet.

    10) Properly handle net_enable_timestamp() being invoked from IRQ
    context, also from Eric Dumazet.

    11) Fix crash on device-tree systems in xgene driver, from Alban Bedel.

    12) Do not call sk_free() on a locked socket, from Arnaldo Carvalho de
    Melo.

    13) Fix use-after-free in netvsc driver, from Dexuan Cui.

    14) Fix max MTU setting in bonding driver, from WANG Cong.

    15) xen-netback hash table can be allocated from softirq context, so use
    GFP_ATOMIC. From Anoob Soman.

    16) Fix MAC address change bug in bgmac driver, from Hari Vyas.

    17) strparser needs to destroy strp_wq on module exit, from WANG Cong.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
    strparser: destroy workqueue on module exit
    sfc: fix IPID endianness in TSOv2
    sfc: avoid max() in array size
    rds: remove unnecessary returned value check
    rxrpc: Fix potential NULL-pointer exception
    nfp: correct DMA direction in XDP DMA sync
    nfp: don't tell FW about the reserved buffer space
    net: ethernet: bgmac: mac address change bug
    net: ethernet: bgmac: init sequence bug
    xen-netback: don't vfree() queues under spinlock
    xen-netback: keep a local pointer for vif in backend_disconnect()
    netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() fails
    netfilter: nft_set_rbtree: incorrect assumption on lower interval lookups
    netfilter: nf_conntrack_sip: fix wrong memory initialisation
    can: flexcan: fix typo in comment
    can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer
    can: gs_usb: fix coding style
    can: gs_usb: Don't use stack memory for USB transfers
    ixgbe: Limit use of 2K buffers on architectures with 256B or larger cache lines
    ixgbe: update the rss key on h/w, when ethtool ask for it
    ...

    Linus Torvalds