18 Oct, 2018

1 commit

  • [ Upstream commit 92ef12b32feab8f277b69e9fb89ede2796777f4d ]

    In the case of implicit connect message with data > 1K, the flow
    control accounting is incorrect. At this state, the socket does not
    know the peer nodes capability and falls back to legacy flow control
    by return 1, however the receiver of this message will perform the
    new block accounting. This leads to a slack and eventually traffic
    disturbance.

    In this commit, we perform tipc_node_get_capabilities() at implicit
    connect and perform accounting based on the peer's capability.

    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Parthasarathy Bhuvaragan
     

15 Sep, 2018

1 commit

  • [ Upstream commit bd583fe30427500a2d0abe25724025b1cb5e2636 ]

    rhashtable_walk_exit() must be paired with rhashtable_walk_enter().

    Fixes: 40f9f4397060 ("tipc: Fix tipc_sk_reinit race conditions")
    Cc: Herbert Xu
    Cc: Ying Xue
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     

21 Jun, 2018

3 commits

  • [ Upstream commit 94f6a80c0c11828cb7b3d79294459dd8d761ca89 ]

    When we get link properties through netlink interface with
    tipc_nl_node_get_link(), we don't validate TIPC_NLA_LINK_NAME
    attribute at all, instead we directly use it. As a consequence,
    KMSAN detected the TIPC_NLA_LINK_NAME attribute was an uninitialized
    value, and then posted the following complaint:

    ==================================================================
    BUG: KMSAN: uninit-value in strcmp+0xf7/0x160 lib/string.c:329
    CPU: 1 PID: 4527 Comm: syz-executor655 Not tainted 4.16.0+ #87
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x185/0x1d0 lib/dump_stack.c:53
    kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
    __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
    strcmp+0xf7/0x160 lib/string.c:329
    tipc_nl_node_get_link+0x220/0x6f0 net/tipc/node.c:1881
    genl_family_rcv_msg net/netlink/genetlink.c:599 [inline]
    genl_rcv_msg+0x1686/0x1810 net/netlink/genetlink.c:624
    netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2447
    genl_rcv+0x63/0x80 net/netlink/genetlink.c:635
    netlink_unicast_kernel net/netlink/af_netlink.c:1311 [inline]
    netlink_unicast+0x166b/0x1740 net/netlink/af_netlink.c:1337
    netlink_sendmsg+0x1048/0x1310 net/netlink/af_netlink.c:1900
    sock_sendmsg_nosec net/socket.c:630 [inline]
    sock_sendmsg net/socket.c:640 [inline]
    ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
    __sys_sendmsg net/socket.c:2080 [inline]
    SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
    SyS_sendmsg+0x54/0x80 net/socket.c:2087
    do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x445589
    RSP: 002b:00007fb7ee66cdb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00000000006dac24 RCX: 0000000000445589
    RDX: 0000000000000000 RSI: 0000000020023000 RDI: 0000000000000003
    RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007fffa2bf3f3f R14: 00007fb7ee66d9c0 R15: 0000000000000001

    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
    kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
    kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
    kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
    slab_post_alloc_hook mm/slab.h:445 [inline]
    slab_alloc_node mm/slub.c:2737 [inline]
    __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
    __kmalloc_reserve net/core/skbuff.c:138 [inline]
    __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
    alloc_skb include/linux/skbuff.h:984 [inline]
    netlink_alloc_large_skb net/netlink/af_netlink.c:1183 [inline]
    netlink_sendmsg+0x9a6/0x1310 net/netlink/af_netlink.c:1875
    sock_sendmsg_nosec net/socket.c:630 [inline]
    sock_sendmsg net/socket.c:640 [inline]
    ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
    __sys_sendmsg net/socket.c:2080 [inline]
    SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
    SyS_sendmsg+0x54/0x80 net/socket.c:2087
    do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    ==================================================================

    To quiet the complaint, TIPC_NLA_LINK_NAME attribute has been
    validated in tipc_nl_node_get_link() before it's used.

    Reported-by: syzbot+df0257c92ffd4fcc58cd@syzkaller.appspotmail.com
    Signed-off-by: Ying Xue
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Ying Xue
     
  • [ Upstream commit 7dbc73e6124ce4d0cfbdd6166de388e9367c47ad ]

    Commit 36a50a989ee8 ("tipc: fix infinite loop when dumping link monitor
    summary") intended to fix a problem with user tool looping when max
    number of bearers are enabled.

    Unfortunately, the wrong version of the commit was posted, so the
    problem was not solved at all.

    This commit adds the missing part.

    Fixes: 36a50a989ee8 ("tipc: fix infinite loop when dumping link monitor summary")
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jon Maloy
     
  • [ Upstream commit 36a50a989ee8267588de520b8704b85f045a3220 ]

    When configuring the number of used bearers to MAX_BEARER and issuing
    command "tipc link monitor summary", the command enters infinite loop
    in user space.

    This issue happens because function tipc_nl_node_dump_monitor() returns
    the wrong 'prev_bearer' value when all potential monitors have been
    scanned.

    The correct behavior is to always try to scan all monitors until either
    the netlink message is full, in which case we return the bearer identity
    of the affected monitor, or we continue through the whole bearer array
    until we can return MAX_BEARERS. This solution also caters for the case
    where there may be gaps in the bearer array.

    Signed-off-by: Tung Nguyen
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Tung Nguyen
     

29 Apr, 2018

1 commit

  • [ Upstream commit ec518f21cb1a1b1f8a516499ea05c60299e04963 ]

    Before syzbot/KMSAN bites, add the missing policy for TIPC_NLA_NET_ADDR

    Fixes: 27c21416727a ("tipc: add net set to new netlink api")
    Signed-off-by: Eric Dumazet
    Cc: Jon Maloy
    Cc: Ying Xue
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

03 Mar, 2018

2 commits

  • [ Upstream commit 642a8439ddd8423b92f2e71960afe21ee1f66bb6 ]

    Calling tipc_mon_delete() before the monitor has been created will oops.
    This can happen in tipc_enable_bearer() error path if tipc_disc_create()
    fails.

    [ 48.589074] BUG: unable to handle kernel paging request at 0000000000001008
    [ 48.590266] IP: tipc_mon_delete+0xea/0x270 [tipc]
    [ 48.591223] PGD 1e60c5067 P4D 1e60c5067 PUD 1eb0cf067 PMD 0
    [ 48.592230] Oops: 0000 [#1] SMP KASAN
    [ 48.595610] CPU: 5 PID: 1199 Comm: tipc Tainted: G B 4.15.0-rc4-pc64-dirty #5
    [ 48.597176] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
    [ 48.598489] RIP: 0010:tipc_mon_delete+0xea/0x270 [tipc]
    [ 48.599347] RSP: 0018:ffff8801d827f668 EFLAGS: 00010282
    [ 48.600705] RAX: ffff8801ee813f00 RBX: 0000000000000204 RCX: 0000000000000000
    [ 48.602183] RDX: 1ffffffff1de6a75 RSI: 0000000000000297 RDI: 0000000000000297
    [ 48.604373] RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1dd1533
    [ 48.605607] R10: ffffffff8eafbb05 R11: fffffbfff1dd1534 R12: 0000000000000050
    [ 48.607082] R13: dead000000000200 R14: ffffffff8e73f310 R15: 0000000000001020
    [ 48.608228] FS: 00007fc686484800(0000) GS:ffff8801f5540000(0000) knlGS:0000000000000000
    [ 48.610189] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 48.611459] CR2: 0000000000001008 CR3: 00000001dda70002 CR4: 00000000003606e0
    [ 48.612759] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 48.613831] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 48.615038] Call Trace:
    [ 48.615635] tipc_enable_bearer+0x415/0x5e0 [tipc]
    [ 48.620623] tipc_nl_bearer_enable+0x1ab/0x200 [tipc]
    [ 48.625118] genl_family_rcv_msg+0x36b/0x570
    [ 48.631233] genl_rcv_msg+0x5a/0xa0
    [ 48.631867] netlink_rcv_skb+0x1cc/0x220
    [ 48.636373] genl_rcv+0x24/0x40
    [ 48.637306] netlink_unicast+0x29c/0x350
    [ 48.639664] netlink_sendmsg+0x439/0x590
    [ 48.642014] SYSC_sendto+0x199/0x250
    [ 48.649912] do_syscall_64+0xfd/0x2c0
    [ 48.650651] entry_SYSCALL64_slow_path+0x25/0x25
    [ 48.651843] RIP: 0033:0x7fc6859848e3
    [ 48.652539] RSP: 002b:00007ffd25dff938 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    [ 48.654003] RAX: ffffffffffffffda RBX: 00007ffd25dff990 RCX: 00007fc6859848e3
    [ 48.655303] RDX: 0000000000000054 RSI: 00007ffd25dff990 RDI: 0000000000000003
    [ 48.656512] RBP: 00007ffd25dff980 R08: 00007fc685c35fc0 R09: 000000000000000c
    [ 48.657697] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000d13010
    [ 48.658840] R13: 00007ffd25e009c0 R14: 0000000000000000 R15: 0000000000000000
    [ 48.662972] RIP: tipc_mon_delete+0xea/0x270 [tipc] RSP: ffff8801d827f668
    [ 48.664073] CR2: 0000000000001008
    [ 48.664576] ---[ end trace e811818d54d5ce88 ]---

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tommi Rantala
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Tommi Rantala
     
  • [ Upstream commit 19142551b2be4a9e13838099fde1351386e5e007 ]

    Fix memory leak in tipc_enable_bearer() if enable_media() fails, and
    cleanup with bearer_disable() if tipc_mon_create() fails.

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tommi Rantala
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Tommi Rantala
     

31 Jan, 2018

1 commit

  • [ Upstream commit 59b36613e85fb16ebf9feaf914570879cd5c2a21 ]

    When tipc_node_find_by_name() fails, the nlmsg is not
    freed.

    While on it, switch to a goto label to properly
    free it.

    Fixes: be9c086715c ("tipc: narrow down exposure of struct tipc_node")
    Reported-by: Dmitry Vyukov
    Cc: Jon Maloy
    Cc: Ying Xue
    Signed-off-by: Cong Wang
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     

03 Jan, 2018

1 commit

  • [ Upstream commit 517d7c79bdb39864e617960504bdc1aa560c75c6 ]

    In commit 42b531de17d2f6 ("tipc: Fix missing connection request
    handling"), we replaced unconditional wakeup() with condtional
    wakeup for clients with flags POLLIN | POLLRDNORM | POLLRDBAND.

    This breaks the applications which do a connect followed by poll
    with POLLOUT flag. These applications are not woken when the
    connection is ESTABLISHED and hence sleep forever.

    In this commit, we fix it by including the POLLOUT event for
    sockets in TIPC_CONNECTING state.

    Fixes: 42b531de17d2f6 ("tipc: Fix missing connection request handling")
    Acked-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Parthasarathy Bhuvaragan
     

17 Dec, 2017

2 commits

  • [ Upstream commit c7799c067c2ae33e348508c8afec354f3257ff25 ]

    Remove the second tipc_rcv() call in tipc_udp_recv(). We have just
    checked that the bearer is not up, and calling tipc_rcv() with a bearer
    that is not up leads to a TIPC div-by-zero crash in
    tipc_node_calculate_timer(). The crash is rare in practice, but can
    happen like this:

    We're enabling a bearer, but it's not yet up and fully initialized.
    At the same time we receive a discovery packet, and in tipc_udp_recv()
    we end up calling tipc_rcv() with the not-yet-initialized bearer,
    causing later the div-by-zero crash in tipc_node_calculate_timer().

    Jon Maloy explains the impact of removing the second tipc_rcv() call:
    "link setup in the worst case will be delayed until the next arriving
    discovery messages, 1 sec later, and this is an acceptable delay."

    As the tipc_rcv() call is removed, just leave the function via the
    rcu_out label, so that we will kfree_skb().

    [ 12.590450] Own node address , network identity 1
    [ 12.668088] divide error: 0000 [#1] SMP
    [ 12.676952] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.14.2-dirty #1
    [ 12.679225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
    [ 12.682095] task: ffff8c2a761edb80 task.stack: ffffa41cc0cac000
    [ 12.684087] RIP: 0010:tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc]
    [ 12.686486] RSP: 0018:ffff8c2a7fc838a0 EFLAGS: 00010246
    [ 12.688451] RAX: 0000000000000000 RBX: ffff8c2a5b382600 RCX: 0000000000000000
    [ 12.691197] RDX: 0000000000000000 RSI: ffff8c2a5b382600 RDI: ffff8c2a5b382600
    [ 12.693945] RBP: ffff8c2a7fc838b0 R08: 0000000000000001 R09: 0000000000000001
    [ 12.696632] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c2a5d8949d8
    [ 12.699491] R13: ffffffff95ede400 R14: 0000000000000000 R15: ffff8c2a5d894800
    [ 12.702338] FS: 0000000000000000(0000) GS:ffff8c2a7fc80000(0000) knlGS:0000000000000000
    [ 12.705099] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 12.706776] CR2: 0000000001bb9440 CR3: 00000000bd009001 CR4: 00000000003606e0
    [ 12.708847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 12.711016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 12.712627] Call Trace:
    [ 12.713390]
    [ 12.714011] tipc_node_check_dest+0x2e8/0x350 [tipc]
    [ 12.715286] tipc_disc_rcv+0x14d/0x1d0 [tipc]
    [ 12.716370] tipc_rcv+0x8b0/0xd40 [tipc]
    [ 12.717396] ? minmax_running_min+0x2f/0x60
    [ 12.718248] ? dst_alloc+0x4c/0xa0
    [ 12.718964] ? tcp_ack+0xaf1/0x10b0
    [ 12.719658] ? tipc_udp_is_known_peer+0xa0/0xa0 [tipc]
    [ 12.720634] tipc_udp_recv+0x71/0x1d0 [tipc]
    [ 12.721459] ? dst_alloc+0x4c/0xa0
    [ 12.722130] udp_queue_rcv_skb+0x264/0x490
    [ 12.722924] __udp4_lib_rcv+0x21e/0x990
    [ 12.723670] ? ip_route_input_rcu+0x2dd/0xbf0
    [ 12.724442] ? tcp_v4_rcv+0x958/0xa40
    [ 12.725039] udp_rcv+0x1a/0x20
    [ 12.725587] ip_local_deliver_finish+0x97/0x1d0
    [ 12.726323] ip_local_deliver+0xaf/0xc0
    [ 12.726959] ? ip_route_input_noref+0x19/0x20
    [ 12.727689] ip_rcv_finish+0xdd/0x3b0
    [ 12.728307] ip_rcv+0x2ac/0x360
    [ 12.728839] __netif_receive_skb_core+0x6fb/0xa90
    [ 12.729580] ? udp4_gro_receive+0x1a7/0x2c0
    [ 12.730274] __netif_receive_skb+0x1d/0x60
    [ 12.730953] ? __netif_receive_skb+0x1d/0x60
    [ 12.731637] netif_receive_skb_internal+0x37/0xd0
    [ 12.732371] napi_gro_receive+0xc7/0xf0
    [ 12.732920] receive_buf+0x3c3/0xd40
    [ 12.733441] virtnet_poll+0xb1/0x250
    [ 12.733944] net_rx_action+0x23e/0x370
    [ 12.734476] __do_softirq+0xc5/0x2f8
    [ 12.734922] irq_exit+0xfa/0x100
    [ 12.735315] do_IRQ+0x4f/0xd0
    [ 12.735680] common_interrupt+0xa2/0xa2
    [ 12.736126]
    [ 12.736416] RIP: 0010:native_safe_halt+0x6/0x10
    [ 12.736925] RSP: 0018:ffffa41cc0cafe90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff4d
    [ 12.737756] RAX: 0000000000000000 RBX: ffff8c2a761edb80 RCX: 0000000000000000
    [ 12.738504] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    [ 12.739258] RBP: ffffa41cc0cafe90 R08: 0000014b5b9795e5 R09: ffffa41cc12c7e88
    [ 12.740118] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
    [ 12.740964] R13: ffff8c2a761edb80 R14: 0000000000000000 R15: 0000000000000000
    [ 12.741831] default_idle+0x2a/0x100
    [ 12.742323] arch_cpu_idle+0xf/0x20
    [ 12.742796] default_idle_call+0x28/0x40
    [ 12.743312] do_idle+0x179/0x1f0
    [ 12.743761] cpu_startup_entry+0x1d/0x20
    [ 12.744291] start_secondary+0x112/0x120
    [ 12.744816] secondary_startup_64+0xa5/0xa5
    [ 12.745367] Code: b9 f4 01 00 00 48 89 c2 48 c1 ea 02 48 3d d3 07 00
    00 48 0f 47 d1 49 8b 0c 24 48 39 d1 76 07 49 89 14 24 48 89 d1 31 d2 48
    89 df f7 f1 89 c6 e8 81 6e ff ff 5b 41 5c 5d c3 66 90 66 2e 0f 1f
    [ 12.747527] RIP: tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc] RSP: ffff8c2a7fc838a0
    [ 12.748555] ---[ end trace 1399ab83390650fd ]---
    [ 12.749296] Kernel panic - not syncing: Fatal exception in interrupt
    [ 12.750123] Kernel Offset: 0x13200000 from 0xffffffff82000000
    (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
    [ 12.751215] Rebooting in 60 seconds..

    Fixes: c9b64d492b1f ("tipc: add replicast peer discovery")
    Signed-off-by: Tommi Rantala
    Cc: Jon Maloy
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Tommi Rantala
     
  • [ Upstream commit a7d5f107b4978e08eeab599ee7449af34d034053 ]

    When the function tipc_accept_from_sock() fails to create an instance of
    struct tipc_subscriber it omits to free the already created instance of
    struct tipc_conn instance before it returns.

    We fix that with this commit.

    Reported-by: David S. Miller
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jon Maloy
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

09 Oct, 2017

2 commits

  • When a bundling message is received, the function tipc_link_input()
    calls function tipc_msg_extract() to unbundle all inner messages of
    the bundling message before adding them to input queue.

    The function tipc_msg_extract() just clones all inner skb for all
    inner messagges from the bundling skb. This means that the skb
    headroom of an inner message overlaps with the data part of the
    preceding message in the bundle.

    If the message in question is a name addressed message, it may be
    subject to a secondary destination lookup, and eventually be sent out
    on one of the interfaces again. But, since what is perceived as headroom
    by the device driver in reality is the last bytes of the preceding
    message in the bundle, the latter will be overwritten by the MAC
    addresses of the L2 header. If the preceding message has not yet been
    consumed by the user, it will evenually be delivered with corrupted
    contents.

    This commit fixes this by uncloning all messages passing through the
    function tipc_msg_lookup_dest(), hence ensuring that the headroom
    is always valid when the message is passed on.

    Signed-off-by: Tung Nguyen
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • We change the initialization of the skb transmit buffer queues
    in the functions tipc_bcast_xmit() and tipc_rcast_xmit() to also
    initialize their spinlocks. This is needed because we may, during
    error conditions, need to call skb_queue_purge() on those queues
    further down the stack.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

01 Oct, 2017

1 commit

  • In commit e3a77561e7d32 ("tipc: split up function tipc_msg_eval()"),
    we have updated the function tipc_msg_lookup_dest() to set the error
    codes to negative values at destination lookup failures. Thus when
    the function sets the error code to -TIPC_ERR_NO_NAME, its inserted
    into the 4 bit error field of the message header as 0xf instead of
    TIPC_ERR_NO_NAME (1). The value 0xf is an unknown error code.

    In this commit, we set only positive error code.

    Fixes: e3a77561e7d32 ("tipc: split up function tipc_msg_eval()")
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

07 Sep, 2017

1 commit


02 Sep, 2017

1 commit


30 Aug, 2017

1 commit

  • For a bond slave device as a tipc bearer, the dev represents the bond
    interface and orig_dev represents the slave in tipc_l2_rcv_msg().
    Since we decode the tipc_ptr from bonding device (dev), we fail to
    find the bearer and thus tipc links are not established.

    In this commit, we register the tipc protocol callback per device and
    look for tipc bearer from both the devices.

    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

25 Aug, 2017

4 commits

  • If we fail to find a valid bearer in tipc_node_get_linkname(),
    node_read_unlock() is called without holding the node read lock.

    This commit fixes this error.

    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • In tipc_msg_reverse(), we assign skb attributes to local pointers
    in stack at startup. This is followed by skb_linearize() and for
    cloned buffers we perform skb relocation using pskb_expand_head().
    Both these methods may update the skb attributes and thus making
    the pointers incorrect.

    In this commit, we fix this error by ensuring that the pointers
    are re-assigned after any of these skb operations.

    Fixes: 29042e19f2c60 ("tipc: let function tipc_msg_reverse() expand header
    when needed")
    Signed-off-by: Parthasarathy Bhuvaragan
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • In tipc_rcv(), we linearize only the header and usually the packets
    are consumed as the nodes permit direct reception. However, if the
    skb contains tunnelled message due to fail over or synchronization
    we parse it in tipc_node_check_state() without performing
    linearization. This will cause link disturbances if the skb was
    non linear.

    In this commit, we perform linearization for the above messages.

    Signed-off-by: Parthasarathy Bhuvaragan
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • In 9dbbfb0ab6680c6a85609041011484e6658e7d3c function tipc_sk_reinit
    had additional logic added to loop in the event that function
    rhashtable_walk_next() returned -EAGAIN. No worries.

    However, if rhashtable_walk_start returns -EAGAIN, it does "continue",
    and therefore skips the call to rhashtable_walk_stop(). That has
    the effect of calling rcu_read_lock() without its paired call to
    rcu_read_unlock(). Since rcu_read_lock() may be nested, the problem
    may not be apparent for a while, especially since resize events may
    be rare. But the comments to rhashtable_walk_start() state:

    * ...Note that we take the RCU lock in all
    * cases including when we return an error. So you must always call
    * rhashtable_walk_stop to clean up.

    This patch replaces the continue with a goto and label to ensure a
    matching call to rhashtable_walk_stop().

    Signed-off-by: Bob Peterson
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Bob Peterson
     

24 Aug, 2017

1 commit

  • genl_ops are not supposed to change at runtime. All functions
    working with genl_ops provided by work with
    const genl_ops. So mark the non-const structs as const.

    Signed-off-by: Arvind Yadav
    Signed-off-by: David S. Miller

    Arvind Yadav
     

23 Aug, 2017

2 commits

  • No matter whether a request is inserted into workqueue as a work item
    to cancel a subscription or to delete a subscription's subscriber
    asynchronously, the work items may be executed in different workers.
    As a result, it doesn't mean that one request which is raised prior to
    another request is definitely handled before the latter. By contrast,
    if the latter request is executed before the former request, below
    error may happen:

    [ 656.183644] BUG: spinlock bad magic on CPU#0, kworker/u8:0/12117
    [ 656.184487] general protection fault: 0000 [#1] SMP
    [ 656.185160] Modules linked in: tipc ip6_udp_tunnel udp_tunnel 9pnet_virtio 9p 9pnet virtio_net virtio_pci virtio_ring virtio [last unloaded: ip6_udp_tunnel]
    [ 656.187003] CPU: 0 PID: 12117 Comm: kworker/u8:0 Not tainted 4.11.0-rc7+ #6
    [ 656.187920] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    [ 656.188690] Workqueue: tipc_rcv tipc_recv_work [tipc]
    [ 656.189371] task: ffff88003f5cec40 task.stack: ffffc90004448000
    [ 656.190157] RIP: 0010:spin_bug+0xdd/0xf0
    [ 656.190678] RSP: 0018:ffffc9000444bcb8 EFLAGS: 00010202
    [ 656.191375] RAX: 0000000000000034 RBX: ffff88003f8d1388 RCX: 0000000000000000
    [ 656.192321] RDX: ffff88003ba13708 RSI: ffff88003ba0cd08 RDI: ffff88003ba0cd08
    [ 656.193265] RBP: ffffc9000444bcd0 R08: 0000000000000030 R09: 000000006b6b6b6b
    [ 656.194208] R10: ffff8800bde3e000 R11: 00000000000001b4 R12: 6b6b6b6b6b6b6b6b
    [ 656.195157] R13: ffffffff81a3ca64 R14: ffff88003f8d1388 R15: ffff88003f8d13a0
    [ 656.196101] FS: 0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
    [ 656.197172] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 656.197935] CR2: 00007f0b3d2e6000 CR3: 000000003ef9e000 CR4: 00000000000006f0
    [ 656.198873] Call Trace:
    [ 656.199210] do_raw_spin_lock+0x66/0xa0
    [ 656.199735] _raw_spin_lock_bh+0x19/0x20
    [ 656.200258] tipc_subscrb_subscrp_delete+0x28/0xf0 [tipc]
    [ 656.200990] tipc_subscrb_rcv_cb+0x45/0x260 [tipc]
    [ 656.201632] tipc_receive_from_sock+0xaf/0x100 [tipc]
    [ 656.202299] tipc_recv_work+0x2b/0x60 [tipc]
    [ 656.202872] process_one_work+0x157/0x420
    [ 656.203404] worker_thread+0x69/0x4c0
    [ 656.203898] kthread+0x138/0x170
    [ 656.204328] ? process_one_work+0x420/0x420
    [ 656.204889] ? kthread_create_on_node+0x40/0x40
    [ 656.205527] ret_from_fork+0x29/0x40
    [ 656.206012] Code: 48 8b 0c 25 00 c5 00 00 48 c7 c7 f0 24 a3 81 48 81 c1 f0 05 00 00 65 8b 15 61 ef f5 7e e8 9a 4c 09 00 4d 85 e4 44 8b 4b 08 74 92 8b 84 24 40 04 00 00 49 8d 8c 24 f0 05 00 00 eb 8d 90 0f 1f
    [ 656.208504] RIP: spin_bug+0xdd/0xf0 RSP: ffffc9000444bcb8
    [ 656.209798] ---[ end trace e2a800e6eb0770be ]---

    In above scenario, the request of deleting subscriber was performed
    earlier than the request of canceling a subscription although the
    latter was issued before the former, which means tipc_subscrb_delete()
    was called before tipc_subscrp_cancel(). As a result, when
    tipc_subscrb_subscrp_delete() called by tipc_subscrp_cancel() was
    executed to cancel a subscription, the subscription's subscriber
    refcnt had been decreased to 1. After tipc_subscrp_delete() where
    the subscriber was freed because its refcnt was decremented to zero,
    but the subscriber's lock had to be released, as a consequence, panic
    happened.

    By contrast, if we increase subscriber's refcnt before
    tipc_subscrb_subscrp_delete() is called in tipc_subscrp_cancel(),
    the panic issue can be avoided.

    Fixes: d094c4d5f5c7 ("tipc: add subscription refcount to avoid invalid delete")
    Reported-by: Parthasarathy Bhuvaragan
    Signed-off-by: Ying Xue
    Signed-off-by: David S. Miller

    Ying Xue
     
  • In commit, 139bb36f754a ("tipc: advance the time of deleting
    subscription from subscriber->subscrp_list"), we delete the
    subscription from the subscribers list and from nametable
    unconditionally. This leads to the following bug if the timer
    running tipc_subscrp_timeout() in another CPU accesses the
    subscription list after the subscription delete request.

    [39.570] general protection fault: 0000 [#1] SMP
    ::
    [39.574] task: ffffffff81c10540 task.stack: ffffffff81c00000
    [39.575] RIP: 0010:tipc_subscrp_timeout+0x32/0x80 [tipc]
    [39.576] RSP: 0018:ffff88003ba03e90 EFLAGS: 00010282
    [39.576] RAX: dead000000000200 RBX: ffff88003f0f3600 RCX: 0000000000000101
    [39.577] RDX: dead000000000100 RSI: 0000000000000201 RDI: ffff88003f0d7948
    [39.578] RBP: ffff88003ba03ea0 R08: 0000000000000001 R09: ffff88003ba03ef8
    [39.579] R10: 000000000000014f R11: 0000000000000000 R12: ffff88003f0d7948
    [39.580] R13: ffff88003f0f3618 R14: ffffffffa006c250 R15: ffff88003f0f3600
    [39.581] FS: 0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
    [39.582] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [39.583] CR2: 00007f831c6e0714 CR3: 000000003d3b0000 CR4: 00000000000006f0
    [39.584] Call Trace:
    [39.584]
    [39.585] call_timer_fn+0x3d/0x180
    [39.585] ? tipc_subscrb_rcv_cb+0x260/0x260 [tipc]
    [39.586] run_timer_softirq+0x168/0x1f0
    [39.586] ? sched_clock_cpu+0x16/0xc0
    [39.587] __do_softirq+0x9b/0x2de
    [39.587] irq_exit+0x60/0x70
    [39.588] smp_apic_timer_interrupt+0x3d/0x50
    [39.588] apic_timer_interrupt+0x86/0x90
    [39.589] RIP: 0010:default_idle+0x20/0xf0
    [39.589] RSP: 0018:ffffffff81c03e58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
    [39.590] RAX: 0000000000000000 RBX: ffffffff81c10540 RCX: 0000000000000000
    [39.591] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    [39.592] RBP: ffffffff81c03e68 R08: 0000000000000000 R09: 0000000000000000
    [39.593] R10: ffffc90001cbbe00 R11: 0000000000000000 R12: 0000000000000000
    [39.594] R13: ffffffff81c10540 R14: 0000000000000000 R15: 0000000000000000
    [39.595]
    ::
    [39.603] RIP: tipc_subscrp_timeout+0x32/0x80 [tipc] RSP: ffff88003ba03e90
    [39.604] ---[ end trace 79ce94b7216cb459 ]---

    Fixes: 139bb36f754a ("tipc: advance the time of deleting subscription from subscriber->subscrp_list")
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

22 Aug, 2017

2 commits

  • David S. Miller
     
  • When the broadcast send link after 100 attempts has failed to
    transfer a packet to all peers, we consider it stale, and reset
    it. Thereafter it needs to re-synchronize with the peers, something
    currently done by just resetting and re-establishing all links to
    all peers. This has turned out to be overkill, with potentially
    unwanted consequences for the remaining cluster.

    A closer analysis reveals that this can be done much simpler. When
    this kind of failure happens, for reasons that may lie outside the
    TIPC protocol, it is typically only one peer which is failing to
    receive and acknowledge packets. It is hence sufficient to identify
    and reset the links only to that peer to resolve the situation, without
    having to reset the broadcast link at all. This solution entails a much
    lower risk of negative consequences for the own node as well as for
    the overall cluster.

    We implement this change in this commit.

    Reviewed-by: Parthasarathy Bhuvaragan
    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

19 Aug, 2017

1 commit

  • syszkaller reported use-after-free in tipc [1]

    When msg->rep skb is freed, set the pointer to NULL,
    so that caller does not free it again.

    [1]

    ==================================================================
    BUG: KASAN: use-after-free in skb_push+0xd4/0xe0 net/core/skbuff.c:1466
    Read of size 8 at addr ffff8801c6e71e90 by task syz-executor5/4115

    CPU: 1 PID: 4115 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #32
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:52
    print_address_description+0x73/0x250 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x24e/0x340 mm/kasan/report.c:409
    __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
    skb_push+0xd4/0xe0 net/core/skbuff.c:1466
    tipc_nl_compat_recv+0x833/0x18f0 net/tipc/netlink_compat.c:1209
    genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
    genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
    netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
    genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
    netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
    netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
    netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
    sock_sendmsg_nosec net/socket.c:633 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:643
    sock_write_iter+0x31a/0x5d0 net/socket.c:898
    call_write_iter include/linux/fs.h:1743 [inline]
    new_sync_write fs/read_write.c:457 [inline]
    __vfs_write+0x684/0x970 fs/read_write.c:470
    vfs_write+0x189/0x510 fs/read_write.c:518
    SYSC_write fs/read_write.c:565 [inline]
    SyS_write+0xef/0x220 fs/read_write.c:557
    entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x4512e9
    RSP: 002b:00007f3bc8184c08 EFLAGS: 00000216 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 00000000004512e9
    RDX: 0000000000000020 RSI: 0000000020fdb000 RDI: 0000000000000006
    RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004b5e76
    R13: 00007f3bc8184b48 R14: 00000000004b5e86 R15: 0000000000000000

    Allocated by task 4115:
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
    save_stack+0x43/0xd0 mm/kasan/kasan.c:447
    set_track mm/kasan/kasan.c:459 [inline]
    kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
    kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
    kmem_cache_alloc_node+0x13d/0x750 mm/slab.c:3651
    __alloc_skb+0xf1/0x740 net/core/skbuff.c:219
    alloc_skb include/linux/skbuff.h:903 [inline]
    tipc_tlv_alloc+0x26/0xb0 net/tipc/netlink_compat.c:148
    tipc_nl_compat_dumpit+0xf2/0x3c0 net/tipc/netlink_compat.c:248
    tipc_nl_compat_handle net/tipc/netlink_compat.c:1130 [inline]
    tipc_nl_compat_recv+0x756/0x18f0 net/tipc/netlink_compat.c:1199
    genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
    genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
    netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
    genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
    netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
    netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
    netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
    sock_sendmsg_nosec net/socket.c:633 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:643
    sock_write_iter+0x31a/0x5d0 net/socket.c:898
    call_write_iter include/linux/fs.h:1743 [inline]
    new_sync_write fs/read_write.c:457 [inline]
    __vfs_write+0x684/0x970 fs/read_write.c:470
    vfs_write+0x189/0x510 fs/read_write.c:518
    SYSC_write fs/read_write.c:565 [inline]
    SyS_write+0xef/0x220 fs/read_write.c:557
    entry_SYSCALL_64_fastpath+0x1f/0xbe

    Freed by task 4115:
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
    save_stack+0x43/0xd0 mm/kasan/kasan.c:447
    set_track mm/kasan/kasan.c:459 [inline]
    kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
    __cache_free mm/slab.c:3503 [inline]
    kmem_cache_free+0x77/0x280 mm/slab.c:3763
    kfree_skbmem+0x1a1/0x1d0 net/core/skbuff.c:622
    __kfree_skb net/core/skbuff.c:682 [inline]
    kfree_skb+0x165/0x4c0 net/core/skbuff.c:699
    tipc_nl_compat_dumpit+0x36a/0x3c0 net/tipc/netlink_compat.c:260
    tipc_nl_compat_handle net/tipc/netlink_compat.c:1130 [inline]
    tipc_nl_compat_recv+0x756/0x18f0 net/tipc/netlink_compat.c:1199
    genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
    genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
    netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
    genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
    netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
    netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
    netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
    sock_sendmsg_nosec net/socket.c:633 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:643
    sock_write_iter+0x31a/0x5d0 net/socket.c:898
    call_write_iter include/linux/fs.h:1743 [inline]
    new_sync_write fs/read_write.c:457 [inline]
    __vfs_write+0x684/0x970 fs/read_write.c:470
    vfs_write+0x189/0x510 fs/read_write.c:518
    SYSC_write fs/read_write.c:565 [inline]
    SyS_write+0xef/0x220 fs/read_write.c:557
    entry_SYSCALL_64_fastpath+0x1f/0xbe

    The buggy address belongs to the object at ffff8801c6e71dc0
    which belongs to the cache skbuff_head_cache of size 224
    The buggy address is located 208 bytes inside of
    224-byte region [ffff8801c6e71dc0, ffff8801c6e71ea0)
    The buggy address belongs to the page:
    page:ffffea00071b9c40 count:1 mapcount:0 mapping:ffff8801c6e71000 index:0x0
    flags: 0x200000000000100(slab)
    raw: 0200000000000100 ffff8801c6e71000 0000000000000000 000000010000000c
    raw: ffffea0007224a20 ffff8801d98caf48 ffff8801d9e79040 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff8801c6e71d80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
    ffff8801c6e71e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff8801c6e71e80: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
    ^
    ffff8801c6e71f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff8801c6e71f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ==================================================================

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Cc: Jon Maloy
    Cc: Ying Xue
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Aug, 2017

2 commits

  • In the function msg_reverse(), we reverse the header while trying to
    reuse the original buffer whenever possible. Those rejected/returned
    messages are always transmitted as unicast, but the msg_non_seq field
    is not explicitly set to zero as it should be.

    We have seen cases where multicast senders set the message type to
    "NOT dest_droppable", meaning that a multicast message shorter than
    one MTU will be returned, e.g., during receive buffer overflow, by
    reusing the original buffer. This has the effect that even the
    'msg_non_seq' field is inadvertently inherited by the rejected message,
    although it is now sent as a unicast message. This again leads the
    receiving unicast link endpoint to steer the packet toward the broadcast
    link receive function, where it is dropped. The affected unicast link is
    thereafter (after 100 failed retransmissions) declared 'stale' and
    reset.

    We fix this by unconditionally setting the 'msg_non_seq' flag to zero
    for all rejected/returned messages.

    Reported-by: Canh Duc Luu
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • On L2 bearers, the TIPC broadcast function is sending out packets using
    the corresponding L2 broadcast address. At reception, we filter such
    packets under the assumption that they will also be delivered as
    broadcast packets.

    This assumption doesn't always hold true. Under high load, we have seen
    that a switch may convert the destination address and deliver the packet
    as a PACKET_MULTICAST, something leading to inadvertently dropped
    packets and a stale and reset broadcast link.

    We fix this by extending the reception filtering to accept packets of
    type PACKET_MULTICAST.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

10 Aug, 2017

1 commit

  • When a link between two nodes come up, both endpoints will initially
    send out a STATE message to the peer, to increase the probability that
    the peer endpoint also is up when the first traffic message arrives.
    Thereafter, if the establishing link is the second link between two
    nodes, this first "traffic" message is a TUNNEL_PROTOCOL/SYNCH message,
    helping the peer to perform initial synchronization between the two
    links.

    However, the initial STATE message may be lost, in which case the SYNCH
    message will be the first one arriving at the peer. This should also
    work, as the SYNCH message itself will be used to take up the link
    endpoint before initializing synchronization.

    Unfortunately the code for this case is broken. Currently, the link is
    brought up through a tipc_link_fsm_evt(ESTABLISHED) when a SYNCH
    arrives, whereupon __tipc_node_link_up() is called to distribute the
    link slots and take the link into traffic. But, __tipc_node_link_up() is
    itself starting with a test for whether the link is up, and if true,
    returns without action. Clearly, the tipc_link_fsm_evt(ESTABLISHED) call
    is unnecessary, since tipc_node_link_up() is itself issuing such an
    event, but also harmful, since it inhibits tipc_node_link_up() to
    perform the test of its tasks, and the link endpoint in question hence
    is never taken into traffic.

    This problem has been exposed when we set up dual links between pre-
    and post-4.4 kernels, because the former ones don't send out the
    initial STATE message described above.

    We fix this by removing the unnecessary event call.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

01 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    This patch uses refcount_inc_not_zero() instead of
    atomic_inc_not_zero_hint() due to absense of a _hint()
    version of refcount API. If the hint() version must
    be used, we might need to revisit API.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

11 Jun, 2017

1 commit

  • The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
    function call path is:
    tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
    tipc_rcv
    tipc_sk_rcv
    tipc_msg_reverse
    pskb_expand_head(GFP_KERNEL) --> may sleep
    tipc_node_broadcast
    tipc_node_xmit_skb
    tipc_node_xmit
    tipc_sk_rcv
    tipc_msg_reverse
    pskb_expand_head(GFP_KERNEL) --> may sleep

    To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".

    Signed-off-by: Jia-Ju Bai
    Signed-off-by: David S. Miller

    Jia-Ju Bai
     

12 May, 2017

1 commit

  • The macro tipc_wait_for_cond() is embedding the macro sk_wait_event()
    to fulfil its task. The latter, in turn, is evaluating the stated
    condition outside the socket lock context. This is problematic if
    the condition is accessing non-trivial data structures which may be
    altered by incoming interrupts, as is the case with the cong_links()
    linked list, used by socket to keep track of the current set of
    congested links. We sometimes see crashes when this list is accessed
    by a condition function at the same time as a SOCK_WAKEUP interrupt
    is removing an element from the list.

    We fix this by expanding selected parts of sk_wait_event() into the
    outer macro, while ensuring that all evaluations of a given condition
    are performed under socket lock protection.

    Fixes: commit 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
    Reviewed-by: Parthasarathy Bhuvaragan
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

03 May, 2017

3 commits

  • Pull networking updates from David Millar:
    "Here are some highlights from the 2065 networking commits that
    happened this development cycle:

    1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri)

    2) Add a generic XDP driver, so that anyone can test XDP even if they
    lack a networking device whose driver has explicit XDP support
    (me).

    3) Sparc64 now has an eBPF JIT too (me)

    4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei
    Starovoitov)

    5) Make netfitler network namespace teardown less expensive (Florian
    Westphal)

    6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana)

    7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger)

    8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky)

    9) Multiqueue support in stmmac driver (Joao Pinto)

    10) Remove TCP timewait recycling, it never really could possibly work
    well in the real world and timestamp randomization really zaps any
    hint of usability this feature had (Soheil Hassas Yeganeh)

    11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay
    Aleksandrov)

    12) Add socket busy poll support to epoll (Sridhar Samudrala)

    13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso,
    and several others)

    14) IPSEC hw offload infrastructure (Steffen Klassert)"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits)
    tipc: refactor function tipc_sk_recv_stream()
    tipc: refactor function tipc_sk_recvmsg()
    net: thunderx: Optimize page recycling for XDP
    net: thunderx: Support for XDP header adjustment
    net: thunderx: Add support for XDP_TX
    net: thunderx: Add support for XDP_DROP
    net: thunderx: Add basic XDP support
    net: thunderx: Cleanup receive buffer allocation
    net: thunderx: Optimize CQE_TX handling
    net: thunderx: Optimize RBDR descriptor handling
    net: thunderx: Support for page recycling
    ipx: call ipxitf_put() in ioctl error path
    net: sched: add helpers to handle extended actions
    qed*: Fix issues in the ptp filter config implementation.
    qede: Fix concurrency issue in PTP Tx path processing.
    stmmac: Add support for SIMATIC IOT2000 platform
    net: hns: fix ethtool_get_strings overflow in hns driver
    tcp: fix wraparound issue in tcp_lp
    bpf, arm64: fix jit branch offset related to ldimm64
    bpf, arm64: implement jiting of BPF_XADD
    ...

    Linus Torvalds
     
  • We try to make this function more readable by improving variable names
    and comments, using more stack variables, and doing some smaller changes
    to the logics. We also rename the function to make it consistent with
    naming conventions used elsewhere in the code.

    Reviewed-by: Parthasarathy Bhuvaragan
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • We try to make this function more readable by improving variable names
    and comments, plus some minor changes to the logics.

    Reviewed-by: Parthasarathy Bhuvaragan
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

29 Apr, 2017

2 commits

  • When a socket is shutting down, we notify the peer node about the
    connection termination by reusing an incoming message if possible.
    If the last received message was a connection acknowledgment
    message, we reverse this message and set the error code to
    TIPC_ERR_NO_PORT and send it to peer.

    In tipc_sk_proto_rcv(), we never check for message errors while
    processing the connection acknowledgment or probe messages. Thus
    this message performs the usual flow control accounting and leaves
    the session hanging.

    In this commit, we terminate the connection when we receive such
    error messages.

    Signed-off-by: Parthasarathy Bhuvaragan
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Until now, the checks for sockets in CONNECTING state was based on
    the assumption that the incoming message was always from the
    peer's accepted data socket.

    However an application using a non-blocking socket sends an implicit
    connect, this socket which is in CONNECTING state can receive error
    messages from the peer's listening socket. As we discard these
    messages, the application socket hangs as there due to inactivity.
    In addition to this, there are other places where we process errors
    but do not notify the user.

    In this commit, we process such incoming error messages and notify
    our users about them using sk_state_change().

    Signed-off-by: Parthasarathy Bhuvaragan
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan