09 Dec, 2020

1 commit

  • `tipc_node_apply_property` does a null check on a `tipc_link_entry`
    pointer but also accesses the same pointer out of the null check block.

    This triggers a warning on Coverity Static Analyzer because we're
    implying that `e->link` can BE null.

    Move "Update MTU for node link entry" line into if block to make sure
    that we're not in a state that `e->link` is null.

    Signed-off-by: Cengiz Can
    Signed-off-by: David S. Miller

    Cengiz Can
     

02 Dec, 2020

1 commit

  • In commit 682cd3cf946b6
    ("tipc: confgiure and apply UDP bearer MTU on running links"), we
    introduced a function to change UDP bearer MTU and applied this new value
    across existing per-link. However, we did not apply this new MTU value at
    node level. This lead to packet dropped at link level if its size is
    greater than new MTU value.

    To fix this issue, we also apply this new MTU value for node level.

    Fixes: 682cd3cf946b6 ("tipc: confgiure and apply UDP bearer MTU on running links")
    Acked-by: Jon Maloy
    Signed-off-by: Hoang Le
    Link: https://lore.kernel.org/r/20201130025544.3602-1-hoang.h.le@dektech.com.au
    Signed-off-by: Jakub Kicinski

    Hoang Le
     

12 Nov, 2020

1 commit

  • kmemleak report a memory leak as follows:

    unreferenced object 0xffff88810a596800 (size 512):
    comm "ip", pid 21558, jiffies 4297568990 (age 112.120s)
    hex dump (first 32 bytes):
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
    ff ff ff ff ff ff ff ff 00 83 60 b0 ff ff ff ff ..........`.....
    backtrace:
    [] tipc_topsrv_init_net+0x1f3/0xa70
    [] ops_init+0xa8/0x3c0
    [] setup_net+0x2de/0x7e0
    [] copy_net_ns+0x27d/0x530
    [] create_new_namespaces+0x382/0xa30
    [] unshare_nsproxy_namespaces+0xa1/0x1d0
    [] ksys_unshare+0x39c/0x780
    [] __x64_sys_unshare+0x2d/0x40
    [] do_syscall_64+0x56/0xa0
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    'srv' is malloced in tipc_topsrv_start() but not free before
    leaving from the error handling cases. We need to free it.

    Fixes: 5c45ab24ac77 ("tipc: make struct tipc_server private for server.c")
    Reported-by: Hulk Robot
    Signed-off-by: Wang Hai
    Link: https://lore.kernel.org/r/20201109140913.47370-1-wanghai38@huawei.com
    Signed-off-by: Jakub Kicinski

    Wang Hai
     

30 Oct, 2020

2 commits

  • Pull networking fixes from Jakub Kicinski:
    "Current release regressions:

    - r8169: fix forced threading conflicting with other shared
    interrupts; we tried to fix the use of raise_softirq_irqoff from an
    IRQ handler on RT by forcing hard irqs, but this driver shares
    legacy PCI IRQs so drop the _irqoff() instead

    - tipc: fix memory leak caused by a recent syzbot report fix to
    tipc_buf_append()

    Current release - bugs in new features:

    - devlink: Unlock on error in dumpit() and fix some error codes

    - net/smc: fix null pointer dereference in smc_listen_decline()

    Previous release - regressions:

    - tcp: Prevent low rmem stalls with SO_RCVLOWAT.

    - net: protect tcf_block_unbind with block lock

    - ibmveth: Fix use of ibmveth in a bridge; the self-imposed filtering
    to only send legal frames to the hypervisor was too strict

    - net: hns3: Clear the CMDQ registers before unmapping BAR region;
    incorrect cleanup order was leading to a crash

    - bnxt_en - handful of fixes to fixes:
    - Send HWRM_FUNC_RESET fw command unconditionally, even if there
    are PCIe errors being reported
    - Check abort error state in bnxt_open_nic().
    - Invoke cancel_delayed_work_sync() for PFs also.
    - Fix regression in workqueue cleanup logic in bnxt_remove_one().

    - mlxsw: Only advertise link modes supported by both driver and
    device, after removal of 56G support from the driver 56G was not
    cleared from advertised modes

    - net/smc: fix suppressed return code

    Previous release - always broken:

    - netem: fix zero division in tabledist, caused by integer overflow

    - bnxt_en: Re-write PCI BARs after PCI fatal error.

    - cxgb4: set up filter action after rewrites

    - net: ipa: command payloads already mapped

    Misc:

    - s390/ism: fix incorrect system EID, it's okay to change since it
    was added in current release

    - vsock: use ns_capable_noaudit() on socket create to suppress false
    positive audit messages"

    * tag 'net-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
    r8169: fix issue with forced threading in combination with shared interrupts
    netem: fix zero division in tabledist
    ibmvnic: fix ibmvnic_set_mac
    mptcp: add missing memory scheduling in the rx path
    tipc: fix memory leak caused by tipc_buf_append()
    gtp: fix an use-before-init in gtp_newlink()
    net: protect tcf_block_unbind with block lock
    ibmveth: Fix use of ibmveth in a bridge.
    net/sched: act_mpls: Add softdep on mpls_gso.ko
    ravb: Fix bit fields checking in ravb_hwtstamp_get()
    devlink: Unlock on error in dumpit()
    devlink: Fix some error codes
    chelsio/chtls: fix memory leaks in CPL handlers
    chelsio/chtls: fix deadlock issue
    net: hns3: Clear the CMDQ registers before unmapping BAR region
    bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally.
    bnxt_en: Check abort error state in bnxt_open_nic().
    bnxt_en: Re-write PCI BARs after PCI fatal error.
    bnxt_en: Invoke cancel_delayed_work_sync() for PFs also.
    bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one().
    ...

    Linus Torvalds
     
  • Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
    replaced skb_unshare() with skb_copy() to not reduce the data reference
    counter of the original skb intentionally. This is not the correct
    way to handle the cloned skb because it causes memory leak in 2
    following cases:
    1/ Sending multicast messages via broadcast link
    The original skb list is cloned to the local skb list for local
    destination. After that, the data reference counter of each skb
    in the original list has the value of 2. This causes each skb not
    to be freed after receiving ACK:
    tipc_link_advance_transmq()
    {
    ...
    /* release skb */
    __skb_unlink(skb, &l->transmq);
    kfree_skb(skb);
    Reported-by: Thang Hoang Ngo
    Signed-off-by: Tung Nguyen
    Reviewed-by: Xin Long
    Acked-by: Cong Wang
    Link: https://lore.kernel.org/r/20201027032403.1823-1-tung.q.nguyen@dektech.com.au
    Signed-off-by: Jakub Kicinski

    Tung Nguyen
     

26 Oct, 2020

1 commit

  • Commit 453431a54934 ("mm, treewide: rename kzfree() to
    kfree_sensitive()") renamed kzfree() to kfree_sensitive(),
    but it left a compatibility definition of kzfree() to avoid
    being too disruptive.

    Since then a few more instances of kzfree() have slipped in.

    Just get rid of them and remove the compatibility definition
    once and for all.

    Signed-off-by: Eric Biggers
    Signed-off-by: Linus Torvalds

    Eric Biggers
     

17 Oct, 2020

2 commits

  • In commit 16ad3f4022bb
    ("tipc: introduce variable window congestion control"), we applied
    the algorithm to select window size from minimum window to the
    configured maximum window for unicast link, and, besides we chose
    to keep the window size for broadcast link unchanged and equal (i.e
    fix window 50)

    However, when setting maximum window variable via command, the window
    variable was re-initialized to unexpect value (i.e 32).

    We fix this by updating the fix window for broadcast as we stated.

    Fixes: 16ad3f4022bb ("tipc: introduce variable window congestion control")
    Acked-by: Jon Maloy
    Signed-off-by: Hoang Huu Le
    Signed-off-by: Jakub Kicinski

    Hoang Huu Le
     
  • The queue limit of the broadcast link is being calculated base on initial
    MTU. However, when MTU value changed (e.g manual changing MTU on NIC
    device, MTU negotiation etc.,) we do not re-calculate queue limit.
    This gives throughput does not reflect with the change.

    So fix it by calling the function to re-calculate queue limit of the
    broadcast link.

    Acked-by: Jon Maloy
    Signed-off-by: Hoang Huu Le
    Signed-off-by: Jakub Kicinski

    Hoang Huu Le
     

16 Oct, 2020

1 commit


10 Oct, 2020

2 commits

  • In the function node_lost_contact(), we call __skb_queue_purge() without
    grabbing the list->lock. This can cause to a race-condition why processing
    the list 'namedq' in calling path tipc_named_rcv()->tipc_named_dequeue().

    [] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [] #PF: supervisor read access in kernel mode
    [] #PF: error_code(0x0000) - not-present page
    [] PGD 7ca63067 P4D 7ca63067 PUD 6c553067 PMD 0
    [] Oops: 0000 [#1] SMP NOPTI
    [] CPU: 1 PID: 15 Comm: ksoftirqd/1 Tainted: G O 5.9.0-rc6+ #2
    [] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS [...]
    [] RIP: 0010:tipc_named_rcv+0x103/0x320 [tipc]
    [] Code: 41 89 44 24 10 49 8b 16 49 8b 46 08 49 c7 06 00 00 00 [...]
    [] RSP: 0018:ffffc900000a7c58 EFLAGS: 00000282
    [] RAX: 00000000000012ec RBX: 0000000000000000 RCX: ffff88807bde1270
    [] RDX: 0000000000002c7c RSI: 0000000000002c7c RDI: ffff88807b38f1a8
    [] RBP: ffff88807b006288 R08: ffff88806a367800 R09: ffff88806a367900
    [] R10: ffff88806a367a00 R11: ffff88806a367b00 R12: ffff88807b006258
    [] R13: ffff88807b00628a R14: ffff888069334d00 R15: ffff88806a434600
    [] FS: 0000000000000000(0000) GS:ffff888079480000(0000) knlGS:0[...]
    [] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [] CR2: 0000000000000000 CR3: 0000000077320000 CR4: 00000000000006e0
    [] Call Trace:
    [] ? tipc_bcast_rcv+0x9a/0x1a0 [tipc]
    [] tipc_rcv+0x40d/0x670 [tipc]
    [] ? _raw_spin_unlock+0xa/0x20
    [] tipc_l2_rcv_msg+0x55/0x80 [tipc]
    [] __netif_receive_skb_one_core+0x8c/0xa0
    [] process_backlog+0x98/0x140
    [] net_rx_action+0x13a/0x420
    [] __do_softirq+0xdb/0x316
    [] ? smpboot_thread_fn+0x2f/0x1e0
    [] ? smpboot_thread_fn+0x74/0x1e0
    [] ? smpboot_thread_fn+0x14e/0x1e0
    [] run_ksoftirqd+0x1a/0x40
    [] smpboot_thread_fn+0x149/0x1e0
    [] ? sort_range+0x20/0x20
    [] kthread+0x131/0x150
    [] ? kthread_unuse_mm+0xa0/0xa0
    [] ret_from_fork+0x22/0x30
    [] Modules linked in: veth tipc(O) ip6_udp_tunnel udp_tunnel [...]
    [] CR2: 0000000000000000
    [] ---[ end trace 65c276a8e2e2f310 ]---

    To fix this, we need to grab the lock of the 'namedq' list on both
    path calling.

    Fixes: cad2929dc432 ("tipc: update a binding service via broadcast")
    Acked-by: Jon Maloy
    Signed-off-by: Hoang Huu Le
    Signed-off-by: Jakub Kicinski

    Hoang Huu Le
     
  • skb_unshare() drops a reference count on the old skb unconditionally,
    so in the failure case, we end up freeing the skb twice here.
    And because the skb is allocated in fclone and cloned by caller
    tipc_msg_reassemble(), the consequence is actually freeing the
    original skb too, thus triggered the UAF by syzbot.

    Fix this by replacing this skb_unshare() with skb_cloned()+skb_copy().

    Fixes: ff48b6222e65 ("tipc: use skb_unshare() instead in tipc_buf_append()")
    Reported-and-tested-by: syzbot+e96a7ba46281824cc46a@syzkaller.appspotmail.com
    Cc: Jon Maloy
    Cc: Ying Xue
    Signed-off-by: Cong Wang
    Reviewed-by: Xin Long
    Signed-off-by: Jakub Kicinski

    Cong Wang
     

03 Oct, 2020

1 commit


23 Sep, 2020

1 commit

  • Two minor conflicts:

    1) net/ipv4/route.c, adding a new local variable while
    moving another local variable and removing it's
    initial assignment.

    2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
    One pretty prints the port mode differently, whilst another
    changes the driver to try and obtain the port mode from
    the port node rather than the switch node.

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Sep, 2020

7 commits

  • If the header file containing a function's prototype isn't included by
    the sourcefile containing the associated function, the build system
    complains of missing prototypes.

    Fixes the following W=1 kernel build warning(s):

    net/tipc/udp_media.c:446:5: warning: no previous prototype for ‘tipc_udp_nl_dump_remoteip’ [-Wmissing-prototypes]
    net/tipc/udp_media.c:532:5: warning: no previous prototype for ‘tipc_udp_nl_add_bearer_data’ [-Wmissing-prototypes]
    net/tipc/udp_media.c:614:5: warning: no previous prototype for ‘tipc_udp_nl_bearer_add’ [-Wmissing-prototypes]

    Signed-off-by: Wang Hai
    Signed-off-by: David S. Miller

    Wang Hai
     
  • It is no used any more, so can remove it.

    Signed-off-by: YueHaibing
    Signed-off-by: David S. Miller

    YueHaibing
     
  • Drop repeated words in net/tipc/.

    Signed-off-by: Randy Dunlap
    Cc: "David S. Miller"
    Cc: Jakub Kicinski
    Cc: Jon Maloy
    Cc: Ying Xue
    Cc: tipc-discussion@lists.sourceforge.net
    Signed-off-by: David S. Miller

    Randy Dunlap
     
  • Rekeying is required for security since a key is less secure when using
    for a long time. Also, key will be detached when its nonce value (or
    seqno ...) is exhausted. We now make the rekeying process automatic and
    configurable by user.

    Basically, TIPC will at a specific interval generate a new key by using
    the kernel 'Random Number Generator' cipher, then attach it as the node
    TX key and securely distribute to others in the cluster as RX keys (-
    the key exchange). The automatic key switching will then take over, and
    make the new key active shortly. Afterwards, the traffic from this node
    will be encrypted with the new session key. The same can happen in peer
    nodes but not necessarily at the same time.

    For simplicity, the automatically generated key will be initiated as a
    per node key. It is not too hard to also support a cluster key rekeying
    (e.g. a given node will generate a unique cluster key and update to the
    others in the cluster...), but that doesn't bring much benefit, while a
    per-node key is even more secure.

    We also enable user to force a rekeying or change the rekeying interval
    via netlink, the new 'set key' command option: 'TIPC_NLA_NODE_REKEYING'
    is added for these purposes as follows:
    - A value >= 1 will be set as the rekeying interval (in minutes);
    - A value of 0 will disable the rekeying;
    - A value of 'TIPC_REKEYING_NOW' (~0) will force an immediate rekeying;

    The default rekeying interval is (60 * 24) minutes i.e. done every day.
    There isn't any restriction for the value but user shouldn't set it too
    small or too large which results in an "ineffective" rekeying (thats ok
    for testing though).

    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     
  • With support from the master key option in the previous commit, it
    becomes easy to make frequent updates/exchanges of session keys between
    authenticated cluster nodes.
    Basically, there are two situations where the key exchange will take in
    place:

    - When a new node joins the cluster (with the master key), it will need
    to get its peer's TX key, so that be able to decrypt further messages
    from that peer.

    - When a new session key is generated (by either user manual setting or
    later automatic rekeying feature), the key will be distributed to all
    peer nodes in the cluster.

    A key to be exchanged is encapsulated in the data part of a 'MSG_CRYPTO
    /KEY_DISTR_MSG' TIPC v2 message, then xmit-ed as usual and encrypted by
    using the master key before sending out. Upon receipt of the message it
    will be decrypted in the same way as regular messages, then attached as
    the sender's RX key in the receiver node.

    In this way, the key exchange is reliable by the link layer, as well as
    security, integrity and authenticity by the crypto layer.

    Also, the forward security will be easily achieved by user changing the
    master key actively but this should not be required very frequently.

    The key exchange feature is independent on the presence of a master key
    Note however that the master key still is needed for new nodes to be
    able to join the cluster. It is also optional, and can be turned off/on
    via the sysfs: 'net/tipc/key_exchange_enabled' [default 1: enabled].

    Backward compatibility is guaranteed because for nodes that do not have
    master key support, key exchange using master key ie. tx_key = 0 if any
    will be shortly discarded at the message validation step. In other
    words, the key exchange feature will be automatically disabled to those
    nodes.

    v2: fix the "implicit declaration of function 'tipc_crypto_key_flush'"
    error in node.c. The function only exists when built with the TIPC
    "CONFIG_TIPC_CRYPTO" option.

    v3: use 'info->extack' for a message emitted due to netlink operations
    instead (- David's comment).

    Reported-by: kernel test robot
    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     
  • In addition to the supported cluster & per-node encryption keys for the
    en/decryption of TIPC messages, we now introduce one option for user to
    set a cluster key as 'master key', which is simply a symmetric key like
    the former but has a longer life cycle. It has two purposes:

    - Authentication of new member nodes in the cluster. New nodes, having
    no knowledge of current session keys in the cluster will still be
    able to join the cluster as long as they know the master key. This is
    because all neighbor discovery (LINK_CONFIG) messages must be
    encrypted with this key.

    - Encryption of session encryption keys during automatic exchange and
    update of those.This is a feature we will introduce in a later commit
    in this series.

    We insert the new key into the currently unused slot 0 in the key array
    and start using it immediately once the user has set it.
    After joining, a node only knowing the master key should be fully
    communicable to existing nodes in the cluster, although those nodes may
    have their own session keys activated (i.e. not the master one). To
    support this, we define a 'grace period', starting from the time a node
    itself reports having no RX keys, so the existing nodes will use the
    master key for encryption instead. The grace period can be extended but
    will automatically stop after e.g. 5 seconds without a new report. This
    is also the basis for later key exchanging feature as the new node will
    be impossible to decrypt anything without the support from master key.

    For user to set a master key, we define a new netlink flag -
    'TIPC_NLA_NODE_KEY_MASTER', so it can be added to the current 'set key'
    netlink command to specify the setting key to be a master key.

    Above all, the traditional cluster/per-node key mechanism is guaranteed
    to work when user comes not to use this master key option. This is also
    compatible to legacy nodes without the feature supported.

    Even this master key can be updated without any interruption of cluster
    connectivity but is so is needed, this has to be coordinated and set by
    the user.

    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     
  • We reduce the lasting time for a pending TX key to be active as well as
    for a passive RX key to be freed which generally helps speed up the key
    switching. It is not expected to be too fast but should not be too slow
    either. Also the key handling logic is simplified that a pending RX key
    will be removed automatically if it is found not working after a number
    of times; the probing for a pending TX key is now carried on a specific
    message user ('LINK_PROTOCOL' or 'LINK_CONFIG') which is more efficient
    than using a timer on broadcast messages, the timer is reserved for use
    later as needed.

    The kernel logs or 'pr***()' are now made as clear as possible to user.
    Some prints are added, removed or changed to the debug-level. The
    'TIPC_CRYPTO_DEBUG' definition is removed, and the 'pr_debug()' is used
    instead which will be much helpful in runtime.

    Besides we also optimize the code in some other places as a preparation
    for later commits.

    v2: silent more kernel logs, also use 'info->extack' for a message
    emitted due to netlink operations instead (- David's comments).

    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

16 Sep, 2020

1 commit

  • Fix parameter description of tipc_link_bc_create()

    Reported-by: Hulk Robot
    Fixes: 16ad3f4022bb ("tipc: introduce variable window congestion control")
    Signed-off-by: Lu Wei
    Signed-off-by: David S. Miller

    Lu Wei
     

15 Sep, 2020

2 commits

  • In tipc_buf_append() it may change skb's frag_list, and it causes
    problems when this skb is cloned. skb_unclone() doesn't really
    make this skb's flag_list available to change.

    Shuang Li has reported an use-after-free issue because of this
    when creating quite a few macvlan dev over the same dev, where
    the broadcast packets will be cloned and go up to the stack:

    [ ] BUG: KASAN: use-after-free in pskb_expand_head+0x86d/0xea0
    [ ] Call Trace:
    [ ] dump_stack+0x7c/0xb0
    [ ] print_address_description.constprop.7+0x1a/0x220
    [ ] kasan_report.cold.10+0x37/0x7c
    [ ] check_memory_region+0x183/0x1e0
    [ ] pskb_expand_head+0x86d/0xea0
    [ ] process_backlog+0x1df/0x660
    [ ] net_rx_action+0x3b4/0xc90
    [ ]
    [ ] Allocated by task 1786:
    [ ] kmem_cache_alloc+0xbf/0x220
    [ ] skb_clone+0x10a/0x300
    [ ] macvlan_broadcast+0x2f6/0x590 [macvlan]
    [ ] macvlan_process_broadcast+0x37c/0x516 [macvlan]
    [ ] process_one_work+0x66a/0x1060
    [ ] worker_thread+0x87/0xb10
    [ ]
    [ ] Freed by task 3253:
    [ ] kmem_cache_free+0x82/0x2a0
    [ ] skb_release_data+0x2c3/0x6e0
    [ ] kfree_skb+0x78/0x1d0
    [ ] tipc_recvmsg+0x3be/0xa40 [tipc]

    So fix it by using skb_unshare() instead, which would create a new
    skb for the cloned frag and it'll be safe to change its frag_list.
    The similar things were also done in sctp_make_reassembled_event(),
    which is using skb_copy().

    Reported-by: Shuang Li
    Fixes: 37e22164a8a3 ("tipc: rename and move message reassembly function")
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • tipc_group_add_to_tree() returns silently if `key` matches `nkey` of an
    existing node, causing tipc_group_create_member() to leak memory. Let
    tipc_group_add_to_tree() return an error in such a case, so that
    tipc_group_create_member() can handle it properly.

    Fixes: 75da2163dbb6 ("tipc: introduce communication groups")
    Reported-and-tested-by: syzbot+f95d90c454864b3b5bc9@syzkaller.appspotmail.com
    Cc: Hillf Danton
    Link: https://syzkaller.appspot.com/bug?id=048390604fe1b60df34150265479202f10e13aff
    Signed-off-by: Peilin Ye
    Signed-off-by: David S. Miller

    Peilin Ye
     

11 Sep, 2020

1 commit

  • I confirmed that the problem fixed by commit 2a63866c8b51a3f7 ("tipc: fix
    shutdown() of connectionless socket") also applies to stream socket.

    ----------
    #include
    #include
    #include

    int main(int argc, char *argv[])
    {
    int fds[2] = { -1, -1 };
    socketpair(PF_TIPC, SOCK_STREAM /* or SOCK_DGRAM */, 0, fds);
    if (fork() == 0)
    _exit(read(fds[0], NULL, 1));
    shutdown(fds[0], SHUT_RDWR); /* This must make read() return. */
    wait(NULL); /* To be woken up by _exit(). */
    return 0;
    }
    ----------

    Since shutdown(SHUT_RDWR) should affect all processes sharing that socket,
    unconditionally setting sk->sk_shutdown to SHUTDOWN_MASK will be the right
    behavior.

    Signed-off-by: Tetsuo Handa
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Tetsuo Handa
     

08 Sep, 2020

1 commit

  • In the commit fdeba99b1e58
    ("tipc: fix use-after-free in tipc_bcast_get_mode"), we're trying
    to make sure the tipc_net_finalize_work work item finished if it
    enqueued. But calling flush_scheduled_work() is not just affecting
    above work item but either any scheduled work. This has turned out
    to be overkill and caused to deadlock as syzbot reported:

    ======================================================
    WARNING: possible circular locking dependency detected
    5.9.0-rc2-next-20200828-syzkaller #0 Not tainted
    ------------------------------------------------------
    kworker/u4:6/349 is trying to acquire lock:
    ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777

    but task is already holding lock:
    ffffffff8a879430 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x9b/0xb10 net/core/net_namespace.c:565

    [...]
    Possible unsafe locking scenario:

    CPU0 CPU1
    ---- ----
    lock(pernet_ops_rwsem);
    lock(&sb->s_type->i_mutex_key#13);
    lock(pernet_ops_rwsem);
    lock((wq_completion)events);

    *** DEADLOCK ***
    [...]

    v1:
    To fix the original issue, we replace above calling by introducing
    a bit flag. When a namespace cleaned-up, bit flag is set to zero and:
    - tipc_net_finalize functionial just does return immediately.
    - tipc_net_finalize_work does not enqueue into the scheduled work queue.

    v2:
    Use cancel_work_sync() helper to make sure ONLY the
    tipc_net_finalize_work() stopped before releasing bcbase object.

    Reported-by: syzbot+d5aa7e0385f6a5d0f4fd@syzkaller.appspotmail.com
    Fixes: fdeba99b1e58 ("tipc: fix use-after-free in tipc_bcast_get_mode")
    Acked-by: Jon Maloy
    Signed-off-by: Hoang Huu Le
    Signed-off-by: Jakub Kicinski

    Hoang Huu Le
     

05 Sep, 2020

1 commit

  • We got slightly different patches removing a double word
    in a comment in net/ipv4/raw.c - picked the version from net.

    Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
    values instead of VNIC login response buffer (following what
    commit 507ebe6444a4 ("ibmvnic: Fix use-after-free of VNIC login
    response buffer") did).

    Signed-off-by: Jakub Kicinski

    Jakub Kicinski
     

04 Sep, 2020

1 commit

  • Pull networking fixes from David Miller:

    1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi
    Kivilinna.

    2) Fix loss of RTT samples in rxrpc, from David Howells.

    3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu.

    4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka.

    5) We disable BH for too lokng in sctp_get_port_local(), add a
    cond_resched() here as well, from Xin Long.

    6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu.

    7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from
    Yonghong Song.

    8) Missing of_node_put() in mt7530 DSA driver, from Sumera
    Priyadarsini.

    9) Fix crash in bnxt_fw_reset_task(), from Michael Chan.

    10) Fix geneve tunnel checksumming bug in hns3, from Yi Li.

    11) Memory leak in rxkad_verify_response, from Dinghao Liu.

    12) In tipc, don't use smp_processor_id() in preemptible context. From
    Tuong Lien.

    13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu.

    14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter.

    15) Fix ABI mismatch between driver and firmware in nfp, from Louis
    Peens.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits)
    net/smc: fix sock refcounting in case of termination
    net/smc: reset sndbuf_desc if freed
    net/smc: set rx_off for SMCR explicitly
    net/smc: fix toleration of fake add_link messages
    tg3: Fix soft lockup when tg3_reset_task() fails.
    doc: net: dsa: Fix typo in config code sample
    net: dp83867: Fix WoL SecureOn password
    nfp: flower: fix ABI mismatch between driver and firmware
    tipc: fix shutdown() of connectionless socket
    ipv6: Fix sysctl max for fib_multipath_hash_policy
    drivers/net/wan/hdlc: Change the default of hard_header_len to 0
    net: gemini: Fix another missing clk_disable_unprepare() in probe
    net: bcmgenet: fix mask check in bcmgenet_validate_flow()
    amd-xgbe: Add support for new port mode
    net: usb: dm9601: Add USB ID of Keenetic Plus DSL
    vhost: fix typo in error message
    net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()
    pktgen: fix error message with wrong function name
    net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode
    cxgb4: fix thermal zone device registration
    ...

    Linus Torvalds
     

03 Sep, 2020

1 commit

  • syzbot is reporting hung task at nbd_ioctl() [1], for there are two
    problems regarding TIPC's connectionless socket's shutdown() operation.

    ----------
    #include
    #include
    #include
    #include
    #include

    int main(int argc, char *argv[])
    {
    const int fd = open("/dev/nbd0", 3);
    alarm(5);
    ioctl(fd, NBD_SET_SOCK, socket(PF_TIPC, SOCK_DGRAM, 0));
    ioctl(fd, NBD_DO_IT, 0); /* To be interrupted by SIGALRM. */
    return 0;
    }
    ----------

    One problem is that wait_for_completion() from flush_workqueue() from
    nbd_start_device_ioctl() from nbd_ioctl() cannot be completed when
    nbd_start_device_ioctl() received a signal at wait_event_interruptible(),
    for tipc_shutdown() from kernel_sock_shutdown(SHUT_RDWR) from
    nbd_mark_nsock_dead() from sock_shutdown() from nbd_start_device_ioctl()
    is failing to wake up a WQ thread sleeping at wait_woken() from
    tipc_wait_for_rcvmsg() from sock_recvmsg() from sock_xmit() from
    nbd_read_stat() from recv_work() scheduled by nbd_start_device() from
    nbd_start_device_ioctl(). Fix this problem by always invoking
    sk->sk_state_change() (like inet_shutdown() does) when tipc_shutdown() is
    called.

    The other problem is that tipc_wait_for_rcvmsg() cannot return when
    tipc_shutdown() is called, for tipc_shutdown() sets sk->sk_shutdown to
    SEND_SHUTDOWN (despite "how" is SHUT_RDWR) while tipc_wait_for_rcvmsg()
    needs sk->sk_shutdown set to RCV_SHUTDOWN or SHUTDOWN_MASK. Fix this
    problem by setting sk->sk_shutdown to SHUTDOWN_MASK (like inet_shutdown()
    does) when the socket is connectionless.

    [1] https://syzkaller.appspot.com/bug?id=3fe51d307c1f0a845485cf1798aa059d12bf18b2

    Reported-by: syzbot
    Signed-off-by: Tetsuo Handa
    Signed-off-by: David S. Miller

    Tetsuo Handa
     

01 Sep, 2020

2 commits


31 Aug, 2020

1 commit

  • The 'this_cpu_ptr()' is used to obtain the AEAD key' TFM on the current
    CPU for encryption, however the execution can be preemptible since it's
    actually user-space context, so the 'using smp_processor_id() in
    preemptible' has been observed.

    We fix the issue by using the 'get/put_cpu_ptr()' API which consists of
    a 'preempt_disable()' instead.

    Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

27 Aug, 2020

1 commit

  • Syzbot has reported those issues as:

    ==================================================================
    BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759
    Read of size 1 at addr ffff88805e6b3571 by task kworker/0:6/3850

    CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Workqueue: events tipc_net_finalize_work

    Thread 1's call trace:
    [...]
    kfree+0x103/0x2c0 mm/slab.c:3757
    Signed-off-by: Hoang Huu Le
    Signed-off-by: David S. Miller

    Hoang Huu Le
     

24 Aug, 2020

2 commits


21 Aug, 2020

1 commit

  • b->media->send_msg() requires rcu_read_lock(), as we can see
    elsewhere in tipc, tipc_bearer_xmit, tipc_bearer_xmit_skb
    and tipc_bearer_bc_xmit().

    Syzbot has reported this issue as:

    net/tipc/bearer.c:466 suspicious rcu_dereference_check() usage!
    Workqueue: cryptd cryptd_queue_worker
    Call Trace:
    tipc_l2_send_msg+0x354/0x420 net/tipc/bearer.c:466
    tipc_aead_encrypt_done+0x204/0x3a0 net/tipc/crypto.c:761
    cryptd_aead_crypt+0xe8/0x1d0 crypto/cryptd.c:739
    cryptd_queue_worker+0x118/0x1b0 crypto/cryptd.c:181
    process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
    worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
    kthread+0x3b5/0x4a0 kernel/kthread.c:291
    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

    So fix it by calling rcu_read_lock() in tipc_aead_encrypt_done()
    for b->media->send_msg().

    Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
    Reported-by: syzbot+47bbc6b678d317cccbe0@syzkaller.appspotmail.com
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

19 Aug, 2020

2 commits

  • This patch is to do 3 things for ipv6_dev_find():

    As David A. noticed,

    - rt6_lookup() is not really needed. Different from __ip_dev_find(),
    ipv6_dev_find() doesn't have a compatibility problem, so remove it.

    As Hideaki suggested,

    - "valid" (non-tentative) check for the address is also needed.
    ipv6_chk_addr() calls ipv6_chk_addr_and_flags(), which will
    traverse the address hash list, but it's heavy to be called
    inside ipv6_dev_find(). This patch is to reuse the code of
    ipv6_chk_addr_and_flags() for ipv6_dev_find().

    - dev parameter is passed into ipv6_dev_find(), as link-local
    addresses from user space has sin6_scope_id set and the dev
    lookup needs it.

    Fixes: 81f6cb31222d ("ipv6: add ipv6_dev_find()")
    Suggested-by: YOSHIFUJI Hideaki
    Reported-by: David Ahern
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • Convert the uses of fallthrough comments to fallthrough macro.

    Signed-off-by: Miaohe Lin
    Signed-off-by: David S. Miller

    Miaohe Lin
     

17 Aug, 2020

2 commits

  • When using ipv6_dev_find() in one module, it requires ipv6 not to
    work as a module. Otherwise, this error occurs in build:

    undefined reference to `ipv6_dev_find'.

    So fix it by adding "depends on IPV6 || IPV6=n" to tipc/Kconfig,
    as it does in sctp/Kconfig.

    Fixes: 5a6f6f579178 ("tipc: set ub->ifindex for local ipv6 address")
    Reported-by: kernel test robot
    Acked-by: Randy Dunlap
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • __tipc_nl_compat_dumpit() has two callers, and it expects them to
    pass a valid nlmsghdr via arg->data. This header is artificial and
    crafted just for __tipc_nl_compat_dumpit().

    tipc_nl_compat_publ_dump() does so by putting a genlmsghdr as well
    as some nested attribute, TIPC_NLA_SOCK. But the other caller
    tipc_nl_compat_dumpit() does not, this leaves arg->data uninitialized
    on this call path.

    Fix this by just adding a similar nlmsghdr without any payload in
    tipc_nl_compat_dumpit().

    This bug exists since day 1, but the recent commit 6ea67769ff33
    ("net: tipc: prepare attrs in __tipc_nl_compat_dumpit()") makes it
    easier to appear.

    Reported-and-tested-by: syzbot+0e7181deafa7e0b79923@syzkaller.appspotmail.com
    Fixes: d0796d1ef63d ("tipc: convert legacy nl bearer dump to nl compat")
    Cc: Jon Maloy
    Cc: Ying Xue
    Cc: Richard Alpe
    Signed-off-by: Cong Wang
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Cong Wang
     

08 Aug, 2020

1 commit

  • As said by Linus:

    A symmetric naming is only helpful if it implies symmetries in use.
    Otherwise it's actively misleading.

    In "kzalloc()", the z is meaningful and an important part of what the
    caller wants.

    In "kzfree()", the z is actively detrimental, because maybe in the
    future we really _might_ want to use that "memfill(0xdeadbeef)" or
    something. The "zero" part of the interface isn't even _relevant_.

    The main reason that kzfree() exists is to clear sensitive information
    that should not be leaked to other future users of the same memory
    objects.

    Rename kzfree() to kfree_sensitive() to follow the example of the recently
    added kvfree_sensitive() and make the intention of the API more explicit.
    In addition, memzero_explicit() is used to clear the memory to make sure
    that it won't get optimized away by the compiler.

    The renaming is done by using the command sequence:

    git grep -w --name-only kzfree |\
    xargs sed -i 's/kzfree/kfree_sensitive/'

    followed by some editing of the kfree_sensitive() kerneldoc and adding
    a kzfree backward compatibility macro in slab.h.

    [akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
    [akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]

    Suggested-by: Joe Perches
    Signed-off-by: Waiman Long
    Signed-off-by: Andrew Morton
    Acked-by: David Howells
    Acked-by: Michal Hocko
    Acked-by: Johannes Weiner
    Cc: Jarkko Sakkinen
    Cc: James Morris
    Cc: "Serge E. Hallyn"
    Cc: Joe Perches
    Cc: Matthew Wilcox
    Cc: David Rientjes
    Cc: Dan Carpenter
    Cc: "Jason A . Donenfeld"
    Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
    Signed-off-by: Linus Torvalds

    Waiman Long