17 Dec, 2017

1 commit

  • [ Upstream commit a7d5f107b4978e08eeab599ee7449af34d034053 ]

    When the function tipc_accept_from_sock() fails to create an instance of
    struct tipc_subscriber it omits to free the already created instance of
    struct tipc_conn instance before it returns.

    We fix that with this commit.

    Reported-by: David S. Miller
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jon Maloy
     

25 Jan, 2017

4 commits

  • In tipc_server_stop(), we iterate over the connections with limiting
    factor as server's idr_in_use. We ignore the fact that this variable
    is decremented in tipc_close_conn(), leading to premature exit.

    In this commit, we iterate until the we have no connections left.

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Tested-by: John Thompson
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • In tipc_conn_sendmsg(), we first queue the request to the outqueue
    followed by the connection state check. If the connection is not
    connected, we should not queue this message.

    In this commit, we reject the messages if the connection state is
    not CF_CONNECTED.

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Tested-by: John Thompson
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Commit 333f796235a527 ("tipc: fix a race condition leading to
    subscriber refcnt bug") reveals a soft lockup while acquiring
    nametbl_lock.

    Before commit 333f796235a527, we call tipc_conn_shutdown() from
    tipc_close_conn() in the context of tipc_topsrv_stop(). In that
    context, we are allowed to grab the nametbl_lock.

    Commit 333f796235a527, moved tipc_conn_release (renamed from
    tipc_conn_shutdown) to the connection refcount cleanup. This allows
    either tipc_nametbl_withdraw() or tipc_topsrv_stop() to the cleanup.

    Since tipc_exit_net() first calls tipc_topsrv_stop() and then
    tipc_nametble_withdraw() increases the chances for the later to
    perform the connection cleanup.

    The soft lockup occurs in the call chain of tipc_nametbl_withdraw(),
    when it performs the tipc_conn_kref_release() as it tries to grab
    nametbl_lock again while holding it already.
    tipc_nametbl_withdraw() grabs nametbl_lock
    tipc_nametbl_remove_publ()
    tipc_subscrp_report_overlap()
    tipc_subscrp_send_event()
    tipc_conn_sendmsg()
    << if (con->flags != CF_CONNECTED) we do conn_put(),
    triggering the cleanup as refcount=0. >>
    tipc_conn_kref_release
    tipc_sock_release
    tipc_conn_release
    tipc_subscrb_delete
    tipc_subscrp_delete
    tipc_nametbl_unsubscribe << Soft Lockup >>

    The previous changes in this series fixes the race conditions fixed
    by commit 333f796235a527. Hence we can now revert the commit.

    Fixes: 333f796235a52727 ("tipc: fix a race condition leading to subscriber refcnt bug")
    Reported-and-Tested-by: John Thompson
    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Until now, the generic server framework maintains the connection
    id's per subscriber in server's conn_idr. At tipc_close_conn, we
    remove the connection id from the server list, but the connection is
    valid until we call the refcount cleanup. Hence we have a window
    where the server allocates the same connection to an new subscriber
    leading to inconsistent reference count. We have another refcount
    warning we grab the refcount in tipc_conn_lookup() for connections
    with flag with CF_CONNECTED not set. This usually occurs at shutdown
    when the we stop the topology server and withdraw TIPC_CFG_SRV
    publication thereby triggering a withdraw message to subscribers.

    In this commit, we:
    1. remove the connection from the server list at recount cleanup.
    2. grab the refcount for a connection only if CF_CONNECTED is set.

    Tested-by: John Thompson
    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

27 Jun, 2016

1 commit

  • Replace calls to kmalloc followed by a memcpy with a direct call to
    kmemdup.

    The Coccinelle semantic patch used to make this change is as follows:
    @@
    expression from,to,size,flag;
    statement S;
    @@

    - to = \(kmalloc\|kzalloc\)(size,flag);
    + to = kmemdup(from,size,flag);
    if (to==NULL || ...) S
    - memcpy(to, from, size);

    Signed-off-by: Amitoj Kaur Chawla
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Amitoj Kaur Chawla
     

20 May, 2016

1 commit

  • TCP stack can now run from process context.

    Use read_lock_bh(&sk->sk_callback_lock) variant to restore previous
    assumption.

    Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
    Fixes: d41a69f1d390 ("tcp: make tcp_sendmsg() aware of socket backlog")
    Signed-off-by: Eric Dumazet
    Cc: Jon Maloy
    Cc: Ying Xue
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Apr, 2016

1 commit

  • Until now, the requests sent to topology server are queued
    to a workqueue by the generic server framework.
    These messages are processed by worker threads and trigger the
    registered callbacks.
    To reduce latency on uniprocessor systems, explicit rescheduling
    is performed using cond_resched() after MAX_RECV_MSG_COUNT(25)
    messages.

    This implementation on SMP systems leads to an subscriber refcnt
    error as described below:
    When a worker thread yields by calling cond_resched() in a SMP
    system, a new worker is created on another CPU to process the
    pending workitem. Sometimes the sleeping thread wakes up before
    the new thread finishes execution.
    This breaks the assumption on ordering and being single threaded.
    The fault is more frequent when MAX_RECV_MSG_COUNT is lowered.

    If the first thread was processing subscription create and the
    second thread processing close(), the close request will free
    the subscriber and the create request oops as follows:

    [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380 [tipc]
    [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97
    [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc]
    [...]
    [31.228377] Call Trace:
    [31.228377] [] dump_stack+0x4d/0x72
    [31.228377] [] __warn+0xd1/0xf0
    [31.228377] [] warn_slowpath_null+0x1d/0x20
    [31.228377] [] tipc_subscrb_rcv_cb+0x317/0x380 [tipc]
    [31.228377] [] tipc_receive_from_sock+0xd4/0x130 [tipc]
    [31.228377] [] tipc_recv_work+0x2b/0x50 [tipc]
    [31.228377] [] process_one_work+0x145/0x3d0
    [31.246554] ---[ end trace c3882c9baa05a4fd ]---
    [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266
    [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428
    [31.249323] IP: [] spin_dump+0x5c/0xe0
    [31.249323] PGD 0
    [31.249323] Oops: 0000 [#1] SMP

    In this commit, we
    - rename tipc_conn_shutdown() to tipc_conn_release().
    - move connection release callback execution from tipc_close_conn()
    to a new function tipc_sock_release(), which is executed before
    we free the connection.
    Thus we release the subscriber during connection release procedure
    rather than connection shutdown procedure.

    Signed-off-by: Parthasarathy Bhuvaragan
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

06 Feb, 2016

1 commit

  • Until now, tipc_rcv and tipc_send workqueues in server are allocated
    with parameters WQ_UNBOUND & max_active = 1.
    This parameters passed to this function makes it equivalent to
    alloc_ordered_workqueue(). The later form is more explicit and
    can inherit future ordered_workqueue changes.

    In this commit we replace alloc_workqueue() with more readable
    alloc_ordered_workqueue().

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

15 May, 2015

1 commit


05 May, 2015

1 commit


23 Apr, 2015

1 commit

  • When a new topology server is launched in a new namespace, its
    listening socket is inserted into the "init ns" namespace's socket
    hash table rather than the one owned by the new namespace. Although
    the socket's namespace is forcedly changed to the new namespace later,
    the socket is still stored in the socket hash table of "init ns"
    namespace. When a client created in the new namespace connects
    its own topology server, the connection is failed as its server's
    socket could not be found from its own namespace's socket table.

    If __sock_create() instead of original sock_create_kern() is used
    to create the server's socket through specifying an expected namesapce,
    the socket will be inserted into the specified namespace's socket
    table, thereby avoiding to the topology server broken issue.

    Fixes: 76100a8a64bc ("tipc: fix netns refcnt leak")

    Reported-by: Erik Hugne
    Signed-off-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

18 Mar, 2015

2 commits

  • The TIPC topology server is a per namespace service associated with the
    tipc name {1, 1}. When a namespace is deleted, that name must be withdrawn
    before we call sk_release_kernel because the kernel socket release is
    done in init_net and trying to withdraw a TIPC name published in another
    namespace will fail with an error as:

    [ 170.093264] Unable to remove local publication
    [ 170.093264] (type=1, lower=1, ref=2184244004, key=2184244005)

    We fix this by breaking the association between the topology server name
    and socket before calling sk_release_kernel.

    Signed-off-by: Ying Xue
    Reviewed-by: Erik Hugne
    Signed-off-by: David S. Miller

    Ying Xue
     
  • When the TIPC module is loaded, we launch a topology server in kernel
    space, which in its turn is creating TIPC sockets for communication
    with topology server users. Because both the socket's creator and
    provider reside in the same module, it is necessary that the TIPC
    module's reference count remains zero after the server is started and
    the socket created; otherwise it becomes impossible to perform "rmmod"
    even on an idle module.

    Currently, we achieve this by defining a separate "tipc_proto_kern"
    protocol struct, that is used only for kernel space socket allocations.
    This structure has the "owner" field set to NULL, which restricts the
    module reference count from being be bumped when sk_alloc() for local
    sockets is called. Furthermore, we have defined three kernel-specific
    functions, tipc_sock_create_local(), tipc_sock_release_local() and
    tipc_sock_accept_local(), to avoid the module counter being modified
    when module local sockets are created or deleted. This has worked well
    until we introduced name space support.

    However, after name space support was introduced, we have observed that
    a reference count leak occurs, because the netns counter is not
    decremented in tipc_sock_delete_local().

    This commit remedies this problem. But instead of just modifying
    tipc_sock_delete_local(), we eliminate the whole parallel socket
    handling infrastructure, and start using the regular sk_create_kern(),
    kernel_accept() and sk_release_kernel() calls. Since those functions
    manipulate the module counter, we must now compensate for that by
    explicitly decrementing the counter after module local sockets are
    created, and increment it just before calling sk_release_kernel().

    Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace")
    Signed-off-by: Ying Xue
    Reviewed-by: Jon Maloy
    Reviewed-by: Erik Hugne
    Reported-by: Cong Wang
    Tested-by: Erik Hugne
    Signed-off-by: David S. Miller

    Ying Xue
     

13 Jan, 2015

3 commits

  • TIPC establishes one subscriber server which allows users to subscribe
    their interesting name service status. After tipc supports namespace,
    one dedicated tipc stack instance is created for each namespace, and
    each instance can be deemed as one independent TIPC node. As a result,
    subscriber server must be built for each namespace.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • TIPC name table is used to store the mapping relationship between
    TIPC service name and socket port ID. When tipc supports namespace,
    it allows users to publish service names only owned by a certain
    namespace. Therefore, every namespace must have its private name
    table to prevent service names published to one namespace from being
    contaminated by other service names in another namespace. Therefore,
    The name table global variable (ie, nametbl) and its lock must be
    moved to tipc_net structure, and a parameter of namespace must be
    added for necessary functions so that they can obtain name table
    variable defined in tipc_net structure.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Only the works of initializing and shutting down tipc module are done
    in core.h and core.c files, so all stuffs which are not closely
    associated with the two tasks should be moved to appropriate places.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

12 Apr, 2014

1 commit

  • Several spots in the kernel perform a sequence like:

    skb_queue_tail(&sk->s_receive_queue, skb);
    sk->sk_data_ready(sk, skb->len);

    But at the moment we place the SKB onto the socket receive queue it
    can be consumed and freed up. So this skb->len access is potentially
    to freed up memory.

    Furthermore, the skb->len can be modified by the consumer so it is
    possible that the value isn't accurate.

    And finally, no actual implementation of this callback actually uses
    the length argument. And since nobody actually cared about it's
    value, lots of call sites pass arbitrary values in such as '0' and
    even '1'.

    So just remove the length argument from the callback, that way there
    is no confusion whatsoever and all of these use-after-free cases get
    fixed as a side effect.

    Based upon a patch by Eric Dumazet and his suggestion to audit this
    issue tree-wide.

    Signed-off-by: David S. Miller

    David S. Miller
     

07 Mar, 2014

2 commits

  • When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a
    connection instance, its reference count value is increased if
    it's found. But subsequently if it's found that the connection is
    closed, the work of sending message is not queued into its server
    send workqueue, and the connection reference count is not decreased.
    This will cause a reference count leak. To reproduce this problem,
    an application would need to open and closes topology server
    connections with high intensity.

    We fix this by immediately decrementing the connection reference
    count if a send fails due to the connection being closed.

    Signed-off-by: Ying Xue
    Acked-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Currently connection shutdown callback function is called when
    connection instance is released in tipc_conn_kref_release(), and
    receiving packets and sending packets are running in different
    threads. Even if connection is closed by the thread of receiving
    packets, its shutdown callback may not be called immediately as
    the connection reference count is non-zero at that moment. So,
    although the connection is shut down by the thread of receiving
    packets, the thread of sending packets doesn't know it. Before
    its shutdown callback is invoked to tell the sending thread its
    connection has been closed, the sending thread may deliver
    messages by tipc_conn_sendmsg(), this is why the following error
    information appears:

    "Sending subscription event failed, no memory"

    To eliminate it, allow connection shutdown callback function to
    be called before connection id is removed in tipc_close_conn(),
    which makes the sending thread know the truth in time that its
    socket is closed so that it doesn't send message to it. We also
    remove the "Sending XXX failed..." error reporting for topology
    and config services.

    Signed-off-by: Ying Xue
    Signed-off-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

22 Feb, 2014

1 commit

  • When tipc module is inserted, many tipc components are initialized
    one by one. During the initialization period, if one of them is
    failed, tipc_core_stop() will be called to stop all components
    whatever corresponding components are created or not. To avoid to
    release uncreated ones, relevant components have to add necessary
    enabled flags indicating whether they are created or not.

    But in the initialization stage, if one component is unsuccessfully
    created, we will just destroy successfully created components before
    the failed component instead of all components. All enabled flags
    defined in components, in turn, become redundant. Additionally it's
    also unnecessary to identify whether table.types is NULL in
    tipc_nametbl_stop() because name stable has been definitely created
    successfully when tipc_nametbl_stop() is called.

    Cc: Jon Maloy
    Cc: Erik Hugne
    Signed-off-by: Ying Xue
    Reviewed-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Ying Xue
     

15 Jan, 2014

1 commit


02 Aug, 2013

1 commit

  • When creation of TIPC internal server socket fails,
    we get an oops with the following dump:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
    IP: [] tipc_close_conn+0x59/0xb0 [tipc]
    PGD 13719067 PUD 12008067 PMD 0
    Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
    Modules linked in: tipc(+)
    CPU: 4 PID: 4340 Comm: insmod Not tainted 3.10.0+ #1
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
    task: ffff880014360000 ti: ffff88001374c000 task.ti: ffff88001374c000
    RIP: 0010:[] [] tipc_close_conn+0x59/0xb0 [tipc]
    RSP: 0018:ffff88001374dc98 EFLAGS: 00010292
    RAX: 0000000000000000 RBX: ffff880012ac09d8 RCX: 0000000000000000
    RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff880014360000
    RBP: ffff88001374dcb8 R08: 0000000000000001 R09: 0000000000000001
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0016fa0
    R13: ffffffffa0017010 R14: ffffffffa0017010 R15: ffff880012ac09d8
    FS: 0000000000000000(0000) GS:ffff880016600000(0063) knlGS:00000000f76668d0
    CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
    CR2: 0000000000000020 CR3: 0000000012227000 CR4: 00000000000006e0
    Stack:
    ffff88001374dcb8 ffffffffa0016fa0 0000000000000000 0000000000000001
    ffff88001374dcf8 ffffffffa0012922 ffff88001374dce8 00000000ffffffea
    ffffffffa0017100 0000000000000000 ffff8800134241a8 ffffffffa0017150
    Call Trace:
    [] tipc_server_stop+0xa2/0x1b0 [tipc]
    [] tipc_subscr_stop+0x15/0x20 [tipc]
    [] tipc_core_stop+0x1d/0x33 [tipc]
    [] tipc_init+0xd4/0xf8 [tipc]
    [] ? 0xffffffffa001efff
    [] do_one_initcall+0x3f/0x150
    [] ? __blocking_notifier_call_chain+0x7d/0xd0
    [] load_module+0x11aa/0x19c0
    [] ? show_initstate+0x50/0x50
    [] ? retint_restore_args+0xe/0xe
    [] SyS_init_module+0xd9/0x110
    [] sysenter_dispatch+0x7/0x1f
    Code: 6c 24 70 4c 89 ef e8 b7 04 8f e1 8b 73 04 4c 89 e7 e8 7c 9e 32 e1 41 83 ac 24
    b8 00 00 00 01 4c 89 ef e8 eb 0a 8f e1 48 8b 43 08 8b 68 20 4d 8d a5 48 03 00
    00 4c 89 e7 e8 04 05 8f e1 4c 89
    RIP [] tipc_close_conn+0x59/0xb0 [tipc]
    RSP
    CR2: 0000000000000020
    ---[ end trace b02321f40e4269a3 ]---

    We have the following call chain:

    tipc_core_start()
    ret = tipc_subscr_start()
    ret = tipc_server_start(){
    server->enabled = 1;
    ret = tipc_open_listening_sock()
    }

    I.e., the server->enabled flag is unconditionally set to 1, whatever
    the return value of tipc_open_listening_sock().

    This causes a crash when tipc_core_start() tries to clean up
    resources after a failed initialization:

    if (ret == failed)
    tipc_subscr_stop()
    tipc_server_stop(){
    if (server->enabled)
    tipc_close_conn(){
    NULL reference of con->sock-sk
    OOPS!
    }
    }

    To avoid this, tipc_server_start() should only set server->enabled
    to 1 in case of a succesful socket creation. In case of failure, it
    should release all allocated resources before returning.

    Problem introduced in commit c5fa7b3cf3cb22e4ac60485fc2dc187fe012910f
    ("tipc: introduce new TIPC server infrastructure") in v3.11-rc1.
    Note that it won't be seen often; it takes a module load under memory
    constrained conditions in order to trigger the failure condition.

    Signed-off-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Ying Xue
     

18 Jun, 2013

1 commit

  • TIPC has two internal servers, one providing a subscription
    service for topology events, and another providing the
    configuration interface. These servers have previously been running
    in BH context, accessing the TIPC-port (aka native) API directly.
    Apart from these servers, even the TIPC socket implementation is
    partially built on this API.

    As this API may simultaneously be called via different paths and in
    different contexts, a complex and costly lock policiy is required
    in order to protect TIPC internal resources.

    To eliminate the need for this complex lock policiy, we introduce
    a new, generic service API that uses kernel sockets for message
    passing instead of the native API. Once the toplogy and configuration
    servers are converted to use this new service, all code pertaining
    to the native API can be removed. This entails a significant
    reduction in code amount and complexity, and opens up for a complete
    rework of the locking policy in TIPC.

    The new service also solves another problem:

    As the current topology server works in BH context, it cannot easily
    be blocked when sending of events fails due to congestion. In such
    cases events may have to be silently dropped, something that is
    unacceptable. Therefore, the new service keeps a dedicated outbound
    queue receiving messages from BH context. Once messages are
    inserted into this queue, we will immediately schedule a work from a
    special workqueue. This way, messages/events from the topology server
    are in reality sent in process context, and the server can block
    if necessary.

    Analogously, there is a new workqueue for receiving messages. Once a
    notification about an arriving message is received in BH context, we
    schedule a work from the receive workqueue to do the job of
    receiving the message in process context.

    As both sending and receive messages are now finished in processes,
    subscribed events cannot be dropped any more.

    As of this commit, this new server infrastructure is built, but
    not actually yet called by the existing TIPC code, but since the
    conversion changes required in order to use it are significant,
    the addition is kept here as a separate commit.

    Signed-off-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Ying Xue