08 Dec, 2018

1 commit

  • commit 29e270fc32192e7729057963ae7120663856c93e upstream.

    Got below warning with gcc 8.2 compiler.

    net/tipc/topsrv.c: In function ‘tipc_topsrv_start’:
    net/tipc/topsrv.c:660:2: warning: ‘strncpy’ specified bound depends on the length of the source argument [-Wstringop-overflow=]
    strncpy(srv->name, name, strlen(name) + 1);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    net/tipc/topsrv.c:660:27: note: length computed here
    strncpy(srv->name, name, strlen(name) + 1);
    ^~~~~~~~~~~~
    So change it to correct length and use strscpy.

    Signed-off-by: Guoqing Jiang
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Guoqing Jiang
     

23 Aug, 2017

2 commits

  • No matter whether a request is inserted into workqueue as a work item
    to cancel a subscription or to delete a subscription's subscriber
    asynchronously, the work items may be executed in different workers.
    As a result, it doesn't mean that one request which is raised prior to
    another request is definitely handled before the latter. By contrast,
    if the latter request is executed before the former request, below
    error may happen:

    [ 656.183644] BUG: spinlock bad magic on CPU#0, kworker/u8:0/12117
    [ 656.184487] general protection fault: 0000 [#1] SMP
    [ 656.185160] Modules linked in: tipc ip6_udp_tunnel udp_tunnel 9pnet_virtio 9p 9pnet virtio_net virtio_pci virtio_ring virtio [last unloaded: ip6_udp_tunnel]
    [ 656.187003] CPU: 0 PID: 12117 Comm: kworker/u8:0 Not tainted 4.11.0-rc7+ #6
    [ 656.187920] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    [ 656.188690] Workqueue: tipc_rcv tipc_recv_work [tipc]
    [ 656.189371] task: ffff88003f5cec40 task.stack: ffffc90004448000
    [ 656.190157] RIP: 0010:spin_bug+0xdd/0xf0
    [ 656.190678] RSP: 0018:ffffc9000444bcb8 EFLAGS: 00010202
    [ 656.191375] RAX: 0000000000000034 RBX: ffff88003f8d1388 RCX: 0000000000000000
    [ 656.192321] RDX: ffff88003ba13708 RSI: ffff88003ba0cd08 RDI: ffff88003ba0cd08
    [ 656.193265] RBP: ffffc9000444bcd0 R08: 0000000000000030 R09: 000000006b6b6b6b
    [ 656.194208] R10: ffff8800bde3e000 R11: 00000000000001b4 R12: 6b6b6b6b6b6b6b6b
    [ 656.195157] R13: ffffffff81a3ca64 R14: ffff88003f8d1388 R15: ffff88003f8d13a0
    [ 656.196101] FS: 0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
    [ 656.197172] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 656.197935] CR2: 00007f0b3d2e6000 CR3: 000000003ef9e000 CR4: 00000000000006f0
    [ 656.198873] Call Trace:
    [ 656.199210] do_raw_spin_lock+0x66/0xa0
    [ 656.199735] _raw_spin_lock_bh+0x19/0x20
    [ 656.200258] tipc_subscrb_subscrp_delete+0x28/0xf0 [tipc]
    [ 656.200990] tipc_subscrb_rcv_cb+0x45/0x260 [tipc]
    [ 656.201632] tipc_receive_from_sock+0xaf/0x100 [tipc]
    [ 656.202299] tipc_recv_work+0x2b/0x60 [tipc]
    [ 656.202872] process_one_work+0x157/0x420
    [ 656.203404] worker_thread+0x69/0x4c0
    [ 656.203898] kthread+0x138/0x170
    [ 656.204328] ? process_one_work+0x420/0x420
    [ 656.204889] ? kthread_create_on_node+0x40/0x40
    [ 656.205527] ret_from_fork+0x29/0x40
    [ 656.206012] Code: 48 8b 0c 25 00 c5 00 00 48 c7 c7 f0 24 a3 81 48 81 c1 f0 05 00 00 65 8b 15 61 ef f5 7e e8 9a 4c 09 00 4d 85 e4 44 8b 4b 08 74 92 8b 84 24 40 04 00 00 49 8d 8c 24 f0 05 00 00 eb 8d 90 0f 1f
    [ 656.208504] RIP: spin_bug+0xdd/0xf0 RSP: ffffc9000444bcb8
    [ 656.209798] ---[ end trace e2a800e6eb0770be ]---

    In above scenario, the request of deleting subscriber was performed
    earlier than the request of canceling a subscription although the
    latter was issued before the former, which means tipc_subscrb_delete()
    was called before tipc_subscrp_cancel(). As a result, when
    tipc_subscrb_subscrp_delete() called by tipc_subscrp_cancel() was
    executed to cancel a subscription, the subscription's subscriber
    refcnt had been decreased to 1. After tipc_subscrp_delete() where
    the subscriber was freed because its refcnt was decremented to zero,
    but the subscriber's lock had to be released, as a consequence, panic
    happened.

    By contrast, if we increase subscriber's refcnt before
    tipc_subscrb_subscrp_delete() is called in tipc_subscrp_cancel(),
    the panic issue can be avoided.

    Fixes: d094c4d5f5c7 ("tipc: add subscription refcount to avoid invalid delete")
    Reported-by: Parthasarathy Bhuvaragan
    Signed-off-by: Ying Xue
    Signed-off-by: David S. Miller

    Ying Xue
     
  • In commit, 139bb36f754a ("tipc: advance the time of deleting
    subscription from subscriber->subscrp_list"), we delete the
    subscription from the subscribers list and from nametable
    unconditionally. This leads to the following bug if the timer
    running tipc_subscrp_timeout() in another CPU accesses the
    subscription list after the subscription delete request.

    [39.570] general protection fault: 0000 [#1] SMP
    ::
    [39.574] task: ffffffff81c10540 task.stack: ffffffff81c00000
    [39.575] RIP: 0010:tipc_subscrp_timeout+0x32/0x80 [tipc]
    [39.576] RSP: 0018:ffff88003ba03e90 EFLAGS: 00010282
    [39.576] RAX: dead000000000200 RBX: ffff88003f0f3600 RCX: 0000000000000101
    [39.577] RDX: dead000000000100 RSI: 0000000000000201 RDI: ffff88003f0d7948
    [39.578] RBP: ffff88003ba03ea0 R08: 0000000000000001 R09: ffff88003ba03ef8
    [39.579] R10: 000000000000014f R11: 0000000000000000 R12: ffff88003f0d7948
    [39.580] R13: ffff88003f0f3618 R14: ffffffffa006c250 R15: ffff88003f0f3600
    [39.581] FS: 0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
    [39.582] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [39.583] CR2: 00007f831c6e0714 CR3: 000000003d3b0000 CR4: 00000000000006f0
    [39.584] Call Trace:
    [39.584]
    [39.585] call_timer_fn+0x3d/0x180
    [39.585] ? tipc_subscrb_rcv_cb+0x260/0x260 [tipc]
    [39.586] run_timer_softirq+0x168/0x1f0
    [39.586] ? sched_clock_cpu+0x16/0xc0
    [39.587] __do_softirq+0x9b/0x2de
    [39.587] irq_exit+0x60/0x70
    [39.588] smp_apic_timer_interrupt+0x3d/0x50
    [39.588] apic_timer_interrupt+0x86/0x90
    [39.589] RIP: 0010:default_idle+0x20/0xf0
    [39.589] RSP: 0018:ffffffff81c03e58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
    [39.590] RAX: 0000000000000000 RBX: ffffffff81c10540 RCX: 0000000000000000
    [39.591] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    [39.592] RBP: ffffffff81c03e68 R08: 0000000000000000 R09: 0000000000000000
    [39.593] R10: ffffc90001cbbe00 R11: 0000000000000000 R12: 0000000000000000
    [39.594] R13: ffffffff81c10540 R14: 0000000000000000 R15: 0000000000000000
    [39.595]
    ::
    [39.603] RIP: tipc_subscrp_timeout+0x32/0x80 [tipc] RSP: ffff88003ba03e90
    [39.604] ---[ end trace 79ce94b7216cb459 ]---

    Fixes: 139bb36f754a ("tipc: advance the time of deleting subscription from subscriber->subscrp_list")
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

29 Mar, 2017

2 commits

  • When a new subscription object is inserted into name_seq->subscriptions
    list, it's under name_seq->lock protection; when a subscription is
    deleted from the list, it's also under the same lock protection;
    similarly, when accessing a subscription by going through subscriptions
    list, the entire process is also protected by the name_seq->lock.

    Therefore, if subscription refcount is increased before it's inserted
    into subscriptions list, and its refcount is decreased after it's
    deleted from the list, it will be unnecessary to hold refcount at all
    before accessing subscription object which is obtained by going through
    subscriptions list under name_seq->lock protection.

    Signed-off-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • After a subscription object is created, it's inserted into its
    subscriber subscrp_list list under subscriber lock protection,
    similarly, before it's destroyed, it should be first removed from
    its subscriber->subscrp_list. Since the subscription list is
    accessed with subscriber lock, all the subscriptions are valid
    during the lock duration. Hence in tipc_subscrb_subscrp_delete(), we
    remove subscription get/put and the extra subscriber unlock/lock.

    After this change, the subscriptions refcount cleanup is very simple
    and does not access any lock.

    Acked-by: Jon Maloy
    Signed-off-by: Ying Xue
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Ying Xue
     

23 Mar, 2017

1 commit

  • Until now, tipc_nametbl_unsubscribe() is called at subscriptions
    reference count cleanup. Usually the subscriptions cleanup is
    called at subscription timeout or at subscription cancel or at
    subscriber delete.

    We have ignored the possibility of this being called from other
    locations, which causes deadlock as we try to grab the
    tn->nametbl_lock while holding it already.

    CPU1: CPU2:
    ---------- ----------------
    tipc_nametbl_publish
    spin_lock_bh(&tn->nametbl_lock)
    tipc_nametbl_insert_publ
    tipc_nameseq_insert_publ
    tipc_subscrp_report_overlap
    tipc_subscrp_get
    tipc_subscrp_send_event
    tipc_close_conn
    tipc_subscrb_release_cb
    tipc_subscrb_delete
    tipc_subscrp_put
    tipc_subscrp_put
    tipc_subscrp_kref_release
    tipc_nametbl_unsubscribe
    spin_lock_bh(&tn->nametbl_lock)
    <>

    CPU1: CPU2:
    ---------- ----------------
    tipc_nametbl_stop
    spin_lock_bh(&tn->nametbl_lock)
    tipc_purge_publications
    tipc_nameseq_remove_publ
    tipc_subscrp_report_overlap
    tipc_subscrp_get
    tipc_subscrp_send_event
    tipc_close_conn
    tipc_subscrb_release_cb
    tipc_subscrb_delete
    tipc_subscrp_put
    tipc_subscrp_put
    tipc_subscrp_kref_release
    tipc_nametbl_unsubscribe
    spin_lock_bh(&tn->nametbl_lock)
    <>

    In this commit, we advance the calling of tipc_nametbl_unsubscribe()
    from the refcount cleanup to the intended callers.

    Fixes: d094c4d5f5c7 ("tipc: add subscription refcount to avoid invalid delete")
    Reported-by: John Thompson
    Acked-by: Jon Maloy
    Signed-off-by: Ying Xue
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Ying Xue
     

25 Jan, 2017

1 commit

  • Until now, the subscribers keep track of the subscriptions using
    reference count at subscriber level. At subscription cancel or
    subscriber delete, we delete the subscription only if the timer
    was pending for the subscription. This approach is incorrect as:
    1. del_timer() is not SMP safe, if on CPU0 the check for pending
    timer returns true but CPU1 might schedule the timer callback
    thereby deleting the subscription. Thus when CPU0 is scheduled,
    it deletes an invalid subscription.
    2. We export tipc_subscrp_report_overlap(), which accesses the
    subscription pointer multiple times. Meanwhile the subscription
    timer can expire thereby freeing the subscription and we might
    continue to access the subscription pointer leading to memory
    violations.

    In this commit, we introduce subscription refcount to avoid deleting
    an invalid subscription.

    Reported-and-Tested-by: John Thompson
    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

29 Apr, 2016

1 commit


15 Apr, 2016

1 commit

  • Until now, the requests sent to topology server are queued
    to a workqueue by the generic server framework.
    These messages are processed by worker threads and trigger the
    registered callbacks.
    To reduce latency on uniprocessor systems, explicit rescheduling
    is performed using cond_resched() after MAX_RECV_MSG_COUNT(25)
    messages.

    This implementation on SMP systems leads to an subscriber refcnt
    error as described below:
    When a worker thread yields by calling cond_resched() in a SMP
    system, a new worker is created on another CPU to process the
    pending workitem. Sometimes the sleeping thread wakes up before
    the new thread finishes execution.
    This breaks the assumption on ordering and being single threaded.
    The fault is more frequent when MAX_RECV_MSG_COUNT is lowered.

    If the first thread was processing subscription create and the
    second thread processing close(), the close request will free
    the subscriber and the create request oops as follows:

    [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380 [tipc]
    [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97
    [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc]
    [...]
    [31.228377] Call Trace:
    [31.228377] [] dump_stack+0x4d/0x72
    [31.228377] [] __warn+0xd1/0xf0
    [31.228377] [] warn_slowpath_null+0x1d/0x20
    [31.228377] [] tipc_subscrb_rcv_cb+0x317/0x380 [tipc]
    [31.228377] [] tipc_receive_from_sock+0xd4/0x130 [tipc]
    [31.228377] [] tipc_recv_work+0x2b/0x50 [tipc]
    [31.228377] [] process_one_work+0x145/0x3d0
    [31.246554] ---[ end trace c3882c9baa05a4fd ]---
    [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266
    [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428
    [31.249323] IP: [] spin_dump+0x5c/0xe0
    [31.249323] PGD 0
    [31.249323] Oops: 0000 [#1] SMP

    In this commit, we
    - rename tipc_conn_shutdown() to tipc_conn_release().
    - move connection release callback execution from tipc_close_conn()
    to a new function tipc_sock_release(), which is executed before
    we free the connection.
    Thus we release the subscriber during connection release procedure
    rather than connection shutdown procedure.

    Signed-off-by: Parthasarathy Bhuvaragan
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

09 Mar, 2016

1 commit


07 Mar, 2016

1 commit

  • commit 4d5cfcba2f6e ('tipc: fix connection abort during subscription
    cancel'), removes the check for a valid subscription before calling
    tipc_nametbl_subscribe().

    This will lead to a nullptr exception when we process a
    subscription cancel request. For a cancel request, a null
    subscription is passed to tipc_nametbl_subscribe() resulting
    in exception.

    In this commit, we call tipc_nametbl_subscribe() only for
    a valid subscription.

    Fixes: 4d5cfcba2f6e ('tipc: fix connection abort during subscription cancel')
    Reported-by: Anders Widell
    Signed-off-by: Parthasarathy Bhuvaragan
    Acked-by: Jon Maloy
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

06 Feb, 2016

9 commits

  • Until now, we create timers even for the subscription requests
    with timeout = TIPC_WAIT_FOREVER.
    This can be improved by avoiding timer creation when the timeout
    is set to TIPC_WAIT_FOREVER.

    In this commit, we introduce a check to creates timers only
    when timeout != TIPC_WAIT_FOREVER.

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Until now, during subscription creation the mod_time() &
    tipc_subscrb_get() are called after releasing the subscriber
    spin lock.

    In a SMP system when performing a subscription creation, if the
    subscription timeout occurs simultaneously (the timer is
    scheduled to run on another CPU) then the timer thread
    might decrement the subscribers refcount before the create
    thread increments the refcount.

    This can be simulated by creating subscription with timeout=0 and
    sometimes the timeout occurs before the create request is complete.
    This leads to the following message:
    [30.702949] BUG: spinlock bad magic on CPU#1, kworker/u8:3/87
    [30.703834] general protection fault: 0000 [#1] SMP
    [30.704826] CPU: 1 PID: 87 Comm: kworker/u8:3 Not tainted 4.4.0-rc8+ #18
    [30.704826] Workqueue: tipc_rcv tipc_recv_work [tipc]
    [30.704826] task: ffff88003f878600 ti: ffff88003fae0000 task.ti: ffff88003fae0000
    [30.704826] RIP: 0010:[] [] spin_dump+0x5c/0xe0
    [...]
    [30.704826] Call Trace:
    [30.704826] [] spin_bug+0x26/0x30
    [30.704826] [] do_raw_spin_lock+0xe5/0x120
    [30.704826] [] _raw_spin_lock_bh+0x19/0x20
    [30.704826] [] tipc_subscrb_rcv_cb+0x1d0/0x330 [tipc]
    [30.704826] [] tipc_receive_from_sock+0xc1/0x150 [tipc]
    [30.704826] [] tipc_recv_work+0x3f/0x80 [tipc]
    [30.704826] [] process_one_work+0x149/0x3c0
    [30.704826] [] worker_thread+0x66/0x460
    [30.704826] [] ? process_one_work+0x3c0/0x3c0
    [30.704826] [] ? process_one_work+0x3c0/0x3c0
    [30.704826] [] kthread+0xed/0x110
    [30.704826] [] ? kthread_create_on_node+0x190/0x190
    [30.704826] [] ret_from_fork+0x3f/0x70

    In this commit,
    1. we remove the check for the return code for mod_timer()
    2. we protect tipc_subscrb_get() using the subscriber spin lock.
    We increment the subscriber's refcount as soon as we add the
    subscription to subscriber's subscription list.

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Until now, while creating a subscription the subscriber lock
    protects only the subscribers subscription list and not the
    nametable. The call to tipc_nametbl_subscribe() is outside
    the lock. However, at subscription timeout and cancel both
    the subscribers subscription list and the nametable are
    protected by the subscriber lock.

    This asymmetric locking mechanism leads to the following problem:
    In a SMP system, the timer can be fire on another core before
    the create request is complete.
    When the timer thread calls tipc_nametbl_unsubscribe() before create
    thread calls tipc_nametbl_subscribe(), we get a nullptr exception.

    This can be simulated by creating subscription with timeout=0 and
    sometimes the timeout occurs before the create request is complete.

    The following is the oops:
    [57.569661] BUG: unable to handle kernel NULL pointer dereference at (null)
    [57.577498] IP: [] tipc_nametbl_unsubscribe+0x8a/0x120 [tipc]
    [57.584820] PGD 0
    [57.586834] Oops: 0002 [#1] SMP
    [57.685506] CPU: 14 PID: 10077 Comm: kworker/u40:1 Tainted: P OENX 3.12.48-52.27.1. 9688.1.PTF-default #1
    [57.703637] Workqueue: tipc_rcv tipc_recv_work [tipc]
    [57.708697] task: ffff88064c7f00c0 ti: ffff880629ef4000 task.ti: ffff880629ef4000
    [57.716181] RIP: 0010:[] [] tipc_nametbl_unsubscribe+0x8a/ 0x120 [tipc]
    [...]
    [57.812327] Call Trace:
    [57.814806] [] tipc_subscrp_delete+0x37/0x90 [tipc]
    [57.821357] [] tipc_subscrp_timeout+0x3f/0x70 [tipc]
    [57.827982] [] call_timer_fn+0x31/0x100
    [57.833490] [] run_timer_softirq+0x1f9/0x2b0
    [57.839414] [] __do_softirq+0xe5/0x230
    [57.844827] [] call_softirq+0x1c/0x30
    [57.850150] [] do_softirq+0x55/0x90
    [57.855285] [] irq_exit+0x95/0xa0
    [57.860290] [] smp_apic_timer_interrupt+0x45/0x60
    [57.866644] [] apic_timer_interrupt+0x6d/0x80
    [57.872686] [] tipc_subscrb_rcv_cb+0x2a5/0x3f0 [tipc]
    [57.879425] [] tipc_receive_from_sock+0x9f/0x100 [tipc]
    [57.886324] [] tipc_recv_work+0x26/0x60 [tipc]
    [57.892463] [] process_one_work+0x172/0x420
    [57.898309] [] worker_thread+0x11a/0x3c0
    [57.903871] [] kthread+0xb4/0xc0
    [57.908751] [] ret_from_fork+0x58/0x90

    In this commit, we do the following at subscription creation:
    1. set the subscription's subscriber pointer before performing
    tipc_nametbl_subscribe(), as this value is required further in
    the call chain ex: by tipc_subscrp_send_event().
    2. move tipc_nametbl_subscribe() under the scope of subscriber lock

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Until now, the subscribers endianness for a subscription
    create/cancel request is determined as:
    swap = !(s->filter & (TIPC_SUB_PORTS | TIPC_SUB_SERVICE))
    The checks are performed only for port/service subscriptions.

    The swap calculation is incorrect if the filter in the subscription
    cancellation request is set to TIPC_SUB_CANCEL (it's a malformed
    cancel request, as the corresponding subscription create filter
    is missing).
    Thus, the check if the request is for cancellation fails and the
    request is treated as a subscription create request. The
    subscription creation fails as the request is illegal, which
    terminates this connection.

    In this commit we determine the endianness by including
    TIPC_SUB_CANCEL, which will set swap correctly and the
    request is processed as a cancellation request.

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • In 'commit 7fe8097cef5f ("tipc: fix nullpointer bug when subscribing
    to events")', we terminate the connection if the subscription
    creation fails.
    In the same commit, the subscription creation result was based on
    the value of subscription pointer (set in the function) instead of
    the return code.

    Unfortunately, the same function also handles subscription
    cancellation request. For a subscription cancellation request,
    the subscription pointer cannot be set. Thus the connection is
    terminated during cancellation request.

    In this commit, we move the subcription cancel check outside
    of tipc_subscrp_create(). Hence,
    - tipc_subscrp_create() will create a subscripton
    - tipc_subscrb_rcv_cb() will subscribe or cancel a subscription.

    Fixes: 'commit 7fe8097cef5f ("tipc: fix nullpointer bug when subscribing to events")'

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • In this commit, we split tipc_subscrp_create() into two:
    1. tipc_subscrp_create() creates a subscription
    2. A new function tipc_subscrp_subscribe() adds the
    subscription to the subscriber subscription list,
    activates the subscription timer and subscribes to
    the nametable updates.

    In future commits, the purpose of tipc_subscrb_rcv_cb() will
    be to either subscribe or cancel a subscription.

    There is no functional change in this commit.

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Until now, struct tipc_subscriber has duplicate fields for
    type, upper and lower (as member of struct tipc_name_seq) at:
    1. as member seq in struct tipc_subscription
    2. as member seq in struct tipc_subscr, which is contained
    in struct tipc_event
    The former structure contains the type, upper and lower
    values in network byte order and the later contains the
    intact copy of the request.
    The struct tipc_subscription contains a field swap to
    determine if request needs network byte order conversion.
    Thus by using swap, we can convert the request when
    required instead of duplicating it.

    In this commit,
    1. we remove the references to these elements as members of
    struct tipc_subscription and replace them with elements
    from struct tipc_subscr.
    2. provide new functions to convert the user request into
    network byte order.

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Until now, struct tipc_subscription has duplicate timeout and filter
    attributes present:
    1. directly as members of struct tipc_subscription
    2. in struct tipc_subscr, which is contained in struct tipc_event

    In this commit, we remove the references to these elements as
    members of struct tipc_subscription and replace them with elements
    from struct tipc_subscr.

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     
  • Until now, during subscription creation we set sub->timeout by
    converting the timeout request value in milliseconds to jiffies.
    This is followed by setting the timeout value in the timer if
    sub->timeout != TIPC_WAIT_FOREVER.

    For a subscription create request with a timeout value of
    TIPC_WAIT_FOREVER, msecs_to_jiffies(TIPC_WAIT_FOREVER)
    returns MAX_JIFFY_OFFSET (0xfffffffe). This is not equal to
    TIPC_WAIT_FOREVER (0xffffffff).

    In this commit, we remove this check.

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

30 Jan, 2016

1 commit

  • In 'commit 7fe8097cef5f ("tipc: fix nullpointer bug when subscribing
    to events")', we terminate the connection if the subscription
    creation fails.
    In the same commit, the subscription creation result was based on
    the value of the subscription pointer (set in the function) instead
    of the return code.

    Unfortunately, the same function tipc_subscrp_create() handles
    subscription cancel request. For a subscription cancellation request,
    the subscription pointer cannot be set. Thus if a subscriber has
    several subscriptions and cancels any of them, the connection is
    terminated.

    In this commit, we terminate the connection based on the return value
    of tipc_subscrp_create().
    Fixes: commit 7fe8097cef5f ("tipc: fix nullpointer bug when subscribing to events")

    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

05 May, 2015

4 commits

  • Currently subscriber's lock protects not only subscriber's subscription
    list but also all subscriptions linked into the list. However, as all
    members of subscription are never changed after they are initialized,
    it's unnecessary for subscription to be protected under subscriber's
    lock. If the lock is used to only protect subscriber's subscription
    list, the adjustment not only makes the locking policy simpler, but
    also helps to avoid a deadlock which may happen once creating a
    subscription is failed.

    Signed-off-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • At present subscriber's lock is used to protect the subscription list
    of subscriber as well as subscriptions linked into the list. While one
    or all subscriptions are deleted through iterating the list, the
    subscriber's lock must be held. Meanwhile, as deletion of subscription
    may happen in subscription timer's handler, the lock must be grabbed
    in the function as well. When subscription's timer is terminated with
    del_timer_sync() during above iteration, subscriber's lock has to be
    temporarily released, otherwise, deadlock may occur. However, the
    temporary release may cause the double free of a subscription as the
    subscription is not disconnected from the subscription list.

    Now if a reference counter is introduced to subscriber, subscription's
    timer can be asynchronously stopped with del_timer(). As a result, the
    issue is not only able to be fixed, but also relevant code is pretty
    readable and understandable.

    Signed-off-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Introducing a new function makes the purpose of tipc_subscrb_connect_cb
    callback routine more clear.

    Signed-off-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • When a topology server accepts a connection request from its client,
    it allocates a connection instance and a tipc_subscriber structure
    object. The former is used to communicate with client, and the latter
    is often treated as a subscriber which manages all subscription events
    requested from a same client. When a topology server receives a request
    of subscribing name services from a client through the connection, it
    creates a tipc_subscription structure instance which is seen as a
    subscription recording what name services are subscribed. In order to
    manage all subscriptions from a same client, topology server links
    them into the subscrp_list of the subscriber. So subscriber and
    subscription completely represents different meanings respectively,
    but function names associated with them make us so confused that we
    are unable to easily tell which function is against subscriber and
    which is to subscription. So we want to eliminate the confusion by
    renaming them.

    Signed-off-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

28 Feb, 2015

1 commit

  • If a subscription request is sent to a topology server
    connection, and any error occurs (malformed request, oom
    or limit reached) while processing this request, TIPC should
    terminate the subscriber connection. While doing so, it tries
    to access fields in an already freed (or never allocated)
    subscription element leading to a nullpointer exception.
    We fix this by removing the subscr_terminate function and
    terminate the connection immediately upon any subscription
    failure.

    Signed-off-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Erik Hugne
     

13 Jan, 2015

3 commits

  • TIPC establishes one subscriber server which allows users to subscribe
    their interesting name service status. After tipc supports namespace,
    one dedicated tipc stack instance is created for each namespace, and
    each instance can be deemed as one independent TIPC node. As a result,
    subscriber server must be built for each namespace.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • TIPC name table is used to store the mapping relationship between
    TIPC service name and socket port ID. When tipc supports namespace,
    it allows users to publish service names only owned by a certain
    namespace. Therefore, every namespace must have its private name
    table to prevent service names published to one namespace from being
    contaminated by other service names in another namespace. Therefore,
    The name table global variable (ie, nametbl) and its lock must be
    moved to tipc_net structure, and a parameter of namespace must be
    added for necessary functions so that they can obtain name table
    variable defined in tipc_net structure.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Not only some wrapper function like k_term_timer() is empty, but also
    some others including k_start_timer() and k_cancel_timer() don't return
    back any value to its caller, what's more, there is no any component
    in the kernel world to do such thing. Therefore, these timer interfaces
    defined in tipc module should be purged.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

09 Dec, 2014

1 commit

  • When a list_head variable is seen as a new entry to be added to a
    list head, it's unnecessary to be initialized with INIT_LIST_HEAD().

    Signed-off-by: Ying Xue
    Reviewed-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Tested-by: Erik Hugne
    Signed-off-by: David S. Miller

    Ying Xue
     

24 Aug, 2014

1 commit

  • We move the inline functions in the file port.h to socket.c, and modify
    their names accordingly.

    We move struct tipc_port and some macros to socket.h.

    Finally, we remove the file port.h.

    Signed-off-by: Jon Maloy
    Reviewed-by: Erik Hugne
    Reviewed-by: Ying Xue
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

25 Mar, 2014

1 commit

  • If a topology event subscription fails for any reason, such as out
    of memory, max number reached or because we received an invalid
    request the correct behavior is to terminate the subscribers
    connection to the topology server. This is currently broken and
    produces the following oops:

    [27.953662] tipc: Subscription rejected, illegal request
    [27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6
    [27.957066] lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1
    [27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ #5
    [27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    [27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc]
    [27.961430] ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050
    [27.962292] ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a
    [27.963152] ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520
    [27.964023] Call Trace:
    [27.964292] [] dump_stack+0x45/0x56
    [27.964874] [] spin_dump+0x8c/0x91
    [27.965420] [] spin_bug+0x21/0x26
    [27.965995] [] do_raw_spin_lock+0x116/0x140
    [27.966631] [] _raw_spin_lock_bh+0x15/0x20
    [27.967256] [] subscr_conn_shutdown_event+0x20/0xa0 [tipc]
    [27.968051] [] tipc_close_conn+0xa4/0xb0 [tipc]
    [27.968722] [] tipc_conn_terminate+0x1a/0x30 [tipc]
    [27.969436] [] subscr_conn_msg_event+0x1f2/0x2f0 [tipc]
    [27.970209] [] tipc_receive_from_sock+0x90/0xf0 [tipc]
    [27.970972] [] tipc_recv_work+0x29/0x50 [tipc]
    [27.971633] [] process_one_work+0x165/0x3e0
    [27.972267] [] worker_thread+0x119/0x3a0
    [27.972896] [] ? manage_workers.isra.25+0x2a0/0x2a0
    [27.973622] [] kthread+0xdf/0x100
    [27.974168] [] ? kthread_create_on_node+0x1a0/0x1a0
    [27.974893] [] ret_from_fork+0x7c/0xb0
    [27.975466] [] ? kthread_create_on_node+0x1a0/0x1a0

    The recursion occurs when subscr_terminate tries to grab the
    subscriber lock, which is already taken by subscr_conn_msg_event.
    We fix this by checking if the request to establish a new
    subscription was successful, and if not we initiate termination of
    the subscriber after we have released the subscriber lock.

    Signed-off-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Erik Hugne
     

07 Mar, 2014

2 commits

  • When a topology server subscriber is disconnected, the associated
    connection id is set to zero. A check vs zero is then done in the
    subscription timeout function to see if the subscriber have been
    shut down. This is unnecessary, because all subscription timers
    will be cancelled when a subscriber terminates. Setting the
    connection id to zero is actually harmful because id zero is the
    identity of the topology server listening socket, and can cause a
    race that leads to this socket being closed instead.

    Signed-off-by: Erik Hugne
    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Erik Hugne
     
  • Currently connection shutdown callback function is called when
    connection instance is released in tipc_conn_kref_release(), and
    receiving packets and sending packets are running in different
    threads. Even if connection is closed by the thread of receiving
    packets, its shutdown callback may not be called immediately as
    the connection reference count is non-zero at that moment. So,
    although the connection is shut down by the thread of receiving
    packets, the thread of sending packets doesn't know it. Before
    its shutdown callback is invoked to tell the sending thread its
    connection has been closed, the sending thread may deliver
    messages by tipc_conn_sendmsg(), this is why the following error
    information appears:

    "Sending subscription event failed, no memory"

    To eliminate it, allow connection shutdown callback function to
    be called before connection id is removed in tipc_close_conn(),
    which makes the sending thread know the truth in time that its
    socket is closed so that it doesn't send message to it. We also
    remove the "Sending XXX failed..." error reporting for topology
    and config services.

    Signed-off-by: Ying Xue
    Signed-off-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

15 Jan, 2014

1 commit


18 Jun, 2013

2 commits

  • No runtime code changes here. Just a realign of the function
    arguments to start where the 1st one was, and fit as many args
    as can be put in an 80 char line.

    Signed-off-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Paul Gortmaker
     
  • As the new TIPC server infrastructure has been introduced, we can
    now convert the TIPC topology server to it. We get two benefits
    from doing this:

    1) It simplifies the topology server locking policy. In the
    original locking policy, we placed one spin lock pointer in the
    tipc_subscriber structure to reuse the lock of the subscriber's
    server port, controlling access to members of tipc_subscriber
    instance. That is, we only used one lock to ensure both
    tipc_port and tipc_subscriber members were safely accessed.

    Now we introduce another spin lock for tipc_subscriber structure
    only protecting themselves, to get a finer granularity locking
    policy. Moreover, the change will allow us to make the topology
    server code more readable and maintainable.

    2) It fixes a bug where sent subscription events may be lost when
    the topology port is congested. Using the new service, the
    topology server now queues sent events into an outgoing buffer,
    and then wakes up a sender process which has been blocked in
    workqueue context. The process will keep picking events from the
    buffer and send them to their respective subscribers, using the
    kernel socket interface, until the buffer is empty. Even if the
    socket is congested during transmission there is no risk that
    events may be dropped, since the sender process may block when
    needed.

    Some minor reordering of initialization is done, since we now
    have a scenario where the topology server must be started after
    socket initialization has taken place, as the former depends
    on the latter. And overall, we see a simplification of the
    TIPC subscriber code in making this changeover.

    Signed-off-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Ying Xue
     

08 Dec, 2012

1 commit

  • Currently we have tipc_disconnect and tipc_disconnect_port. It is
    not clear from the names alone, what they do or how they differ.
    It turns out that tipc_disconnect just deals with the port locking
    and then calls tipc_disconnect_port which does all the work.

    If we rename as follows: tipc_disconnect_port --> __tipc_disconnect
    then we will be following typical linux convention, where:

    __tipc_disconnect: "raw" function that does all the work.

    tipc_disconnect: wrapper that deals with locking and then calls
    the real core __tipc_disconnect function

    With this, the difference is immediately evident, and locking
    violations are more apt to be spotted by chance while working on,
    or even just while reading the code.

    On the connect side of things, we currently only have the single
    "tipc_connect2port" function. It does both the locking at enter/exit,
    and the core of the work. Pending changes will make it desireable to
    have the connect be a two part locking wrapper + worker function,
    just like the disconnect is already.

    Here, we make the connect look just like the updated disconnect case,
    for the above reason, and for consistency. In the process, we also
    get rid of the "2port" suffix that was on the original name, since
    it adds no descriptive value.

    On close examination, one might notice that the above connect
    changes implicitly move the call to tipc_link_get_max_pkt() to be
    within the scope of tipc_port_lock() protected region; when it was
    not previously. We don't see any issues with this, and it is in
    keeping with __tipc_connect doing the work and tipc_connect just
    handling the locking.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

20 Aug, 2012

1 commit


14 Jul, 2012

1 commit

  • All messages should go directly to the kernel log. The TIPC
    specific error, warning, info and debug trace macro's are
    removed and all references replaced with pr_err, pr_warn,
    pr_info and pr_debug.

    Commonly used sub-strings are explicitly declared as a const
    char to reduce .text size.

    Note that this means the debug messages (changed to pr_debug),
    are now enabled through dynamic debugging, instead of a TIPC
    specific Kconfig option (TIPC_DEBUG). The latter will be
    phased out completely

    Signed-off-by: Erik Hugne
    Signed-off-by: Jon Maloy
    [PG: use pr_fmt as suggested by Joe Perches ]
    Signed-off-by: Paul Gortmaker

    Erik Hugne