11 Dec, 2019

1 commit

  • The current rbtree for service ranges in the name table is built based
    on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
    searching. Some issues have been observed in case of range overlapping:

    Case #1: unable to withdraw a name entry:
    After some name services are bound, all of them are withdrawn by user
    but one remains in the name table forever. This corrupts the table and
    that service becomes dummy i.e. no real port.
    E.g.

    /
    {22, 22}
    /
    /
    ---> {10, 50}
    / \
    / \
    {10, 30} {20, 60}

    The node {10, 30} cannot be removed since the rbtree searching stops at
    the node's ancestor i.e. {10, 50}, so starting from it will never reach
    the finding node.

    Case #2: failed to send data in some cases:
    E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
    this service will be one of the two cases below depending on the order
    of the bindings:

    {20, 60} {10, 50}
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

23 Nov, 2019

1 commit

  • It is observed that TIPC service binding order will not be kept in the
    publication event report to user if the service is subscribed after the
    bindings.

    For example, services are bound by application in the following order:

    Server: bound port A to {18888,66,66} scope 2
    Server: bound port A to {18888,33,33} scope 2

    Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
    it will get the 'TIPC_PUBLISHED' events in that binding order only when
    the subscription is started before the bindings.
    Otherwise, if started after the bindings, the events will arrive in the
    opposite order:

    Client: received event for published {18888,33,33}
    Client: received event for published {18888,66,66}

    For the latter case, it is clear that the bindings have existed in the
    name table already, so when reported, the events' order will follow the
    order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
    range value will be first).

    This is correct as we provide the tracking on a specific service status
    (available or not), not the relationship between multiple services.
    However, some users expect to see the same order of arriving events
    irrespective of when the subscription is issued. This turns out to be
    easy to fix. We now add functionality to ensure that publication events
    always are issued in the same temporal order as the corresponding
    bindings were performed.

    v2: replace the unnecessary macro - 'publication_after()' with inline
    function.
    v3: reuse 'time_after32()' instead of reinventing the same exact code.

    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

28 Apr, 2019

1 commit

  • Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
    netlink based interfaces (including recently added ones) are still not
    setting it in kernel generated messages. Without the flag, message parsers
    not aware of attribute semantics (e.g. wireshark dissector or libmnl's
    mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
    the structure of their contents.

    Unfortunately we cannot just add the flag everywhere as there may be
    userspace applications which check nlattr::nla_type directly rather than
    through a helper masking out the flags. Therefore the patch renames
    nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
    as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
    are rewritten to use nla_nest_start().

    Except for changes in include/net/netlink.h, the patch was generated using
    this semantic patch:

    @@ expression E1, E2; @@
    -nla_nest_start(E1, E2)
    +nla_nest_start_noflag(E1, E2)

    @@ expression E1, E2; @@
    -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
    +nla_nest_start(E1, E2)

    Signed-off-by: Michal Kubecek
    Acked-by: Jiri Pirko
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Michal Kubecek
     

11 Apr, 2019

1 commit

  • When binding multiple services with specific type 1Ki, 2Ki..,
    this leads to some entries in the name table of publications
    missing when listed out via 'tipc name show'.

    The problem is at identify zero last_type conditional provided
    via netlink. The first is initial 'type' when starting name table
    dummping. The second is continuously with zero type (node state
    service type). Then, lookup function failure to finding node state
    service type in next iteration.

    To solve this, adding more conditional to marked as dirty type and
    lookup correct service type for the next iteration instead of select
    the first service as initial 'type' zero.

    Acked-by: Jon Maloy
    Signed-off-by: Hoang Le
    Signed-off-by: David S. Miller

    Hoang Le
     

23 Oct, 2018

1 commit

  • We have seen the following race scenario:
    1) named_distribute() builds a "bulk" message, containing a PUBLISH
    item for a certain publication. This is based on the contents of
    the binding tables's 'cluster_scope' list.
    2) tipc_named_withdraw() removes the same publication from the list,
    bulds a WITHDRAW message and distributes it to all cluster nodes.
    3) tipc_named_node_up(), which was calling named_distribute(), sends
    out the bulk message built under 1)
    4) The WITHDRAW message arrives at the just detected node, finds
    no corresponding publication, and is dropped.
    5) The PUBLISH item arrives at the same node, is added to its binding
    table, and remains there forever.

    This arrival disordering was earlier taken care of by the backlog queue,
    originally added for a different purpose, which was removed in the
    commit referred to below, but we now need a different solution.
    In this commit, we replace the rcu lock protecting the 'cluster_scope'
    list with a regular RW lock which comprises even the sending of the
    bulk message. This both guarantees both the list integrity and the
    message sending order. We will later add a commit which cleans up
    this code further.

    Note that this commit needs recently added commit d3092b2efca1 ("tipc:
    fix unsafe rcu locking when accessing publication list") to apply
    cleanly.

    Fixes: 37922ea4a310 ("tipc: permit overlapping service ranges in name table")
    Reported-by: Tuong Lien Tong
    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

28 Aug, 2018

1 commit

  • In function tipc_dest_push, the 32bit variables 'node' and 'port'
    are stored separately in uppper and lower part of 64bit 'value'.
    Then this value is assigned to dst->value which is a union like:
    union
    {
    struct {
    u32 port;
    u32 node;
    };
    u64 value;
    }
    This works on little-endian machines like x86 but fails on big-endian
    machines.

    The fix remove the 'value' stack parameter and even the 'value'
    member of the union in tipc_dest, assign the 'node' and 'port' member
    directly with the input parameter to avoid the endian issue.

    Fixes: a80ae5306a73 ("tipc: improve destination linked list")
    Signed-off-by: Zhenbo Gao
    Acked-by: Jon Maloy
    Signed-off-by: Haiqing Bai
    Signed-off-by: David S. Miller

    Haiqing Bai
     

28 Jul, 2018

1 commit


11 May, 2018

1 commit

  • In commit be47e41d77fb ("tipc: fix use-after-free in tipc_nametbl_stop")
    we fixed a problem caused by premature release of service range items.

    That fix is correct, and solved the problem. However, it doesn't address
    the root of the problem, which is that we don't lookup the tipc_service
    -> service_range -> publication items in the correct hierarchical
    order.

    In this commit we try to make this right, and as a side effect obtain
    some code simplification.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

19 Apr, 2018

1 commit

  • When we delete a service item in tipc_nametbl_stop() we loop over
    all service ranges in the service's RB tree, and for each service
    range we loop over its pertaining publications while calling
    tipc_service_remove_publ() for each of them.

    However, tipc_service_remove_publ() has the side effect that it also
    removes the comprising service range item when there are no publications
    left. This leads to a "use-after-free" access when the inner loop
    continues to the next iteration, since the range item holding the list
    we are looping no longer exists.

    We fix this by moving the delete of the service range item outside
    the said function. Instead, we now let the two functions calling it
    test if the list is empty and perform the removal when that is the
    case.

    Reported-by: syzbot+d64b64afc55660106556@syzkaller.appspotmail.com
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

13 Apr, 2018

1 commit

  • When a topology subscription is created, we may encounter (or KASAN
    may provoke) a failure to create a corresponding service instance in
    the binding table. Instead of letting the tipc_nametbl_subscribe()
    report the failure back to the caller, the function just makes a warning
    printout and returns, without incrementing the subscription reference
    counter as expected by the caller.

    This makes the caller believe that the subscription was successful, so
    it will at a later moment try to unsubscribe the item. This involves
    a sub_put() call. Since the reference counter never was incremented
    in the first place, we get a premature delete of the subscription item,
    followed by a "use-after-free" warning.

    We fix this by adding a return value to tipc_nametbl_subscribe() and
    make the caller aware of the failure to subscribe.

    This bug seems to always have been around, but this fix only applies
    back to the commit shown below. Given the low risk of this happening
    we believe this to be sufficient.

    Fixes: commit 218527fe27ad ("tipc: replace name table service range
    array with rb tree")
    Reported-by: syzbot+aa245f26d42b8305d157@syzkaller.appspotmail.com

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

01 Apr, 2018

3 commits

  • With the new RB tree structure for service ranges it becomes possible to
    solve an old problem; - we can now allow overlapping service ranges in
    the table.

    When inserting a new service range to the tree, we use 'lower' as primary
    key, and when necessary 'upper' as secondary key.

    Since there may now be multiple service ranges matching an indicated
    'lower' value, we must also add the 'upper' value to the functions
    used for removing publications, so that the correct, corresponding
    range item can be found.

    These changes guarantee that a well-formed publication/withdrawal item
    from a peer node never will be rejected, and make it possible to
    eliminate the problematic backlog functionality we currently have for
    handling such cases.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • The function tipc_nametbl_translate() function is ugly and hard to
    follow. This can be improved somewhat by introducing a stack variable
    for holding the publication list to be used and re-ordering the if-
    clauses for selection of algorithm.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • The current design of the binding table has an unnecessary memory
    consuming and complex data structure. It aggregates the service range
    items into an array, which is expanded by a factor two every time it
    becomes too small to hold a new item. Furthermore, the arrays never
    shrink when the number of ranges diminishes.

    We now replace this array with an RB tree that is holding the range
    items as tree nodes, each range directly holding a list of bindings.

    This, along with a few name changes, improves both readability and
    volume of the code, as well as reducing memory consumption and hopefully
    improving cache hit rate.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

24 Mar, 2018

2 commits

  • As a preparation to changing the addressing structure of TIPC we replace
    all direct accesses to the tipc_net::own_addr field with the function
    dedicated for this, tipc_own_addr().

    There are no changes to program logics in this commit.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • The removal of an internal structure of the node address has an unwanted
    side effect.
    - Currently, if a user is sending an anycast message with destination
    domain 0, the tipc_namebl_translate() function will use the 'closest-
    first' algorithm to first look for a node local destination, and only
    when no such is found, will it resort to the cluster global 'round-
    robin' lookup algorithm.
    - Current users can get around this, and enforce unconditional use of
    global round-robin by indicating a destination as Z.0.0 or Z.C.0.
    - This option disappears when we make the node address flat, since the
    lookup algorithm has no way of recognizing this case. So, as long as
    there are node local destinations, the algorithm will always select
    one of those, and there is nothing the sender can do to change this.

    We solve this by eliminating the 'closest-first' option, which was never
    a good idea anyway, for non-legacy users, but only for those. To
    distinguish between legacy users and non-legacy users we introduce a new
    flag 'legacy_addr_format' in struct tipc_core, to be set when the user
    configures a legacy-style Z.C.N node address. Hence, when a legacy user
    indicates a zero lookup domain 'closest-first' is selected, and in all
    other cases we use 'round-robin'.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

18 Mar, 2018

4 commits

  • We rename some lists and fields in struct publication both to make
    the naming more consistent and to better reflect their roles. We
    also update the descriptions of those lists.

    node_list -> local_publ
    cluster_list -> all_publ
    pport_list -> binding_sock
    ref -> port

    There are no functional changes in this commit.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • As a further consequence of the previous commits, we can also remove
    the member 'zone_list 'in struct name_info and struct publication.
    Instead, we now let the member cluster_list take over the role a
    container of all publications of a given .
    We also remove the counters for the size of those lists, since
    they don't serve any purpose.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • As a consequence of the previous commit we nan now eliminate zone scope
    related lists in the name table. We start with name_table::publ_list[3],
    which can now be replaced with two lists, one for node scope publications
    and one for cluster scope publications.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • Publications for TIPC_CLUSTER_SCOPE and TIPC_ZONE_SCOPE are in all
    aspects handled the same way, both on the publishing node and on the
    receiving nodes.

    Despite previous ambitions to the contrary, this is never going to change,
    so we take the conseqeunce of this and obsolete TIPC_ZONE_SCOPE and related
    macros/functions. Whenever a user is doing a bind() or a sendmsg() attempt
    using ZONE_SCOPE we translate this internally to CLUSTER_SCOPE, while we
    remain compatible with users and remote nodes still using ZONE_SCOPE.

    Furthermore, the non-formalized scope value 0 has always been permitted
    for use during lookup, with the same meaning as ZONE_SCOPE/CLUSTER_SCOPE.
    We now permit it even as binding scope, but for compatibility reasons we
    choose to not change the value of TIPC_CLUSTER_SCOPE.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

17 Feb, 2018

4 commits

  • In order to narrow the interface and dependencies between the topology
    server and the subscription/binding table functionality we move struct
    tipc_server inside the file server.c. This requires some code
    adaptations in other files, but those are mostly minor.

    The most important change is that we have to move the start/stop
    functions for the topology server to server.c, where they logically
    belong anyway.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • Since we now have removed struct tipc_subscriber from the code, and
    only struct tipc_subscription remains, there is no longer need for long
    and awkward prefixes to distinguish between their pertaining functions.

    We now change all tipc_subscrp_* prefixes to tipc_sub_*. This is
    a purely cosmetic change.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • Because of the requirement for total distribution transparency, users
    send subscriptions and receive topology events in their own host format.
    It is up to the topology server to determine this format and do the
    correct conversions to and from its own host format when needed.

    Until now, this has been handled in a rather non-transparent way inside
    the topology server and subscriber code, leading to unnecessary
    complexity when creating subscriptions and issuing events.

    We now improve this situation by adding two new macros, tipc_sub_read()
    and tipc_evt_write(). Both those functions calculate the need for
    conversion internally before performing their respective operations.
    Hence, all handling of such conversions become transparent to the rest
    of the code.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • The message transmission and reception in the topology server is more
    generic than is currently necessary. By basing the funtionality on the
    fact that we only send items of type struct tipc_event and always
    receive items of struct tipc_subcr we can make several simplifications,
    and also get rid of some unnecessary dynamic memory allocations.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

16 Jan, 2018

1 commit

  • In commit 232d07b74a33 ("tipc: improve groupcast scope handling") we
    inadvertently broke non-group multicast transmission when changing the
    parameter 'domain' to 'scope' in the function
    tipc_nametbl_lookup_dst_nodes(). We missed to make the corresponding
    change in the calling function, with the result that the lookup always
    fails.

    A closer anaysis reveals that this parameter is not needed at all.
    Non-group multicast is hard coded to use CLUSTER_SCOPE, and in the
    current implementation this will be delivered to all matching
    destinations except those which are published with NODE_SCOPE on other
    nodes. Since such publications never will be visible on the sending node
    anyway, it makes no sense to discriminate by scope at all.

    We now remove this parameter altogether.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

10 Jan, 2018

3 commits

  • When a member joins a group, it also indicates a binding scope. This
    makes it possible to create both node local groups, invisible to other
    nodes, as well as cluster global groups, visible everywhere.

    In order to avoid that different members end up having permanently
    differing views of group size and memberhip, we must inhibit locally
    and globally bound members from joining the same group.

    We do this by using the binding scope as an additional separator between
    groups. I.e., a member must ignore all membership events from sockets
    using a different scope than itself, and all lookups for message
    destinations must require an exact match between the message's lookup
    scope and the potential target's binding scope.

    Apart from making it possible to create local groups using the same
    identity on different nodes, a side effect of this is that it now also
    becomes possible to create a cluster global group with the same identity
    across the same nodes, without interfering with the local groups.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • Currently, when a user is subscribing for binding table publications,
    he will receive a PUBLISH event for all already existing matching items
    in the binding table.

    However, a group socket making a subscriptions doesn't need this initial
    status update from the binding table, because it has already scanned it
    during the join operation. Worse, the multiplicatory effect of issuing
    mutual events for dozens or hundreds group members within a short time
    frame put a heavy load on the topology server, with the end result that
    scale out operations on a big group tend to take much longer than needed.

    We now add a new filter option, TIPC_SUB_NO_STATUS, for topology server
    subscriptions, so that this initial avalanche of events is suppressed.
    This change, along with the previous commit, significantly improves the
    range and speed of group scale out operations.

    We keep the new option internal for the tipc driver, at least for now.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • When a socket is joining a group, we look up in the binding table to
    find if there are already other members of the group present. This is
    used for being able to return EAGAIN instead of EHOSTUNREACH if the
    user proceeds directly to a send attempt.

    However, the information in the binding table can be used to directly
    set the created member in state MBR_PUBLISHED and send a JOIN message
    to the peer, instead of waiting for a topology PUBLISH event to do this.
    When there are many members in a group, the propagation time for such
    events can be significant, and we can save time during the join
    operation if we use the initial lookup result fully.

    In this commit, we eliminate the member state MBR_DISCOVERED which has
    been the result of the initial lookup, and do instead go directly to
    MBR_PUBLISHED, which initiates the setup.

    After this change, the tipc_member FSM looks as follows:

    +-----------+
    ---->| PUBLISHED |-----------------------------------------------+
    PUB- +-----------+ LEAVE/WITHRAW |
    LISH |JOIN |
    | +-------------------------------------------+ |
    | | LEAVE/WITHDRAW | |
    | | +------------+ | |
    | | +----------->| PENDING |---------+ | |
    | | |msg/maxactv +-+---+------+ LEAVE/ | | |
    | | | | | WITHDRAW | | |
    | | | +----------+ | | | |
    | | | |revert/maxactv| | | |
    | | | V V V V V
    | +----------+ msg +------------+ +-----------+
    +-->| JOINED |------>| ACTIVE |------>| LEAVING |--->
    | +----------+ +--- -+------+ LEAVE/+-----------+DOWN
    | A A | WITHDRAW A A A EVT
    | | | |RECLAIM | | |
    | | |REMIT V | | |
    | | |== adv +------------+ | | |
    | | +---------| RECLAIMING |--------+ | |
    | | +-----+------+ LEAVE/ | |
    | | |REMIT WITHDRAW | |
    | | |< adv | |
    | |msg/ V LEAVE/ | |
    | |adv==ADV_IDLE+------------+ WITHDRAW | |
    | +-------------| REMITTED |------------+ |
    | +------------+ |
    |PUBLISH |
    JOIN +-----------+ LEAVE/WITHDRAW |
    ---->| JOINING |-----------------------------------------------+
    +-----------+

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy

    Signed-off-by: David S. Miller

    Jon Maloy
     

26 Oct, 2017

1 commit

  • The following warning was reported by syzbot on Oct 24. 2017:
    KASAN: slab-out-of-bounds Read in tipc_nametbl_lookup_dst_nodes

    This is a harmless bug, but we still want to get rid of the warning,
    so we swap the two conditions in question.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

13 Oct, 2017

3 commits

  • In this commit, we make it possible to send connectionless unicast
    messages to any member corresponding to the given member identity,
    when there is more than one such member. The sender must use a
    TIPC_ADDR_NAME address to achieve this effect.

    We also perform load balancing between the destinations, i.e., we
    primarily select one which has advertised sufficient send window
    to not cause a block/EAGAIN delay, if any. This mechanism is
    overlayed on the always present round-robin selection.

    Anycast messages are subject to the same start synchronization
    and flow control mechanism as group broadcast messages.

    Signed-off-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • As a preparation for introducing flow control for multicast and datagram
    messaging we need a more strictly defined framework than we have now. A
    socket must be able keep track of exactly how many and which other
    sockets it is allowed to communicate with at any moment, and keep the
    necessary state for those.

    We therefore introduce a new concept we have named Communication Group.
    Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
    The call takes four parameters: 'type' serves as group identifier,
    'instance' serves as an logical member identifier, and 'scope' indicates
    the visibility of the group (node/cluster/zone). Finally, 'flags' makes
    it possible to set certain properties for the member. For now, there is
    only one flag, indicating if the creator of the socket wants to receive
    a copy of broadcast or multicast messages it is sending via the socket,
    and if wants to be eligible as destination for its own anycasts.

    A group is closed, i.e., sockets which have not joined a group will
    not be able to send messages to or receive messages from members of
    the group, and vice versa.

    Any member of a group can send multicast ('group broadcast') messages
    to all group members, optionally including itself, using the primitive
    send(). The messages are received via the recvmsg() primitive. A socket
    can only be member of one group at a time.

    Signed-off-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • We often see a need for a linked list of destination identities,
    sometimes containing a port number, sometimes a node identity, and
    sometimes both. The currently defined struct u32_list is not generic
    enough to cover all cases, so we extend it to contain two u32 integers
    and rename it to struct tipc_dest_list.

    Signed-off-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Jon Maloy
     

29 Mar, 2017

1 commit

  • When a new subscription object is inserted into name_seq->subscriptions
    list, it's under name_seq->lock protection; when a subscription is
    deleted from the list, it's also under the same lock protection;
    similarly, when accessing a subscription by going through subscriptions
    list, the entire process is also protected by the name_seq->lock.

    Therefore, if subscription refcount is increased before it's inserted
    into subscriptions list, and its refcount is decreased after it's
    deleted from the list, it will be unnecessary to hold refcount at all
    before accessing subscription object which is obtained by going through
    subscriptions list under name_seq->lock protection.

    Signed-off-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

21 Jan, 2017

1 commit


04 Jan, 2017

1 commit

  • During multicast reception we currently use a simple linked list with
    push/pop semantics to store port numbers.

    We now see a need for a more generic list for storing values of type
    u32. We therefore make some modifications to this list, while replacing
    the prefix 'tipc_plist_' with 'u32_'. We also add a couple of new
    functions which will come to use in the next commits.

    Acked-by: Parthasarathy Bhuvaragan
    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

08 Mar, 2016

1 commit


06 Feb, 2016

1 commit

  • Until now, struct tipc_subscriber has duplicate fields for
    type, upper and lower (as member of struct tipc_name_seq) at:
    1. as member seq in struct tipc_subscription
    2. as member seq in struct tipc_subscr, which is contained
    in struct tipc_event
    The former structure contains the type, upper and lower
    values in network byte order and the later contains the
    intact copy of the request.
    The struct tipc_subscription contains a field swap to
    determine if request needs network byte order conversion.
    Thus by using swap, we can convert the request when
    required instead of duplicating it.

    In this commit,
    1. we remove the references to these elements as members of
    struct tipc_subscription and replace them with elements
    from struct tipc_subscr.
    2. provide new functions to convert the user request into
    network byte order.

    Acked-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

21 Nov, 2015

1 commit

  • The file name_distr.c currently contains three functions,
    named_cluster_distribute(), tipc_publ_subcscribe() and
    tipc_publ_unsubscribe() that all directly access fields in
    struct tipc_node. We want to eliminate such dependencies, so
    we move those functions to the file node.c and rename them to
    tipc_node_broadcast(), tipc_node_subscribe() and tipc_node_unsubscribe()
    respectively.

    Reviewed-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

05 May, 2015

1 commit

  • When a topology server accepts a connection request from its client,
    it allocates a connection instance and a tipc_subscriber structure
    object. The former is used to communicate with client, and the latter
    is often treated as a subscriber which manages all subscription events
    requested from a same client. When a topology server receives a request
    of subscribing name services from a client through the connection, it
    creates a tipc_subscription structure instance which is seen as a
    subscription recording what name services are subscribed. In order to
    manage all subscriptions from a same client, topology server links
    them into the subscrp_list of the subscriber. So subscriber and
    subscription completely represents different meanings respectively,
    but function names associated with them make us so confused that we
    are unable to easily tell which function is against subscriber and
    which is to subscription. So we want to eliminate the confusion by
    renaming them.

    Signed-off-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

18 Mar, 2015

1 commit

  • [ 28.531768] =============================================
    [ 28.532322] [ INFO: possible recursive locking detected ]
    [ 28.532322] 3.19.0+ #194 Not tainted
    [ 28.532322] ---------------------------------------------
    [ 28.532322] insmod/583 is trying to acquire lock:
    [ 28.532322] (&(&nseq->lock)->rlock){+.....}, at: [] tipc_nametbl_remove_publ+0x49/0x2e0 [tipc]
    [ 28.532322]
    [ 28.532322] but task is already holding lock:
    [ 28.532322] (&(&nseq->lock)->rlock){+.....}, at: [] tipc_nametbl_stop+0xfc/0x1f0 [tipc]
    [ 28.532322]
    [ 28.532322] other info that might help us debug this:
    [ 28.532322] Possible unsafe locking scenario:
    [ 28.532322]
    [ 28.532322] CPU0
    [ 28.532322] ----
    [ 28.532322] lock(&(&nseq->lock)->rlock);
    [ 28.532322] lock(&(&nseq->lock)->rlock);
    [ 28.532322]
    [ 28.532322] *** DEADLOCK ***
    [ 28.532322]
    [ 28.532322] May be due to missing lock nesting notation
    [ 28.532322]
    [ 28.532322] 3 locks held by insmod/583:
    [ 28.532322] #0: (net_mutex){+.+.+.}, at: [] register_pernet_subsys+0x1f/0x50
    [ 28.532322] #1: (&(&tn->nametbl_lock)->rlock){+.....}, at: [] tipc_nametbl_stop+0xb1/0x1f0 [tipc]
    [ 28.532322] #2: (&(&nseq->lock)->rlock){+.....}, at: [] tipc_nametbl_stop+0xfc/0x1f0 [tipc]
    [ 28.532322]
    [ 28.532322] stack backtrace:
    [ 28.532322] CPU: 1 PID: 583 Comm: insmod Not tainted 3.19.0+ #194
    [ 28.532322] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
    [ 28.532322] ffffffff82394460 ffff8800144cb928 ffffffff81792f3e 0000000000000007
    [ 28.532322] ffffffff82394460 ffff8800144cba28 ffffffff810a8080 ffff8800144cb998
    [ 28.532322] ffffffff810a4df3 ffff880013e9cb10 ffffffff82b0d330 ffff880013e9cb38
    [ 28.532322] Call Trace:
    [ 28.532322] [] dump_stack+0x4c/0x65
    [ 28.532322] [] __lock_acquire+0x740/0x1ca0
    [ 28.532322] [] ? __bfs+0x23/0x270
    [ 28.532322] [] ? check_irq_usage+0x96/0xe0
    [ 28.532322] [] ? __lock_acquire+0x1133/0x1ca0
    [ 28.532322] [] ? tipc_nametbl_remove_publ+0x49/0x2e0 [tipc]
    [ 28.532322] [] lock_acquire+0x9c/0x140
    [ 28.532322] [] ? tipc_nametbl_remove_publ+0x49/0x2e0 [tipc]
    [ 28.532322] [] _raw_spin_lock_bh+0x3f/0x50
    [ 28.532322] [] ? tipc_nametbl_remove_publ+0x49/0x2e0 [tipc]
    [ 28.532322] [] tipc_nametbl_remove_publ+0x49/0x2e0 [tipc]
    [ 28.532322] [] tipc_nametbl_stop+0x13e/0x1f0 [tipc]
    [ 28.532322] [] ? tipc_nametbl_stop+0x5/0x1f0 [tipc]
    [ 28.532322] [] tipc_init_net+0x13b/0x150 [tipc]
    [ 28.532322] [] ? tipc_init_net+0x5/0x150 [tipc]
    [ 28.532322] [] ops_init+0x4e/0x150
    [ 28.532322] [] ? trace_hardirqs_on+0xd/0x10
    [ 28.532322] [] register_pernet_operations+0xf3/0x190
    [ 28.532322] [] register_pernet_subsys+0x2e/0x50
    [ 28.532322] [] tipc_init+0x6a/0x1000 [tipc]
    [ 28.532322] [] ? 0xffffffffa0024000
    [ 28.532322] [] do_one_initcall+0x89/0x1c0
    [ 28.532322] [] ? kmem_cache_alloc_trace+0x50/0x1b0
    [ 28.532322] [] ? do_init_module+0x2b/0x200
    [ 28.532322] [] do_init_module+0x64/0x200
    [ 28.532322] [] load_module+0x12f3/0x18e0
    [ 28.532322] [] ? show_initstate+0x50/0x50
    [ 28.532322] [] SyS_init_module+0xd9/0x110
    [ 28.532322] [] sysenter_dispatch+0x7/0x1f

    Before tipc_purge_publications() calls tipc_nametbl_remove_publ() to
    remove a publication with a name sequence, the name sequence's lock
    is held. However, when tipc_nametbl_remove_publ() calling
    tipc_nameseq_remove_publ() to remove the publication, it first tries
    to query name sequence instance with the publication, and then holds
    the lock of the found name sequence. But as the lock may be already
    taken in tipc_purge_publications(), deadlock happens like above
    scenario demonstrated. As tipc_nameseq_remove_publ() doesn't grab name
    sequence's lock, the deadlock can be avoided if it's directly invoked
    by tipc_purge_publications().

    Fixes: 97ede29e80ee ("tipc: convert name table read-write lock to RCU")
    Signed-off-by: Ying Xue
    Reviewed-by: Erik Hugne
    Signed-off-by: David S. Miller

    Ying Xue
     

10 Feb, 2015

1 commit