24 Mar, 2018

4 commits

  • We add a 128-bit node identity, as an alternative to the currently used
    32-bit node address.

    For the sake of compatibility and to minimize message header changes
    we retain the existing 32-bit address field. When not set explicitly by
    the user, this field will be filled with a hash value generated from the
    much longer node identity, and be used as a shorthand value for the
    latter.

    We permit either the address or the identity to be set by configuration,
    but not both, so when the address value is set by a legacy user the
    corresponding 128-bit node identity is generated based on the that value.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • As a preparation to changing the addressing structure of TIPC we replace
    all direct accesses to the tipc_net::own_addr field with the function
    dedicated for this, tipc_own_addr().

    There are no changes to program logics in this commit.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • The removal of an internal structure of the node address has an unwanted
    side effect.
    - Currently, if a user is sending an anycast message with destination
    domain 0, the tipc_namebl_translate() function will use the 'closest-
    first' algorithm to first look for a node local destination, and only
    when no such is found, will it resort to the cluster global 'round-
    robin' lookup algorithm.
    - Current users can get around this, and enforce unconditional use of
    global round-robin by indicating a destination as Z.0.0 or Z.C.0.
    - This option disappears when we make the node address flat, since the
    lookup algorithm has no way of recognizing this case. So, as long as
    there are node local destinations, the algorithm will always select
    one of those, and there is nothing the sender can do to change this.

    We solve this by eliminating the 'closest-first' option, which was never
    a good idea anyway, for non-legacy users, but only for those. To
    distinguish between legacy users and non-legacy users we introduce a new
    flag 'legacy_addr_format' in struct tipc_core, to be set when the user
    configures a legacy-style Z.C.N node address. Hence, when a legacy user
    indicates a zero lookup domain 'closest-first' is selected, and in all
    other cases we use 'round-robin'.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • Nominally, TIPC organizes network nodes into a three-level network
    hierarchy consisting of the levels 'zone', 'cluster' and 'node'. This
    hierarchy is reflected in the node address format, - it is sub-divided
    into an 8-bit zone id, and 12 bit cluster id, and a 12-bit node id.

    However, the 'zone' and 'cluster' levels have in reality never been
    fully implemented,and never will be. The result of this has been
    that the first 20 bits the node identity structure have been wasted,
    and the usable node identity range within a cluster has been limited
    to 12 bits. This is starting to become a problem.

    In the following commits, we will need to be able to connect between
    nodes which are using the whole 32-bit value space of the node address.
    We therefore remove the restrictions on which values can be assigned
    to node identity, -it is from now on only a 32-bit integer with no
    assumed internal structure.

    Isolation between clusters is now achieved only by setting different
    values for the 'network id' field used during neighbor discovery, in
    practice leading to the latter becoming the new cluster identity.

    The rules for accepting discovery requests/responses from neighboring
    nodes now become:

    - If the user is using legacy address format on both peers, reception
    of discovery messages is subject to the legacy lookup domain check
    in addition to the cluster id check.

    - Otherwise, the discovery request/response is always accepted, provided
    both peers have the same network id.

    This secures backwards compatibility for users who have been using zone
    or cluster identities as cluster separators, instead of the intended
    'network id'.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

18 Mar, 2018

1 commit

  • Publications for TIPC_CLUSTER_SCOPE and TIPC_ZONE_SCOPE are in all
    aspects handled the same way, both on the publishing node and on the
    receiving nodes.

    Despite previous ambitions to the contrary, this is never going to change,
    so we take the conseqeunce of this and obsolete TIPC_ZONE_SCOPE and related
    macros/functions. Whenever a user is doing a bind() or a sendmsg() attempt
    using ZONE_SCOPE we translate this internally to CLUSTER_SCOPE, while we
    remain compatible with users and remote nodes still using ZONE_SCOPE.

    Furthermore, the non-formalized scope value 0 has always been permitted
    for use during lookup, with the same meaning as ZONE_SCOPE/CLUSTER_SCOPE.
    We now permit it even as binding scope, but for compatibility reasons we
    choose to not change the value of TIPC_CLUSTER_SCOPE.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

27 Jul, 2016

1 commit


16 Jun, 2016

1 commit

  • TIPC based clusters are by default set up with full-mesh link
    connectivity between all nodes. Those links are expected to provide
    a short failure detection time, by default set to 1500 ms. Because
    of this, the background load for neighbor monitoring in an N-node
    cluster increases with a factor N on each node, while the overall
    monitoring traffic through the network infrastructure increases at
    a ~(N * (N - 1)) rate. Experience has shown that such clusters don't
    scale well beyond ~100 nodes unless we significantly increase failure
    discovery tolerance.

    This commit introduces a framework and an algorithm that drastically
    reduces this background load, while basically maintaining the original
    failure detection times across the whole cluster. Using this algorithm,
    background load will now grow at a rate of ~(2 * sqrt(N)) per node, and
    at ~(2 * N * sqrt(N)) in traffic overhead. As an example, each node will
    now have to actively monitor 38 neighbors in a 400-node cluster, instead
    of as before 399.

    This "Overlapping Ring Supervision Algorithm" is completely distributed
    and employs no centralized or coordinated state. It goes as follows:

    - Each node makes up a linearly ascending, circular list of all its N
    known neighbors, based on their TIPC node identity. This algorithm
    must be the same on all nodes.

    - The node then selects the next M = sqrt(N) - 1 nodes downstream from
    itself in the list, and chooses to actively monitor those. This is
    called its "local monitoring domain".

    - It creates a domain record describing the monitoring domain, and
    piggy-backs this in the data area of all neighbor monitoring messages
    (LINK_PROTOCOL/STATE) leaving that node. This means that all nodes in
    the cluster eventually (default within 400 ms) will learn about
    its monitoring domain.

    - Whenever a node discovers a change in its local domain, e.g., a node
    has been added or has gone down, it creates and sends out a new
    version of its node record to inform all neighbors about the change.

    - A node receiving a domain record from anybody outside its local domain
    matches this against its own list (which may not look the same), and
    chooses to not actively monitor those members of the received domain
    record that are also present in its own list. Instead, it relies on
    indications from the direct monitoring nodes if an indirectly
    monitored node has gone up or down. If a node is indicated lost, the
    receiving node temporarily activates its own direct monitoring towards
    that node in order to confirm, or not, that it is actually gone.

    - Since each node is actively monitoring sqrt(N) downstream neighbors,
    each node is also actively monitored by the same number of upstream
    neighbors. This means that all non-direct monitoring nodes normally
    will receive sqrt(N) indications that a node is gone.

    - A major drawback with ring monitoring is how it handles failures that
    cause massive network partitionings. If both a lost node and all its
    direct monitoring neighbors are inside the lost partition, the nodes in
    the remaining partition will never receive indications about the loss.
    To overcome this, each node also chooses to actively monitor some
    nodes outside its local domain. Those nodes are called remote domain
    "heads", and are selected in such a way that no node in the cluster
    will be more than two direct monitoring hops away. Because of this,
    each node, apart from monitoring the member of its local domain, will
    also typically monitor sqrt(N) remote head nodes.

    - As an optimization, local list status, domain status and domain
    records are marked with a generation number. This saves senders from
    unnecessarily conveying unaltered domain records, and receivers from
    performing unneeded re-adaptations of their node monitoring list, such
    as re-assigning domain heads.

    - As a measure of caution we have added the possibility to disable the
    new algorithm through configuration. We do this by keeping a threshold
    value for the cluster size; a cluster that grows beyond this value
    will switch from full-mesh to ring monitoring, and vice versa when
    it shrinks below the value. This means that if the threshold is set to
    a value larger than any anticipated cluster size (default size is 32)
    the new algorithm is effectively disabled. A patch set for altering the
    threshold value and for listing the table contents will follow shortly.

    - This change is fully backwards compatible.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

15 May, 2015

1 commit

  • When we try to add new inline functions in the code, we sometimes
    run into circular include dependencies.

    The main problem is that the file core.h, which really should be at
    the root of the dependency chain, instead is a leaf. I.e., core.h
    includes a number of header files that themselves should be allowed
    to include core.h. In reality this is unnecessary, because core.h does
    not need to know the full signature of any of the structs it refers to,
    only their type declaration.

    In this commit, we remove all dependencies from core.h towards any
    other tipc header file.

    As a consequence of this change, we can now move the function
    tipc_own_addr(net) from addr.c to addr.h, and make it inline.

    There are no functional changes in this commit.

    Reviewed-by: Erik Hugne
    Reviewed-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

30 Mar, 2015

1 commit

  • A message sent to a node after a successful name table lookup may still
    find that the destination socket has disappeared, because distribution
    of name table updates is non-atomic. If so, the message will be rejected
    back to the sender with error code TIPC_ERR_NO_PORT. If the source
    socket of the message has disappeared in the meantime, the message
    should be dropped.

    However, in the currrent code, the message will instead be subject to an
    unwanted tertiary lookup, because the function tipc_msg_lookup_dest()
    doesn't check if there is an error code present in the message before
    performing the lookup. In the worst case, the message may now find the
    old destination again, and be redirected once more, instead of being
    dropped directly as it should be.

    A second bug in this function is that the "prev_node" field in the message
    is not updated after successful lookup, something that may have
    unpredictable consequences.

    The problems arising from those bugs occur very infrequently.

    The third change in this function; the test on msg_reroute_msg_cnt() is
    purely cosmetic, reflecting that the returned value never can be negative.

    This commit corrects the two bugs described above.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

13 Jan, 2015

4 commits

  • If net namespace is supported in tipc, each namespace will be treated
    as a separate tipc node. Therefore, every namespace must own its
    private tipc node address. This means the "tipc_own_addr" global
    variable of node address must be moved to tipc_net structure to
    satisfy the requirement. It's turned out that users also can assign
    node address for every namespace.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • TIPC broadcast link is statically established and its relevant states
    are maintained with the global variables: "bcbearer", "bclink" and
    "bcl". Allowing different namespace to own different broadcast link
    instances, these variables must be moved to tipc_net structure and
    broadcast link instances would be allocated and initialized when
    namespace is created.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Global variables associated with node table are below:
    - node table list (node_htable)
    - node hash table list (tipc_node_list)
    - node table lock (node_list_lock)
    - node number counter (tipc_num_nodes)
    - node link number counter (tipc_num_links)

    To make node table support namespace, above global variables must be
    moved to tipc_net structure in order to keep secret for different
    namespaces. As a consequence, these variables are allocated and
    initialized when namespace is created, and deallocated when namespace
    is destroyed. After the change, functions associated with these
    variables have to utilize a namespace pointer to access them. So
    adding namespace pointer as a parameter of these functions is the
    major change made in the commit.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Only the works of initializing and shutting down tipc module are done
    in core.h and core.c files, so all stuffs which are not closely
    associated with the two tasks should be moved to appropriate places.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

14 Feb, 2014

1 commit

  • The inline functions in addr.h uses tipc_own_addr which is exported by
    core.h, but addr.h never actually includes it. It works because it is
    explicitly included where this is used, but it looks a bit strange.

    Include core.h in addr.h explicitly to make the dependency clearer.

    Signed-off-by: Andreas Bofjäll
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Andreas Bofjäll
     

01 May, 2012

1 commit

  • Some of the comment blocks are floating in limbo between two
    functions, or between blocks of code. Delete the extra line
    feeds between any comment and its associated following block
    of code, to be consistent with the majority of the rest of
    the kernel. Also delete trailing newlines at EOF and fix
    a couple trivial typos in existing comments.

    This is a 100% cosmetic change with no runtime impact. We get
    rid of over 500 lines of non-code, and being blank line deletes,
    they won't even show up as noise in git blame.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

20 Apr, 2012

1 commit

  • Introduces routines that test whether a given network address is
    equal to a node's own network address or if it lies within the node's
    own network cluster, and which work properly regardless of whether
    the node is using the default network address or a non-zero
    network address that is assigned later on. In essence, these routines
    ensure that address is treated as an alias for "this node",
    regardless of which network address the node is actually using.

    Old users of the pre-existing more strict match in_own_cluster()
    have been accordingly redirected to what is now called
    in_own_cluster_exact() --- which does not extend matching to .

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     

11 May, 2011

1 commit


14 Mar, 2011

1 commit

  • Introduces a pair of helper routines that convert the network address
    for a TIPC node into the network address for its cluster or zone.

    This is a cosmetic change designed to avoid future errors caused by
    the incorrect use of address bitmasks, and does not alter the existing
    operation of TIPC.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     

02 Jan, 2011

2 commits

  • Eliminates routines and data structures that were intended to allow
    TIPC to route messages to other clusters. Currently, TIPC supports only
    networks consisting of a single cluster within a single zone, so this
    code is unnecessary.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Allan Stephens
     
  • Simplifies routines and data structures that were intended to allow
    TIPC to support slave nodes (i.e. nodes that did not have links to
    all of the other nodes in its cluster, forcing TIPC to route messages
    that it could not deliver directly through a non-slave node).

    Currently, TIPC supports only networks containing non-slave nodes,
    so this code is unnecessary.

    Note: The latest edition of the TIPC 2.0 Specification has eliminated
    the concept of slave nodes entirely.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Allan Stephens
     

13 May, 2010

2 commits


08 Feb, 2008

1 commit

  • All these static inlines are unused:

    in_own_zone 1 (net/tipc/addr.h)
    msg_dataoctet 1 (net/tipc/msg.h)
    msg_direct 1 (include/net/tipc/tipc_msg.h)
    msg_options 1 (include/net/tipc/tipc_msg.h)
    tipc_nmap_get 1 (net/tipc/bcast.h)

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

11 Feb, 2007

1 commit


18 Jan, 2006

1 commit


13 Jan, 2006

4 commits