11 Dec, 2019

1 commit

  • We introduce a simple variable window congestion control for links.
    The algorithm is inspired by the Reno algorithm, covering both 'slow
    start', 'congestion avoidance', and 'fast recovery' modes.

    - We introduce hard lower and upper window limits per link, still
    different and configurable per bearer type.

    - We introduce a 'slow start theshold' variable, initially set to
    the maximum window size.

    - We let a link start at the minimum congestion window, i.e. in slow
    start mode, and then let is grow rapidly (+1 per rceived ACK) until
    it reaches the slow start threshold and enters congestion avoidance
    mode.

    - In congestion avoidance mode we increment the congestion window for
    each window-size number of acked packets, up to a possible maximum
    equal to the configured maximum window.

    - For each non-duplicate NACK received, we drop back to fast recovery
    mode, by setting the both the slow start threshold to and the
    congestion window to (current_congestion_window / 2).

    - If the timeout handler finds that the transmit queue has not moved
    since the previous timeout, it drops the link back to slow start
    and forces a probe containing the last sent sequence number to the
    sent to the peer, so that this can discover the stale situation.

    This change does in reality have effect only on unicast ethernet
    transport, as we have seen that there is no room whatsoever for
    increasing the window max size for the UDP bearer.
    For now, we also choose to keep the limits for the broadcast link
    unchanged and equal.

    This algorithm seems to give a 50-100% throughput improvement for
    messages larger than MTU.

    Suggested-by: Xin Long
    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

09 Nov, 2019

2 commits

  • This commit offers an option to encrypt and authenticate all messaging,
    including the neighbor discovery messages. The currently most advanced
    algorithm supported is the AEAD AES-GCM (like IPSec or TLS). All
    encryption/decryption is done at the bearer layer, just before leaving
    or after entering TIPC.

    Supported features:
    - Encryption & authentication of all TIPC messages (header + data);
    - Two symmetric-key modes: Cluster and Per-node;
    - Automatic key switching;
    - Key-expired revoking (sequence number wrapped);
    - Lock-free encryption/decryption (RCU);
    - Asynchronous crypto, Intel AES-NI supported;
    - Multiple cipher transforms;
    - Logs & statistics;

    Two key modes:
    - Cluster key mode: One single key is used for both TX & RX in all
    nodes in the cluster.
    - Per-node key mode: Each nodes in the cluster has one specific TX key.
    For RX, a node requires its peers' TX key to be able to decrypt the
    messages from those peers.

    Key setting from user-space is performed via netlink by a user program
    (e.g. the iproute2 'tipc' tool).

    Internal key state machine:

    Attach Align(RX)
    +-+ +-+
    | V | V
    +---------+ Attach +---------+
    | IDLE |---------------->| PENDING |(user = 0)
    +---------+ +---------+
    A A Switch| A
    | | | |
    | | Free(switch/revoked) | |
    (Free)| +----------------------+ | |Timeout
    | (TX) | | |(RX)
    | | | |
    | | v |
    +---------+ Switch +---------+
    | PASSIVE |= 1)

    The number of TFMs is 10 by default and can be changed via the procfs
    'net/tipc/max_tfms'. At this moment, as for simplicity, this file is
    also used to print the crypto statistics at runtime:

    echo 0xfff1 > /proc/sys/net/tipc/max_tfms

    The patch defines a new TIPC version (v7) for the encryption message (-
    backward compatibility as well). The message is basically encapsulated
    as follows:

    +----------------------------------------------------------+
    | TIPCv7 encryption | Original TIPCv2 | Authentication |
    | header | packet (encrypted) | Tag |
    +----------------------------------------------------------+

    The throughput is about ~40% for small messages (compared with non-
    encryption) and ~9% for large messages. With the support from hardware
    crypto i.e. the Intel AES-NI CPU instructions, the throughput increases
    upto ~85% for small messages and ~55% for large messages.

    By default, the new feature is inactive (i.e. no encryption) until user
    sets a key for TIPC. There is however also a new option - "TIPC_CRYPTO"
    in the kernel configuration to enable/disable the new code when needed.

    MAINTAINERS | add two new files 'crypto.h' & 'crypto.c' in tipc

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     
  • As a need to support the crypto asynchronous operations in the later
    commits, apart from the current RCU mechanism for bearer pointer, we
    add a 'refcnt' to the bearer object as well.

    So, a bearer can be hold via 'tipc_bearer_hold()' without being freed
    even though the bearer or interface can be disabled in the meanwhile.
    If that happens, the bearer will be released then when the crypto
    operation is completed and 'tipc_bearer_put()' is called.

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

09 Aug, 2019

1 commit

  • Since node internal messages are passed directly to the socket, it is not
    possible to observe those messages via tcpdump or wireshark.

    We now remedy this by making it possible to clone such messages and send
    the clones to the loopback interface. The clones are dropped at reception
    and have no functional role except making the traffic visible.

    The feature is enabled if network taps are active for the loopback device.
    pcap filtering restrictions require the messages to be presented to the
    receiving side of the loopback device.

    v3 - Function dev_nit_active used to check for network taps.
    - Procedure netif_rx_ni used to send cloned messages to loopback device.

    Signed-off-by: John Rutherford
    Acked-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    John Rutherford
     

20 Dec, 2018

1 commit

  • As for the sake of debugging/tracing, the commit enables tracepoints in
    TIPC along with some general trace_events as shown below. It also
    defines some 'tipc_*_dump()' functions that allow to dump TIPC object
    data whenever needed, that is, for general debug purposes, ie. not just
    for the trace_events.

    The following trace_events are now available:

    - trace_tipc_skb_dump(): allows to trace and dump TIPC msg & skb data,
    e.g. message type, user, droppable, skb truesize, cloned skb, etc.

    - trace_tipc_list_dump(): allows to trace and dump any TIPC buffers or
    queues, e.g. TIPC link transmq, socket receive queue, etc.

    - trace_tipc_sk_dump(): allows to trace and dump TIPC socket data, e.g.
    sk state, sk type, connection type, rmem_alloc, socket queues, etc.

    - trace_tipc_link_dump(): allows to trace and dump TIPC link data, e.g.
    link state, silent_intv_cnt, gap, bc_gap, link queues, etc.

    - trace_tipc_node_dump(): allows to trace and dump TIPC node data, e.g.
    node state, active links, capabilities, link entries, etc.

    How to use:
    Put the trace functions at any places where we want to dump TIPC data
    or events.

    Note:
    a) The dump functions will generate raw data only, that is, to offload
    the trace event's processing, it can require a tool or script to parse
    the data but this should be simple.

    b) The trace_tipc_*_dump() should be reserved for a failure cases only
    (e.g. the retransmission failure case) or where we do not expect to
    happen too often, then we can consider enabling these events by default
    since they will almost not take any effects under normal conditions,
    but once the rare condition or failure occurs, we get the dumped data
    fully for post-analysis.

    For other trace purposes, we can reuse these trace classes as template
    but different events.

    c) A trace_event is only effective when we enable it. To enable the
    TIPC trace_events, echo 1 to 'enable' files in the events/tipc/
    directory in the 'debugfs' file system. Normally, they are located at:

    /sys/kernel/debug/tracing/events/tipc/

    For example:

    To enable the tipc_link_dump event:

    echo 1 > /sys/kernel/debug/tracing/events/tipc/tipc_link_dump/enable

    To enable all the TIPC trace_events:

    echo 1 > /sys/kernel/debug/tracing/events/tipc/enable

    To collect the trace data:

    cat trace

    or

    cat trace_pipe > /trace.out &

    To disable all the TIPC trace_events:

    echo 0 > /sys/kernel/debug/tracing/events/tipc/enable

    To clear the trace buffer:

    echo > trace

    d) Like the other trace_events, the feature like 'filter' or 'trigger'
    is also usable for the tipc trace_events.
    For more details, have a look at:

    Documentation/trace/ftrace.txt

    MAINTAINERS | add two new files 'trace.h' & 'trace.c' in tipc

    Acked-by: Ying Xue
    Tested-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

20 Apr, 2018

1 commit

  • In previous commit, we changed the default emulated MTU for UDP bearers
    to 14k.

    This commit adds the functionality to set/change the default value
    by configuring new MTU for UDP media. UDP bearer(s) have to be disabled
    and enabled back for the new MTU to take effect.

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: GhantaKrishnamurthy MohanKrishna
    Signed-off-by: David S. Miller

    GhantaKrishnamurthy MohanKrishna
     

24 Mar, 2018

1 commit

  • To facilitate the coming changes in the neighbor discovery functionality
    we make some renaming and refactoring of that code. The functional changes
    in this commit are trivial, e.g., that we move the message sending call in
    tipc_disc_timeout() outside the spinlock protected region.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

15 Feb, 2018

4 commits


02 Sep, 2017

1 commit


30 Aug, 2017

1 commit

  • For a bond slave device as a tipc bearer, the dev represents the bond
    interface and orig_dev represents the slave in tipc_l2_rcv_msg().
    Since we decode the tipc_ptr from bonding device (dev), we fail to
    find the bearer and thus tipc links are not established.

    In this commit, we register the tipc protocol callback per device and
    look for tipc bearer from both the devices.

    Signed-off-by: Parthasarathy Bhuvaragan
    Signed-off-by: David S. Miller

    Parthasarathy Bhuvaragan
     

22 Aug, 2017

1 commit

  • When the broadcast send link after 100 attempts has failed to
    transfer a packet to all peers, we consider it stale, and reset
    it. Thereafter it needs to re-synchronize with the peers, something
    currently done by just resetting and re-establishing all links to
    all peers. This has turned out to be overkill, with potentially
    unwanted consequences for the remaining cluster.

    A closer analysis reveals that this can be done much simpler. When
    this kind of failure happens, for reasons that may lie outside the
    TIPC protocol, it is typically only one peer which is failing to
    receive and acknowledge packets. It is hence sufficient to identify
    and reset the links only to that peer to resolve the situation, without
    having to reset the broadcast link at all. This solution entails a much
    lower risk of negative consequences for the own node as well as for
    the overall cluster.

    We implement this change in this commit.

    Reviewed-by: Parthasarathy Bhuvaragan
    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

21 Jan, 2017

1 commit

  • As a preparation for the 'replicast' functionality we are going to
    introduce in the next commits, we need the broadcast base structure to
    store whether bearer broadcast is available at all from the currently
    used bearer or bearers.

    We do this by adding a new function tipc_bearer_bcast_support() to
    the bearer layer, and letting the bearer selection function in
    bcast.c use this to give a new boolean field, 'bcast_support' the
    appropriate value.

    Reviewed-by: Parthasarathy Bhuvaragan
    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

03 Dec, 2016

1 commit

  • Qian Zhang (张谦) reported a potential socket buffer overflow in
    tipc_msg_build() which is also known as CVE-2016-8632: due to
    insufficient checks, a buffer overflow can occur if MTU is too short for
    even tipc headers. As anyone can set device MTU in a user/net namespace,
    this issue can be abused by a regular user.

    As agreed in the discussion on Ben Hutchings' original patch, we should
    check the MTU at the moment a bearer is attached rather than for each
    processed packet. We also need to repeat the check when bearer MTU is
    adjusted to new device MTU. UDP case also needs a check to avoid
    overflow when calculating bearer MTU.

    Fixes: b97bf3fd8f6a ("[TIPC] Initial merge")
    Signed-off-by: Michal Kubecek
    Reported-by: Qian Zhang (张谦)
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Michal Kubeček
     

27 Aug, 2016

1 commit

  • This patch introduces UDP replicast. A concept where we emulate
    multicast by sending multiple unicast messages to configured peers.

    The purpose of replicast is mainly to be able to use TIPC in cloud
    environments where IP multicast is disabled. Using replicas to unicast
    multicast messages is costly as we have to copy each skb and send the
    copies individually.

    Signed-off-by: Richard Alpe
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Richard Alpe
     

19 Aug, 2016

1 commit

  • In commit 5b7066c3dd24 ("tipc: stricter filtering of packets in bearer
    layer") we introduced a method of filtering out messages while a bearer
    is being reset, to avoid that links may be re-created and come back in
    working state while we are still in the process of shutting them down.

    This solution works well, but is limited to only work with L2 media, which
    is insufficient with the increasing use of UDP as carrier media.

    We now replace this solution with a more generic one, by introducing a
    new flag "up" in the generic struct tipc_bearer. This field will be set
    and reset at the same locations as with the previous solution, while
    the packet filtering is moved to the generic code for the sending side.
    On the receiving side, the filtering is still done in media specific
    code, but now including the UDP bearer.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

27 Jul, 2016

1 commit


24 Jul, 2016

1 commit


12 Jul, 2016

1 commit

  • In test situations with many nodes and a heavily stressed system we have
    observed that the transmission broadcast link may fail due to an
    excessive number of retransmissions of the same packet. In such
    situations we need to reset all unicast links to all peers, in order to
    reset and re-synchronize the broadcast link.

    In this commit, we add a new function tipc_bearer_reset_all() to be used
    in such situations. The function scans across all bearers and resets all
    their pertaining links.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

16 Jun, 2016

1 commit

  • TIPC based clusters are by default set up with full-mesh link
    connectivity between all nodes. Those links are expected to provide
    a short failure detection time, by default set to 1500 ms. Because
    of this, the background load for neighbor monitoring in an N-node
    cluster increases with a factor N on each node, while the overall
    monitoring traffic through the network infrastructure increases at
    a ~(N * (N - 1)) rate. Experience has shown that such clusters don't
    scale well beyond ~100 nodes unless we significantly increase failure
    discovery tolerance.

    This commit introduces a framework and an algorithm that drastically
    reduces this background load, while basically maintaining the original
    failure detection times across the whole cluster. Using this algorithm,
    background load will now grow at a rate of ~(2 * sqrt(N)) per node, and
    at ~(2 * N * sqrt(N)) in traffic overhead. As an example, each node will
    now have to actively monitor 38 neighbors in a 400-node cluster, instead
    of as before 399.

    This "Overlapping Ring Supervision Algorithm" is completely distributed
    and employs no centralized or coordinated state. It goes as follows:

    - Each node makes up a linearly ascending, circular list of all its N
    known neighbors, based on their TIPC node identity. This algorithm
    must be the same on all nodes.

    - The node then selects the next M = sqrt(N) - 1 nodes downstream from
    itself in the list, and chooses to actively monitor those. This is
    called its "local monitoring domain".

    - It creates a domain record describing the monitoring domain, and
    piggy-backs this in the data area of all neighbor monitoring messages
    (LINK_PROTOCOL/STATE) leaving that node. This means that all nodes in
    the cluster eventually (default within 400 ms) will learn about
    its monitoring domain.

    - Whenever a node discovers a change in its local domain, e.g., a node
    has been added or has gone down, it creates and sends out a new
    version of its node record to inform all neighbors about the change.

    - A node receiving a domain record from anybody outside its local domain
    matches this against its own list (which may not look the same), and
    chooses to not actively monitor those members of the received domain
    record that are also present in its own list. Instead, it relies on
    indications from the direct monitoring nodes if an indirectly
    monitored node has gone up or down. If a node is indicated lost, the
    receiving node temporarily activates its own direct monitoring towards
    that node in order to confirm, or not, that it is actually gone.

    - Since each node is actively monitoring sqrt(N) downstream neighbors,
    each node is also actively monitored by the same number of upstream
    neighbors. This means that all non-direct monitoring nodes normally
    will receive sqrt(N) indications that a node is gone.

    - A major drawback with ring monitoring is how it handles failures that
    cause massive network partitionings. If both a lost node and all its
    direct monitoring neighbors are inside the lost partition, the nodes in
    the remaining partition will never receive indications about the loss.
    To overcome this, each node also chooses to actively monitor some
    nodes outside its local domain. Those nodes are called remote domain
    "heads", and are selected in such a way that no node in the cluster
    will be more than two direct monitoring hops away. Because of this,
    each node, apart from monitoring the member of its local domain, will
    also typically monitor sqrt(N) remote head nodes.

    - As an optimization, local list status, domain status and domain
    records are marked with a generation number. This saves senders from
    unnecessarily conveying unaltered domain records, and receivers from
    performing unneeded re-adaptations of their node monitoring list, such
    as re-assigning domain heads.

    - As a measure of caution we have added the possibility to disable the
    new algorithm through configuration. We do this by keeping a threshold
    value for the cluster size; a cluster that grows beyond this value
    will switch from full-mesh to ring monitoring, and vice versa when
    it shrinks below the value. This means that if the threshold is set to
    a value larger than any anticipated cluster size (default size is 32)
    the new algorithm is effectively disabled. A patch set for altering the
    threshold value and for listing the table contents will follow shortly.

    - This change is fully backwards compatible.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

14 Apr, 2016

1 commit

  • We remove a couple of leftover fields in struct tipc_bearer. Those
    were used by the old broadcast implementation, and are not needed
    any longer. There is no functional changes in this commit.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

21 Nov, 2015

1 commit

  • The number of variables with Hungarian notation (l_ptr, n_ptr etc.)
    has been significantly reduced over the last couple of years.

    We now root out the last traces of this practice.
    There are no functional changes in this commit.

    Reviewed-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

24 Oct, 2015

4 commits

  • After the previous changes in this series, we can now remove some
    unused code and structures, both in the broadcast, link aggregation
    and link code.

    There are no functional changes in this commit.

    Signed-off-by: Jon Maloy
    Reviewed-by: Ying Xue
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • The neighbor discovery function currently uses the function
    tipc_bearer_send() for transmitting packets, assuming that the
    sent buffers are not consumed by the called function.

    We want to change this, in order to avoid unnecessary buffer cloning
    elswhere in the code.

    This commit introduces a new function tipc_bearer_skb() which consumes
    the sent buffers, and let the discoverer functions use this new call
    instead. The discoverer does now itself perform the cloning when
    that is necessary.

    Signed-off-by: Jon Maloy
    Reviewed-by: Ying Xue
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • Until now, we have only been supporting a fix MTU size of 1500 bytes
    for all broadcast media, irrespective of their actual capability.

    We now make the broadcast MTU adaptable to the carrying media, i.e.,
    we use the smallest MTU supported by any of the interfaces attached
    to TIPC.

    Signed-off-by: Jon Maloy
    Reviewed-by: Ying Xue
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • Until now, we have been keeping track of the exact set of broadcast
    destinations though the help structure tipc_node_map. This leads us to
    have to maintain a whole infrastructure for supporting this, including
    a pseudo-bearer and a number of functions to manipulate both the bearers
    and the node map correctly. Apart from the complexity, this approach is
    also limiting, as struct tipc_node_map only can support cluster local
    broadcast if we want to avoid it becoming excessively large. We want to
    eliminate this limitation, in order to enable introduction of scoped
    multicast in the future.

    A closer analysis reveals that it is unnecessary maintaining this "full
    set" overview; it is sufficient to keep a counter per bearer, indicating
    how many nodes can be reached via this bearer at the moment. The protocol
    is now robust enough to handle transitional discrepancies between the
    nominal number of reachable destinations, as expected by the broadcast
    protocol itself, and the number which is actually reachable at the
    moment. The initial broadcast synchronization, in conjunction with the
    retransmission mechanism, ensures that all packets will eventually be
    acknowledged by the correct set of destinations.

    This commit introduces these changes.

    Signed-off-by: Jon Maloy
    Reviewed-by: Ying Xue
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

21 Jul, 2015

1 commit

  • Currently, message sending is performed through a deep call chain,
    where the node spinlock is grabbed and held during a significant
    part of the transmission time. This is clearly detrimental to
    overall throughput performance; it would be better if we could send
    the message after the spinlock has been released.

    In this commit, we do instead let the call revert on the stack after
    the buffer chain has been added to the transmission queue, whereafter
    clones of the buffers are transmitted to the device layer outside the
    spinlock scope.

    As a further step in our effort to separate the roles of the node
    and link entities we also move the function tipc_link_xmit() to
    node.c, and rename it to tipc_node_xmit().

    Reviewed-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

15 May, 2015

1 commit

  • When we try to add new inline functions in the code, we sometimes
    run into circular include dependencies.

    The main problem is that the file core.h, which really should be at
    the root of the dependency chain, instead is a leaf. I.e., core.h
    includes a number of header files that themselves should be allowed
    to include core.h. In reality this is unnecessary, because core.h does
    not need to know the full signature of any of the structs it refers to,
    only their type declaration.

    In this commit, we remove all dependencies from core.h towards any
    other tipc header file.

    As a consequence of this change, we can now move the function
    tipc_own_addr(net) from addr.c to addr.h, and make it inline.

    There are no functional changes in this commit.

    Reviewed-by: Erik Hugne
    Reviewed-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

06 Mar, 2015

1 commit

  • The ip/udp bearer can be configured in a point-to-point
    mode by specifying both local and remote ip/hostname,
    or it can be enabled in multicast mode, where links are
    established to all tipc nodes that have joined the same
    multicast group. The multicast IP address is generated
    based on the TIPC network ID, but can be overridden by
    using another multicast address as remote ip.

    Signed-off-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Reviewed-by: Ying Xue
    Signed-off-by: David S. Miller

    Erik Hugne
     

28 Feb, 2015

2 commits

  • With the exception of infiniband media which does not use media
    offsets, the media address is always located at offset 4 in the
    media info field as defined by the protocol, so we move the
    definition to the generic bearer.h

    Signed-off-by: Erik Hugne
    Signed-off-by: David S. Miller

    Erik Hugne
     
  • The TIPC_MEDIA_ADDR_SIZE and TIPC_MEDIA_ADDR_OFFSET names
    are misleading, as they actually define the size and offset of
    the whole media info field and not the address part. This patch
    does not have any functional changes.

    Signed-off-by: Erik Hugne
    Signed-off-by: David S. Miller

    Erik Hugne
     

10 Feb, 2015

3 commits

  • Convert TIPC_CMD_GET_MEDIA_NAMES to compat dumpit.

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Introduce a framework for transcoding legacy nl action into actions
    (.doit) calls from the new nl API. This is done by converting the
    incoming TLV data into netlink data with nested netlink attributes.
    Unfortunately due to the randomness of the legacy API we can't do this
    generically so each legacy netlink command requires a specific
    transcoding recipe. In this case for bearer enable and bearer disable.

    Convert TIPC_CMD_ENABLE_BEARER and TIPC_CMD_DISABLE_BEARER into doit
    compat calls.

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Introduce a framework for dumping netlink data from the new netlink
    API and formatting it to the old legacy API format. This is done by
    looping the dump data and calling a format handler for each entity, in
    this case a bearer.

    We dump until either all data is dumped or we reach the limited buffer
    size of the legacy API. Remember, the legacy API doesn't scale.

    In this commit we convert TIPC_CMD_GET_BEARER_NAMES to use the compat
    layer.

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Richard Alpe
     

13 Jan, 2015

4 commits

  • TIPC broadcast link is statically established and its relevant states
    are maintained with the global variables: "bcbearer", "bclink" and
    "bcl". Allowing different namespace to own different broadcast link
    instances, these variables must be moved to tipc_net structure and
    broadcast link instances would be allocated and initialized when
    namespace is created.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Bearer list defined as a global variable is used to store bearer
    instances. When tipc supports net namespace, bearers created in
    one namespace must be isolated with others allocated in other
    namespaces, which requires us that the bearer list(bearer_list)
    must be moved to tipc_net structure. As a result, a net namespace
    pointer has to be passed to functions which access the bearer list.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Global variables associated with node table are below:
    - node table list (node_htable)
    - node hash table list (tipc_node_list)
    - node table lock (node_list_lock)
    - node number counter (tipc_num_nodes)
    - node link number counter (tipc_num_links)

    To make node table support namespace, above global variables must be
    moved to tipc_net structure in order to keep secret for different
    namespaces. As a consequence, these variables are allocated and
    initialized when namespace is created, and deallocated when namespace
    is destroyed. After the change, functions associated with these
    variables have to utilize a namespace pointer to access them. So
    adding namespace pointer as a parameter of these functions is the
    major change made in the commit.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Involve namespace infrastructure, make the "tipc_net_id" global
    variable aware of per namespace, and rename it to "net_id". In
    order that the conversion can be successfully done, an instance
    of networking namespace must be passed to relevant functions,
    allowing them to access the "net_id" variable of per namespace.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue