19 Sep, 2020

2 commits

  • Rekeying is required for security since a key is less secure when using
    for a long time. Also, key will be detached when its nonce value (or
    seqno ...) is exhausted. We now make the rekeying process automatic and
    configurable by user.

    Basically, TIPC will at a specific interval generate a new key by using
    the kernel 'Random Number Generator' cipher, then attach it as the node
    TX key and securely distribute to others in the cluster as RX keys (-
    the key exchange). The automatic key switching will then take over, and
    make the new key active shortly. Afterwards, the traffic from this node
    will be encrypted with the new session key. The same can happen in peer
    nodes but not necessarily at the same time.

    For simplicity, the automatically generated key will be initiated as a
    per node key. It is not too hard to also support a cluster key rekeying
    (e.g. a given node will generate a unique cluster key and update to the
    others in the cluster...), but that doesn't bring much benefit, while a
    per-node key is even more secure.

    We also enable user to force a rekeying or change the rekeying interval
    via netlink, the new 'set key' command option: 'TIPC_NLA_NODE_REKEYING'
    is added for these purposes as follows:
    - A value >= 1 will be set as the rekeying interval (in minutes);
    - A value of 0 will disable the rekeying;
    - A value of 'TIPC_REKEYING_NOW' (~0) will force an immediate rekeying;

    The default rekeying interval is (60 * 24) minutes i.e. done every day.
    There isn't any restriction for the value but user shouldn't set it too
    small or too large which results in an "ineffective" rekeying (thats ok
    for testing though).

    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     
  • In addition to the supported cluster & per-node encryption keys for the
    en/decryption of TIPC messages, we now introduce one option for user to
    set a cluster key as 'master key', which is simply a symmetric key like
    the former but has a longer life cycle. It has two purposes:

    - Authentication of new member nodes in the cluster. New nodes, having
    no knowledge of current session keys in the cluster will still be
    able to join the cluster as long as they know the master key. This is
    because all neighbor discovery (LINK_CONFIG) messages must be
    encrypted with this key.

    - Encryption of session encryption keys during automatic exchange and
    update of those.This is a feature we will introduce in a later commit
    in this series.

    We insert the new key into the currently unused slot 0 in the key array
    and start using it immediately once the user has set it.
    After joining, a node only knowing the master key should be fully
    communicable to existing nodes in the cluster, although those nodes may
    have their own session keys activated (i.e. not the master one). To
    support this, we define a 'grace period', starting from the time a node
    itself reports having no RX keys, so the existing nodes will use the
    master key for encryption instead. The grace period can be extended but
    will automatically stop after e.g. 5 seconds without a new report. This
    is also the basis for later key exchanging feature as the new node will
    be impossible to decrypt anything without the support from master key.

    For user to set a master key, we define a new netlink flag -
    'TIPC_NLA_NODE_KEY_MASTER', so it can be added to the current 'set key'
    netlink command to specify the setting key to be a master key.

    Above all, the traditional cluster/per-node key mechanism is guaranteed
    to work when user comes not to use this master key option. This is also
    compatible to legacy nodes without the feature supported.

    Even this master key can be updated without any interruption of cluster
    connectivity but is so is needed, this has to be coordinated and set by
    the user.

    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

27 May, 2020

1 commit

  • This commit enables dumping the statistics of a broadcast-receiver link
    like the traditional 'broadcast-link' one (which is for broadcast-
    sender). The link dumping can be triggered via netlink (e.g. the
    iproute2/tipc tool) by the link flag - 'TIPC_NLA_LINK_BROADCAST' as the
    indicator.

    The name of a broadcast-receiver link of a specific peer will be in the
    format: 'broadcast-link:'.

    For example:

    Link
    Window:50 packets
    RX packets:7841 fragments:2408/440 bundles:0/0
    TX packets:0 fragments:0/0 bundles:0/0
    RX naks:0 defs:124 dups:0
    TX naks:21 acks:0 retrans:0
    Congestion link:0 Send queue max:0 avg:0

    In addition, the broadcast-receiver link statistics can be reset in the
    usual way via netlink by specifying that link name in command.

    Note: the 'tipc_link_name_ext()' is removed because the link name can
    now be retrieved simply via the 'l->name'.

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

04 Mar, 2020

1 commit


21 Dec, 2019

1 commit

  • To enable iproute2/tipc to generate backwards compatible
    printouts and validate command parameters for nodes using a
    node address, it needs to be able to read the legacy
    address flag from the kernel. The legacy address flag records
    the way in which the node identity was originally specified.

    The legacy address flag is requested by the netlink message
    TIPC_NL_ADDR_LEGACY_GET. If the flag is set the attribute
    TIPC_NLA_NET_ADDR_LEGACY is set in the return message.

    Signed-off-by: John Rutherford
    Acked-by: Jon Maloy
    Signed-off-by: David S. Miller

    John Rutherford
     

09 Nov, 2019

1 commit

  • This commit adds two netlink commands to TIPC in order for user to be
    able to set or remove AEAD keys:
    - TIPC_NL_KEY_SET
    - TIPC_NL_KEY_FLUSH

    When the 'KEY_SET' is given along with the key data, the key will be
    initiated and attached to TIPC crypto. On the other hand, the
    'KEY_FLUSH' command will remove all existing keys if any.

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tuong Lien
    Signed-off-by: David S. Miller

    Tuong Lien
     

06 Oct, 2019

2 commits


24 Jun, 2019

1 commit


28 Apr, 2019

2 commits

  • Add options to strictly validate messages and dump messages,
    sometimes perhaps validating dump messages non-strictly may
    be required, so add an option for that as well.

    Since none of this can really be applied to existing commands,
    set the options everwhere using the following spatch:

    @@
    identifier ops;
    expression X;
    @@
    struct genl_ops ops[] = {
    ...,
    {
    .cmd = X,
    + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
    ...
    },
    ...
    };

    For new commands one should just not copy the .validate 'opt-out'
    flags and thus get strict validation.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • We currently have two levels of strict validation:

    1) liberal (default)
    - undefined (type >= max) & NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted
    - garbage at end of message accepted
    2) strict (opt-in)
    - NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted

    Split out parsing strictness into four different options:
    * TRAILING - check that there's no trailing data after parsing
    attributes (in message or nested)
    * MAXTYPE - reject attrs > max known type
    * UNSPEC - reject attributes with NLA_UNSPEC policy entries
    * STRICT_ATTRS - strictly validate attribute size

    The default for future things should be *everything*.
    The current *_strict() is a combination of TRAILING and MAXTYPE,
    and is renamed to _deprecated_strict().
    The current regular parsing has none of this, and is renamed to
    *_parse_deprecated().

    Additionally it allows us to selectively set one of the new flags
    even on old policies. Notably, the UNSPEC flag could be useful in
    this case, since it can be arranged (by filling in the policy) to
    not be an incompatible userspace ABI change, but would then going
    forward prevent forgetting attribute entries. Similar can apply
    to the POLICY flag.

    We end up with the following renames:
    * nla_parse -> nla_parse_deprecated
    * nla_parse_strict -> nla_parse_deprecated_strict
    * nlmsg_parse -> nlmsg_parse_deprecated
    * nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
    * nla_parse_nested -> nla_parse_nested_deprecated
    * nla_validate_nested -> nla_validate_nested_deprecated

    Using spatch, of course:
    @@
    expression TB, MAX, HEAD, LEN, POL, EXT;
    @@
    -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
    +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression TB, MAX, NLA, POL, EXT;
    @@
    -nla_parse_nested(TB, MAX, NLA, POL, EXT)
    +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)

    @@
    expression START, MAX, POL, EXT;
    @@
    -nla_validate_nested(START, MAX, POL, EXT)
    +nla_validate_nested_deprecated(START, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, MAX, POL, EXT;
    @@
    -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
    +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)

    For this patch, don't actually add the strict, non-renamed versions
    yet so that it breaks compile if I get it wrong.

    Also, while at it, make nla_validate and nla_parse go down to a
    common __nla_validate_parse() function to avoid code duplication.

    Ultimately, this allows us to have very strict validation for every
    new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
    next patch, while existing things will continue to work as is.

    In effect then, this adds fully strict validation for any new command.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

22 Mar, 2019

1 commit

  • Since maxattr is common, the policy can't really differ sanely,
    so make it common as well.

    The only user that did in fact manage to make a non-common policy
    is taskstats, which has to be really careful about it (since it's
    still using a common maxattr!). This is no longer supported, but
    we can fake it using pre_doit.

    This reduces the size of e.g. nl80211.o (which has lots of commands):

    text data bss dec hex filename
    398745 14323 2240 415308 6564c net/wireless/nl80211.o (before)
    397913 14331 2240 414484 65314 net/wireless/nl80211.o (after)
    --------------------------------
    -832 +8 0 -824

    Which is obviously just 8 bytes for each command, and an added 8
    bytes for the new policy pointer. I'm not sure why the ops list is
    counted as .text though.

    Most of the code transformations were done using the following spatch:
    @ops@
    identifier OPS;
    expression POLICY;
    @@
    struct genl_ops OPS[] = {
    ...,
    {
    - .policy = POLICY,
    },
    ...
    };

    @@
    identifier ops.OPS;
    expression ops.POLICY;
    identifier fam;
    expression M;
    @@
    struct genl_family fam = {
    .ops = OPS,
    .maxattr = M,
    + .policy = POLICY,
    ...
    };

    This also gets rid of devlink_nl_cmd_region_read_dumpit() accessing
    the cb->data as ops, which we want to change in a later genl patch.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

20 Mar, 2019

1 commit

  • Currently, a multicast stream uses either broadcast or replicast as
    transmission method, based on the ratio between number of actual
    destinations nodes and cluster size.

    However, when an L2 interface (e.g., VXLAN) provides pseudo
    broadcast support, this becomes very inefficient, as it blindly
    replicates multicast packets to all cluster/subnet nodes,
    irrespective of whether they host actual target sockets or not.

    The TIPC multicast algorithm is able to distinguish real destination
    nodes from other nodes, and hence provides a smarter and more
    efficient method for transferring multicast messages than
    pseudo broadcast can do.

    Because of this, we now make it possible for users to force
    the broadcast link to permanently switch to using replicast,
    irrespective of which capabilities the bearer provides,
    or pretend to provide.
    Conversely, we also make it possible to force the broadcast link
    to always use true broadcast. While maybe less useful in
    deployed systems, this may at least be useful for testing the
    broadcast algorithm in small clusters.

    We retain the current AUTOSELECT ability, i.e., to let the broadcast link
    automatically select which algorithm to use, and to switch back and forth
    between broadcast and replicast as the ratio between destination
    node number and cluster size changes. This remains the default method.

    Furthermore, we make it possible to configure the threshold ratio for
    such switches. The default ratio is now set to 10%, down from 25% in the
    earlier implementation.

    Acked-by: Jon Maloy
    Signed-off-by: Hoang Le
    Signed-off-by: David S. Miller

    Hoang Le
     

30 Aug, 2018

1 commit

  • syzbot reported a use-after-free in tipc_group_fill_sock_diag(),
    where tipc_group_fill_sock_diag() still reads tsk->group meanwhile
    tipc_group_delete() just deletes it in tipc_release().

    tipc_nl_sk_walk() aims to lock this sock when walking each sock
    in the hash table to close race conditions with sock changes like
    this one, by acquiring tsk->sk.sk_lock.slock spinlock, unfortunately
    this doesn't work at all. All non-BH call path should take
    lock_sock() instead to make it work.

    tipc_nl_sk_walk() brutally iterates with raw rht_for_each_entry_rcu()
    where RCU read lock is required, this is the reason why lock_sock()
    can't be taken on this path. This could be resolved by switching to
    rhashtable iterator API's, where taking a sleepable lock is possible.
    Also, the iterator API's are friendly for restartable calls like
    diag dump, the last position is remembered behind the scence,
    all we need to do here is saving the iterator into cb->args[].

    I tested this with parallel tipc diag dump and thousands of tipc
    socket creation and release, no crash or memory leak.

    Reported-by: syzbot+b9c8f3ab2994b7cd1625@syzkaller.appspotmail.com
    Cc: Jon Maloy
    Cc: Ying Xue
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

17 Apr, 2018

2 commits

  • syzbot reported a crash in __tipc_nl_net_set() caused by NULL dereference.

    We need to check that both TIPC_NLA_NET_NODEID and TIPC_NLA_NET_NODEID_W1
    are present.

    We also need to make sure userland provided u64 attributes.

    Fixes: d50ccc2d3909 ("tipc: add 128-bit node identifier")
    Signed-off-by: Eric Dumazet
    Cc: Jon Maloy
    Cc: Ying Xue
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Before syzbot/KMSAN bites, add the missing policy for TIPC_NLA_NET_ADDR

    Fixes: 27c21416727a ("tipc: add net set to new netlink api")
    Signed-off-by: Eric Dumazet
    Cc: Jon Maloy
    Cc: Ying Xue
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Apr, 2017

1 commit


28 Oct, 2016

4 commits

  • Now genl_register_family() is the only thing (other than the
    users themselves, perhaps, but I didn't find any doing that)
    writing to the family struct.

    In all families that I found, genl_register_family() is only
    called from __init functions (some indirectly, in which case
    I've add __init annotations to clarifly things), so all can
    actually be marked __ro_after_init.

    This protects the data structure from accidental corruption.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Instead of providing macros/inline functions to initialize
    the families, make all users initialize them statically and
    get rid of the macros.

    This reduces the kernel code size by about 1.6k on x86-64
    (with allyesconfig).

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Static family IDs have never really been used, the only
    use case was the workaround I introduced for those users
    that assumed their family ID was also their multicast
    group ID.

    Additionally, because static family IDs would never be
    reserved by the generic netlink code, using a relatively
    low ID would only work for built-in families that can be
    registered immediately after generic netlink is started,
    which is basically only the control family (apart from
    the workaround code, which I also had to add code for so
    it would reserve those IDs)

    Thus, anything other than GENL_ID_GENERATE is flawed and
    luckily not used except in the cases I mentioned. Move
    those workarounds into a few lines of code, and then get
    rid of GENL_ID_GENERATE entirely, making it more robust.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • This helper function allows family implementations to access
    their family's attrbuf. This gets rid of the attrbuf usage
    in families, and also adds locking validation, since it's not
    valid to use the attrbuf with parallel_ops or outside of the
    dumpit callback.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

27 Aug, 2016

2 commits

  • When using replicast a UDP bearer can have an arbitrary amount of
    remote ip addresses associated with it. This means we cannot simply
    add all remote ip addresses to an existing bearer data message as it
    might fill the message, leaving us with a truncated message that we
    can't safely resume. To handle this we introduce the new netlink
    command TIPC_NL_UDP_GET_REMOTEIP. This command is intended to be
    called when the bearer data message has the
    TIPC_NLA_UDP_MULTI_REMOTEIP flag set, indicating there are more than
    one remote ip (replicast).

    Signed-off-by: Richard Alpe
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • This patch introduces UDP replicast. A concept where we emulate
    multicast by sending multiple unicast messages to configured peers.

    The purpose of replicast is mainly to be able to use TIPC in cloud
    environments where IP multicast is disabled. Using replicas to unicast
    multicast messages is costly as we have to copy each skb and send the
    copies individually.

    Signed-off-by: Richard Alpe
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Richard Alpe
     

19 Aug, 2016

1 commit

  • Add TIPC_NL_PEER_REMOVE netlink command. This command can remove
    an offline peer node from the internal data structures.

    This will be supported by the tipc user space tool in iproute2.

    Signed-off-by: Richard Alpe
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe
     

27 Jul, 2016

3 commits


08 Mar, 2016

1 commit


21 Nov, 2015

2 commits

  • We move the definition of struct tipc_link from link.h to link.c in
    order to minimize its exposure to the rest of the code.

    When needed, we define new functions to make it possible for external
    entities to access and set data in the link.

    Apart from the above, there are no functional changes.

    Reviewed-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • In our effort to have less code and include dependencies between
    entities such as node, link and bearer, we try to narrow down
    the exposed interface towards the node as much as possible.

    In this commit, we move the definition of struct tipc_node, along
    with many of its associated function declarations, from node.h to
    node.c. We also move some function definitions from link.c and
    name_distr.c to node.c, since they access fields in struct tipc_node
    that should not be externally visible. The moved functions are renamed
    according to new location, and made static whenever possible.

    There are no functional changes in this commit.

    Reviewed-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

10 Feb, 2015

1 commit

  • The new netlink API is no longer "v2" but rather the standard API and
    the legacy API is now "nl compat". We split them into separate
    start/stop and put them in different files in order to further
    distinguish them.

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Ying Xue
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Richard Alpe
     

13 Jan, 2015

2 commits

  • Currently tipc module only allows users sitting on "init_net" namespace
    to configure it through netlink interface. But now almost each tipc
    component is able to be aware of net namespace, so it's time to open
    the permission for users residing in other namespaces, allowing them
    to configure their own tipc stack instance through netlink interface.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     
  • Involve namespace infrastructure, make the "tipc_net_id" global
    variable aware of per namespace, and rename it to "net_id". In
    order that the conversion can be successfully done, an instance
    of networking namespace must be passed to relevant functions,
    allowing them to access the "net_id" variable of per namespace.

    Signed-off-by: Ying Xue
    Tested-by: Tero Aho
    Reviewed-by: Jon Maloy
    Signed-off-by: David S. Miller

    Ying Xue
     

22 Nov, 2014

7 commits

  • Add TIPC_NL_NAME_TABLE_GET command to the new tipc netlink API.

    This command supports dumping the name table of all nodes.

    Netlink logical layout of name table response message:
    -> name table
    -> publication
    -> type
    -> lower
    -> upper
    -> scope
    -> node
    -> ref
    -> key

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Add TIPC_NL_NET_SET command to the new tipc netlink API.

    This command can set the network id and network (tipc) address.

    Netlink logical layout of network set message:
    -> net
    [ -> id ]
    [ -> address ]

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Add TIPC_NL_NET_GET command to the new tipc netlink API.

    This command dumps the network id of the node.

    Netlink logical layout of returned network data:
    -> net
    -> id

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Add TIPC_NL_NODE_GET to the new tipc netlink API.

    This command can dump the address and node status of all nodes in the
    tipc cluster.

    Netlink logical layout of returned node/address data:
    -> node
    -> address
    -> up flag

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Add TIPC_NL_MEDIA_SET command to the new tipc netlink API.

    This command can set one or more link properties for a particular
    media.

    Netlink logical layout of bearer set message:
    -> media
    -> name
    -> link properties
    [ -> tolerance ]
    [ -> priority ]
    [ -> window ]

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Add TIPC_NL_MEDIA_GET command to the new tipc netlink API.

    This command supports dumping all information about all defined
    media as well as getting all information about a specific media.

    The information about a media includes name and link properties.

    Netlink logical layout of media get response message:
    -> media
    -> name
    -> link properties
    -> tolerance
    -> priority
    -> window

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Add TIPC_NL_LINK_RESET_STATS command to the new netlink API.

    This command resets the link statistics for a particular link.

    Netlink logical layout of link reset message:
    -> link
    -> name

    Signed-off-by: Richard Alpe
    Reviewed-by: Erik Hugne
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe