09 Apr, 2018
1 commit
-
Commit 4b2e6877b879 ("tipc: Fix namespace violation in tipc_sk_fill_sock_diag")
tried to fix the crash but failed, the crash is still 100% reproducible
with it.In tipc_sk_fill_sock_diag(), skb is the diag dump we are filling, it is not
correct to retrieve its NETLINK_CB(), instead, like other protocol diag,
we should use NETLINK_CB(cb->skb).sk here.Reported-by:
Fixes: 4b2e6877b879 ("tipc: Fix namespace violation in tipc_sk_fill_sock_diag")
Fixes: c30b70deb5f4 (tipc: implement socket diagnostics for AF_TIPC)
Cc: GhantaKrishnamurthy MohanKrishna
Cc: Jon Maloy
Cc: Ying Xue
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
04 Apr, 2018
2 commits
-
To fetch UID info for socket diagnostics, we determine the
namespace of user context using tipc socket instance. This
may cause namespace violation, as the kernel will remap based
on UID.We fix this by fetching namespace info using the calling userspace
netlink socket.Fixes: c30b70deb5f4 (tipc: implement socket diagnostics for AF_TIPC)
Reported-by: syzbot+326e587eff1074657718@syzkaller.appspotmail.com
Acked-by: Jon Maloy
Signed-off-by: GhantaKrishnamurthy MohanKrishna
Signed-off-by: David S. Miller -
When an item of struct tipc_subscription is created, we fail to
initialize the two lists aggregated into the struct. This has so far
never been a problem, since the items are just added to a root
object by list_add(), which does not require the addee list to be
pre-initialized. However, syzbot is provoking situations where this
addition fails, whereupon the attempted removal if the item from
the list causes a crash.This problem seems to always have been around, despite that the code
for creating this object was rewritten in commit 242e82cc95f6 ("tipc:
collapse subscription creation functions"), which is still in net-next.We fix this for that commit by initializing the two lists properly.
Fixes: 242e82cc95f6 ("tipc: collapse subscription creation functions")
Reported-by: syzbot+0bb443b74ce09197e970@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller
01 Apr, 2018
4 commits
-
gcc points out that the combined length of the fixed-length inputs to
l->name is larger than the destination buffer size:net/tipc/link.c: In function 'tipc_link_create':
net/tipc/link.c:465:26: error: '%s' directive writing up to 32 bytes
into a region of size between 26 and 58 [-Werror=format-overflow=]
sprintf(l->name, "%s:%s-%s:unknown", self_str, if_name, peer_str);net/tipc/link.c:465:2: note: 'sprintf' output 11 or more bytes
(assuming 75) into a destination of size 60
sprintf(l->name, "%s:%s-%s:unknown", self_str, if_name, peer_str);A detailed analysis reveals that the theoretical maximum length of
a link name is:
max self_str + 1 + max if_name + 1 + max peer_str + 1 + max if_name =
16 + 1 + 15 + 1 + 16 + 1 + 15 = 65
Since we also need space for a trailing zero we now set MAX_LINK_NAME
to 68.Just to be on the safe side we also replace the sprintf() call with
snprintf().Fixes: 25b0b9c4e835 ("tipc: handle collisions of 32-bit node address
hash values")
Reported-by: Arnd BergmannSigned-off-by: Jon Maloy
Signed-off-by: David S. Miller -
With the new RB tree structure for service ranges it becomes possible to
solve an old problem; - we can now allow overlapping service ranges in
the table.When inserting a new service range to the tree, we use 'lower' as primary
key, and when necessary 'upper' as secondary key.Since there may now be multiple service ranges matching an indicated
'lower' value, we must also add the 'upper' value to the functions
used for removing publications, so that the correct, corresponding
range item can be found.These changes guarantee that a well-formed publication/withdrawal item
from a peer node never will be rejected, and make it possible to
eliminate the problematic backlog functionality we currently have for
handling such cases.Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
The function tipc_nametbl_translate() function is ugly and hard to
follow. This can be improved somewhat by introducing a stack variable
for holding the publication list to be used and re-ordering the if-
clauses for selection of algorithm.Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
The current design of the binding table has an unnecessary memory
consuming and complex data structure. It aggregates the service range
items into an array, which is expanded by a factor two every time it
becomes too small to hold a new item. Furthermore, the arrays never
shrink when the number of ranges diminishes.We now replace this array with an RB tree that is holding the range
items as tree nodes, each range directly holding a list of bindings.This, along with a few name changes, improves both readability and
volume of the code, as well as reducing memory consumption and hopefully
improving cache hit rate.Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller
28 Mar, 2018
1 commit
-
Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.Signed-off-by: Kirill Tkhai
Signed-off-by: David S. Miller
27 Mar, 2018
2 commits
-
Fixes the following sparse warning:
net/tipc/node.c:336:18: warning:
symbol 'tipc_node_create' was not declared. Should it be static?Signed-off-by: Wei Yongjun
Acked-by: Jon Maloy
Signed-off-by: David S. Miller -
Release alloced resource before return from the error handling
case in tipc_udp_enable(), otherwise will cause memory leak.Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default")
Signed-off-by: Wei Yongjun
Acked-by: Jon Maloy
Signed-off-by: David S. Miller
26 Mar, 2018
1 commit
-
Fixes: 25b0b9c4e835 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Fengguang Wu
Acked-by: Jon Maloy jon.maloy@ericsson.com
Signed-off-by: David S. Miller
24 Mar, 2018
8 commits
-
Selecting and explicitly configuring a TIPC node identity may be
unwanted in some cases.In this commit we introduce a default setting if the identity has not
been set at the moment the first bearer is enabled. We do this by
using a raw copy of a unique identifier from the used interface: MAC
address in the case of an L2 bearer, IPv4/IPv6 address in the case
of a UDP bearer.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
When a 32-bit node address is generated from a 128-bit identifier,
there is a risk of collisions which must be discovered and handled.We do this as follows:
- We don't apply the generated address immediately to the node, but do
instead initiate a 1 sec trial period to allow other cluster members
to discover and handle such collisions.- During the trial period the node periodically sends out a new type
of message, DSC_TRIAL_MSG, using broadcast or emulated broadcast,
to all the other nodes in the cluster.- When a node is receiving such a message, it must check that the
presented 32-bit identifier either is unused, or was used by the very
same peer in a previous session. In both cases it accepts the request
by not responding to it.- If it finds that the same node has been up before using a different
address, it responds with a DSC_TRIAL_FAIL_MSG containing that
address.- If it finds that the address has already been taken by some other
node, it generates a new, unused address and returns it to the
requester.- During the trial period the requesting node must always be prepared
to accept a failure message, i.e., a message where a peer suggests a
different (or equal) address to the one tried. In those cases it
must apply the suggested value as trial address and restart the trial
period.This algorithm ensures that in the vast majority of cases a node will
have the same address before and after a reboot. If a legacy user
configures the address explicitly, there will be no trial period and
messages, so this protocol addition is completely backwards compatible.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
We add a 128-bit node identity, as an alternative to the currently used
32-bit node address.For the sake of compatibility and to minimize message header changes
we retain the existing 32-bit address field. When not set explicitly by
the user, this field will be filled with a hash value generated from the
much longer node identity, and be used as a shorthand value for the
latter.We permit either the address or the identity to be set by configuration,
but not both, so when the address value is set by a legacy user the
corresponding 128-bit node identity is generated based on the that value.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
As a preparation to changing the addressing structure of TIPC we replace
all direct accesses to the tipc_net::own_addr field with the function
dedicated for this, tipc_own_addr().There are no changes to program logics in this commit.
Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
The removal of an internal structure of the node address has an unwanted
side effect.
- Currently, if a user is sending an anycast message with destination
domain 0, the tipc_namebl_translate() function will use the 'closest-
first' algorithm to first look for a node local destination, and only
when no such is found, will it resort to the cluster global 'round-
robin' lookup algorithm.
- Current users can get around this, and enforce unconditional use of
global round-robin by indicating a destination as Z.0.0 or Z.C.0.
- This option disappears when we make the node address flat, since the
lookup algorithm has no way of recognizing this case. So, as long as
there are node local destinations, the algorithm will always select
one of those, and there is nothing the sender can do to change this.We solve this by eliminating the 'closest-first' option, which was never
a good idea anyway, for non-legacy users, but only for those. To
distinguish between legacy users and non-legacy users we introduce a new
flag 'legacy_addr_format' in struct tipc_core, to be set when the user
configures a legacy-style Z.C.N node address. Hence, when a legacy user
indicates a zero lookup domain 'closest-first' is selected, and in all
other cases we use 'round-robin'.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
Nominally, TIPC organizes network nodes into a three-level network
hierarchy consisting of the levels 'zone', 'cluster' and 'node'. This
hierarchy is reflected in the node address format, - it is sub-divided
into an 8-bit zone id, and 12 bit cluster id, and a 12-bit node id.However, the 'zone' and 'cluster' levels have in reality never been
fully implemented,and never will be. The result of this has been
that the first 20 bits the node identity structure have been wasted,
and the usable node identity range within a cluster has been limited
to 12 bits. This is starting to become a problem.In the following commits, we will need to be able to connect between
nodes which are using the whole 32-bit value space of the node address.
We therefore remove the restrictions on which values can be assigned
to node identity, -it is from now on only a 32-bit integer with no
assumed internal structure.Isolation between clusters is now achieved only by setting different
values for the 'network id' field used during neighbor discovery, in
practice leading to the latter becoming the new cluster identity.The rules for accepting discovery requests/responses from neighboring
nodes now become:- If the user is using legacy address format on both peers, reception
of discovery messages is subject to the legacy lookup domain check
in addition to the cluster id check.- Otherwise, the discovery request/response is always accepted, provided
both peers have the same network id.This secures backwards compatibility for users who have been using zone
or cluster identities as cluster separators, instead of the intended
'network id'.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
To facilitate the coming changes in the neighbor discovery functionality
we make some renaming and refactoring of that code. The functional changes
in this commit are trivial, e.g., that we move the message sending call in
tipc_disc_timeout() outside the spinlock protected region.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
As a preparation for the next commits we try to reduce the footprint of
the function tipc_enable_bearer(), while hopefully making is simpler to
follow.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller
23 Mar, 2018
3 commits
-
Currently when tipc is unable to queue a received message on a
socket, the message is rejected back to the sender with error
TIPC_ERR_OVERLOAD. However, the application on this socket
has no knowledge about these discards.In this commit, we try to step the sk_drops counter when tipc
is unable to queue a received message. Export sk_drops
using tipc socket diagnostics.Acked-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: GhantaKrishnamurthy MohanKrishna
Signed-off-by: Parthasarathy Bhuvaragan
Signed-off-by: David S. Miller -
This commit adds socket diagnostics capability for AF_TIPC in netlink
family NETLINK_SOCK_DIAG in a new kernel module (diag.ko).The following are key design considerations:
- config TIPC_DIAG has default y, like INET_DIAG.
- only requests with flag NLM_F_DUMP is supported (dump all).
- tipc_sock_diag_req message is introduced to send filter parameters.
- the response attributes are of TLV, some nested.To avoid exposing data structures between diag and tipc modules and
avoid code duplication, the following additions are required:
- export tipc_nl_sk_walk function to reuse socket iterator.
- export tipc_sk_fill_sock_diag to fill the tipc diag attributes.
- create a sock_diag response message in __tipc_add_sock_diag defined
in diag.c and use the above exported tipc_sk_fill_sock_diag
to fill response.Acked-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: GhantaKrishnamurthy MohanKrishna
Signed-off-by: Parthasarathy Bhuvaragan
Signed-off-by: David S. Miller -
The current socket iterator function tipc_nl_sk_dump, handles socket
locks and calls __tipc_nl_add_sk for each socket.
To reuse this logic in sock_diag implementation, we do minor
modifications to make these functions generic as described below.In this commit, we add a two new functions __tipc_nl_sk_walk,
__tipc_nl_add_sk_info and modify tipc_nl_sk_dump, __tipc_nl_add_sk
accordingly.In __tipc_nl_sk_walk we:
1. acquire and release socket locks
2. for each socket, execute the specified callback functionIn __tipc_nl_add_sk we:
- Move the netlink attribute insertion to __tipc_nl_add_sk_info.tipc_nl_sk_dump calls tipc_nl_sk_walk with __tipc_nl_add_sk as argument.
sock_diag will use these generic functions in a later commit.
There is no functional change in this commit.
Acked-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: GhantaKrishnamurthy MohanKrishna
Signed-off-by: Parthasarathy Bhuvaragan
Signed-off-by: David S. Miller
18 Mar, 2018
5 commits
-
We rename some lists and fields in struct publication both to make
the naming more consistent and to better reflect their roles. We
also update the descriptions of those lists.node_list -> local_publ
cluster_list -> all_publ
pport_list -> binding_sock
ref -> portThere are no functional changes in this commit.
Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
The size of struct publication can be reduced further. Membership in
lists 'nodesub_list' and 'local_list' is mutually exlusive, in that
remote publications use the former and local publications the latter.
We replace the two lists with one single, named 'binding_node' which
reflects what it really is.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
As a further consequence of the previous commits, we can also remove
the member 'zone_list 'in struct name_info and struct publication.
Instead, we now let the member cluster_list take over the role a
container of all publications of a given .
We also remove the counters for the size of those lists, since
they don't serve any purpose.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
As a consequence of the previous commit we nan now eliminate zone scope
related lists in the name table. We start with name_table::publ_list[3],
which can now be replaced with two lists, one for node scope publications
and one for cluster scope publications.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
Publications for TIPC_CLUSTER_SCOPE and TIPC_ZONE_SCOPE are in all
aspects handled the same way, both on the publishing node and on the
receiving nodes.Despite previous ambitions to the contrary, this is never going to change,
so we take the conseqeunce of this and obsolete TIPC_ZONE_SCOPE and related
macros/functions. Whenever a user is doing a bind() or a sendmsg() attempt
using ZONE_SCOPE we translate this internally to CLUSTER_SCOPE, while we
remain compatible with users and remote nodes still using ZONE_SCOPE.Furthermore, the non-formalized scope value 0 has always been permitted
for use during lookup, with the same meaning as ZONE_SCOPE/CLUSTER_SCOPE.
We now permit it even as binding scope, but for compatibility reasons we
choose to not change the value of TIPC_CLUSTER_SCOPE.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller
13 Mar, 2018
1 commit
-
TIPC looks concentrated in itself, and other pernet_operations
seem not touching its entities.tipc_net_ops look pernet-divided, and they should be safe to
be executed in parallel for several net the same time.Signed-off-by: Kirill Tkhai
Signed-off-by: David S. Miller
08 Mar, 2018
1 commit
-
Assign true or false to boolean variables instead of an integer value.
This issue was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva
Acked-by: Ying Xue
Signed-off-by: David S. Miller
06 Mar, 2018
1 commit
-
All of the conflicts were cases of overlapping changes.
In net/core/devlink.c, we have to make care that the
resouce size_params have become a struct member rather
than a pointer to such an object.Signed-off-by: David S. Miller
28 Feb, 2018
1 commit
-
In commit 60c253069632 ("tipc: fix race between poll() and
setsockopt()") we introduced a pointer from struct tipc_group to the
'group_is_connected' flag in struct tipc_sock, so that this field can
be checked without dereferencing the group pointer of the latter struct.The initial value for this flag is correctly set to 'false' when a
group is created, but we miss the case when no group is created at
all, in which case the initial value should be 'true'. This has the
effect that SOCK_RDM/DGRAM sockets sending datagrams never receive
POLLOUT if they request so.This commit corrects this bug.
Fixes: 60c253069632 ("tipc: fix race between poll() and setsockopt()")
Reported-by: Hoang Le
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller
20 Feb, 2018
3 commits
-
syzbot reported a scheduling while atomic issue at netns
destruction time:BUG: sleeping function called from invalid context at net/core/sock.c:2769
in_atomic(): 1, irqs_disabled(): 0, pid: 85, name: kworker/u4:3
5 locks held by kworker/u4:3/85:
#0: ((wq_completion)"%s""netns"){+.+.}, at: []
process_one_work+0xaaf/0x1af0 kernel/workqueue.c:2084
#1: (net_cleanup_work){+.+.}, at: []
process_one_work+0xb01/0x1af0 kernel/workqueue.c:2088
#2: (net_sem){++++}, at: [] cleanup_net+0x23f/0xd20
net/core/net_namespace.c:494
#3: (net_mutex){+.+.}, at: [] cleanup_net+0xa7d/0xd20
net/core/net_namespace.c:496
#4: (&(&srv->idr_lock)->rlock){+...}, at: []
spin_lock_bh include/linux/spinlock.h:315 [inline]
#4: (&(&srv->idr_lock)->rlock){+...}, at: []
tipc_topsrv_stop+0x231/0x610 net/tipc/topsrv.c:685
CPU: 0 PID: 85 Comm: kworker/u4:3 Not tainted 4.16.0-rc1+ #230
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
___might_sleep+0x2b2/0x470 kernel/sched/core.c:6128
__might_sleep+0x95/0x190 kernel/sched/core.c:6081
lock_sock_nested+0x37/0x110 net/core/sock.c:2769
lock_sock include/net/sock.h:1463 [inline]
tipc_release+0x103/0xff0 net/tipc/socket.c:572
sock_release+0x8d/0x1e0 net/socket.c:594
tipc_topsrv_stop+0x3c0/0x610 net/tipc/topsrv.c:696
tipc_exit_net+0x15/0x40 net/tipc/core.c:96
ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:148
cleanup_net+0x6ba/0xd20 net/core/net_namespace.c:529
process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429This is caused by tipc_topsrv_stop() releasing the listener socket
with the idr lock held. This changeset addresses the issue moving
the release operation outside such lock.Reported-and-tested-by: syzbot+749d9d87c294c00ca856@syzkaller.appspotmail.com
Fixes: 0ef897be12b8 ("tipc: separate topology server listener socket from subcsriber sockets")
Signed-off-by: Paolo Abeni
Acked-by: ///jon
Signed-off-by: David S. Miller -
In commit cc1ea9ffadf7 ("tipc: eliminate struct tipc_subscriber") we
re-introduced an old bug on the error path in the function
tipc_topsrv_kern_subscr(). We now re-introduce the correction too.Reported-by: syzbot+f62e0f2a0ef578703946@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller
17 Feb, 2018
6 commits
-
We rename struct tipc_server to struct tipc_topsrv. This reflect its now
specialized role as topology server. Accoringly, we change or add function
prefixes to make it clearer which functionality those belong to.There are no functional changes in this commit.
Acked-by: Ying.Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
We move the listener socket to struct tipc_server and give it its own
work item. This makes it easier to follow the code, and entails some
simplifications in the reception code in subscriber sockets.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
In order to narrow the interface and dependencies between the topology
server and the subscription/binding table functionality we move struct
tipc_server inside the file server.c. This requires some code
adaptations in other files, but those are mostly minor.The most important change is that we have to move the start/stop
functions for the topology server to server.c, where they logically
belong anyway.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
Since we now have removed struct tipc_subscriber from the code, and
only struct tipc_subscription remains, there is no longer need for long
and awkward prefixes to distinguish between their pertaining functions.We now change all tipc_subscrp_* prefixes to tipc_sub_*. This is
a purely cosmetic change.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
After the previous changes it becomes logical to collapse the two-level
creation of subscription instances into one. We do that here.We also rename the creation and deletion functions for more consistency.
Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
Because of the requirement for total distribution transparency, users
send subscriptions and receive topology events in their own host format.
It is up to the topology server to determine this format and do the
correct conversions to and from its own host format when needed.Until now, this has been handled in a rather non-transparent way inside
the topology server and subscriber code, leading to unnecessary
complexity when creating subscriptions and issuing events.We now improve this situation by adding two new macros, tipc_sub_read()
and tipc_evt_write(). Both those functions calculate the need for
conversion internally before performing their respective operations.
Hence, all handling of such conversions become transparent to the rest
of the code.Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller