Eric Lee / smarc-fsl-linux-kernel

19 Aug, 2019

1 commit

e654f9f53 tipc: clean up skb list lock handling on send path ... Browse Code »

The policy for handling the skb list locks on the send and receive paths
is simple.

- On the send path we never need to grab the lock on the 'xmitq' list
when the destination is an exernal node.

- On the receive path we always need to grab the lock on the 'inputq'
list, irrespective of source node.

However, when transmitting node local messages those will eventually
end up on the receive path of a local socket, meaning that the argument
'xmitq' in tipc_node_xmit() will become the 'ínputq' argument in the
function tipc_sk_rcv(). This has been handled by always initializing
the spinlock of the 'xmitq' list at message creation, just in case it
may end up on the receive path later, and despite knowing that the lock
in most cases never will be used.

This approach is inaccurate and confusing, and has also concealed the
fact that the stated 'no lock grabbing' policy for the send path is
violated in some cases.

We now clean up this by never initializing the lock at message creation,
instead doing this at the moment we find that the message actually will
enter the receive path. At the same time we fix the four locations
where we incorrectly access the spinlock on the send/error path.

This patch also reverts commit d12cffe9329f ("tipc: ensure head->lock
is initialised") which has now become redundant.

CC: Eric Dumazet
Reported-by: Chris Packham
Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Reviewed-by: Xin Long
Signed-off-by: David S. Miller

Jon Maloy
2019-08-19 05:01:07 +0800

17 Jun, 2019

1 commit

5cf02612b tipc: purge deferredq list for each grp member in tipc_group_delete ... Browse Code »

Syzbot reported a memleak caused by grp members' deferredq list not
purged when the grp is be deleted.

The issue occurs when more(msg_grp_bc_seqno(hdr), m->bc_rcv_nxt) in
tipc_group_filter_msg() and the skb will stay in deferredq.

So fix it by calling __skb_queue_purge for each member's deferredq
in tipc_group_delete() when a tipc sk leaves the grp.

Fixes: b87a5ea31c93 ("tipc: guarantee group unicast doesn't bypass group broadcast")
Reported-by: syzbot+78fbe679c8ca8d264a8d@syzkaller.appspotmail.com
Signed-off-by: Xin Long
Acked-by: Ying Xue
Signed-off-by: David S. Miller

Xin Long
2019-06-17 11:42:05 +0800

28 Apr, 2019

1 commit

ae0be8de9 netlink: make nla_nest_start() add NLA_F_NESTED flag ... Browse Code »

Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
netlink based interfaces (including recently added ones) are still not
setting it in kernel generated messages. Without the flag, message parsers
not aware of attribute semantics (e.g. wireshark dissector or libmnl's
mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
the structure of their contents.

Unfortunately we cannot just add the flag everywhere as there may be
userspace applications which check nlattr::nla_type directly rather than
through a helper masking out the flags. Therefore the patch renames
nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
are rewritten to use nla_nest_start().

Except for changes in include/net/netlink.h, the patch was generated using
this semantic patch:

@@ expression E1, E2; @@
-nla_nest_start(E1, E2)
+nla_nest_start_noflag(E1, E2)

@@ expression E1, E2; @@
-nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
+nla_nest_start(E1, E2)

Signed-off-by: Michal Kubecek
Acked-by: Jiri Pirko
Acked-by: David Ahern
Signed-off-by: David S. Miller

Michal Kubecek
2019-04-28 05:03:44 +0800

17 Mar, 2019

1 commit

4589e28db net: tipc: fix a missing check of nla_nest_start ... Browse Code »

nla_nest_start could fail and requires a check. The fix returns
-EMSGSIZE if it fails.

Signed-off-by: Kangjie Lu
Signed-off-by: David S. Miller

Kangjie Lu
2019-03-17 03:09:05 +0800

19 Oct, 2018

1 commit

b06f9d9f1 tipc: fix info leak from kernel tipc_event ... Browse Code »

We initialize a struct tipc_event allocated on the kernel stack to
zero to avert info leak to user space.

Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-10-19 07:49:53 +0800

22 Jul, 2018

1 commit

e064cce13 tipc: make some functions static ... Browse Code »

Fixes the following sparse warnings:

net/tipc/link.c:376:5: warning: symbol 'link_bc_rcv_gap' was not declared. Should it be static?
net/tipc/link.c:823:6: warning: symbol 'link_prepare_wakeup' was not declared. Should it be static?
net/tipc/link.c:959:6: warning: symbol 'tipc_link_advance_backlog' was not declared. Should it be static?
net/tipc/link.c:1009:5: warning: symbol 'tipc_link_retrans' was not declared. Should it be static?
net/tipc/monitor.c:687:5: warning: symbol '__tipc_nl_add_monitor_peer' was not declared. Should it be static?
net/tipc/group.c:230:20: warning: symbol 'tipc_group_find_member' was not declared. Should it be static?

Signed-off-by: YueHaibing
Signed-off-by: David S. Miller

YueHaibing
2018-07-22 07:23:22 +0800

19 Jul, 2018

1 commit

d81d25e66 tipc: remove unused tipc_group_size ... Browse Code »

After commit eb929a91b213 ("tipc: improve poll() for group member socket"),
it is no longer used.

Signed-off-by: YueHaibing
Acked-by: Jon Maloy
Signed-off-by: David S. Miller

YueHaibing
2018-07-19 04:49:08 +0800

30 Jun, 2018

1 commit

a1be5a20f tipc: extend sock diag for group communication ... Browse Code »

This commit extends the existing TIPC socket diagnostics framework
for information related to TIPC group communication.

Acked-by: Ying Xue
Acked-by: Jon Maloy
Signed-off-by: GhantaKrishnamurthy MohanKrishna
Signed-off-by: David S. Miller

GhantaKrishnamurthy MohanKrishna
2018-06-30 20:05:42 +0800

06 Mar, 2018

1 commit

0f3e9c97e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

All of the conflicts were cases of overlapping changes.

In net/core/devlink.c, we have to make care that the
resouce size_params have become a struct member rather
than a pointer to such an object.

Signed-off-by: David S. Miller

David S. Miller
2018-03-06 14:20:46 +0800

28 Feb, 2018

1 commit

1b22bcad7 tipc: correct initial value for group congestion flag ... Browse Code »

In commit 60c253069632 ("tipc: fix race between poll() and
setsockopt()") we introduced a pointer from struct tipc_group to the
'group_is_connected' flag in struct tipc_sock, so that this field can
be checked without dereferencing the group pointer of the latter struct.

The initial value for this flag is correctly set to 'false' when a
group is created, but we miss the case when no group is created at
all, in which case the initial value should be 'true'. This has the
effect that SOCK_RDM/DGRAM sockets sending datagrams never receive
POLLOUT if they request so.

This commit corrects this bug.

Fixes: 60c253069632 ("tipc: fix race between poll() and setsockopt()")
Reported-by: Hoang Le
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-02-28 00:46:03 +0800

17 Feb, 2018

1 commit

026321c6d tipc: rename tipc_server to tipc_topsrv ... Browse Code »

We rename struct tipc_server to struct tipc_topsrv. This reflect its now
specialized role as topology server. Accoringly, we change or add function
prefixes to make it clearer which functionality those belong to.

There are no functional changes in this commit.

Acked-by: Ying.Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-02-17 04:26:34 +0800

20 Jan, 2018

1 commit

60c253069 tipc: fix race between poll() and setsockopt() ... Browse Code »

Letting tipc_poll() dereference a socket's pointer to struct tipc_group
entails a race risk, as the group item may be deleted in a concurrent
tipc_sk_join() or tipc_sk_leave() thread.

We now move the 'open' flag in struct tipc_group to struct tipc_sock,
and let the former retain only a pointer to the moved field. This will
eliminate the race risk.

Reported-by: syzbot+799dafde0286795858ac@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-20 04:12:21 +0800

10 Jan, 2018

9 commits

eb929a91b tipc: improve poll() for group member socket ... Browse Code »

The current criteria for returning POLLOUT from a group member socket is
too simplistic. It basically returns POLLOUT as soon as the group has
external destinations, something obviously leading to a lot of spinning
during destination congestion situations. At the same time, the internal
congestion handling is unnecessarily complex.

We now change this as follows.

- We introduce an 'open' flag in struct tipc_group. This flag is used
only to help poll() get the setting of POLLOUT right, and *not* for
congeston handling as such. This means that a user can choose to
ignore an EAGAIN for a destination and go on sending messages to
other destinations in the group if he wants to.

- The flag is set to false every time we return EAGAIN on a send call.

- The flag is set to true every time any member, i.e., not necessarily
the member that caused EAGAIN, is removed from the small_win list.

- We remove the group member 'usr_pending' flag. The size of the send
window and presence in the 'small_win' list is sufficient criteria
for recognizing congestion.

This solution seems to be a reasonable compromise between 'anycast',
which is normally not waiting for POLLOUT for a specific destination,
and the other three send modes, which are.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:58 +0800
232d07b74 tipc: improve groupcast scope handling ... Browse Code »

When a member joins a group, it also indicates a binding scope. This
makes it possible to create both node local groups, invisible to other
nodes, as well as cluster global groups, visible everywhere.

In order to avoid that different members end up having permanently
differing views of group size and memberhip, we must inhibit locally
and globally bound members from joining the same group.

We do this by using the binding scope as an additional separator between
groups. I.e., a member must ignore all membership events from sockets
using a different scope than itself, and all lookups for message
destinations must require an exact match between the message's lookup
scope and the potential target's binding scope.

Apart from making it possible to create local groups using the same
identity on different nodes, a side effect of this is that it now also
becomes possible to create a cluster global group with the same identity
across the same nodes, without interfering with the local groups.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:58 +0800
8348500f8 tipc: add option to suppress PUBLISH events for pre-existing publications ... Browse Code »

Currently, when a user is subscribing for binding table publications,
he will receive a PUBLISH event for all already existing matching items
in the binding table.

However, a group socket making a subscriptions doesn't need this initial
status update from the binding table, because it has already scanned it
during the join operation. Worse, the multiplicatory effect of issuing
mutual events for dozens or hundreds group members within a short time
frame put a heavy load on the topology server, with the end result that
scale out operations on a big group tend to take much longer than needed.

We now add a new filter option, TIPC_SUB_NO_STATUS, for topology server
subscriptions, so that this initial avalanche of events is suppressed.
This change, along with the previous commit, significantly improves the
range and speed of group scale out operations.

We keep the new option internal for the tipc driver, at least for now.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:58 +0800
d12d2e12c tipc: send out join messages as soon as new member is discovered ... Browse Code »

When a socket is joining a group, we look up in the binding table to
find if there are already other members of the group present. This is
used for being able to return EAGAIN instead of EHOSTUNREACH if the
user proceeds directly to a send attempt.

However, the information in the binding table can be used to directly
set the created member in state MBR_PUBLISHED and send a JOIN message
to the peer, instead of waiting for a topology PUBLISH event to do this.
When there are many members in a group, the propagation time for such
events can be significant, and we can save time during the join
operation if we use the initial lookup result fully.

In this commit, we eliminate the member state MBR_DISCOVERED which has
been the result of the initial lookup, and do instead go directly to
MBR_PUBLISHED, which initiates the setup.

After this change, the tipc_member FSM looks as follows:

+-----------+
---->| PUBLISHED |-----------------------------------------------+
PUB- +-----------+ LEAVE/WITHRAW |
LISH |JOIN |
| +-------------------------------------------+ |
| | LEAVE/WITHDRAW | |
| | +------------+ | |
| | +----------->| PENDING |---------+ | |
| | |msg/maxactv +-+---+------+ LEAVE/ | | |
| | | | | WITHDRAW | | |
| | | +----------+ | | | |
| | | |revert/maxactv| | | |
| | | V V V V V
| +----------+ msg +------------+ +-----------+
+-->| JOINED |------>| ACTIVE |------>| LEAVING |--->
| +----------+ +--- -+------+ LEAVE/+-----------+DOWN
| A A | WITHDRAW A A A EVT
| | | |RECLAIM | | |
| | |REMIT V | | |
| | |== adv +------------+ | | |
| | +---------| RECLAIMING |--------+ | |
| | +-----+------+ LEAVE/ | |
| | |REMIT WITHDRAW | |
| | |< adv | |
| |msg/ V LEAVE/ | |
| |adv==ADV_IDLE+------------+ WITHDRAW | |
| +-------------| REMITTED |------------+ |
| +------------+ |
|PUBLISH |
JOIN +-----------+ LEAVE/WITHDRAW |
---->| JOINING |-----------------------------------------------+
+-----------+

Acked-by: Ying Xue
Signed-off-by: Jon Maloy

Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:58 +0800
c2b22bcf2 tipc: simplify group LEAVE sequence ... Browse Code »

After the changes in the previous commit the group LEAVE sequence
can be simplified.

We now let the arrival of a LEAVE message unconditionally issue a group
DOWN event to the user. When a topology WITHDRAW event is received, the
member, if it still there, is set to state LEAVING, but we only issue a
group DOWN event when the link to the peer node is gone, so that no
LEAVE message is to be expected.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:57 +0800
7ad32bcb7 tipc: create group member event messages when they are needed ... Browse Code »

In the current implementation, a group socket receiving topology
events about other members just converts the topology event message
into a group event message and stores it until it reaches the right
state to issue it to the user. This complicates the code unnecessarily,
and becomes impractical when we in the coming commits will need to
create and issue membership events independently.

In this commit, we change this so that we just notice the type and
origin of the incoming topology event, and then drop the buffer. Only
when it is time to actually send a group event to the user do we
explicitly create a new message and send it upwards.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:57 +0800
0233493a5 tipc: adjustment to group member FSM ... Browse Code »

Analysis reveals that the member state MBR_QURANTINED in reality is
unnecessary, and can be replaced by the state MBR_JOINING at all
occurrencs.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:57 +0800
4ea5dab54 tipc: let group member stay in JOINED mode if unable to reclaim ... Browse Code »

We handle a corner case in the function tipc_group_update_rcv_win().
During extreme pessure it might happen that a message receiver has all
its active senders in RECLAIMING or REMITTED mode, meaning that there
is nobody to reclaim advertisements from if an additional sender tries
to go active.

Currently we just set the new sender to ACTIVE anyway, hence at least
theoretically opening up for a receiver queue overflow by exceeding the
MAX_ACTIVE limit. The correct solution to this is to instead add the
member to the pending queue, while letting the oldest member in that
queue revert to JOINED state.

In this commit we refactor the code for handling message arrival from
a JOINED member, both to make it more comprehensible and to cover the
case described above.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:57 +0800
8d5dee21f tipc: a couple of cleanups ... Browse Code »

- We remove the 'reclaiming' member list in struct tipc_group, since
it doesn't serve any purpose.

- We simplify the GRP_REMIT_MSG branch of tipc_group_protocol_rcv().

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-10 01:35:57 +0800

09 Jan, 2018

1 commit

a0ce09318 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2018-01-09 23:37:00 +0800

06 Jan, 2018

2 commits

d84d1b3b6 tipc: simplify small window members' sorting algorithm ... Browse Code »

We simplify the sorting algorithm in tipc_update_member(). We also make
the remaining conditional call to this function unconditional, since the
same condition now is tested for inside the said function.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-06 02:37:03 +0800
38266ca17 tipc: some clarifying name changes ... Browse Code »

We rename some functions and variables, to make their purpose clearer.

- tipc_group::congested -> tipc_group::small_win. Members in this list
are not necessarily (and typically) congested. Instead, they may
*potentially* be subject to congestion because their send window is
less than ADV_IDLE, and therefore need to be checked during message
transmission.

- tipc_group_is_receiver() -> tipc_group_is_sender(). This socket will
accept messages coming from members fulfilling this condition, i.e.,
they are senders from this member's viewpoint.

- tipc_group_is_enabled() -> tipc_group_is_receiver(). Members
fulfilling this condition will accept messages sent from the current
socket, i.e., they are receivers from its viewpoint.

There are no functional changes in this commit.

Acked-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-06 02:37:03 +0800

03 Jan, 2018

1 commit

f9c935db8 tipc: fix problems with multipoint-to-point flow control ... Browse Code »

In commit 04d7b574b245 ("tipc: add multipoint-to-point flow control") we
introduced a protocol for preventing buffer overflow when many group
members try to simultaneously send messages to the same receiving member.

Stress test of this mechanism has revealed a couple of related bugs:

- When the receiving member receives an advertisement REMIT message from
one of the senders, it will sometimes prematurely activate a pending
member and send it the remitted advertisement, although the upper
limit for active senders has been reached. This leads to accumulation
of illegal advertisements, and eventually to messages being dropped
because of receive buffer overflow.

- When the receiving member leaves REMITTED state while a received
message is being read, we miss to look at the pending queue, to
activate the oldest pending peer. This leads to some pending senders
being starved out, and never getting the opportunity to profit from
the remitted advertisement.

We fix the former in the function tipc_group_proto_rcv() by returning
directly from the function once it becomes clear that the remitting
peer cannot leave REMITTED state at that point.

We fix the latter in the function tipc_group_update_rcv_win() by looking
up and activate the longest pending peer when it becomes clear that the
remitting peer now can leave REMITTED state.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2018-01-03 10:52:07 +0800

27 Dec, 2017

2 commits

3a33a19bf tipc: fix memory leak of group member when peer node is lost ... Browse Code »

When a group member receives a member WITHDRAW event, this might have
two reasons: either the peer member is leaving the group, or the link
to the member's node has been lost.

In the latter case we need to issue a DOWN event to the user right away,
and let function tipc_group_filter_msg() perform delete of the member
item. However, in this case we miss to change the state of the member
item to MBR_LEAVING, so the member item is not deleted, and we have a
memory leak.

We now separate better between the four sub-cases of a WITHRAW event
and make sure that each case is handled correctly.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2017-12-27 02:06:36 +0800
0a3d805c9 tipc: base group replicast ack counter on number of actual receivers ... Browse Code »

In commit 2f487712b893 ("tipc: guarantee that group broadcast doesn't
bypass group unicast") we introduced a mechanism that requires the first
(replicated) broadcast sent after a unicast to be acknowledged by all
receivers before permitting sending of the next (true) broadcast.

The counter for keeping track of the number of acknowledges to expect
is based on the tipc_group::member_cnt variable. But this misses that
some of the known members may not be ready for reception, and will never
acknowledge the message, either because they haven't fully joined the
group or because they are leaving the group. Such members are identified
by not fulfilling the condition tested for in the function
tipc_group_is_enabled().

We now set the counter for the actual number of acks to receive at the
moment the message is sent, by just counting the number of recipients
satisfying the tipc_group_is_enabled() test.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2017-12-27 02:00:04 +0800

21 Dec, 2017

1 commit

bb25c3855 tipc: remove joining group member from congested list ... Browse Code »

When we receive a JOIN message from a peer member, the message may
contain an advertised window value ADV_IDLE that permits removing the
member in question from the tipc_group::congested list. However, since
the removal has been made conditional on that the advertised window is
*not* ADV_IDLE, we miss this case. This has the effect that a sender
sometimes may enter a state of permanent, false, broadcast congestion.

We fix this by unconditinally removing the member from the congested
list before calling tipc_member_update(), which might potentially sort
it into the list again.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2017-12-21 03:56:48 +0800

20 Dec, 2017

1 commit

3db096011 tipc: fix list sorting bug in function tipc_group_update_member() ... Browse Code »

When, during a join operation, or during message transmission, a group
member needs to be added to the group's 'congested' list, we sort it
into the list in ascending order, according to its current advertised
window size. However, we miss the case when the member is already on
that list. This will have the result that the member, after the window
size has been decremented, might be at the wrong position in that list.
This again may have the effect that we during broadcast and multicast
transmissions miss the fact that a destination is not yet ready for
reception, and we end up sending anyway. From this point on, the
behavior during the remaining session is unpredictable, e.g., with
underflowing window sizes.

We now correct this bug by unconditionally removing the member from
the list before (re-)sorting it in.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2017-12-20 03:10:03 +0800

19 Dec, 2017

2 commits

3f42f5fe3 tipc: remove leaving group member from all lists ... Browse Code »

A group member going into state LEAVING should never go back to any
other state before it is finally deleted. However, this might happen
if the socket needs to send out a RECLAIM message during this interval.
Since we forget to remove the leaving member from the group's 'active'
or 'pending' list, the member might be selected for reclaiming, change
state to RECLAIMING, and get stuck in this state instead of being
deleted. This might lead to suppression of the expected 'member down'
event to the receiver.

We fix this by removing the member from all lists, except the RB tree,
at the moment it goes into state LEAVING.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2017-12-19 02:16:40 +0800
234833991 tipc: fix lost member events bug ... Browse Code »

Group messages are not supposed to be returned to sender when the
destination socket disappears. This is done correctly for regular
traffic messages, by setting the 'dest_droppable' bit in the header.
But we forget to do that in group protocol messages. This has the effect
that such messages may sometimes bounce back to the sender, be perceived
as a legitimate peer message, and wreak general havoc for the rest of
the session. In particular, we have seen that a member in state LEAVING
may go back to state RECLAIMED or REMITTED, hence causing suppression
of an otherwise expected 'member down' event to the user.

We fix this by setting the 'dest_droppable' bit even in group protocol
messages.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2017-12-19 02:16:40 +0800

28 Nov, 2017

1 commit

2e724dca7 tipc: eliminate access after delete in group_filter_msg() ... Browse Code »

KASAN revealed another access after delete in group.c. This time
it found that we read the header of a received message after the
buffer has been released.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2017-11-28 03:44:45 +0800

21 Nov, 2017

1 commit

e0e853ac0 tipc: fix access of released memory ... Browse Code »

When the function tipc_group_filter_msg() finds that a member event
indicates that the member is leaving the group, it first deletes the
member instance, and then purges the message queue being handled
by the call. But the message queue is an aggregated field in the
just deleted item, leading the purge call to access freed memory.

We fix this by swapping the order of the two actions.

Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Maloy
2017-11-21 19:22:03 +0800

13 Oct, 2017

7 commits

04d7b574b tipc: add multipoint-to-point flow control ... Browse Code »

We already have point-to-multipoint flow control within a group. But
we even need the opposite; -a scheme which can handle that potentially
hundreds of sources may try to send messages to the same destination
simultaneously without causing buffer overflow at the recipient. This
commit adds such a mechanism.

The algorithm works as follows:

- When a member detects a new, joining member, it initially set its
state to JOINED and advertises a minimum window to the new member.
This window is chosen so that the new member can send exactly one
maximum sized message, or several smaller ones, to the recipient
before it must stop and wait for an additional advertisement. This
minimum window ADV_IDLE is set to 65 1kB blocks.

- When a member receives the first data message from a JOINED member,
it changes the state of the latter to ACTIVE, and advertises a larger
window ADV_ACTIVE = 12 x ADV_IDLE blocks to the sender, so it can
continue sending with minimal disturbances to the data flow.

- The active members are kept in a dedicated linked list. Each time a
message is received from an active member, it will be moved to the
tail of that list. This way, we keep a record of which members have
been most (tail) and least (head) recently active.

- There is a maximum number (16) of permitted simultaneous active
senders per receiver. When this limit is reached, the receiver will
not advertise anything immediately to a new sender, but instead put
it in a PENDING state, and add it to a corresponding queue. At the
same time, it will pick the least recently active member, send it an
advertisement RECLAIM message, and set this member to state
RECLAIMING.

- The reclaimee member has to respond with a REMIT message, meaning that
it goes back to a send window of ADV_IDLE, and returns its unused
advertised blocks beyond that value to the reclaiming member.

- When the reclaiming member receives the REMIT message, it unlinks
the reclaimee from its active list, resets its state to JOINED, and
notes that it is now back at ADV_IDLE advertised blocks to that
member. If there are still unread data messages sent out by
reclaimee before the REMIT, the member goes into an intermediate
state REMITTED, where it stays until the said messages have been
consumed.

- The returned advertised blocks can now be re-advertised to the
pending member, which is now set to state ACTIVE and added to
the active member list.

- To be proactive, i.e., to minimize the risk that any member will
end up in the pending queue, we start reclaiming resources already
when the number of active members exceeds 3/4 of the permitted
maximum.

Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller

Jon Maloy
2017-10-13 23:46:01 +0800
a3bada706 tipc: guarantee delivery of last broadcast before DOWN event ... Browse Code »

The following scenario is possible:
- A user sends a broadcast message, and thereafter immediately leaves
the group.
- The LEAVE message, following a different path than the broadcast,
arrives ahead of the broadcast, and the sending member is removed
from the receiver's list.
- The broadcast message arrives, but is dropped because the sender
now is unknown to the receipient.

We fix this by sequence numbering membership events, just like ordinary
unicast messages. Currently, when a JOIN is sent to a peer, it contains
a synchronization point, - the sequence number of the next sent
broadcast, in order to give the receiver a start synchronization point.
We now let even LEAVE messages contain such an "end synchronization"
point, so that the recipient can delay the removal of the sending member
until it knows that all messages have been received.

The received synchronization points are added as sequence numbers to the
generated membership events, making it possible to handle them almost
the same way as regular unicasts in the receiving filter function. In
particular, a DOWN event with a too high sequence number will be kept
in the reordering queue until the missing broadcast(s) arrive and have
been delivered.

Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller

Jon Maloy
2017-10-13 23:46:01 +0800
2f487712b tipc: guarantee that group broadcast doesn't bypass group unicast ... Browse Code »

We need a mechanism guaranteeing that group unicasts sent out from a
socket are not bypassed by later sent broadcasts from the same socket.
We do this as follows:

- Each time a unicast is sent, we set a the broadcast method for the
socket to "replicast" and "mandatory". This forces the first
subsequent broadcast message to follow the same network and data path
as the preceding unicast to a destination, hence preventing it from
overtaking the latter.

- In order to make the 'same data path' statement above true, we let
group unicasts pass through the multicast link input queue, instead
of as previously through the unicast link input queue.

- In the first broadcast following a unicast, we set a new header flag,
requiring all recipients to immediately acknowledge its reception.

- During the period before all the expected acknowledges are received,
the socket refuses to accept any more broadcast attempts, i.e., by
blocking or returning EAGAIN. This period should typically not be
longer than a few microseconds.

- When all acknowledges have been received, the sending socket will
open up for subsequent broadcasts, this time giving the link layer
freedom to itself select the best transmission method.

- The forced and/or abrupt transmission method changes described above
may lead to broadcasts arriving out of order to the recipients. We
remedy this by introducing code that checks and if necessary
re-orders such messages at the receiving end.

Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller

Jon Maloy
2017-10-13 23:46:01 +0800
b87a5ea31 tipc: guarantee group unicast doesn't bypass group broadcast ... Browse Code »

Group unicast messages don't follow the same path as broadcast messages,
and there is a high risk that unicasts sent from a socket might bypass
previously sent broadcasts from the same socket.

We fix this by letting all unicast messages carry the sequence number of
the next sent broadcast from the same node, but without updating this
number at the receiver. This way, a receiver can check and if necessary
re-order such messages before they are added to the socket receive buffer.

Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller

Jon Maloy
2017-10-13 23:46:01 +0800
5b8dddb63 tipc: introduce group multicast messaging ... Browse Code »

The previously introduced message transport to all group members is
based on the tipc multicast service, but is logically a broadcast
service within the group, and that is what we call it.

We now add functionality for sending messages to all group members
having a certain identity. Correspondingly, we call this feature 'group
multicast'. The service is using unicast when only one destination is
found, otherwise it will use the bearer broadcast service to transfer
the messages. In the latter case, the receiving members filter arriving
messages by looking at the intended destination instance. If there is
no match, the message will be dropped, while still being considered
received and read as seen by the flow control mechanism.

Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller

Jon Maloy
2017-10-13 23:46:01 +0800
ee106d7f9 tipc: introduce group anycast messaging ... Browse Code »

In this commit, we make it possible to send connectionless unicast
messages to any member corresponding to the given member identity,
when there is more than one such member. The sender must use a
TIPC_ADDR_NAME address to achieve this effect.

We also perform load balancing between the destinations, i.e., we
primarily select one which has advertised sufficient send window
to not cause a block/EAGAIN delay, if any. This mechanism is
overlayed on the always present round-robin selection.

Anycast messages are subject to the same start synchronization
and flow control mechanism as group broadcast messages.

Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller

Jon Maloy
2017-10-13 23:46:00 +0800
27bd9ec02 tipc: introduce group unicast messaging ... Browse Code »

We now make it possible to send connectionless unicast messages
within a communication group. To send a message, the sender can use
either a direct port address, aka port identity, or an indirect port
name to be looked up.

This type of messages are subject to the same start synchronization
and flow control mechanism as group broadcast messages.

Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller

Jon Maloy
2017-10-13 23:46:00 +0800