28 Nov, 2020
1 commit
-
If an msk listener receives an MPJ carrying an invalid token, it
will zero the request socket msk entry. That should later
cause fallback and subflow reset - as per RFC - at
subflow_syn_recv_sock() time due to failing hmac validation.Since commit 4cf8b7e48a09 ("subflow: introduce and use
mptcp_can_accept_new_subflow()"), we unconditionally dereference
- in mptcp_can_accept_new_subflow - the subflow request msk
before performing hmac validation. In the above scenario we
hit a NULL ptr dereference.Address the issue doing the hmac validation earlier.
Fixes: 4cf8b7e48a09 ("subflow: introduce and use mptcp_can_accept_new_subflow()")
Tested-by: Davide Caratti
Signed-off-by: Paolo Abeni
Reviewed-by: Matthieu Baerts
Link: https://lore.kernel.org/r/03b2cfa3ac80d8fc18272edc6442a9ddf0b1e34e.1606400227.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski
16 Oct, 2020
1 commit
-
Minor conflicts in net/mptcp/protocol.h and
tools/testing/selftests/net/Makefile.In both cases code was added on both sides in the same place
so just keep both.Signed-off-by: Jakub Kicinski
11 Oct, 2020
2 commits
-
The msk can close MP_JOIN subflows if the initial handshake
fails. Currently such subflows are kept alive in the
conn_list until the msk itself is closed.Beyond the wasted memory, we could end-up sending the
DATA_FIN and the DATA_FIN ack on such socket, even after a
reset.Fixes: 43b54c6ee382 ("mptcp: Use full MPTCP-level disconnect state machine")
Reviewed-by: Mat Martineau
Signed-off-by: Paolo Abeni
Signed-off-by: Jakub Kicinski -
Additional/MP_JOIN subflows that do not pass some initial handshake
tests currently causes fallback to TCP. That is an RFC violation:
we should instead reset the subflow and leave the the msk untouched.Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/91
Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Reviewed-by: Mat Martineau
Signed-off-by: Paolo Abeni
Signed-off-by: Jakub Kicinski
09 Oct, 2020
2 commits
-
using packetdrill it's possible to observe the same MPTCP DSN being acked
by different subflows with DACK4 and DACK8. This is in contrast with what
specified in RFC8684 §3.3.2: if an MPTCP endpoint transmits a 64-bit wide
DSN, it MUST be acknowledged with a 64-bit wide DACK. Fix 'use_64bit_ack'
variable to make it a property of MPTCP sockets, not TCP subflows.Fixes: a0c1d0eafd1e ("mptcp: Use 32-bit DATA_ACK when possible")
Acked-by: Paolo Abeni
Signed-off-by: Davide Caratti
Reviewed-by: Mat Martineau
Signed-off-by: Jakub Kicinski -
Small conflict around locking in rxrpc_process_event() -
channel_lock moved to bundle in next, while state lock
needs _bh() from net.Signed-off-by: Jakub Kicinski
06 Oct, 2020
2 commits
-
Currently data fin on data packet are not handled properly:
the 'rcv_data_fin_seq' field is interpreted as the last
sequence number carrying a valid data, but for data fin
packet with valid maps we currently store map_seq + map_len,
that is, the next value.The 'write_seq' fields carries instead the value subseguent
to the last valid byte, so in mptcp_write_data_fin() we
never detect correctly the last DSS map.Fixes: 7279da6145bb ("mptcp: Use MPTCP-level flag for sending DATA_FIN")
Fixes: 1a49b2c2a501 ("mptcp: Handle incoming 32-bit DATA_FIN values")
Reviewed-by: Mat Martineau
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
Rejecting non-native endian BTF overlapped with the addition
of support for it.The rest were more simple overlapping changes, except the
renesas ravb binding update, which had to follow a file
move as well as a YAML conversion.Signed-off-by: David S. Miller
30 Sep, 2020
1 commit
-
The peer may send a DATA_FIN mapping with either a 32-bit or 64-bit
sequence number. When a 32-bit sequence number is received for the
DATA_FIN, it must be expanded to 64 bits before comparing it to the
last acked sequence number. This expansion was missing.Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/93
Fixes: 3721b9b64676 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Signed-off-by: Mat Martineau
Signed-off-by: David S. Miller
25 Sep, 2020
2 commits
-
This patch added a new helper named mptcp_destroy_common containing the
shared code between mptcp_destroy() and mptcp_sock_destruct().Suggested-by: Paolo Abeni
Signed-off-by: Geliang Tang
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
This patch implements the remove announced addr and subflow logic in PM
netlink.When the PM netlink removes an address, we traverse all the existing msk
sockets to find the relevant sockets.We add a new list named anno_list in mptcp_pm_data, to record all the
announced addrs. In the traversing, we check if it has been recorded.
If it has been, we trigger the RM_ADDR signal.We also check if this address is in conn_list. If it is, we remove the
subflow which using this local address.Since we call mptcp_pm_free_anno_list in mptcp_destroy, we need to move
__mptcp_init_sock before the mptcp_is_enabled check in mptcp_init_sock.Suggested-by: Matthieu Baerts
Suggested-by: Paolo Abeni
Suggested-by: Mat Martineau
Acked-by: Paolo Abeni
Signed-off-by: Geliang Tang
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller
24 Sep, 2020
1 commit
-
When receiving a DATA_FIN MPTCP option on a TCP FIN packet, the DATA_FIN
information would be stored but the MPTCP worker did not get
scheduled. In turn, the MPTCP socket state would remain in
TCP_ESTABLISHED and no blocked operations would be awakened.TCP FIN packets are seen by the MPTCP socket when moving skbs out of the
subflow receive queues, so schedule the MPTCP worker when a skb with
DATA_FIN but no data payload is moved from a subflow queue. Other cases
(DATA_FIN on a bare TCP ACK or on a packet with data payload) are
already handled.Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/84
Fixes: 43b54c6ee382 ("mptcp: Use full MPTCP-level disconnect state machine")
Acked-by: Paolo Abeni
Signed-off-by: Mat Martineau
Signed-off-by: Matthieu Baerts
Signed-off-by: David S. Miller
23 Sep, 2020
1 commit
-
Two minor conflicts:
1) net/ipv4/route.c, adding a new local variable while
moving another local variable and removing it's
initial assignment.2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
One pretty prints the port mode differently, whilst another
changes the driver to try and obtain the port mode from
the port node rather than the switch node.Signed-off-by: David S. Miller
18 Sep, 2020
1 commit
-
Christoph reported an infinite loop in the subflow receive path
under stress condition.If there are multiple subflows, each of them using a large send
buffer, the delta between the sequence number used by
MPTCP-level retransmission can and the current msk->ack_seq
can be greater than MAX_INT.In the above scenario, when calling mptcp_subflow_discard_data(),
such delta will be truncated to int, and could result in a negative
number: no bytes will be dropped, and subflow_check_data_avail()
will try again to process the same packet, looping forever.This change addresses the issue by expanding the 'limit' size to 64
bits, so that overflows are not possible anymore.Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/87
Fixes: 6719331c2f73 ("mptcp: trigger msk processing even for OoO data")
Reported-and-tested-by: Christoph Paasch
Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller
15 Sep, 2020
9 commits
-
That is needed to let the subflows announce promptly when new
space is available in the receive buffer.tcp_cleanup_rbuf() is currently a static function, drop the
scope modifier and add a declaration in the TCP header.Reviewed-by: Mat Martineau
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
Currently the 'backup' attribute of local endpoint
is ignored. Let's use it for the MP_JOIN handshakeSigned-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
So that can be accessed easily from the subflow creation
helper. No functional change intended.Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
Add a bunch of MPTCP mibs related to MPTCP OoO data
processing.Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
There is no need to use the tcp_read_sock(), we can
simply drop the skb. Additionally try to look at the
next buffer for in order data.This both simplifies the code and avoid unneeded indirect
calls.Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
Add an RB-tree to cope with OoO (at MPTCP level) data.
__mptcp_move_skb() insert into the RB tree "future"
data, eventually coalescing skb as allowed by the
MPTCP DSN.To simplify sequence accounting, move the DSN inside
the cb.After successfully enqueuing in sequence data, check
if we can use any data from the RB tree.Additionally move the data_fin check after spooling
data from the OoO tree, otherwise we could miss shutdown
events.The RB tree code is copied as verbatim as possible
from tcp_data_queue_ofo(), with a few simplifications
due to the fact that MPTCP doesn't need to cope with
sacks. All bugs here are added by me.Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
This is a prerequisite to allow receiving data from multiple
subflows without re-injection.Instead of dropping the OoO - "future" data in
subflow_check_data_avail(), call into __mptcp_move_skbs()
and let the msk drop that.To avoid code duplication factor out the mptcp_subflow_discard_data()
helper.Note that __mptcp_move_skbs() can now find multiple subflows
with data avail (comprising to-be-discarded data), so must
update the byte counter incrementally.v1 -> v2:
- fix checkpatch issues (unsigned -> unsigned int)Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
This simplify mptcp_subflow_data_available() and will
made follow-up patches simpler.Additionally remove the unneeded checks on subflow copied_seq:
we always whole skbs out of subflows.Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
Currently, when checking for the 'msk is writable' condition, we
look at the individual subflows write space.
That works well while we send data via a single subflow, but will
not as soon as we will enable concurrent xmit on multiple subflows.With this change msk becomes writable when the following conditions
hold:
- the socket has some free write space
- there is at least a subflow with write free spaceAdditionally we need to set the NOSPACE bit on all subflows
before blocking.Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller
11 Sep, 2020
1 commit
-
This patch set the init remote_id to zero, otherwise it will be a random
number.Then it added the missing subflow's remote_id setting code both in
__mptcp_subflow_connect and in subflow_ulp_clone.Fixes: 01cacb00b35cb ("mptcp: add netlink-based PM")
Fixes: ec3edaa7ca6ce ("mptcp: Add handling of outgoing MP_JOIN requests")
Fixes: f296234c98a8f ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Geliang Tang
Reviewed-by: Matthieu Baerts
Signed-off-by: David S. Miller
08 Aug, 2020
1 commit
-
With commit b93df08ccda3 ("mptcp: explicitly track the fully
established status"), the status of unaccepted mptcp closed in
mptcp_sock_destruct() changes from TCP_SYN_RECV to TCP_ESTABLISHED.As a result mptcp_sock_destruct() does not perform the proper
cleanup and inet_sock_destruct() will later emit a warn.Address the issue updating the condition tested in mptcp_sock_destruct().
Also update the related comment.Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/66
Reported-and-tested-by: Christoph Paasch
Fixes: b93df08ccda3 ("mptcp: explicitly track the fully established status")
Signed-off-by: Paolo Abeni
Reviewed-by: Matthieu Baerts
Signed-off-by: David S. Miller
06 Aug, 2020
1 commit
-
Nicolas reported the following oops:
[ 1521.392541] BUG: kernel NULL pointer dereference, address: 00000000000000c0
[ 1521.394189] #PF: supervisor read access in kernel mode
[ 1521.395376] #PF: error_code(0x0000) - not-present page
[ 1521.396607] PGD 0 P4D 0
[ 1521.397156] Oops: 0000 [#1] SMP PTI
[ 1521.398020] CPU: 0 PID: 22986 Comm: kworker/0:2 Not tainted 5.8.0-rc4+ #109
[ 1521.399618] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 1521.401728] Workqueue: events mptcp_worker
[ 1521.402651] RIP: 0010:mptcp_subflow_create_socket+0xf1/0x1c0
[ 1521.403954] Code: 24 08 89 44 24 04 48 8b 7a 18 e8 2a 48 d4 ff 8b 44 24 04 85 c0 75 7a 48 8b 8b 78 02 00 00 48 8b 54 24 08 48 8d bb 80 00 00 00 8b 89 c0 00 00 00 48 89 8a c0 00 00 00 48 8b 8b 78 02 00 00 8b
[ 1521.408201] RSP: 0000:ffffabc4002d3c60 EFLAGS: 00010246
[ 1521.409433] RAX: 0000000000000000 RBX: ffffa0b9ad8c9a00 RCX: 0000000000000000
[ 1521.411096] RDX: ffffa0b9ae78a300 RSI: 00000000fffffe01 RDI: ffffa0b9ad8c9a80
[ 1521.412734] RBP: ffffa0b9adff2e80 R08: ffffa0b9af02d640 R09: ffffa0b9ad923a00
[ 1521.414333] R10: ffffabc4007139f8 R11: fefefefefefefeff R12: ffffabc4002d3cb0
[ 1521.415918] R13: ffffa0b9ad91fa58 R14: ffffa0b9ad8c9f9c R15: 0000000000000000
[ 1521.417592] FS: 0000000000000000(0000) GS:ffffa0b9af000000(0000) knlGS:0000000000000000
[ 1521.419490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1521.420839] CR2: 00000000000000c0 CR3: 000000002951e006 CR4: 0000000000160ef0
[ 1521.422511] Call Trace:
[ 1521.423103] __mptcp_subflow_connect+0x94/0x1f0
[ 1521.425376] mptcp_pm_create_subflow_or_signal_addr+0x200/0x2a0
[ 1521.426736] mptcp_worker+0x31b/0x390
[ 1521.431324] process_one_work+0x1fc/0x3f0
[ 1521.432268] worker_thread+0x2d/0x3b0
[ 1521.434197] kthread+0x117/0x130
[ 1521.435783] ret_from_fork+0x22/0x30on some unconventional configuration.
The MPTCP protocol is trying to create a subflow for an
unaccepted server socket. That is allowed by the RFC, even
if subflow creation will likely fail.
Unaccepted sockets have still a NULL sk_socket field,
avoid the issue by failing earlier.Reported-and-tested-by: Nicolas Rybowski
Fixes: 7d14b0d2b9b3 ("mptcp: set correct vfs info for subflows")
Signed-off-by: Paolo Abeni
Reviewed-by: Matthieu Baerts
Signed-off-by: David S. Miller
01 Aug, 2020
5 commits
-
JOIN requests do not work in syncookie mode -- for HMAC validation, the
peers nonce and the mptcp token (to obtain the desired connection socket
the join is for) are required, but this information is only present in the
initial syn.So either we need to drop all JOIN requests once a listening socket enters
syncookie mode, or we need to store enough state to reconstruct the request
socket later.This adds a state table (1024 entries) to store the data present in the
MP_JOIN syn request and the random nonce used for the cookie syn/ack.When a MP_JOIN ACK passed cookie validation, the table is consulted
to rebuild the request socket from it.An alternate approach would be to "cancel" syn-cookie mode and force
MP_JOIN to always use a syn queue entry.However, doing so brings the backlog over the configured queue limit.
v2: use req->syncookie, not (removed) want_cookie arg
Suggested-by: Paolo Abeni
Signed-off-by: Florian Westphal
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
Will be used to initialize the mptcp request socket when a MP_CAPABLE
request was handled in syncookie mode, i.e. when a TCP ACK containing a
MP_CAPABLE option is a valid syncookie value.Normally (non-cookie case), MPTCP will generate a unique 32 bit connection
ID and stores it in the MPTCP token storage to be able to retrieve the
mptcp socket for subflow joining.In syncookie case, we do not want to store any state, so just generate the
unique ID and use it in the reply.This means there is a small window where another connection could generate
the same token.When Cookie ACK comes back, we check that the token has not been registered
in the mean time. If it was, the connection needs to fall back to TCP.Changes in v2:
- use req->syncookie instead of passing 'want_cookie' arg to ->init_req()
(Eric Dumazet)Signed-off-by: Florian Westphal
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
syncookie code path needs to create an mptcp request sock.
Prepare for this and add mptcp prefix plus needed export of ops struct.
Signed-off-by: Florian Westphal
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
When syncookie support is added, we will need to add a variant of
subflow_init_req() helper. It will do almost same thing except
that it will not compute/add a token to the mptcp token tree.To avoid excess copy&paste, this commit splits away part of the
code into a new helper, __subflow_init_req, that can then be re-used
from the 'no insert' function added in a followup change.Signed-off-by: Florian Westphal
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller -
Once syncookie support is added, no state will be stored anymore when the
syn/ack is generated in syncookie mode.When the ACK comes back, the generated key will be taken from the TCP ACK,
the token is re-generated and inserted into the token tree.This means we can't retry with a new key when the token is already taken
in the syncookie case.Therefore, move the retry logic to the caller to prepare for syncookie
support in mptcp.Signed-off-by: Florian Westphal
Reviewed-by: Mat Martineau
Signed-off-by: David S. Miller
29 Jul, 2020
2 commits
-
The MPTCP state machine handles disconnections on non-fallback connections,
but the mptcp_sock still needs to get notified when fallback subflows
disconnect.Signed-off-by: Mat Martineau
Signed-off-by: David S. Miller -
RFC 8684 appendix D describes the connection state machine for
MPTCP. This patch implements the DATA_FIN / DATA_ACK exchanges and
MPTCP-level socket state changes described in that appendix, rather than
simply sending DATA_FIN along with TCP FIN when disconnecting subflows.DATA_FIN is now sent and acknowledged before shutting down the
subflows. Received DATA_FIN information (if not part of a data packet)
is written to the MPTCP socket when the incoming DSS option is parsed by
the subflow, and the MPTCP worker is scheduled to process the
flag. DATA_FIN received as part of a full DSS mapping will be handled
when the mapping is processed.The DATA_FIN is acknowledged by the worker if the reader is caught
up. If there is still data to be moved to the MPTCP-level queue, ack_seq
will be incremented to account for the DATA_FIN when it reaches the end
of the stream and a DATA_ACK will be sent to the peer.Signed-off-by: Mat Martineau
Signed-off-by: David S. Miller
24 Jul, 2020
6 commits
-
So that we can easily perform some basic PM-related
adimission checks before creating the child socket.Reviewed-by: Mat Martineau
Tested-by: Christoph Paasch
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
tcp_send_active_reset() is more prone to transient errors
(memory allocation or xmit queue full): in stress conditions
the kernel may drop the egress packet, and the client will be
stuck.Reviewed-by: Mat Martineau
Tested-by: Christoph Paasch
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
When syncookie are in use, the TCP stack may feed into
subflow_syn_recv_sock() plain TCP request sockets. We can't
access mptcp_subflow_request_sock-specific fields on such
sockets. Explicitly check the rsk ops to do safe accesses.Reviewed-by: Mat Martineau
Tested-by: Christoph Paasch
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
The mentioned function has several unneeded branches,
handle each case - MP_CAPABLE, MP_JOIN, fallback -
under a single conditional and drop quite a bit of
duplicate code.Reviewed-by: Mat Martineau
Tested-by: Christoph Paasch
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
Currently accepted msk sockets become established only after
accept() returns the new sk to user-space.As MP_JOIN request are refused as per RFC spec on non fully
established socket, the above causes mp_join self-tests
instabilities.This change lets the msk entering the established status
as soon as it receives the 3rd ack and propagates the first
subflow fully established status on the msk socket.Finally we can change the subflow acceptance condition to
take in account both the sock state and the msk fully
established flag.Reviewed-by: Mat Martineau
Tested-by: Christoph Paasch
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
Currently we do not init the subflow write sequence for
MP_JOIN subflows. This will cause bad mapping being
generated as soon as we will use non backup subflow.Reviewed-by: Mat Martineau
Tested-by: Christoph Paasch
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
18 Jul, 2020
1 commit
-
since commit d47a72152097 ("mptcp: fix race in subflow_data_ready()"), it
is possible to observe a regression in MP_JOIN kselftests. For sockets in
TCP_CLOSE state, it's not sufficient to just wake up the main socket: we
also need to ensure that received data are made available to the reader.
Silence the WARN_ON_ONCE() in these cases: it preserves the syzkaller fix
and restores kselftests when they are ran as follows:# while true; do
> make KBUILD_OUTPUT=/tmp/kselftest TARGETS=net/mptcp kselftest
> doneReported-by: Florian Westphal
Fixes: d47a72152097 ("mptcp: fix race in subflow_data_ready()")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/47
Signed-off-by: Davide Caratti
Reviewed-by: Matthieu Baerts
Signed-off-by: David S. Miller