Commit 7948f6cc9951f00aeb8edc51fafbb7450f61d62c

Authored by Florian Westphal
Committed by David S. Miller
1 parent d027236c41

mptcp: allow partial cleaning of rtx head dfrag

After adding wmem accounting for the mptcp socket we could get
into a situation where the mptcp socket can't transmit more data,
and mptcp_clean_una doesn't reduce wmem even if snd_una has advanced
because it currently will only remove entire dfrags.

Allow advancing the dfrag head sequence and reduce wmem,
even though this isn't correct (as we can't release the page).

Because we will soon block on mptcp sk in case wmem is too large,
call sk_stream_write_space() in case we reduced the backlog so
userspace task blocked in sendmsg or poll will be woken up.

This isn't an issue if the send buffer is large, but it is when
SO_SNDBUF is used to reduce it to a lower value.

Note we can still get a deadlock for low SO_SNDBUF values in
case both sides of the connection write to the socket: both could
be blocked due to wmem being too small -- and current mptcp stack
will only increment mptcp ack_seq on recv.

This doesn't happen with the selftest as it uses poll() and
will always call recv if there is data to read.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Showing 2 changed files with 26 additions and 0 deletions Side-by-side Diff

net/mptcp/protocol.c
... ... @@ -338,6 +338,7 @@
338 338 static void dfrag_uncharge(struct sock *sk, int len)
339 339 {
340 340 sk_mem_uncharge(sk, len);
  341 + sk_wmem_queued_add(sk, -len);
341 342 }
342 343  
343 344 static void dfrag_clear(struct sock *sk, struct mptcp_data_frag *dfrag)
344 345  
... ... @@ -364,8 +365,23 @@
364 365 cleaned = true;
365 366 }
366 367  
  368 + dfrag = mptcp_rtx_head(sk);
  369 + if (dfrag && after64(snd_una, dfrag->data_seq)) {
  370 + u64 delta = dfrag->data_seq + dfrag->data_len - snd_una;
  371 +
  372 + dfrag->data_seq += delta;
  373 + dfrag->data_len -= delta;
  374 +
  375 + dfrag_uncharge(sk, delta);
  376 + cleaned = true;
  377 + }
  378 +
367 379 if (cleaned) {
368 380 sk_mem_reclaim_partial(sk);
  381 +
  382 + /* Only wake up writers if a subflow is ready */
  383 + if (test_bit(MPTCP_SEND_SPACE, &msk->flags))
  384 + sk_stream_write_space(sk);
369 385 }
370 386 }
371 387  
net/mptcp/protocol.h
... ... @@ -190,6 +190,16 @@
190 190 return list_last_entry(&msk->rtx_queue, struct mptcp_data_frag, list);
191 191 }
192 192  
  193 +static inline struct mptcp_data_frag *mptcp_rtx_head(const struct sock *sk)
  194 +{
  195 + struct mptcp_sock *msk = mptcp_sk(sk);
  196 +
  197 + if (list_empty(&msk->rtx_queue))
  198 + return NULL;
  199 +
  200 + return list_first_entry(&msk->rtx_queue, struct mptcp_data_frag, list);
  201 +}
  202 +
193 203 struct mptcp_subflow_request_sock {
194 204 struct tcp_request_sock sk;
195 205 u16 mp_capable : 1,