Eric Lee / smarc-fsl-linux-kernel

09 Sep, 2020

1 commit

245500d85 rxrpc: Rewrite the client connection manager ... Browse Code »

Rewrite the rxrpc client connection manager so that it can support multiple
connections for a given security key to a peer. The following changes are
made:

(1) For each open socket, the code currently maintains an rbtree with the
connections placed into it, keyed by communications parameters. This
is tricky to maintain as connections can be culled from the tree or
replaced within it. Connections can require replacement for a number
of reasons, e.g. their IDs span too great a range for the IDR data
type to represent efficiently, the call ID numbers on that conn would
overflow or the conn got aborted.

This is changed so that there's now a connection bundle object placed
in the tree, keyed on the same parameters. The bundle, however, does
not need to be replaced.

(2) An rxrpc_bundle object can now manage the available channels for a set
of parallel connections. The lock that manages this is moved there
from the rxrpc_connection struct (channel_lock).

(3) There'a a dummy bundle for all incoming connections to share so that
they have a channel_lock too. It might be better to give each
incoming connection its own bundle. This bundle is not needed to
manage which channels incoming calls are made on because that's the
solely at whim of the client.

(4) The restrictions on how many client connections are around are
removed. Instead, a previous patch limits the number of client calls
that can be allocated. Ordinarily, client connections are reaped
after 2 minutes on the idle queue, but when more than a certain number
of connections are in existence, the reaper starts reaping them after
2s of idleness instead to get the numbers back down.

It could also be made such that new call allocations are forced to
wait until the number of outstanding connections subsides.

Signed-off-by: David Howells

David Howells
2020-09-09 04:11:43 +0800

21 Aug, 2020

1 commit

4700c4d80 rxrpc: Fix loss of RTT samples due to interposed ACK ... Browse Code »

The Rx protocol has a mechanism to help generate RTT samples that works by
a client transmitting a REQUESTED-type ACK when it receives a DATA packet
that has the REQUEST_ACK flag set.

The peer, however, may interpose other ACKs before transmitting the
REQUESTED-ACK, as can be seen in the following trace excerpt:

rxrpc_tx_data: c=00000044 DATA d0b5ece8:00000001 00000001 q=00000001 fl=07
rxrpc_rx_ack: c=00000044 00000001 PNG r=00000000 f=00000002 p=00000000 n=0
rxrpc_rx_ack: c=00000044 00000002 REQ r=00000001 f=00000002 p=00000001 n=0
...

DATA packet 1 (q=xx) has REQUEST_ACK set (bit 1 of fl=xx). The incoming
ping (labelled PNG) hard-acks the request DATA packet (f=xx exceeds the
sequence number of the DATA packet), causing it to be discarded from the Tx
ring. The ACK that was requested (labelled REQ, r=xx references the serial
of the DATA packet) comes after the ping, but the sk_buff holding the
timestamp has gone and the RTT sample is lost.

This is particularly noticeable on RPC calls used to probe the service
offered by the peer. A lot of peers end up with an unknown RTT because we
only ever sent a single RPC. This confuses the server rotation algorithm.

Fix this by caching the information about the outgoing packet in RTT
calculations in the rxrpc_call struct rather than looking in the Tx ring.

A four-deep buffer is maintained and both REQUEST_ACK-flagged DATA and
PING-ACK transmissions are recorded in there. When the appropriate
response ACK is received, the buffer is checked for a match and, if found,
an RTT sample is recorded.

If a received ACK refers to a packet with a later serial number than an
entry in the cache, that entry is presumed lost and the entry is made
available to record a new transmission.

ACKs types other than REQUESTED-type and PING-type cause any matching
sample to be cancelled as they don't necessarily represent a useful
measurement.

If there's no space in the buffer on ping/data transmission, the sample
base is discarded.

Fixes: 50235c4b5a2f ("rxrpc: Obtain RTT data by requesting ACKs on DATA packets")
Signed-off-by: David Howells

David Howells
2020-08-21 00:59:27 +0800

29 May, 2020

1 commit

2de569bda ipv4: add ip_sock_set_mtu_discover ... Browse Code »

Add a helper to directly set the IP_MTU_DISCOVER sockopt from kernel
space without going through a fake uaccess.

Signed-off-by: Christoph Hellwig
Reviewed-by: David Howells [rxrpc bits]
Signed-off-by: David S. Miller

Christoph Hellwig
2020-05-29 02:11:45 +0800

11 May, 2020

1 commit

c410bf019 rxrpc: Fix the excessive initial retransmission timeout ... Browse Code »

rxrpc currently uses a fixed 4s retransmission timeout until the RTT is
sufficiently sampled. This can cause problems with some fileservers with
calls to the cache manager in the afs filesystem being dropped from the
fileserver because a packet goes missing and the retransmission timeout is
greater than the call expiry timeout.

Fix this by:

(1) Copying the RTT/RTO calculation code from Linux's TCP implementation
and altering it to fit rxrpc.

(2) Altering the various users of the RTT to make use of the new SRTT
value.

(3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO
value instead (which is needed in jiffies), along with a backoff.

Notes:

(1) rxrpc provides RTT samples by matching the serial numbers on outgoing
DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets
against the reference serial number in incoming REQUESTED ACK and
PING-RESPONSE ACK packets.

(2) Each packet that is transmitted on an rxrpc connection gets a new
per-connection serial number, even for retransmissions, so an ACK can
be cross-referenced to a specific trigger packet. This allows RTT
information to be drawn from retransmitted DATA packets also.

(3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than
on an rxrpc_call because many RPC calls won't live long enough to
generate more than one sample.

(4) The calculated SRTT value is in units of 8ths of a microsecond rather
than nanoseconds.

The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers.

Fixes: 17926a79320a ([AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both"")
Signed-off-by: David Howells

David Howells
2020-05-11 23:42:28 +0800

15 Apr, 2020

1 commit

0e631eee1 rxrpc: Fix DATA Tx to disable nofrag for UDP on AF_INET6 socket ... Browse Code »

Fix the DATA packet transmission to disable nofrag for UDPv4 on an AF_INET6
socket as well as UDPv6 when trying to transmit fragmentably.

Without this, packets filled to the normal size used by the kernel AFS
client of 1412 bytes be rejected by udp_sendmsg() with EMSGSIZE
immediately. The ->sk_error_report() notification hook is called, but
rxrpc doesn't generate a trace for it.

This is a temporary fix; a more permanent solution needs to involve
changing the size of the packets being filled in accordance with the MTU,
which isn't currently done in AF_RXRPC. The reason for not doing so was
that, barring the last packet in an rx jumbo packet, jumbos can only be
assembled out of 1412-byte packets - and the plan was to construct jumbos
on the fly at transmission time.

Also, there's no point turning on IPV6_MTU_DISCOVER, since IPv6 has to
engage in this anyway since fragmentation is only done by the sender. We
can then condense the switch-statement in rxrpc_send_data_packet().

Fixes: 75b54cb57ca3 ("rxrpc: Add IPv6 support")
Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2020-04-15 07:26:47 +0800

03 Feb, 2020

1 commit

5273a191d rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect ... Browse Code »

When a call is disconnected, the connection pointer from the call is
cleared to make sure it isn't used again and to prevent further attempted
transmission for the call. Unfortunately, there might be a daemon trying
to use it at the same time to transmit a packet.

Fix this by keeping call->conn set, but setting a flag on the call to
indicate disconnection instead.

Remove also the bits in the transmission functions where the conn pointer is
checked and a ref taken under spinlock as this is now redundant.

Fixes: 8d94aa381dab ("rxrpc: Calls shouldn't hold socket refs")
Signed-off-by: David Howells

David Howells
2020-02-03 18:25:30 +0800

27 Aug, 2019

1 commit

987db9f7c rxrpc: Use the tx-phase skb flag to simplify tracing ... Browse Code »

Use the previously-added transmit-phase skbuff private flag to simplify the
socket buffer tracing a bit. Which phase the skbuff comes from can now be
divined from the skb rather than having to be guessed from the call state.

We can also reduce the number of rxrpc_skb_trace values by eliminating the
difference between Tx and Rx in the symbols.

Signed-off-by: David Howells

David Howells
2019-08-27 17:04:18 +0800

09 Aug, 2019

1 commit

e8c3af6bb rxrpc: Don't bother generating maxSkew in the ACK packet ... Browse Code »

Don't bother generating maxSkew in the ACK packet as it has been obsolete
since AFS 3.1.

Signed-off-by: David Howells
Reviewed-by: Jeffrey Altman

David Howells
2019-08-09 22:24:00 +0800

03 Jul, 2019

1 commit

3427beb63 rxrpc: Fix uninitialized error code in rxrpc_send_data_packet() ... Browse Code »

With gcc 4.1:

net/rxrpc/output.c: In function ‘rxrpc_send_data_packet’:
net/rxrpc/output.c:338: warning: ‘ret’ may be used uninitialized in this function

Indeed, if the first jump to the send_fragmentable label is made, and
the address family is not handled in the switch() statement, ret will be
used uninitialized.

Fix this by BUG()'ing as is done in other places in rxrpc where internal
support for future address families will need adding. It should not be
possible to reach this normally as the address families are checked
up-front.

Fixes: 5a924b8951f835b5 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
Reported-by: Geert Uytterhoeven
Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2019-07-03 03:09:09 +0800

31 May, 2019

1 commit

2874c5fd2 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:26:32 +0800

24 Mar, 2019

1 commit

526949e87 rxrpc: avoid clang -Wuninitialized warning ... Browse Code »

clang produces a false-positive warning as it fails to notice
that "lost = true" implies that "ret" is initialized:

net/rxrpc/output.c:402:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
if (lost)
^~~~
net/rxrpc/output.c:437:6: note: uninitialized use occurs here
if (ret >= 0) {
^~~
net/rxrpc/output.c:402:2: note: remove the 'if' if its condition is always false
if (lost)
^~~~~~~~~
net/rxrpc/output.c:339:9: note: initialize the variable 'ret' to silence this warning
int ret, opt;
^
= 0

Rearrange the code to make that more obvious and avoid the warning.

Signed-off-by: Arnd Bergmann
Reviewed-by: Nathan Chancellor
Signed-off-by: David S. Miller

Arnd Bergmann
2019-03-24 09:48:30 +0800

03 Nov, 2018

1 commit

c7e86acfc rxrpc: Fix lockup due to no error backoff after ack transmit error ... Browse Code »

If the network becomes (partially) unavailable, say by disabling IPv6, the
background ACK transmission routine can get itself into a tizzy by
proposing immediate ACK retransmission. Since we're in the call event
processor, that happens immediately without returning to the workqueue
manager.

The condition should clear after a while when either the network comes back
or the call times out.

Fix this by:

(1) When re-proposing an ACK on failed Tx, don't schedule it immediately.
This will allow a certain amount of time to elapse before we try
again.

(2) Enforce a return to the workqueue manager after a certain number of
iterations of the call processing loop.

(3) Add a backoff delay that increases the delay on deferred ACKs by a
jiffy per failed transmission to a limit of HZ. The backoff delay is
cleared on a successful return from kernel_sendmsg().

(4) Cancel calls immediately if the opening sendmsg fails. The layer
above can arrange retransmission or rotate to another server.

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2018-11-03 14:59:26 +0800

20 Oct, 2018

1 commit

2e2d6f034 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'. Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller

David S. Miller
2018-10-20 02:03:06 +0800

16 Oct, 2018

1 commit

d6672a5a9 rxrpc: use correct kvec num when sending BUSY response packet ... Browse Code »

Fixes gcc '-Wunused-but-set-variable' warning:

net/rxrpc/output.c: In function 'rxrpc_reject_packets':
net/rxrpc/output.c:527:11: warning:
variable 'ioc' set but not used [-Wunused-but-set-variable]

'ioc' is the correct kvec num when sending a BUSY (or an ABORT) response
packet.

Fixes: ece64fec164f ("rxrpc: Emit BUSY packets when supposed to rather than ABORTs")
Signed-off-by: YueHaibing
Signed-off-by: David Howells
Signed-off-by: David S. Miller

YueHaibing
2018-10-16 13:08:17 +0800

04 Oct, 2018

2 commits

5a790b737 rxrpc: Drop the local endpoint arg from rxrpc_extract_addr_from_skb() ... Browse Code »

rxrpc_extract_addr_from_skb() doesn't use the argument that points to the
local endpoint, so remove the argument.

Signed-off-by: David Howells

David Howells
2018-10-04 16:32:28 +0800
b3cfb6f56 rxrpc: Emit the data Tx trace line before transmitting ... Browse Code »

Print the data Tx trace line before transmitting so that it appears before
the trace lines indicating success or failure of the transmission. This
makes the trace log less confusing.

Signed-off-by: David Howells

David Howells
2018-10-04 16:32:27 +0800

28 Sep, 2018

2 commits

ece64fec1 rxrpc: Emit BUSY packets when supposed to rather than ABORTs ... Browse Code »

In the input path, a received sk_buff can be marked for rejection by
setting RXRPC_SKB_MARK_* in skb->mark and, if needed, some auxiliary data
(such as an abort code) in skb->priority. The rejection is handled by
queueing the sk_buff up for dealing with in process context. The output
code reads the mark and priority and, theoretically, generates an
appropriate response packet.

However, if RXRPC_SKB_MARK_BUSY is set, this isn't noticed and an ABORT
message with a random abort code is generated (since skb->priority wasn't
set to anything).

Fix this by outputting the appropriate sort of packet.

Also, whilst we're at it, most of the marks are no longer used, so remove
them and rename the remaining two to something more obvious.

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells

David Howells
2018-09-28 17:32:19 +0800
b604dd988 rxrpc: Fix RTT gathering ... Browse Code »

Fix RTT information gathering in AF_RXRPC by the following means:

(1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS.

(2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready()
collects it, set it at that point.

(3) Allow ACKs to be requested on the last packet of a client call, but
not a service call. We need to be careful lest we undo:

bf7d620abf22c321208a4da4f435e7af52551a21
Author: David Howells
Date: Thu Oct 6 08:11:51 2016 +0100
rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase

but that only really applies to service calls that we're handling,
since the client side gets to send the final ACK (or not).

(4) When about to transmit an ACK or DATA packet, record the Tx timestamp
before only; don't update the timestamp afterwards.

(5) Switch the ordering between recording the serial and recording the
timestamp to always set the serial number first. The serial number
shouldn't be seen referenced by an ACK packet until we've transmitted
the packet bearing it - so in the Rx path, we don't need the timestamp
until we've checked the serial number.

Fixes: cf1a6474f807 ("rxrpc: Add per-peer RTT tracker")
Signed-off-by: David Howells

David Howells
2018-09-28 17:32:03 +0800

10 Aug, 2018

1 commit

a736e0746 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Overlapping changes in RXRPC, changing to ktime_get_seconds() whilst
adding some tracepoints.

Signed-off-by: David S. Miller

David S. Miller
2018-08-10 02:52:36 +0800

09 Aug, 2018

1 commit

330bdcfad rxrpc: Fix the keepalive generator [ver #2] ... Browse Code »

AF_RXRPC has a keepalive message generator that generates a message for a
peer ~20s after the last transmission to that peer to keep firewall ports
open. The implementation is incorrect in the following ways:

(1) It mixes up ktime_t and time64_t types.

(2) It uses ktime_get_real(), the output of which may jump forward or
backward due to adjustments to the time of day.

(3) If the current time jumps forward too much or jumps backwards, the
generator function will crank the base of the time ring round one slot
at a time (ie. a 1s period) until it catches up, spewing out VERSION
packets as it goes.

Fix the problem by:

(1) Only using time64_t. There's no need for sub-second resolution.

(2) Use ktime_get_seconds() rather than ktime_get_real() so that time
isn't perceived to go backwards.

(3) Simplifying rxrpc_peer_keepalive_worker() by splitting it into two
parts:

(a) The "worker" function that manages the buckets and the timer.

(b) The "dispatch" function that takes the pending peers and
potentially transmits a keepalive packet before putting them back
in the ring into the slot appropriate to the revised last-Tx time.

(4) Taking everything that's pending out of the ring and splicing it into
a temporary collector list for processing.

In the case that there's been a significant jump forward, the ring
gets entirely emptied and then the time base can be warped forward
before the peers are processed.

The warping can't happen if the ring isn't empty because the slot a
peer is in is keepalive-time dependent, relative to the base time.

(5) Limit the number of iterations of the bucket array when scanning it.

(6) Set the timer to skip any empty slots as there's no point waking up if
there's nothing to do yet.

This can be triggered by an incoming call from a server after a reboot with
AF_RXRPC and AFS built into the kernel causing a peer record to be set up
before userspace is started. The system clock is then adjusted by
userspace, thereby potentially causing the keepalive generator to have a
meltdown - which leads to a message like:

watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:23]
...
Workqueue: krxrpcd rxrpc_peer_keepalive_worker
EIP: lock_acquire+0x69/0x80
...
Call Trace:
? rxrpc_peer_keepalive_worker+0x5e/0x350
? _raw_spin_lock_bh+0x29/0x60
? rxrpc_peer_keepalive_worker+0x5e/0x350
? rxrpc_peer_keepalive_worker+0x5e/0x350
? __lock_acquire+0x3d3/0x870
? process_one_work+0x110/0x340
? process_one_work+0x166/0x340
? process_one_work+0x110/0x340
? worker_thread+0x39/0x3c0
? kthread+0xdb/0x110
? cancel_delayed_work+0x90/0x90
? kthread_stop+0x70/0x70
? ret_from_fork+0x19/0x24

Fixes: ace45bec6d77 ("rxrpc: Fix firewall route keepalive")
Reported-by: kernel test robot
Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2018-08-09 10:10:26 +0800

01 Aug, 2018

1 commit

4764c0da6 rxrpc: Trace packet transmission ... Browse Code »

Trace successful packet transmission (kernel_sendmsg() succeeded, that is)
in AF_RXRPC. We can share the enum that defines the transmission points
with the trace_rxrpc_tx_fail() tracepoint, so rename its constants to be
applicable to both.

Also, save the internal call->debug_id in the rxrpc_channel struct so that
it can be used in retransmission trace lines.

Signed-off-by: David Howells

David Howells
2018-08-01 20:28:23 +0800

11 May, 2018

2 commits

6b47fe1d1 rxrpc: Trace UDP transmission failure ... Browse Code »

Add a tracepoint to log transmission failure from the UDP transport socket
being used by AF_RXRPC.

Signed-off-by: David Howells

David Howells
2018-05-11 06:26:01 +0800
c54e43d75 rxrpc: Fix missing start of call timeout ... Browse Code »

The expect_rx_by call timeout is supposed to be set when a call is started
to indicate that we need to receive a packet by that point. This is
currently put back every time we receive a packet, but it isn't started
when we first send a packet. Without this, the call may wait forever if
the server doesn't deign to reply.

Fix this by setting the timeout upon a successful UDP sendmsg call for the
first DATA packet. The timeout is initiated only for initial transmission
and not for subsequent retries as we don't want the retry mechanism to
extend the timeout indefinitely.

Fixes: a158bdd3247b ("rxrpc: Fix call timeouts")
Reported-by: Marc Dionne
Signed-off-by: David Howells

David Howells
2018-05-11 06:26:00 +0800

31 Mar, 2018

1 commit

ace45bec6 rxrpc: Fix firewall route keepalive ... Browse Code »

Fix the firewall route keepalive part of AF_RXRPC which is currently
function incorrectly by replying to VERSION REPLY packets from the server
with VERSION REQUEST packets.

Instead, send VERSION REPLY packets to the peers of service connections to
act as keep-alives 20s after the latest packet was transmitted to that
peer.

Also, just discard VERSION REPLY packets rather than replying to them.

Signed-off-by: David Howells

David Howells
2018-03-31 04:04:43 +0800

23 Feb, 2018

1 commit

93c62c45e rxrpc: Fix send in rxrpc_send_data_packet() ... Browse Code »

All the kernel_sendmsg() calls in rxrpc_send_data_packet() need to send
both parts of the iov[] buffer, but one of them does not. Fix it so that
it does.

Without this, short IPv6 rxrpc DATA packets may be seen that have the rxrpc
header included, but no payload.

Fixes: 5a924b8951f8 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
Reported-by: Marc Dionne
Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2018-02-23 04:37:47 +0800

24 Nov, 2017

2 commits

415f44e43 rxrpc: Add keepalive for a call ... Browse Code »

We need to transmit a packet every so often to act as a keepalive for the
peer (which has a timeout from the last time it received a packet) and also
to prevent any intervening firewalls from closing the route.

Do this by resetting a timer every time we transmit a packet. If the timer
ever expires, we transmit a PING ACK packet and thereby also elicit a PING
RESPONSE ACK from the other side - which prevents our last-rx timeout from
expiring.

The timer is set to 1/6 of the last-rx timeout so that we can detect the
other side going away if it misses 6 replies in a row.

This is particularly necessary for servers where the processing of the
service function may take a significant amount of time.

Signed-off-by: David Howells

David Howells
2017-11-24 18:18:42 +0800
bd1fdf8cf rxrpc: Add a timeout for detecting lost ACKs/lost DATA ... Browse Code »

Add an extra timeout that is set/updated when we send a DATA packet that
has the request-ack flag set. This allows us to detect if we don't get an
ACK in response to the latest flagged packet.

The ACK packet is adjudged to have been lost if it doesn't turn up within
2*RTT of the transmission.

If the timeout occurs, we schedule the sending of a PING ACK to find out
the state of the other side. If a new DATA packet is ready to go sooner,
we cancel the sending of the ping and set the request-ack flag on that
instead.

If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what
we had at the time of the ping transmission, we adjudge all the DATA
packets sent between the response tx_top and the ping-time tx_top to have
been lost and retransmit immediately.

Rather than sending a PING ACK, we could just pick a DATA packet and
speculatively retransmit that with request-ack set. It should result in
either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the
a PING-RESPONSE ACK mentioned above.

Signed-off-by: David Howells

David Howells
2017-11-24 18:18:42 +0800

02 Nov, 2017

2 commits

dcbefc30f rxrpc: Fix call expiry handling ... Browse Code »

Fix call expiry handling in the following ways

(1) If all the request data from a client call is acked, don't send a
follow up IDLE ACK with firstPacket == 1 and previousPacket == 0 as
this appears to fool some servers into thinking everything has been
accepted.

(2) Never send an abort back to the server once it has ACK'd all the
request packets; rather just try to reuse the channel for the next
call. The first request DATA packet of the next call on the same
channel will implicitly ACK the entire reply of the dead call - even
if we haven't transmitted it yet.

(3) Don't send RX_CALL_TIMEOUT in an ABORT packet, librx uses abort codes
to pass local errors to the caller in addition to remote errors, and
this is meant to be local only.

The following also need to be addressed in future patches:

(4) Service calls should send PING ACKs as 'keep alives' if the server is
still processing the call.

(5) VERSION REPLY packets should be sent to the peers of service
connections to act as keep-alives. This is used to keep firewall
routes in place. The AFS CM should enable this.

Signed-off-by: David Howells

David Howells
2017-11-02 23:20:43 +0800
1457cc4cf rxrpc: Fix a null ptr deref in rxrpc_fill_out_ack() ... Browse Code »

rxrpc_fill_out_ack() needs to be passed the connection pointer from its
caller rather than using call->conn as the call may be disconnected in
parallel with it, clearing call->conn, leading to:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: rxrpc_send_ack_packet+0x231/0x6a4

Signed-off-by: David Howells

David Howells
2017-11-02 23:20:43 +0800

29 Aug, 2017

1 commit

7b674e390 rxrpc: Fix IPv6 support ... Browse Code »

Fix IPv6 support in AF_RXRPC in the following ways:

(1) When extracting the address from a received IPv4 packet, if the local
transport socket is open for IPv6 then fill out the sockaddr_rxrpc
struct for an IPv4-mapped-to-IPv6 AF_INET6 transport address instead
of an AF_INET one.

(2) When sending CHALLENGE or RESPONSE packets, the transport length needs
to be set from the sockaddr_rxrpc::transport_len field rather than
sizeof() on the IPv4 transport address.

(3) When processing an IPv4 ICMP packet received by an IPv6 socket, set up
the address correctly before searching for the affected peer.

Signed-off-by: David Howells

David Howells
2017-08-29 17:55:20 +0800

05 Jun, 2017

1 commit

4e255721d rxrpc: Add service upgrade support for client connections ... Browse Code »

Make it possible for a client to use AuriStor's service upgrade facility.

The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
the first sendmsg() of a call. This takes no parameters.

When recvmsg() starts returning data from the call, the service ID field in
the returned msg_name will reflect the result of the upgrade attempt. If
the upgrade was ignored, srx_service will match what was set in the
sendmsg(); if the upgrade happened the srx_service will be altered to
indicate the service the server upgraded to.

Note that:

(1) The choice of upgrade service is up to the server

(2) Further client calls to the same server that would share a connection
are blocked if an upgrade probe is in progress.

(3) This should only be used to probe the service. Clients should then
use the returned service ID in all subsequent communications with that
server (and not set the upgrade). Note that the kernel will not
retain this information should the connection expire from its cache.

(4) If a server that supports upgrading is replaced by one that doesn't,
whilst a connection is live, and if the replacement is running, say,
OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
server will not respond to packets sent to the upgraded connection.

At this point, calls will time out and the server must be reprobed.

Signed-off-by: David Howells

David Howells
2017-06-05 21:30:49 +0800

06 Oct, 2016

3 commits

bf7d620ab rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase ... Browse Code »

Don't request an ACK on the last DATA packet of a call's Tx phase as for a
client there will be a reply packet or some sort of ACK to shift phase. If
the ACK is requested, OpenAFS sends a REQUESTED-ACK ACK with soft-ACKs in
it and doesn't follow up with a hard-ACK.

If we don't set the flag, OpenAFS will send a DELAY ACK that hard-ACKs the
reply data, thereby allowing the call to terminate cleanly.

Signed-off-by: David Howells

David Howells
2016-10-06 15:11:51 +0800
a5af7e1fc rxrpc: Fix loss of PING RESPONSE ACK production due to PING ACKs ... Browse Code »

Separate the output of PING ACKs from the output of other sorts of ACK so
that if we receive a PING ACK and schedule transmission of a PING RESPONSE
ACK, the response doesn't get cancelled by a PING ACK we happen to be
scheduling transmission of at the same time.

If a PING RESPONSE gets lost, the other side might just sit there waiting
for it and refuse to proceed otherwise.

Signed-off-by: David Howells

David Howells
2016-10-06 15:11:49 +0800
26cb02aa6 rxrpc: Fix warning by splitting rxrpc_send_call_packet() ... Browse Code »

Split rxrpc_send_data_packet() to separate ACK generation (which is more
complicated) from ABORT generation. This simplifies the code a bit and
fixes the following warning:

In file included from ../net/rxrpc/output.c:20:0:
net/rxrpc/output.c: In function 'rxrpc_send_call_packet':
net/rxrpc/ar-internal.h:1187:27: error: 'top' may be used uninitialized in this function [-Werror=maybe-uninitialized]
net/rxrpc/output.c:103:24: note: 'top' was declared here
net/rxrpc/output.c:225:25: error: 'hard_ack' may be used uninitialized in this function [-Werror=maybe-uninitialized]

Reported-by: Arnd Bergmann
Signed-off-by: David Howells

David Howells
2016-10-06 15:11:49 +0800

30 Sep, 2016

2 commits

b112a6708 rxrpc: Request more ACKs in slow-start mode ... Browse Code »

Set the request-ACK on more DATA packets whilst we're in slow start mode so
that we get sufficient ACKs back to supply information to configure the
window.

Signed-off-by: David Howells

David Howells
2016-09-30 05:57:47 +0800
a1767077b rxrpc: Make Tx loss-injection go through normal return and adjust tracing ... Browse Code »

In rxrpc_send_data_packet() make the loss-injection path return through the
same code as the transmission path so that the RTT determination is
initiated and any future timer shuffling will be done, despite the packet
having been binned.

Whilst we're at it:

(1) Add to the tx_data tracepoint an indication of whether or not we're
retransmitting a data packet.

(2) When we're deciding whether or not to request an ACK, rather than
checking if we're in fast-retransmit mode check instead if we're
retransmitting.

(3) Don't invoke the lose_skb tracepoint when losing a Tx packet as we're
not altering the sk_buff refcount nor are we just seeing it after
getting it off the Tx list.

(4) The rxrpc_skb_tx_lost note is then no longer used so remove it.

(5) rxrpc_lose_skb() no longer needs to deal with rxrpc_skb_tx_lost.

Signed-off-by: David Howells

David Howells
2016-09-30 05:37:15 +0800

25 Sep, 2016

2 commits

57494343c rxrpc: Implement slow-start ... Browse Code »

Implement RxRPC slow-start, which is similar to RFC 5681 for TCP. A
tracepoint is added to log the state of the congestion management algorithm
and the decisions it makes.

Notes:

(1) Since we send fixed-size DATA packets (apart from the final packet in
each phase), counters and calculations are in terms of packets rather
than bytes.

(2) The ACK packet carries the equivalent of TCP SACK.

(3) The FLIGHT_SIZE calculation in RFC 5681 doesn't seem particularly
suited to SACK of a small number of packets. It seems that, almost
inevitably, by the time three 'duplicate' ACKs have been seen, we have
narrowed the loss down to one or two missing packets, and the
FLIGHT_SIZE calculation ends up as 2.

(4) In rxrpc_resend(), if there was no data that apparently needed
retransmission, we transmit a PING ACK to ask the peer to tell us what
its Rx window state is.

Signed-off-by: David Howells

David Howells
2016-09-25 06:49:46 +0800
805b21b92 rxrpc: Send an ACK after every few DATA packets we receive ... Browse Code »

Send an ACK if we haven't sent one for the last two packets we've received.
This keeps the other end apprised of where we've got to - which is
important if they're doing slow-start.

We do this in recvmsg so that we can dispatch a packet directly without the
need to wake up the background thread.

This should possibly be made configurable in future.

Signed-off-by: David Howells

David Howells
2016-09-25 01:05:26 +0800

23 Sep, 2016

2 commits

9c7ad4344 rxrpc: Add tracepoint for ACK proposal ... Browse Code »

Add a tracepoint to log proposed ACKs, including whether the proposal is
used to update a pending ACK or is discarded in favour of an easlier,
higher priority ACK.

Whilst we're at it, get rid of the rxrpc_acks() function and access the
name array directly. We do, however, need to validate the ACK reason
number given to trace_rxrpc_rx_ack() to make sure we don't overrun the
array.

Signed-off-by: David Howells

David Howells
2016-09-23 22:49:19 +0800
be832aecc rxrpc: Add data Tx tracepoint and adjust Tx ACK tracepoint ... Browse Code »

Add a tracepoint to log transmission of DATA packets (including loss
injection).

Adjust the ACK transmission tracepoint to include the packet serial number
and to line this up with the DATA transmission display.

Signed-off-by: David Howells

David Howells
2016-09-23 22:49:19 +0800