09 Sep, 2020

1 commit

  • Rewrite the rxrpc client connection manager so that it can support multiple
    connections for a given security key to a peer. The following changes are
    made:

    (1) For each open socket, the code currently maintains an rbtree with the
    connections placed into it, keyed by communications parameters. This
    is tricky to maintain as connections can be culled from the tree or
    replaced within it. Connections can require replacement for a number
    of reasons, e.g. their IDs span too great a range for the IDR data
    type to represent efficiently, the call ID numbers on that conn would
    overflow or the conn got aborted.

    This is changed so that there's now a connection bundle object placed
    in the tree, keyed on the same parameters. The bundle, however, does
    not need to be replaced.

    (2) An rxrpc_bundle object can now manage the available channels for a set
    of parallel connections. The lock that manages this is moved there
    from the rxrpc_connection struct (channel_lock).

    (3) There'a a dummy bundle for all incoming connections to share so that
    they have a channel_lock too. It might be better to give each
    incoming connection its own bundle. This bundle is not needed to
    manage which channels incoming calls are made on because that's the
    solely at whim of the client.

    (4) The restrictions on how many client connections are around are
    removed. Instead, a previous patch limits the number of client calls
    that can be allocated. Ordinarily, client connections are reaped
    after 2 minutes on the idle queue, but when more than a certain number
    of connections are in existence, the reaper starts reaping them after
    2s of idleness instead to get the numbers back down.

    It could also be made such that new call allocations are forced to
    wait until the number of outstanding connections subsides.

    Signed-off-by: David Howells

    David Howells
     

21 Aug, 2020

1 commit

  • The Rx protocol has a mechanism to help generate RTT samples that works by
    a client transmitting a REQUESTED-type ACK when it receives a DATA packet
    that has the REQUEST_ACK flag set.

    The peer, however, may interpose other ACKs before transmitting the
    REQUESTED-ACK, as can be seen in the following trace excerpt:

    rxrpc_tx_data: c=00000044 DATA d0b5ece8:00000001 00000001 q=00000001 fl=07
    rxrpc_rx_ack: c=00000044 00000001 PNG r=00000000 f=00000002 p=00000000 n=0
    rxrpc_rx_ack: c=00000044 00000002 REQ r=00000001 f=00000002 p=00000001 n=0
    ...

    DATA packet 1 (q=xx) has REQUEST_ACK set (bit 1 of fl=xx). The incoming
    ping (labelled PNG) hard-acks the request DATA packet (f=xx exceeds the
    sequence number of the DATA packet), causing it to be discarded from the Tx
    ring. The ACK that was requested (labelled REQ, r=xx references the serial
    of the DATA packet) comes after the ping, but the sk_buff holding the
    timestamp has gone and the RTT sample is lost.

    This is particularly noticeable on RPC calls used to probe the service
    offered by the peer. A lot of peers end up with an unknown RTT because we
    only ever sent a single RPC. This confuses the server rotation algorithm.

    Fix this by caching the information about the outgoing packet in RTT
    calculations in the rxrpc_call struct rather than looking in the Tx ring.

    A four-deep buffer is maintained and both REQUEST_ACK-flagged DATA and
    PING-ACK transmissions are recorded in there. When the appropriate
    response ACK is received, the buffer is checked for a match and, if found,
    an RTT sample is recorded.

    If a received ACK refers to a packet with a later serial number than an
    entry in the cache, that entry is presumed lost and the entry is made
    available to record a new transmission.

    ACKs types other than REQUESTED-type and PING-type cause any matching
    sample to be cancelled as they don't necessarily represent a useful
    measurement.

    If there's no space in the buffer on ping/data transmission, the sample
    base is discarded.

    Fixes: 50235c4b5a2f ("rxrpc: Obtain RTT data by requesting ACKs on DATA packets")
    Signed-off-by: David Howells

    David Howells
     

29 May, 2020

1 commit


11 May, 2020

1 commit

  • rxrpc currently uses a fixed 4s retransmission timeout until the RTT is
    sufficiently sampled. This can cause problems with some fileservers with
    calls to the cache manager in the afs filesystem being dropped from the
    fileserver because a packet goes missing and the retransmission timeout is
    greater than the call expiry timeout.

    Fix this by:

    (1) Copying the RTT/RTO calculation code from Linux's TCP implementation
    and altering it to fit rxrpc.

    (2) Altering the various users of the RTT to make use of the new SRTT
    value.

    (3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO
    value instead (which is needed in jiffies), along with a backoff.

    Notes:

    (1) rxrpc provides RTT samples by matching the serial numbers on outgoing
    DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets
    against the reference serial number in incoming REQUESTED ACK and
    PING-RESPONSE ACK packets.

    (2) Each packet that is transmitted on an rxrpc connection gets a new
    per-connection serial number, even for retransmissions, so an ACK can
    be cross-referenced to a specific trigger packet. This allows RTT
    information to be drawn from retransmitted DATA packets also.

    (3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than
    on an rxrpc_call because many RPC calls won't live long enough to
    generate more than one sample.

    (4) The calculated SRTT value is in units of 8ths of a microsecond rather
    than nanoseconds.

    The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers.

    Fixes: 17926a79320a ([AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both"")
    Signed-off-by: David Howells

    David Howells
     

15 Apr, 2020

1 commit

  • Fix the DATA packet transmission to disable nofrag for UDPv4 on an AF_INET6
    socket as well as UDPv6 when trying to transmit fragmentably.

    Without this, packets filled to the normal size used by the kernel AFS
    client of 1412 bytes be rejected by udp_sendmsg() with EMSGSIZE
    immediately. The ->sk_error_report() notification hook is called, but
    rxrpc doesn't generate a trace for it.

    This is a temporary fix; a more permanent solution needs to involve
    changing the size of the packets being filled in accordance with the MTU,
    which isn't currently done in AF_RXRPC. The reason for not doing so was
    that, barring the last packet in an rx jumbo packet, jumbos can only be
    assembled out of 1412-byte packets - and the plan was to construct jumbos
    on the fly at transmission time.

    Also, there's no point turning on IPV6_MTU_DISCOVER, since IPv6 has to
    engage in this anyway since fragmentation is only done by the sender. We
    can then condense the switch-statement in rxrpc_send_data_packet().

    Fixes: 75b54cb57ca3 ("rxrpc: Add IPv6 support")
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

03 Feb, 2020

1 commit

  • When a call is disconnected, the connection pointer from the call is
    cleared to make sure it isn't used again and to prevent further attempted
    transmission for the call. Unfortunately, there might be a daemon trying
    to use it at the same time to transmit a packet.

    Fix this by keeping call->conn set, but setting a flag on the call to
    indicate disconnection instead.

    Remove also the bits in the transmission functions where the conn pointer is
    checked and a ref taken under spinlock as this is now redundant.

    Fixes: 8d94aa381dab ("rxrpc: Calls shouldn't hold socket refs")
    Signed-off-by: David Howells

    David Howells
     

27 Aug, 2019

1 commit

  • Use the previously-added transmit-phase skbuff private flag to simplify the
    socket buffer tracing a bit. Which phase the skbuff comes from can now be
    divined from the skb rather than having to be guessed from the call state.

    We can also reduce the number of rxrpc_skb_trace values by eliminating the
    difference between Tx and Rx in the symbols.

    Signed-off-by: David Howells

    David Howells
     

09 Aug, 2019

1 commit


03 Jul, 2019

1 commit

  • With gcc 4.1:

    net/rxrpc/output.c: In function ‘rxrpc_send_data_packet’:
    net/rxrpc/output.c:338: warning: ‘ret’ may be used uninitialized in this function

    Indeed, if the first jump to the send_fragmentable label is made, and
    the address family is not handled in the switch() statement, ret will be
    used uninitialized.

    Fix this by BUG()'ing as is done in other places in rxrpc where internal
    support for future address families will need adding. It should not be
    possible to reach this normally as the address families are checked
    up-front.

    Fixes: 5a924b8951f835b5 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
    Reported-by: Geert Uytterhoeven
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

24 Mar, 2019

1 commit

  • clang produces a false-positive warning as it fails to notice
    that "lost = true" implies that "ret" is initialized:

    net/rxrpc/output.c:402:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
    if (lost)
    ^~~~
    net/rxrpc/output.c:437:6: note: uninitialized use occurs here
    if (ret >= 0) {
    ^~~
    net/rxrpc/output.c:402:2: note: remove the 'if' if its condition is always false
    if (lost)
    ^~~~~~~~~
    net/rxrpc/output.c:339:9: note: initialize the variable 'ret' to silence this warning
    int ret, opt;
    ^
    = 0

    Rearrange the code to make that more obvious and avoid the warning.

    Signed-off-by: Arnd Bergmann
    Reviewed-by: Nathan Chancellor
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

03 Nov, 2018

1 commit

  • If the network becomes (partially) unavailable, say by disabling IPv6, the
    background ACK transmission routine can get itself into a tizzy by
    proposing immediate ACK retransmission. Since we're in the call event
    processor, that happens immediately without returning to the workqueue
    manager.

    The condition should clear after a while when either the network comes back
    or the call times out.

    Fix this by:

    (1) When re-proposing an ACK on failed Tx, don't schedule it immediately.
    This will allow a certain amount of time to elapse before we try
    again.

    (2) Enforce a return to the workqueue manager after a certain number of
    iterations of the call processing loop.

    (3) Add a backoff delay that increases the delay on deferred ACKs by a
    jiffy per failed transmission to a limit of HZ. The backoff delay is
    cleared on a successful return from kernel_sendmsg().

    (4) Cancel calls immediately if the opening sendmsg fails. The layer
    above can arrange retransmission or rotate to another server.

    Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

20 Oct, 2018

1 commit

  • net/sched/cls_api.c has overlapping changes to a call to
    nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
    to the 5th argument, and another (from 'net-next') added cb->extack
    instead of NULL to the 6th argument.

    net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
    code which moved (to mr_table_dump)) in 'net-next'. Thanks to David
    Ahern for the heads up.

    Signed-off-by: David S. Miller

    David S. Miller
     

16 Oct, 2018

1 commit

  • Fixes gcc '-Wunused-but-set-variable' warning:

    net/rxrpc/output.c: In function 'rxrpc_reject_packets':
    net/rxrpc/output.c:527:11: warning:
    variable 'ioc' set but not used [-Wunused-but-set-variable]

    'ioc' is the correct kvec num when sending a BUSY (or an ABORT) response
    packet.

    Fixes: ece64fec164f ("rxrpc: Emit BUSY packets when supposed to rather than ABORTs")
    Signed-off-by: YueHaibing
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    YueHaibing
     

04 Oct, 2018

2 commits


28 Sep, 2018

2 commits

  • In the input path, a received sk_buff can be marked for rejection by
    setting RXRPC_SKB_MARK_* in skb->mark and, if needed, some auxiliary data
    (such as an abort code) in skb->priority. The rejection is handled by
    queueing the sk_buff up for dealing with in process context. The output
    code reads the mark and priority and, theoretically, generates an
    appropriate response packet.

    However, if RXRPC_SKB_MARK_BUSY is set, this isn't noticed and an ABORT
    message with a random abort code is generated (since skb->priority wasn't
    set to anything).

    Fix this by outputting the appropriate sort of packet.

    Also, whilst we're at it, most of the marks are no longer used, so remove
    them and rename the remaining two to something more obvious.

    Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
    Signed-off-by: David Howells

    David Howells
     
  • Fix RTT information gathering in AF_RXRPC by the following means:

    (1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS.

    (2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready()
    collects it, set it at that point.

    (3) Allow ACKs to be requested on the last packet of a client call, but
    not a service call. We need to be careful lest we undo:

    bf7d620abf22c321208a4da4f435e7af52551a21
    Author: David Howells
    Date: Thu Oct 6 08:11:51 2016 +0100
    rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase

    but that only really applies to service calls that we're handling,
    since the client side gets to send the final ACK (or not).

    (4) When about to transmit an ACK or DATA packet, record the Tx timestamp
    before only; don't update the timestamp afterwards.

    (5) Switch the ordering between recording the serial and recording the
    timestamp to always set the serial number first. The serial number
    shouldn't be seen referenced by an ACK packet until we've transmitted
    the packet bearing it - so in the Rx path, we don't need the timestamp
    until we've checked the serial number.

    Fixes: cf1a6474f807 ("rxrpc: Add per-peer RTT tracker")
    Signed-off-by: David Howells

    David Howells
     

10 Aug, 2018

1 commit


09 Aug, 2018

1 commit

  • AF_RXRPC has a keepalive message generator that generates a message for a
    peer ~20s after the last transmission to that peer to keep firewall ports
    open. The implementation is incorrect in the following ways:

    (1) It mixes up ktime_t and time64_t types.

    (2) It uses ktime_get_real(), the output of which may jump forward or
    backward due to adjustments to the time of day.

    (3) If the current time jumps forward too much or jumps backwards, the
    generator function will crank the base of the time ring round one slot
    at a time (ie. a 1s period) until it catches up, spewing out VERSION
    packets as it goes.

    Fix the problem by:

    (1) Only using time64_t. There's no need for sub-second resolution.

    (2) Use ktime_get_seconds() rather than ktime_get_real() so that time
    isn't perceived to go backwards.

    (3) Simplifying rxrpc_peer_keepalive_worker() by splitting it into two
    parts:

    (a) The "worker" function that manages the buckets and the timer.

    (b) The "dispatch" function that takes the pending peers and
    potentially transmits a keepalive packet before putting them back
    in the ring into the slot appropriate to the revised last-Tx time.

    (4) Taking everything that's pending out of the ring and splicing it into
    a temporary collector list for processing.

    In the case that there's been a significant jump forward, the ring
    gets entirely emptied and then the time base can be warped forward
    before the peers are processed.

    The warping can't happen if the ring isn't empty because the slot a
    peer is in is keepalive-time dependent, relative to the base time.

    (5) Limit the number of iterations of the bucket array when scanning it.

    (6) Set the timer to skip any empty slots as there's no point waking up if
    there's nothing to do yet.

    This can be triggered by an incoming call from a server after a reboot with
    AF_RXRPC and AFS built into the kernel causing a peer record to be set up
    before userspace is started. The system clock is then adjusted by
    userspace, thereby potentially causing the keepalive generator to have a
    meltdown - which leads to a message like:

    watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:23]
    ...
    Workqueue: krxrpcd rxrpc_peer_keepalive_worker
    EIP: lock_acquire+0x69/0x80
    ...
    Call Trace:
    ? rxrpc_peer_keepalive_worker+0x5e/0x350
    ? _raw_spin_lock_bh+0x29/0x60
    ? rxrpc_peer_keepalive_worker+0x5e/0x350
    ? rxrpc_peer_keepalive_worker+0x5e/0x350
    ? __lock_acquire+0x3d3/0x870
    ? process_one_work+0x110/0x340
    ? process_one_work+0x166/0x340
    ? process_one_work+0x110/0x340
    ? worker_thread+0x39/0x3c0
    ? kthread+0xdb/0x110
    ? cancel_delayed_work+0x90/0x90
    ? kthread_stop+0x70/0x70
    ? ret_from_fork+0x19/0x24

    Fixes: ace45bec6d77 ("rxrpc: Fix firewall route keepalive")
    Reported-by: kernel test robot
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

01 Aug, 2018

1 commit

  • Trace successful packet transmission (kernel_sendmsg() succeeded, that is)
    in AF_RXRPC. We can share the enum that defines the transmission points
    with the trace_rxrpc_tx_fail() tracepoint, so rename its constants to be
    applicable to both.

    Also, save the internal call->debug_id in the rxrpc_channel struct so that
    it can be used in retransmission trace lines.

    Signed-off-by: David Howells

    David Howells
     

11 May, 2018

2 commits

  • Add a tracepoint to log transmission failure from the UDP transport socket
    being used by AF_RXRPC.

    Signed-off-by: David Howells

    David Howells
     
  • The expect_rx_by call timeout is supposed to be set when a call is started
    to indicate that we need to receive a packet by that point. This is
    currently put back every time we receive a packet, but it isn't started
    when we first send a packet. Without this, the call may wait forever if
    the server doesn't deign to reply.

    Fix this by setting the timeout upon a successful UDP sendmsg call for the
    first DATA packet. The timeout is initiated only for initial transmission
    and not for subsequent retries as we don't want the retry mechanism to
    extend the timeout indefinitely.

    Fixes: a158bdd3247b ("rxrpc: Fix call timeouts")
    Reported-by: Marc Dionne
    Signed-off-by: David Howells

    David Howells
     

31 Mar, 2018

1 commit

  • Fix the firewall route keepalive part of AF_RXRPC which is currently
    function incorrectly by replying to VERSION REPLY packets from the server
    with VERSION REQUEST packets.

    Instead, send VERSION REPLY packets to the peers of service connections to
    act as keep-alives 20s after the latest packet was transmitted to that
    peer.

    Also, just discard VERSION REPLY packets rather than replying to them.

    Signed-off-by: David Howells

    David Howells
     

23 Feb, 2018

1 commit

  • All the kernel_sendmsg() calls in rxrpc_send_data_packet() need to send
    both parts of the iov[] buffer, but one of them does not. Fix it so that
    it does.

    Without this, short IPv6 rxrpc DATA packets may be seen that have the rxrpc
    header included, but no payload.

    Fixes: 5a924b8951f8 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
    Reported-by: Marc Dionne
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

24 Nov, 2017

2 commits

  • We need to transmit a packet every so often to act as a keepalive for the
    peer (which has a timeout from the last time it received a packet) and also
    to prevent any intervening firewalls from closing the route.

    Do this by resetting a timer every time we transmit a packet. If the timer
    ever expires, we transmit a PING ACK packet and thereby also elicit a PING
    RESPONSE ACK from the other side - which prevents our last-rx timeout from
    expiring.

    The timer is set to 1/6 of the last-rx timeout so that we can detect the
    other side going away if it misses 6 replies in a row.

    This is particularly necessary for servers where the processing of the
    service function may take a significant amount of time.

    Signed-off-by: David Howells

    David Howells
     
  • Add an extra timeout that is set/updated when we send a DATA packet that
    has the request-ack flag set. This allows us to detect if we don't get an
    ACK in response to the latest flagged packet.

    The ACK packet is adjudged to have been lost if it doesn't turn up within
    2*RTT of the transmission.

    If the timeout occurs, we schedule the sending of a PING ACK to find out
    the state of the other side. If a new DATA packet is ready to go sooner,
    we cancel the sending of the ping and set the request-ack flag on that
    instead.

    If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what
    we had at the time of the ping transmission, we adjudge all the DATA
    packets sent between the response tx_top and the ping-time tx_top to have
    been lost and retransmit immediately.

    Rather than sending a PING ACK, we could just pick a DATA packet and
    speculatively retransmit that with request-ack set. It should result in
    either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the
    a PING-RESPONSE ACK mentioned above.

    Signed-off-by: David Howells

    David Howells
     

02 Nov, 2017

2 commits

  • Fix call expiry handling in the following ways

    (1) If all the request data from a client call is acked, don't send a
    follow up IDLE ACK with firstPacket == 1 and previousPacket == 0 as
    this appears to fool some servers into thinking everything has been
    accepted.

    (2) Never send an abort back to the server once it has ACK'd all the
    request packets; rather just try to reuse the channel for the next
    call. The first request DATA packet of the next call on the same
    channel will implicitly ACK the entire reply of the dead call - even
    if we haven't transmitted it yet.

    (3) Don't send RX_CALL_TIMEOUT in an ABORT packet, librx uses abort codes
    to pass local errors to the caller in addition to remote errors, and
    this is meant to be local only.

    The following also need to be addressed in future patches:

    (4) Service calls should send PING ACKs as 'keep alives' if the server is
    still processing the call.

    (5) VERSION REPLY packets should be sent to the peers of service
    connections to act as keep-alives. This is used to keep firewall
    routes in place. The AFS CM should enable this.

    Signed-off-by: David Howells

    David Howells
     
  • rxrpc_fill_out_ack() needs to be passed the connection pointer from its
    caller rather than using call->conn as the call may be disconnected in
    parallel with it, clearing call->conn, leading to:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: rxrpc_send_ack_packet+0x231/0x6a4

    Signed-off-by: David Howells

    David Howells
     

29 Aug, 2017

1 commit

  • Fix IPv6 support in AF_RXRPC in the following ways:

    (1) When extracting the address from a received IPv4 packet, if the local
    transport socket is open for IPv6 then fill out the sockaddr_rxrpc
    struct for an IPv4-mapped-to-IPv6 AF_INET6 transport address instead
    of an AF_INET one.

    (2) When sending CHALLENGE or RESPONSE packets, the transport length needs
    to be set from the sockaddr_rxrpc::transport_len field rather than
    sizeof() on the IPv4 transport address.

    (3) When processing an IPv4 ICMP packet received by an IPv6 socket, set up
    the address correctly before searching for the affected peer.

    Signed-off-by: David Howells

    David Howells
     

05 Jun, 2017

1 commit

  • Make it possible for a client to use AuriStor's service upgrade facility.

    The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
    the first sendmsg() of a call. This takes no parameters.

    When recvmsg() starts returning data from the call, the service ID field in
    the returned msg_name will reflect the result of the upgrade attempt. If
    the upgrade was ignored, srx_service will match what was set in the
    sendmsg(); if the upgrade happened the srx_service will be altered to
    indicate the service the server upgraded to.

    Note that:

    (1) The choice of upgrade service is up to the server

    (2) Further client calls to the same server that would share a connection
    are blocked if an upgrade probe is in progress.

    (3) This should only be used to probe the service. Clients should then
    use the returned service ID in all subsequent communications with that
    server (and not set the upgrade). Note that the kernel will not
    retain this information should the connection expire from its cache.

    (4) If a server that supports upgrading is replaced by one that doesn't,
    whilst a connection is live, and if the replacement is running, say,
    OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
    server will not respond to packets sent to the upgraded connection.

    At this point, calls will time out and the server must be reprobed.

    Signed-off-by: David Howells

    David Howells
     

06 Oct, 2016

3 commits

  • Don't request an ACK on the last DATA packet of a call's Tx phase as for a
    client there will be a reply packet or some sort of ACK to shift phase. If
    the ACK is requested, OpenAFS sends a REQUESTED-ACK ACK with soft-ACKs in
    it and doesn't follow up with a hard-ACK.

    If we don't set the flag, OpenAFS will send a DELAY ACK that hard-ACKs the
    reply data, thereby allowing the call to terminate cleanly.

    Signed-off-by: David Howells

    David Howells
     
  • Separate the output of PING ACKs from the output of other sorts of ACK so
    that if we receive a PING ACK and schedule transmission of a PING RESPONSE
    ACK, the response doesn't get cancelled by a PING ACK we happen to be
    scheduling transmission of at the same time.

    If a PING RESPONSE gets lost, the other side might just sit there waiting
    for it and refuse to proceed otherwise.

    Signed-off-by: David Howells

    David Howells
     
  • Split rxrpc_send_data_packet() to separate ACK generation (which is more
    complicated) from ABORT generation. This simplifies the code a bit and
    fixes the following warning:

    In file included from ../net/rxrpc/output.c:20:0:
    net/rxrpc/output.c: In function 'rxrpc_send_call_packet':
    net/rxrpc/ar-internal.h:1187:27: error: 'top' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    net/rxrpc/output.c:103:24: note: 'top' was declared here
    net/rxrpc/output.c:225:25: error: 'hard_ack' may be used uninitialized in this function [-Werror=maybe-uninitialized]

    Reported-by: Arnd Bergmann
    Signed-off-by: David Howells

    David Howells
     

30 Sep, 2016

2 commits

  • Set the request-ACK on more DATA packets whilst we're in slow start mode so
    that we get sufficient ACKs back to supply information to configure the
    window.

    Signed-off-by: David Howells

    David Howells
     
  • In rxrpc_send_data_packet() make the loss-injection path return through the
    same code as the transmission path so that the RTT determination is
    initiated and any future timer shuffling will be done, despite the packet
    having been binned.

    Whilst we're at it:

    (1) Add to the tx_data tracepoint an indication of whether or not we're
    retransmitting a data packet.

    (2) When we're deciding whether or not to request an ACK, rather than
    checking if we're in fast-retransmit mode check instead if we're
    retransmitting.

    (3) Don't invoke the lose_skb tracepoint when losing a Tx packet as we're
    not altering the sk_buff refcount nor are we just seeing it after
    getting it off the Tx list.

    (4) The rxrpc_skb_tx_lost note is then no longer used so remove it.

    (5) rxrpc_lose_skb() no longer needs to deal with rxrpc_skb_tx_lost.

    Signed-off-by: David Howells

    David Howells
     

25 Sep, 2016

2 commits

  • Implement RxRPC slow-start, which is similar to RFC 5681 for TCP. A
    tracepoint is added to log the state of the congestion management algorithm
    and the decisions it makes.

    Notes:

    (1) Since we send fixed-size DATA packets (apart from the final packet in
    each phase), counters and calculations are in terms of packets rather
    than bytes.

    (2) The ACK packet carries the equivalent of TCP SACK.

    (3) The FLIGHT_SIZE calculation in RFC 5681 doesn't seem particularly
    suited to SACK of a small number of packets. It seems that, almost
    inevitably, by the time three 'duplicate' ACKs have been seen, we have
    narrowed the loss down to one or two missing packets, and the
    FLIGHT_SIZE calculation ends up as 2.

    (4) In rxrpc_resend(), if there was no data that apparently needed
    retransmission, we transmit a PING ACK to ask the peer to tell us what
    its Rx window state is.

    Signed-off-by: David Howells

    David Howells
     
  • Send an ACK if we haven't sent one for the last two packets we've received.
    This keeps the other end apprised of where we've got to - which is
    important if they're doing slow-start.

    We do this in recvmsg so that we can dispatch a packet directly without the
    need to wake up the background thread.

    This should possibly be made configurable in future.

    Signed-off-by: David Howells

    David Howells
     

23 Sep, 2016

2 commits

  • Add a tracepoint to log proposed ACKs, including whether the proposal is
    used to update a pending ACK or is discarded in favour of an easlier,
    higher priority ACK.

    Whilst we're at it, get rid of the rxrpc_acks() function and access the
    name array directly. We do, however, need to validate the ACK reason
    number given to trace_rxrpc_rx_ack() to make sure we don't overrun the
    array.

    Signed-off-by: David Howells

    David Howells
     
  • Add a tracepoint to log transmission of DATA packets (including loss
    injection).

    Adjust the ACK transmission tracepoint to include the packet serial number
    and to line this up with the DATA transmission display.

    Signed-off-by: David Howells

    David Howells