01 Nov, 2011

1 commit


29 Oct, 2010

2 commits

  • This extends the existing wait-for-ccid routine so that it may be used with
    different types of CCID, addressing the following problems:

    1) The queue-drain mechanism only works with rate-based CCIDs. If CCID-2 for
    example has a full TX queue and becomes network-limited just as the
    application wants to close, then waiting for CCID-2 to become unblocked
    could lead to an indefinite delay (i.e., application "hangs").
    2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
    in its sending policy while the queue is being drained. This can lead to
    further delays during which the application will not be able to terminate.
    3) The minimum wait time for CCID-3/4 can be expected to be the queue length
    times the current inter-packet delay. For example if tx_qlen=100 and a delay
    of 15 ms is used for each packet, then the application would have to wait
    for a minimum of 1.5 seconds before being allowed to exit.
    4) There is no way for the user/application to control this behaviour. It would
    be good to use the timeout argument of dccp_close() as an upper bound. Then
    the maximum time that an application is willing to wait for its CCIDs to can
    be set via the SO_LINGER option.

    These problems are addressed by giving the CCID a grace period of up to the
    `timeout' value.

    The wait-for-ccid function is, as before, used when the application
    (a) has read all the data in its receive buffer and
    (b) if SO_LINGER was set with a non-zero linger time, or
    (c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
    state (client application closes after receiving CloseReq).

    In addition, there is a catch-all case of __skb_queue_purge() after waiting for
    the CCID. This is necessary since the write queue may still have data when
    (a) the host has been passively-closed,
    (b) abnormal termination (unread data, zero linger time),
    (c) wait-for-ccid could not finish within the given time limit.

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This extends the packet dequeuing interface of dccp_write_xmit() to allow
    1. CCIDs to take care of timing when the next packet may be sent;
    2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).

    The main purpose is to take CCID-2 out of its polling mode (when it is network-
    limited, it tries every millisecond to send, without interruption).

    The mode of operation for (2) is as follows:
    * new packet is enqueued via dccp_sendmsg() => dccp_write_xmit(),
    * ccid_hc_tx_send_packet() detects that it may not send (e.g. window full),
    * it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
    * dccp_write_xmit() returns without further action;
    * after some time the wait-condition for CCID becomes true,
    * that CCID schedules the tasklet,
    * tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
    * since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
    * packet is sent, and possibly more (since dccp_write_xmit() loops).

    Code reuse: the taskled function calls dccp_write_xmit(), the timer function
    reduces to a wrapper around the same code.

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     

13 Apr, 2010

1 commit

  • With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this
    work.

    sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock)

    This rwlock is readlocked for a very small amount of time, and dst
    entries are already freed after RCU grace period. This calls for RCU
    again :)

    This patch converts sk_dst_lock to a spinlock, and use RCU for readers.

    __sk_dst_get() is supposed to be called with rcu_read_lock() or if
    socket locked by user, so use appropriate rcu_dereference_check()
    condition (rcu_read_lock_held() || sock_owned_by_user(sk))

    This patch avoids two atomic ops per tx packet on UDP connected sockets,
    for example, and permits sk_dst_lock to be much less dirtied.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Oct, 2009

1 commit

  • dst_negative_advice() should check for changed dst and reset
    sk_tx_queue_mapping accordingly. Pass sock to the callers of
    dst_negative_advice.

    (sk_reset_txq is defined just for use by dst_negative_advice. The
    only way I could find to get around this is to move dst_negative_()
    from dst.h to dst.c, include sock.h in dst.c, etc)

    Signed-off-by: Krishna Kumar
    Signed-off-by: David S. Miller

    Krishna Kumar
     

12 Nov, 2008

1 commit

  • This patch limits feature (capability) negotation to the connection setup phase:

    1. Although it is theoretically possible to perform feature negotiation at any
    time (and RFC 4340 supports this), in practice this is prohibitively complex,
    as it requires to put traffic on hold for each new negotiation.
    2. As a byproduct of restricting feature negotiation to connection setup, the
    feature-negotiation retransmit timer is no longer required. This part is now
    mapped onto the protocol-level retransmission.
    Details indicating why timers are no longer needed can be found on
    http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\
    implementation_notes.html

    This patch disables anytime negotiation, subsequent patches work out full
    feature negotiation support for connection setup.

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     

26 Jul, 2008

2 commits

  • This patch allows the sender to distinguish original and retransmitted packets,
    which is in particular needed for the retransmission of DCCP-Requests:
    * the first Request uses ISS (generated in net/dccp/ip*.c), and sets GSS = ISS;
    * all retransmitted Requests use GSS' = GSS + 1, so that the n-th retransmitted
    Request has sequence number ISS + n (mod 48).

    To add generic support, the patch reorganises existing code so that:
    * icsk_retransmits == 0 for the original packet and
    * icsk_retransmits = n > 0 for the n-th retransmitted packet
    at the time dccp_transmit_skb() is called, via dccp_retransmit_skb().

    Thanks to Wei Yongjun for pointing this problem out.

    Further changes:
    ----------------
    * removed the `skb' argument from dccp_retransmit_skb(), since sk_send_head
    is used for all retransmissions (the exception is client-Acks in PARTOPEN
    state, but these do not use sk_send_head);
    * since sk_send_head always contains the original skb (via dccp_entail()),
    skb_cloned() never evaluated to true and thus pskb_copy() was never used.

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     
  • Removes legacy reinvent-the-wheel type thing. The generic
    machinery integrates much better to automated debugging aids
    such as kerneloops.org (and others), and is unambiguous due to
    better naming. Non-intuively BUG_TRAP() is actually equal to
    WARN_ON() rather than BUG_ON() though some might actually be
    promoted to BUG_ON() but I left that to future.

    I could make at least one BUILD_BUG_ON conversion.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

17 Jul, 2008

1 commit


29 Jan, 2008

1 commit

  • Many-many code in the kernel initialized the timer->function
    and timer->data together with calling init_timer(timer). There
    is already a helper for this. Use it for networking code.

    The patch is HUGE, but makes the code 130 lines shorter
    (98 insertions(+), 228 deletions(-)).

    Signed-off-by: Pavel Emelyanov
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

11 Oct, 2007

1 commit

  • This provides a timesource, conveniently used for DCCP timestamps, which
    returns the elapsed time in 10s of microseconds since initialisation.
    This makes for a wrap-around time of about 11.9 hours, which should be
    sufficient for most applications.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     

26 Mar, 2007

1 commit


10 Mar, 2007

1 commit

  • The TX CCID needs the write_xmit_timer for delaying packet sends. Previously
    this timer was only activated on active (connecting) sockets.

    This patch initialises the write_xmit_timer in sync with the other timers, i.e.
    the timer will be ready on any socket. This is used by applications with a
    listening socket which start to stream after receiving an initiation by the
    client. The write_xmit_timer is stopped when the application closes, as before.

    Was tested to work and to remove the timer bug reported on dccp@vger.

    Also moved timer initialisation into timer.c (static).

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: David S. Miller

    Gerrit Renker
     

11 Feb, 2007

1 commit


12 Dec, 2006

1 commit


03 Dec, 2006

3 commits

  • This removes 3 forward declarations by reordering 2 functions.

    No code change at all.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This adds 3 sysctls which govern the retransmission behaviour of DCCP control
    packets (3way handshake, feature negotiation).

    It removes 4 FIXMEs from the code.

    The close resemblance of sysctl variables to their TCP analogues is emphasised
    not only by their name, but also by giving them the same initial values.
    This is useful since there is not much practical experience with DCCP yet.

    Furthermore, with regard to the previous patch, it is now possible to limit
    the number of keepalive-Responses by setting net.dccp.default.request_retries
    (also a bit like in TCP).

    Lastly, added documentation of all existing DCCP sysctls.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This updates program documentation: spell out precise conditions about
    which packets are eligible for retransmission (which is actually quite
    hard to extract from RFC 4340).

    It is based on the following table derived from RFC 4340:

    +-----------+---------------------------------+---------------------+
    | Type | Retransmit? | Remark |
    +-----------+---------------------------------+---------------------+
    | Request | in client-REQUEST state | sec. 8.1.1 |
    | Response | NEVER | SHOULD NOT, 8.1.3 |
    | Data | NEVER | unreliable protocol |
    | Ack | possible in client-PARTOPEN | sec. 8.1.5 |
    | DataAck | NEVER | unreliable protocol |
    | CloseReq | only in server-CLOSEREQ state | MUST, sec. 8.3 |
    | Close | in node-CLOSING state | MUST, sec. 8.3 |
    +-----------+-------------------------------------------------------+
    | Reset | only in response to other packets |
    | Sync | only in response to sequence-invalid packets (7.5.4) |
    | SyncAck | only in response to Sync packets |
    +-----------+-------------------------------------------------------+

    Hence the only packets eligible for retransmission are:
    * Requests in client-REQUEST state (sec. 8.1.1)
    * Acks in client-PARTOPEN state (sec. 8.1.5)
    * CloseReq in server-CLOSEREQ state (sec. 8.3)
    * Close in node-CLOSING state (sec. 8.3)

    I had meant to put in a check for these types too, but have left that
    for later.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     

01 Jul, 2006

1 commit


21 Mar, 2006

2 commits

  • Renaming it to dccp_send_reset and moving it from the ipv4 specific
    code to the core dccp code.

    This fixes some bugs in IPV6 where timers would send v4 resets, etc.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Still needs more work, but boots and doesn't crashes, even
    does some negotiation!

    18:38:52.174934 127.0.0.1.43458 > 127.0.0.1.5001: request
    18:38:52.218526 127.0.0.1.5001 > 127.0.0.1.43458: response
    18:38:52.185398 127.0.0.1.43458 > 127.0.0.1.5001:

    :-)

    Signed-off-by: Andrea Bittau
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Andrea Bittau
     

30 Aug, 2005

4 commits