16 Oct, 2007

1 commit


11 Oct, 2007

39 commits

  • This fixes two bugs in processing of connection-Requests in
    v{4,6}_conn_request:

    1. Due to using the variable `reset_code', the Reset code generated
    internally by dccp_parse_options() is overwritten with the
    initialised value ("Too Busy") of reset_code, which is not what is
    intended.

    2. When receiving a connection-Request on a multicast or broadcast
    address, no Reset should be generated, to avoid storms of such
    packets. Instead of jumping to the `drop' label, the
    v{4,6}_conn_request functions now return 0. Below is why in my
    understanding this is correct:

    When the conn_request function returns < 0, then the caller,
    dccp_rcv_state_process(), returns 1. In all instances where
    dccp_rcv_state_process is called (dccp_v4_do_rcv, dccp_v6_do_rcv,
    and dccp_child_process), a return value of != 0 from
    dccp_rcv_state_process() means that a Reset is generated.

    If on the other hand the conn_request function returns 0, the
    packet is discarded and no Reset is generated.

    Note: There may be a related problem when sending the Response, due to
    the following.

    if (dccp_v6_send_response(sk, req, NULL))
    goto drop_and_free;
    /* ... */
    drop_and_free:
    return -1;

    In this case, if send_response fails due to transmission errors, the
    next thing that is generated is a Reset with a code "Too Busy". I
    haven't been able to conjure up such a condition, but it might be good
    to change the behaviour here also (not done by this patch).

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • The elapsed time uses u32, but printk was using %d, not %u.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This

    * removes a declaration of a non-existent function
    __dccp_minisock_init;

    * shifts the initialisation function dccp_minisock_init() from
    options.c to minisocks.c, where it is more naturally expected to
    be.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This replaces several uses of standard arithmetic with the DCCP
    sequence number arithmetic functions. The problem here is that the
    sequence number wrap-around was not taken into consideration.

    * Condition "seqp->ccid2s_seq ccid2s_seq" has been replaced
    by

    dccp_delta_seqno(seqp->ccid2s_seq, prev->ccid2s_seq) >= 0

    since if seqp is `before' prev, then the delta_seqno() is positive.

    * The test whether sequence numbers `a' and `b' are consecutive has
    the form

    dccp_delta_seqno(a, b) == 1

    * Increment of ccid2hctx_rpseq could be done using dccp_inc_seqno(),
    but since here the incremented ccid2hctx_rpseq == seqno, used
    assignment instead.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • skb's passed to ccid2_hc_tx_send_packet() are headerless, the packet
    type is decided later, in dccp_write_xmit(). Therefore the first test
    of the switch/case block is always true, the others are never reached.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This removes a test for `val < 1' which would only have been triggered
    when val < 0, due to a preceding test for 0. Fixed by using an
    unsigned type for cwnd (as in TCP) instead.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This removes an ugly BUG_ON which has been pointed out by Arnaldo.

    Instead of freezing up the machine, a `critical' message is now issued
    to the system log.

    There is potential of doing this more gracefully (eg. there are a few
    internal variables which could be updated despite the lack of memory),
    but that requires more complicated changes to the algorithm; thus a
    `FIXME' has been added.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This patch simplifies the interface of ccid2_hc_tx_alloc_seq():

    * ccid2_hc_tx_alloc_seq() is always called with an argument of
    CCID2_SEQBUF_LEN;

    * other code - ccid2_hc_tx_check_sanity() - even depends on the
    assumption that ccid2_hc_tx_alloc_seq() has been called with this
    particular size;

    * passing the `gfp_t' argument to ccid2_hc_tx_alloc_seq() is
    redundant with gfp_any().

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This just sets the parameter to bool, since debugging messages are
    either on or off.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This enables applications to query the current value of the Maximum
    Packet Size via a socket option, suggested as a SHOULD in (RFC 4340,
    p. 102).

    This socket option is useful to avoid the annoying bail-out via
    `-EMSGSIZE'. In particular, as fragmentation is not currently
    supported (and its use is partly discouraged in RFC 4340).

    With this option, it is possible to size buffers accordingly, e.g.

    int buflen = dccp_get_cur_mps(sockfd);

    /* or */
    if (msgsize > dccp_get_cur_mps(sockfd))
    die("message is too large for this path");

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This performs a minor optimisation: when ccid_hc_tx_send_packet
    returns a value greater zero, then the same call previously was done
    again at the begin of the while loop in dccp_wait_for_ccid.

    This patch exploits the available information and schedule-timeouts
    directly instead.

    Documentation also added.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • As suggested by DaveM.

    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Since DCCP requires to close both ends of a connection simultaneously,
    permission to write in state DCCP_CLOSING is removed in dccp_sendmsg():
    * if the sending end closed, it would encounter a write error anyhow;
    * if the other end has closed the connection, it accepts no more data.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This factors code common to dccp_v{4,6}_ctl_send_reset into a separate function,
    and adds support for filling in the Data 1 ... Data 3 fields from RFC 4340, 5.6.

    It is useful to have this separate, since the following Reset codes will always
    be generated from the control socket rather than via dccp_send_reset:
    * Code 3, "No Connection", cf. 8.3.1;
    * Code 4, "Packet Error" (identification for Data 1 added);
    * Code 5, "Option Error" (identification for Data 1..3 added, will be used later);
    * Code 6, "Mandatory Error" (same as Option Error);
    * Code 7, "Connection Refused" (what on Earth is the difference to "No Connection"?);
    * Code 8, "Bad Service Code";
    * Code 9, "Too Busy";
    * Code 10, "Bad Init Cookie" (not used).

    Code 0 is not recommended by the RFC, the following codes would be used in
    dccp_send_reset() instead, since they all relate to an established DCCP connection:
    * Code 1, "Closed";
    * Code 2, "Aborted";
    * Code 11, "Aggression Penalty" (12.3).

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This replaces normal addition with mod-48 addition so that sequence number
    wraparound is respected.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This implements a SHOULD from RFC 4340, 7.5.4:
    "To protect against denial-of-service attacks, DCCP implementations SHOULD
    impose a rate limit on DCCP-Syncs sent in response to sequence-invalid packets,
    such as not more than eight DCCP-Syncs per second."

    The rate-limit is maintained on a per-socket basis. This is a more stringent
    policy than enforcing the rate-limit on a per-source-address basis and
    protects against attacks with forged source addresses.

    Moreover, the mechanism is deliberately kept simple. In contrast to
    xrlim_allow(), bursts of Sync packets in reply to sequence-invalid packets
    are not supported. This foils such attacks where the receipt of a Sync
    triggers further sequence-invalid packets. (I have tested this mechanism against
    xrlim_allow algorithm for Syncs, permitting bursts just increases the problems.)

    In order to keep flexibility, the timeout parameter can be set via sysctl; and
    the whole mechanism can even be disabled (which is however not recommended).

    The algorithm in this patch has been improved with regard to wrapping issues
    thanks to a suggestion by Arnaldo.

    Commiter note: Rate limited the step 6 DCCP_WARN too, as it says we're
    sending a sync.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • In this patch, duplicated code is removed for the case when a Reset packet is
    sent from a connected socket. This code duplication is between dccp_make_reset
    and dccp_transmit_skb, which already contained an (up to now entirely unused)
    switch statement to fill in the reset code from the DCCP_SKB_CB.

    The only thing that has been removed is the call to dst_clone(dst), since
    the queue_xmit functions use sk_dst_cache anyway.

    I wasn't sure which purpose inet_sk_rebuild_header served, so I left it in.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This adds fields to support the informational Data 1..3 fields of the
    DCCP-Reset packets (RFC 4340, 5.6), and makes minor cosmetic changes
    to documentation.
    Code which fills in these fields follows in subsequent patches, it is
    primarily used for reporting option-processing and feature-negotiation
    errors.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This adds a FIXME to signal that the function dccp_send_delayed_ack is nowhere
    used in the entire DCCP/CCID code.

    Using a delayed Ack timer is suggested in 11.3 of RFC 4340, but it has also
    rather subtle implications for the Ack-Ratio-accounting.

    CCID2 does not use this (maybe it should).

    I think leaving the function in is good, in case someone wants to implement
    this.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This moves several instances of testing against NULL into the function which is
    used to de-reference the CCID-private data.

    Committer note: Made the BUG_ON depend on having CONFIG_IP_DCCP_CCID3_DEBUG, as it
    is too much to have this on production code. Also made sure that
    the macro is used only after checking if sk_state is not LISTEN,
    to make it equivalent to what we had before.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This fixes the code to correspond to RFC 4340, 7.5.4, which states the
    exception that a Sync received in state REQUEST generates a Reset (not
    a SyncAck).

    To achieve this, only a small change is required. Since
    dccp_rcv_request_sent_state_process() already uses the correct Reset Code
    number 4 ("Packet Error"), we only need to shift the if-statement a few
    lines further down.

    (To test this case: replace DCCP_PKT_RESPONSE with DCCP_PKT_SYNC
    in dccp_make_response.)

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • The parameter `seq' of dccp_send_sync() is in fact an acknowledgement number
    and not a sequence number - thus renamed by this patch into `ackno'.

    Secondly, a `critical' warning is added when a Sync/SyncAck could not be sent.

    Sanity: I have checked all other functions that are called in dccp_transmit_skb,
    there are no clashes with the use of dccpd_ack_seq; no other function is
    using this slot at the same time.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This updates sequence number checking with regard to RFC 4340, 7.5.4.
    Missing in the code was an exception for sequence-invalid Reset packets,
    which get a Sync acknowledging GSR, instead of (as usual) P.seqno.

    This can lead to an oscillating ping-pong flood of Reset packets.

    In fact, it has been observed on the wire as follows:

    1. client establishes connection to server;
    2. before server can write to client, client crashes without notifying
    the server (NB: now no longer possible due to ABORT function);
    3. server sends DCCP-Data packet (has no ackno);
    4. client generates Reset "No Connection", seqno=0, increments seqno;
    5. server replies with Sync, using ackno = P.seqno;
    6. client generates Reset "No Connection" with seqno = ackno + 1;
    7. goto (5).

    The difference is that now in (5) the server uses GSR. This causes the
    Reset sent by the client in (6) to become sequence-valid, so that in (7)
    the vicious circle is broken; the Reset is then enqueued and causes the
    socket to enter TIMEWAIT state.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This patch is in part required by the next patch; it

    * replaces 6 instances of `DCCP_SKB_CB(skb)->dccpd_seq' with `seqno';
    * replaces 7 instances of `DCCP_SKB_CB(skb)->dccpd_ack_seq' with `ackno';
    * replaces 1 use of dccp_inc_seqno() by unfolding `ADD48' macro in place.

    No changes in algorithm, all changes are text replacement/substitution.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • The third parameter of dccp_sample_rtt now becomes useless and is removed.

    Also combined the subtraction of the timestamp echo and the elapsed time.
    This is safe, since (a) presence of timestamp echo is tested first and (b)
    elapsed time is either present and non-zero or it is not set and equals 0
    due to the memset in dccp_parse_options.

    To avoid measuring option-processing time, the timestamp for measuring the
    initial Request/Response RTT sample is taken directly when the function is
    called (the Linux implementation always adds a timestamp on the Request,
    so there is no loss in doing this).

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This provides a timesource, conveniently used for DCCP timestamps, which
    returns the elapsed time in 10s of microseconds since initialisation.
    This makes for a wrap-around time of about 11.9 hours, which should be
    sufficient for most applications.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This patch reduces the number of timestamps taken in the receive path
    for each packet.

    The ccid3_hc_tx_update_x() routine is called in
    * the receive path for each CCID3-controlled packet
    * for the nofeedback timer (if no feedback arrives during 4 RTT)

    Currently, when there is no loss, each packet gets timestamped twice.
    The patch resolves this by recycling the first timestamp taken on packet
    reception for RTT sampling.

    When the no_feedback_timer() is called, then the timestamp argument is
    simply set to NULL - so that ccid3_hc_tx_update_x() takes care of the logic.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This patch makes /proc/net per network namespace. It modifies the global
    variables proc_net and proc_net_stat to be per network namespace.
    The proc_net file helpers are modified to take a network namespace argument,
    and all of their callers are fixed to pass &init_net for that argument.
    This ensures that all of the /proc/net files are only visible and
    usable in the initial network namespace until the code behind them
    has been updated to be handle multiple network namespaces.

    Making /proc/net per namespace is necessary as at least some files
    in /proc/net depend upon the set of network devices which is per
    network namespace, and even more files in /proc/net have contents
    that are relevant to a single network namespace.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This trivial patch removes the unneeded pointer newdp, which is never used.

    Signed-off-by: Micah Gruber
    Signed-off-by: David S. Miller

    Micah Gruber
     
  • Hopefully captured all single statement cases under net/. I'm
    not too sure if there is some policy about #includes that are
    "guaranteed" (ie., in the current tree) to be available through
    some other #included header, so I just added linux/kernel.h to
    each changed file that didn't #include it previously.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Now to convert the ackvec code to ktime_t so that we can get rid of
    dccp_timestamp and the epoch thing in dccp_sock.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo