11 Oct, 2007

40 commits

  • This replaces normal addition with mod-48 addition so that sequence number
    wraparound is respected.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This implements a SHOULD from RFC 4340, 7.5.4:
    "To protect against denial-of-service attacks, DCCP implementations SHOULD
    impose a rate limit on DCCP-Syncs sent in response to sequence-invalid packets,
    such as not more than eight DCCP-Syncs per second."

    The rate-limit is maintained on a per-socket basis. This is a more stringent
    policy than enforcing the rate-limit on a per-source-address basis and
    protects against attacks with forged source addresses.

    Moreover, the mechanism is deliberately kept simple. In contrast to
    xrlim_allow(), bursts of Sync packets in reply to sequence-invalid packets
    are not supported. This foils such attacks where the receipt of a Sync
    triggers further sequence-invalid packets. (I have tested this mechanism against
    xrlim_allow algorithm for Syncs, permitting bursts just increases the problems.)

    In order to keep flexibility, the timeout parameter can be set via sysctl; and
    the whole mechanism can even be disabled (which is however not recommended).

    The algorithm in this patch has been improved with regard to wrapping issues
    thanks to a suggestion by Arnaldo.

    Commiter note: Rate limited the step 6 DCCP_WARN too, as it says we're
    sending a sync.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • In this patch, duplicated code is removed for the case when a Reset packet is
    sent from a connected socket. This code duplication is between dccp_make_reset
    and dccp_transmit_skb, which already contained an (up to now entirely unused)
    switch statement to fill in the reset code from the DCCP_SKB_CB.

    The only thing that has been removed is the call to dst_clone(dst), since
    the queue_xmit functions use sk_dst_cache anyway.

    I wasn't sure which purpose inet_sk_rebuild_header served, so I left it in.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This adds fields to support the informational Data 1..3 fields of the
    DCCP-Reset packets (RFC 4340, 5.6), and makes minor cosmetic changes
    to documentation.
    Code which fills in these fields follows in subsequent patches, it is
    primarily used for reporting option-processing and feature-negotiation
    errors.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This adds a FIXME to signal that the function dccp_send_delayed_ack is nowhere
    used in the entire DCCP/CCID code.

    Using a delayed Ack timer is suggested in 11.3 of RFC 4340, but it has also
    rather subtle implications for the Ack-Ratio-accounting.

    CCID2 does not use this (maybe it should).

    I think leaving the function in is good, in case someone wants to implement
    this.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This moves several instances of testing against NULL into the function which is
    used to de-reference the CCID-private data.

    Committer note: Made the BUG_ON depend on having CONFIG_IP_DCCP_CCID3_DEBUG, as it
    is too much to have this on production code. Also made sure that
    the macro is used only after checking if sk_state is not LISTEN,
    to make it equivalent to what we had before.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This fixes the code to correspond to RFC 4340, 7.5.4, which states the
    exception that a Sync received in state REQUEST generates a Reset (not
    a SyncAck).

    To achieve this, only a small change is required. Since
    dccp_rcv_request_sent_state_process() already uses the correct Reset Code
    number 4 ("Packet Error"), we only need to shift the if-statement a few
    lines further down.

    (To test this case: replace DCCP_PKT_RESPONSE with DCCP_PKT_SYNC
    in dccp_make_response.)

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This patch:
    - makes hidp_setup_input() return int to indicate errors;
    - checks its return value to handle errors.

    And this time it is against -rc7-mm1 tree.

    Thanks to roel and Marcel Holtmann for comments.

    Signed-off-by: WANG Cong
    Signed-off-by: Marcel Holtmann
    Signed-off-by: David S. Miller

    WANG Cong
     
  • Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • In case of ACK reordering, the SACK block might be valid in it's
    time but is already obsoleted since we've received another kind
    of confirmation about arrival of the segments through snd_una
    advancement of an earlier packet.

    I didn't bother to build distinguishing of valid and invalid
    SACK blocks but simply made reordered SACK blocks that are too
    old always not counted regardless of their "real" validity which
    could be determined by using the ack field of the reordered
    packet (won't be significant IMHO).

    DSACKs can very well be considered useful even in this situation,
    so won't do any of this for them.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • I previously added checking to position that is rather poor as
    state has already been adjusted quite a bit. Re-placing it above
    all state changes should be more robust though the return should
    never ever get executed regardless of its place :-).

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • The parameter `seq' of dccp_send_sync() is in fact an acknowledgement number
    and not a sequence number - thus renamed by this patch into `ackno'.

    Secondly, a `critical' warning is added when a Sync/SyncAck could not be sent.

    Sanity: I have checked all other functions that are called in dccp_transmit_skb,
    there are no clashes with the use of dccpd_ack_seq; no other function is
    using this slot at the same time.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This updates sequence number checking with regard to RFC 4340, 7.5.4.
    Missing in the code was an exception for sequence-invalid Reset packets,
    which get a Sync acknowledging GSR, instead of (as usual) P.seqno.

    This can lead to an oscillating ping-pong flood of Reset packets.

    In fact, it has been observed on the wire as follows:

    1. client establishes connection to server;
    2. before server can write to client, client crashes without notifying
    the server (NB: now no longer possible due to ABORT function);
    3. server sends DCCP-Data packet (has no ackno);
    4. client generates Reset "No Connection", seqno=0, increments seqno;
    5. server replies with Sync, using ackno = P.seqno;
    6. client generates Reset "No Connection" with seqno = ackno + 1;
    7. goto (5).

    The difference is that now in (5) the server uses GSR. This causes the
    Reset sent by the client in (6) to become sequence-valid, so that in (7)
    the vicious circle is broken; the Reset is then enqueued and causes the
    socket to enter TIMEWAIT state.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This patch is in part required by the next patch; it

    * replaces 6 instances of `DCCP_SKB_CB(skb)->dccpd_seq' with `seqno';
    * replaces 7 instances of `DCCP_SKB_CB(skb)->dccpd_ack_seq' with `ackno';
    * replaces 1 use of dccp_inc_seqno() by unfolding `ADD48' macro in place.

    No changes in algorithm, all changes are text replacement/substitution.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • The third parameter of dccp_sample_rtt now becomes useless and is removed.

    Also combined the subtraction of the timestamp echo and the elapsed time.
    This is safe, since (a) presence of timestamp echo is tested first and (b)
    elapsed time is either present and non-zero or it is not set and equals 0
    due to the memset in dccp_parse_options.

    To avoid measuring option-processing time, the timestamp for measuring the
    initial Request/Response RTT sample is taken directly when the function is
    called (the Linux implementation always adds a timestamp on the Request,
    so there is no loss in doing this).

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This provides a timesource, conveniently used for DCCP timestamps, which
    returns the elapsed time in 10s of microseconds since initialisation.
    This makes for a wrap-around time of about 11.9 hours, which should be
    sufficient for most applications.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This patch reduces the number of timestamps taken in the receive path
    for each packet.

    The ccid3_hc_tx_update_x() routine is called in
    * the receive path for each CCID3-controlled packet
    * for the nofeedback timer (if no feedback arrives during 4 RTT)

    Currently, when there is no loss, each packet gets timestamped twice.
    The patch resolves this by recycling the first timestamp taken on packet
    reception for RTT sampling.

    When the no_feedback_timer() is called, then the timestamp argument is
    simply set to NULL - so that ccid3_hc_tx_update_x() takes care of the logic.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • Might as well rename ieee80211_cfg.h to cfg.h to keep things consistent.

    Signed-off-by: Michael Wu
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Michael Wu
     
  • Each station has a vlan_id that is useless. Remove it.

    Signed-off-by: Johannes Berg
    Signed-off-by: Michael Wu
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • The parse result typedef isn't needed.

    Signed-off-by: Johannes Berg
    Signed-off-by: Michael Wu
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • It's just painful to have the extra ieee80211_ prefix.

    Signed-off-by: Johannes Berg
    Signed-off-by: Michael Wu
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • This makes mac80211 print out the wiphy name instead of the
    master device name where appropriate.

    Signed-off-by: Johannes Berg
    Signed-off-by: Michael Wu
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • This fixes a warning about NUM_IEEE80211_MODES missing
    in a switch statement. Intentionally do not add a default
    case so we get warnings at these places if we need to add
    new modes.

    Signed-off-by: Johannes Berg
    Signed-off-by: Michael Wu
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • This patch removes the key threshold stuff from mac80211.
    I have patches for later that add it as a per-key setting
    to nl/cfg80211.

    Signed-off-by: Johannes Berg
    Signed-off-by: Michael Wu
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • This patch allows drivers to indicate bad FCS/PLCP CRC to the stack and
    have the stack drop packets like that except for monitor interfaces.

    Signed-off-by: Johannes Berg
    Signed-off-by: Michael Wu
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • This adds support for disabling the radio and setting the TXpower
    through wext.
    This also fixes the prism TXpower ioctl (It always overwrote the TXpower
    value in ieee80211_hw_config())

    Signed-off-by: Michael Buesch
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Michael Buesch
     
  • It seems I was actually able to hit this deadlock, on my quad G5 softmac
    locks up more often than not. This fixes it by using an own workqueue
    that can safely be flushed under RTNL.

    Not sure if the patch is correct with the workqueue naming. And don't
    think with the patch it doesn't continually lock up. It still does, just
    doesn't invoke lockdep warnings all the time.

    Signed-off-by: Johannes Berg
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • For N cpus, with full throttle traffic on all N CPUs, funneling traffic
    to the same ethernet device, the devices queue lock is contended by all
    N CPUs constantly. The TX lock is only contended by a max of 2 CPUS.
    In the current mode of operation, after all the work of entering the
    dequeue region, we may endup aborting the path if we are unable to get
    the tx lock and go back to contend for the queue lock. As N goes up,
    this gets worse.

    The changes in this patch result in a small increase in performance
    with a 4CPU (2xdual-core) with no irq binding. Both e1000 and tg3
    showed similar behavior;

    Signed-off-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Jamal Hadi Salim
     
  • This patch replaces all occurences to the static variable
    loopback_dev to a pointer loopback_dev. That provides the
    mindless, trivial, uninteressting change part for the dynamic
    allocation for the loopback.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Daniel Lezcano
    Acked-By: Kirill Korotaev
    Acked-by: Benjamin Thery
    Signed-off-by: David S. Miller

    Daniel Lezcano
     
  • Signed-off-by: Johannes Berg
    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • There's no reason to clear the sacktag skb hint when small part
    of the rexmit queue changes. Account changes (if any) instead when
    fragmenting/collapsing. RTO/FRTO do not touch SACKED_ACKED bits so
    no need to discard SACK tag hint at all.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • Most of the description that follows comes from my mail to
    netdev (some editing done):

    Main obstacle to FRTO use is its deployment as it has to be on
    the sender side where as wireless link is often the receiver's
    access link. Take initiative on behalf of unlucky receivers and
    enable it by default in future Linux TCP senders. Also IETF
    seems to interested in advancing FRTO from experimental [1].

    How does FRTO help?
    ===================

    FRTO detects spurious RTOs and avoids a number of unnecessary
    retransmissions and a couple of other problems that can arise
    due to incorrect guess made at RTO (i.e., that segments were
    lost when they actually got delayed which is likely to occur
    e.g. in wireless environments with link-layer retransmission).
    Though FRTO cannot prevent the first (potentially unnecessary)
    retransmission at RTO, I suspect that it won't cost that much
    even if you have to pay for each bit (won't be that high
    percentage out of all packets after all :-)). However, usually
    when you have a spurious RTO, not only the first segment
    unnecessarily retransmitted but the *whole window*. It goes like
    this: all cumulative ACKs got delayed due to in-order delivery,
    then TCP will actually send 1.5*original cwnd worth of data in
    the RTO's slow-start when the delayed ACKs arrive (basically the
    original cwnd worth of it unnecessarily). In case one is
    interested in minimizing unnecessary retransmissions e.g. due to
    cost, those rexmissions must never see daylight. Besides, in the
    worst case the generated burst overloads the bottleneck buffers
    which is likely to significantly delay the further progress of
    the flow. In case of ll rexmissions, ACK compression often
    occurs at the same time making the burst very "sharp edged" (in
    that case TCP often loses most of the segments above high_seq
    => very bad performance too). When FRTO is enabled, those
    unnecessary retransmissions are fully avoided except for the
    first segment and the cwnd behavior after detected spurious RTO
    is determined by the response (one can tune that by sysctl).

    Basic version (non-SACK enhanced one), FRTO can fail to detect
    spurious RTO as spurious and falls back to conservative
    behavior. ACK lossage is much less significant than reordering,
    usually the FRTO can detect spurious RTO if at least 2
    cumulative ACKs from original window are preserved (excluding
    the ACK that advances to high_seq). With SACK-enhanced version,
    the detection is quite robust.

    FRTO should remove the need to set a high lower bound for the
    RTO estimator due to delay spikes that occur relatively common
    in some environments (esp. in wireless/cellular ones).

    [1] http://www1.ietf.org/mail-archive/web/tcpm/current/msg02862.html

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • Basically this change enables it, previously other undo_marker
    users were left with nothing. Reverse undo_marker logic
    completely to get it set right in CA_Loss. On the other hand,
    when spurious RTO is detected, clear it. Clearing might be too
    heavy for some scenarios but seems safe enough starting point
    for now and shouldn't have much effect except in majority of
    cases (if in any).

    By adding a new FLAG_ we avoid looping through write_queue when
    RTO occurs.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • Implements following cleanups:
    - Comment re-placement (CodingStyle)
    - tcp_tso_acked() local (wrapper-like) variable removal
    (readability)
    - __-types removed (IMHO they make local variables jumpy looking
    and just was space)
    - acked -> flag (naming conventions elsewhere in TCP code)
    - linebreak adjustments (readability)
    - nested if()s combined (reduced indentation)
    - clarifying newlines added

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • The accounting code is pretty much the same, so it's a shame
    we do it in two places.

    I'm not too sure if added fully_acked check in MTU probing is
    really what we want perhaps the added end_seq could be used in
    the after() comparison.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • In addition, fix its function comment spacing.

    Signed-off-by: Ilpo Järvinen

    Ilpo Järvinen
     
  • Substraction for fackets_out is unconditional when snd_una
    advances, thus there's no need to do it inside the loop. Just
    make sure correct bounds are honored.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • In general, it should not be necessary to call tcp_fragment for
    already SACKed skbs, but it's better to be safe than sorry. And
    indeed, it can be called from sacktag when a DSACK arrives or
    some ACK (with SACK) reordering occurs (sacktag could be made
    to avoid the call in the latter case though I'm not sure if it's
    worth of the trouble and added complexity to cover such marginal
    case).

    The collapse case has return for SACKED_ACKED case earlier, so
    just WARN_ON if internal inconsistency is detected for some
    reason.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • This is nicer than the MAC_FMT stuff.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • Return some useful information such as the maximum listen backlog and
    the current listen backlog in the tcp_info structure and
    INET_DIAG_INFO.

    Signed-off-by: Rick Jones
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Rick Jones