05 May, 2014

1 commit


31 Oct, 2013

1 commit


14 Jan, 2011

1 commit

  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
    Documentation/trace/events.txt: Remove obsolete sched_signal_send.
    writeback: fix global_dirty_limits comment runtime -> real-time
    ppc: fix comment typo singal -> signal
    drivers: fix comment typo diable -> disable.
    m68k: fix comment typo diable -> disable.
    wireless: comment typo fix diable -> disable.
    media: comment typo fix diable -> disable.
    remove doc for obsolete dynamic-printk kernel-parameter
    remove extraneous 'is' from Documentation/iostats.txt
    Fix spelling milisec -> ms in snd_ps3 module parameter description
    Fix spelling mistakes in comments
    Revert conflicting V4L changes
    i7core_edac: fix typos in comments
    mm/rmap.c: fix comment
    sound, ca0106: Fix assignment to 'channel'.
    hrtimer: fix a typo in comment
    init/Kconfig: fix typo
    anon_inodes: fix wrong function name in comment
    fix comment typos concerning "consistent"
    poll: fix a typo in comment
    ...

    Fix up trivial conflicts in:
    - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
    - fs/ext4/ext4.h

    Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.

    Linus Torvalds
     

07 Jan, 2011

1 commit

  • The 'seq_window' sysctl sets the initial value for the DCCP Sequence Window,
    which may range from 32..2^46-1 (RFC 4340, 7.5.2). The patch sets the upper
    bound consistently to 2^32-1 on both 32 and 64 bit systems, which should be
    sufficient - with a RTT of 1sec and 1-byte packets, a seq_window of 2^32-1
    corresponds to a link speed of 34 Gbps.

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     

07 Dec, 2010

1 commit

  • This patch adds a generic infrastructure for policy-based dequeueing of
    TX packets and provides two policies:
    * a simple FIFO policy (which is the default) and
    * a priority based policy (set via socket options).
    Both policies honour the tx_qlen sysctl for the maximum size of the write
    queue (can be overridden via socket options).

    The priority policy uses skb->priority internally to assign an u32 priority
    identifier, using the same ranking as SO_PRIORITY. The skb->priority field
    is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
    data using cmsg(3), the patch also provides the requisite parsing routines.

    Signed-off-by: Tomasz Grobelny
    Signed-off-by: Gerrit Renker

    Tomasz Grobelny
     

16 Nov, 2010

1 commit

  • Some of the documentation refers to web pages under
    the domain `osdl.org'. However, `osdl.org' now
    redirects to `linuxfoundation.org'.

    Rather than rely on redirections, this patch updates
    the addresses appropriately; for the most part, only
    documentation that is meant to be current has been
    updated.

    The patch should be pretty quick to scan and check;
    each new web-page url was gotten by trying out the
    original URL in a browser and then simply copying the
    the redirected URL (formatting as necessary).

    There is some conflict as to which one of these domain
    names is preferred:

    linuxfoundation.org
    linux-foundation.org

    So, I wrote:

    info@linuxfoundation.org

    and got this reply:

    Message-ID:
    Date: Mon, 15 Nov 2010 10:41:42 -0800
    From: David Ames

    ...

    linuxfoundation.org is preferred. The canonical name for our web site is
    www.linuxfoundation.org. Our list site is actually
    lists.linux-foundation.org.

    Regarding email linuxfoundation.org is preferred there are a few people
    who choose to use linux-foundation.org for their own reasons.

    Consequently, I used `linuxfoundation.org' for web pages and
    `lists.linux-foundation.org' for mailing-list web pages and email addresses;
    the only personal email address I updated from `@osdl.org' was that of
    Andrew Morton, who prefers `linux-foundation.org' according `git log'.

    Signed-off-by: Michael Witten
    Signed-off-by: Jiri Kosina

    Michael Witten
     

31 Aug, 2010

2 commits

  • This makes RTAX_RTO_MIN also available to CCID-3, replacing the compile-time
    RTO lower bound with a per-route tunable value.

    The original Kconfig option solved the problem that a very low RTT (in the
    order of HZ) can trigger too frequent and unnecessary reductions of the
    sending rate.

    This tunable does not affect the initial RTO value of 2 seconds specified in
    RFC 5348, section 4.2 and Appendix B. But like the hardcoded Kconfig value,
    it allows to adapt to network conditions.

    The same effect as the original Kconfig option of 100ms is now achieved by

    > ip route replace to unicast 192.168.0.0/24 rto_min 100j dev eth0

    (assuming HZ=1000).

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • Using a fixed RTO_MIN of 0.2 seconds was found to cause problems for CCID-2
    over 802.11g: at least once per session there was a spurious timeout. It
    helped to then increase the the value of RTO_MIN over this link.

    Since the problem is the same as in TCP, this patch makes the solution from
    commit "05bb1fad1cde025a864a90cfeb98dcbefe78a44a"
    "[TCP]: Allow minimum RTO to be configurable via routing metrics."
    available to DCCP.

    This avoids reinventing the wheel, so that e.g. the following works in the
    expected way now also for CCID-2:

    > ip route change 10.0.0.2 rto_min 800 dev ath0

    Luckily this useful rto_min function was recently moved to net/tcp.h,
    which simplifies sharing code originating from TCP.

    Documentation also updated (plus minor whitespace fixes).

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     

13 Feb, 2010

1 commit

  • This fixes a problem in the DCCP getsockopt() API: currently there is no way
    for a user to a priori know the number of built-in CCIDs, other than trying
    DCCP_SOCKOPT_AVAILABLE_CCIDS in a loop, incrementing the option length until
    EINVAL is no longer returned.

    This patch truncates the array to the user-provided length. No copy is made
    when the length is
    Signed-off-by: David S. Miller

    Gerrit Renker
     

22 Jan, 2009

1 commit

  • This adds full support for local/remote Sequence Window feature, from which the
    * sequence-number-validity (W) and
    * acknowledgment-number-validity (W') windows
    derive as specified in RFC 4340, 7.5.3.

    Specifically, the following is contained in this patch:
    * integrated new socket fields into dccp_sk;
    * updated the update_gsr/gss routines with regard to these fields;
    * updated handler code: the Sequence Window feature is located at the TX side,
    so the local feature is meant if the handler-rx flag is false;
    * the initialisation of `rcv_wnd' in reqsk is removed, since
    - rcv_wnd is not used by the code anywhere;
    - sequence number checks are not done in the LISTEN state (cf. 7.5.3);
    - dccp_check_req checks the Ack number validity more rigorously;
    * the `struct dccp_minisock' became empty and is now removed.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: David S. Miller

    Gerrit Renker
     

08 Dec, 2008

3 commits

  • This removes the use of the sysctl and the minisock variable for the Send Ack
    Vector feature, as it now is handled fully dynamically via feature negotiation
    (i.e. when CCID-2 is enabled, Ack Vectors are automatically enabled as per
    RFC 4341, 4.).

    Using a sysctl in parallel to this implementation would open the door to
    crashes, since much of the code relies on tests of the boolean minisock /
    sysctl variable. Thus, this patch replaces all tests of type

    if (dccp_msk(sk)->dccpms_send_ack_vector)
    /* ... */
    with
    if (dp->dccps_hc_rx_ackvec != NULL)
    /* ... */

    The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
    negotiation concluded that Ack Vectors are to be used on the half-connection.
    Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
    so that the test is a valid one.

    The activation handler for Ack Vectors is called as soon as the feature
    negotiation has concluded at the
    * server when the Ack marking the transition RESPOND => OPEN arrives;
    * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.

    Adding the sequence number of the Response packet to the Ack Vector has been
    removed, since
    (a) connection establishment implies that the Response has been received;
    (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
    this entry will always be ignored;
    (c) it can not be used for anything useful - to detect loss for instance, only
    packets received after the loss can serve as pseudo-dupacks.

    There was a FIXME to change the error code when dccp_ackvec_add() fails.
    I removed this after finding out that:
    * the check whether ackno < ISN is already made earlier,
    * this Response is likely the 1st packet with an Ackno that the client gets,
    * so when dccp_ackvec_add() fails, the reason is likely not a packet error.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • Updating the NDP count feature is handled automatically now:
    * for CCID-2 it is disabled, since the code does not use NDP counts;
    * for CCID-3 it is enabled, as NDP counts are used to determine loss lengths.

    Allowing the user to change NDP values leads to unpredictable and failing
    behaviour, since it is then possible to disable NDP counts even when they
    are needed (e.g. in CCID-3).

    This means that only those user settings are sensible that agree with the
    values for Send NDP Count implied by the choice of CCID. But those settings
    are already activated by the feature negotiation (CCID dependency tracking),
    hence this form of support is redundant.

    At startup the initialisation of the NDP count feature uses the default
    value of 0, which is done implicitly by the zeroing-out of the socket when
    it is allocated. If the choice of CCID or feature negotiation enables NDP
    count, this will then be updated via the NDP activation handler.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • The TX/RX CCIDs of the minisock are now redundant: similar to the Ack Vector
    case, their value equals initially that of the sysctl, but at the end of
    feature negotiation may be something different.

    The old interface removed by this patch thus has been replaced by the newer
    interface to dynamically query the currently loaded CCIDs.

    Also removed are the constructors for the TX CCID and the RX CCID, since the
    switch "rx non-rx" is done by the handler in minisocks.c (and the handler
    is the only place in the code where CCIDs are loaded).

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: David S. Miller

    Gerrit Renker
     

24 Nov, 2008

1 commit

  • With this patch, TX/RX CCIDs can now be changed on a per-connection
    basis, which overrides the defaults set by the global sysctl variables
    for TX/RX CCIDs.

    To make full use of this facility, the remaining patches of this patch
    set are needed, which track dependencies and activate negotiated
    feature values.

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     

17 Nov, 2008

1 commit

  • This patch deprecates the Ack Ratio sysctl, since
    * Ack Ratio is entirely ignored by CCID-3 and CCID-4,
    * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1);
    * even if it would work in CCID-2, there is no point for a user to change it:
    - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2),
    - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts
    (since waiting for Acks which will never arrive in this window),
    - cwnd is not a user-configurable value.

    The only reasonable place for Ack Ratio is to print it for debugging. It is
    planned to do this later on, as part of e.g. dccp_probe.

    With this patch Ack Ratio is now under full control of feature negotiation:
    * Ack Ratio is resolved as a dependency of the selected CCID;
    * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to
    the default of 2, following RFC 4340, 11.3 - "New connections start with Ack
    Ratio 2 for both endpoints";
    * what happens then is part of another patch set, since it concerns the
    dynamic update of Ack Ratio while the connection is in full flight.

    Thanks to Tomasz Grobelny for discussion leading up to this patch.

    Signed-off-by: Gerrit Renker
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     

12 Nov, 2008

1 commit

  • This provides a data structure to record which CCIDs are locally supported
    and three accessor functions:
    - a test function for internal use which is used to validate CCID requests
    made by the user;
    - a copy function so that the list can be used for feature-negotiation;
    - documented getsockopt() support so that the user can query capabilities.

    The data structure is a table which is filled in at compile-time with the
    list of available CCIDs (which in turn depends on the Kconfig choices).

    Using the copy function for cloning the list of supported CCIDs is useful for
    feature negotiation, since the negotiation is now with the full list of available
    CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation
    will not fail if the peer requests to use CCID3 instead of CCID2.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: David S. Miller

    Gerrit Renker
     

29 Jan, 2008

5 commits

  • This adds a socket option and signalling support for the case where the server
    holds timewait state on closing the connection, as described in RFC 4340, 8.3.

    Since holding timewait state at the server is the non-usual case, it is enabled
    via a socket option. Documentation for this socket option has been added.

    The setsockopt statement has been made resilient against different possible cases
    of expressing boolean `true' values using a suggestion by Ian McDonald.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This patch addresses the following problems:

    1. DCCP relies for its proper functioning on having at least one CCID module
    enabled (as in TCP plugable congestion control). Currently it is possible to
    disable both CCIDs and thus leave the DCCP module in a compiled, but entirely
    non-functional state: no sockets can be created when no CCID is available.
    Furthermore, the protocol is (again like TCP) not intended to be used without
    CCIDs. Last, a non-empty CCID list is needed for doing CCID feature negotiation.

    2. Internally the default CCID that is advertised by the Linux host is set to CCID2
    (DCCPF_INITIAL_CCID in include/linux/dccp.h). Disabling CCID2 in the Kconfig
    menu without changing the defaults leads to a failure `module not found' when
    trying to load the dccp module (which internally tries to load the default CCID).

    3. The specification (RFC 4340, sec. 10) treats CCID2 somewhat like a
    `minimum common denominator'; the specification says that:

    * "New connections start with CCID 2 for both endpoints"

    * "A DCCP implementation intended for general use, such as an implementation in a
    general-purpose operating system kernel, SHOULD implement at least CCID 2.
    The intent is to make CCID 2 broadly available for interoperability [...]"

    Providing CCID2 as minimum-required CCID (like Reno/Cubic in TCP) thus seems reasonable.

    Hence this patch automatically selects CCID2 when DCCP is enabled. Documentation also added.

    Discussions with Ian McDonald on this subject are gratefully acknowledged.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This updates the DCCP documentation, following input from Ian McDonald,
    clarifiying the status of DCCP, and adding a note about the test tree.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This extends the DCCP socket API by honouring any shutdown(2) option set by the user.
    The behaviour is, as much as possible, made consistent with the API for TCP's shutdown.

    This patch exploits the information provided by the user via the socket API to reduce
    processing costs:
    * if the read end is closed (SHUT_RD), it is not necessary to deliver to input CCID;
    * if the write end is closed (SHUT_WR), the same idea applies, but with a difference -
    as long as the TX queue has not been drained, we need to receive feedback to keep
    congestion-control rates up to date. Hence SHUT_WR is honoured only after the last
    packet (under congestion control) has been sent;
    * although SHUT_RDWR seems nonsensical, it is nevertheless supported in the same manner
    as for TCP (and agrees with test for SHUTDOWN_MASK in dccp_poll() in net/dccp/proto.c).

    Furthermore, most of the code already honours the sk_shutdown flags (dccp_recvmsg() for
    instance sets the read length to 0 if SHUT_RD had been called); CCID handling is now added
    to this by the present patch.

    There will also no longer be any delivery when the socket is in the final stages, i.e. when
    one of dccp_close(), dccp_fin(), or dccp_done() has been called - which is fine since at
    that stage the connection is its final stages.

    Motivation and background are on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/shutdown

    A FIXME has been added to notify the other end if SHUT_RD has been set (RFC 4340, 11.7).

    Note: There is a comment in inet_shutdown() in net/ipv4/af_inet.c which asks to "make
    sure the socket is a TCP socket". This should probably be extended to mean
    `TCP or DCCP socket' (the code is also used by UDP and raw sockets).

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     

11 Oct, 2007

4 commits

  • This corrects erroneous documentation of the socket API.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This adds documentation on the use of service codes on client and
    server.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This enables applications to query the current value of the Maximum
    Packet Size via a socket option, suggested as a SHOULD in (RFC 4340,
    p. 102).

    This socket option is useful to avoid the annoying bail-out via
    `-EMSGSIZE'. In particular, as fragmentation is not currently
    supported (and its use is partly discouraged in RFC 4340).

    With this option, it is possible to size buffers accordingly, e.g.

    int buflen = dccp_get_cur_mps(sockfd);

    /* or */
    if (msgsize > dccp_get_cur_mps(sockfd))
    die("message is too large for this path");

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This implements a SHOULD from RFC 4340, 7.5.4:
    "To protect against denial-of-service attacks, DCCP implementations SHOULD
    impose a rate limit on DCCP-Syncs sent in response to sequence-invalid packets,
    such as not more than eight DCCP-Syncs per second."

    The rate-limit is maintained on a per-socket basis. This is a more stringent
    policy than enforcing the rate-limit on a per-source-address basis and
    protects against attacks with forged source addresses.

    Moreover, the mechanism is deliberately kept simple. In contrast to
    xrlim_allow(), bursts of Sync packets in reply to sequence-invalid packets
    are not supported. This foils such attacks where the receipt of a Sync
    triggers further sequence-invalid packets. (I have tested this mechanism against
    xrlim_allow algorithm for Syncs, permitting bursts just increases the problems.)

    In order to keep flexibility, the timeout parameter can be set via sysctl; and
    the whole mechanism can even be disabled (which is however not recommended).

    The algorithm in this patch has been improved with regard to wrapping issues
    thanks to a suggestion by Arnaldo.

    Commiter note: Rate limited the step 6 DCCP_WARN too, as it says we're
    sending a sync.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     

26 Apr, 2007

1 commit


12 Dec, 2006

1 commit

  • As Eddie Kohler points out the RFC is Proposed Standard not experimental.
    Also removed documentation about deprecated socket option.

    Signed-off-by: Ian McDonald
    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Ian McDonald
     

03 Dec, 2006

4 commits

  • This one got lost on the way from Ian to Gerrit to me, fix it.

    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Ian McDonald
     
  • This patch just updates DCCP documentation a bit.

    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Ian McDonald
     
  • This adds 3 sysctls which govern the retransmission behaviour of DCCP control
    packets (3way handshake, feature negotiation).

    It removes 4 FIXMEs from the code.

    The close resemblance of sysctl variables to their TCP analogues is emphasised
    not only by their name, but also by giving them the same initial values.
    This is useful since there is not much practical experience with DCCP yet.

    Furthermore, with regard to the previous patch, it is now possible to limit
    the number of keepalive-Responses by setting net.dccp.default.request_retries
    (also a bit like in TCP).

    Lastly, added documentation of all existing DCCP sysctls.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This patch does the following:
    a) introduces variable-length checksums as specified in [RFC 4340, sec. 9.2]
    b) provides necessary socket options and documentation as to how to use them
    c) basic support and infrastructure for the Minimum Checksum Coverage feature
    [RFC 4340, sec. 9.2.1]: acceptability tests, user notification and user
    interface

    In addition, it

    (1) fixes two bugs in the DCCPv4 checksum computation:
    * pseudo-header used checksum_len instead of skb->len
    * incorrect checksum coverage calculation based on dccph_x
    (2) removes dccp_v4_verify_checksum() since it reduplicates code of the
    checksum computation; code calling this function is updated accordingly.
    (3) now uses skb_checksum(), which is safer than checksum_partial() if the
    sk_buff has is a non-linear buffer (has pages attached to it).
    (4) fixes an outstanding TODO item:
    * If P.CsCov is too large for the packet size, drop packet and return.

    The code has been tested with applications, the latest version of tcpdump now
    comes with support for partial DCCP checksums.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     

25 Sep, 2006

1 commit

  • This has been discussed on dccp@vger and removes the necessity for applications
    to supply service codes in each and every case.

    If an application does not want to provide a service code, that's fine, it will
    be given 0. Otherwise, service codes can be set via socket options as before.

    This patch has been tested using various client/server configurations
    (including listening on multiple service codes).

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     

11 Nov, 2005

1 commit