20 Dec, 2011

1 commit

  • Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
    limiting the autoclose value. If userspace passes in -1 on 32-bit
    platform, the overflow check didn't work and autoclose would be set
    to 0xffffffff.

    This patch defines a max_autoclose (in seconds) for limiting the value
    and exposes it through sysctl, with the following intentions.

    1) Avoid overflowing autoclose * HZ.

    2) Keep the default autoclose bound consistent across 32- and 64-bit
    platforms (INT_MAX / HZ in this patch).

    3) Keep the autoclose value consistent between setsockopt() and
    getsockopt() calls.

    Suggested-by: Vlad Yasevich
    Signed-off-by: Xi Wang
    Signed-off-by: David S. Miller

    Xi Wang
     

25 Aug, 2011

1 commit


02 Jun, 2011

1 commit

  • In this case, the SCTP association transmits an ASCONF packet
    including addition of the new IP address and deletion of the old
    address. This patch implements this functionality.
    In this case, the ASCONF chunk is added to the beginning of the
    queue, because the other chunks cannot be transmitted in this state.

    Signed-off-by: Michio Honda
    Signed-off-by: YOSHIFUJI Hideaki
    Acked-by: Wei Yongjun
    Signed-off-by: David S. Miller

    Michio Honda
     

01 Jun, 2011

1 commit


26 May, 2011

1 commit


13 Apr, 2011

2 commits

  • Since we can not update retran path to unconfirmed transports,
    when we remove a peer, the retran path may not be update if the
    other transports are all unconfirmed, and we will still using
    the removed transport as the retran path. This may cause panic
    if retrasnmit happen.

    Signed-off-by: Wei Yongjun
    Signed-off-by: David S. Miller

    Wei Yongjun
     
  • commit fbdf501c9374966a56829ecca3a7f25d2b49a305
    sctp: Do no select unconfirmed transports for retransmissions

    Introduced the initial falt.

    commit d598b166ced20d9b9281ea3527c0e18405ddb803
    sctp: Make sure we always return valid retransmit path

    Solved the problem, but forgot to change the DEBUG statement.
    Thus it was still possible to dereference a NULL pointer.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

31 Mar, 2011

1 commit


08 Mar, 2011

1 commit


27 Aug, 2010

1 commit

  • Change SCTP_DEBUG_PRINTK and SCTP_DEBUG_PRINTK_IPADDR to
    use do { print } while (0) guards.
    Add SCTP_DEBUG_PRINTK_CONT to fix errors in log when
    lines were continued.
    Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
    Add a missing newline in "Failed bind hash alloc"

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

17 Jun, 2010

1 commit


18 May, 2010

1 commit

  • This patch removes from net/ (but not any netfilter files)
    all the unnecessary return; statements that precede the
    last closing brace of void functions.

    It does not remove the returns that are immediately
    preceded by a label as gcc doesn't like that.

    Done via:
    $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
    xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

04 May, 2010

1 commit


01 May, 2010

4 commits

  • rwnd_press tracks the pressure on the recieve window. Every
    timer the receive buffer overlows, we truncate the receive
    window and then grow it back. However, if we don't track
    the cumulative presser, it's possible to reach a situation
    when receive buffer is empty, but rwnd stays truncated.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • Right now, sctp transports are not fully initialized and when
    adding any new fields, they have to be explicitely initialized.
    This is prone to mistakes. So we switch to calling kzalloc()
    which makes things much simpler.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • commit 4951feda0c60d1ef681f1a270afdd617924ab041
    sctp: Do no select unconfirmed transports for retransmissions

    added code to make sure that we do not select unconfirmed paths
    for data transmission. This caused a problem when there are only
    2 paths, 1 unconfirmed and 1 unreachable. In that case, the next
    retransmit path returned is NULL and that causes a kernel crash.

    The solution is to only change retransmit paths if we found one to use.

    Reported-by: Frank Schuster
    Signed-off-b: Vlad Yasevich

    Vlad Yasevich
     
  • An unconfirmed transport is one that we have not been
    able to reach since the beginning. There is no point in
    trying to retrasnmit data on those transports. Also, the
    specification forbids it due to security issues.

    Reported-by: Frank Schuster

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

29 Apr, 2010

1 commit


24 Nov, 2009

5 commits

  • We use the idr subsystem and always ask for an id
    at or above 1. This results in a id reuse when one
    association is terminated while another is created.

    To prevent re-use, we keep track of the last id returned
    and ask for that id + 1 as a base for each query. We let
    the idr spin lock protect this base id as well.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • When setting the autoclose timeout in jiffies there is a possible
    integer overflow if the value in seconds is very large
    (e.g. for 2^22 s with HZ=1024). The problem appears even on
    64-bit due to the integer promotion rules. The fix is just a cast
    to unsigned long.

    Signed-off-by: Andrei Pelinescu-Onciul
    Signed-off-by: Vlad Yasevich

    Andrei Pelinescu-Onciul
     
  • Current implementation of max.burst ends up limiting new
    data during cwnd decay period. The decay is happening becuase
    the connection is idle and we are allowed to fill the congestion
    window. The point of max.burst is to limit micro-bursts in response
    to large acks. This still happens, as max.burst is still applied
    to each transmit opportunity. It will also apply if a very large
    send is made (greater then allowed by burst).

    Tested-by: Florian Niederbacher
    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • We currently send window update SACKs every time we free up 1 PMTU
    worth of data. That a lot more SACKs then necessary. Instead, we'll
    now send back the actuall window every time we send a sack, and do
    window-update SACKs when a fraction of the receive buffer has been
    opened. The fraction is controlled with a sysctl.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • When sctp_connectx() is used, we pick the first address as
    primary, even though it may not have worked. This results
    in excessive retransmits and poor performance. We should
    select the address that the association was established with.

    Reported-by: Thomas Dreibholz
    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

14 Nov, 2009

1 commit

  • Recent commit 8da645e101a8c20c6073efda3c7cc74eec01b87f
    sctp: Get rid of an extra routing lookup when adding a transport
    introduced a regression in the connection setup. The behavior was

    different between IPv4 and IPv6. IPv4 case ended up working because the
    route lookup routing returned a NULL route, which triggered another
    route lookup later in the output patch that succeeded. In the IPv6 case,
    a valid route was returned for first call, but we could not find a valid
    source address at the time since the source addresses were not set on the
    association yet. Thus resulted in a hung connection.

    The solution is to set the source addresses on the association prior to
    adding peers.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

05 Sep, 2009

5 commits

  • We used to perform 2 routing lookups for a new transport: one
    just for path mtu detection, and one to actually route to destination
    and path mtu update when sending a packet. There is no point in doing
    both of them, especially since the first one just for path mtu doesn't
    take into account source address and sometimes gives the wrong route,
    causing path mtu updates anyway.

    We now do just the one call to do both route to destination and get
    path mtu updates.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • Add-IP feature allows users to delete an active transport. If that
    transport has chunks in flight, those chunks need to be moved to another
    transport or association may get into unrecoverable state.

    Reported-by: Rafael Laufer
    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • We had a bug that we never stored the user-defined value for
    MAXSEG when setting the value on an association. Thus future
    PMTU events ended up re-writing the frag point and increasing
    it past user limit. Additionally, when setting the option on
    the socket/endpoint, we effect all current associations, which
    is against spec.

    Now, we store the user 'maxseg' value along with the computed
    'frag_point'. We inherit 'maxseg' from the socket at association
    creation and use it as an upper limit for 'frag_point' when its
    set.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • SCTP has a problem that when small chunks are used, it is possible
    to exhaust the receiver buffer without fully closing receive window.
    This happens due to all overhead that we have account for with small
    messages. To fix this, when receive buffer is exceeded, we'll drop
    the window to 0 and save the 'drop' portion. When application starts
    reading data and freeing up recevie buffer space, we'll wait until
    we've reached the 'drop' window and then add back this 'drop' one
    mtu at a time. This worked well in testing and under stress produced
    rather even recovery.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • When the sctp transport is marked down, we can release the
    cached route and force a new lookup when attempting to use
    this transport for anything. This way, if a better route
    or source address is available, we'll try to use it.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

03 Jun, 2009

5 commits


09 Oct, 2008

1 commit

  • The tsn map currently use is 4K large and is stuck inside
    the sctp_association structure making memory references REALLY
    expensive. What we really need is at most 4K worth of bits
    so the biggest map we would have is 512 bytes. Also, the
    map is only really usefull when we have gaps to store and
    report. As such, starting with minimal map of say 32 TSNs (bits)
    should be enough for normal low-loss operations. We can grow
    the map by some multiple of 32 along with some extra room any
    time we receive the TSN which would put us outside of the map
    boundry. As we close gaps, we can shift the map to rebase
    it on the latest TSN we've seen. This saves 4088 bytes per
    association just in the map alone along savings from the now
    unnecessary structure members.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

19 Sep, 2008

1 commit

  • If INIT-ACK is received with SupportedExtensions parameter which
    indicates that the peer does not support AUTH, the packet will be
    silently ignore, and sctp_process_init() do cleanup all of the
    transports in the association.
    When T1-Init timer is expires, OOPS happen while we try to choose
    a different init transport.

    The solution is to only clean up the non-active transports, i.e
    the ones that the peer added. However, that introduces a problem
    with sctp_connectx(), because we don't mark the proper state for
    the transports provided by the user. So, we'll simply mark
    user-provided transports as ACTIVE. That will allow INIT
    retransmissions to work properly in the sctp_connectx() context
    and prevent the crash.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

26 Jul, 2008

1 commit

  • Removes legacy reinvent-the-wheel type thing. The generic
    machinery integrates much better to automated debugging aids
    such as kerneloops.org (and others), and is unambiguous due to
    better naming. Non-intuively BUG_TRAP() is actually equal to
    WARN_ON() rather than BUG_ON() though some might actually be
    promoted to BUG_ON() but I left that to future.

    I could make at least one BUILD_BUG_ON conversion.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

19 Jul, 2008

1 commit

  • valgrind reports uninizialized memory accesses when running
    sctp inside the network simulation cradle simulator:

    Conditional jump or move depends on uninitialised value(s)
    at 0x570E34A: sctp_assoc_sync_pmtu (associola.c:1324)
    by 0x57427DA: sctp_packet_transmit (output.c:403)
    by 0x5710EFF: sctp_outq_flush (outqueue.c:824)
    by 0x5710B88: sctp_outq_uncork (outqueue.c:701)
    by 0x5745262: sctp_cmd_interpreter (sm_sideeffect.c:1548)
    by 0x57444B7: sctp_side_effects (sm_sideeffect.c:976)
    by 0x5744460: sctp_do_sm (sm_sideeffect.c:945)
    by 0x572157D: sctp_primitive_ASSOCIATE (primitive.c:94)
    by 0x5725C04: __sctp_connect (socket.c:1094)
    by 0x57297DC: sctp_connect (socket.c:3297)

    Conditional jump or move depends on uninitialised value(s)
    at 0x575D3A5: mod_timer (timer.c:630)
    by 0x5752B78: sctp_cmd_hb_timers_start (sm_sideeffect.c:555)
    by 0x5754133: sctp_cmd_interpreter (sm_sideeffect.c:1448)
    by 0x5753607: sctp_side_effects (sm_sideeffect.c:976)
    by 0x57535B0: sctp_do_sm (sm_sideeffect.c:945)
    by 0x571E9AE: sctp_endpoint_bh_rcv (endpointola.c:474)
    by 0x573347F: sctp_inq_push (inqueue.c:104)
    by 0x572EF93: sctp_rcv (input.c:256)
    by 0x5689623: ip_local_deliver_finish (ip_input.c:230)
    by 0x5689759: ip_local_deliver (ip_input.c:268)
    by 0x5689CAC: ip_rcv_finish (dst.h:246)

    #1 is due to "if (t->pmtu_pending)".
    8a4794914f9cf2681235ec2311e189fe307c28c7 "[SCTP] Flag a pmtu change request"
    suggests it should be initialized to 0.

    #2 is the heartbeat timer 'expires' value, which is uninizialised, but
    test by mod_timer().
    T3_rtx_timer seems to be affected by the same problem, so initialize it, too.

    Signed-off-by: Florian Westphal
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Florian Westphal
     

21 Jun, 2008

1 commit


20 Jun, 2008

1 commit

  • RFC 4960, Section 11.4. Protection of Non-SCTP-Capable Hosts

    When an SCTP stack receives a packet containing multiple control or
    DATA chunks and the processing of the packet requires the sending of
    multiple chunks in response, the sender of the response chunk(s) MUST
    NOT send more than one packet. If bundling is supported, multiple
    response chunks that fit into a single packet MAY be bundled together
    into one single response packet. If bundling is not supported, then
    the sender MUST NOT send more than one response chunk and MUST
    discard all other responses. Note that this rule does NOT apply to a
    SACK chunk, since a SACK chunk is, in itself, a response to DATA and
    a SACK does not require a response of more DATA.

    We implement this by not servicing our outqueue until we reach the end
    of the packet. This enables maximum bundling. We also identify
    'response' chunks and make sure that we only send 1 packet when sending
    such chunks.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich