24 Nov, 2009

5 commits

  • We use the idr subsystem and always ask for an id
    at or above 1. This results in a id reuse when one
    association is terminated while another is created.

    To prevent re-use, we keep track of the last id returned
    and ask for that id + 1 as a base for each query. We let
    the idr spin lock protect this base id as well.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • When setting the autoclose timeout in jiffies there is a possible
    integer overflow if the value in seconds is very large
    (e.g. for 2^22 s with HZ=1024). The problem appears even on
    64-bit due to the integer promotion rules. The fix is just a cast
    to unsigned long.

    Signed-off-by: Andrei Pelinescu-Onciul
    Signed-off-by: Vlad Yasevich

    Andrei Pelinescu-Onciul
     
  • Current implementation of max.burst ends up limiting new
    data during cwnd decay period. The decay is happening becuase
    the connection is idle and we are allowed to fill the congestion
    window. The point of max.burst is to limit micro-bursts in response
    to large acks. This still happens, as max.burst is still applied
    to each transmit opportunity. It will also apply if a very large
    send is made (greater then allowed by burst).

    Tested-by: Florian Niederbacher
    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • We currently send window update SACKs every time we free up 1 PMTU
    worth of data. That a lot more SACKs then necessary. Instead, we'll
    now send back the actuall window every time we send a sack, and do
    window-update SACKs when a fraction of the receive buffer has been
    opened. The fraction is controlled with a sysctl.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • When sctp_connectx() is used, we pick the first address as
    primary, even though it may not have worked. This results
    in excessive retransmits and poor performance. We should
    select the address that the association was established with.

    Reported-by: Thomas Dreibholz
    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

14 Nov, 2009

1 commit

  • Recent commit 8da645e101a8c20c6073efda3c7cc74eec01b87f
    sctp: Get rid of an extra routing lookup when adding a transport
    introduced a regression in the connection setup. The behavior was

    different between IPv4 and IPv6. IPv4 case ended up working because the
    route lookup routing returned a NULL route, which triggered another
    route lookup later in the output patch that succeeded. In the IPv6 case,
    a valid route was returned for first call, but we could not find a valid
    source address at the time since the source addresses were not set on the
    association yet. Thus resulted in a hung connection.

    The solution is to set the source addresses on the association prior to
    adding peers.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

05 Sep, 2009

5 commits

  • We used to perform 2 routing lookups for a new transport: one
    just for path mtu detection, and one to actually route to destination
    and path mtu update when sending a packet. There is no point in doing
    both of them, especially since the first one just for path mtu doesn't
    take into account source address and sometimes gives the wrong route,
    causing path mtu updates anyway.

    We now do just the one call to do both route to destination and get
    path mtu updates.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • Add-IP feature allows users to delete an active transport. If that
    transport has chunks in flight, those chunks need to be moved to another
    transport or association may get into unrecoverable state.

    Reported-by: Rafael Laufer
    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • We had a bug that we never stored the user-defined value for
    MAXSEG when setting the value on an association. Thus future
    PMTU events ended up re-writing the frag point and increasing
    it past user limit. Additionally, when setting the option on
    the socket/endpoint, we effect all current associations, which
    is against spec.

    Now, we store the user 'maxseg' value along with the computed
    'frag_point'. We inherit 'maxseg' from the socket at association
    creation and use it as an upper limit for 'frag_point' when its
    set.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • SCTP has a problem that when small chunks are used, it is possible
    to exhaust the receiver buffer without fully closing receive window.
    This happens due to all overhead that we have account for with small
    messages. To fix this, when receive buffer is exceeded, we'll drop
    the window to 0 and save the 'drop' portion. When application starts
    reading data and freeing up recevie buffer space, we'll wait until
    we've reached the 'drop' window and then add back this 'drop' one
    mtu at a time. This worked well in testing and under stress produced
    rather even recovery.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • When the sctp transport is marked down, we can release the
    cached route and force a new lookup when attempting to use
    this transport for anything. This way, if a better route
    or source address is available, we'll try to use it.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

03 Jun, 2009

5 commits


09 Oct, 2008

1 commit

  • The tsn map currently use is 4K large and is stuck inside
    the sctp_association structure making memory references REALLY
    expensive. What we really need is at most 4K worth of bits
    so the biggest map we would have is 512 bytes. Also, the
    map is only really usefull when we have gaps to store and
    report. As such, starting with minimal map of say 32 TSNs (bits)
    should be enough for normal low-loss operations. We can grow
    the map by some multiple of 32 along with some extra room any
    time we receive the TSN which would put us outside of the map
    boundry. As we close gaps, we can shift the map to rebase
    it on the latest TSN we've seen. This saves 4088 bytes per
    association just in the map alone along savings from the now
    unnecessary structure members.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

19 Sep, 2008

1 commit

  • If INIT-ACK is received with SupportedExtensions parameter which
    indicates that the peer does not support AUTH, the packet will be
    silently ignore, and sctp_process_init() do cleanup all of the
    transports in the association.
    When T1-Init timer is expires, OOPS happen while we try to choose
    a different init transport.

    The solution is to only clean up the non-active transports, i.e
    the ones that the peer added. However, that introduces a problem
    with sctp_connectx(), because we don't mark the proper state for
    the transports provided by the user. So, we'll simply mark
    user-provided transports as ACTIVE. That will allow INIT
    retransmissions to work properly in the sctp_connectx() context
    and prevent the crash.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

26 Jul, 2008

1 commit

  • Removes legacy reinvent-the-wheel type thing. The generic
    machinery integrates much better to automated debugging aids
    such as kerneloops.org (and others), and is unambiguous due to
    better naming. Non-intuively BUG_TRAP() is actually equal to
    WARN_ON() rather than BUG_ON() though some might actually be
    promoted to BUG_ON() but I left that to future.

    I could make at least one BUILD_BUG_ON conversion.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

19 Jul, 2008

1 commit

  • valgrind reports uninizialized memory accesses when running
    sctp inside the network simulation cradle simulator:

    Conditional jump or move depends on uninitialised value(s)
    at 0x570E34A: sctp_assoc_sync_pmtu (associola.c:1324)
    by 0x57427DA: sctp_packet_transmit (output.c:403)
    by 0x5710EFF: sctp_outq_flush (outqueue.c:824)
    by 0x5710B88: sctp_outq_uncork (outqueue.c:701)
    by 0x5745262: sctp_cmd_interpreter (sm_sideeffect.c:1548)
    by 0x57444B7: sctp_side_effects (sm_sideeffect.c:976)
    by 0x5744460: sctp_do_sm (sm_sideeffect.c:945)
    by 0x572157D: sctp_primitive_ASSOCIATE (primitive.c:94)
    by 0x5725C04: __sctp_connect (socket.c:1094)
    by 0x57297DC: sctp_connect (socket.c:3297)

    Conditional jump or move depends on uninitialised value(s)
    at 0x575D3A5: mod_timer (timer.c:630)
    by 0x5752B78: sctp_cmd_hb_timers_start (sm_sideeffect.c:555)
    by 0x5754133: sctp_cmd_interpreter (sm_sideeffect.c:1448)
    by 0x5753607: sctp_side_effects (sm_sideeffect.c:976)
    by 0x57535B0: sctp_do_sm (sm_sideeffect.c:945)
    by 0x571E9AE: sctp_endpoint_bh_rcv (endpointola.c:474)
    by 0x573347F: sctp_inq_push (inqueue.c:104)
    by 0x572EF93: sctp_rcv (input.c:256)
    by 0x5689623: ip_local_deliver_finish (ip_input.c:230)
    by 0x5689759: ip_local_deliver (ip_input.c:268)
    by 0x5689CAC: ip_rcv_finish (dst.h:246)

    #1 is due to "if (t->pmtu_pending)".
    8a4794914f9cf2681235ec2311e189fe307c28c7 "[SCTP] Flag a pmtu change request"
    suggests it should be initialized to 0.

    #2 is the heartbeat timer 'expires' value, which is uninizialised, but
    test by mod_timer().
    T3_rtx_timer seems to be affected by the same problem, so initialize it, too.

    Signed-off-by: Florian Westphal
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Florian Westphal
     

21 Jun, 2008

1 commit


20 Jun, 2008

1 commit

  • RFC 4960, Section 11.4. Protection of Non-SCTP-Capable Hosts

    When an SCTP stack receives a packet containing multiple control or
    DATA chunks and the processing of the packet requires the sending of
    multiple chunks in response, the sender of the response chunk(s) MUST
    NOT send more than one packet. If bundling is supported, multiple
    response chunks that fit into a single packet MAY be bundled together
    into one single response packet. If bundling is not supported, then
    the sender MUST NOT send more than one response chunk and MUST
    discard all other responses. Note that this rule does NOT apply to a
    SACK chunk, since a SACK chunk is, in itself, a response to DATA and
    a SACK does not require a response of more DATA.

    We implement this by not servicing our outqueue until we reach the end
    of the packet. This enables maximum bundling. We also identify
    'response' chunks and make sure that we only send 1 packet when sending
    such chunks.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

17 Jun, 2008

2 commits


10 Jun, 2008

1 commit


05 Jun, 2008

1 commit


10 May, 2008

1 commit


13 Apr, 2008

1 commit


06 Mar, 2008

1 commit


07 Feb, 2008

1 commit

  • While recevied ASCONF chunk with serial number less then needed, kernel
    will treat this chunk as a retransmitted ASCONF chunk and find cached
    ASCONF-ACK chunk used sctp_assoc_lookup_asconf_ack(). But this function
    will always return NO-NULL. So response with cached ASCONF-ACKs chunk
    will cause kernel panic.
    In function sctp_assoc_lookup_asconf_ack(), if the cached ASCONF-ACKs
    list asconf_ack_list is empty, or if the serial being requested does not
    exists, the function as it currectly stands returns the actuall
    list_head asoc->asconf_ack_list, this is not a cache ASCONF-ACK chunk
    but a bogus pointer.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Vlad Yasevich

    Wei Yongjun
     

05 Feb, 2008

1 commit

  • I was notified by Randy Stewart that lksctp claims to be
    "the reference implementation". First of all, "the
    refrence implementation" was the original implementation
    of SCTP in usersapce written ty Randy and a few others.
    Second, after looking at the definiton of 'reference implementation',
    we don't really meet the requirements.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

29 Jan, 2008

3 commits

  • The processing of the ASCONF chunks has changed a lot in the
    spec. New items are:
    1. A list of ASCONF-ACK chunks is now cached
    2. The source of the packet is used in response.
    3. New handling for unexpect ASCONF chunks.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • The Address Parameter in the parameter list of the ASCONF chunk
    may be a wildcard address. In this case special processing
    is required. For the 'add' case, the source IP of the packet is
    added. In the 'del' case, all addresses except the source IP
    of packet are removed. In the "mark primary" case, the source
    address is marked as primary.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • Many-many code in the kernel initialized the timer->function
    and timer->data together with calling init_timer(timer). There
    is already a helper for this. Use it for networking code.

    The patch is HUGE, but makes the code 130 lines shorter
    (98 insertions(+), 228 deletions(-)).

    Signed-off-by: Pavel Emelyanov
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

08 Nov, 2007

2 commits


11 Oct, 2007

3 commits


17 Sep, 2007

1 commit

  • Since the sctp_sockaddr_entry is now RCU enabled as part of
    the patch to synchronize sctp_localaddr_list, it makes sense to
    change all handling of these entries to RCU. This includes the
    sctp_bind_addrs structure and it's list of bound addresses.

    This list is currently protected by an external rw_lock and that
    looks like an overkill. There are only 2 writers to the list:
    bind()/bindx() calls, and BH processing of ASCONF-ACK chunks.
    These are already seriealized via the socket lock, so they will
    not step on each other. These are also relatively rare, so we
    should be good with RCU.

    The readers are varied and they are easily converted to RCU.

    Signed-off-by: Vlad Yasevich
    Acked-by: Paul E. McKenney
    Acked-by: Sridhar Samdurala
    Signed-off-by: David S. Miller

    Vlad Yasevich