19 Sep, 2016

2 commits

  • sctp_outq_flush return value is meaningless now, this patch is
    to make sctp_outq_flush return void, as well as sctp_outq_fail
    and sctp_outq_uncork.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • Once a chunk is enqueued successfully, sctp queues can take care of it.
    Even if it is failed to transmit (like because of nomem), it should be
    put into retransmit queue.

    If sctp report this error to users, it confuses them, they may resend
    that msg, but actually in kernel sctp stack is in charge of retransmit
    it already.

    Besides, this error probably is not from the failure of transmitting
    current msg, but transmitting or retransmitting another msg's chunks,
    as sctp_outq_flush just tries to send out all transports' chunks.

    This patch is to make sctp_cmd_send_msg return avoid, and not return the
    transmit err back to sctp_sendmsg

    Fixes: 8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg")
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

11 Jun, 2016

1 commit

  • Now sctp doesn't change socket state upon shutdown reception. It changes
    just the assoc state, even though it's a TCP-style socket.

    For some cases, if we really need to check sk->sk_state, it's necessary to
    fix this issue, at least when we use ss or netstat to dump, we can get a
    more exact information.

    As an improvement, we will change sk->sk_state when we change asoc->state
    to SHUTDOWN_RECEIVED, and also do it in sctp_shutdown to keep consistent
    with sctp_close.

    Signed-off-by: Xin Long
    Acked-by: Marcelo R. Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

02 May, 2016

1 commit

  • Dave Miller pointed out that fb586f25300f ("sctp: delay calls to
    sk_data_ready() as much as possible") may insert latency specially if
    the receiving application is running on another CPU and that it would be
    better if we signalled as early as possible.

    This patch thus basically inverts the logic on fb586f25300f and signals
    it as early as possible, similar to what we had before.

    Fixes: fb586f25300f ("sctp: delay calls to sk_data_ready() as much as possible")
    Reported-by: Dave Miller
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

24 Apr, 2016

1 commit


14 Apr, 2016

1 commit

  • Currently processing of multiple chunks in a single SCTP packet leads to
    multiple calls to sk_data_ready, causing multiple wake up signals which
    are costy and doesn't make it wake up any faster.

    With this patch it will note that the wake up is pending and will do it
    before leaving the state machine interpreter, latest place possible to
    do it realiably and cleanly.

    Note that sk_data_ready events are not dependent on asocs, unlike waking
    up writers.

    v2: series re-checked
    v3: use local vars to cleanup the code, suggested by Jakub Sitnicki
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

11 Apr, 2016

1 commit

  • Currently on high rate SCTP streams the heartbeat timer refresh can
    consume quite a lot of resources as timer updates are costly and it
    contains a random factor, which a) is also costly and b) invalidates
    mod_timer() optimization for not editing a timer to the same value.
    It may even cause the timer to be slightly advanced, for no good reason.

    As suggested by David Laight this patch now removes this timer update
    from hot path by leaving the timer on and re-evaluating upon its
    expiration if the heartbeat is still needed or not, similarly to what is
    done for TCP. If it's not needed anymore the timer is re-scheduled to
    the new timeout, considering the time already elapsed.

    For this, we now record the last tx timestamp per transport, updated in
    the same spots as hb timer was restarted on tx. Also split up
    sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and
    sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the
    heartbeat one.

    On loopback with MTU of 65535 and data chunks with 1636, so that we
    have a considerable amount of chunks without stressing system calls,
    netperf -t SCTP_STREAM -l 30, perf looked like this before:

    Samples: 103K of event 'cpu-clock', Event count (approx.): 25833000000
    Overhead Command Shared Object Symbol
    + 6,15% netperf [kernel.vmlinux] [k] copy_user_enhanced_fast_string
    - 5,43% netperf [kernel.vmlinux] [k] _raw_write_unlock_irqrestore
    - _raw_write_unlock_irqrestore
    - 96,54% _raw_spin_unlock_irqrestore
    - 36,14% mod_timer
    + 97,24% sctp_transport_reset_timers
    + 2,76% sctp_do_sm
    + 33,65% __wake_up_sync_key
    + 28,77% sctp_ulpq_tail_event
    + 1,40% del_timer
    - 1,84% mod_timer
    + 99,03% sctp_transport_reset_timers
    + 0,97% sctp_do_sm
    + 1,50% sctp_ulpq_tail_event

    And after this patch, now with netperf -l 60:

    Samples: 230K of event 'cpu-clock', Event count (approx.): 57707250000
    Overhead Command Shared Object Symbol
    + 5,65% netperf [kernel.vmlinux] [k] memcpy_erms
    + 5,59% netperf [kernel.vmlinux] [k] copy_user_enhanced_fast_string
    - 5,05% netperf [kernel.vmlinux] [k] _raw_spin_unlock_irqrestore
    - _raw_spin_unlock_irqrestore
    + 49,89% __wake_up_sync_key
    + 45,68% sctp_ulpq_tail_event
    - 2,85% mod_timer
    + 76,51% sctp_transport_reset_t3_rtx
    + 23,49% sctp_do_sm
    + 1,55% del_timer
    + 2,50% netperf [sctp] [k] sctp_datamsg_from_user
    + 2,26% netperf [sctp] [k] sctp_sendmsg

    Throughput-wise, from 6800mbps without the patch to 7050mbps with it,
    ~3.7%.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

21 Mar, 2016

1 commit


14 Mar, 2016

1 commit

  • Currently sctp_sendmsg() triggers some calls that will allocate memory
    with GFP_ATOMIC even when not necessary. In the case of
    sctp_packet_transmit it will allocate a linear skb that will be used to
    construct the packet and this may cause sends to fail due to ENOMEM more
    often than anticipated specially with big MTUs.

    This patch thus allows it to inherit gfp flags from upper calls so that
    it can use GFP_KERNEL if it was triggered by a sctp_sendmsg call or
    similar. All others, like retransmits or flushes started from BH, are
    still allocated using GFP_ATOMIC.

    In netperf tests this didn't result in any performance drawbacks when
    memory is not too fragmented and made it trigger ENOMEM way less often.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

29 Jan, 2016

1 commit

  • After we use refcnt to check if transport is alive, the dead can be
    removed from sctp_transport.

    The traversal of transport_addr_list in procfs dump is using
    list_for_each_entry_rcu, no need to check if it has been freed.

    sctp_generate_t3_rtx_event and sctp_generate_heartbeat_event is
    protected by sock lock, it's not necessary to check dead, either.
    also, the timers are cancelled when sctp_transport_free() is
    called, that it doesn't wait for refcnt to reach 0 to cancel them.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

12 Jan, 2016

2 commits

  • Conflicts:
    drivers/net/bonding/bond_main.c
    drivers/net/ethernet/mellanox/mlxsw/spectrum.h
    drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c

    The bond_main.c and mellanox switch conflicts were cases of
    overlapping changes.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Dmitry Vyukov reported a use-after-free in the code expanded by the
    macro debug_post_sfx, which is caused by the use of the asoc pointer
    after it was freed within sctp_side_effect() scope.

    This patch fixes it by allowing sctp_side_effect to clear that asoc
    pointer when the TCB is freed.

    As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case
    because it will trigger DELETE_TCB too on that same loop.

    Also, there were places issuing SCTP_CMD_INIT_FAILED and ASSOC_FAILED
    but returning SCTP_DISPOSITION_CONSUME, which would fool the scheme
    above. Fix it by returning SCTP_DISPOSITION_ABORT instead.

    The macro is already prepared to handle such NULL pointer.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

06 Jan, 2016

1 commit

  • transport hashtable will replace the association hashtable,
    so association hashtable is not used in sctp any more, so
    drop the codes about that.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

16 Dec, 2015

1 commit

  • As we all know, the value of pf_retrans >= max_retrans_path can
    disable pf state. The variables of pf_retrans and max_retrans_path
    can be changed by the userspace application.

    Sometimes the user expects to disable pf state while the 2
    variables are changed to enable pf state. So it is necessary to
    introduce a new variable to disable pf state.

    According to the suggestions from Vlad Yasevich, extra1 and extra2
    are removed. The initialization of pf_enable is added.

    Acked-by: Vlad Yasevich
    Signed-off-by: Zhu Yanjun
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Zhu Yanjun
     

29 Sep, 2015

2 commits

  • A case can occur when sctp_accept() is called by the user during
    a heartbeat timeout event after the 4-way handshake. Since
    sctp_assoc_migrate() changes both assoc->base.sk and assoc->ep, the
    bh_sock_lock in sctp_generate_heartbeat_event() will be taken with
    the listening socket but released with the new association socket.
    The result is a deadlock on any future attempts to take the listening
    socket lock.

    Note that this race can occur with other SCTP timeouts that take
    the bh_lock_sock() in the event sctp_accept() is called.

    BUG: soft lockup - CPU#9 stuck for 67s! [swapper:0]
    ...
    RIP: 0010:[] [] _spin_lock+0x1e/0x30
    RSP: 0018:ffff880028323b20 EFLAGS: 00000206
    RAX: 0000000000000002 RBX: ffff880028323b20 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffff880028323be0 RDI: ffff8804632c4b48
    RBP: ffffffff8100bb93 R08: 0000000000000000 R09: 0000000000000000
    R10: ffff880610662280 R11: 0000000000000100 R12: ffff880028323aa0
    R13: ffff8804383c3880 R14: ffff880028323a90 R15: ffffffff81534225
    FS: 0000000000000000(0000) GS:ffff880028320000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 00000000006df528 CR3: 0000000001a85000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process swapper (pid: 0, threadinfo ffff880616b70000, task ffff880616b6cab0)
    Stack:
    ffff880028323c40 ffffffffa01c2582 ffff880614cfb020 0000000000000000
    0100000000000000 00000014383a6c44 ffff8804383c3880 ffff880614e93c00
    ffff880614e93c00 0000000000000000 ffff8804632c4b00 ffff8804383c38b8
    Call Trace:

    [] ? sctp_rcv+0x492/0xa10 [sctp]
    [] ? nf_iterate+0x69/0xb0
    [] ? ip_local_deliver_finish+0x0/0x2d0
    [] ? nf_hook_slow+0x76/0x120
    [] ? ip_local_deliver_finish+0x0/0x2d0
    [] ? ip_local_deliver_finish+0xdd/0x2d0
    [] ? ip_local_deliver+0x98/0xa0
    [] ? ip_rcv_finish+0x12d/0x440
    [] ? ip_rcv+0x275/0x350
    [] ? __netif_receive_skb+0x4ab/0x750
    ...

    With lockdep debugging:

    =====================================
    [ BUG: bad unlock balance detected! ]
    -------------------------------------
    CslRx/12087 is trying to release lock (slock-AF_INET) at:
    [] sctp_generate_timeout_event+0x40/0xe0 [sctp]
    but there are no more locks to release!

    other info that might help us debug this:
    2 locks held by CslRx/12087:
    #0: (&asoc->timers[i]){+.-...}, at: [] run_timer_softirq+0x16f/0x3e0
    #1: (slock-AF_INET){+.-...}, at: [] sctp_generate_timeout_event+0x23/0xe0 [sctp]

    Ensure the socket taken is also the same one that is released by
    saving a copy of the socket before entering the timeout event
    critical section.

    Signed-off-by: Karl Heiss
    Signed-off-by: David S. Miller

    Karl Heiss
     
  • Fix indentation in sctp_generate_heartbeat_event.

    Signed-off-by: Karl Heiss
    Signed-off-by: David S. Miller

    Karl Heiss
     

29 Aug, 2015

1 commit

  • When removing an non-primary transport during ASCONF
    processing, we end up traversing the transport list
    twice: once in sctp_cmd_del_non_primary, and once in
    sctp_assoc_del_peer. We can avoid the second
    search and call sctp_assoc_rm_peer() instead.
    Found by code inspection during code reviews.

    Signed-off-by: Vladislav Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

28 Aug, 2015

1 commit

  • Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown")
    fixed a problem with excessive retransmissions in the SHUTDOWN_PENDING by not
    resetting the association overall_error_count. This allowed the association
    to better enforce assoc.max_retrans limit.

    However, the same issue still exists when the association is in SHUTDOWN_RECEIVED
    state. In this state, HB-ACKs will continue to reset the overall_error_count
    for the association would extend the lifetime of association unnecessarily.

    This patch solves this by resetting the overall_error_count whenever the current
    state is small then SCTP_STATE_SHUTDOWN_PENDING. As a small side-effect, we
    end up also handling SCTP_STATE_SHUTDOWN_ACK_SENT and SCTP_STATE_SHUTDOWN_SENT
    states, but they are not really impacted because we disable Heartbeats in those
    states.

    Fixes: Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown")
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    lucien
     

28 Apr, 2014

1 commit

  • Don't transition to the PF state on every strike after 'Path.Max.Retrans'.
    Per draft-ietf-tsvwg-sctp-failover-03 Section 5.1.6:

    Additional (PMR - PFMR) consecutive timeouts on a PF destination
    confirm the path failure, upon which the destination transitions to the
    Inactive state. As described in [RFC4960], the sender (i) SHOULD notify
    ULP about this state transition, and (ii) transmit heartbeats to the
    Inactive destination at a lower frequency as described in Section 8.3 of
    [RFC4960].

    This also prevents sending SCTP_ADDR_UNREACHABLE to the user as the state
    bounces between SCTP_INACTIVE and SCTP_PF for each subsequent strike.

    Signed-off-by: Karl Heiss
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Karl Heiss
     

21 Feb, 2014

1 commit

  • In current implementation it is possible to reach PF state from unconfirmed.
    We can interpret sctp-failover-02 in a way that PF state is meant to be reached
    only from active state, in the end, this is when entering PF state makes sense.
    Here are few quotes from sctp-failover-02, but regardless of these, same
    understanding can be reached from whole section 5:

    Section 5.1, quickfailover guide:
    "The PF state is an intermediate state between Active and Failed states."

    "Each time the T3-rtx timer expires on an active or idle
    destination, the error counter of that destination address will
    be incremented. When the value in the error counter exceeds
    PFMR, the endpoint should mark the destination transport address as PF."

    There are several concrete reasons for such interpretation. For start, rfc4960
    does not take into concern quickfailover algorithm. Therefore, quickfailover
    must comply to 4960. Point where this compliance can be argued is following
    behavior:
    When PF is entered, association overall error counter is incremented for each
    missed HB. This is contradictory to rfc4960, as address, while in unconfirmed
    state, is subjected to probing, and while it is probed, it should not increment
    association overall error counter. This has as a consequence that we might end
    up in situation in which we drop association due path failure on unconfirmed
    address, in case we have wrong configuration in a way:
    Association.Max.Retrans == Path.Max.Retrans.

    Another reason is that entering PF from unconfirmed will cause a loss of address
    confirmed event when address is once (if) confirmed. This is fine from failover
    guide point of view, but it is not consistent with behavior preceding failover
    implementation and recommendation from 4960:

    5.4. Path Verification
    Whenever a path is confirmed, an indication MAY be given to the upper
    layer.

    Signed-off-by: Matija Glavinic Pecotic
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Matija Glavinic Pecotic
     

22 Jan, 2014

1 commit


27 Dec, 2013

1 commit


23 Dec, 2013

1 commit


07 Dec, 2013

1 commit

  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: Vlad Yasevich
    CC: Neil Horman
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     

04 Nov, 2013

1 commit

  • Introduced in f9e42b853523 ("net: sctp: sideeffect: throw BUG if
    primary_path is NULL"), we intended to find a buggy assoc that's
    part of the assoc hash table with a primary_path that is NULL.
    However, we better remove the BUG_ON for now and find a more
    suitable place to assert for these things as Mark reports that
    this also triggers the bug when duplication cookie processing
    happens, and the assoc is not part of the hash table (so all
    good in this case). Such a situation can for example easily be
    reproduced by:

    tc qdisc add dev eth0 root handle 1: prio bands 2 priomap 1 1 1 1 1 1
    tc qdisc add dev eth0 parent 1:2 handle 20: netem loss 20%
    tc filter add dev eth0 protocol ip parent 1: prio 2 u32 match ip \
    protocol 132 0xff match u8 0x0b 0xff at 32 flowid 1:2

    This drops 20% of COOKIE-ACK packets. After some follow-up
    discussion with Vlad we came to the conclusion that for now we
    should still better remove this BUG_ON() assertion, and come up
    with two follow-ups later on, that is, i) find a more suitable
    place for this assertion, and possibly ii) have a special
    allocator/initializer for such kind of temporary assocs.

    Reported-by: Mark Thomas
    Signed-off-by: Vlad Yasevich
    Signed-off-by: Daniel Borkmann
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

10 Aug, 2013

1 commit

  • With the restructuring of the lksctp.org site, we only allow bug
    reports through the SCTP mailing list linux-sctp@vger.kernel.org,
    not via SF, as SF is only used for web hosting and nothing more.
    While at it, also remove the obvious statement that bugs will be
    fixed and incooperated into the kernel.

    Signed-off-by: Daniel Borkmann
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

25 Jul, 2013

1 commit

  • The SCTP mailing list address to send patches or questions
    to is linux-sctp@vger.kernel.org and not
    lksctp-developers@lists.sourceforge.net anymore. Therefore,
    update all occurences.

    Signed-off-by: Daniel Borkmann
    Acked-by: Neil Horman
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

10 Jul, 2013

1 commit

  • This fix has been proposed originally by Vlad Yasevich. He says:

    When SCTP makes forward progress (receives a SACK that acks new chunks,
    renegs, or answeres 0-window probes) or when HB-ACK arrives, mark
    the route as confirmed so we don't unnecessarily send NUD probes.

    Having a simple SCTP client/server that exchange data chunks every 1sec,
    without this patch ARP requests are sent periodically every 40-60sec.
    With this fix applied, an ARP request is only done once right at the
    "session" beginning. Also, when clearing the related ARP cache entry
    manually during the session, a new request is correctly done. I have
    only "backported" this to net-next and tested that it works, so full
    credit goes to Vlad.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: Daniel Borkmann
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

02 Jul, 2013

1 commit

  • We should get rid of all own SCTP debug printk macros and use the ones
    that the kernel offers anyway instead. This makes the code more readable
    and conform to the kernel code, and offers all the features of dynamic
    debbuging that pr_debug() et al has, such as only turning on/off portions
    of debug messages at runtime through debugfs. The runtime cost of having
    CONFIG_DYNAMIC_DEBUG enabled, but none of the debug statements printing,
    is negligible [1]. If kernel debugging is completly turned off, then these
    statements will also compile into "empty" functions.

    While we're at it, we also need to change the Kconfig option as it /now/
    only refers to the ifdef'ed code portions in outqueue.c that enable further
    debugging/tracing of SCTP transaction fields. Also, since SCTP_ASSERT code
    was enabled with this Kconfig option and has now been removed, we
    transform those code parts into WARNs resp. where appropriate BUG_ONs so
    that those bugs can be more easily detected as probably not many people
    have SCTP debugging permanently turned on.

    To turn on all SCTP debugging, the following steps are needed:

    # mount -t debugfs none /sys/kernel/debug
    # echo -n 'module sctp +p' > /sys/kernel/debug/dynamic_debug/control

    This can be done more fine-grained on a per file, per line basis and others
    as described in [2].

    [1] https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf
    [2] Documentation/dynamic-debug-howto.txt

    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

15 Jun, 2013

1 commit

  • This clearly states a BUG somewhere in the SCTP code as e.g. fixed once
    in f28156335 ("sctp: Use correct sideffect command in duplicate cookie
    handling"). If this ever happens, throw a trace in the sideeffect engine
    where assocs clearly must have a primary_path assigned.

    When in sctp_seq_dump_local_addrs() also throw a WARN and bail out since
    we do not need to panic for printing this one asterisk. Also, it will
    avoid the not so obvious case when primary != NULL test passes and at a
    later point in time triggering a NULL ptr dereference caused by primary.
    While at it, also fix up the white space.

    Signed-off-by: Daniel Borkmann
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

05 Feb, 2013

1 commit


04 Dec, 2012

1 commit

  • The current SCTP stack is lacking a mechanism to have per association
    statistics. This is an implementation modeled after OpenSolaris'
    SCTP_GET_ASSOC_STATS.

    Userspace part will follow on lksctp if/when there is a general ACK on
    this.
    V4:
    - Move ipackets++ before q->immediate.func() for consistency reasons
    - Move sctp_max_rto() at the end of sctp_transport_update_rto() to avoid
    returning bogus RTO values
    - return asoc->rto_min when max_obs_rto value has not changed

    V3:
    - Increase ictrlchunks in sctp_assoc_bh_rcv() as well
    - Move ipackets++ to sctp_inq_push()
    - return 0 when no rto updates took place since the last call

    V2:
    - Implement partial retrieval of stat struct to cope for future expansion
    - Kill the rtxpackets counter as it cannot be precise anyway
    - Rename outseqtsns to outofseqtsns to make it clearer that these are out
    of sequence unexpected TSNs
    - Move asoc->ipackets++ under a lock to avoid potential miscounts
    - Fold asoc->opackets++ into the already existing asoc check
    - Kill unneeded (q->asoc) test when increasing rtxchunks
    - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0)
    - Don't count SHUTDOWNs as SACKs
    - Move SCTP_GET_ASSOC_STATS to the private space API
    - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for
    future struct growth
    - Move association statistics in their own struct
    - Update idupchunks when we send a SACK with dup TSNs
    - return min_rto in max_rto when RTO has not changed. Also return the
    transport when max_rto last changed.

    Signed-off: Michele Baldessari
    Acked-by: Vlad Yasevich

    Signed-off-by: David S. Miller

    Michele Baldessari
     

21 Nov, 2012

1 commit

  • In the event that an association exceeds its max_retrans attempts, we should
    send an ABORT chunk indicating that we are closing the assocation as a result.
    Because of the nature of the error, its unlikely to be received, but its a nice
    clean way to close the association if it does make it through, and it will give
    anyone watching via tcpdump a clue as to what happened.

    Change notes:
    v2)
    * Removed erroneous changes from sctp_make_violation_parmlen

    Signed-off-by: Neil Horman
    CC: Vlad Yasevich
    CC: "David S. Miller"
    CC: linux-sctp@vger.kernel.org
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman
     

04 Nov, 2012

1 commit

  • Lots of points in the sctp_cmd_interpreter function treat the sctp_cmd_t arg as
    a void pointer, even though they are written as various other types. Theres no
    need for this as doing so just leads to possible type-punning issues that could
    cause crashes, and if we remain type-consistent we can actually just remove the
    void * member of the union entirely.

    Change Notes:

    v2)
    * Dropped chunk that modified SCTP_NULL to create a marker pattern
    should anyone try to use a SCTP_NULL() assigned sctp_arg_t, Assigning
    to .zero provides the same effect and should be faster, per Vlad Y.

    v3)
    * Reverted part of V2, opting to use memset instead of .zero, so that
    the entire union is initalized thus avoiding the i164 speculative load
    problems previously encountered, per Dave M.. Also rewrote
    SCTP_[NO]FORCE so as to use common infrastructure a little more

    Signed-off-by: Neil Horman
    CC: "David S. Miller"
    CC: linux-sctp@vger.kernel.org
    Signed-off-by: David S. Miller

    Neil Horman
     

17 Oct, 2012

1 commit


05 Oct, 2012

1 commit

  • Suppose we have an SCTP connection with two paths. After connection is
    established, path1 is not available, thus this path is marked as inactive. Then
    traffic goes through path2, but for some reasons packets are delayed (after
    rto.max). Because packets are delayed, the retransmit mechanism will switch
    again to path1. At this time, we receive a delayed SACK from path2. When we
    update the state of the path in sctp_check_transmitted(), we do not take into
    account the source address of the SACK, hence we update the wrong path.

    Signed-off-by: Nicolas Dichtel
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

15 Aug, 2012

2 commits


23 Jul, 2012

1 commit

  • I've seen several attempts recently made to do quick failover of sctp transports
    by reducing various retransmit timers and counters. While its possible to
    implement a faster failover on multihomed sctp associations, its not
    particularly robust, in that it can lead to unneeded retransmits, as well as
    false connection failures due to intermittent latency on a network.

    Instead, lets implement the new ietf quick failover draft found here:
    http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05

    This will let the sctp stack identify transports that have had a small number of
    errors, and avoid using them quickly until their reliability can be
    re-established. I've tested this out on two virt guests connected via multiple
    isolated virt networks and believe its in compliance with the above draft and
    works well.

    Signed-off-by: Neil Horman
    CC: Vlad Yasevich
    CC: Sridhar Samudrala
    CC: "David S. Miller"
    CC: linux-sctp@vger.kernel.org
    CC: joe@perches.com
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman
     

01 Jul, 2012

1 commit

  • It was noticed recently that when we send data on a transport, its possible that
    we might bundle a sack that arrived on a different transport. While this isn't
    a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
    2960:

    An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
    etc.) to the same destination transport address from which it
    received the DATA or control chunk to which it is replying. This
    rule should also be followed if the endpoint is bundling DATA chunks
    together with the reply chunk.

    This patch seeks to correct that. It restricts the bundling of sack operations
    to only those transports which have moved the ctsn of the association forward
    since the last sack. By doing this we guarantee that we only bundle outbound
    saks on a transport that has received a chunk since the last sack. This brings
    us into stricter compliance with the RFC.

    Vlad had initially suggested that we strictly allow only sack bundling on the
    transport that last moved the ctsn forward. While this makes sense, I was
    concerned that doing so prevented us from bundling in the case where we had
    received chunks that moved the ctsn on multiple transports. In those cases, the
    RFC allows us to select any of the transports having received chunks to bundle
    the sack on. so I've modified the approach to allow for that, by adding a state
    variable to each transport that tracks weather it has moved the ctsn since the
    last sack. This I think keeps our behavior (and performance), close enough to
    our current profile that I think we can do this without a sysctl knob to
    enable/disable it.

    Signed-off-by: Neil Horman
    CC: Vlad Yaseivch
    CC: David S. Miller
    CC: linux-sctp@vger.kernel.org
    Reported-by: Michele Baldessari
    Reported-by: sorin serban
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman