27 Aug, 2010

1 commit

  • Change SCTP_DEBUG_PRINTK and SCTP_DEBUG_PRINTK_IPADDR to
    use do { print } while (0) guards.
    Add SCTP_DEBUG_PRINTK_CONT to fix errors in log when
    lines were continued.
    Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
    Add a missing newline in "Failed bind hash alloc"

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

17 May, 2010

1 commit


16 May, 2010

1 commit


12 May, 2010

1 commit


06 May, 2010

1 commit

  • ICMP protocol unreachable handling completely disregarded
    the fact that the user may have locked the socket. It proceeded
    to destroy the association, even though the user may have
    held the lock and had a ref on the association. This resulted
    in the following:

    Attempt to release alive inet socket f6afcc00

    =========================
    [ BUG: held lock freed! ]
    -------------------------
    somenu/2672 is freeing memory f6afcc00-f6afcfff, with a lock still held
    there!
    (sk_lock-AF_INET){+.+.+.}, at: [] sctp_connect+0x13/0x4c
    1 lock held by somenu/2672:
    #0: (sk_lock-AF_INET){+.+.+.}, at: [] sctp_connect+0x13/0x4c

    stack backtrace:
    Pid: 2672, comm: somenu Not tainted 2.6.32-telco #55
    Call Trace:
    [] ? printk+0xf/0x11
    [] debug_check_no_locks_freed+0xce/0xff
    [] kmem_cache_free+0x21/0x66
    [] __sk_free+0x9d/0xab
    [] sk_free+0x1c/0x1e
    [] sctp_association_put+0x32/0x89
    [] __sctp_connect+0x36d/0x3f4
    [] ? sctp_connect+0x13/0x4c
    [] ? autoremove_wake_function+0x0/0x33
    [] sctp_connect+0x31/0x4c
    [] inet_dgram_connect+0x4b/0x55
    [] sys_connect+0x54/0x71
    [] ? lock_release_non_nested+0x88/0x239
    [] ? might_fault+0x42/0x7c
    [] ? might_fault+0x42/0x7c
    [] sys_socketcall+0x6d/0x178
    [] ? trace_hardirqs_on_thunk+0xc/0x10
    [] syscall_call+0x7/0xb

    This was because the sctp_wait_for_connect() would aqcure the socket
    lock and then proceed to release the last reference count on the
    association, thus cause the fully destruction path to finish freeing
    the socket.

    The simplest solution is to start a very short timer in case the socket
    is owned by user. When the timer expires, we can do some verification
    and be able to do the release properly.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

01 May, 2010

3 commits


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

29 Nov, 2009

2 commits

  • Conflicts:
    drivers/ieee802154/fakehard.c
    drivers/net/e1000e/ich8lan.c
    drivers/net/e1000e/phy.c
    drivers/net/netxen/netxen_nic_init.c
    drivers/net/wireless/ath/ath9k/main.c

    David S. Miller
     
  • When retransmitting due to T3 timeout, retransmit all the
    in-flight chunks for the corresponding transport/path, including
    chunks sent less then 1 rto ago.
    This is the correct behaviour according to rfc4960 section 6.3.3
    E3 and
    "Note: Any DATA chunks that were sent to the address for which the
    T3-rtx timer expired but did not fit in one MTU (rule E3 above)
    should be marked for retransmission and sent as soon as cwnd
    allows (normally, when a SACK arrives). ".

    This fixes problems when more then one path is present and the T3
    retransmission of the first chunk that timeouts stops the T3 timer
    for the initial active path, leaving all the other in-flight
    chunks waiting forever or until a new chunk is transmitted on the
    same path and timeouts (and this will happen only if the cwnd
    allows sending new chunks, but since cwnd was dropped to MTU by
    the timeout => it will wait until the first heartbeat).

    Example: 10 packets in flight, sent at 0.1 s intervals on the
    primary path. The primary path is down and the first packet
    timeouts. The first packet is retransmitted on another path, the
    T3 timer for the primary path is stopped and cwnd is set to MTU.
    All the other 9 in-flight packets will not be retransmitted
    (unless more new packets are sent on the primary path which depend
    on cwnd allowing it, and even in this case the 9 packets will be
    retransmitted only after a new packet timeouts which even in the
    best case would be more then RTO).

    This commit reverts d0ce92910bc04e107b2f3f2048f07e94f570035d and
    also removes the now unused transport->last_rto, introduced in
    b6157d8e03e1e780660a328f7183bcbfa4a93a19.

    p.s The problem is not only when multiple paths are there. It
    can happen in a single homed environment. If the application
    stops sending data, it possible to have a hung association.

    Signed-off-by: Andrei Pelinescu-Onciul
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Andrei Pelinescu-Onciul
     

24 Nov, 2009

2 commits

  • Current implementation of max.burst ends up limiting new
    data during cwnd decay period. The decay is happening becuase
    the connection is idle and we are allowed to fill the congestion
    window. The point of max.burst is to limit micro-bursts in response
    to large acks. This still happens, as max.burst is still applied
    to each transmit opportunity. It will also apply if a very large
    send is made (greater then allowed by burst).

    Tested-by: Florian Niederbacher
    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • The transport last_time_used variable is rather useless.
    It was only used when determining if CWND needs to be updated
    due to idle transport. However, idle transport detection was
    based on a Heartbeat timer and last_time_used was not incremented
    when sending Heartbeats. As a result the check for cwnd reduction
    was always true. We can get rid of the variable and just base
    our cwnd manipulation on the HB timer (like the code comment sais).
    We also have to call into the cwnd manipulation function regardless
    of whether HBs are enabled or not. That way we will detect idle
    transports if the user has disabled Heartbeats.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

14 Nov, 2009

1 commit

  • Recent commits
    sctp: Get rid of an extra routing lookup when adding a transport
    and
    sctp: Set source addresses on the association before adding transports

    changed when routes are added to the sctp transports. As such,
    we didn't set the socket source address correctly when adding the first
    transport. The first transport is always the primary/active one, so
    when adding it, set the socket source address. This was causing
    regression failures in SCTP tests.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

05 Sep, 2009

1 commit


03 Mar, 2009

1 commit


16 Feb, 2009

1 commit

  • SCTP incorrectly doubles rto ever time a Hearbeat chunk
    is generated. However RFC 4960 states:

    On an idle destination address that is allowed to heartbeat, it is
    recommended that a HEARTBEAT chunk is sent once per RTO of that
    destination address plus the protocol parameter 'HB.interval', with
    jittering of +/- 50% of the RTO value, and exponential backoff of the
    RTO if the previous HEARTBEAT is unanswered.

    Essentially, of if the heartbean is unacknowledged, do we double the RTO.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

19 Jul, 2008

1 commit

  • valgrind reports uninizialized memory accesses when running
    sctp inside the network simulation cradle simulator:

    Conditional jump or move depends on uninitialised value(s)
    at 0x570E34A: sctp_assoc_sync_pmtu (associola.c:1324)
    by 0x57427DA: sctp_packet_transmit (output.c:403)
    by 0x5710EFF: sctp_outq_flush (outqueue.c:824)
    by 0x5710B88: sctp_outq_uncork (outqueue.c:701)
    by 0x5745262: sctp_cmd_interpreter (sm_sideeffect.c:1548)
    by 0x57444B7: sctp_side_effects (sm_sideeffect.c:976)
    by 0x5744460: sctp_do_sm (sm_sideeffect.c:945)
    by 0x572157D: sctp_primitive_ASSOCIATE (primitive.c:94)
    by 0x5725C04: __sctp_connect (socket.c:1094)
    by 0x57297DC: sctp_connect (socket.c:3297)

    Conditional jump or move depends on uninitialised value(s)
    at 0x575D3A5: mod_timer (timer.c:630)
    by 0x5752B78: sctp_cmd_hb_timers_start (sm_sideeffect.c:555)
    by 0x5754133: sctp_cmd_interpreter (sm_sideeffect.c:1448)
    by 0x5753607: sctp_side_effects (sm_sideeffect.c:976)
    by 0x57535B0: sctp_do_sm (sm_sideeffect.c:945)
    by 0x571E9AE: sctp_endpoint_bh_rcv (endpointola.c:474)
    by 0x573347F: sctp_inq_push (inqueue.c:104)
    by 0x572EF93: sctp_rcv (input.c:256)
    by 0x5689623: ip_local_deliver_finish (ip_input.c:230)
    by 0x5689759: ip_local_deliver (ip_input.c:268)
    by 0x5689CAC: ip_rcv_finish (dst.h:246)

    #1 is due to "if (t->pmtu_pending)".
    8a4794914f9cf2681235ec2311e189fe307c28c7 "[SCTP] Flag a pmtu change request"
    suggests it should be initialized to 0.

    #2 is the heartbeat timer 'expires' value, which is uninizialised, but
    test by mod_timer().
    T3_rtx_timer seems to be affected by the same problem, so initialize it, too.

    Signed-off-by: Florian Westphal
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Florian Westphal
     

05 Jun, 2008

3 commits


06 Mar, 2008

1 commit


05 Feb, 2008

1 commit

  • I was notified by Randy Stewart that lksctp claims to be
    "the reference implementation". First of all, "the
    refrence implementation" was the original implementation
    of SCTP in usersapce written ty Randy and a few others.
    Second, after looking at the definiton of 'reference implementation',
    we don't really meet the requirements.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

29 Jan, 2008

1 commit

  • Many-many code in the kernel initialized the timer->function
    and timer->data together with calling init_timer(timer). There
    is already a helper for this. Use it for networking code.

    The patch is HUGE, but makes the code 130 lines shorter
    (98 insertions(+), 228 deletions(-)).

    Signed-off-by: Pavel Emelyanov
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

08 Nov, 2007

1 commit

  • Commit d0ce92910bc04e107b2f3f2048f07e94f570035d broke several retransmit
    cases including fast retransmit. The reason is that we should
    only delay by rto while doing retranmists as a result of a timeout.
    Retransmit as a result of path mtu discover, fast retransmit, or
    other evernts that should trigger immidiate retransmissions got broken.

    Also, since rto is doubled prior to marking of packets elegable for
    retransmission, we never marked correct chunks anyway.

    The fix is provide a reason for a given retransmission so that we
    can mark chunks appropriately and to save the old rto value to do
    comparisons against.

    All regressions tests passed with this code.

    Spotted by Wei Yongjun

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

14 Jun, 2007

2 commits


26 Apr, 2007

1 commit

  • Spring cleaning time...

    There seems to be a lot of places in the network code that have
    extra bogus semicolons after conditionals. Most commonly is a
    bogus semicolon after: switch() { }

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

23 Mar, 2007

1 commit


20 Mar, 2007

1 commit

  • If the association has been restarted, we need to reset the
    transport congestion variables as well as accumulated error
    counts and CACC variables. If we do not, the association
    will use the wrong values and may terminate prematurely.

    This was found with a scenario where the peer restarted
    the association when lksctp was in the last HB timeout for
    its association. The restart happened, but the error counts
    have not been reset and when the timeout occurred, a newly
    restarted association was terminated due to excessive
    retransmits.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: Sridhar Samudrala
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

11 Feb, 2007

1 commit


23 Sep, 2006

1 commit

  • The SCTP sysctl entries are displayed in milliseconds, but stored
    internally in jiffies. This results in multiple levels of msecs to
    jiffies conversion and as a result produces a truncation error. This
    patch makes things consistent in that we store and display defaults
    in milliseconds and only convert once for use by association.
    This patch also adds some sane min/max values so that we don't go off
    the deep end.

    Signed-off-by: Vladislav Yasevich
    Signed-off-by: Sridhar Samudrala
    Signed-off-by: David S. Miller

    Vladislav Yasevich
     

22 Jul, 2006

1 commit


18 Jan, 2006

1 commit


04 Jan, 2006

1 commit


03 Dec, 2005

1 commit


09 Oct, 2005

1 commit

  • - added typedef unsigned int __nocast gfp_t;

    - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
    the same warnings as far as sparse is concerned, doesn't change
    generated code (from gcc point of view we replaced unsigned int with
    typedef) and documents what's going on far better.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

12 Jul, 2005

1 commit


29 Jun, 2005

1 commit


21 Jun, 2005

1 commit