12 Dec, 2011

1 commit


17 Jun, 2011

1 commit

  • Unnecessary casts of void * clutter the code.

    These are the remainder casts after several specific
    patches to remove netdev_priv and dev_priv.

    Done via coccinelle script:

    $ cat cast_void_pointer.cocci
    @@
    type T;
    T *pt;
    void *pv;
    @@

    - pt = (T *)pv;
    + pt = pv;

    Signed-off-by: Joe Perches
    Acked-by: Paul Moore
    Signed-off-by: David S. Miller

    Joe Perches
     

23 Apr, 2011

1 commit


20 Apr, 2011

2 commits

  • Changed the order of processing SHUTDOWN ACK and COOKIE ACK
    refer to section 8.4:Handle "Out of the Blue" Packets.

    SHUTDOWN ACK chunk should be processed before processing
    "Stale Cookie" ERROR or a COOKIE ACK.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Shan Wei
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Shan Wei
     
  • The 'p' member of struct sctp_paramhdr is common part for
    IPv4 addr parameter and IPv6 addr parameter in union sctp_addr_param.

    For addr-related code, use specified addr parameter.
    Otherwise, use common header to access type/length member.

    Signed-off-by: Shan Wei
    Signed-off-by: Vlad Yasevich
    Signed-off-by: Wei Yongjun
    Signed-off-by: David S. Miller

    Shan Wei
     

31 Mar, 2011

1 commit


08 Mar, 2011

1 commit


06 May, 2010

1 commit

  • ICMP protocol unreachable handling completely disregarded
    the fact that the user may have locked the socket. It proceeded
    to destroy the association, even though the user may have
    held the lock and had a ref on the association. This resulted
    in the following:

    Attempt to release alive inet socket f6afcc00

    =========================
    [ BUG: held lock freed! ]
    -------------------------
    somenu/2672 is freeing memory f6afcc00-f6afcfff, with a lock still held
    there!
    (sk_lock-AF_INET){+.+.+.}, at: [] sctp_connect+0x13/0x4c
    1 lock held by somenu/2672:
    #0: (sk_lock-AF_INET){+.+.+.}, at: [] sctp_connect+0x13/0x4c

    stack backtrace:
    Pid: 2672, comm: somenu Not tainted 2.6.32-telco #55
    Call Trace:
    [] ? printk+0xf/0x11
    [] debug_check_no_locks_freed+0xce/0xff
    [] kmem_cache_free+0x21/0x66
    [] __sk_free+0x9d/0xab
    [] sk_free+0x1c/0x1e
    [] sctp_association_put+0x32/0x89
    [] __sctp_connect+0x36d/0x3f4
    [] ? sctp_connect+0x13/0x4c
    [] ? autoremove_wake_function+0x0/0x33
    [] sctp_connect+0x31/0x4c
    [] inet_dgram_connect+0x4b/0x55
    [] sys_connect+0x54/0x71
    [] ? lock_release_non_nested+0x88/0x239
    [] ? might_fault+0x42/0x7c
    [] ? might_fault+0x42/0x7c
    [] sys_socketcall+0x6d/0x178
    [] ? trace_hardirqs_on_thunk+0xc/0x10
    [] syscall_call+0x7/0xb

    This was because the sctp_wait_for_connect() would aqcure the socket
    lock and then proceed to release the last reference count on the
    association, thus cause the fully destruction path to finish freeing
    the socket.

    The simplest solution is to start a very short timer in case the socket
    is owned by user. When the timer expires, we can do some verification
    and be able to do the release properly.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

06 Mar, 2010

2 commits


09 Jun, 2009

1 commit


16 Feb, 2009

2 commits

  • The sctp crc32c checksum is always generated in little endian.
    So, we clean up the code to treat it as little endian and remove
    all the __force casts.

    Suggested by Herbert Xu.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • This is a new version of my patch, now using a module parameter instead
    of a sysctl, so that the option is harder to find. Please note that,
    once the module is loaded, it is still possible to change the value of
    the parameter in /sys/module/sctp/parameters/, which is useful if you
    want to do performance comparisons without rebooting.

    Computation of SCTP checksums significantly affects the performance of
    SCTP. For example, using two dual-Opteron 246 connected using a Gbe
    network, it was not possible to achieve more than ~730 Mbps, compared to
    941 Mbps after disabling SCTP checksums.
    Unfortunately, SCTP checksum offloading in NICs is not commonly
    available (yet).

    By default, checksums are still enabled, of course.

    Signed-off-by: Lucas Nussbaum
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Lucas Nussbaum
     

23 Jan, 2009

1 commit

  • There is a race between sctp_rcv() and sctp_accept() where we
    have moved the association from the listening socket to the
    accepted socket, but sctp_rcv() processing cached the old
    socket and continues to use it.

    The easy solution is to check for the socket mismatch once we've
    grabed the socket lock. If we hit a mis-match, that means
    that were are currently holding the lock on the listening socket,
    but the association is refrencing a newly accepted socket. We need
    to drop the lock on the old socket and grab the lock on the new one.

    A more proper solution might be to create accepted sockets when
    the new association is established, similar to TCP. That would
    eliminate the race for 1-to-1 style sockets, but it would still
    existing for 1-to-many sockets where a user wished to peeloff an
    association. For now, we'll live with this easy solution as
    it addresses the problem.

    Reported-by: Michal Hocko
    Reported-by: Karsten Keil
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

23 Oct, 2008

1 commit

  • If ICMP packet too big message is received with MTU larger than current
    PMTU, SCTP will still accept this ICMP message and sync the PMTU of assoc
    with the wrong MTU.

    Endpoing A Endpoint B
    (ESTABLISHED) (ESTABLISHED)
    ICMP --------->
    (packet too big, MTU too larger)
    sync PMTU

    This patch fixed the problem by drop that ICMP message.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Wei Yongjun
     

19 Jul, 2008

1 commit


17 Jul, 2008

1 commit


15 Jul, 2008

1 commit


20 Jun, 2008

1 commit

  • This patch add to validate initiate tag and chunk type if verification
    tag is 0 when handling ICMP message.

    RFC 4960, Appendix C. ICMP Handling

    ICMP6) An implementation MUST validate that the Verification Tag
    contained in the ICMP message matches the Verification Tag of the peer.
    If the Verification Tag is not 0 and does NOT match, discard the ICMP
    message. If it is 0 and the ICMP message contains enough bytes to
    verify that the chunk type is an INIT chunk and that the Initiate Tag
    matches the tag of the peer, continue with ICMP7. If the ICMP message
    is too short or the chunk type or the Initiate Tag does not match,
    silently discard the packet.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Wei Yongjun
     

10 Apr, 2008

1 commit


18 Mar, 2008

2 commits


06 Mar, 2008

1 commit


05 Feb, 2008

1 commit

  • I was notified by Randy Stewart that lksctp claims to be
    "the reference implementation". First of all, "the
    refrence implementation" was the original implementation
    of SCTP in usersapce written ty Randy and a few others.
    Second, after looking at the definiton of 'reference implementation',
    we don't really meet the requirements.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

29 Jan, 2008

2 commits


10 Nov, 2007

1 commit


11 Oct, 2007

1 commit


26 Sep, 2007

1 commit

  • RFC 4460 and future RFC 4960 (2960-bis) specify that packets
    with bundled INIT chunks need to be dropped. We currenlty do
    that only after processing any leading chunks. For OOTB chunks,
    since we already walk the entire packet, we should discard packets
    with bundled INITs.

    There are other chunks chunks that MUST NOT be bundled, but the spec
    is silent on theire treatment. Thus, we'll leave their teatment
    alone for the moment.

    Signed-off-by: Vlad Yasevich
    Acked-by: Wei Yongjun

    Vlad Yasevich
     

01 Aug, 2007

1 commit


14 Jun, 2007

2 commits

  • Currently, if the socket is owned by the user, we drop the ICMP
    message. As a result SCTP forgets that path MTU changed and
    never adjusting it's estimate. This causes all subsequent
    packets to be fragmented. With this patch, we'll flag the association
    that it needs to udpate it's estimate based on the already updated
    routing information.

    Signed-off-by: Vlad Yasevich
    Acked-by: Sridhar Samudrala

    Vlad Yasevich
     
  • Introduce new function sctp_transport_update_pmtu that updates
    the transports and destination caches view of the path mtu.

    Signed-off-by: Vlad Yasevich
    Acked-by: Sridhar Samudrala

    Vlad Yasevich
     

26 Apr, 2007

7 commits

  • When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
    maps to the semantics of CHECKSUM_UNNECESSARY. Therefore we should
    treat it as such in the stack.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
    on 64bit architectures, allowing us to combine the 4 bytes hole left by the
    layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
    64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
    :-)

    Many calculations that previously required that skb->{transport,network,
    mac}_header be first converted to a pointer now can be done directly, being
    meaningful as offsets or pointers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • With this we save 8 bytes per network packet, leaving a 4 bytes hole to be used
    in further shrinking work, likely with the offsetization of other pointers,
    such as ->{data,tail,end}, at the cost of adds, that were minimized by the
    usual practice of setting skb->{mac,nh,n}.raw to a local variable that is then
    accessed multiple times in each function, it also is not more expensive than
    before with regards to most of the handling of such headers, like setting one
    of these headers to another (transport to network, etc), or subtracting, adding
    to/from it, comparing them, etc.

    Now we have this layout for sk_buff on a x86_64 machine:

    [acme@mica net-2.6.22]$ pahole vmlinux sk_buff
    struct sk_buff {
    struct sk_buff * next; /* 0 8 */
    struct sk_buff * prev; /* 8 8 */
    struct rb_node rb; /* 16 24 */
    struct sock * sk; /* 40 8 */
    ktime_t tstamp; /* 48 8 */
    struct net_device * dev; /* 56 8 */
    /* --- cacheline 1 boundary (64 bytes) --- */
    struct net_device * input_dev; /* 64 8 */
    sk_buff_data_t transport_header; /* 72 4 */
    sk_buff_data_t network_header; /* 76 4 */
    sk_buff_data_t mac_header; /* 80 4 */

    /* XXX 4 bytes hole, try to pack */

    struct dst_entry * dst; /* 88 8 */
    struct sec_path * sp; /* 96 8 */
    char cb[48]; /* 104 48 */
    /* cacheline 2 boundary (128 bytes) was 24 bytes ago*/
    unsigned int len; /* 152 4 */
    unsigned int data_len; /* 156 4 */
    unsigned int mac_len; /* 160 4 */
    union {
    __wsum csum; /* 4 */
    __u32 csum_offset; /* 4 */
    }; /* 164 4 */
    __u32 priority; /* 168 4 */
    __u8 local_df:1; /* 172 1 */
    __u8 cloned:1; /* 172 1 */
    __u8 ip_summed:2; /* 172 1 */
    __u8 nohdr:1; /* 172 1 */
    __u8 nfctinfo:3; /* 172 1 */
    __u8 pkt_type:3; /* 173 1 */
    __u8 fclone:2; /* 173 1 */
    __u8 ipvs_property:1; /* 173 1 */

    /* XXX 2 bits hole, try to pack */

    __be16 protocol; /* 174 2 */
    void (*destructor)(struct sk_buff *); /* 176 8 */
    struct nf_conntrack * nfct; /* 184 8 */
    /* --- cacheline 3 boundary (192 bytes) --- */
    struct sk_buff * nfct_reasm; /* 192 8 */
    struct nf_bridge_info *nf_bridge; /* 200 8 */
    __u16 tc_index; /* 208 2 */
    __u16 tc_verd; /* 210 2 */
    dma_cookie_t dma_cookie; /* 212 4 */
    __u32 secmark; /* 216 4 */
    __u32 mark; /* 220 4 */
    unsigned int truesize; /* 224 4 */
    atomic_t users; /* 228 4 */
    unsigned char * head; /* 232 8 */
    unsigned char * data; /* 240 8 */
    unsigned char * tail; /* 248 8 */
    /* --- cacheline 4 boundary (256 bytes) --- */
    unsigned char * end; /* 256 8 */
    }; /* size: 264, cachelines: 5 */
    /* sum members: 260, holes: 1, sum holes: 4 */
    /* bit holes: 1, sum bit holes: 2 bits */
    /* last cacheline: 8 bytes */

    On 32 bits nothing changes, and pointers continue to be used with the compiler
    turning all this abstraction layer into dust. But there are some sk_buff
    validation tricks that are now possible, humm... :-)

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
    skb->mac to skb->mac_header, to match the names of the associated helpers
    (skb[_[re]set]_{transport,network,mac}_header).

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For consistency with all the other skb->h.raw accessors.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo