14 Jun, 2007

6 commits


04 Jun, 2007

1 commit


25 May, 2007

1 commit


11 May, 2007

3 commits


09 May, 2007

1 commit


05 May, 2007

4 commits


04 May, 2007

1 commit

  • Cleanup of dev_base list use, with the aim to simplify making device
    list per-namespace. In almost every occasion, use of dev_base variable
    and dev->next pointer could be easily replaced by for_each_netdev
    loop. A few most complicated places were converted to using
    first_netdev()/next_netdev().

    Signed-off-by: Pavel Emelianov
    Acked-by: Kirill Korotaev
    Signed-off-by: David S. Miller

    Pavel Emelianov
     

29 Apr, 2007

1 commit


26 Apr, 2007

19 commits

  • Spring cleaning time...

    There seems to be a lot of places in the network code that have
    extra bogus semicolons after conditionals. Most commonly is a
    bogus semicolon after: switch() { }

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
    maps to the semantics of CHECKSUM_UNNECESSARY. Therefore we should
    treat it as such in the stack.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • As stated in the sctp socket api draft:

    sac_info: variable

    If the sac_state is SCTP_COMM_LOST and an ABORT chunk was received
    for this association, sac_info[] contains the complete ABORT chunk as
    defined in the SCTP specification RFC2960 [RFC2960] section 3.3.7.

    We now save received ABORT chunks into the sac_info field and pass that
    to the user.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • Parameters only take effect when a corresponding flag bit is set
    and a value is specified. This means we need to check the flags
    in addition to checking for non-zero value.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • This option induces partial delivery to run as soon
    as the specified amount of data has been accumulated on
    the association. However, we give preference to fully
    reassembled messages over PD messages. In any case,
    window and buffer is freed up.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • This option was introduced in draft-ietf-tsvwg-sctpsocket-13. It
    prevents head-of-line blocking in the case of one-to-many endpoint.
    Applications enabling this option really must enable SCTP_SNDRCV event
    so that they would know where the data belongs. Based on an
    earlier patch by Ivan Skytte Jørgensen.

    Additionally, this functionality now permits multiple associations
    on the same endpoint to enter Partial Delivery. Applications should
    be extra careful, when using this functionality, to track EOR indicators.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
    on 64bit architectures, allowing us to combine the 4 bytes hole left by the
    layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
    64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
    :-)

    Many calculations that previously required that skb->{transport,network,
    mac}_header be first converted to a pointer now can be done directly, being
    meaningful as offsets or pointers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • With this we save 8 bytes per network packet, leaving a 4 bytes hole to be used
    in further shrinking work, likely with the offsetization of other pointers,
    such as ->{data,tail,end}, at the cost of adds, that were minimized by the
    usual practice of setting skb->{mac,nh,n}.raw to a local variable that is then
    accessed multiple times in each function, it also is not more expensive than
    before with regards to most of the handling of such headers, like setting one
    of these headers to another (transport to network, etc), or subtracting, adding
    to/from it, comparing them, etc.

    Now we have this layout for sk_buff on a x86_64 machine:

    [acme@mica net-2.6.22]$ pahole vmlinux sk_buff
    struct sk_buff {
    struct sk_buff * next; /* 0 8 */
    struct sk_buff * prev; /* 8 8 */
    struct rb_node rb; /* 16 24 */
    struct sock * sk; /* 40 8 */
    ktime_t tstamp; /* 48 8 */
    struct net_device * dev; /* 56 8 */
    /* --- cacheline 1 boundary (64 bytes) --- */
    struct net_device * input_dev; /* 64 8 */
    sk_buff_data_t transport_header; /* 72 4 */
    sk_buff_data_t network_header; /* 76 4 */
    sk_buff_data_t mac_header; /* 80 4 */

    /* XXX 4 bytes hole, try to pack */

    struct dst_entry * dst; /* 88 8 */
    struct sec_path * sp; /* 96 8 */
    char cb[48]; /* 104 48 */
    /* cacheline 2 boundary (128 bytes) was 24 bytes ago*/
    unsigned int len; /* 152 4 */
    unsigned int data_len; /* 156 4 */
    unsigned int mac_len; /* 160 4 */
    union {
    __wsum csum; /* 4 */
    __u32 csum_offset; /* 4 */
    }; /* 164 4 */
    __u32 priority; /* 168 4 */
    __u8 local_df:1; /* 172 1 */
    __u8 cloned:1; /* 172 1 */
    __u8 ip_summed:2; /* 172 1 */
    __u8 nohdr:1; /* 172 1 */
    __u8 nfctinfo:3; /* 172 1 */
    __u8 pkt_type:3; /* 173 1 */
    __u8 fclone:2; /* 173 1 */
    __u8 ipvs_property:1; /* 173 1 */

    /* XXX 2 bits hole, try to pack */

    __be16 protocol; /* 174 2 */
    void (*destructor)(struct sk_buff *); /* 176 8 */
    struct nf_conntrack * nfct; /* 184 8 */
    /* --- cacheline 3 boundary (192 bytes) --- */
    struct sk_buff * nfct_reasm; /* 192 8 */
    struct nf_bridge_info *nf_bridge; /* 200 8 */
    __u16 tc_index; /* 208 2 */
    __u16 tc_verd; /* 210 2 */
    dma_cookie_t dma_cookie; /* 212 4 */
    __u32 secmark; /* 216 4 */
    __u32 mark; /* 220 4 */
    unsigned int truesize; /* 224 4 */
    atomic_t users; /* 228 4 */
    unsigned char * head; /* 232 8 */
    unsigned char * data; /* 240 8 */
    unsigned char * tail; /* 248 8 */
    /* --- cacheline 4 boundary (256 bytes) --- */
    unsigned char * end; /* 256 8 */
    }; /* size: 264, cachelines: 5 */
    /* sum members: 260, holes: 1, sum holes: 4 */
    /* bit holes: 1, sum bit holes: 2 bits */
    /* last cacheline: 8 bytes */

    On 32 bits nothing changes, and pointers continue to be used with the compiler
    turning all this abstraction layer into dust. But there are some sk_buff
    validation tricks that are now possible, humm... :-)

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
    skb->mac to skb->mac_header, to match the names of the associated helpers
    (skb[_[re]set]_{transport,network,mac}_header).

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For consistency with all the other skb->h.raw accessors.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the quite common 'skb->h.raw - skb->data' sequence.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Now the skb->nh union has just one member, .raw, i.e. it is just like the
    skb->mac union, strange, no? I'm just leaving it like that till the transport
    layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or
    ->mac_header_offset?), ditto for ->{h,nh}.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Now related to this form:

    skb->nh.ipv6h = (struct ipv6hdr *)skb_put(skb, length);

    That, as the others, is done when skb->tail is still equal to skb->data, making
    the conversion to skb_reset_network_header possible.

    Also one more case equivalent to skb->nh.raw = skb->data, of this form:

    iph = (struct ipv6hdr *)skb->data;

    skb->nh.ipv6h = iph;

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • This time of the type:

    skb->nh.iph = (struct iphdr *)skb->data;

    That is completely equivalent to:

    skb->nh.raw = skb->data;

    Wonder why people love casts... :-)

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     

19 Apr, 2007

1 commit

  • The way partial delivery is currently implemnted, it is possible to
    intereleave a message (either from another steram, or unordered) that
    is not part of partial delivery process. The only way to this is for
    a message to not be a fragment and be 'in order' or unorderd for a
    given stream. This will result in bypassing the reassembly/ordering
    queues where things live duing partial delivery, and the
    message will be delivered to the socket in the middle of partial delivery.

    This is a two-fold problem, in that:
    1. the app now must check the stream-id and flags which it may not
    be doing.
    2. this clearing partial delivery state from the association and results
    in ulp hanging.

    This patch is a band-aid over a much bigger problem in that we
    don't do stream interleave.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

18 Apr, 2007

2 commits

  • During the sctp_bindx() call to add additional addresses to the
    endpoint, any v4mapped addresses are converted and stored as regular
    v4 addresses. However, when trying to remove these addresses, the
    v4mapped addresses are not converted and the operation fails. This
    patch unmaps the addresses on during the remove operation as well.

    Signed-off-by: Paolo Galtieri
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Paolo Galtieri
     
  • In current implementation, LKSCTP does receive buffer accounting for
    data in sctp_receive_queue and pd_lobby. However, LKSCTP don't do
    accounting for data in frag_list when data is fragmented. In addition,
    LKSCTP doesn't do accounting for data in reasm and lobby queue in
    structure sctp_ulpq.
    When there are date in these queue, assertion failed message is printed
    in inet_sock_destruct because sk_rmem_alloc of oldsk does not become 0
    when socket is destroyed.

    Signed-off-by: Tsutomu Fujii
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Tsutomu Fujii