09 Dec, 2009

1 commit

  • First patch changes __inet_hash_nolisten() and __inet6_hash()
    to get a timewait parameter to be able to unhash it from ehash
    at same time the new socket is inserted in hash.

    This makes sure timewait socket wont be found by a concurrent
    writer in __inet_check_established()

    Reported-by: kapil dakhane
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Oct, 2009

1 commit

  • In order to have better cache layouts of struct sock (separate zones
    for rx/tx paths), we need this preliminary patch.

    Goal is to transfert fields used at lookup time in the first
    read-mostly cache line (inside struct sock_common) and move sk_refcnt
    to a separate cache line (only written by rx path)

    This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
    sport and id fields. This allows a future patch to define these
    fields as macros, like sk_refcnt, without name clashes.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

03 Jun, 2009

1 commit

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Oct, 2008

2 commits


17 Jun, 2008

2 commits

  • There are many possible ways to add this "salt", thus I made this
    patch to be the last in the series to change it if required.

    Currently I propose to use the struct net pointer itself as this
    salt, but since this pointer is most often cache-line aligned, shift
    this right to eliminate the bits, that are most often zeroed.

    After this, simply add this mix to prepared hashfn-s.

    For CONFIG_NET_NS=n case this salt is 0 and no changes in hashfn
    appear.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Same as for inet_hashfn, prepare its ipv6 incarnation.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

03 Feb, 2008

1 commit

  • This way we can remove TCP and DCCP specific versions of

    sk->sk_prot->get_port: both v4 and v6 use inet_csk_get_port
    sk->sk_prot->hash: inet_hash is directly used, only v6 need
    a specific version to deal with mapped sockets
    sk->sk_prot->unhash: both v4 and v6 use inet_hash directly

    struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so
    that inet_csk_get_port can find the per family routine.

    Now only the lookup routines receive as a parameter a struct inet_hashtable.

    With this we further reuse code, reducing the difference among INET transport
    protocols.

    Eventually work has to be done on UDP and SCTP to make them share this
    infrastructure and get as a bonus inet_diag interfaces so that iproute can be
    used with these protocols.

    net-2.6/net/ipv4/inet_hashtables.c:
    struct proto | +8
    struct inet_connection_sock_af_ops | +8
    2 structs changed
    __inet_hash_nolisten | +18
    __inet_hash | -210
    inet_put_port | +8
    inet_bind_bucket_create | +1
    __inet_hash_connect | -8
    5 functions changed, 27 bytes added, 218 bytes removed, diff: -191

    net-2.6/net/core/sock.c:
    proto_seq_show | +3
    1 function changed, 3 bytes added, diff: +3

    net-2.6/net/ipv4/inet_connection_sock.c:
    inet_csk_get_port | +15
    1 function changed, 15 bytes added, diff: +15

    net-2.6/net/ipv4/tcp.c:
    tcp_set_state | -7
    1 function changed, 7 bytes removed, diff: -7

    net-2.6/net/ipv4/tcp_ipv4.c:
    tcp_v4_get_port | -31
    tcp_v4_hash | -48
    tcp_v4_destroy_sock | -7
    tcp_v4_syn_recv_sock | -2
    tcp_unhash | -179
    5 functions changed, 267 bytes removed, diff: -267

    net-2.6/net/ipv6/inet6_hashtables.c:
    __inet6_hash | +8
    1 function changed, 8 bytes added, diff: +8

    net-2.6/net/ipv4/inet_hashtables.c:
    inet_unhash | +190
    inet_hash | +242
    2 functions changed, 432 bytes added, diff: +432

    vmlinux:
    16 functions changed, 485 bytes added, 492 bytes removed, diff: -7

    /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c:
    tcp_v6_get_port | -31
    tcp_v6_hash | -7
    tcp_v6_syn_recv_sock | -9
    3 functions changed, 47 bytes removed, diff: -47

    /home/acme/git/net-2.6/net/dccp/proto.c:
    dccp_destroy_sock | -7
    dccp_unhash | -179
    dccp_hash | -49
    dccp_set_state | -7
    dccp_done | +1
    5 functions changed, 1 bytes added, 242 bytes removed, diff: -241

    /home/acme/git/net-2.6/net/dccp/ipv4.c:
    dccp_v4_get_port | -31
    dccp_v4_request_recv_sock | -2
    2 functions changed, 33 bytes removed, diff: -33

    /home/acme/git/net-2.6/net/dccp/ipv6.c:
    dccp_v6_get_port | -31
    dccp_v6_hash | -7
    dccp_v6_request_recv_sock | +5
    3 functions changed, 5 bytes added, 38 bytes removed, diff: -33

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     

01 Feb, 2008

1 commit


26 Apr, 2007

1 commit

  • The days are gone when this was not an issue, there are folks out
    there with huge bot networks that can be used to attack the
    established hash tables on remote systems.

    So just like the routing cache and connection tracking
    hash, use Jenkins hash with random secret input.

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Dec, 2006

1 commit


26 Apr, 2006

1 commit


10 Apr, 2006

1 commit

  • Deinline a few functions which produce 200+ bytes of code.

    Size Uses Wasted Name and definition
    ===== ==== ====== ================================================
    429 3 818 __inet6_lookup include/net/inet6_hashtables.h
    404 2 384 __inet6_lookup_established include/net/inet6_hashtables.h
    206 3 372 __inet6_hash include/net/inet6_hashtables.h

    Signed-off-by: Denis Vlasenko
    Signed-off-by: David S. Miller

    Denis Vlasenko
     

04 Jan, 2006

2 commits


04 Oct, 2005

1 commit

  • Arnaldo and I agreed it could be applied now, because I have other
    pending patches depending on this one (Thank you Arnaldo)

    (The other important patch moves skc_refcnt in a separate cache line,
    so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)

    1) First some performance data :
    --------------------------------

    tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()

    The most time critical code is :

    sk_for_each(sk, node, &head->chain) {
    if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
    goto hit; /* You sunk my battleship! */
    }

    The sk_for_each() does use prefetch() hints but only the begining of
    "struct sock" is prefetched.

    As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far
    away from the begining of "struct sock", it has to bring into CPU
    cache cold cache line. Each iteration has to use at least 2 cache
    lines.

    This can be problematic if some chains are very long.

    2) The goal
    -----------

    The idea I had is to change things so that INET_MATCH() may return
    FALSE in 99% of cases only using the data already in the CPU cache,
    using one cache line per iteration.

    3) Description of the patch
    ---------------------------

    Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
    filling a 32 bits hole on 64 bits platform.

    struct sock_common {
    unsigned short skc_family;
    volatile unsigned char skc_state;
    unsigned char skc_reuse;
    int skc_bound_dev_if;
    struct hlist_node skc_node;
    struct hlist_node skc_bind_node;
    atomic_t skc_refcnt;
    + unsigned int skc_hash;
    struct proto *skc_prot;
    };

    Store in this 32 bits field the full hash, not masked by (ehash_size -
    1) Using this full hash as the first comparison done in INET_MATCH
    permits us immediatly skip the element without touching a second cache
    line in case of a miss.

    Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
    sk_hash and tw_hash) already contains the slot number if we mask with
    (ehash_size - 1)

    File include/net/inet_hashtables.h

    64 bits platforms :
    #define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
    (((__sk)->sk_hash == (__hash))
    ((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie)) && \
    ((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports)) && \
    (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))

    32bits platforms:
    #define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
    (((__sk)->sk_hash == (__hash)) && \
    (inet_sk(__sk)->daddr == (__saddr)) && \
    (inet_sk(__sk)->rcv_saddr == (__daddr)) && \
    (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))

    - Adds a prefetch(head->chain.first) in
    __inet_lookup_established()/__tcp_v4_check_established() and
    __inet6_lookup_established()/__tcp_v6_check_established() and
    __dccp_v4_check_established() to bring into cache the first element of the
    list, before the {read|write}_lock(&head->lock);

    Signed-off-by: Eric Dumazet
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Aug, 2005

2 commits