13 Jun, 2009

1 commit


17 Nov, 2008

1 commit

  • hlist uses NULL value to finish a chain.

    hlist_nulls variant use the low order bit set to 1 to signal an end-of-list marker.

    This allows to store many different end markers, so that some RCU lockless
    algos (used in TCP/UDP stack for example) can save some memory barriers in
    fast paths.

    Two new files are added :

    include/linux/list_nulls.h
    - mimics hlist part of include/linux/list.h, derived to hlist_nulls variant

    include/linux/rculist_nulls.h
    - mimics hlist part of include/linux/rculist.h, derived to hlist_nulls variant

    Only four helpers are declared for the moment :

    hlist_nulls_del_init_rcu(), hlist_nulls_del_rcu(),
    hlist_nulls_add_head_rcu() and hlist_nulls_for_each_entry_rcu()

    prefetches() were removed, since an end of list is not anymore NULL value.
    prefetches() could trigger useless (and possibly dangerous) memory transactions.

    Example of use (extracted from __udp4_lib_lookup())

    struct sock *sk, *result;
    struct hlist_nulls_node *node;
    unsigned short hnum = ntohs(dport);
    unsigned int hash = udp_hashfn(net, hnum);
    struct udp_hslot *hslot = &udptable->hash[hash];
    int score, badness;

    rcu_read_lock();
    begin:
    result = NULL;
    badness = -1;
    sk_nulls_for_each_rcu(sk, node, &hslot->head) {
    score = compute_score(sk, net, saddr, hnum, sport,
    daddr, dport, dif);
    if (score > badness) {
    result = sk;
    badness = score;
    }
    }
    /*
    * if the nulls value we got at the end of this lookup is
    * not the expected one, we must restart lookup.
    * We probably met an item that was moved to another chain.
    */
    if (get_nulls_value(node) != hash)
    goto begin;

    if (result) {
    if (unlikely(!atomic_inc_not_zero(&result->sk_refcnt)))
    result = NULL;
    else if (unlikely(compute_score(result, net, saddr, hnum, sport,
    daddr, dport, dif) < badness)) {
    sock_put(result);
    goto begin;
    }
    }
    rcu_read_unlock();
    return result;

    Signed-off-by: Eric Dumazet
    Acked-by: Peter Zijlstra
    Signed-off-by: David S. Miller

    Eric Dumazet