27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

18 Jul, 2011

1 commit


17 Jul, 2011

4 commits


14 Jul, 2011

1 commit

  • Now that there is a one-to-one correspondance between neighbour
    and hh_cache entries, we no longer need:

    1) dynamic allocation
    2) attachment to dst->hh
    3) refcounting

    Initialization of the hh_cache entry is indicated by hh_len
    being non-zero, and such initialization is always done with
    the neighbour's lock held as a writer.

    Signed-off-by: David S. Miller

    David S. Miller
     

11 Jul, 2011

1 commit

  • And mask the hash function result by simply shifting
    down the "->hash_shift" most significant bits.

    Currently which bits we use is arbitrary since jhash
    produces entropy evenly across the whole hash function
    result.

    But soon we'll be using universal hashing functions,
    and in those cases more entropy exists in the higher
    bits than the lower bits, because they use multiplies.

    Signed-off-by: David S. Miller

    David S. Miller
     

20 Nov, 2010

1 commit


19 Nov, 2010

1 commit

  • jiffies is defined as "volatile".

    extern unsigned long volatile __jiffy_data jiffies;

    ACCESS_ONCE() uses "volatile".
    As a result, some compilers warn duplicate `volatile' for ACCESS_ONCE(jiffies).

    Signed-off-by: Tetsuo Handa
    Signed-off-by: David S. Miller

    Tetsuo Handa
     

12 Nov, 2010

1 commit

  • It is important to move nud_state outside of the often modified cache
    line (because of refcnt), to reduce false sharing in neigh_event_send()

    This is a followup of commit 0ed8ddf4045f (neigh: Protect neigh->ha[]
    with a seqlock)

    This gives a 7% speedup on routing test with IP route cache disabled.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

12 Oct, 2010

2 commits

  • Le mardi 12 octobre 2010 à 00:02 +0200, Eric Dumazet a écrit :
    > Here is the followup patch.
    >
    > Thanks !
    >

    Oops, this was an old version, the up2date ones also took care of "used"
    field.

    I guess its time for a sleep, sorry again.

    [PATCH net-next V2] neigh: reorder struct neighbour fields

    (refcnt) and (ha_lock, ha, used, dev, output, ops, primary_key) should
    be placed on a separate cache lines.

    refcnt can be often written, while other fields are mostly read.

    This gave me good result on stress test :

    before:

    real 0m45.570s
    user 0m15.525s
    sys 9m56.669s

    After:

    real 0m41.841s
    user 0m15.261s
    sys 8m45.949s

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Add a seqlock in struct neighbour to protect neigh->ha[], and avoid
    dirtying neighbour in stress situation (many different flows / dsts)

    Dirtying takes place because of read_lock(&n->lock) and n->used writes.

    Switching to a seqlock, and writing n->used only on jiffies changes
    permits less dirtying.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Oct, 2010

1 commit

  • This is the second step for neighbour RCU conversion.

    (first was commit d6bf7817 : RCU conversion of neigh hash table)

    neigh_lookup() becomes lockless, but still take a reference on found
    neighbour. (no more read_lock()/read_unlock() on tbl->lock)

    struct neighbour gets an additional rcu_head field and is freed after an
    RCU grace period.

    Future work would need to eventually not take a reference on neighbour
    for temporary dst (DST_NOCACHE), but this would need dst->_neighbour to
    use a noref bit like we did for skb->_dst.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

06 Oct, 2010

1 commit

  • David

    This is the first step for RCU conversion of neigh code.

    Next patches will convert hash_buckets[] and "struct neighbour" to RCU
    protected objects.

    Thanks

    [PATCH net-next] net neigh: RCU conversion of neigh hash table

    Instead of storing hash_buckets, hash_mask and hash_rnd in "struct
    neigh_table", a new structure is defined :

    struct neigh_hash_table {
    struct neighbour **hash_buckets;
    unsigned int hash_mask;
    __u32 hash_rnd;
    struct rcu_head rcu;
    };

    And "struct neigh_table" has an RCU protected pointer to such a
    neigh_hash_table.

    This means the signature of (*hash)() function changed: We need to add a
    third parameter with the actual hash_rnd value, since this is not
    anymore a neigh_table field.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Oct, 2010

1 commit


01 Jul, 2010

1 commit


15 Apr, 2010

1 commit

  • - fix IP DNAT on vlan- or pppoe-encapsulated traffic: The functions
    neigh_hh_output() or dst->neighbour->output() overwrite the complete
    Ethernet header, although we only need the destination MAC address.
    For encapsulated packets, they ended up overwriting the encapsulating
    header. The new code copies the Ethernet source MAC address and
    protocol number before calling dst->neighbour->output(). The Ethernet
    source MAC and protocol number are copied back in place in
    br_nf_pre_routing_finish_bridge_slow(). This also makes the IP DNAT
    more transparent because in the old scheme the source MAC of the
    bridge was copied into the source address in the Ethernet header. We
    also let skb->protocol equal ETH_P_IP resp. ETH_P_IPV6 during the
    execution of the PF_INET resp. PF_INET6 hooks.

    - Speed up IP DNAT by calling neigh_hh_bridge() instead of
    neigh_hh_output(): if dst->hh is available, we already know the MAC
    address so we can just copy it.

    Signed-off-by: Bart De Schuymer
    Signed-off-by: Patrick McHardy

    Bart De Schuymer
     

17 Feb, 2010

2 commits

  • Add __percpu sparse annotations to net.

    These annotations are to make sparse consider percpu variables to be
    in a different address space and warn if accessed without going
    through percpu accessors. This patch doesn't affect normal builds.

    The macro and type tricks around snmp stats make things a bit
    interesting. DEFINE/DECLARE_SNMP_STAT() macros mark the target field
    as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly. All
    snmp_mib_*() users which used to cast the argument to (void **) are
    updated to cast it to (void __percpu **).

    Signed-off-by: Tejun Heo
    Acked-by: David S. Miller
    Cc: Patrick McHardy
    Cc: Arnaldo Carvalho de Melo
    Cc: Vlad Yasevich
    Cc: netdev@vger.kernel.org
    Signed-off-by: David S. Miller

    Tejun Heo
     
  • Stop computing the number of neighbour table settings we have by
    counting the number of binary sysctls. This behaviour was silly
    and meant that we could not add another neighbour table setting
    without also adding another binary sysctl.

    Don't pass the binary sysctl path for neighour table entries
    into neigh_sysctl_register. These parameters are no longer
    used and so are just dead code.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

15 Dec, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
    m68k: rename global variable vmalloc_end to m68k_vmalloc_end
    percpu: add missing per_cpu_ptr_to_phys() definition for UP
    percpu: Fix kdump failure if booted with percpu_alloc=page
    percpu: make misc percpu symbols unique
    percpu: make percpu symbols in ia64 unique
    percpu: make percpu symbols in powerpc unique
    percpu: make percpu symbols in x86 unique
    percpu: make percpu symbols in xen unique
    percpu: make percpu symbols in cpufreq unique
    percpu: make percpu symbols in oprofile unique
    percpu: make percpu symbols in tracer unique
    percpu: make percpu symbols under kernel/ and mm/ unique
    percpu: remove some sparse warnings
    percpu: make alloc_percpu() handle array types
    vmalloc: fix use of non-existent percpu variable in put_cpu_var()
    this_cpu: Use this_cpu_xx in trace_functions_graph.c
    this_cpu: Use this_cpu_xx for ftrace
    this_cpu: Use this_cpu_xx in nmi handling
    this_cpu: Use this_cpu operations in RCU
    this_cpu: Use this_cpu ops for VM statistics
    ...

    Fix up trivial (famous last words) global per-cpu naming conflicts in
    arch/x86/kvm/svm.c
    mm/slab.c

    Linus Torvalds
     

08 Dec, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
    mac80211: fix reorder buffer release
    iwmc3200wifi: Enable wimax core through module parameter
    iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
    iwmc3200wifi: Coex table command does not expect a response
    iwmc3200wifi: Update wiwi priority table
    iwlwifi: driver version track kernel version
    iwlwifi: indicate uCode type when fail dump error/event log
    iwl3945: remove duplicated event logging code
    b43: fix two warnings
    ipw2100: fix rebooting hang with driver loaded
    cfg80211: indent regulatory messages with spaces
    iwmc3200wifi: fix NULL pointer dereference in pmkid update
    mac80211: Fix TX status reporting for injected data frames
    ath9k: enable 2GHz band only if the device supports it
    airo: Fix integer overflow warning
    rt2x00: Fix padding bug on L2PAD devices.
    WE: Fix set events not propagated
    b43legacy: avoid PPC fault during resume
    b43: avoid PPC fault during resume
    tcp: fix a timewait refcnt race
    ...

    Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
    CTL_UNNUMBERED removed) in
    kernel/sysctl_check.c
    net/ipv4/sysctl_net_ipv4.c
    net/ipv6/addrconf.c
    net/sctp/sysctl.c

    Linus Torvalds
     

12 Nov, 2009

1 commit

  • Now that sys_sysctl is a compatiblity wrapper around /proc/sys
    all sysctl strategy routines, and all ctl_name and strategy
    entries in the sysctl tables are unused, and can be
    revmoed.

    In addition neigh_sysctl_register has been modified to no longer
    take a strategy argument and it's callers have been modified not
    to pass one.

    Cc: "David Miller"
    Cc: Hideaki YOSHIFUJI
    Cc: netdev@vger.kernel.org
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

04 Nov, 2009

1 commit

  • This cleanup patch puts struct/union/enum opening braces,
    in first line to ease grep games.

    struct something
    {

    becomes :

    struct something {

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

03 Oct, 2009

1 commit


02 Sep, 2009

1 commit


03 Aug, 2009

1 commit

  • Current neigh_periodic_timer() function is fired by timer IRQ, and
    scans one hash bucket each round (very litle work in fact)

    As we are supposed to scan whole hash table in 15 seconds, this means
    neigh_periodic_timer() can be fired very often. (depending on the number
    of concurrent hash entries we stored in this table)

    Converting this to a workqueue permits scanning whole table, minimizing
    icache pollution, and firing this work every 15 seconds, independantly
    of hash table size.

    This 15 seconds delay is not a hard number, as work is a deferrable one.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

12 Nov, 2008

2 commits


17 Jul, 2008

1 commit

  • in __neigh_event_send, if we have a neighbour entry which is in
    NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour
    to the neighbours arp_queue, which is default capped to a length of 3
    skbs. If that queue exceeds its set length, it will drop an skb on
    the queue to enqueue the newly arrived skb. This results in a drop
    for which we have no statistics incremented. This patch adds an
    unresolved_discards stat to /proc/net/stat/ndisc_cache to track these
    lost frames.

    Signed-off-by: Neil Horman
    Signed-off-by: David S. Miller

    Neil Horman
     

28 Mar, 2008

1 commit


26 Mar, 2008

1 commit


25 Mar, 2008

1 commit

  • Proxy neighbors do not have any reference counting, so any caller
    of pneigh_lookup (unless it's a netlink triggered add/del routine)
    should _not_ perform any actions on the found proxy entry.

    There's one exception from this rule - the ipv6's ndisc_recv_ns()
    uses found entry to check the flags for NTF_ROUTER.

    This creates a race between the ndisc and pneigh_delete - after
    the pneigh is returned to the caller, the nd_tbl.lock is dropped
    and the deleting procedure may proceed.

    One of the fixes would be to add a reference counting, but this
    problem exists for ndisc only. Besides such a patch would be too
    big for -rc4.

    So I propose to introduce a __pneigh_lookup() which is supposed
    to be called with the lock held and use it in ndisc code to check
    the flags on alive pneigh entry.

    Changes from v2:
    As David noticed, Exported the __pneigh_lookup() to ipv6 module.
    The checkpatch generates a warning on it, since the EXPORT_SYMBOL
    does not follow the symbol itself, but in this file all the
    exports come at the end, so I decided no to break this harmony.

    Changes from v1:
    Fixed comments from YOSHIFUJI - indentation of prototype in header
    and the pndisc_check_router() name - and a compilation fix, pointed
    by Daniel - the is_routed was (falsely) considered as uninitialized
    by gcc.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

04 Mar, 2008

1 commit


29 Jan, 2008

5 commits

  • Make them static.

    [ Moved the inline before, instead of after, call sites. -DaveM ]

    Signed-off-by: Denis V. Lunev
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • When I studied the neighbor code I puzzled over what the NUD can mean
    for quite a long time.

    Finally I asked Alexey and he said that this was smth like "neighbor
    unreachability detection".

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • seq_open_net requires that first field of the seq->private data to be
    struct seq_net_private. In reality this is a single pointer to a
    struct net for now. The patch makes code consistent.

    Signed-off-by: Denis V. Lunev
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • Signed-off-by: Rami Rosen
    Signed-off-by: David S. Miller

    Rami Rosen
     
  • I'm actually surprised at how much was involved. At first glance it
    appears that the neighbour table data structures are already split by
    network device so all that should be needed is to modify the user
    interface commands to filter the set of neighbours by the network
    namespace of their devices.

    However a couple things turned up while I was reading through the
    code. The proxy neighbour table allows entries with no network
    device, and the neighbour parms are per network device (except for the
    defaults) so they now need a per network namespace default.

    So I updated the two structures (which surprised me) with their very
    own network namespace parameter. Updated the relevant lookup and
    destroy routines with a network namespace parameter and modified the
    code that interacts with users to filter out neighbour table entries
    for devices of other namespaces.

    I'm a little concerned that we can modify and display the global table
    configuration and from all network namespaces. But this appears good
    enough for now.

    I keep thinking modifying the neighbour table to have per network
    namespace instances of each table type would should be cleaner. The
    hash table is already dynamically sized so there are it is not a
    limiter. The default parameter would be straight forward to take care
    of. However when I look at the how the network table is built and
    used I still find some assumptions that there is only a single
    neighbour table for each type of table in the kernel. The netlink
    operations, neigh_seq_start, the non-core network users that call
    neigh_lookup. So while it might be doable it would require more
    refactoring than my current approach of just doing a little extra
    filtering in the code.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

26 Apr, 2007

1 commit