27 Nov, 2011

1 commit


18 Jul, 2011

1 commit


27 Jan, 2011

1 commit

  • Routing metrics are now copy-on-write.

    Initially a route entry points it's metrics at a read-only location.
    If a routing table entry exists, it will point there. Else it will
    point at the all zero metric place-holder called 'dst_default_metrics'.

    The writeability state of the metrics is stored in the low bits of the
    metrics pointer, we have two bits left to spare if we want to store
    more states.

    For the initial implementation, COW is implemented simply via kmalloc.
    However future enhancements will change this to place the writable
    metrics somewhere else, in order to increase sharing. Very likely
    this "somewhere else" will be the inetpeer cache.

    Note also that this means that metrics updates may transiently fail
    if we cannot COW the metrics successfully.

    But even by itself, this patch should decrease memory usage and
    increase cache locality especially for routing workloads. In those
    cases the read-only metric copies stay in place and never get written
    to.

    TCP workloads where metrics get updated, and those rare cases where
    PMTU triggers occur, will take a very slight performance hit. But
    that hit will be alleviated when the long-term writable metrics
    move to a more sharable location.

    Since the metrics storage went from a u32 array of RTAX_MAX entries to
    what is essentially a pointer, some retooling of the dst_entry layout
    was necessary.

    Most importantly, we need to preserve the alignment of the reference
    count so that it doesn't share cache lines with the read-mostly state,
    as per Eric Dumazet's alignment assertion checks.

    The only non-trivial bit here is the move of the 'flags' member into
    the writeable cacheline. This is OK since we are always accessing the
    flags around the same moment when we made a modification to the
    reference count.

    Signed-off-by: David S. Miller

    David S. Miller
     

15 Dec, 2010

1 commit


14 Dec, 2010

1 commit

  • Make all RTAX_ADVMSS metric accesses go through a new helper function,
    dst_metric_advmss().

    Leave the actual default metric as "zero" in the real metric slot,
    and compute the actual default value dynamically via a new dst_ops
    AF specific callback.

    For stacked IPSEC routes, we use the advmss of the path which
    preserves existing behavior.

    Unlike ipv4/ipv6, DecNET ties the advmss to the mtu and thus updates
    advmss on pmtu updates. This inconsistency in advmss handling
    results in more raw metric accesses than I wish we ended up with.

    Signed-off-by: David S. Miller

    David S. Miller
     

08 Nov, 2010

1 commit

  • Presently the b43legacy build fails on an sh randconfig:

    In file included from include/net/dst.h:12,
    from drivers/net/wireless/b43legacy/xmit.c:32:
    include/net/dst_ops.h:28: error: expected ':', ',', ';', '}' or '__attribute__' before '____cacheline_aligned_in_smp'
    include/net/dst_ops.h: In function 'dst_entries_get_fast':
    include/net/dst_ops.h:33: error: 'struct dst_ops' has no member named 'pcpuc_entries'
    include/net/dst_ops.h: In function 'dst_entries_get_slow':
    include/net/dst_ops.h:41: error: 'struct dst_ops' has no member named 'pcpuc_entries'
    include/net/dst_ops.h: In function 'dst_entries_add':
    include/net/dst_ops.h:49: error: 'struct dst_ops' has no member named 'pcpuc_entries'
    include/net/dst_ops.h: In function 'dst_entries_init':
    include/net/dst_ops.h:55: error: 'struct dst_ops' has no member named 'pcpuc_entries'
    include/net/dst_ops.h: In function 'dst_entries_destroy':
    include/net/dst_ops.h:60: error: 'struct dst_ops' has no member named 'pcpuc_entries'
    make[5]: *** [drivers/net/wireless/b43legacy/xmit.o] Error 1
    make[5]: *** Waiting for unfinished jobs....

    Signed-off-by: Paul Mundt
    Signed-off-by: David S. Miller

    Paul Mundt
     

12 Oct, 2010

1 commit

  • struct dst_ops tracks number of allocated dst in an atomic_t field,
    subject to high cache line contention in stress workload.

    Switch to a percpu_counter, to reduce number of time we need to dirty a
    central location. Place it on a separate cache line to avoid dirtying
    read only fields.

    Stress test :

    (Sending 160.000.000 UDP frames,
    IP route cache disabled, dual E5540 @2.53GHz,
    32bit kernel, FIB_TRIE, SLUB/NUMA)

    Before:

    real 0m51.179s
    user 0m15.329s
    sys 10m15.942s

    After:

    real 0m45.570s
    user 0m15.525s
    sys 9m56.669s

    With a small reordering of struct neighbour fields, subject of a
    following patch, (to separate refcnt from other read mostly fields)

    real 0m41.841s
    user 0m15.261s
    sys 8m45.949s

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Sep, 2009

1 commit

  • struct net::ipv6.ip6_dst_ops is separatedly dynamically allocated,
    but there is no fundamental reason for it. Embed it directly into
    struct netns_ipv6.

    For that:
    * move struct dst_ops into separate header to fix circular dependencies
    I honestly tried not to, it's pretty impossible to do other way
    * drop dynamical allocation, allocate together with netns

    For a change, remove struct dst_ops::dst_net, it's deducible
    by using container_of() given dst_ops pointer.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan