05 Jul, 2012

1 commit


04 May, 2012

1 commit


06 Dec, 2011

1 commit


03 Dec, 2011

1 commit


01 Dec, 2011

1 commit

  • We need rcu_read_lock() protection before using dst_get_neighbour(), and
    we must cache its value (pass it to __teql_resolve())

    teql_master_xmit() is called under rcu_read_lock_bh() protection, its
    not enough.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Nov, 2011

1 commit

  • Create separate queue state flags so that either the stack or drivers
    can turn on XOFF. Added a set of functions used in the stack to determine
    if a queue is really stopped (either by stack or driver)

    Signed-off-by: Tom Herbert
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Tom Herbert
     

18 Jul, 2011

1 commit


25 Jan, 2011

1 commit


21 Jan, 2011

1 commit

  • In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed
    a problem with pfifo_head drops that incorrectly decreased
    sch->bstats.bytes and sch->bstats.packets

    Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
    previously enqueued packet, and bstats cannot be changed, so
    bstats/rates are not accurate (over estimated)

    This patch changes the qdisc_bstats updates to be done at dequeue() time
    instead of enqueue() time. bstats counters no longer account for dropped
    frames, and rates are more correct, since enqueue() bursts dont have
    effect on dequeue() rate.

    Signed-off-by: Eric Dumazet
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Jan, 2011

1 commit


14 Jan, 2011

1 commit

  • After recent changes, (percpu stats on vlan/tunnels...), we dont need
    anymore per struct netdev_queue tx_bytes/tx_packets/tx_dropped counters.

    Only remaining users are ixgbe, sch_teql, gianfar & macvlan :

    1) ixgbe can be converted to use existing tx_ring counters.

    2) macvlan incremented txq->tx_dropped, it can use the
    dev->stats.tx_dropped counter.

    3) sch_teql : almost revert ab35cd4b8f42 (Use net_device internal stats)
    Now we have ndo_get_stats64(), use it, even for "unsigned long"
    fields (No need to bring back a struct net_device_stats)

    4) gianfar adds a stats structure per tx queue to hold
    tx_bytes/tx_packets

    This removes a lockdep warning (and possible lockup) in rndis gadget,
    calling dev_get_stats() from hard IRQ context.

    Ref: http://www.spinics.net/lists/netdev/msg149202.html

    Reported-by: Neil Jones
    Signed-off-by: Eric Dumazet
    CC: Jarek Poplawski
    CC: Alexander Duyck
    CC: Jeff Kirsher
    CC: Sandeep Gopalpet
    CC: Michal Nazarewicz
    Signed-off-by: David S. Miller

    Eric Dumazet
     

11 Jan, 2011

1 commit

  • HTB takes into account skb is segmented in stats updates.
    Generalize this to all schedulers.

    They should use qdisc_bstats_update() helper instead of manipulating
    bstats.bytes and bstats.packets

    Add bstats_update() helper too for classes that use
    gnet_stats_basic_packed fields.

    Note : Right now, TCQ_F_CAN_BYPASS shortcurt can be taken only if no
    stab is setup on qdisc.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 Nov, 2010

1 commit


12 Oct, 2010

1 commit

  • Add a seqlock in struct neighbour to protect neigh->ha[], and avoid
    dirtying neighbour in stress situation (many different flows / dsts)

    Dirtying takes place because of read_lock(&n->lock) and n->used writes.

    Switching to a seqlock, and writing n->used only on jiffies changes
    permits less dirtying.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

10 Aug, 2010

1 commit


17 Jun, 2010

1 commit

  • https://bugzilla.kernel.org/show_bug.cgi?id=16183

    The sch_teql module, which can be used to load balance over a set of
    underlying interfaces, stopped working after 2.6.30 and has been
    broken in all kernels since then for any underlying interface which
    requires the addition of link level headers.

    The problem is that the transmit routine relies on being able to
    access the destination address in the skb in order to do address
    resolution once it has decided which underlying interface it is going
    to transmit through.

    In 2.6.31 the IFF_XMIT_DST_RELEASE flag was introduced, and set by
    default for all interfaces, which causes the destination address to be
    released before the transmit routine for the interface is called.

    The solution is to clear that flag for teql interfaces.

    Signed-off-by: Tom Hughes
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Tom Hughes
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

30 Nov, 2009

1 commit


01 Sep, 2009

1 commit


06 Jul, 2009

1 commit


13 Jun, 2009

1 commit


03 Jun, 2009

1 commit

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 May, 2009

1 commit

  • We would like to get rid of netdev->trans_start = jiffies; that about all net
    drivers have to use in their start_xmit() function, and use txq->trans_start
    instead.

    This can be done generically in core network, as suggested by David.

    Some devices, (particularly loopback) dont need trans_start update, because
    they dont have transmit watchdog. We could add a new device flag, or rely
    on fact that txq->tran_start can be updated is txq->xmit_lock_owner is
    different than -1. Use a helper function to hide our choice.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 May, 2009

1 commit

  • We can slightly reduce size of teqlN structure, not duplicating stats
    structure in teql_master but using stats field from net_device.stats
    for tx_errors and from netdev_queue for tx_bytes/tx_packets/tx_dropped
    values.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 May, 2009

1 commit

  • It is illegal to dereference a skb after a successful ndo_start_xmit()
    call. We must store skb length in a local variable instead.

    Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb
    (net_sched: Add accessor function for packet length for qdiscs)

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Jan, 2009

1 commit


14 Nov, 2008

1 commit

  • After implementing qdisc->ops->peek() and changing sch_netem into
    classless qdisc there are no more qdisc->ops->requeue() users. This
    patch removes this method with its wrappers (qdisc_requeue()), and
    also unused qdisc->requeue structure. There are a few minor fixes of
    warnings (htb_enqueue()) and comments btw.

    The idea to kill ->requeue() and a similar patch were first developed
    by David S. Miller.

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

31 Oct, 2008

1 commit


30 Aug, 2008

1 commit

  • Use qdisc_root_sleeping_lock() instead of qdisc_root_lock() where
    appropriate. The only difference is while dev is deactivated, when
    currently we can use a sleeping qdisc with the lock of noop_qdisc.
    This shouldn't be dangerous since after deactivation root lock could
    be used only by gen_estimator code, but looks wrong anyway.

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

01 Aug, 2008

1 commit

  • When support for multiple TX queues were added, the
    netif_tx_lock() routines we converted to iterate over
    all TX queues and grab each queue's spinlock.

    This causes heartburn for lockdep and it's not a healthy
    thing to do with lots of TX queues anyways.

    So modify this to use a top-level lock and a "frozen"
    state for the individual TX queues.

    Signed-off-by: David S. Miller

    David S. Miller
     

20 Jul, 2008

1 commit


18 Jul, 2008

3 commits

  • We can simply use the qdisc->q.lock for all of the
    qdisc tree synchronization.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This effectively "flips the switch" by making the core networking
    and multiqueue-aware drivers use the new TX multiqueue structures.

    Non-multiqueue drivers need no changes. The interfaces they use such
    as netif_stop_queue() degenerate into an operation on TX queue zero.
    So everything "just works" for them.

    Code that really wants to do "X" to all TX queues now invokes a
    routine that does so, such as netif_tx_wake_all_queues(),
    netif_tx_stop_all_queues(), etc.

    pktgen and netpoll required a little bit more surgery than the others.

    In particular the pktgen changes, whilst functional, could be largely
    improved. The initial check in pktgen_xmit() will sometimes check the
    wrong queue, which is mostly harmless. The thing to do is probably to
    invoke fill_packet() earlier.

    The bulk of the netpoll changes is to make the code operate solely on
    the TX queue indicated by by the SKB queue mapping.

    Setting of the SKB queue mapping is entirely confined inside of
    net/core/dev.c:dev_pick_tx(). If we end up needing any kind of
    special semantics (drops, for example) it will be implemented here.

    Finally, we now have a "real_num_tx_queues" which is where the driver
    indicates how many TX queues are actually active.

    With IGB changes from Jeff Kirsher.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • alloc_netdev_mq() now allocates an array of netdev_queue
    structures for TX, based upon the queue_count argument.

    Furthermore, all accesses to the TX queues are now vectored
    through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
    interfaces. This makes it easy to grep the tree for all
    things that want to get to a TX queue of a net device.

    Problem spots which are not really multiqueue aware yet, and
    only work with one queue, can easily be spotted by grepping
    for all netdev_get_tx_queue() calls that pass in a zero index.

    Signed-off-by: David S. Miller

    David S. Miller
     

09 Jul, 2008

3 commits


01 Feb, 2008

1 commit


29 Jan, 2008

1 commit

  • Convert packet schedulers to use the netlink API. Unfortunately a gradual
    conversion is not possible without breaking compilation in the middle or
    adding lots of casts, so this patch converts them all in one step. The
    patch has been mostly generated automatically with some minor edits to
    at least allow seperate conversion of classifiers and actions.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

07 Nov, 2007

1 commit