23 Sep, 2016

1 commit


26 Jun, 2016

1 commit

  • Qdisc performance suffers when packets are dropped at enqueue()
    time because drops (kfree_skb()) are done while qdisc lock is held,
    delaying a dequeue() draining the queue.

    Nominal throughput can be reduced by 50 % when this happens,
    at a time we would like the dequeue() to proceed as fast as possible.

    Even FQ is vulnerable to this problem, while one of FQ goals was
    to provide some flow isolation.

    This patch adds a 'struct sk_buff **to_free' parameter to all
    qdisc->enqueue(), and in qdisc_drop() helper.

    I measured a performance increase of up to 12 %, but this patch
    is a prereq so that future batches in enqueue() can fly.

    Signed-off-by: Eric Dumazet
    Acked-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Mar, 2016

2 commits

  • When the bottom qdisc decides to, for example, drop some packet,
    it calls qdisc_tree_decrease_qlen() to update the queue length
    for all its ancestors, we need to update the backlog too to
    keep the stats on root qdisc accurate.

    Cc: Jamal Hadi Salim
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     
  • Remove nearly duplicated code and prepare for the following patch.

    Cc: Jamal Hadi Salim
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     

28 Aug, 2015

1 commit

  • For classifiers getting invoked via tc_classify(), we always need an
    extra function call into tc_classify_compat(), as both are being
    exported as symbols and tc_classify() itself doesn't do much except
    handling of reclassifications when tp->classify() returned with
    TC_ACT_RECLASSIFY.

    CBQ and ATM are the only qdiscs that directly call into tc_classify_compat(),
    all others use tc_classify(). When tc actions are being configured
    out in the kernel, tc_classify() effectively does nothing besides
    delegating.

    We could spare this layer and consolidate both functions. pktgen on
    single CPU constantly pushing skbs directly into the netif_receive_skb()
    path with a dummy classifier on ingress qdisc attached, improves
    slightly from 22.3Mpps to 23.1Mpps.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

19 Aug, 2015

1 commit

  • Those were all workarounds for the formerly double meaning of
    tx_queue_len, which broke scheduling algorithms if untreated.

    Now that all in-tree drivers have been converted away from setting
    tx_queue_len = 0, it should be safe to drop these workarounds for
    categorically broken setups.

    Signed-off-by: Phil Sutter
    Cc: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Phil Sutter
     

04 May, 2015

1 commit


30 Sep, 2014

1 commit


14 Sep, 2014

1 commit

  • rcu'ify tcf_proto this allows calling tc_classify() without holding
    any locks. Updaters are protected by RTNL.

    This patch prepares the core net_sched infrastracture for running
    the classifier/action chains without holding the qdisc lock however
    it does nothing to ensure cls_xxx and act_xxx types also work without
    locking. Additional patches are required to address the fall out.

    Signed-off-by: John Fastabend
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    John Fastabend
     

15 Jan, 2014

1 commit


12 Jul, 2012

1 commit


02 Apr, 2012

1 commit


10 Feb, 2012

1 commit

  • Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside
    of other data structures.

    This is intended to be used by IPoIB so that it can remember
    addressing information stored at hard_header_ops->create() time that
    it can fetch when the packet gets to the transmit routine.

    Signed-off-by: David S. Miller

    David S. Miller
     

29 Nov, 2011

1 commit

  • Current SFB double hashing is not fulfilling SFB theory, if two flows
    share same rxhash value.

    Using skb_flow_dissect() permits to really have better hash dispersion,
    and get tunnelling support as well.

    Double hashing point was mentioned by Florian Westphal

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Aug, 2011

1 commit


24 Feb, 2011

1 commit

  • This is the Stochastic Fair Blue scheduler, based on work from :

    W. Feng, D. Kandlur, D. Saha, K. Shin. Blue: A New Class of Active Queue
    Management Algorithms. U. Michigan CSE-TR-387-99, April 1999.

    http://www.thefengs.com/wuchang/blue/CSE-TR-387-99.pdf

    This implementation is based on work done by Juliusz Chroboczek

    General SFB algorithm can be found in figure 14, page 15:

    B[l][n] : L x N array of bins (L levels, N bins per level)
    enqueue()
    Calculate hash function values h{0}, h{1}, .. h{L-1}
    Update bins at each level
    for i = 0 to L - 1
    if (B[i][h{i}].qlen > bin_size)
    B[i][h{i}].p_mark += p_increment;
    else if (B[i][h{i}].qlen == 0)
    B[i][h{i}].p_mark -= p_decrement;
    p_min = min(B[0][h{0}].p_mark ... B[L-1][h{L-1}].p_mark);
    if (p_min == 1.0)
    ratelimit();
    else
    mark/drop with probabilty p_min;

    I did the adaptation of Juliusz code to meet current kernel standards,
    and various changes to address previous comments :

    http://thread.gmane.org/gmane.linux.network/90225
    http://thread.gmane.org/gmane.linux.network/90375

    Default flow classifier is the rxhash introduced by RPS in 2.6.35, but
    we can use an external flow classifier if wanted.

    tc qdisc add dev $DEV parent 1:11 handle 11: \
    est 0.5sec 2sec sfb limit 128

    tc filter add dev $DEV protocol ip parent 11: handle 3 \
    flow hash keys dst divisor 1024

    Notes:

    1) SFB default child qdisc is pfifo_fast. It can be changed by another
    qdisc but a child qdisc MUST not drop a packet previously queued. This
    is because SFB needs to handle a dequeued packet in order to maintain
    its virtual queue states. pfifo_head_drop or CHOKe should not be used.

    2) ECN is enabled by default, unlike RED/CHOKe/GRED

    With help from Patrick McHardy & Andi Kleen

    Signed-off-by: Eric Dumazet
    CC: Juliusz Chroboczek
    CC: Stephen Hemminger
    CC: Patrick McHardy
    CC: Andi Kleen
    CC: John W. Linville
    Signed-off-by: David S. Miller

    Eric Dumazet