24 Dec, 2011

1 commit

  • Add backlog (byte count) information in hfsc classes and qdisc, so that
    "tc -s" can report it to user, instead of 0 values :

    qdisc hfsc 1: root refcnt 6 default 20
    Sent 45141660 bytes 30545 pkt (dropped 0, overlimits 91751 requeues 0)
    rate 1492Kbit 126pps backlog 103226b 74p requeues 0
    ...
    class hfsc 1:20 parent 1:1 leaf 1201: rt m1 0bit d 0us m2 400000bit ls m1 0bit d 0us m2 200000bit
    Sent 49534912 bytes 33519 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 81822b 56p requeues 0
    period 23 work 49451576 bytes rtwork 13277552 bytes level 0
    ...

    Signed-off-by: Eric Dumazet
    CC: John A. Sullivan III
    Signed-off-by: David S. Miller

    Eric Dumazet
     

25 Jan, 2011

1 commit


21 Jan, 2011

2 commits

  • In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed
    a problem with pfifo_head drops that incorrectly decreased
    sch->bstats.bytes and sch->bstats.packets

    Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
    previously enqueued packet, and bstats cannot be changed, so
    bstats/rates are not accurate (over estimated)

    This patch changes the qdisc_bstats updates to be done at dequeue() time
    instead of enqueue() time. bstats counters no longer account for dropped
    frames, and rates are more correct, since enqueue() bursts dont have
    effect on dequeue() rate.

    Signed-off-by: Eric Dumazet
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • In commit 371121057607e (net: QDISC_STATE_RUNNING dont need atomic bit
    ops) I moved QDISC_STATE_RUNNING flag to __state container, located in
    the cache line containing qdisc lock and often dirtied fields.

    I now move TCQ_F_THROTTLED bit too, so that we let first cache line read
    mostly, and shared by all cpus. This should speedup HTB/CBQ for example.

    Not using test_bit()/__clear_bit()/__test_and_set_bit allows to use an
    "unsigned int" for __state container, reducing by 8 bytes Qdisc size.

    Introduce helpers to hide implementation details.

    Signed-off-by: Eric Dumazet
    CC: Patrick McHardy
    CC: Jesper Dangaard Brouer
    CC: Jarek Poplawski
    CC: Jamal Hadi Salim
    CC: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Jan, 2011

1 commit


11 Jan, 2011

1 commit

  • HTB takes into account skb is segmented in stats updates.
    Generalize this to all schedulers.

    They should use qdisc_bstats_update() helper instead of manipulating
    bstats.bytes and bstats.packets

    Add bstats_update() helper too for classes that use
    gnet_stats_basic_packed fields.

    Note : Right now, TCQ_F_CAN_BYPASS shortcurt can be taken only if no
    stab is setup on qdisc.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Oct, 2010

1 commit


02 Sep, 2010

1 commit


18 May, 2010

2 commits

  • This patch removes from net/ (but not any netfilter files)
    all the unnecessary return; statements that precede the
    last closing brace of void functions.

    It does not remove the returns that are immediately
    preceded by a label as gcc doesn't like that.

    Done via:
    $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
    xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • When attaching filters to a class pointing to a class higher up in the
    hierarchy, classification may enter an endless loop. Currently this is
    prevented for filters that are already resolved, but not for filters
    resolved at runtime.

    Only allow filters to point downwards in the hierarchy, similar to what
    CBQ does.

    Reported-by: Pawel Staszewski
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

07 Oct, 2009

1 commit

  • Jarek Poplawski a écrit :
    >
    >
    > Hmm... So you made me to do some "real" work here, and guess what?:
    > there is one serious checkpatch warning! ;-) Plus, this new parameter
    > should be added to the function description. Otherwise:
    > Signed-off-by: Jarek Poplawski
    >
    > Thanks,
    > Jarek P.
    >
    > PS: I guess full "Don't" would show we really mean it...

    Okay :) Here is the last round, before the night !

    Thanks again

    [RFC] pkt_sched: gen_estimator: Don't report fake rate estimators

    We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
    is running.

    # tc -s -d qdisc
    qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
    rate 0bit 0pps backlog 0b 0p requeues 0

    User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
    one (because no estimator is active)

    After this patch, tc command output is :
    $ tc -s -d qdisc
    qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0

    We add a parameter to gnet_stats_copy_rate_est() function so that
    it can use gen_estimator_active(bstats, r), as suggested by Jarek.

    This parameter can be NULL if check is not necessary, (htb for
    example has a mandatory rate estimator)

    Signed-off-by: Eric Dumazet
    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Sep, 2009

1 commit


06 Sep, 2009

1 commit


18 Aug, 2009

1 commit

  • In 5e140dfc1fe87eae27846f193086724806b33c7d "net: reorder struct Qdisc
    for better SMP performance" the definition of struct gnet_stats_basic
    changed incompatibly, as copies of this struct are shipped to
    userland via netlink.

    Restoring old behavior is not welcome, for performance reason.

    Fix is to use a private structure for kernel, and
    teach gnet_stats_copy_basic() to convert from kernel to user land,
    using legacy structure (struct gnet_stats_basic)

    Based on a report and initial patch from Michael Spang.

    Reported-by: Michael Spang
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Jun, 2009

1 commit

  • Use PSCHED_SHIFT constant instead of '10' in PSCHED_US2NS() and
    PSCHED_NS2US() macros to enable changing this value later.

    Additionally use PSCHED_SHIFT in sch_hfsc SM_SHIFT and ISM_SHIFT
    definitions. This part of the patch is based on feedback from
    Patrick McHardy .

    Reported-by: Antonio Almeida
    Tested-by: Antonio Almeida
    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

16 Mar, 2009

1 commit

  • While looking for a possible reason of bugzilla report on HTB oops:
    http://bugzilla.kernel.org/show_bug.cgi?id=12858
    I found the code in htb_delete calling htb_destroy_class on zero
    refcount is very misleading: it can suggest this is a common path, and
    destroy is called under sch_tree_lock. Actually, this can never happen
    like this because before deletion cops->get() is done, and after
    delete a class is still used by tclass_notify. The class destroy is
    always called from cops->put(), so without sch_tree_lock.

    This doesn't mean much now (since 2.6.27) because all vulnerable calls
    were moved from htb_destroy_class to htb_delete, but there was a bug
    in older kernels. The same change is done for other classful scheds,
    which, it seems, didn't have similar locking problems here.

    Reported-by: m0sia
    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

01 Feb, 2009

1 commit


26 Nov, 2008

2 commits


20 Nov, 2008

1 commit


14 Nov, 2008

1 commit

  • After implementing qdisc->ops->peek() and changing sch_netem into
    classless qdisc there are no more qdisc->ops->requeue() users. This
    patch removes this method with its wrappers (qdisc_requeue()), and
    also unused qdisc->requeue structure. There are a few minor fixes of
    warnings (htb_enqueue()) and comments btw.

    The idea to kill ->requeue() and a similar patch were first developed
    by David S. Miller.

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

31 Oct, 2008

2 commits

  • This patch adds qdisc_peek_dequeued() wrapper to emulate peek method
    with qdisc->dequeue() and storing "peeked" skb in qdisc->gso_skb until
    dequeuing. This is mainly for compatibility reasons not to break some
    strange configs because peeking is expected for non-work-conserving
    parent qdiscs to query work-conserving child qdiscs.

    This implementation requires using qdisc_dequeue_peeked() wrapper
    instead of directly calling qdisc->dequeue() for all qdiscs ever
    querried with qdisc->ops->peek() or qdisc_peek_dequeued().

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     
  • Use qdisc->ops->peek() instead of ->dequeue() & ->requeue() pair.
    After this patch the only remaining user of qdisc->ops->requeue() is
    netem_enqueue(). Based on ideas of Herbert Xu, Patrick McHardy and
    David S. Miller.

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

27 Aug, 2008

1 commit

  • While passing a qdisc root lock to gen_new_estimator() and
    gen_replace_estimator() dev could be deactivated or even before
    grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create), so
    using qdisc_root_lock() is not enough. This patch adds
    qdisc_root_sleeping_lock() for this, plus additional checks, where
    necessary.

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

05 Aug, 2008

2 commits

  • Patrick McHardy noticed that it would be nice to
    handle NET_XMIT_BYPASS by NET_XMIT_SUCCESS with an internal qdisc flag
    __NET_XMIT_BYPASS and to remove the mapping from dev_queue_xmit().

    David Miller spotted a serious bug in the first
    version of this patch.

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     
  • Patrick McHardy noticed:
    "The other problem that affects all qdiscs supporting actions is
    TC_ACT_QUEUED/TC_ACT_STOLEN getting mapped to NET_XMIT_SUCCESS
    even though the packet is not queued, corrupting upper qdiscs'
    qlen counters."

    and later explained:
    "The reason why it translates it at all seems to be to not increase
    the drops counter. Within a single qdisc this could be avoided by
    other means easily, upper qdiscs would still increase the counter
    when we return anything besides NET_XMIT_SUCCESS though.

    This means we need a new NET_XMIT return value to indicate this to
    the upper qdiscs. So I'd suggest to introduce NET_XMIT_STOLEN,
    return that to upper qdiscs and translate it to NET_XMIT_SUCCESS
    in dev_queue_xmit, similar to NET_XMIT_BYPASS."

    David Miller noticed:
    "Maybe these NET_XMIT_* values being passed around should be a set of
    bits. They could be composed of base meanings, combined with specific
    attributes.

    So you could say "NET_XMIT_DROP | __NET_XMIT_NO_DROP_COUNT"

    The attributes get masked out by the top-level ->enqueue() caller,
    such that the base meanings are the only thing that make their
    way up into the stack. If it's only about communication within the
    qdisc tree, let's simply code it that way."

    This patch is trying to realize these ideas.

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

20 Jul, 2008

2 commits


18 Jul, 2008

1 commit

  • When code wants to lock the qdisc tree state, the logic
    operation it's doing is locking the top-level qdisc that
    sits of the root of the netdev_queue.

    Add qdisc_root_lock() to represent this and convert the
    easiest cases.

    In order for this to work out in all cases, we have to
    hook up the noop_qdisc to a dummy netdev_queue.

    Signed-off-by: David S. Miller

    David S. Miller
     

09 Jul, 2008

3 commits

  • The lock is now an attribute of the device queue.

    One thing to notice is that "suspicious" places
    emerge which will need specific training about
    multiple queue handling. They are so marked with
    explicit "netdev->rx_queue" and "netdev->tx_queue"
    references.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • It can be obtained via the netdev_queue. So create a helper routine,
    qdisc_dev(), to make the transformations nicer looking.

    Now, qdisc_alloc() now no longer needs a net_device pointer argument.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • A netdev_queue is an entity managed by a qdisc.

    Currently there is one RX and one TX queue, and a netdev_queue merely
    contains a backpointer to the net_device.

    The Qdisc struct is augmented with a netdev_queue pointer as well.

    Eventually the 'dev' Qdisc member will go away and we will have the
    resulting hierarchy:

    net_device --> netdev_queue --> Qdisc

    Also, qdisc_alloc() and qdisc_create_dflt() now take a netdev_queue
    pointer argument.

    Signed-off-by: David S. Miller

    David S. Miller
     

06 Jul, 2008

1 commit


02 Jul, 2008

2 commits


04 Jun, 2008

1 commit

  • Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and
    nla_nest_cancel() void functions.

    Return -EMSGSIZE instead of -1 if the provided message buffer is not
    big enough.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     

29 Jan, 2008

4 commits